Re: [HACKERS] Block level parallel vacuum

Started by Masahiko Sawadaover 7 years ago408 messages

sawada.mshk@gmail.com

over 7 years ago

2 attachment(s)

On Thu, Nov 30, 2017 at 11:09 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:

On Tue, Oct 24, 2017 at 5:54 AM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Yeah, I was thinking the commit is relevant with this issue but as
Amit mentioned this error is emitted by DROP SCHEMA CASCASE.
I don't find out the cause of this issue yet. With the previous
version patch, autovacuum workers were woking with one parallel worker
but it never drops relations. So it's possible that the error might
not have been relevant with the patch but anywayI'll continue to work
on that.

This depends on the extension lock patch from
/messages/by-id/CAD21AoCmT3cFQUN4aVvzy5chw7DuzXrJCbrjTU05B+Ss=Gn1LA@mail.gmail.com/
if I am following correctly. So I propose to mark this patch as
returned with feedback for now, and come back to it once the root
problems are addressed. Feel free to correct me if you think that's
not adapted.

I've re-designed the parallel vacuum patch. Attached the latest
version patch. As the discussion so far, this patch depends on the
extension lock patch[1]/messages/by-id/CAD21AoBn8WbOt21MFfj1mQmL2ZD8KVgMHYrOe1F5ozsQC4Z_hw@mail.gmail.com. However I think we can discuss the design
part of parallel vacuum independently from that patch. That's way I'm
proposing the new patch. In this patch, I structured and refined the
lazy_scan_heap() because it's a single big function and not suitable
for making it parallel.

The parallel vacuum worker processes keep waiting for commands from
the parallel vacuum leader process. Before entering each phase of lazy
vacuum such as scanning heap, vacuum index and vacuum heap, the leader
process changes the all workers state to the next state. Vacuum worker
processes do the job according to the their state and wait for the
next command after finished. Also in before entering the next phase,
the leader process does some preparation works while vacuum workers is
sleeping; for example, clearing shared dead tuple space before
entering the 'scanning heap' phase. The status of vacuum workers are
stored into a DSM area pointed by WorkerState variables, and
controlled by the leader process. FOr the basic design and performance
improvements please refer to my presentation at PGCon 2018[2]https://www.pgcon.org/2018/schedule/events/1202.en.html.

The number of parallel vacuum workers is determined according to
either the table size or PARALLEL option in VACUUM command. The
maximum of parallel workers is max_parallel_maintenance_workers.

I've separated the code for vacuum worker process to
backends/commands/vacuumworker.c, and created
includes/commands/vacuum_internal.h file to declare the definitions
for the lazy vacuum.

For autovacuum, this patch allows autovacuum worker process to use the
parallel option according to the relation size or the reloption. But
autovacuum delay, since there is no slots for parallel worker of
autovacuum in AutoVacuumShmem this patch doesn't support the change of
the autovacuum delay configuration during running.

Please apply this patch with the extension lock patch[1]/messages/by-id/CAD21AoBn8WbOt21MFfj1mQmL2ZD8KVgMHYrOe1F5ozsQC4Z_hw@mail.gmail.com when testing
as this patch can try to extend visibility map pages concurrently.

[1]: /messages/by-id/CAD21AoBn8WbOt21MFfj1mQmL2ZD8KVgMHYrOe1F5ozsQC4Z_hw@mail.gmail.com
[2]: https://www.pgcon.org/2018/schedule/events/1202.en.html

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachments:

v7-0002-Add-parallel-option-to-lazy-vacuum.patchapplication/octet-stream; name=v7-0002-Add-parallel-option-to-lazy-vacuum.patchDownload

From 77924a660889a37008a600aa3d9d38819cb0760f Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 10 Aug 2018 14:38:21 +0900
Subject: [PATCH v7 2/2] Add parallel option to lazy vacuum.

---
 doc/src/sgml/config.sgml               |    9 +-
 doc/src/sgml/ref/create_table.sgml     |   11 +-
 doc/src/sgml/ref/vacuum.sgml           |   16 +
 src/backend/access/common/reloptions.c |   10 +
 src/backend/access/transam/parallel.c  |    7 +
 src/backend/catalog/system_views.sql   |   23 +-
 src/backend/commands/Makefile          |    2 +-
 src/backend/commands/vacuum.c          |   53 +-
 src/backend/commands/vacuumlazy.c      | 2089 ++++++++++++++++++++++++--------
 src/backend/commands/vacuumworker.c    |  327 +++++
 src/backend/nodes/equalfuncs.c         |    7 +-
 src/backend/optimizer/plan/planner.c   |  133 ++
 src/backend/parser/gram.y              |   88 +-
 src/backend/postmaster/autovacuum.c    |   38 +-
 src/backend/postmaster/pgstat.c        |   26 +-
 src/backend/tcop/utility.c             |    4 +-
 src/backend/utils/adt/pgstatfuncs.c    |    8 +-
 src/include/catalog/pg_proc.dat        |    6 +-
 src/include/commands/vacuum.h          |   11 +-
 src/include/commands/vacuum_internal.h |  191 +++
 src/include/nodes/parsenodes.h         |   18 +-
 src/include/optimizer/planner.h        |    1 +
 src/include/pgstat.h                   |    9 +-
 src/include/storage/lwlock.h           |    1 +
 src/include/utils/rel.h                |    1 +
 src/test/regress/expected/rules.out    |   20 +-
 src/test/regress/expected/vacuum.out   |    2 +
 src/test/regress/sql/vacuum.sql        |    3 +
 28 files changed, 2479 insertions(+), 635 deletions(-)
 create mode 100644 src/backend/commands/vacuumworker.c
 create mode 100644 src/include/commands/vacuum_internal.h

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index bee4afb..ce66ea3 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2141,10 +2141,11 @@ include_dir 'conf.d'
        <listitem>
         <para>
          Sets the maximum number of parallel workers that can be
-         started by a single utility command.  Currently, the only
-         parallel utility command that supports the use of parallel
-         workers is <command>CREATE INDEX</command>, and only when
-         building a B-tree index.  Parallel workers are taken from the
+         started by a single utility command.  Currently, the parallel
+         utility commands that supports the use of parallel worker are
+         <command>CREATE INDEX</command>, and only when
+         building a B-tree index and <command>VACUUM</command> without
+         <literal>FULL</literal>.  Parallel workers are taken from the
          pool of processes established by <xref
          linkend="guc-max-worker-processes"/>, limited by <xref
          linkend="guc-max-parallel-workers"/>.  Note that the requested
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index d936de3..23da677 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -1421,7 +1421,16 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
     </listitem>
    </varlistentry>
 
-   <varlistentry>
+    <varlistentry>
+    <term><literal>autovacuum_vacuum_parallel_workers</literal>, <literal>toast.autovacuum_multixact_freeze_max_age</literal> (<type>integer</type>)</term>
+    <listitem>
+     <para>
+      This sets the number of worker that can be used to vacuum for this table. If not set, the autovacuum performs with no workers (non-parallel).
+     </para>
+    </listitem>
+   </varlistentry>
+
+    <varlistentry>
     <term><literal>autovacuum_freeze_min_age</literal>, <literal>toast.autovacuum_freeze_min_age</literal> (<type>integer</type>)</term>
     <listitem>
      <para>
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index b760e8e..aea2fd8 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -30,6 +30,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     FREEZE
     VERBOSE
     ANALYZE
+    PARALLEL <replaceable class="parameter">N</replaceable>
     DISABLE_PAGE_SKIPPING
 
 <phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
@@ -142,6 +143,21 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    </varlistentry>
 
    <varlistentry>
+    <term><literal>PARALLEL <replaceable class="parameter">N</replaceable></literal></term>
+    <listitem>
+     <para>
+      Execute <command>VACUUM</command> in parallel by <replaceable class="parameter">N
+      </replaceable>a background workers. Collecting garbage on table is processed
+      in block-level parallel. For tables with indexes, parallel vacuum assigns each
+      index to each parallel vacuum worker and all garbages on a index are processed
+      by particular parallel vacuum worker. The maximum nunber of parallel workers
+      is <xref linkend="guc-max-parallel-workers-maintenance"/>. This option can not
+      use with <literal>FULL</literal> option.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><literal>DISABLE_PAGE_SKIPPING</literal></term>
     <listitem>
      <para>
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index db84da0..45e2bca 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -348,6 +348,14 @@ static relopt_int intRelOpts[] =
 		},
 		-1, 0, 1024
 	},
+	{
+		{
+			"autovacuum_vacuum_parallel_workers",
+			"Number of parallel processes that can be used to vacuum for this relation",
+			RELOPT_KIND_HEAP | RELOPT_KIND_TOAST,
+			ShareUpdateExclusiveLock
+		}, -1, 0, 1024
+	},
 
 	/* list terminator */
 	{{NULL}}
@@ -1377,6 +1385,8 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
 		offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, vacuum_scale_factor)},
 		{"autovacuum_analyze_scale_factor", RELOPT_TYPE_REAL,
 		offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, analyze_scale_factor)},
+		{"autovacuum_vacuum_parallel_workers", RELOPT_TYPE_INT,
+		offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, vacuum_parallel_workers)},
 		{"user_catalog_table", RELOPT_TYPE_BOOL,
 		offsetof(StdRdOptions, user_catalog_table)},
 		{"parallel_workers", RELOPT_TYPE_INT,
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index c168118..391b1cb 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -22,6 +22,7 @@
 #include "catalog/index.h"
 #include "catalog/namespace.h"
 #include "commands/async.h"
+#include "commands/vacuum.h"
 #include "executor/execParallel.h"
 #include "libpq/libpq.h"
 #include "libpq/pqformat.h"
@@ -134,6 +135,9 @@ static const struct
 	},
 	{
 		"_bt_parallel_build_main", _bt_parallel_build_main
+	},
+	{
+		"lazy_parallel_vacuum_main", lazy_parallel_vacuum_main
 	}
 };
 
@@ -1266,6 +1270,9 @@ ParallelWorkerMain(Datum main_arg)
 	ParallelMasterBackendId = fps->parallel_master_backend_id;
 	on_shmem_exit(ParallelWorkerShutdown, (Datum) 0);
 
+	/* Report pid of master process for progress information */
+	pgstat_report_leader_pid(fps->parallel_master_pid);
+
 	/*
 	 * Now we can find and attach to the error queue provided for us.  That's
 	 * good, because until we do that, any errors that happen here will not be
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 7251552..61771c0 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -897,11 +897,24 @@ CREATE VIEW pg_stat_progress_vacuum AS
 					  WHEN 5 THEN 'truncating heap'
 					  WHEN 6 THEN 'performing final cleanup'
 					  END AS phase,
-		S.param2 AS heap_blks_total, S.param3 AS heap_blks_scanned,
-		S.param4 AS heap_blks_vacuumed, S.param5 AS index_vacuum_count,
-		S.param6 AS max_dead_tuples, S.param7 AS num_dead_tuples
-    FROM pg_stat_get_progress_info('VACUUM') AS S
-		LEFT JOIN pg_database D ON S.datid = D.oid;
+		S.param2 AS heap_blks_total,
+		W.heap_blks_scanned,
+		W.heap_blks_vacuumed,
+		W.index_vacuum_count,
+		S.param6 AS max_dead_tuples,
+		W.num_dead_tuples
+	FROM pg_stat_get_progress_info('VACUUM') AS S
+	        LEFT JOIN pg_database D ON S.datid = D.oid
+		LEFT JOIN
+		(SELECT leader_pid,
+			max(param3) AS heap_blks_scanned,
+			max(param4) AS heap_blks_vacuumed,
+			max(param5) AS index_vacuum_count,
+			max(param7) AS num_dead_tuples
+	        FROM pg_stat_get_progress_info('VACUUM')
+		GROUP BY leader_pid) AS W ON S.pid = W.leader_pid
+	WHERE
+		S.pid = S.leader_pid;
 
 CREATE VIEW pg_user_mappings AS
     SELECT
diff --git a/src/backend/commands/Makefile b/src/backend/commands/Makefile
index 4a6c99e..c3623da 100644
--- a/src/backend/commands/Makefile
+++ b/src/backend/commands/Makefile
@@ -20,6 +20,6 @@ OBJS = amcmds.o aggregatecmds.o alter.o analyze.o async.o cluster.o comment.o \
 	policy.o portalcmds.o prepare.o proclang.o publicationcmds.o \
 	schemacmds.o seclabel.o sequence.o statscmds.o subscriptioncmds.o \
 	tablecmds.o tablespace.o trigger.o tsearchcmds.o typecmds.o user.o \
-	vacuum.o vacuumlazy.o variable.o view.o
+	vacuum.o vacuumlazy.o vacuumworker.o variable.o view.o
 
 include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index ee32fe8..abb7daa 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -38,6 +38,7 @@
 #include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
+#include "optimizer/planner.h"
 #include "pgstat.h"
 #include "postmaster/autovacuum.h"
 #include "storage/bufmgr.h"
@@ -74,7 +75,7 @@ static void vac_truncate_clog(TransactionId frozenXID,
 				  MultiXactId minMulti,
 				  TransactionId lastSaneFrozenXid,
 				  MultiXactId lastSaneMinMulti);
-static bool vacuum_rel(Oid relid, RangeVar *relation, int options,
+static bool vacuum_rel(Oid relid, RangeVar *relation, VacuumOption options,
 		   VacuumParams *params);
 
 /*
@@ -89,15 +90,15 @@ ExecVacuum(VacuumStmt *vacstmt, bool isTopLevel)
 	VacuumParams params;
 
 	/* sanity checks on options */
-	Assert(vacstmt->options & (VACOPT_VACUUM | VACOPT_ANALYZE));
-	Assert((vacstmt->options & VACOPT_VACUUM) ||
-		   !(vacstmt->options & (VACOPT_FULL | VACOPT_FREEZE)));
-	Assert(!(vacstmt->options & VACOPT_SKIPTOAST));
+	Assert(vacstmt->options.flags & (VACOPT_VACUUM | VACOPT_ANALYZE));
+	Assert((vacstmt->options.flags & VACOPT_VACUUM) ||
+		   !(vacstmt->options.flags & (VACOPT_FULL | VACOPT_FREEZE)));
+	Assert(!(vacstmt->options.flags & VACOPT_SKIPTOAST));
 
 	/*
 	 * Make sure VACOPT_ANALYZE is specified if any column lists are present.
 	 */
-	if (!(vacstmt->options & VACOPT_ANALYZE))
+	if (!(vacstmt->options.flags & VACOPT_ANALYZE))
 	{
 		ListCell   *lc;
 
@@ -116,7 +117,7 @@ ExecVacuum(VacuumStmt *vacstmt, bool isTopLevel)
 	 * All freeze ages are zero if the FREEZE option is given; otherwise pass
 	 * them as -1 which means to use the default values.
 	 */
-	if (vacstmt->options & VACOPT_FREEZE)
+	if (vacstmt->options.flags & VACOPT_FREEZE)
 	{
 		params.freeze_min_age = 0;
 		params.freeze_table_age = 0;
@@ -163,7 +164,7 @@ ExecVacuum(VacuumStmt *vacstmt, bool isTopLevel)
  * memory context that will not disappear at transaction commit.
  */
 void
-vacuum(int options, List *relations, VacuumParams *params,
+vacuum(VacuumOption options, List *relations, VacuumParams *params,
 	   BufferAccessStrategy bstrategy, bool isTopLevel)
 {
 	static bool in_vacuum = false;
@@ -174,7 +175,7 @@ vacuum(int options, List *relations, VacuumParams *params,
 
 	Assert(params != NULL);
 
-	stmttype = (options & VACOPT_VACUUM) ? "VACUUM" : "ANALYZE";
+	stmttype = (options.flags & VACOPT_VACUUM) ? "VACUUM" : "ANALYZE";
 
 	/*
 	 * We cannot run VACUUM inside a user transaction block; if we were inside
@@ -184,7 +185,7 @@ vacuum(int options, List *relations, VacuumParams *params,
 	 *
 	 * ANALYZE (without VACUUM) can run either way.
 	 */
-	if (options & VACOPT_VACUUM)
+	if (options.flags & VACOPT_VACUUM)
 	{
 		PreventInTransactionBlock(isTopLevel, stmttype);
 		in_outer_xact = false;
@@ -206,8 +207,8 @@ vacuum(int options, List *relations, VacuumParams *params,
 	/*
 	 * Sanity check DISABLE_PAGE_SKIPPING option.
 	 */
-	if ((options & VACOPT_FULL) != 0 &&
-		(options & VACOPT_DISABLE_PAGE_SKIPPING) != 0)
+	if ((options.flags & VACOPT_FULL) != 0 &&
+		(options.flags & VACOPT_DISABLE_PAGE_SKIPPING) != 0)
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg("VACUUM option DISABLE_PAGE_SKIPPING cannot be used with FULL")));
@@ -216,7 +217,7 @@ vacuum(int options, List *relations, VacuumParams *params,
 	 * Send info about dead objects to the statistics collector, unless we are
 	 * in autovacuum --- autovacuum.c does this for itself.
 	 */
-	if ((options & VACOPT_VACUUM) && !IsAutoVacuumWorkerProcess())
+	if ((options.flags & VACOPT_VACUUM) && !IsAutoVacuumWorkerProcess())
 		pgstat_vacuum_stat();
 
 	/*
@@ -281,11 +282,11 @@ vacuum(int options, List *relations, VacuumParams *params,
 	 * transaction block, and also in an autovacuum worker, use own
 	 * transactions so we can release locks sooner.
 	 */
-	if (options & VACOPT_VACUUM)
+	if (options.flags & VACOPT_VACUUM)
 		use_own_xacts = true;
 	else
 	{
-		Assert(options & VACOPT_ANALYZE);
+		Assert(options.flags & VACOPT_ANALYZE);
 		if (IsAutoVacuumWorkerProcess())
 			use_own_xacts = true;
 		else if (in_outer_xact)
@@ -335,13 +336,13 @@ vacuum(int options, List *relations, VacuumParams *params,
 		{
 			VacuumRelation *vrel = lfirst_node(VacuumRelation, cur);
 
-			if (options & VACOPT_VACUUM)
+			if (options.flags & VACOPT_VACUUM)
 			{
 				if (!vacuum_rel(vrel->oid, vrel->relation, options, params))
 					continue;
 			}
 
-			if (options & VACOPT_ANALYZE)
+			if (options.flags & VACOPT_ANALYZE)
 			{
 				/*
 				 * If using separate xacts, start one for analyze. Otherwise,
@@ -354,7 +355,7 @@ vacuum(int options, List *relations, VacuumParams *params,
 					PushActiveSnapshot(GetTransactionSnapshot());
 				}
 
-				analyze_rel(vrel->oid, vrel->relation, options, params,
+				analyze_rel(vrel->oid, vrel->relation, options.flags, params,
 							vrel->va_cols, in_outer_xact, vac_strategy);
 
 				if (use_own_xacts)
@@ -390,7 +391,7 @@ vacuum(int options, List *relations, VacuumParams *params,
 		StartTransactionCommand();
 	}
 
-	if ((options & VACOPT_VACUUM) && !IsAutoVacuumWorkerProcess())
+	if ((options.flags & VACOPT_VACUUM) && !IsAutoVacuumWorkerProcess())
 	{
 		/*
 		 * Update pg_database.datfrozenxid, and truncate pg_xact if possible.
@@ -1304,7 +1305,7 @@ vac_truncate_clog(TransactionId frozenXID,
  *		At entry and exit, we are not inside a transaction.
  */
 static bool
-vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
+vacuum_rel(Oid relid, RangeVar *relation, VacuumOption options, VacuumParams *params)
 {
 	LOCKMODE	lmode;
 	Relation	onerel;
@@ -1326,7 +1327,7 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 	 */
 	PushActiveSnapshot(GetTransactionSnapshot());
 
-	if (!(options & VACOPT_FULL))
+	if (!(options.flags & VACOPT_FULL))
 	{
 		/*
 		 * In lazy vacuum, we can set the PROC_IN_VACUUM flag, which lets
@@ -1366,7 +1367,7 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 	 * vacuum, but just ShareUpdateExclusiveLock for concurrent vacuum. Either
 	 * way, we can be sure that no other backend is vacuuming the same table.
 	 */
-	lmode = (options & VACOPT_FULL) ? AccessExclusiveLock : ShareUpdateExclusiveLock;
+	lmode = (options.flags & VACOPT_FULL) ? AccessExclusiveLock : ShareUpdateExclusiveLock;
 
 	/*
 	 * Open the relation and get the appropriate lock on it.
@@ -1377,7 +1378,7 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 	 * If we've been asked not to wait for the relation lock, acquire it first
 	 * in non-blocking mode, before calling try_relation_open().
 	 */
-	if (!(options & VACOPT_SKIP_LOCKED))
+	if (!(options.flags & VACOPT_SKIP_LOCKED))
 		onerel = try_relation_open(relid, lmode);
 	else if (ConditionalLockRelationOid(relid, lmode))
 		onerel = try_relation_open(relid, NoLock);
@@ -1530,7 +1531,7 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 	 * us to process it.  In VACUUM FULL, though, the toast table is
 	 * automatically rebuilt by cluster_rel so we shouldn't recurse to it.
 	 */
-	if (!(options & VACOPT_SKIPTOAST) && !(options & VACOPT_FULL))
+	if (!(options.flags & VACOPT_SKIPTOAST) && !(options.flags & VACOPT_FULL))
 		toast_relid = onerel->rd_rel->reltoastrelid;
 	else
 		toast_relid = InvalidOid;
@@ -1549,7 +1550,7 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 	/*
 	 * Do the actual work --- either FULL or "lazy" vacuum
 	 */
-	if (options & VACOPT_FULL)
+	if (options.flags & VACOPT_FULL)
 	{
 		int			cluster_options = 0;
 
@@ -1557,7 +1558,7 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 		relation_close(onerel, NoLock);
 		onerel = NULL;
 
-		if ((options & VACOPT_VERBOSE) != 0)
+		if ((options.flags & VACOPT_VERBOSE) != 0)
 			cluster_options |= CLUOPT_VERBOSE;
 
 		/* VACUUM FULL is now a variant of CLUSTER; see cluster.c */
diff --git a/src/backend/commands/vacuumlazy.c b/src/backend/commands/vacuumlazy.c
index 5649a70..391d286 100644
--- a/src/backend/commands/vacuumlazy.c
+++ b/src/backend/commands/vacuumlazy.c
@@ -22,6 +22,17 @@
  * of index scans performed.  So we don't use maintenance_work_mem memory for
  * the TID array, just enough to hold as many heap tuples as fit on one page.
  *
+ * Lazy vacuum can be performed with parallel workers. In parallel lazy vacuum,
+ * multiple vacuum worker processes get blocks in parallel using parallel heap
+ * scan and process each of them. If a table with indexes the parallel vacuum
+ * workers vacuum the heap and indexes in parallel.  Also, the dead tuple
+ * TIDs are shared among all vacuum processes including the leader process.
+ * Before getting into each state such as scanning heap, vacuum index the
+ * leader process does some preparation work and asks all vacuum worker process
+ * to run the same state. If table with no indexes, all vacuum processes just
+ * vacuum each page as we go. Therefore the dead tuple TIDs are not shared.
+ * The information required by parallel lazy vacuum such as vacuum statistics,
+ * parallel heap scan description are also shared among vacuum processes.
  *
  * Portions Copyright (c) 1996-2018, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
@@ -38,23 +49,32 @@
 
 #include "access/genam.h"
 #include "access/heapam.h"
-#include "access/heapam_xlog.h"
 #include "access/htup_details.h"
 #include "access/multixact.h"
+#include "access/parallel.h"
 #include "access/transam.h"
 #include "access/visibilitymap.h"
+#include "access/xact.h"
 #include "access/xlog.h"
 #include "catalog/storage.h"
 #include "commands/dbcommands.h"
 #include "commands/progress.h"
 #include "commands/vacuum.h"
+#include "commands/vacuum_internal.h"
 #include "miscadmin.h"
+#include "optimizer/paths.h"
+#include "optimizer/pathnode.h"
+#include "optimizer/planmain.h"
+#include "optimizer/planner.h"
 #include "pgstat.h"
 #include "portability/instr_time.h"
 #include "postmaster/autovacuum.h"
 #include "storage/bufmgr.h"
+#include "storage/condition_variable.h"
 #include "storage/freespace.h"
 #include "storage/lmgr.h"
+#include "storage/ipc.h"
+#include "tcop/tcopprot.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/pg_rusage.h"
@@ -111,70 +131,148 @@
  */
 #define PREFETCH_SIZE			((BlockNumber) 32)
 
-typedef struct LVRelStats
+/* See note in lazy_scan_get_nextpage about forcing scanning of last page */
+#define FORCE_CHECK_PAGE(nblocks, blkno, vacrelstats) \
+	((blkno) == (nblocks) - 1 && should_attempt_truncation((vacrelstats)))
+
+/* Macros for checking the status of vacuum worker slot */
+#define IsVacuumWorkerStopped(pid) ((pid) == 0)
+#define IsVacuumWorkerInvalid(pid) (((pid) == InvalidPid) || ((pid) == 0))
+
+/*
+ * LVTidMap controls the dead tuple TIDs collected during heap scan. The 'shared'
+ * indicates LVTidMap is shared among vacuum workers. When it's true, it exists
+ * in shared memory.
+ */
+struct LVTidMap
 {
-	/* hasindex = true means two-pass strategy; false means one-pass */
-	bool		hasindex;
-	/* Overall statistics about rel */
-	BlockNumber old_rel_pages;	/* previous value of pg_class.relpages */
-	BlockNumber rel_pages;		/* total number of pages */
-	BlockNumber scanned_pages;	/* number of pages we examined */
-	BlockNumber pinskipped_pages;	/* # of pages we skipped due to a pin */
-	BlockNumber frozenskipped_pages;	/* # of frozen pages we skipped */
-	BlockNumber tupcount_pages; /* pages whose tuples we counted */
-	double		old_live_tuples;	/* previous value of pg_class.reltuples */
-	double		new_rel_tuples; /* new estimated total # of tuples */
-	double		new_live_tuples;	/* new estimated total # of live tuples */
-	double		new_dead_tuples;	/* new estimated total # of dead tuples */
-	BlockNumber pages_removed;
-	double		tuples_deleted;
-	BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
+	int		max_items;	/* # slots allocated in itemptrs */
+	int		num_items;	/* current # of entries */
+
+	/* The fields used for vacuum heap */
+	int		item_idx;
+	int		vacuumed_pages;	/* # pages vacuumed in a heap vacuum cycle */
+
+	/* The fields used for only parallel lazy vacuum */
+	bool	shared;		/* dead tuples is shared among vacuum workers */
+	slock_t	mutex;
+
 	/* List of TIDs of tuples we intend to delete */
 	/* NB: this list is ordered by TID address */
-	int			num_dead_tuples;	/* current # of entries */
-	int			max_dead_tuples;	/* # slots allocated in array */
-	ItemPointer dead_tuples;	/* array of ItemPointerData */
-	int			num_index_scans;
-	TransactionId latestRemovedXid;
-	bool		lock_waiter_detected;
-} LVRelStats;
+	ItemPointerData	itemptrs[FLEXIBLE_ARRAY_MEMBER];
+};
+#define SizeOfLVTidMap offsetof(LVTidMap, itemptrs) + sizeof(ItemPointerData)
+
+/*
+ * Struct for index statistics that are used for parallel lazy vacuum.
+ * In single lazy vacuum, we update the statistics of index after cleanup
+ * them. However, since any updates are not allowed during parallel mode
+ * we store all index statistics to LVIndStats and update them after exit
+ * parallel mode.
+ */
+typedef struct IndexStats
+{
+	bool		need_update;
+	BlockNumber	num_pages;
+	BlockNumber	num_tuples;
+} IndexStats;
+struct LVIndStats
+{
+	/*
+	 * nindexes has the length of stats. nprocessed and mutex are used
+	 * only for parallel lazy vacuum when processing each indexes by
+	 * the workers and the leader.
+	 */
+	int		nindexes;	/* total # of indexes */
+	int		nprocessed;	/* used for vacuum/cleanup index */
+	slock_t		mutex;	/* protect nprocessed */
+	IndexStats stats[FLEXIBLE_ARRAY_MEMBER];
+};
+#define SizeOfLVIndStats offsetof(LVIndStats, stats) + sizeof(IndexStats)
+
+/* Scan description data for lazy vacuum */
+struct LVScanDescData
+{
+	/* Common information for scanning heap */
+	Relation	lv_rel;
+	bool		disable_page_skipping;	/* enable DISABLE_PAGE_SKIPPING option */
+	bool		aggressive;				/* aggressive vacuum */
+
+	/* Used for single lazy vacuum, otherwise NULL */
+	HeapScanDesc lv_heapscan;
 
+	/* Used for parallel lazy vacuum, otherwise invalid values */
+	BlockNumber	lv_cblock;
+	BlockNumber	lv_next_unskippable_block;
+	BlockNumber	lv_nblocks;
+};
+
+/*
+ * Status for leader in parallel lazy vacuum. LVLeader is only present
+ * in the leader process.
+ */
+typedef struct LVLeader
+{
+	/*
+	 * allrelstats points to a shared memory space that stores the all index
+	 * statistics.
+	 */
+	LVRelStats	*allrelstats;
+	ParallelContext	*pcxt;
+} LVLeader;
+
+/* Global variables for lazy vacuum */
+LVWorkerState	*WorkerState = NULL;
 
 /* A few variables that don't seem worth passing around as parameters */
 static int	elevel = -1;
-
+static BufferAccessStrategy vac_strategy;
 static TransactionId OldestXmin;
 static TransactionId FreezeLimit;
 static MultiXactId MultiXactCutoff;
 
-static BufferAccessStrategy vac_strategy;
-
-
 /* non-export function prototypes */
-static void lazy_scan_heap(Relation onerel, int options,
-			   LVRelStats *vacrelstats, Relation *Irel, int nindexes,
-			   bool aggressive);
-static void lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats);
+static void lazy_scan_heap(LVState *lvstate);
 static bool lazy_check_needs_freeze(Buffer buf, bool *hastup);
 static void lazy_vacuum_index(Relation indrel,
 				  IndexBulkDeleteResult **stats,
-				  LVRelStats *vacrelstats);
+				  LVRelStats *vacrelstats,
+				  LVTidMap *dead_tuples);
 static void lazy_cleanup_index(Relation indrel,
-				   IndexBulkDeleteResult *stats,
-				   LVRelStats *vacrelstats);
-static int lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
-				 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer);
+							   IndexBulkDeleteResult *stats,
+							   LVRelStats *vacrelstats,
+							   IndexStats *indstas);
+static int lazy_vacuum_page(LVState *lvstate, Relation onerel, BlockNumber blkno, Buffer buffer,
+							int tupindex, Buffer *vmbuffer);
 static bool should_attempt_truncation(LVRelStats *vacrelstats);
 static void lazy_truncate_heap(Relation onerel, LVRelStats *vacrelstats);
 static BlockNumber count_nondeletable_pages(Relation onerel,
 						 LVRelStats *vacrelstats);
-static void lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks);
-static void lazy_record_dead_tuple(LVRelStats *vacrelstats,
+static void lazy_record_dead_tuple(LVTidMap *dead_tuples,
 					   ItemPointer itemptr);
-static bool lazy_tid_reaped(ItemPointer itemptr, void *state);
+static bool lazy_tid_reaped(ItemPointer itemptr, void *dt);
 static int	vac_cmp_itemptr(const void *left, const void *right);
 static bool heap_page_is_all_visible(Relation rel, Buffer buf,
 						 TransactionId *visibility_cutoff_xid, bool *all_frozen);
+static BlockNumber lazy_get_next_vacuum_page(LVState *lvstate, int *tupindex_p,
+											 int *npages_p);
+static bool lazy_dead_tuples_is_full(LVTidMap *tidmap);
+static int lazy_get_dead_tuple_count(LVTidMap *dead_tuples);
+static BlockNumber lazy_scan_get_nextpage(LVScanDesc lvscan, LVRelStats *vacrelstats,
+										  bool *all_visible_according_to_vm_p,
+										  Buffer *vmbuffer_p);
+static long lazy_get_max_dead_tuples(LVRelStats *vacrelstats, BlockNumber relblocks);
+
+/* function prototypes for parallel vacuum */
+static LVLeader *lazy_vacuum_begin_parallel(LVState *lvstate, int request);
+static void lazy_vacuum_end_parallel(LVState *lvstate, LVLeader *lvleader,
+									 bool update_stats);
+static void lazy_prepare_next_state(LVState *lvstate, LVLeader *lvleader,
+									int next_state);
+static void lazy_gather_worker_stats(LVLeader *lvleader, LVRelStats *vacrelstats);
+static void lazy_wait_for_vacuum_workers_to_be_done(void);
+static void lazy_set_workers_state(VacWorkerState new_state);
+static void lazy_wait_for_vacuum_workers_attach(ParallelContext *pcxt);
 
 
 /*
@@ -187,12 +285,11 @@ static bool heap_page_is_all_visible(Relation rel, Buffer buf,
  *		and locked the relation.
  */
 void
-lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params,
+lazy_vacuum_rel(Relation onerel, VacuumOption options, VacuumParams *params,
 				BufferAccessStrategy bstrategy)
 {
+	LVState	   *lvstate;
 	LVRelStats *vacrelstats;
-	Relation   *Irel;
-	int			nindexes;
 	PGRUsage	ru0;
 	TimestampTz starttime = 0;
 	long		secs;
@@ -218,7 +315,7 @@ lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 		starttime = GetCurrentTimestamp();
 	}
 
-	if (options & VACOPT_VERBOSE)
+	if (options.flags & VACOPT_VERBOSE)
 		elevel = INFO;
 	else
 		elevel = DEBUG2;
@@ -246,26 +343,34 @@ lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 											   xidFullScanLimit);
 	aggressive |= MultiXactIdPrecedesOrEquals(onerel->rd_rel->relminmxid,
 											  mxactFullScanLimit);
-	if (options & VACOPT_DISABLE_PAGE_SKIPPING)
+	if (options.flags & VACOPT_DISABLE_PAGE_SKIPPING)
 		aggressive = true;
 
-	vacrelstats = (LVRelStats *) palloc0(sizeof(LVRelStats));
+	/* Create lazy vacuum state and statistics */
+	lvstate = (LVState *) palloc0(sizeof(LVState));
+	lvstate->options = options;
+	lvstate->aggressive = aggressive;
+	lvstate->relid = RelationGetRelid(onerel);
+	lvstate->relation = onerel;
+	lvstate->is_wraparound = params->is_wraparound;
+	lvstate->indstats = NULL;
+	lvstate->dead_tuples = NULL;
+	lvstate->lvshared = NULL;
 
+	vacrelstats = (LVRelStats *) palloc0(sizeof(LVRelStats));
 	vacrelstats->old_rel_pages = onerel->rd_rel->relpages;
 	vacrelstats->old_live_tuples = onerel->rd_rel->reltuples;
-	vacrelstats->num_index_scans = 0;
-	vacrelstats->pages_removed = 0;
 	vacrelstats->lock_waiter_detected = false;
+	lvstate->vacrelstats = vacrelstats;
 
 	/* Open all indexes of the relation */
-	vac_open_indexes(onerel, RowExclusiveLock, &nindexes, &Irel);
-	vacrelstats->hasindex = (nindexes > 0);
+	vac_open_indexes(onerel, RowExclusiveLock, &lvstate->nindexes, &lvstate->indRels);
+	vacrelstats->hasindex = (lvstate->nindexes > 0);
 
-	/* Do the vacuuming */
-	lazy_scan_heap(onerel, options, vacrelstats, Irel, nindexes, aggressive);
+	lazy_scan_heap(lvstate);
 
 	/* Done with indexes */
-	vac_close_indexes(nindexes, Irel, NoLock);
+	vac_close_indexes(lvstate->nindexes, lvstate->indRels, NoLock);
 
 	/*
 	 * Compute whether we actually scanned the all unfrozen pages. If we did,
@@ -374,7 +479,7 @@ lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 			 * emitting individual parts of the message when not applicable.
 			 */
 			initStringInfo(&buf);
-			if (aggressive)
+			if (lvstate->aggressive)
 				msgfmt = _("automatic aggressive vacuum of table \"%s.%s.%s\": index scans: %d\n");
 			else
 				msgfmt = _("automatic vacuum of table \"%s.%s.%s\": index scans: %d\n");
@@ -444,7 +549,7 @@ vacuum_log_cleanup_info(Relation rel, LVRelStats *vacrelstats)
 }
 
 /*
- *	lazy_scan_heap() -- scan an open heap relation
+ *	do_lazy_scan_heap() -- scan an open heap relation
  *
  *		This routine prunes each page in the heap, which will among other
  *		things truncate dead tuples to dead line pointers, defragment the
@@ -459,32 +564,19 @@ vacuum_log_cleanup_info(Relation rel, LVRelStats *vacrelstats)
  *		dead line pointers need only be retained until all index pointers that
  *		reference them have been killed.
  */
-static void
-lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
-			   Relation *Irel, int nindexes, bool aggressive)
+int
+do_lazy_scan_heap(LVState *lvstate, bool *isFinished)
 {
-	BlockNumber nblocks,
-				blkno;
+	Relation 	onerel = lvstate->relation;
+	LVRelStats	*vacrelstats = lvstate->vacrelstats;
+	BlockNumber	blkno;
 	HeapTupleData tuple;
 	char	   *relname;
 	TransactionId relfrozenxid = onerel->rd_rel->relfrozenxid;
 	TransactionId relminmxid = onerel->rd_rel->relminmxid;
-	BlockNumber empty_pages,
-				vacuumed_pages,
-				next_fsm_block_to_vacuum;
-	double		num_tuples,		/* total number of nonremovable tuples */
-				live_tuples,	/* live tuples (reltuples estimate) */
-				tups_vacuumed,	/* tuples cleaned up by vacuum */
-				nkeep,			/* dead-but-not-removable tuples */
-				nunused;		/* unused item pointers */
-	IndexBulkDeleteResult **indstats;
 	int			i;
-	PGRUsage	ru0;
 	Buffer		vmbuffer = InvalidBuffer;
-	BlockNumber next_unskippable_block;
-	bool		skipping_blocks;
-	xl_heap_freeze_tuple *frozen;
-	StringInfoData buf;
+	bool		all_visible_accroding_to_vm;
 	const int	initprog_index[] = {
 		PROGRESS_VACUUM_PHASE,
 		PROGRESS_VACUUM_TOTAL_HEAP_BLKS,
@@ -492,117 +584,17 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	};
 	int64		initprog_val[3];
 
-	pg_rusage_init(&ru0);
-
 	relname = RelationGetRelationName(onerel);
-	if (aggressive)
-		ereport(elevel,
-				(errmsg("aggressively vacuuming \"%s.%s\"",
-						get_namespace_name(RelationGetNamespace(onerel)),
-						relname)));
-	else
-		ereport(elevel,
-				(errmsg("vacuuming \"%s.%s\"",
-						get_namespace_name(RelationGetNamespace(onerel)),
-						relname)));
-
-	empty_pages = vacuumed_pages = 0;
-	next_fsm_block_to_vacuum = (BlockNumber) 0;
-	num_tuples = live_tuples = tups_vacuumed = nkeep = nunused = 0;
-
-	indstats = (IndexBulkDeleteResult **)
-		palloc0(nindexes * sizeof(IndexBulkDeleteResult *));
-
-	nblocks = RelationGetNumberOfBlocks(onerel);
-	vacrelstats->rel_pages = nblocks;
-	vacrelstats->scanned_pages = 0;
-	vacrelstats->tupcount_pages = 0;
-	vacrelstats->nonempty_pages = 0;
-	vacrelstats->latestRemovedXid = InvalidTransactionId;
-
-	lazy_space_alloc(vacrelstats, nblocks);
-	frozen = palloc(sizeof(xl_heap_freeze_tuple) * MaxHeapTuplesPerPage);
 
 	/* Report that we're scanning the heap, advertising total # of blocks */
 	initprog_val[0] = PROGRESS_VACUUM_PHASE_SCAN_HEAP;
-	initprog_val[1] = nblocks;
-	initprog_val[2] = vacrelstats->max_dead_tuples;
+	initprog_val[1] = lvstate->lvscan->lv_nblocks;
+	initprog_val[2] = lvstate->dead_tuples->max_items;
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
-	/*
-	 * Except when aggressive is set, we want to skip pages that are
-	 * all-visible according to the visibility map, but only when we can skip
-	 * at least SKIP_PAGES_THRESHOLD consecutive pages.  Since we're reading
-	 * sequentially, the OS should be doing readahead for us, so there's no
-	 * gain in skipping a page now and then; that's likely to disable
-	 * readahead and so be counterproductive. Also, skipping even a single
-	 * page means that we can't update relfrozenxid, so we only want to do it
-	 * if we can skip a goodly number of pages.
-	 *
-	 * When aggressive is set, we can't skip pages just because they are
-	 * all-visible, but we can still skip pages that are all-frozen, since
-	 * such pages do not need freezing and do not affect the value that we can
-	 * safely set for relfrozenxid or relminmxid.
-	 *
-	 * Before entering the main loop, establish the invariant that
-	 * next_unskippable_block is the next block number >= blkno that we can't
-	 * skip based on the visibility map, either all-visible for a regular scan
-	 * or all-frozen for an aggressive scan.  We set it to nblocks if there's
-	 * no such block.  We also set up the skipping_blocks flag correctly at
-	 * this stage.
-	 *
-	 * Note: The value returned by visibilitymap_get_status could be slightly
-	 * out-of-date, since we make this test before reading the corresponding
-	 * heap page or locking the buffer.  This is OK.  If we mistakenly think
-	 * that the page is all-visible or all-frozen when in fact the flag's just
-	 * been cleared, we might fail to vacuum the page.  It's easy to see that
-	 * skipping a page when aggressive is not set is not a very big deal; we
-	 * might leave some dead tuples lying around, but the next vacuum will
-	 * find them.  But even when aggressive *is* set, it's still OK if we miss
-	 * a page whose all-frozen marking has just been cleared.  Any new XIDs
-	 * just added to that page are necessarily newer than the GlobalXmin we
-	 * computed, so they'll have no effect on the value to which we can safely
-	 * set relfrozenxid.  A similar argument applies for MXIDs and relminmxid.
-	 *
-	 * We will scan the table's last page, at least to the extent of
-	 * determining whether it has tuples or not, even if it should be skipped
-	 * according to the above rules; except when we've already determined that
-	 * it's not worth trying to truncate the table.  This avoids having
-	 * lazy_truncate_heap() take access-exclusive lock on the table to attempt
-	 * a truncation that just fails immediately because there are tuples in
-	 * the last page.  This is worth avoiding mainly because such a lock must
-	 * be replayed on any hot standby, where it can be disruptive.
-	 */
-	next_unskippable_block = 0;
-	if ((options & VACOPT_DISABLE_PAGE_SKIPPING) == 0)
-	{
-		while (next_unskippable_block < nblocks)
-		{
-			uint8		vmstatus;
-
-			vmstatus = visibilitymap_get_status(onerel, next_unskippable_block,
-												&vmbuffer);
-			if (aggressive)
-			{
-				if ((vmstatus & VISIBILITYMAP_ALL_FROZEN) == 0)
-					break;
-			}
-			else
-			{
-				if ((vmstatus & VISIBILITYMAP_ALL_VISIBLE) == 0)
-					break;
-			}
-			vacuum_delay_point();
-			next_unskippable_block++;
-		}
-	}
-
-	if (next_unskippable_block >= SKIP_PAGES_THRESHOLD)
-		skipping_blocks = true;
-	else
-		skipping_blocks = false;
-
-	for (blkno = 0; blkno < nblocks; blkno++)
+	while ((blkno = lazy_scan_get_nextpage(lvstate->lvscan, lvstate->vacrelstats,
+										   &all_visible_accroding_to_vm, &vmbuffer))
+		   != InvalidBlockNumber)
 	{
 		Buffer		buf;
 		Page		page;
@@ -619,159 +611,9 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		bool		has_dead_tuples;
 		TransactionId visibility_cutoff_xid = InvalidTransactionId;
 
-		/* see note above about forcing scanning of last page */
-#define FORCE_CHECK_PAGE() \
-		(blkno == nblocks - 1 && should_attempt_truncation(vacrelstats))
-
-		pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_SCANNED, blkno);
-
-		if (blkno == next_unskippable_block)
-		{
-			/* Time to advance next_unskippable_block */
-			next_unskippable_block++;
-			if ((options & VACOPT_DISABLE_PAGE_SKIPPING) == 0)
-			{
-				while (next_unskippable_block < nblocks)
-				{
-					uint8		vmskipflags;
-
-					vmskipflags = visibilitymap_get_status(onerel,
-														   next_unskippable_block,
-														   &vmbuffer);
-					if (aggressive)
-					{
-						if ((vmskipflags & VISIBILITYMAP_ALL_FROZEN) == 0)
-							break;
-					}
-					else
-					{
-						if ((vmskipflags & VISIBILITYMAP_ALL_VISIBLE) == 0)
-							break;
-					}
-					vacuum_delay_point();
-					next_unskippable_block++;
-				}
-			}
-
-			/*
-			 * We know we can't skip the current block.  But set up
-			 * skipping_blocks to do the right thing at the following blocks.
-			 */
-			if (next_unskippable_block - blkno > SKIP_PAGES_THRESHOLD)
-				skipping_blocks = true;
-			else
-				skipping_blocks = false;
-
-			/*
-			 * Normally, the fact that we can't skip this block must mean that
-			 * it's not all-visible.  But in an aggressive vacuum we know only
-			 * that it's not all-frozen, so it might still be all-visible.
-			 */
-			if (aggressive && VM_ALL_VISIBLE(onerel, blkno, &vmbuffer))
-				all_visible_according_to_vm = true;
-		}
-		else
-		{
-			/*
-			 * The current block is potentially skippable; if we've seen a
-			 * long enough run of skippable blocks to justify skipping it, and
-			 * we're not forced to check it, then go ahead and skip.
-			 * Otherwise, the page must be at least all-visible if not
-			 * all-frozen, so we can set all_visible_according_to_vm = true.
-			 */
-			if (skipping_blocks && !FORCE_CHECK_PAGE())
-			{
-				/*
-				 * Tricky, tricky.  If this is in aggressive vacuum, the page
-				 * must have been all-frozen at the time we checked whether it
-				 * was skippable, but it might not be any more.  We must be
-				 * careful to count it as a skipped all-frozen page in that
-				 * case, or else we'll think we can't update relfrozenxid and
-				 * relminmxid.  If it's not an aggressive vacuum, we don't
-				 * know whether it was all-frozen, so we have to recheck; but
-				 * in this case an approximate answer is OK.
-				 */
-				if (aggressive || VM_ALL_FROZEN(onerel, blkno, &vmbuffer))
-					vacrelstats->frozenskipped_pages++;
-				continue;
-			}
-			all_visible_according_to_vm = true;
-		}
-
 		vacuum_delay_point();
 
 		/*
-		 * If we are close to overrunning the available space for dead-tuple
-		 * TIDs, pause and do a cycle of vacuuming before we tackle this page.
-		 */
-		if ((vacrelstats->max_dead_tuples - vacrelstats->num_dead_tuples) < MaxHeapTuplesPerPage &&
-			vacrelstats->num_dead_tuples > 0)
-		{
-			const int	hvp_index[] = {
-				PROGRESS_VACUUM_PHASE,
-				PROGRESS_VACUUM_NUM_INDEX_VACUUMS
-			};
-			int64		hvp_val[2];
-
-			/*
-			 * Before beginning index vacuuming, we release any pin we may
-			 * hold on the visibility map page.  This isn't necessary for
-			 * correctness, but we do it anyway to avoid holding the pin
-			 * across a lengthy, unrelated operation.
-			 */
-			if (BufferIsValid(vmbuffer))
-			{
-				ReleaseBuffer(vmbuffer);
-				vmbuffer = InvalidBuffer;
-			}
-
-			/* Log cleanup info before we touch indexes */
-			vacuum_log_cleanup_info(onerel, vacrelstats);
-
-			/* Report that we are now vacuuming indexes */
-			pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
-										 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
-
-			/* Remove index entries */
-			for (i = 0; i < nindexes; i++)
-				lazy_vacuum_index(Irel[i],
-								  &indstats[i],
-								  vacrelstats);
-
-			/*
-			 * Report that we are now vacuuming the heap.  We also increase
-			 * the number of index scans here; note that by using
-			 * pgstat_progress_update_multi_param we can update both
-			 * parameters atomically.
-			 */
-			hvp_val[0] = PROGRESS_VACUUM_PHASE_VACUUM_HEAP;
-			hvp_val[1] = vacrelstats->num_index_scans + 1;
-			pgstat_progress_update_multi_param(2, hvp_index, hvp_val);
-
-			/* Remove tuples from heap */
-			lazy_vacuum_heap(onerel, vacrelstats);
-
-			/*
-			 * Forget the now-vacuumed tuples, and press on, but be careful
-			 * not to reset latestRemovedXid since we want that value to be
-			 * valid.
-			 */
-			vacrelstats->num_dead_tuples = 0;
-			vacrelstats->num_index_scans++;
-
-			/*
-			 * Vacuum the Free Space Map to make newly-freed space visible on
-			 * upper-level FSM pages.  Note we have not yet processed blkno.
-			 */
-			FreeSpaceMapVacuumRange(onerel, next_fsm_block_to_vacuum, blkno);
-			next_fsm_block_to_vacuum = blkno;
-
-			/* Report that we are once again scanning the heap */
-			pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
-										 PROGRESS_VACUUM_PHASE_SCAN_HEAP);
-		}
-
-		/*
 		 * Pin the visibility map page in case we need to mark the page
 		 * all-visible.  In most cases this will be very cheap, because we'll
 		 * already have the correct page pinned anyway.  However, it's
@@ -794,7 +636,8 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			 * it's OK to skip vacuuming pages we get a lock conflict on. They
 			 * will be dealt with in some future vacuum.
 			 */
-			if (!aggressive && !FORCE_CHECK_PAGE())
+			if (!lvstate->aggressive &&
+				!FORCE_CHECK_PAGE(vacrelstats->rel_pages, blkno, vacrelstats))
 			{
 				ReleaseBuffer(buf);
 				vacrelstats->pinskipped_pages++;
@@ -827,7 +670,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 					vacrelstats->nonempty_pages = blkno + 1;
 				continue;
 			}
-			if (!aggressive)
+			if (!lvstate->aggressive)
 			{
 				/*
 				 * Here, we must not advance scanned_pages; that would amount
@@ -881,7 +724,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 						(errmsg("relation \"%s\" page %u is uninitialized --- fixing",
 								relname, blkno)));
 				PageInit(page, BufferGetPageSize(buf), 0);
-				empty_pages++;
+				vacrelstats->empty_pages++;
 			}
 			freespace = PageGetHeapFreeSpace(page);
 			MarkBufferDirty(buf);
@@ -893,7 +736,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 
 		if (PageIsEmpty(page))
 		{
-			empty_pages++;
+			vacrelstats->empty_pages++;
 			freespace = PageGetHeapFreeSpace(page);
 
 			/* empty pages are always all-visible and all-frozen */
@@ -935,7 +778,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		 *
 		 * We count tuples removed by the pruning step as removed by VACUUM.
 		 */
-		tups_vacuumed += heap_page_prune(onerel, buf, OldestXmin, false,
+		vacrelstats->tuples_deleted += heap_page_prune(onerel, buf, OldestXmin, false,
 										 &vacrelstats->latestRemovedXid);
 
 		/*
@@ -946,7 +789,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		has_dead_tuples = false;
 		nfrozen = 0;
 		hastup = false;
-		prev_dead_count = vacrelstats->num_dead_tuples;
+		prev_dead_count = lazy_get_dead_tuple_count(lvstate->dead_tuples);
 		maxoff = PageGetMaxOffsetNumber(page);
 
 		/*
@@ -964,7 +807,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			/* Unused items require no processing, but we count 'em */
 			if (!ItemIdIsUsed(itemid))
 			{
-				nunused += 1;
+				vacrelstats->unused_tuples += 1;
 				continue;
 			}
 
@@ -979,13 +822,13 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 
 			/*
 			 * DEAD item pointers are to be vacuumed normally; but we don't
-			 * count them in tups_vacuumed, else we'd be double-counting (at
+			 * count them in vacrelstats->tuples_deleted, else we'd be double-counting (at
 			 * least in the common case where heap_page_prune() just freed up
 			 * a non-HOT tuple).
 			 */
 			if (ItemIdIsDead(itemid))
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(lvstate->dead_tuples, &(tuple.t_self));
 				all_visible = false;
 				continue;
 			}
@@ -1037,7 +880,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 					 */
 					if (HeapTupleIsHotUpdated(&tuple) ||
 						HeapTupleIsHeapOnly(&tuple))
-						nkeep += 1;
+						vacrelstats->new_dead_tuples += 1;
 					else
 						tupgone = true; /* we can delete the tuple */
 					all_visible = false;
@@ -1053,7 +896,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 					 * Count it as live.  Not only is this natural, but it's
 					 * also what acquire_sample_rows() does.
 					 */
-					live_tuples += 1;
+					vacrelstats->live_tuples += 1;
 
 					/*
 					 * Is the tuple definitely visible to all transactions?
@@ -1096,7 +939,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 					 * If tuple is recently deleted then we must not remove it
 					 * from relation.
 					 */
-					nkeep += 1;
+					vacrelstats->new_dead_tuples += 1;
 					all_visible = false;
 					break;
 				case HEAPTUPLE_INSERT_IN_PROGRESS:
@@ -1122,7 +965,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 					 * deleting transaction will commit and update the
 					 * counters after we report.
 					 */
-					live_tuples += 1;
+					vacrelstats->live_tuples += 1;
 					break;
 				default:
 					elog(ERROR, "unexpected HeapTupleSatisfiesVacuum result");
@@ -1131,17 +974,17 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 
 			if (tupgone)
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(lvstate->dead_tuples, &(tuple.t_self));
 				HeapTupleHeaderAdvanceLatestRemovedXid(tuple.t_data,
 													   &vacrelstats->latestRemovedXid);
-				tups_vacuumed += 1;
+				vacrelstats->tuples_deleted += 1;
 				has_dead_tuples = true;
 			}
 			else
 			{
 				bool		tuple_totally_frozen;
 
-				num_tuples += 1;
+				vacrelstats->num_tuples += 1;
 				hastup = true;
 
 				/*
@@ -1151,9 +994,9 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 				if (heap_prepare_freeze_tuple(tuple.t_data,
 											  relfrozenxid, relminmxid,
 											  FreezeLimit, MultiXactCutoff,
-											  &frozen[nfrozen],
+											  &(lvstate->frozen[nfrozen]),
 											  &tuple_totally_frozen))
-					frozen[nfrozen++].offset = offnum;
+					lvstate->frozen[nfrozen++].offset = offnum;
 
 				if (!tuple_totally_frozen)
 					all_frozen = false;
@@ -1177,10 +1020,10 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 				ItemId		itemid;
 				HeapTupleHeader htup;
 
-				itemid = PageGetItemId(page, frozen[i].offset);
+				itemid = PageGetItemId(page, lvstate->frozen[i].offset);
 				htup = (HeapTupleHeader) PageGetItem(page, itemid);
 
-				heap_execute_freeze_tuple(htup, &frozen[i]);
+				heap_execute_freeze_tuple(htup, &(lvstate->frozen[i]));
 			}
 
 			/* Now WAL-log freezing if necessary */
@@ -1189,7 +1032,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 				XLogRecPtr	recptr;
 
 				recptr = log_heap_freeze(onerel, buf, FreezeLimit,
-										 frozen, nfrozen);
+										 lvstate->frozen, nfrozen);
 				PageSetLSN(page, recptr);
 			}
 
@@ -1200,20 +1043,24 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		 * If there are no indexes then we can vacuum the page right now
 		 * instead of doing a second scan.
 		 */
-		if (nindexes == 0 &&
-			vacrelstats->num_dead_tuples > 0)
+		if (lvstate->nindexes == 0 &&
+			lazy_get_dead_tuple_count(lvstate->dead_tuples) > 0)
 		{
 			/* Remove tuples from heap */
-			lazy_vacuum_page(onerel, blkno, buf, 0, vacrelstats, &vmbuffer);
+			lazy_vacuum_page(lvstate, onerel, blkno, buf, 0, &vmbuffer);
 			has_dead_tuples = false;
 
 			/*
 			 * Forget the now-vacuumed tuples, and press on, but be careful
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
+			 *
+			 * If table with no index, since the dead tuple space exists on
+			 * local memory regardless parallel or non-parallel lazy vacuum
+			 * we don't need to acquire the lock to modify it.
 			 */
-			vacrelstats->num_dead_tuples = 0;
-			vacuumed_pages++;
+			lvstate->dead_tuples->num_items = 0;
+			vacrelstats->vacuumed_pages++;
 
 			/*
 			 * Periodically do incremental FSM vacuuming to make newly-freed
@@ -1221,11 +1068,11 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			 * the current block, we haven't yet updated its FSM entry (that
 			 * happens further down), so passing end == blkno is correct.
 			 */
-			if (blkno - next_fsm_block_to_vacuum >= VACUUM_FSM_EVERY_PAGES)
+			if (blkno - lvstate->next_fsm_block_to_vacuum >= VACUUM_FSM_EVERY_PAGES)
 			{
-				FreeSpaceMapVacuumRange(onerel, next_fsm_block_to_vacuum,
+				FreeSpaceMapVacuumRange(onerel, lvstate->next_fsm_block_to_vacuum,
 										blkno);
-				next_fsm_block_to_vacuum = blkno;
+				lvstate->next_fsm_block_to_vacuum = blkno;
 			}
 		}
 
@@ -1328,127 +1175,31 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		 * page, so remember its free space as-is.  (This path will always be
 		 * taken if there are no indexes.)
 		 */
-		if (vacrelstats->num_dead_tuples == prev_dead_count)
+		if (lazy_get_dead_tuple_count(lvstate->dead_tuples) == prev_dead_count)
 			RecordPageWithFreeSpace(onerel, blkno, freespace);
-	}
-
-	/* report that everything is scanned and vacuumed */
-	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_SCANNED, blkno);
-
-	pfree(frozen);
 
-	/* save stats for use later */
-	vacrelstats->tuples_deleted = tups_vacuumed;
-	vacrelstats->new_dead_tuples = nkeep;
-
-	/* now we can compute the new value for pg_class.reltuples */
-	vacrelstats->new_live_tuples = vac_estimate_reltuples(onerel,
-														  nblocks,
-														  vacrelstats->tupcount_pages,
-														  live_tuples);
-
-	/* also compute total number of surviving heap entries */
-	vacrelstats->new_rel_tuples =
-		vacrelstats->new_live_tuples + vacrelstats->new_dead_tuples;
+		/* Dead tuple space is full, exit scanning */
+		if (lvstate->nindexes > 0 && lazy_dead_tuples_is_full(lvstate->dead_tuples))
+			break;
+	}
 
 	/*
-	 * Release any remaining pin on visibility map page.
+	 * Before beginning index vacuuming, we release any pin we may
+	 * hold on the visibility map page.  This isn't necessary for
+	 * correctness, but we do it anyway to avoid holding the pin
+	 * across a lengthy, unrelated operation.
 	 */
 	if (BufferIsValid(vmbuffer))
-	{
 		ReleaseBuffer(vmbuffer);
-		vmbuffer = InvalidBuffer;
-	}
-
-	/* If any tuples need to be deleted, perform final vacuum cycle */
-	/* XXX put a threshold on min number of tuples here? */
-	if (vacrelstats->num_dead_tuples > 0)
-	{
-		const int	hvp_index[] = {
-			PROGRESS_VACUUM_PHASE,
-			PROGRESS_VACUUM_NUM_INDEX_VACUUMS
-		};
-		int64		hvp_val[2];
-
-		/* Log cleanup info before we touch indexes */
-		vacuum_log_cleanup_info(onerel, vacrelstats);
-
-		/* Report that we are now vacuuming indexes */
-		pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
-									 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
-
-		/* Remove index entries */
-		for (i = 0; i < nindexes; i++)
-			lazy_vacuum_index(Irel[i],
-							  &indstats[i],
-							  vacrelstats);
-
-		/* Report that we are now vacuuming the heap */
-		hvp_val[0] = PROGRESS_VACUUM_PHASE_VACUUM_HEAP;
-		hvp_val[1] = vacrelstats->num_index_scans + 1;
-		pgstat_progress_update_multi_param(2, hvp_index, hvp_val);
-
-		/* Remove tuples from heap */
-		pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
-									 PROGRESS_VACUUM_PHASE_VACUUM_HEAP);
-		lazy_vacuum_heap(onerel, vacrelstats);
-		vacrelstats->num_index_scans++;
-	}
-
-	/*
-	 * Vacuum the remainder of the Free Space Map.  We must do this whether or
-	 * not there were indexes.
-	 */
-	if (blkno > next_fsm_block_to_vacuum)
-		FreeSpaceMapVacuumRange(onerel, next_fsm_block_to_vacuum, blkno);
-
-	/* report all blocks vacuumed; and that we're cleaning up */
-	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_VACUUMED, blkno);
-	pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
-								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
-
-	/* Do post-vacuum cleanup and statistics update for each index */
-	for (i = 0; i < nindexes; i++)
-		lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
 
-	/* If no indexes, make log report that lazy_vacuum_heap would've made */
-	if (vacuumed_pages)
-		ereport(elevel,
-				(errmsg("\"%s\": removed %.0f row versions in %u pages",
-						RelationGetRelationName(onerel),
-						tups_vacuumed, vacuumed_pages)));
+	/* Reached the end of the table */
+	if (!BlockNumberIsValid(blkno))
+		*isFinished = true;
 
-	/*
-	 * This is pretty messy, but we split it up so that we can skip emitting
-	 * individual parts of the message when not applicable.
-	 */
-	initStringInfo(&buf);
-	appendStringInfo(&buf,
-					 _("%.0f dead row versions cannot be removed yet, oldest xmin: %u\n"),
-					 nkeep, OldestXmin);
-	appendStringInfo(&buf, _("There were %.0f unused item pointers.\n"),
-					 nunused);
-	appendStringInfo(&buf, ngettext("Skipped %u page due to buffer pins, ",
-									"Skipped %u pages due to buffer pins, ",
-									vacrelstats->pinskipped_pages),
-					 vacrelstats->pinskipped_pages);
-	appendStringInfo(&buf, ngettext("%u frozen page.\n",
-									"%u frozen pages.\n",
-									vacrelstats->frozenskipped_pages),
-					 vacrelstats->frozenskipped_pages);
-	appendStringInfo(&buf, ngettext("%u page is entirely empty.\n",
-									"%u pages are entirely empty.\n",
-									empty_pages),
-					 empty_pages);
-	appendStringInfo(&buf, _("%s."), pg_rusage_show(&ru0));
+	/* Remember the just scanned block before leaving */
+	lvstate->current_block = blkno;
 
-	ereport(elevel,
-			(errmsg("\"%s\": found %.0f removable, %.0f nonremovable row versions in %u out of %u pages",
-					RelationGetRelationName(onerel),
-					tups_vacuumed, num_tuples,
-					vacrelstats->scanned_pages, nblocks),
-			 errdetail_internal("%s", buf.data)));
-	pfree(buf.data);
+	return lazy_get_dead_tuple_count(lvstate->dead_tuples);
 }
 
 
@@ -1463,38 +1214,36 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
  * the tuples until we've removed their index entries, and we want to
  * process index entry removal in batches as large as possible.
  */
-static void
-lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
+void
+lazy_vacuum_heap(Relation onerel, LVState *lvstate)
 {
-	int			tupindex;
+	int			tupindex = 0;
 	int			npages;
 	PGRUsage	ru0;
 	Buffer		vmbuffer = InvalidBuffer;
+	BlockNumber tblk;
 
 	pg_rusage_init(&ru0);
 	npages = 0;
 
-	tupindex = 0;
-	while (tupindex < vacrelstats->num_dead_tuples)
+	while ((tblk = lazy_get_next_vacuum_page(lvstate, &tupindex, &npages))
+		   != InvalidBlockNumber)
 	{
-		BlockNumber tblk;
 		Buffer		buf;
 		Page		page;
 		Size		freespace;
 
 		vacuum_delay_point();
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
 		buf = ReadBufferExtended(onerel, MAIN_FORKNUM, tblk, RBM_NORMAL,
 								 vac_strategy);
 		if (!ConditionalLockBufferForCleanup(buf))
 		{
 			ReleaseBuffer(buf);
-			++tupindex;
 			continue;
 		}
-		tupindex = lazy_vacuum_page(onerel, tblk, buf, tupindex, vacrelstats,
-									&vmbuffer);
+
+		lazy_vacuum_page(lvstate, onerel, tblk, buf, tupindex, &vmbuffer);
 
 		/* Now that we've compacted the page, record its available space */
 		page = BufferGetPage(buf);
@@ -1502,7 +1251,6 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 
 		UnlockReleaseBuffer(buf);
 		RecordPageWithFreeSpace(onerel, tblk, freespace);
-		npages++;
 	}
 
 	if (BufferIsValid(vmbuffer))
@@ -1511,11 +1259,13 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 		vmbuffer = InvalidBuffer;
 	}
 
-	ereport(elevel,
-			(errmsg("\"%s\": removed %d row versions in %d pages",
-					RelationGetRelationName(onerel),
-					tupindex, npages),
-			 errdetail_internal("%s", pg_rusage_show(&ru0))));
+	/* Report by only the vacuum leader */
+	if (!IsParallelWorker())
+		ereport(elevel,
+				(errmsg("\"%s\": removed %d row versions in %d pages",
+						RelationGetRelationName(onerel),
+						tupindex, npages),
+				 errdetail_internal("%s", pg_rusage_show(&ru0))));
 }
 
 /*
@@ -1529,29 +1279,32 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
  * The return value is the first tupindex after the tuples of this page.
  */
 static int
-lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
-				 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer)
+lazy_vacuum_page(LVState *lvstate, Relation onerel, BlockNumber blkno,
+				 Buffer buffer, int tupindex, Buffer *vmbuffer)
 {
+	LVRelStats *vacrelstats = lvstate->vacrelstats;
+	LVTidMap   *dead_tuples = lvstate->dead_tuples;
 	Page		page = BufferGetPage(buffer);
 	OffsetNumber unused[MaxOffsetNumber];
 	int			uncnt = 0;
 	TransactionId visibility_cutoff_xid;
 	bool		all_frozen;
+	int			num_items = lazy_get_dead_tuple_count(dead_tuples);
 
 	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_VACUUMED, blkno);
 
 	START_CRIT_SECTION();
 
-	for (; tupindex < vacrelstats->num_dead_tuples; tupindex++)
+	for (; tupindex < num_items; tupindex++)
 	{
 		BlockNumber tblk;
 		OffsetNumber toff;
 		ItemId		itemid;
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&dead_tuples->itemptrs[tupindex]);
 		if (tblk != blkno)
 			break;				/* past end of tuples for this block */
-		toff = ItemPointerGetOffsetNumber(&vacrelstats->dead_tuples[tupindex]);
+		toff = ItemPointerGetOffsetNumber(&dead_tuples->itemptrs[tupindex]);
 		itemid = PageGetItemId(page, toff);
 		ItemIdSetUnused(itemid);
 		unused[uncnt++] = toff;
@@ -1681,7 +1434,8 @@ lazy_check_needs_freeze(Buffer buf, bool *hastup)
 static void
 lazy_vacuum_index(Relation indrel,
 				  IndexBulkDeleteResult **stats,
-				  LVRelStats *vacrelstats)
+				  LVRelStats *vacrelstats,
+				  LVTidMap	*dead_tuples)
 {
 	IndexVacuumInfo ivinfo;
 	PGRUsage	ru0;
@@ -1698,12 +1452,13 @@ lazy_vacuum_index(Relation indrel,
 
 	/* Do bulk deletion */
 	*stats = index_bulk_delete(&ivinfo, *stats,
-							   lazy_tid_reaped, (void *) vacrelstats);
+							   lazy_tid_reaped, (void *) dead_tuples);
 
+	/* Report by both the leader and workers */
 	ereport(elevel,
 			(errmsg("scanned index \"%s\" to remove %d row versions",
 					RelationGetRelationName(indrel),
-					vacrelstats->num_dead_tuples),
+					lazy_get_dead_tuple_count(dead_tuples)),
 			 errdetail_internal("%s", pg_rusage_show(&ru0))));
 }
 
@@ -1713,7 +1468,8 @@ lazy_vacuum_index(Relation indrel,
 static void
 lazy_cleanup_index(Relation indrel,
 				   IndexBulkDeleteResult *stats,
-				   LVRelStats *vacrelstats)
+				   LVRelStats *vacrelstats,
+				   IndexStats *indstats)
 {
 	IndexVacuumInfo ivinfo;
 	PGRUsage	ru0;
@@ -1740,18 +1496,29 @@ lazy_cleanup_index(Relation indrel,
 
 	/*
 	 * Now update statistics in pg_class, but only if the index says the count
-	 * is accurate.
+	 * is accurate and in not parallel lazy vacuum.
 	 */
 	if (!stats->estimated_count)
-		vac_update_relstats(indrel,
-							stats->num_pages,
-							stats->num_index_tuples,
-							0,
-							false,
-							InvalidTransactionId,
-							InvalidMultiXactId,
-							false);
+	{
+		if (indstats)
+		{
+			/* In parallel lazy vacuum, remember them and update later */
+			indstats->need_update = true;
+			indstats->num_pages = stats->num_pages;
+			indstats->num_tuples = stats->num_index_tuples;
+		}
+		else
+			vac_update_relstats(indrel,
+								stats->num_pages,
+								stats->num_index_tuples,
+								0,
+								false,
+								InvalidTransactionId,
+								InvalidMultiXactId,
+								false);
+	}
 
+	/* Report by both the leader and workers */
 	ereport(elevel,
 			(errmsg("index \"%s\" now contains %.0f row versions in %u pages",
 					RelationGetRelationName(indrel),
@@ -2074,57 +1841,51 @@ count_nondeletable_pages(Relation onerel, LVRelStats *vacrelstats)
  *
  * See the comments at the head of this file for rationale.
  */
-static void
-lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+void
+lazy_space_alloc(LVState *lvstate, BlockNumber relblocks)
 {
 	long		maxtuples;
-	int			vac_work_mem = IsAutoVacuumWorkerProcess() &&
-	autovacuum_work_mem != -1 ?
-	autovacuum_work_mem : maintenance_work_mem;
+	LVTidMap	*dead_tuples;
 
-	if (vacrelstats->hasindex)
-	{
-		maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);
-		maxtuples = Min(maxtuples, INT_MAX);
-		maxtuples = Min(maxtuples, MaxAllocSize / sizeof(ItemPointerData));
+	Assert(lvstate->dead_tuples == NULL);
 
-		/* curious coding here to ensure the multiplication can't overflow */
-		if ((BlockNumber) (maxtuples / LAZY_ALLOC_TUPLES) > relblocks)
-			maxtuples = relblocks * LAZY_ALLOC_TUPLES;
+	maxtuples = lazy_get_max_dead_tuples(lvstate->vacrelstats,
+										 relblocks);
 
-		/* stay sane if small maintenance_work_mem */
-		maxtuples = Max(maxtuples, MaxHeapTuplesPerPage);
-	}
-	else
-	{
-		maxtuples = MaxHeapTuplesPerPage;
-	}
+	dead_tuples = (LVTidMap *) palloc(SizeOfLVTidMap +
+									  sizeof(ItemPointerData) * (int) maxtuples);
+	dead_tuples->max_items = maxtuples;
+	dead_tuples->num_items = 0;
+	dead_tuples->shared = false;
+	dead_tuples->item_idx = 0;
 
-	vacrelstats->num_dead_tuples = 0;
-	vacrelstats->max_dead_tuples = (int) maxtuples;
-	vacrelstats->dead_tuples = (ItemPointer)
-		palloc(maxtuples * sizeof(ItemPointerData));
+	lvstate->dead_tuples = dead_tuples;
 }
 
 /*
  * lazy_record_dead_tuple - remember one deletable tuple
  */
 static void
-lazy_record_dead_tuple(LVRelStats *vacrelstats,
-					   ItemPointer itemptr)
+lazy_record_dead_tuple(LVTidMap *dead_tuples, ItemPointer itemptr)
 {
+	if (dead_tuples->shared)
+		SpinLockAcquire(&(dead_tuples->mutex));
+
 	/*
 	 * The array shouldn't overflow under normal behavior, but perhaps it
 	 * could if we are given a really small maintenance_work_mem. In that
 	 * case, just forget the last few tuples (we'll get 'em next time).
 	 */
-	if (vacrelstats->num_dead_tuples < vacrelstats->max_dead_tuples)
+	if (dead_tuples->num_items < dead_tuples->max_items)
 	{
-		vacrelstats->dead_tuples[vacrelstats->num_dead_tuples] = *itemptr;
-		vacrelstats->num_dead_tuples++;
+		dead_tuples->itemptrs[dead_tuples->num_items] = *itemptr;
+		(dead_tuples->num_items)++;
 		pgstat_progress_update_param(PROGRESS_VACUUM_NUM_DEAD_TUPLES,
-									 vacrelstats->num_dead_tuples);
+									 dead_tuples->num_items);
 	}
+
+	if (dead_tuples->shared)
+		SpinLockRelease(&(dead_tuples->mutex));
 }
 
 /*
@@ -2135,14 +1896,14 @@ lazy_record_dead_tuple(LVRelStats *vacrelstats,
  *		Assumes dead_tuples array is in sorted order.
  */
 static bool
-lazy_tid_reaped(ItemPointer itemptr, void *state)
+lazy_tid_reaped(ItemPointer itemptr, void *dt)
 {
-	LVRelStats *vacrelstats = (LVRelStats *) state;
+	LVTidMap	*dead_tuples = (LVTidMap *) dt;
 	ItemPointer res;
 
 	res = (ItemPointer) bsearch((void *) itemptr,
-								(void *) vacrelstats->dead_tuples,
-								vacrelstats->num_dead_tuples,
+								(void *) dead_tuples->itemptrs,
+								lazy_get_dead_tuple_count(dead_tuples),
 								sizeof(ItemPointerData),
 								vac_cmp_itemptr);
 
@@ -2290,3 +2051,1255 @@ heap_page_is_all_visible(Relation rel, Buffer buf,
 
 	return all_visible;
 }
+
+/*
+ * Perform single or parallel lazy vacuum.
+ */
+static void
+lazy_scan_heap(LVState *lvstate)
+{
+	LVRelStats		*vacrelstats = lvstate->vacrelstats;
+	LVLeader		*lvleader = NULL;
+	Relation		onerel = lvstate->relation;
+	bool		isFinished = false;
+	int			nworkers = 0;
+	BlockNumber	nblocks;
+	char		*relname;
+	StringInfoData	buf;
+	PGRUsage		ru0;
+
+	/* Reset parallel vacuum worker stats */
+	if (WorkerState)
+		WorkerState = NULL;
+
+	relname = RelationGetRelationName(onerel);
+
+	/* Plan the number of parallel workers */
+	if ((lvstate->options.flags & VACOPT_PARALLEL) != 0)
+		nworkers = plan_lazy_vacuum_workers(RelationGetRelid(lvstate->relation),
+											lvstate->options.nworkers);
+
+	if (nworkers > 0)
+	{
+		/* Set parallel context and attempt to launch parallel workers */
+		lvleader = lazy_vacuum_begin_parallel(lvstate, nworkers);
+	}
+	else
+	{
+		/* Prepare dead tuple space for the single lazy scan heap */
+		lazy_space_alloc(lvstate, RelationGetNumberOfBlocks(lvstate->relation));
+	}
+
+	pg_rusage_init(&ru0);
+
+	if (lvstate->aggressive)
+		ereport(elevel,
+				(errmsg("aggressively vacuuming \"%s.%s\"",
+						get_namespace_name(RelationGetNamespace(onerel)),
+						relname)));
+	else
+		ereport(elevel,
+				(errmsg("vacuuming \"%s.%s\"",
+						get_namespace_name(RelationGetNamespace(onerel)),
+						relname)));
+
+	nblocks = RelationGetNumberOfBlocks(onerel);
+	vacrelstats->rel_pages = nblocks;
+
+	vacrelstats->scanned_pages = 0;
+	vacrelstats->tupcount_pages = 0;
+	vacrelstats->nonempty_pages = 0;
+	vacrelstats->empty_pages = 0;
+	vacrelstats->latestRemovedXid = InvalidTransactionId;
+
+	lvstate->lvscan = lv_beginscan(onerel, lvstate->lvshared, lvstate->aggressive,
+						  (lvstate->options.flags & VACOPT_DISABLE_PAGE_SKIPPING) != 0);
+	lvstate->indbulkstats = (IndexBulkDeleteResult **)
+		palloc0(lvstate->nindexes * sizeof(IndexBulkDeleteResult *));
+	lvstate->frozen = palloc(sizeof(xl_heap_freeze_tuple) * MaxHeapTuplesPerPage);
+
+	/* Do the actual lazy vacuum */
+	while (!isFinished)
+	{
+		int ndeadtuples;
+		const int	hvp_index[] = {
+			PROGRESS_VACUUM_PHASE,
+			PROGRESS_VACUUM_NUM_INDEX_VACUUMS
+		};
+		int64		hvp_val[2];
+
+		/*
+		 * Scan heap until the end of the table or dead tuple space is full if the
+		 * table with indexes.
+		 */
+		ndeadtuples = do_lazy_scan_heap(lvstate, &isFinished);
+
+		/* Reached the end of table with no garbage */
+		if (isFinished && ndeadtuples == 0)
+			break;
+
+		/* Log cleanup info before we touch indexes */
+		vacuum_log_cleanup_info(onerel, vacrelstats);
+
+		/* Prepare the index vacuum */
+		lazy_prepare_next_state(lvstate, lvleader, VACSTATE_VACUUM_INDEX);
+
+		/* Report that we are now vacuuming indexes */
+		pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
+									 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
+
+		/* Remove index entries */
+		lazy_vacuum_all_indexes(lvstate);
+
+		/* Prepare the heap vacuum */
+		lazy_prepare_next_state(lvstate, lvleader, VACSTATE_VACUUM_HEAP);
+
+		/*
+		 * Report that we are now vacuuming the heap.  We also increase
+		 * the number of index scans here; note that by using
+		 * pgstat_progress_update_multi_param we can update both
+		 * parameters atomically.
+		 */
+		hvp_val[0] = PROGRESS_VACUUM_PHASE_VACUUM_HEAP;
+		hvp_val[1] = vacrelstats->num_index_scans + 1;
+		pgstat_progress_update_multi_param(2, hvp_index, hvp_val);
+
+		/* Remove tuples from heap */
+		lazy_vacuum_heap(onerel, lvstate);
+
+		/*
+		 * Vacuum the Free Space Map to make newly-freed space visible on
+		 * upper-level FSM pages.  Note we have not yet processed blkno.
+		 */
+		FreeSpaceMapVacuumRange(onerel, lvstate->next_fsm_block_to_vacuum,
+								lvstate->current_block);
+		lvstate->next_fsm_block_to_vacuum = lvstate->current_block;
+
+		vacrelstats->num_index_scans++;
+
+		if (!isFinished)
+		{
+			/*
+			 * Prepare for the next heap scan. Forget the now-vacuumed tuples,
+			 * and press on, but be careful not to reset latestRemovedXid since
+			 * we want that value to be valid.
+			 */
+			lazy_prepare_next_state(lvstate, lvleader, VACSTATE_SCAN);
+			pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
+										 PROGRESS_VACUUM_PHASE_SCAN_HEAP);
+		}
+		else
+			pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_VACUUMED, nblocks);
+	}
+
+	/* report that everything is scanned and vacuumed */
+	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_SCANNED, nblocks);
+
+	/* End heap scan */
+	lv_endscan(lvstate->lvscan);
+
+	pfree(lvstate->frozen);
+
+	/*
+	 * Vacuum the remainder of the Free Space Map.  We must do this whether or
+	 * not there were indexes.
+	 */
+	if (lvstate->current_block > lvstate->next_fsm_block_to_vacuum)
+		FreeSpaceMapVacuumRange(onerel, lvstate->next_fsm_block_to_vacuum,
+								lvstate->current_block);
+
+	/* Prepare the cleanup index */
+	lazy_prepare_next_state(lvstate, lvleader, VACSTATE_CLEANUP_INDEX);
+
+	/* report all blocks vacuumed; and that we're cleaning up */
+	pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
+								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
+
+	/* Do post-vacuum cleanup and statistics update for each index */
+	lazy_cleanup_all_indexes(lvstate);
+
+	/* Shut down all vacuum workers */
+	lazy_prepare_next_state(lvstate, lvleader, VACSTATE_COMPLETED);
+
+	/* If no indexes, make log report that lazy_vacuum_heap would've made */
+	if (vacrelstats->vacuumed_pages)
+		ereport(elevel,
+				(errmsg("\"%s\": removed %.0f row versions in %u pages",
+						RelationGetRelationName(onerel),
+						vacrelstats->tuples_deleted, vacrelstats->vacuumed_pages)));
+
+	/* Finish parallel lazy vacuum and update index statistics */
+	if (nworkers > 0)
+		lazy_vacuum_end_parallel(lvstate, lvleader, true);
+
+	/*
+	 * This is pretty messy, but we split it up so that we can skip emitting
+	 * individual parts of the message when not applicable.
+	 */
+	initStringInfo(&buf);
+	appendStringInfo(&buf,
+					 _("%.0f dead row versions cannot be removed yet, oldest xmin: %u\n"),
+					 vacrelstats->new_dead_tuples, OldestXmin);
+	appendStringInfo(&buf, _("There were %.0f unused item pointers.\n"),
+					 vacrelstats->unused_tuples);
+	appendStringInfo(&buf, ngettext("Skipped %u page due to buffer pins, ",
+									"Skipped %u pages due to buffer pins, ",
+									vacrelstats->pinskipped_pages),
+					 vacrelstats->pinskipped_pages);
+	appendStringInfo(&buf, ngettext("%u frozen page.\n",
+									"%u frozen pages.\n",
+									vacrelstats->frozenskipped_pages),
+					 vacrelstats->frozenskipped_pages);
+	appendStringInfo(&buf, ngettext("%u page is entirely empty.\n",
+									"%u pages are entirely empty.\n",
+									vacrelstats->empty_pages),
+					 vacrelstats->empty_pages);
+	appendStringInfo(&buf, _("%s."), pg_rusage_show(&ru0));
+
+	ereport(elevel,
+			(errmsg("\"%s\": found %.0f removable, %.0f nonremovable row versions in %u out of %u pages",
+					RelationGetRelationName(lvstate->relation),
+					vacrelstats->tuples_deleted, vacrelstats->num_tuples,
+					vacrelstats->scanned_pages, nblocks),
+			 errdetail_internal("%s", buf.data)));
+	pfree(buf.data);
+}
+
+/*
+ * Create parallel context, and launch workers for lazy vacuum.
+ * Also this function constructs the leader's lvstate.
+ */
+static LVLeader *
+lazy_vacuum_begin_parallel(LVState *lvstate, int request)
+{
+	LVLeader		*lvleader = palloc(sizeof(LVLeader));
+	ParallelContext *pcxt;
+	Size			estshared,
+					estvacstats,
+					estindstats,
+					estdt,
+					estworker;
+	LVRelStats		*vacrelstats;
+	LVShared		*lvshared;
+	int			querylen;
+	int 		keys = 0;
+	char		*sharedquery;
+	long	 	maxtuples;
+	int			nparticipants = request + 1;
+	int i;
+
+	EnterParallelMode();
+	pcxt = CreateParallelContext("postgres", "lazy_parallel_vacuum_main",
+								 request, true);
+	lvleader->pcxt = pcxt;
+
+	/* Calculate maximum dead tuples we store */
+	maxtuples = lazy_get_max_dead_tuples(lvstate->vacrelstats,
+										 RelationGetNumberOfBlocks(lvstate->relation));
+
+	/* Estimate size for shared state -- VACUUM_KEY_SHARED */
+	estshared = MAXALIGN(sizeof(LVShared));
+	shm_toc_estimate_chunk(&pcxt->estimator, estshared);
+	keys++;
+
+	/* Estimate size for vacuum statistics for only workers -- VACUUM_KEY_VACUUM_STATS */
+	estvacstats = MAXALIGN(mul_size(sizeof(LVRelStats), request));
+	shm_toc_estimate_chunk(&pcxt->estimator, estvacstats);
+	keys++;
+
+	/* Estimate size for parallel worker status including the leader -- VACUUM_KEY_WORKERS */
+	estworker = MAXALIGN(SizeOfLVWorkerState +
+						 mul_size(sizeof(VacuumWorker), request));
+	shm_toc_estimate_chunk(&pcxt->estimator, estworker);
+	keys++;
+
+	/* We have to dead tuple information only when the table has indexes */
+	if (lvstate->nindexes > 0)
+	{
+		/* Estimate size for index statistics -- VACUUM_KEY_INDEX_STATS */
+		estindstats = MAXALIGN(SizeOfLVIndStats +
+							   mul_size(sizeof(IndexStats), lvstate->nindexes));
+		shm_toc_estimate_chunk(&pcxt->estimator, estindstats);
+		keys++;
+
+		/* Estimate size for dead tuple control -- VACUUM_KEY_DEAD_TUPLES */
+		estdt = MAXALIGN(SizeOfLVTidMap +
+						 mul_size(sizeof(ItemPointerData), maxtuples));
+		shm_toc_estimate_chunk(&pcxt->estimator, estdt);
+		keys++;
+	}
+	else
+	{
+		/* Dead tuple are stored into the local memory if no indexes */
+		lazy_space_alloc(lvstate, RelationGetNumberOfBlocks(lvstate->relation));
+		lvstate->indstats = NULL;
+	}
+
+	shm_toc_estimate_keys(&pcxt->estimator, keys);
+
+	/*
+	 * Finally, estimate VACUUM_KEY_QUERY_TEXT space. Auto vacuum doesn't have
+	 * debug_query_string.
+	 */
+	if (debug_query_string)
+	{
+		querylen = strlen(debug_query_string);
+		shm_toc_estimate_chunk(&pcxt->estimator, querylen + 1);
+		shm_toc_estimate_keys(&pcxt->estimator, 1);
+	}
+
+	InitializeParallelDSM(pcxt);
+
+	/*
+	 * Initialize dynamic shared memory for parallel lazy vacuum. We store
+	 * relevant informations of parallel heap scanning, dead tuple array
+	 * and vacuum statistics for each worker and some parameters for lazy vacuum.
+	 */
+	lvshared = shm_toc_allocate(pcxt->toc, estshared);
+	lvshared->relid = lvstate->relid;
+	lvshared->aggressive = lvstate->aggressive;
+	lvshared->options = lvstate->options;
+	lvshared->oldestXmin = OldestXmin;
+	lvshared->freezeLimit = FreezeLimit;
+	lvshared->multiXactCutoff = MultiXactCutoff;
+	lvshared->elevel = elevel;
+	lvshared->is_wraparound = lvstate->is_wraparound;
+	lvshared->cost_delay = VacuumCostDelay;
+	lvshared->cost_limit = VacuumCostLimit;
+	lvshared->max_dead_tuples_per_worker = maxtuples / nparticipants;
+	heap_parallelscan_initialize(&lvshared->heapdesc, lvstate->relation, SnapshotAny);
+	shm_toc_insert(pcxt->toc, VACUUM_KEY_SHARED, lvshared);
+	lvstate->lvshared = lvshared;
+
+	/* Prepare vacuum relation statistics */
+	vacrelstats = (LVRelStats *) shm_toc_allocate(pcxt->toc, estvacstats);
+	for (i = 0; i < request; i++)
+		memcpy(&vacrelstats[i], lvstate->vacrelstats, sizeof(LVRelStats));
+	shm_toc_insert(pcxt->toc, VACUUM_KEY_VACUUM_STATS, vacrelstats);
+	lvleader->allrelstats = vacrelstats;
+
+	/* Prepare worker status */
+	WorkerState = (LVWorkerState *) shm_toc_allocate(pcxt->toc, estworker);
+	ConditionVariableInit(&WorkerState->cv);
+	LWLockInitialize(&WorkerState->vacuumlock, LWTRANCHE_PARALLEL_VACUUM);
+	WorkerState->nparticipantvacuum = request;
+	for (i = 0; i < request; i++)
+	{
+		VacuumWorker *worker = &(WorkerState->workers[i]);
+
+		worker->pid = InvalidPid;
+		worker->state = VACSTATE_INVALID;	/* initial state */
+		SpinLockInit(&worker->mutex);
+	}
+	shm_toc_insert(pcxt->toc, VACUUM_KEY_WORKERS, WorkerState);
+
+	/* Prepare index statistics and deadtuple space if the table has index */
+	if (lvstate->nindexes > 0)
+	{
+		LVIndStats	*indstats;
+		LVTidMap	*dead_tuples;
+
+		/* Prepare Index statistics */
+		indstats = shm_toc_allocate(pcxt->toc, estindstats);
+		indstats->nindexes = lvstate->nindexes;
+		indstats->nprocessed = 0;
+		MemSet(indstats->stats, 0, sizeof(IndexStats) * indstats->nindexes);
+		SpinLockInit(&indstats->mutex);
+		shm_toc_insert(pcxt->toc, VACUUM_KEY_INDEX_STATS, indstats);
+		lvstate->indstats = indstats;
+
+		/* Prepare shared dead tuples space */
+		dead_tuples = (LVTidMap *) shm_toc_allocate(pcxt->toc, estdt);
+		dead_tuples->max_items = maxtuples;
+		dead_tuples->num_items = 0;
+		dead_tuples->item_idx = 0;
+		dead_tuples->shared = true;
+		SpinLockInit(&dead_tuples->mutex);
+		shm_toc_insert(pcxt->toc, VACUUM_KEY_DEAD_TUPLES, dead_tuples);
+		lvstate->dead_tuples = dead_tuples;
+	}
+
+	/* Store query string for workers */
+	if (debug_query_string)
+	{
+		sharedquery = shm_toc_allocate(pcxt->toc, querylen + 1);
+		memcpy(sharedquery, debug_query_string, querylen + 1);
+		shm_toc_insert(pcxt->toc, VACUUM_KEY_QUERY_TEXT, sharedquery);
+	}
+
+	/* Set master pid to itself */
+	pgstat_report_leader_pid(MyProcPid);
+
+	/* Launch workers */
+	LaunchParallelWorkers(pcxt);
+
+	if (pcxt->nworkers_launched == 0)
+	{
+		lazy_vacuum_end_parallel(lvstate, lvleader, false);
+		pfree(lvleader);
+		return NULL;
+	}
+
+	/* Update the number of workers participating */
+	WorkerState->nparticipantvacuum_launched = pcxt->nworkers_launched;
+
+	lazy_wait_for_vacuum_workers_attach(pcxt);
+
+	return lvleader;
+}
+
+/*
+ * Wait for all workers finish and exit parallel vacuum. If update_stats
+ * is true, gather vacuum statistics of all parallel workers and
+ * update index statistics.
+ */
+static void
+lazy_vacuum_end_parallel(LVState *lvstate, LVLeader *lvleader, bool update_stats)
+{
+	IndexStats *copied_indstats = NULL;
+
+	if (update_stats)
+	{
+		/* Copy index stats before destroy parallel context */
+		copied_indstats = palloc(sizeof(IndexStats) * lvstate->nindexes);
+		memcpy(copied_indstats, lvstate->indstats->stats,
+			   sizeof(IndexStats) * lvstate->nindexes);
+	}
+
+	/* Wait for workers finished vacuum */
+	WaitForParallelWorkersToFinish(lvleader->pcxt);
+
+	/* End parallel mode */
+	DestroyParallelContext(lvleader->pcxt);
+	ExitParallelMode();
+
+	/*
+	 * Since we cannot do any updates in parallel mode we update index statistics
+	 * after exit parallel mode.
+	 */
+	if (update_stats)
+	{
+		int i;
+
+		for (i = 0; i < lvstate->nindexes; i++)
+		{
+			Relation	ind = lvstate->indRels[i];
+			IndexStats *istat = (IndexStats *) &(copied_indstats[i]);
+
+			/* Update index statsistics */
+			if (istat->need_update)
+				vac_update_relstats(ind,
+									istat->num_pages,
+									istat->num_tuples,
+									0,
+									false,
+									InvalidTransactionId,
+									InvalidMultiXactId,
+									false);
+		}
+	}
+
+	/* Reset shared fields */
+	lvstate->indstats = NULL;
+	lvstate->dead_tuples = NULL;
+	WorkerState = NULL;
+}
+
+/*
+ * lazy_gather_worker_stats() -- Gather vacuum statistics from workers
+ */
+static void
+lazy_gather_worker_stats(LVLeader *lvleader, LVRelStats *vacrelstats)
+{
+	int	i;
+
+	if (!IsInParallelMode())
+		return;
+
+	/* Gather worker stats */
+	for (i = 0; i < (WorkerState->nparticipantvacuum_launched); i++)
+	{
+		LVRelStats *wstats = (LVRelStats *) &lvleader->allrelstats[i];
+
+		vacrelstats->scanned_pages += wstats->scanned_pages;
+		vacrelstats->pinskipped_pages += wstats->pinskipped_pages;
+		vacrelstats->frozenskipped_pages += wstats->frozenskipped_pages;
+		vacrelstats->tupcount_pages += wstats->tupcount_pages;
+		vacrelstats->empty_pages += wstats->empty_pages;
+		vacrelstats->vacuumed_pages += wstats->vacuumed_pages;
+		vacrelstats->num_tuples += wstats->num_tuples;
+		vacrelstats->live_tuples += wstats->live_tuples;
+		vacrelstats->tuples_deleted += wstats->tuples_deleted;
+		vacrelstats->unused_tuples += wstats->unused_tuples;
+		vacrelstats->pages_removed += wstats->pages_removed;
+		vacrelstats->new_dead_tuples += wstats->new_dead_tuples;
+		vacrelstats->new_live_tuples += wstats->new_live_tuples;
+		vacrelstats->nonempty_pages += wstats->nonempty_pages;
+	}
+}
+
+/*
+ * Return the number of maximum dead tuples can be stored according
+ * to vac_work_mem.
+ */
+static long
+lazy_get_max_dead_tuples(LVRelStats *vacrelstats, BlockNumber relblocks)
+{
+	long maxtuples;
+	int	vac_work_mem = IsAutoVacuumWorkerProcess() &&
+		autovacuum_work_mem != -1 ?
+		autovacuum_work_mem : maintenance_work_mem;
+
+	if (vacrelstats->hasindex)
+	{
+		maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);
+		maxtuples = Min(maxtuples, INT_MAX);
+		maxtuples = Min(maxtuples, MaxAllocSize / sizeof(ItemPointerData));
+
+		/* curious coding here to ensure the multiplication can't overflow */
+		if ((BlockNumber) (maxtuples / LAZY_ALLOC_TUPLES) > relblocks)
+			maxtuples = relblocks * LAZY_ALLOC_TUPLES;
+
+		/* stay sane if small maintenance_work_mem */
+		maxtuples = Max(maxtuples, MaxHeapTuplesPerPage);
+	}
+	else
+		maxtuples = MaxHeapTuplesPerPage;
+
+	return maxtuples;
+}
+
+/*
+ * lazy_prepare_next_state
+ *
+ * Before enter the next state prepare the next state. In parallel lazy vacuum,
+ * we must wait for the all vacuum workers to finish the previous state before
+ * preparation. Also, after prepared we change the state ot all vacuum workers
+ * and wake up them.
+ */
+static void
+lazy_prepare_next_state(LVState *lvstate, LVLeader *lvleader, int next_state)
+{
+	/* Wait for vacuum workers to finish the previous state */
+	if (IsInParallelMode())
+		lazy_wait_for_vacuum_workers_to_be_done();
+
+	switch (next_state)
+	{
+		/*
+		 * Before enter the next state do the preparation work. Since we can
+		 * guarantee that all vacuum workers don't touch and modify the parallel
+		 * vacuum shared data during preparation, we don't need to take any locks
+		 * related lazy vacuum to modify shared data.
+		 */
+		case VACSTATE_SCAN:
+			{
+				LVTidMap *dead_tuples = lvstate->dead_tuples;
+
+				/* Before scanning heap, clear the dead tuples */
+				MemSet(dead_tuples->itemptrs, 0,
+					   sizeof(ItemPointerData) * dead_tuples->max_items);
+				dead_tuples->num_items = 0;
+				dead_tuples->item_idx = 0;
+				dead_tuples->vacuumed_pages = 0;
+				break;
+			}
+		case VACSTATE_VACUUM_INDEX:
+			{
+				LVTidMap	*dead_tuples = lvstate->dead_tuples;
+
+				/* Before vacuum indexes, sort the dead tuple array */
+				qsort((void *) dead_tuples->itemptrs,
+					  dead_tuples->num_items,
+					  sizeof(ItemPointerData), vac_cmp_itemptr);
+
+				/* Reset the process counter of index vacuum */
+				if (lvstate->indstats)
+					lvstate->indstats->nprocessed = 0;
+				break;
+			}
+		case VACSTATE_CLEANUP_INDEX:
+		{
+			LVRelStats *vacrelstats = lvstate->vacrelstats;
+
+			/* Gather vacuum statistics of all vacuum workers */
+			lazy_gather_worker_stats(lvleader, lvstate->vacrelstats);
+
+			/* now we can compute the new value for pg_class.reltuples */
+			vacrelstats->new_live_tuples = vac_estimate_reltuples(lvstate->relation,
+																  vacrelstats->rel_pages,
+																  vacrelstats->tupcount_pages,
+																  vacrelstats->live_tuples);
+
+			/* also compute total number of surviving heap entries */
+			vacrelstats->new_rel_tuples =
+				vacrelstats->new_live_tuples + vacrelstats->new_dead_tuples;
+
+			/* Reset the process counter of index vacuum */
+			if (lvstate->indstats)
+				lvstate->indstats->nprocessed = 0;
+			break;
+		}
+		case VACSTATE_COMPLETED:
+		case VACSTATE_VACUUM_HEAP:
+			/* Before vacuum heap or before exit there is nothing preparation work */
+			break;
+		case VACSTATE_INVALID:
+			elog(ERROR, "unexpected vacuum state %d", next_state);
+			break;
+		default:
+			elog(ERROR, "invalid vacuum state %d", next_state);
+	}
+
+	/* Advance state to the VACUUM state and wake up vacuum workers */
+	if (IsInParallelMode())
+	{
+		lazy_set_workers_state(next_state);
+		ConditionVariableBroadcast(&WorkerState->cv);
+	}
+}
+
+/*
+ * lazy_dead_tuples_is_full - is the dead tuple space full?
+ *
+ * Return true if dead tuple space is full.
+ */
+static bool
+lazy_dead_tuples_is_full(LVTidMap *dead_tuples)
+{
+	bool isfull;
+
+	if (dead_tuples->shared)
+		SpinLockAcquire(&(dead_tuples->mutex));
+
+	isfull = ((dead_tuples->num_items > 0) &&
+			  ((dead_tuples->max_items - dead_tuples->num_items) < MaxHeapTuplesPerPage));
+
+	if (dead_tuples->shared)
+		SpinLockRelease(&(dead_tuples->mutex));
+
+	return isfull;
+}
+
+/*
+ * lazy_get_dead_tuple_count
+ *
+ * Get the current number of dead tuples we are having.
+ */
+static int
+lazy_get_dead_tuple_count(LVTidMap *dead_tuples)
+{
+	int num_items;
+
+	if (dead_tuples->shared)
+		SpinLockAcquire(&dead_tuples->mutex);
+
+	num_items = dead_tuples->num_items;
+
+	if (dead_tuples->shared)
+		SpinLockRelease(&dead_tuples->mutex);
+
+	return num_items;
+}
+
+/*
+ * lazy_get_next_vacuum_page
+ *
+ * For vacuum heap pages, return the block number we vacuum next from the
+ * dead tuple space. Also we advance the index of dead tuple until the
+ * different next block appears for the next search.
+ *
+ * NB: the dead_tuples must be sorted by TID order.
+ */
+static BlockNumber
+lazy_get_next_vacuum_page(LVState *lvstate, int *tupindex_p, int *npages_p)
+{
+	LVTidMap	*dead_tuples = lvstate->dead_tuples;
+	BlockNumber tblk;
+	BlockNumber	prev_tblk = InvalidBlockNumber;
+	BlockNumber	vacuum_tblk;
+
+	Assert(tupindex_p != NULL && npages_p != NULL);
+
+	if (!dead_tuples->shared)
+	{
+		/* Reached the end of dead tuples */
+		if (dead_tuples->item_idx >= dead_tuples->num_items)
+			return InvalidBlockNumber;
+
+		tblk = ItemPointerGetBlockNumber(&(dead_tuples->itemptrs[dead_tuples->item_idx]));
+		*tupindex_p = dead_tuples->item_idx++;
+		*npages_p = tblk;
+		return tblk;
+	}
+
+	/*
+	 * For parallel vacuum, need locks.
+	 *
+	 * XXX: The number of maximum tuple we need to advance is not a large
+	 * number, up to MaxHeapTuplesPerPage. So we use spin lock here.
+	 */
+	if (dead_tuples->shared)
+		SpinLockAcquire(&(dead_tuples->mutex));
+
+	if (dead_tuples->item_idx >= dead_tuples->num_items)
+	{
+		/* Reached the end of dead tuples array */
+		vacuum_tblk = InvalidBlockNumber;
+		*tupindex_p = dead_tuples->num_items;
+		*npages_p = dead_tuples->vacuumed_pages;
+		goto done;
+	}
+
+	/* Set the block number being vacuumed next */
+	vacuum_tblk = ItemPointerGetBlockNumber(&(dead_tuples->itemptrs[dead_tuples->item_idx]));
+
+	/* Set the output arguments */
+	*tupindex_p = dead_tuples->item_idx;
+	*npages_p = ++(dead_tuples->vacuumed_pages);
+
+	/* Advance the index to the beginning of the next different block */
+	while (dead_tuples->item_idx < dead_tuples->num_items)
+	{
+		tblk = ItemPointerGetBlockNumber(&(dead_tuples->itemptrs[dead_tuples->item_idx]));
+
+		if (BlockNumberIsValid(prev_tblk) && prev_tblk != tblk)
+			break;
+
+		prev_tblk = tblk;
+		dead_tuples->item_idx++;
+	}
+
+done:
+	if (dead_tuples->shared)
+		SpinLockRelease(&(dead_tuples->mutex));
+
+	return vacuum_tblk;
+}
+
+/*
+ * Vacuum all indexes. In parallel vacuum, each workers take indexes
+ * one by one. Also after vacuumed index they mark it as done. This marking
+ * is necessary to guarantee that all indexes are vacuumed based on
+ * the current collected dead tuples. The leader process continues to
+ * vacuum even if any indexes is not vacuumed completely due to failure of
+ * parallel worker for whatever reason. The mark will be checked before entering
+ * the next state.
+ */
+void
+lazy_vacuum_all_indexes(LVState *lvstate)
+{
+	int idx;
+	int nprocessed = 0;
+	LVIndStats *sharedstats = lvstate->indstats;
+
+	/* Take the index number we vacuum */
+	if (IsInParallelMode())
+	{
+		Assert(sharedstats != NULL);
+		SpinLockAcquire(&(sharedstats->mutex));
+		idx = (sharedstats->nprocessed)++;
+		SpinLockRelease(&sharedstats->mutex);
+	}
+	else
+		idx = nprocessed++;
+
+	while (idx  < lvstate->nindexes)
+	{
+		/* Remove index entries */
+		lazy_vacuum_index(lvstate->indRels[idx], &lvstate->indbulkstats[idx],
+						  lvstate->vacrelstats, lvstate->dead_tuples);
+
+		/* Take the next index number we vacuum */
+		if (IsInParallelMode())
+		{
+			SpinLockAcquire(&(sharedstats->mutex));
+			idx = (sharedstats->nprocessed)++;
+			SpinLockRelease(&sharedstats->mutex);
+		}
+		else
+			idx = nprocessed++;
+	}
+}
+
+/*
+ * Cleanup all indexes.
+ * This function is similar to lazy_vacuum_all_indexes.
+ */
+void
+lazy_cleanup_all_indexes(LVState *lvstate)
+{
+	int idx;
+	int nprocessed = 0;
+	LVIndStats *sharedstats = lvstate->indstats;
+
+	/* Return if no indexes */
+	if (lvstate->nindexes == 0)
+		return;
+
+	/* Get the target index number */
+	if (IsInParallelMode())
+	{
+		Assert(sharedstats != NULL);
+		SpinLockAcquire(&(sharedstats->mutex));
+		idx = (sharedstats->nprocessed)++;
+		SpinLockRelease(&sharedstats->mutex);
+	}
+	else
+		idx = nprocessed++;
+
+	while (idx  < lvstate->nindexes)
+	{
+		/*
+		 * Do post-vacuum cleanup. Update statistics for each index if not
+		 * in parallel vacuum.
+		 */
+		lazy_cleanup_index(lvstate->indRels[idx],
+						   lvstate->indbulkstats[idx],
+						   lvstate->vacrelstats,
+						   (lvstate->indstats) ? &(sharedstats->stats[idx]) : NULL);
+
+		/* Get the next target index number */
+		if (IsInParallelMode())
+		{
+			SpinLockAcquire(&(sharedstats->mutex));
+			idx = (sharedstats->nprocessed)++;
+			SpinLockRelease(&sharedstats->mutex);
+		}
+		else
+			idx = nprocessed++;
+	}
+}
+
+/*
+ * Set xid limits. This function is for parallel vacuum workers.
+ */
+void
+vacuum_set_xid_limits_for_worker(TransactionId oldestxmin, TransactionId freezelimit,
+								  MultiXactId multixactcutoff)
+{
+	OldestXmin = oldestxmin;
+	FreezeLimit = freezelimit;
+	MultiXactCutoff = multixactcutoff;
+}
+
+/*
+ * Set error level during lazy vacuum for vacuum workers.
+ */
+void
+vacuum_set_elevel_for_worker(int worker_elevel)
+{
+	elevel = worker_elevel;
+}
+
+/*
+ * lazy_set_workers_state - set new state to the all parallel workers
+ */
+static void
+lazy_set_workers_state(VacWorkerState new_state)
+{
+	int i;
+
+	Assert(!IsParallelWorker());
+
+	for (i = 0; i < WorkerState->nparticipantvacuum_launched; i++)
+	{
+		VacuumWorker *w = &WorkerState->workers[i];
+
+		SpinLockAcquire(&w->mutex);
+		if (!IsVacuumWorkerInvalid(w->pid))
+			w->state = new_state;
+		SpinLockRelease(&w->mutex);
+	}
+}
+
+/*
+ * Wait for parallel vacuum workers to attach to both the shmem context
+ * and a worker slot. This is needed for ensuring that the leader can see
+ * the states of all launched workers when checking.
+ */
+static void
+lazy_wait_for_vacuum_workers_attach(ParallelContext *pcxt)
+{
+	int i;
+
+	/* Wait for workers to attach to the shmem context */
+	WaitForParallelWorkersToAttach(pcxt);
+
+	/* Also, wait for workers to attach to the vacuum worker slot */
+	for (i = 0; i < pcxt->nworkers_launched; i++)
+	{
+		VacuumWorker	*worker = &WorkerState->workers[i];
+		int rc;
+
+		for (;;)
+		{
+			pid_t pid;
+
+			CHECK_FOR_INTERRUPTS();
+
+			/*
+			 * If the worker stopped without attaching the vacuum worker
+			 * slot, throw an error.
+			 */
+			if (IsVacuumWorkerStopped(worker->pid))
+				ereport(ERROR,
+						(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+						 errmsg("parallel vacuum worker failed to initialize")));
+
+			SpinLockAcquire(&worker->mutex);
+			pid = worker->pid;
+			SpinLockRelease(&worker->mutex);
+
+			/* The worker successfully attached */
+			if (pid != InvalidPid)
+				break;
+
+			rc = WaitLatch(MyLatch,
+						   WL_TIMEOUT | WL_POSTMASTER_DEATH,
+						   10L, WAIT_EVENT_PARALLEL_VACUUM_STARTUP);
+
+			if (rc & WL_POSTMASTER_DEATH)
+				return;
+
+			ResetLatch(MyLatch);
+		}
+	}
+}
+
+/*
+ * lazy_wait_for_vacuum_workers_to_be_done - all workers are done the previous work?
+ *
+ * Wait for all parallel workers to change its state to VACSTATE_WORKER_DONE.
+ */
+static void
+lazy_wait_for_vacuum_workers_to_be_done(void)
+{
+	while (true)
+	{
+		int i;
+		bool all_finished = true;
+
+		CHECK_FOR_INTERRUPTS();
+
+		for (i = 0; i < WorkerState->nparticipantvacuum_launched; i++)
+		{
+			VacuumWorker *w = &WorkerState->workers[i];
+			pid_t	pid;
+			int		state;
+
+			SpinLockAcquire(&w->mutex);
+			pid = w->pid;
+			state = w->state;
+			SpinLockRelease(&w->mutex);
+
+			/* Skip unused slot */
+			if (IsVacuumWorkerInvalid(pid))
+				continue;
+
+			if (state != VACSTATE_WORKER_DONE)
+			{
+				/* Not finished */
+				all_finished = false;
+				break;
+			}
+		}
+
+		/* All vacuum worker done */
+		if (all_finished)
+			break;
+
+		ConditionVariableSleep(&WorkerState->cv, WAIT_EVENT_PARALLEL_VACUUM);
+	}
+
+	ConditionVariableCancelSleep();
+}
+
+/*
+ * lv_beginscan() -- begin lazy vacuum heap scan
+ *
+ * In parallel vacuum we use parallel heap scan, so initialize parallel
+ * heap scan description.
+ */
+LVScanDesc
+lv_beginscan(Relation onerel, LVShared *lvshared, bool aggressive,
+			 bool disable_page_skipping)
+{
+	LVScanDesc	lvscan = (LVScanDesc) palloc(sizeof(LVScanDescData));
+
+	/* Scan target relation */
+	lvscan->lv_rel = onerel;
+	lvscan->lv_nblocks = RelationGetNumberOfBlocks(onerel);
+
+	/* Set scan options */
+	lvscan->aggressive = aggressive;
+	lvscan->disable_page_skipping = disable_page_skipping;
+
+	/* Initialize other fields */
+	lvscan->lv_heapscan = NULL;
+	lvscan->lv_cblock = 0;
+	lvscan->lv_next_unskippable_block = 0;
+
+	/* For parallel lazy vacuum */
+	if (lvshared)
+	{
+		Assert(!IsBootstrapProcessingMode());
+		lvscan->lv_heapscan = heap_beginscan_parallel(onerel, &lvshared->heapdesc);
+		heap_parallelscan_startblock_init(lvscan->lv_heapscan);
+	}
+
+	return lvscan;
+}
+
+/*
+ * lv_endscan() -- end lazy vacuum heap scan
+ */
+void
+lv_endscan(LVScanDesc lvscan)
+{
+	if (lvscan->lv_heapscan != NULL)
+		heap_endscan(lvscan->lv_heapscan);
+	pfree(lvscan);
+}
+
+/*
+ * Return the block number we need to scan next, or InvalidBlockNumber if
+ * scan finished.
+ *
+ * Except when aggressive is set, we want to skip pages that are
+ * all-visible according to the visibility map, but only when we can skip
+ * at least SKIP_PAGES_THRESHOLD consecutive pages.	 Since we're reading
+ * sequentially, the OS should be doing readahead for us, so there's no
+ * gain in skipping a page now and then; that's likely to disable
+ * readahead and so be counterproductive. Also, skipping even a single
+ * page means that we can't update relfrozenxid, so we only want to do it
+ * if we can skip a goodly number of pages.
+ *
+ * When aggressive is set, we can't skip pages just because they are
+ * all-visible, but we can still skip pages that are all-frozen, since
+ * such pages do not need freezing and do not affect the value that we can
+ * safely set for relfrozenxid or relminmxid.
+ *
+ * Before entering the main loop, establish the invariant that
+ * next_unskippable_block is the next block number >= blkno that we can't
+ * skip based on the visibility map, either all-visible for a regular scan
+ * or all-frozen for an aggressive scan.  We set it to nblocks if there's
+ * no such block.  We also set up the skipping_blocks flag correctly at
+ * this stage.
+ *
+ * In single lazy scan, before entering the main loop, establish the
+ * invariant that next_unskippable_block is the next block number >= blkno
+ * that's not we can't skip based on the visibility map, either all-visible
+ * for a regular scan or all-frozen for an aggressive scan.	 We set it to
+ * nblocks if there's no such block.  We also set up the skipping_blocks
+ * flag correctly at this stage.
+ *
+ * In parallel lazy scan, we scan heap pages using parallel heap scan.
+ * Each worker calls heap_parallelscan_nextpage() in order to exclusively
+ * get the block number we scan. Unlike single parallel lazy scan, we skip
+ * all-visible blocks immediately.
+ *
+ * Note: The value returned by visibilitymap_get_status could be slightly
+ * out-of-date, since we make this test before reading the corresponding
+ * heap page or locking the buffer.	 This is OK.  If we mistakenly think
+ * that the page is all-visible or all-frozen when in fact the flag's just
+ * been cleared, we might fail to vacuum the page.	It's easy to see that
+ * skipping a page when aggressive is not set is not a very big deal; we
+ * might leave some dead tuples lying around, but the next vacuum will
+ * find them.  But even when aggressive *is* set, it's still OK if we miss
+ * a page whose all-frozen marking has just been cleared.  Any new XIDs
+ * just added to that page are necessarily newer than the GlobalXmin we
+ * Computed, so they'll have no effect on the value to which we can safely
+ * set relfrozenxid.  A similar argument applies for MXIDs and relminmxid.
+ *
+ * We will scan the table's last page, at least to the extent of
+ * determining whether it has tuples or not, even if it should be skipped
+ * according to the above rules; except when we've already determined that
+ * it's not worth trying to truncate the table.	 This avoids having
+ * lazy_truncate_heap() take access-exclusive lock on the table to attempt
+ * a truncation that just fails immediately because there are tuples in
+ * the last page.  This is worth avoiding mainly because such a lock must
+ * be replayed on any hot standby, where it can be disruptive.
+ */
+static BlockNumber
+lazy_scan_get_nextpage(LVScanDesc lvscan, LVRelStats *vacrelstats,
+					   bool *all_visible_according_to_vm_p, Buffer *vmbuffer_p)
+{
+	BlockNumber blkno;
+
+	/* Parallel lazy scan mode */
+	if (lvscan->lv_heapscan)
+	{
+		/*
+		 * In parallel lazy vacuum since we cannot know how many consecutive
+		 * all-visible pages exits on table we skip to scan the all-visible
+		 * page immediately.
+		 */
+		while ((blkno = heap_parallelscan_nextpage(lvscan->lv_heapscan)) != InvalidBlockNumber)
+		{
+			*all_visible_according_to_vm_p = false;
+			vacuum_delay_point();
+
+			pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_SCANNED, blkno);
+
+			/* Consider to skip scanning the page according visibility map */
+			if (!lvscan->disable_page_skipping &&
+				!FORCE_CHECK_PAGE(vacrelstats->rel_pages, blkno, vacrelstats))
+			{
+				uint8		vmstatus;
+
+				vmstatus = visibilitymap_get_status(lvscan->lv_rel, blkno, vmbuffer_p);
+
+				if (lvscan->aggressive)
+				{
+					if ((vmstatus & VISIBILITYMAP_ALL_FROZEN) != 0)
+					{
+						vacrelstats->frozenskipped_pages++;
+						continue;
+					}
+					else if ((vmstatus & VISIBILITYMAP_ALL_VISIBLE) != 0)
+						*all_visible_according_to_vm_p = true;
+				}
+				else
+				{
+					if ((vmstatus & VISIBILITYMAP_ALL_VISIBLE) != 0)
+					{
+						if ((vmstatus & VISIBILITYMAP_ALL_FROZEN) != 0)
+							vacrelstats->frozenskipped_pages++;
+						continue;
+					}
+				}
+			}
+
+			/* Okay, need to scan current blkno, break */
+			break;
+		}
+	}
+	else	/* Single lazy scan mode */
+	{
+		bool skipping_blocks = false;
+
+		pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_SCANNED, lvscan->lv_cblock);
+
+		/* Initialize lv_nextunskippable_page if needed */
+		if (lvscan->lv_cblock == 0 && !lvscan->disable_page_skipping)
+		{
+			while (lvscan->lv_next_unskippable_block < lvscan->lv_nblocks)
+			{
+				uint8		vmstatus;
+
+				vmstatus = visibilitymap_get_status(lvscan->lv_rel,
+													lvscan->lv_next_unskippable_block,
+													vmbuffer_p);
+				if (lvscan->aggressive)
+				{
+					if ((vmstatus & VISIBILITYMAP_ALL_FROZEN) == 0)
+						break;
+				}
+				else
+				{
+					if ((vmstatus & VISIBILITYMAP_ALL_VISIBLE) == 0)
+						break;
+				}
+				vacuum_delay_point();
+				lvscan->lv_next_unskippable_block++;
+			}
+
+			if (lvscan->lv_next_unskippable_block >= SKIP_PAGES_THRESHOLD)
+				skipping_blocks = true;
+			else
+				skipping_blocks = false;
+		}
+
+		/* Decide the block number we need to scan */
+		for (blkno = lvscan->lv_cblock; blkno < lvscan->lv_nblocks; blkno++)
+		{
+			if (blkno == lvscan->lv_next_unskippable_block)
+			{
+				/* Time to advance next_unskippable_block */
+				lvscan->lv_next_unskippable_block++;
+				if (!lvscan->disable_page_skipping)
+				{
+					while (lvscan->lv_next_unskippable_block < lvscan->lv_nblocks)
+					{
+						uint8		vmstatus;
+
+						vmstatus = visibilitymap_get_status(lvscan->lv_rel,
+															lvscan->lv_next_unskippable_block,
+															vmbuffer_p);
+						if (lvscan->aggressive)
+						{
+							if ((vmstatus & VISIBILITYMAP_ALL_FROZEN) == 0)
+								break;
+						}
+						else
+						{
+							if ((vmstatus & VISIBILITYMAP_ALL_VISIBLE) == 0)
+								break;
+						}
+						vacuum_delay_point();
+						lvscan->lv_next_unskippable_block++;
+					}
+				}
+
+				/*
+				 * We know we can't skip the current block.	 But set up
+				 * skipping_all_visible_blocks to do the right thing at the
+				 * following blocks.
+				 */
+				if (lvscan->lv_next_unskippable_block - blkno > SKIP_PAGES_THRESHOLD)
+					skipping_blocks = true;
+				else
+					skipping_blocks = false;
+
+				/*
+				 * Normally, the fact that we can't skip this block must mean that
+				 * it's not all-visible.  But in an aggressive vacuum we know only
+				 * that it's not all-frozen, so it might still be all-visible.
+				 */
+				if (lvscan->aggressive && VM_ALL_VISIBLE(lvscan->lv_rel, blkno, vmbuffer_p))
+					*all_visible_according_to_vm_p = true;
+
+				/* Found out that next unskippable block number */
+				break;
+			}
+			else
+			{
+				/*
+				 * The current block is potentially skippable; if we've seen a
+				 * long enough run of skippable blocks to justify skipping it, and
+				 * we're not forced to check it, then go ahead and skip.
+				 * Otherwise, the page must be at least all-visible if not
+				 * all-frozen, so we can set *all_visible_according_to_vm_p = true.
+				 */
+				if (skipping_blocks &&
+					!FORCE_CHECK_PAGE(vacrelstats->rel_pages, blkno, vacrelstats))
+				{
+					/*
+					 * Tricky, tricky.	If this is in aggressive vacuum, the page
+					 * must have been all-frozen at the time we checked whether it
+					 * was skippable, but it might not be any more.	 We must be
+					 * careful to count it as a skipped all-frozen page in that
+					 * case, or else we'll think we can't update relfrozenxid and
+					 * relminmxid.	If it's not an aggressive vacuum, we don't
+					 * know whether it was all-frozen, so we have to recheck; but
+					 * in this case an approximate answer is OK.
+					 */
+					if (lvscan->aggressive || VM_ALL_FROZEN(lvscan->lv_rel, blkno, vmbuffer_p))
+						vacrelstats->frozenskipped_pages++;
+					continue;
+				}
+
+				*all_visible_according_to_vm_p = true;
+
+				/* We need to scan current blkno, break */
+				break;
+			}
+		} /* for */
+
+		/* Advance the current block number for the next scan */
+		lvscan->lv_cblock = blkno + 1;
+	}
+
+	return (blkno == lvscan->lv_nblocks) ? InvalidBlockNumber : blkno;
+}
diff --git a/src/backend/commands/vacuumworker.c b/src/backend/commands/vacuumworker.c
new file mode 100644
index 0000000..ccdc7b1
--- /dev/null
+++ b/src/backend/commands/vacuumworker.c
@@ -0,0 +1,327 @@
+/*-------------------------------------------------------------------------
+ *
+ * vacuumworker.c
+ *	  Parallel lazy vacuum worker.
+ *
+ * The parallel vacuum worker process is a process that helps lazy vacuums.
+ * It continues to wait for its state to be changed by the vacuum leader process.
+ * After finished any state it sets state as done. Normal termination is also
+ * by the leader process.
+ *
+ * Portions Copyright (c) 1996-2018, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ *	  src/backend/commands/vacuumworker.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "access/parallel.h"
+#include "access/xact.h"
+#include "commands/vacuum.h"
+#include "commands/vacuum_internal.h"
+#include "miscadmin.h"
+#include "pgstat.h"
+#include "storage/bufmgr.h"
+#include "storage/condition_variable.h"
+#include "storage/ipc.h"
+#include "tcop/tcopprot.h"
+
+static VacuumWorker	*MyVacuumWorker = NULL;
+
+/* Parallel vacuum worker function prototypes */
+static void lvworker_set_state(VacWorkerState new_state);
+static VacWorkerState lvworker_get_state(void);
+static void lvworker_mainloop(LVState *lvstate);
+static void lvworker_wait_for_next_work(void);
+static void lvworker_attach(void);
+static void lvworker_detach(void);
+static void lvworker_onexit(int code, Datum arg);
+
+/*
+ * Perform work within a launched parallel process.
+ */
+void
+lazy_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
+{
+	LVState		*lvstate = (LVState *) palloc(sizeof(LVState));
+	LVShared	*lvshared;
+	LVRelStats	*vacrelstats;
+	char		*sharedquery;
+
+	ereport(DEBUG1,
+			(errmsg("starting parallel lazy vacuum worker")));
+
+	/* Register the callback function */
+	before_shmem_exit(lvworker_onexit, (Datum) 0);
+
+	/* Look up worker state and attach to the vacuum worker slot */
+	WorkerState = (LVWorkerState *) shm_toc_lookup(toc, VACUUM_KEY_WORKERS, false);
+	lvworker_attach();
+
+	/* Set shared state */
+	lvshared = (LVShared *) shm_toc_lookup(toc, VACUUM_KEY_SHARED, false);
+
+	/*
+	 * Set debug_query_string. The debug_query_string can not be found in
+	 * autovacuum case.
+	 */
+	sharedquery = shm_toc_lookup(toc, VACUUM_KEY_QUERY_TEXT, true);
+	if (sharedquery)
+	{
+		debug_query_string = sharedquery;
+		pgstat_report_activity(STATE_RUNNING, debug_query_string);
+	}
+	else
+		pgstat_report_activity(STATE_RUNNING, lvshared->is_wraparound ?
+							   "autovacuum: parallel worker (to prevent wraparound)" :
+							   "autovacuum: parallel worker");
+
+	/* Set individual vacuum statistics */
+	vacrelstats = (LVRelStats *) shm_toc_lookup(toc, VACUUM_KEY_VACUUM_STATS, false);
+
+	/* Set lazy vacuum state */
+	lvstate->relid = lvshared->relid;
+	lvstate->aggressive = lvshared->aggressive;
+	lvstate->options = lvshared->options;
+	lvstate->vacrelstats = vacrelstats + ParallelWorkerNumber;
+	lvstate->relation = relation_open(lvstate->relid, ShareUpdateExclusiveLock);
+	vac_open_indexes(lvstate->relation, RowExclusiveLock, &lvstate->nindexes,
+					 &lvstate->indRels);
+	lvstate->lvshared = lvshared;
+	lvstate->indstats = NULL;
+	lvstate->dead_tuples = NULL;
+
+	/*
+	 * Set the PROC_IN_VACUUM flag, which lets other concurrent VACUUMs know that
+	 * they can ignore this one while determining their OldestXmin. Also set the
+	 * PROC_VACUUM_FOR_WRAPAROUND flag. Please see the comment in vacuum_rel for
+	 * details.
+	 */
+	LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
+	MyPgXact->vacuumFlags |= PROC_IN_VACUUM;
+	if (lvshared->is_wraparound)
+		MyPgXact->vacuumFlags |= PROC_VACUUM_FOR_WRAPAROUND;
+	LWLockRelease(ProcArrayLock);
+
+	/* Set the space for both index statistics and dead tuples if table with index */
+	if (lvstate->nindexes > 0)
+	{
+		LVTidMap		*dead_tuples;
+		LVIndStats		*indstats;
+
+		/* Attach shared dead tuples */
+		dead_tuples = (LVTidMap *) shm_toc_lookup(toc, VACUUM_KEY_DEAD_TUPLES, false);
+		lvstate->dead_tuples = dead_tuples;
+
+		/* Attach Shared index stats */
+		indstats = (LVIndStats *) shm_toc_lookup(toc, VACUUM_KEY_INDEX_STATS, false);
+		lvstate->indstats = indstats;
+
+		/* Prepare for index bulkdelete */
+		lvstate->indbulkstats = (IndexBulkDeleteResult **)
+			palloc0(lvstate->nindexes * sizeof(IndexBulkDeleteResult *));
+	}
+	else
+	{
+		/* Dead tuple are stored into the local memory if no indexes */
+		lazy_space_alloc(lvstate, RelationGetNumberOfBlocks(lvstate->relation));
+		lvstate->indstats = NULL;
+	}
+
+	/* Restore vacuum xid limits and elevel */
+	vacuum_set_xid_limits_for_worker(lvshared->oldestXmin, lvshared->freezeLimit,
+									 lvshared->multiXactCutoff);
+	vacuum_set_elevel_for_worker(lvshared->elevel);
+
+	pgstat_progress_start_command(PROGRESS_COMMAND_VACUUM,
+								  lvshared->relid);
+
+	/* Restore vacuum delay */
+	VacuumCostDelay = lvshared->cost_delay;
+	VacuumCostLimit = lvshared->cost_limit;
+	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostBalance = 0;
+	VacuumPageHit = 0;
+	VacuumPageMiss = 0;
+	VacuumPageDirty = 0;
+
+	/* Begin lazy heap scan */
+	lvstate->lvscan = lv_beginscan(lvstate->relation, lvstate->lvshared, lvshared->aggressive,
+						  (lvstate->options.flags & VACOPT_DISABLE_PAGE_SKIPPING) != 0);
+
+	/* Prepare other fields */
+	lvstate->frozen = palloc(sizeof(xl_heap_freeze_tuple) * MaxHeapTuplesPerPage);
+
+	/* Enter the main loop */
+	lvworker_mainloop(lvstate);
+
+	/* The lazy vacuum has done, do the post-processing */
+	lv_endscan(lvstate->lvscan);
+	pgstat_progress_end_command();
+	lvworker_detach();
+	cancel_before_shmem_exit(lvworker_onexit, (Datum) 0);
+
+	vac_close_indexes(lvstate->nindexes, lvstate->indRels, RowExclusiveLock);
+	heap_close(lvstate->relation, ShareUpdateExclusiveLock);
+}
+
+/*
+ * Main loop for vacuum workers.
+ */
+static void
+lvworker_mainloop(LVState *lvstate)
+{
+	bool	exit = false;
+
+	/*
+	 * Loop until the leader commands it to exit.
+	 */
+	while (!exit)
+	{
+		VacWorkerState mystate;
+
+		/* Wait for the status to be changed by the leader */
+		lvworker_wait_for_next_work();
+
+		/* Get my new state */
+		mystate = lvworker_get_state();
+
+		/* Dispatch the work according to the state */
+		switch (mystate)
+		{
+			case VACSTATE_SCAN:
+				{
+					bool dummy;
+					do_lazy_scan_heap(lvstate, &dummy);
+					break;
+				}
+			case VACSTATE_VACUUM_INDEX:
+				{
+					lazy_vacuum_all_indexes(lvstate);
+					break;
+				}
+			case VACSTATE_VACUUM_HEAP:
+				{
+					lazy_vacuum_heap(lvstate->relation, lvstate);
+					break;
+				}
+			case VACSTATE_CLEANUP_INDEX:
+				{
+					lazy_cleanup_all_indexes(lvstate);
+					break;
+				}
+			case VACSTATE_COMPLETED:
+				{
+					/* The leader asked us to exit */
+					exit = true;
+					break;
+				}
+			case VACSTATE_INVALID:
+			case VACSTATE_WORKER_DONE:
+				{
+					elog(ERROR, "unexpected vacuum state %d", mystate);
+					break;
+				}
+		}
+
+		/* Set my state as done after finished */
+		lvworker_set_state(VACSTATE_WORKER_DONE);
+	}
+}
+
+/*
+ * Wait for the my state to be changed by the vacuum leader.
+ */
+static void
+lvworker_wait_for_next_work(void)
+{
+	VacWorkerState mystate;
+
+	for (;;)
+	{
+		mystate = lvworker_get_state();
+
+		/* Got the next valid state by the vacuum leader */
+		if (mystate != VACSTATE_WORKER_DONE && mystate != VACSTATE_INVALID)
+			break;
+
+		/* Sleep until the next notification */
+		ConditionVariableSleep(&WorkerState->cv, WAIT_EVENT_PARALLEL_VACUUM);
+	}
+
+	ConditionVariableCancelSleep();
+}
+
+/*
+ * lvworker_get_state - get my current state
+ */
+static VacWorkerState
+lvworker_get_state(void)
+{
+	VacWorkerState state;
+
+	SpinLockAcquire(&MyVacuumWorker->mutex);
+	state = MyVacuumWorker->state;
+	SpinLockRelease(&MyVacuumWorker->mutex);
+
+	return state;
+}
+
+/*
+ * lvworker_set_state - set new state to my state
+ */
+static void
+lvworker_set_state(VacWorkerState new_state)
+{
+	SpinLockAcquire(&MyVacuumWorker->mutex);
+	MyVacuumWorker->state = new_state;
+	SpinLockRelease(&MyVacuumWorker->mutex);
+
+	ConditionVariableBroadcast(&WorkerState->cv);
+}
+
+/*
+ * Clean up function for parallel vacuum worker
+ */
+static void
+lvworker_onexit(int code, Datum arg)
+{
+	if (IsInParallelMode() && MyVacuumWorker)
+		lvworker_detach();
+}
+
+/*
+ * Detach the worker and cleanup worker information.
+ */
+static void
+lvworker_detach(void)
+{
+	SpinLockAcquire(&MyVacuumWorker->mutex);
+	MyVacuumWorker->state = VACSTATE_INVALID;
+	MyVacuumWorker->pid = 0;	/* the worker is dead */
+	SpinLockRelease(&MyVacuumWorker->mutex);
+
+	MyVacuumWorker = NULL;
+}
+
+/*
+ * Attach to a worker slot according to its ParallelWorkerNumber.
+ */
+static void
+lvworker_attach(void)
+{
+	VacuumWorker *vworker;
+
+	LWLockAcquire(&WorkerState->vacuumlock, LW_EXCLUSIVE);
+	vworker = &WorkerState->workers[ParallelWorkerNumber];
+	vworker->pid = MyProcPid;
+	vworker->state = VACSTATE_SCAN; /* first state */
+	LWLockRelease(&WorkerState->vacuumlock);
+
+	MyVacuumWorker = vworker;
+}
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 378f2fa..fe6d777 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -1667,7 +1667,12 @@ _equalDropdbStmt(const DropdbStmt *a, const DropdbStmt *b)
 static bool
 _equalVacuumStmt(const VacuumStmt *a, const VacuumStmt *b)
 {
-	COMPARE_SCALAR_FIELD(options);
+	if (a->options.flags != b->options.flags)
+		return false;
+
+	if (a->options.nworkers != b->options.nworkers)
+		return false;
+
 	COMPARE_NODE_FIELD(rels);
 
 	return true;
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 96bf060..3cedd37 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -52,6 +52,7 @@
 #include "parser/parsetree.h"
 #include "parser/parse_agg.h"
 #include "rewrite/rewriteManip.h"
+#include "storage/bufmgr.h"
 #include "storage/dsm_impl.h"
 #include "utils/rel.h"
 #include "utils/selfuncs.h"
@@ -6022,6 +6023,138 @@ plan_cluster_use_sort(Oid tableOid, Oid indexOid)
 }
 
 /*
+ * plan_lazy_vacuum_workers_index_workers
+ *		Use the planner to decide how many parallel worker processes
+ *		VACUUM and autovacuum should request for use
+ *
+ * tableOid is the table begin vacuumed which must not be non-tables or
+ * special system tables.
+ * nworkers_requested is the number of workers requested in VACUUM option
+ * by user. it's 0 if not requested.
+ *
+ * Return value is the number of parallel worker processes to request.  It
+ * may be unsafe to proceed if this is 0.  Note that this does not include the
+ * leader participating as a worker (value is always a number of parallel
+ * worker processes).
+ *
+ * Note: caller had better already hold some type of lock on the table and
+ * index.
+ */
+int
+plan_lazy_vacuum_workers(Oid tableOid, int nworkers_requested)
+{
+	int				parallel_workers;
+	PlannerInfo 	*root;
+	Query	   		*query;
+	PlannerGlobal 	*glob;
+	RangeTblEntry 	*rte;
+	RelOptInfo 		*rel;
+	Relation		heap;
+	BlockNumber		nblocks;
+
+	/* Return immediately when parallelism disabled */
+	if (max_parallel_maintenance_workers == 0)
+		return 0;
+
+	/* Set up largely-dummy planner state */
+	query = makeNode(Query);
+	query->commandType = CMD_SELECT;
+
+	glob = makeNode(PlannerGlobal);
+
+	root = makeNode(PlannerInfo);
+	root->parse = query;
+	root->glob = glob;
+	root->query_level = 1;
+	root->planner_cxt = CurrentMemoryContext;
+	root->wt_param_id = -1;
+
+	/*
+	 * Build a minimal RTE.
+	 *
+	 * Set the target's table to be an inheritance parent.  This is a kludge
+	 * that prevents problems within get_relation_info(), which does not
+	 * expect that any IndexOptInfo is currently undergoing REINDEX.
+	 */
+	rte = makeNode(RangeTblEntry);
+	rte->rtekind = RTE_RELATION;
+	rte->relid = tableOid;
+	rte->relkind = RELKIND_RELATION;	/* Don't be too picky. */
+	rte->lateral = false;
+	rte->inh = true;
+	rte->inFromCl = true;
+	query->rtable = list_make1(rte);
+
+	/* Set up RTE/RelOptInfo arrays */
+	setup_simple_rel_arrays(root);
+
+	/* Build RelOptInfo */
+	rel = build_simple_rel(root, 1, NULL);
+
+	heap = heap_open(tableOid, NoLock);
+	nblocks = RelationGetNumberOfBlocks(heap);
+
+	/*
+	 * If the number of workers is requested accept it (though still cap
+	 * at max_parallel_maitenance_workers).
+	 */
+	if (nworkers_requested > 0)
+	{
+		parallel_workers = Min(nworkers_requested,
+							   max_parallel_maintenance_workers);
+
+		if (parallel_workers != nworkers_requested)
+			ereport(NOTICE,
+					(errmsg("%d vacuum parallel worker requested but cappped by max_parallel_maintenance_workers",
+							nworkers_requested),
+					 errhint("Increase max_parallel_workers")));
+
+		goto done;
+	}
+
+	/*
+	 * If paralell_workers storage parameter is set for the table, accept that
+	 * as the number of parallel worker process to launch (though still cap
+	 * at max_parallel_maintenance_workers). Note that we deliberately do not
+	 * consider any other factor when parallel_workers is set. (e.g., memory
+	 * use by workers.)
+	 */
+	if (rel->rel_parallel_workers != -1)
+	{
+		parallel_workers = Min(rel->rel_parallel_workers,
+							   max_parallel_maintenance_workers);
+		goto done;
+	}
+
+	/*
+	 * Determine number of workers to scan the heap relation using generic
+	 * model.
+	 */
+	parallel_workers = compute_parallel_worker(rel,
+											   nblocks,
+											   -1,
+											   max_parallel_maintenance_workers);
+	/*
+	 * Cap workers based on available maintenance_work_mem as needed.
+	 *
+	 * Note that each tuplesort participant receives an even share of the
+	 * total maintenance_work_mem budget.  Aim to leave participants
+	 * (including the leader as a participant) with no less than 32MB of
+	 * memory.  This leaves cases where maintenance_work_mem is set to 64MB
+	 * immediately past the threshold of being capable of launching a single
+	 * parallel worker to sort.
+	 */
+	while (parallel_workers > 0 &&
+		   maintenance_work_mem / (parallel_workers + 1) < 32768L)
+		parallel_workers--;
+
+done:
+	heap_close(heap, NoLock);
+
+	return parallel_workers;
+}
+
+/*
  * plan_create_index_workers
  *		Use the planner to decide how many parallel worker processes
  *		CREATE INDEX should request for use
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 87f5e95..74e79fa 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -187,6 +187,7 @@ static void processCASbits(int cas_bits, int location, const char *constrType,
 			   bool *deferrable, bool *initdeferred, bool *not_valid,
 			   bool *no_inherit, core_yyscan_t yyscanner);
 static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
+static VacuumOption *makeVacOpt(VacuumOptionFlag flag, int nworkers);
 
 %}
 
@@ -237,6 +238,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 	struct ImportQual	*importqual;
 	InsertStmt			*istmt;
 	VariableSetStmt		*vsetstmt;
+	VacuumOption		*vacopt;
 	PartitionElem		*partelem;
 	PartitionSpec		*partspec;
 	PartitionBoundSpec	*partboundspec;
@@ -305,8 +307,8 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 				create_extension_opt_item alter_extension_opt_item
 
 %type <ival>	opt_lock lock_type cast_context
-%type <ival>	vacuum_option_list vacuum_option_elem
-				analyze_option_list analyze_option_elem
+%type <vacopt>	vacuum_option_list vacuum_option_elem
+%type <ival>	analyze_option_list analyze_option_elem
 %type <boolean>	opt_or_replace
 				opt_grant_grant_option opt_grant_admin_option
 				opt_nowait opt_if_exists opt_with_data
@@ -10523,22 +10525,29 @@ cluster_index_specification:
 VacuumStmt: VACUUM opt_full opt_freeze opt_verbose opt_analyze opt_vacuum_relation_list
 				{
 					VacuumStmt *n = makeNode(VacuumStmt);
-					n->options = VACOPT_VACUUM;
+					VacuumOption *vacopt = makeVacOpt(VACOPT_VACUUM, 0);
 					if ($2)
-						n->options |= VACOPT_FULL;
+						vacopt->flags |= VACOPT_FULL;
 					if ($3)
-						n->options |= VACOPT_FREEZE;
+						vacopt->flags |= VACOPT_FREEZE;
 					if ($4)
-						n->options |= VACOPT_VERBOSE;
+						vacopt->flags |= VACOPT_VERBOSE;
 					if ($5)
-						n->options |= VACOPT_ANALYZE;
+						vacopt->flags |= VACOPT_ANALYZE;
+
+					n->options.flags = vacopt->flags;
+					n->options.nworkers = 0;
 					n->rels = $6;
 					$$ = (Node *)n;
+					pfree(vacopt);
 				}
 			| VACUUM '(' vacuum_option_list ')' opt_vacuum_relation_list
 				{
-					VacuumStmt *n = makeNode(VacuumStmt);
-					n->options = VACOPT_VACUUM | $3;
+					VacuumStmt 		*n = makeNode(VacuumStmt);
+					VacuumOption 	*vacopt = $3;
+
+					n->options.flags = vacopt->flags | VACOPT_VACUUM;
+					n->options.nworkers = vacopt->nworkers;
 					n->rels = $5;
 					$$ = (Node *) n;
 				}
@@ -10546,18 +10555,42 @@ VacuumStmt: VACUUM opt_full opt_freeze opt_verbose opt_analyze opt_vacuum_relati
 
 vacuum_option_list:
 			vacuum_option_elem								{ $$ = $1; }
-			| vacuum_option_list ',' vacuum_option_elem		{ $$ = $1 | $3; }
+			| vacuum_option_list ',' vacuum_option_elem
+				{
+					VacuumOption *vacopt1 = (VacuumOption *) $1;
+					VacuumOption *vacopt2 = (VacuumOption *) $3;
+
+					/* OR flags */
+					vacopt1->flags |= vacopt2->flags;
+
+					/* Set requested parallel worker number */
+					if (vacopt2->flags == VACOPT_PARALLEL)
+						vacopt1->nworkers = vacopt2->nworkers;
+
+					$$ = vacopt1;
+					pfree(vacopt2);
+				}
 		;
 
 vacuum_option_elem:
-			analyze_keyword		{ $$ = VACOPT_ANALYZE; }
-			| VERBOSE			{ $$ = VACOPT_VERBOSE; }
-			| FREEZE			{ $$ = VACOPT_FREEZE; }
-			| FULL				{ $$ = VACOPT_FULL; }
+			analyze_keyword		{ $$ = makeVacOpt(VACOPT_ANALYZE, 0); }
+			| VERBOSE			{ $$ = makeVacOpt(VACOPT_VERBOSE, 0); }
+			| FREEZE			{ $$ = makeVacOpt(VACOPT_FREEZE, 0); }
+			| FULL				{ $$ = makeVacOpt(VACOPT_FULL, 0); }
+			| PARALLEL			{ $$ = makeVacOpt(VACOPT_PARALLEL, 0); }
+			| PARALLEL ICONST
+				{
+					if ($2 < 1)
+						ereport(ERROR,
+								(errcode(ERRCODE_SYNTAX_ERROR),
+								 errmsg("parallel vacuum degree must be more than 1"),
+								 parser_errposition(@1)));
+					$$ = makeVacOpt(VACOPT_PARALLEL, $2);
+				}
 			| IDENT
 				{
 					if (strcmp($1, "disable_page_skipping") == 0)
-						$$ = VACOPT_DISABLE_PAGE_SKIPPING;
+						$$ = makeVacOpt(VACOPT_DISABLE_PAGE_SKIPPING, 0);
 					else
 						ereport(ERROR,
 								(errcode(ERRCODE_SYNTAX_ERROR),
@@ -10569,16 +10602,23 @@ vacuum_option_elem:
 AnalyzeStmt: analyze_keyword opt_verbose opt_vacuum_relation_list
 				{
 					VacuumStmt *n = makeNode(VacuumStmt);
-					n->options = VACOPT_ANALYZE;
+					VacuumOption *vacopt = makeVacOpt(VACOPT_ANALYZE, 0);
+
 					if ($2)
-						n->options |= VACOPT_VERBOSE;
+						vacopt->flags |= VACOPT_VERBOSE;
+
+					n->options.flags = vacopt->flags;
+					n->options.nworkers = 0;
 					n->rels = $3;
 					$$ = (Node *)n;
+					pfree(vacopt);
 				}
 			| analyze_keyword '(' analyze_option_list ')' opt_vacuum_relation_list
 				{
-					VacuumStmt *n = makeNode(VacuumStmt);
-					n->options = VACOPT_ANALYZE | $3;
+					VacuumStmt 		*n = makeNode(VacuumStmt);
+
+					n->options.flags = $3 | VACOPT_ANALYZE;
+					n->options.nworkers = 0;
 					n->rels = $5;
 					$$ = (Node *) n;
 				}
@@ -16327,6 +16367,16 @@ makeRecursiveViewSelect(char *relname, List *aliases, Node *query)
 	return (Node *) s;
 }
 
+static VacuumOption *
+makeVacOpt(VacuumOptionFlag flag, int nworkers)
+{
+	VacuumOption *vacopt = palloc(sizeof(VacuumOption));
+
+	vacopt->flags = flag;
+	vacopt->nworkers = nworkers;
+	return vacopt;
+}
+
 /* parser_init()
  * Initialize to parse one query string
  */
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 1d9cfc6..a3af5e0 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -187,16 +187,16 @@ typedef struct av_relation
 /* struct to keep track of tables to vacuum and/or analyze, after rechecking */
 typedef struct autovac_table
 {
-	Oid			at_relid;
-	int			at_vacoptions;	/* bitmask of VacuumOption */
-	VacuumParams at_params;
-	int			at_vacuum_cost_delay;
-	int			at_vacuum_cost_limit;
-	bool		at_dobalance;
-	bool		at_sharedrel;
-	char	   *at_relname;
-	char	   *at_nspname;
-	char	   *at_datname;
+	Oid				at_relid;
+	VacuumOption	at_vacoptions;	/* bitmask of VacuumOption */
+	VacuumParams 	at_params;
+	int				at_vacuum_cost_delay;
+	int				at_vacuum_cost_limit;
+	bool			at_dobalance;
+	bool			at_sharedrel;
+	char	   		*at_relname;
+	char	   		*at_nspname;
+	char	   		*at_datname;
 } autovac_table;
 
 /*-------------
@@ -2496,7 +2496,7 @@ do_autovacuum(void)
 			 * next table in our list.
 			 */
 			HOLD_INTERRUPTS();
-			if (tab->at_vacoptions & VACOPT_VACUUM)
+			if (tab->at_vacoptions.flags & VACOPT_VACUUM)
 				errcontext("automatic vacuum of table \"%s.%s.%s\"",
 						   tab->at_datname, tab->at_nspname, tab->at_relname);
 			else
@@ -2848,6 +2848,7 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		int			vac_cost_limit;
 		int			vac_cost_delay;
 		int			log_min_duration;
+		int			parallel_workers;
 
 		/*
 		 * Calculate the vacuum cost parameters and the freeze ages.  If there
@@ -2894,13 +2895,20 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 			? avopts->multixact_freeze_table_age
 			: default_multixact_freeze_table_age;
 
+		parallel_workers = (avopts &&
+							avopts->vacuum_parallel_workers >= 0)
+			? avopts->vacuum_parallel_workers
+			: 0;
+
 		tab = palloc(sizeof(autovac_table));
 		tab->at_relid = relid;
 		tab->at_sharedrel = classForm->relisshared;
-		tab->at_vacoptions = VACOPT_SKIPTOAST |
+		tab->at_vacoptions.flags = VACOPT_SKIPTOAST |
 			(dovacuum ? VACOPT_VACUUM : 0) |
 			(doanalyze ? VACOPT_ANALYZE : 0) |
-			(!wraparound ? VACOPT_SKIP_LOCKED : 0);
+			(!wraparound ? VACOPT_SKIP_LOCKED : 0) |
+			(dovacuum ? VACOPT_PARALLEL : 0);	/* always consider parallel */
+		tab->at_vacoptions.nworkers = parallel_workers;
 		tab->at_params.freeze_min_age = freeze_min_age;
 		tab->at_params.freeze_table_age = freeze_table_age;
 		tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
@@ -3146,10 +3154,10 @@ autovac_report_activity(autovac_table *tab)
 	int			len;
 
 	/* Report the command and possible options */
-	if (tab->at_vacoptions & VACOPT_VACUUM)
+	if (tab->at_vacoptions.flags & VACOPT_VACUUM)
 		snprintf(activity, MAX_AUTOVAC_ACTIV_LEN,
 				 "autovacuum: VACUUM%s",
-				 tab->at_vacoptions & VACOPT_ANALYZE ? " ANALYZE" : "");
+				 tab->at_vacoptions.flags & VACOPT_ANALYZE ? " ANALYZE" : "");
 	else
 		snprintf(activity, MAX_AUTOVAC_ACTIV_LEN,
 				 "autovacuum: ANALYZE");
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index a5d1291..9d29c3a 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -2487,7 +2487,6 @@ pgstat_fetch_stat_funcentry(Oid func_id)
 	return funcentry;
 }
 
-
 /* ----------
  * pgstat_fetch_stat_beentry() -
  *
@@ -3062,6 +3061,25 @@ pgstat_report_activity(BackendState state, const char *cmd_str)
 }
 
 /*-----------
+ * pgstat_report_leader_pid() -
+ *
+ * Report process id of the leader process that this backend is involved
+ * with.
+ */
+void
+pgstat_report_leader_pid(int pid)
+{
+	volatile PgBackendStatus *beentry = MyBEEntry;
+
+	if (!beentry)
+		return;
+
+	pgstat_increment_changecount_before(beentry);
+	beentry->st_leader_pid = pid;
+	pgstat_increment_changecount_after(beentry);
+}
+
+/*-----------
  * pgstat_progress_start_command() -
  *
  * Set st_progress_command (and st_progress_command_target) in own backend
@@ -3665,6 +3683,12 @@ pgstat_get_wait_ipc(WaitEventIPC w)
 		case WAIT_EVENT_PARALLEL_CREATE_INDEX_SCAN:
 			event_name = "ParallelCreateIndexScan";
 			break;
+		case WAIT_EVENT_PARALLEL_VACUUM_STARTUP:
+			event_name = "ParallelVacuumStartup";
+			break;
+		case WAIT_EVENT_PARALLEL_VACUUM:
+			event_name = "ParallelVacuum";
+			break;
 		case WAIT_EVENT_PROCARRAY_GROUP_UPDATE:
 			event_name = "ProcArrayGroupUpdate";
 			break;
diff --git a/src/backend/tcop/utility.c b/src/backend/tcop/utility.c
index b5804f6..b3400a1 100644
--- a/src/backend/tcop/utility.c
+++ b/src/backend/tcop/utility.c
@@ -661,7 +661,7 @@ standard_ProcessUtility(PlannedStmt *pstmt,
 				VacuumStmt *stmt = (VacuumStmt *) parsetree;
 
 				/* we choose to allow this during "read only" transactions */
-				PreventCommandDuringRecovery((stmt->options & VACOPT_VACUUM) ?
+				PreventCommandDuringRecovery((stmt->options.flags & VACOPT_VACUUM) ?
 											 "VACUUM" : "ANALYZE");
 				/* forbidden in parallel mode due to CommandIsReadOnly */
 				ExecVacuum(stmt, isTopLevel);
@@ -2567,7 +2567,7 @@ CreateCommandTag(Node *parsetree)
 			break;
 
 		case T_VacuumStmt:
-			if (((VacuumStmt *) parsetree)->options & VACOPT_VACUUM)
+			if (((VacuumStmt *) parsetree)->options.flags & VACOPT_VACUUM)
 				tag = "VACUUM";
 			else
 				tag = "ANALYZE";
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index e95e347..67aaabf 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -439,7 +439,7 @@ pg_stat_get_backend_idset(PG_FUNCTION_ARGS)
 Datum
 pg_stat_get_progress_info(PG_FUNCTION_ARGS)
 {
-#define PG_STAT_GET_PROGRESS_COLS	PGSTAT_NUM_PROGRESS_PARAM + 3
+#define PG_STAT_GET_PROGRESS_COLS	PGSTAT_NUM_PROGRESS_PARAM + 4
 	int			num_backends = pgstat_fetch_stat_numbackends();
 	int			curr_backend;
 	char	   *cmd = text_to_cstring(PG_GETARG_TEXT_PP(0));
@@ -516,14 +516,16 @@ pg_stat_get_progress_info(PG_FUNCTION_ARGS)
 		if (has_privs_of_role(GetUserId(), beentry->st_userid))
 		{
 			values[2] = ObjectIdGetDatum(beentry->st_progress_command_target);
+			values[3] = Int32GetDatum(beentry->st_leader_pid);
 			for (i = 0; i < PGSTAT_NUM_PROGRESS_PARAM; i++)
-				values[i + 3] = Int64GetDatum(beentry->st_progress_param[i]);
+				values[i + 4] = Int64GetDatum(beentry->st_progress_param[i]);
 		}
 		else
 		{
 			nulls[2] = true;
+			nulls[3] = true;
 			for (i = 0; i < PGSTAT_NUM_PROGRESS_PARAM; i++)
-				nulls[i + 3] = true;
+				nulls[i + 4] = true;
 		}
 
 		tuplestore_putvalues(tupstore, tupdesc, values, nulls);
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index a146510..03753cd 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5172,9 +5172,9 @@
   proname => 'pg_stat_get_progress_info', prorows => '100', proretset => 't',
   provolatile => 's', proparallel => 'r', prorettype => 'record',
   proargtypes => 'text',
-  proallargtypes => '{text,int4,oid,oid,int8,int8,int8,int8,int8,int8,int8,int8,int8,int8}',
-  proargmodes => '{i,o,o,o,o,o,o,o,o,o,o,o,o,o}',
-  proargnames => '{cmdtype,pid,datid,relid,param1,param2,param3,param4,param5,param6,param7,param8,param9,param10}',
+  proallargtypes => '{text,int4,oid,oid,int4,int8,int8,int8,int8,int8,int8,int8,int8,int8,int8}',
+  proargmodes => '{i,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
+  proargnames => '{cmdtype,pid,datid,relid,leader_pid,param1,param2,param3,param4,param5,param6,param7,param8,param9,param10}',
   prosrc => 'pg_stat_get_progress_info' },
 { oid => '3099',
   descr => 'statistics: information about currently active replication',
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 85d472f..790a255 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -14,15 +14,18 @@
 #ifndef VACUUM_H
 #define VACUUM_H
 
+#include "access/heapam_xlog.h"
 #include "access/htup.h"
+#include "access/parallel.h"
+#include "access/relscan.h"
 #include "catalog/pg_statistic.h"
 #include "catalog/pg_type.h"
 #include "nodes/parsenodes.h"
 #include "storage/buf.h"
+#include "storage/condition_variable.h"
 #include "storage/lock.h"
 #include "utils/relcache.h"
 
-
 /*----------
  * ANALYZE builds one of these structs for each attribute (column) that is
  * to be analyzed.  The struct and subsidiary data are in anl_context,
@@ -154,10 +157,9 @@ extern int	vacuum_freeze_table_age;
 extern int	vacuum_multixact_freeze_min_age;
 extern int	vacuum_multixact_freeze_table_age;
 
-
 /* in commands/vacuum.c */
 extern void ExecVacuum(VacuumStmt *vacstmt, bool isTopLevel);
-extern void vacuum(int options, List *relations, VacuumParams *params,
+extern void vacuum(VacuumOption options, List *relations, VacuumParams *params,
 	   BufferAccessStrategy bstrategy, bool isTopLevel);
 extern void vac_open_indexes(Relation relation, LOCKMODE lockmode,
 				 int *nindexes, Relation **Irel);
@@ -187,8 +189,9 @@ extern void vac_update_datfrozenxid(void);
 extern void vacuum_delay_point(void);
 
 /* in commands/vacuumlazy.c */
-extern void lazy_vacuum_rel(Relation onerel, int options,
+extern void lazy_vacuum_rel(Relation onerel, VacuumOption options,
 				VacuumParams *params, BufferAccessStrategy bstrategy);
+extern void lazy_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
 
 /* in commands/analyze.c */
 extern void analyze_rel(Oid relid, RangeVar *relation, int options,
diff --git a/src/include/commands/vacuum_internal.h b/src/include/commands/vacuum_internal.h
new file mode 100644
index 0000000..8a132f9
--- /dev/null
+++ b/src/include/commands/vacuum_internal.h
@@ -0,0 +1,191 @@
+/*-------------------------------------------------------------------------
+ *
+ * vacuum_internal.h
+ *	  Internal declarations for lazy vacuum
+ *
+ *
+ * Portions Copyright (c) 1996-2018, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/commands/vacuum_internal.h
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#ifndef VACUUM_INTERNAL_H
+#define VACUUM_INTERNAL_H
+
+/* DSM key for parallel lazy vacuum */
+#define VACUUM_KEY_SHARED			UINT64CONST(0xFFFFFFFFFFF00001)
+#define VACUUM_KEY_VACUUM_STATS		UINT64CONST(0xFFFFFFFFFFF00002)
+#define VACUUM_KEY_INDEX_STATS	    UINT64CONST(0xFFFFFFFFFFF00003)
+#define VACUUM_KEY_DEAD_TUPLES		UINT64CONST(0xFFFFFFFFFFF00004)
+#define VACUUM_KEY_WORKERS			UINT64CONST(0xFFFFFFFFFFF00005)
+#define VACUUM_KEY_QUERY_TEXT		UINT64CONST(0xFFFFFFFFFFF00006)
+
+/*
+ * Type definitions of lazy vacuum. The fields of these structs are
+ * accessed by only the vacuum leader.
+ */
+typedef struct LVScanDescData LVScanDescData;
+typedef struct LVScanDescData *LVScanDesc;
+typedef struct LVTidMap	LVTidMap;
+typedef struct LVIndStats LVIndStats;
+
+/* Vacuum worker state for parallel lazy vacuum */
+typedef enum VacWorkerSate
+{
+	VACSTATE_INVALID = 0,
+	VACSTATE_SCAN,
+	VACSTATE_VACUUM_INDEX,
+	VACSTATE_VACUUM_HEAP,
+	VACSTATE_CLEANUP_INDEX,
+	VACSTATE_WORKER_DONE,
+	VACSTATE_COMPLETED
+} VacWorkerState;
+
+/*
+ * The 'pid' always starts with InvalidPid, which means the vacuum worker
+ * is starting up. It's sets by the vacuum worker itself during start up. When
+ * the vacuum worker exits or detaches the vacuum worker slot, 'pid' is set to 0,
+ * which means the vacuum worker is dead.
+ */
+typedef struct VacuumWorker
+{
+	pid_t			pid;	/* parallel worker's pid.
+							   InvalidPid = not started yet; 0 = dead */
+
+	VacWorkerState	state;	/* current worker's state */
+	slock_t			mutex;	/* protect the above fields */
+} VacuumWorker;
+
+/* Struct to control parallel vacuum workers */
+typedef struct LVWorkerState
+{
+	int		nparticipantvacuum;		/* only parallel worker, not including
+									   the leader */
+	int		nparticipantvacuum_launched;	/* actual launched workers of
+											   nparticipantvacuum */
+
+	/* condition variable signaled when changing status */
+	ConditionVariable	cv;
+
+	/* protect workers array */
+	LWLock				vacuumlock;
+
+	VacuumWorker workers[FLEXIBLE_ARRAY_MEMBER];
+} LVWorkerState;
+#define SizeOfLVWorkerState offsetof(LVWorkerState, workers) + sizeof(VacuumWorker)
+
+typedef struct LVRelStats
+{
+	/* hasindex = true means two-pass strategy; false means one-pass */
+	bool		hasindex;
+	/* Overall statistics about rel */
+	BlockNumber old_rel_pages;		/* previous value of pg_class.relpages */
+	BlockNumber rel_pages;			/* total number of pages */
+	BlockNumber scanned_pages;		/* number of pages we examined */
+	BlockNumber pinskipped_pages;	/* # of pages we skipped due to a pin */
+	BlockNumber frozenskipped_pages;	/* # of frozen pages we skipped */
+	BlockNumber tupcount_pages;		/* pages whose tuples we counted */
+	BlockNumber empty_pages;		/* # of empty pages */
+	BlockNumber vacuumed_pages;		/* # of pages we vacuumed */
+	double		num_tuples;			/* total number of nonremoval tuples */
+	double		live_tuples;		/* live tuples (reltuples estimate) */
+	double		tuples_deleted;		/* tuples cleaned up by vacuum */
+	double		unused_tuples;		/* unused item pointers */
+	double		old_live_tuples;	/* previous value of pg_class.reltuples */
+	double		new_rel_tuples;		/* new estimated total # of tuples */
+	double		new_live_tuples;	/* new estimated total # of live tuples */
+	double		new_dead_tuples;	/* new estimated total # of dead tuples */
+	BlockNumber pages_removed;
+	BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
+	int			num_index_scans;
+	TransactionId latestRemovedXid;
+	bool		lock_waiter_detected;
+} LVRelStats;
+
+/*
+ * Shared information among parallel workers.
+ */
+typedef struct LVShared
+{
+	/* Target relation's OID */
+	Oid		relid;
+
+	/* Options and thresholds used for lazy vacuum */
+	VacuumOption		options;
+	bool				aggressive;		/* is an aggressive vacuum? */
+	bool				is_wraparound;	/* for anti-wraparound purpose? */
+	int					elevel;			/* verbose logging */
+	TransactionId	oldestXmin;
+	TransactionId	freezeLimit;
+	MultiXactId		multiXactCutoff;
+
+	/* Vacuum delay */
+	int		cost_delay;
+	int		cost_limit;
+
+	int		max_dead_tuples_per_worker;	/* Maximum tuples each worker can have */
+	ParallelHeapScanDescData heapdesc;	/* for heap scan */
+} LVShared;
+
+/*
+ * Working state for lazy vacuum execution. LVState is used by both vacuum
+ * workers and the vacuum leader. In parallel lazy vacuum, the 'vacrelstats'
+ * for vacuum worker and the 'dead_tuples' exit in shared memory in addition
+ * to the three fields for parallel lazy vacuum: 'lvshared', 'indstats' and
+ * 'pcxt'.
+ */
+typedef struct LVState
+{
+	/* Vacuum target relation and indexes */
+	Oid			relid;
+	Relation	relation;
+	Relation	*indRels;
+	int			nindexes;
+
+	/* Used during scanning heap */
+	IndexBulkDeleteResult	**indbulkstats;
+	xl_heap_freeze_tuple	*frozen;
+	BlockNumber				next_fsm_block_to_vacuum;
+	BlockNumber				current_block; /* block number being scanned */
+	VacuumOption			options;
+	bool					is_wraparound;
+
+	/* Scan description for lazy vacuum */
+	LVScanDesc	lvscan;
+	bool		aggressive;
+
+	/* Vacuum statistics for the target table */
+	LVRelStats	*vacrelstats;
+
+	/* Dead tuple array */
+	LVTidMap	*dead_tuples;
+
+	/*
+	 * The following fields are only present when a parallel lazy vacuum
+	 * is performed.
+	 */
+	LVShared		*lvshared;	/* shared information among vacuum workers */
+	LVIndStats		*indstats;	/* shared index statistics */
+
+} LVState;
+
+extern LVWorkerState	*WorkerState;
+
+extern LVScanDesc lv_beginscan(Relation relation, LVShared *lvshared,
+							   bool aggressive, bool disable_page_skipping);
+extern void lv_endscan(LVScanDesc lvscan);
+extern int do_lazy_scan_heap(LVState *lvstate, bool *isFinished);
+extern void lazy_cleanup_all_indexes(LVState *lvstate);
+extern void lazy_vacuum_all_indexes(LVState *lvstate);
+extern void lazy_vacuum_heap(Relation onerel, LVState *lvstate);
+extern void vacuum_set_xid_limits_for_worker(TransactionId oldestxmin,
+											 TransactionId freezelimit,
+											 MultiXactId multixactcutoff);
+extern void vacuum_set_elevel_for_worker(int elevel);
+extern void lazy_space_alloc(LVState *lvstate, BlockNumber relblocks);
+
+
+#endif							/* VACUUM_INTERNAL_H */
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 07ab1a3..70faae9 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3134,7 +3134,7 @@ typedef struct ClusterStmt
  * and VACOPT_ANALYZE must be set in options.
  * ----------------------
  */
-typedef enum VacuumOption
+typedef enum VacuumOptionFlag
 {
 	VACOPT_VACUUM = 1 << 0,		/* do VACUUM */
 	VACOPT_ANALYZE = 1 << 1,	/* do ANALYZE */
@@ -3144,9 +3144,17 @@ typedef enum VacuumOption
 	VACOPT_SKIP_LOCKED = 1 << 5,	/* skip if cannot get lock (autovacuum
 									 * only) */
 	VACOPT_SKIPTOAST = 1 << 6,	/* don't process the TOAST table, if any */
-	VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7	/* don't skip any pages */
+	VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7,	/* don't skip any pages */
+	VACOPT_PARALLEL = 1 << 8	/* do VACUUM in parallel */
+} VacuumOptionFlag;
+
+typedef struct VacuumOption
+{
+	VacuumOptionFlag	flags;	/* OR of VacuumOptionFlag */
+	int					nworkers;	/* # of parallel vacuum workers */
 } VacuumOption;
 
+
 /*
  * Info about a single target table of VACUUM/ANALYZE.
  *
@@ -3164,9 +3172,9 @@ typedef struct VacuumRelation
 
 typedef struct VacuumStmt
 {
-	NodeTag		type;
-	int			options;		/* OR of VacuumOption flags */
-	List	   *rels;			/* list of VacuumRelation, or NIL for all */
+	NodeTag			type;
+	VacuumOption	options;		/* OR of VacuumOption flags */
+	List	   		*rels;			/* list of VacuumRelation, or NIL for all */
 } VacuumStmt;
 
 /* ----------------------
diff --git a/src/include/optimizer/planner.h b/src/include/optimizer/planner.h
index c090396..3a98f0c 100644
--- a/src/include/optimizer/planner.h
+++ b/src/include/optimizer/planner.h
@@ -58,5 +58,6 @@ extern Expr *preprocess_phv_expression(PlannerInfo *root, Expr *expr);
 
 extern bool plan_cluster_use_sort(Oid tableOid, Oid indexOid);
 extern int	plan_create_index_workers(Oid tableOid, Oid indexOid);
+extern int	plan_lazy_vacuum_workers(Oid tableOid, int nworkers_requested);
 
 #endif							/* PLANNER_H */
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index d59c24a..650ba10 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -827,6 +827,8 @@ typedef enum
 	WAIT_EVENT_PARALLEL_FINISH,
 	WAIT_EVENT_PARALLEL_BITMAP_SCAN,
 	WAIT_EVENT_PARALLEL_CREATE_INDEX_SCAN,
+	WAIT_EVENT_PARALLEL_VACUUM_STARTUP,
+	WAIT_EVENT_PARALLEL_VACUUM,
 	WAIT_EVENT_PROCARRAY_GROUP_UPDATE,
 	WAIT_EVENT_CLOG_GROUP_UPDATE,
 	WAIT_EVENT_REPLICATION_ORIGIN_DROP,
@@ -1031,13 +1033,17 @@ typedef struct PgBackendStatus
 
 	/*
 	 * Command progress reporting.  Any command which wishes can advertise
-	 * that it is running by setting st_progress_command,
+	 * that it is running by setting st_leaderpid, st_progress_command,
 	 * st_progress_command_target, and st_progress_param[].
 	 * st_progress_command_target should be the OID of the relation which the
 	 * command targets (we assume there's just one, as this is meant for
 	 * utility commands), but the meaning of each element in the
 	 * st_progress_param array is command-specific.
+	 * st_leader_pid can be used for command progress reporting of parallel
+	 * operation. Setting by the leader's pid of parallel operation we can
+	 * group them in progress reporting SQL.
 	 */
+	int			st_leader_pid;
 	ProgressCommandType st_progress_command;
 	Oid			st_progress_command_target;
 	int64		st_progress_param[PGSTAT_NUM_PROGRESS_PARAM];
@@ -1204,6 +1210,7 @@ extern const char *pgstat_get_crashed_backend_activity(int pid, char *buffer,
 									int buflen);
 extern const char *pgstat_get_backend_desc(BackendType backendType);
 
+extern void pgstat_report_leader_pid(int pid);
 extern void pgstat_progress_start_command(ProgressCommandType cmdtype,
 							  Oid relid);
 extern void pgstat_progress_update_param(int index, int64 val);
diff --git a/src/include/storage/lwlock.h b/src/include/storage/lwlock.h
index c21bfe2..10bd668 100644
--- a/src/include/storage/lwlock.h
+++ b/src/include/storage/lwlock.h
@@ -219,6 +219,7 @@ typedef enum BuiltinTrancheIds
 	LWTRANCHE_SHARED_TUPLESTORE,
 	LWTRANCHE_TBM,
 	LWTRANCHE_PARALLEL_APPEND,
+	LWTRANCHE_PARALLEL_VACUUM,
 	LWTRANCHE_FIRST_USER_DEFINED
 }			BuiltinTrancheIds;
 
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 6ecbdb6..1f7e757 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -249,6 +249,7 @@ typedef struct AutoVacOpts
 	int			multixact_freeze_max_age;
 	int			multixact_freeze_table_age;
 	int			log_min_duration;
+	int			vacuum_parallel_workers;
 	float8		vacuum_scale_factor;
 	float8		analyze_scale_factor;
 } AutoVacOpts;
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 078129f..d697c6a 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1836,13 +1836,21 @@ pg_stat_progress_vacuum| SELECT s.pid,
             ELSE NULL::text
         END AS phase,
     s.param2 AS heap_blks_total,
-    s.param3 AS heap_blks_scanned,
-    s.param4 AS heap_blks_vacuumed,
-    s.param5 AS index_vacuum_count,
+    w.heap_blks_scanned,
+    w.heap_blks_vacuumed,
+    w.index_vacuum_count,
     s.param6 AS max_dead_tuples,
-    s.param7 AS num_dead_tuples
-   FROM (pg_stat_get_progress_info('VACUUM'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10)
-     LEFT JOIN pg_database d ON ((s.datid = d.oid)));
+    w.num_dead_tuples
+   FROM ((pg_stat_get_progress_info('VACUUM'::text) s(pid, datid, relid, leader_pid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10)
+     LEFT JOIN pg_database d ON ((s.datid = d.oid)))
+     LEFT JOIN ( SELECT pg_stat_get_progress_info.leader_pid,
+            max(pg_stat_get_progress_info.param3) AS heap_blks_scanned,
+            max(pg_stat_get_progress_info.param4) AS heap_blks_vacuumed,
+            max(pg_stat_get_progress_info.param5) AS index_vacuum_count,
+            max(pg_stat_get_progress_info.param7) AS num_dead_tuples
+           FROM pg_stat_get_progress_info('VACUUM'::text) pg_stat_get_progress_info(pid, datid, relid, leader_pid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10)
+          GROUP BY pg_stat_get_progress_info.leader_pid) w ON ((s.pid = w.leader_pid)))
+  WHERE (s.pid = s.leader_pid);
 pg_stat_replication| SELECT s.pid,
     s.usesysid,
     u.rolname AS usename,
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index d66e2aa..48f28ac 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -119,6 +119,8 @@ ANALYZE (nonexistant-arg) does_not_exist;
 ERROR:  syntax error at or near "nonexistant"
 LINE 1: ANALYZE (nonexistant-arg) does_not_exist;
                  ^
+-- parallel option
+VACUUM (PARALLEL 1) vactst;
 DROP TABLE vaccluster;
 DROP TABLE vactst;
 DROP TABLE vacparted;
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 275ce2e..4571bcc 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -93,6 +93,9 @@ ANALYZE vactst (i), vacparted (does_not_exist);
 ANALYZE (VERBOSE) does_not_exist;
 ANALYZE (nonexistant-arg) does_not_exist;
 
+-- parallel option
+VACUUM (PARALLEL 1) vactst;
+
 DROP TABLE vaccluster;
 DROP TABLE vactst;
 DROP TABLE vacparted;
-- 
2.10.5

v7-0001-Publish-some-parallel-heap-scan-functions.patchapplication/octet-stream; name=v7-0001-Publish-some-parallel-heap-scan-functions.patchDownload

From f60de50e42d62ece1bfd2e6d95f44736b60ba7e2 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 10 Aug 2018 14:37:40 +0900
Subject: [PATCH v7 1/2] Publish some parallel heap scan functions.

---
 src/backend/access/heap/heapam.c | 6 ++----
 src/include/access/heapam.h      | 2 ++
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 72395a5..5e21f09 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -93,8 +93,6 @@ static HeapScanDesc heap_beginscan_internal(Relation relation,
 						bool is_bitmapscan,
 						bool is_samplescan,
 						bool temp_snap);
-static void heap_parallelscan_startblock_init(HeapScanDesc scan);
-static BlockNumber heap_parallelscan_nextpage(HeapScanDesc scan);
 static HeapTuple heap_prepare_insert(Relation relation, HeapTuple tup,
 					TransactionId xid, CommandId cid, int options);
 static XLogRecPtr log_heap_update(Relation reln, Buffer oldbuf,
@@ -1694,7 +1692,7 @@ heap_beginscan_parallel(Relation relation, ParallelHeapScanDesc parallel_scan)
  *		only to set the startblock once.
  * ----------------
  */
-static void
+void
 heap_parallelscan_startblock_init(HeapScanDesc scan)
 {
 	BlockNumber sync_startpage = InvalidBlockNumber;
@@ -1742,7 +1740,7 @@ retry:
  *		first backend gets an InvalidBlockNumber return.
  * ----------------
  */
-static BlockNumber
+BlockNumber
 heap_parallelscan_nextpage(HeapScanDesc scan)
 {
 	BlockNumber page;
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index ca5cad7..a424136 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -126,6 +126,8 @@ extern void heap_rescan_set_params(HeapScanDesc scan, ScanKey key,
 					   bool allow_strat, bool allow_sync, bool allow_pagemode);
 extern void heap_endscan(HeapScanDesc scan);
 extern HeapTuple heap_getnext(HeapScanDesc scan, ScanDirection direction);
+extern BlockNumber heap_parallelscan_nextpage(HeapScanDesc scan);
+extern void heap_parallelscan_startblock_init(HeapScanDesc scan);
 
 extern Size heap_parallelscan_estimate(Snapshot snapshot);
 extern void heap_parallelscan_initialize(ParallelHeapScanDesc target,
-- 
2.10.5

Masahiko Sawada

sawada.mshk@gmail.com

about 7 years ago

In reply to: Masahiko Sawada (#1)

On Tue, Aug 14, 2018 at 9:31 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Thu, Nov 30, 2017 at 11:09 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:

On Tue, Oct 24, 2017 at 5:54 AM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Yeah, I was thinking the commit is relevant with this issue but as
Amit mentioned this error is emitted by DROP SCHEMA CASCASE.
I don't find out the cause of this issue yet. With the previous
version patch, autovacuum workers were woking with one parallel worker
but it never drops relations. So it's possible that the error might
not have been relevant with the patch but anywayI'll continue to work
on that.

This depends on the extension lock patch from
/messages/by-id/CAD21AoCmT3cFQUN4aVvzy5chw7DuzXrJCbrjTU05B+Ss=Gn1LA@mail.gmail.com/
if I am following correctly. So I propose to mark this patch as
returned with feedback for now, and come back to it once the root
problems are addressed. Feel free to correct me if you think that's
not adapted.

I've re-designed the parallel vacuum patch. Attached the latest
version patch. As the discussion so far, this patch depends on the
extension lock patch[1]. However I think we can discuss the design
part of parallel vacuum independently from that patch. That's way I'm
proposing the new patch. In this patch, I structured and refined the
lazy_scan_heap() because it's a single big function and not suitable
for making it parallel.

The parallel vacuum worker processes keep waiting for commands from
the parallel vacuum leader process. Before entering each phase of lazy
vacuum such as scanning heap, vacuum index and vacuum heap, the leader
process changes the all workers state to the next state. Vacuum worker
processes do the job according to the their state and wait for the
next command after finished. Also in before entering the next phase,
the leader process does some preparation works while vacuum workers is
sleeping; for example, clearing shared dead tuple space before
entering the 'scanning heap' phase. The status of vacuum workers are
stored into a DSM area pointed by WorkerState variables, and
controlled by the leader process. FOr the basic design and performance
improvements please refer to my presentation at PGCon 2018[2].

The number of parallel vacuum workers is determined according to
either the table size or PARALLEL option in VACUUM command. The
maximum of parallel workers is max_parallel_maintenance_workers.

I've separated the code for vacuum worker process to
backends/commands/vacuumworker.c, and created
includes/commands/vacuum_internal.h file to declare the definitions
for the lazy vacuum.

For autovacuum, this patch allows autovacuum worker process to use the
parallel option according to the relation size or the reloption. But
autovacuum delay, since there is no slots for parallel worker of
autovacuum in AutoVacuumShmem this patch doesn't support the change of
the autovacuum delay configuration during running.

Attached rebased version patch to the current HEAD.

Please apply this patch with the extension lock patch[1] when testing
as this patch can try to extend visibility map pages concurrently.

Because the patch leads performance degradation in the case where
bulk-loading to a partitioned table I think that the original
proposal, which makes group locking conflict when relation extension
locks, is more realistic approach. So I worked on this with the simple
patch instead of [1]/messages/by-id/CAD21AoBn8WbOt21MFfj1mQmL2ZD8KVgMHYrOe1F5ozsQC4Z_hw@mail.gmail.com. Attached three patches:

* 0001 patch publishes some static functions such as
heap_paralellscan_startblock_init so that the parallel vacuum code can
use them.
* 0002 patch makes the group locking conflict when relation extension locks.
* 0003 patch add paralel option to lazy vacuum.

Please review them.

[1]: /messages/by-id/CAD21AoBn8WbOt21MFfj1mQmL2ZD8KVgMHYrOe1F5ozsQC4Z_hw@mail.gmail.com

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Masahiko Sawada

sawada.mshk@gmail.com

about 7 years ago

In reply to: Masahiko Sawada (#2)

3 attachment(s)

On Tue, Oct 30, 2018 at 5:30 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Tue, Aug 14, 2018 at 9:31 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Thu, Nov 30, 2017 at 11:09 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:

On Tue, Oct 24, 2017 at 5:54 AM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Yeah, I was thinking the commit is relevant with this issue but as
Amit mentioned this error is emitted by DROP SCHEMA CASCASE.
I don't find out the cause of this issue yet. With the previous
version patch, autovacuum workers were woking with one parallel worker
but it never drops relations. So it's possible that the error might
not have been relevant with the patch but anywayI'll continue to work
on that.

This depends on the extension lock patch from
/messages/by-id/CAD21AoCmT3cFQUN4aVvzy5chw7DuzXrJCbrjTU05B+Ss=Gn1LA@mail.gmail.com/
if I am following correctly. So I propose to mark this patch as
returned with feedback for now, and come back to it once the root
problems are addressed. Feel free to correct me if you think that's
not adapted.

I've re-designed the parallel vacuum patch. Attached the latest
version patch. As the discussion so far, this patch depends on the
extension lock patch[1]. However I think we can discuss the design
part of parallel vacuum independently from that patch. That's way I'm
proposing the new patch. In this patch, I structured and refined the
lazy_scan_heap() because it's a single big function and not suitable
for making it parallel.

The parallel vacuum worker processes keep waiting for commands from
the parallel vacuum leader process. Before entering each phase of lazy
vacuum such as scanning heap, vacuum index and vacuum heap, the leader
process changes the all workers state to the next state. Vacuum worker
processes do the job according to the their state and wait for the
next command after finished. Also in before entering the next phase,
the leader process does some preparation works while vacuum workers is
sleeping; for example, clearing shared dead tuple space before
entering the 'scanning heap' phase. The status of vacuum workers are
stored into a DSM area pointed by WorkerState variables, and
controlled by the leader process. FOr the basic design and performance
improvements please refer to my presentation at PGCon 2018[2].

The number of parallel vacuum workers is determined according to
either the table size or PARALLEL option in VACUUM command. The
maximum of parallel workers is max_parallel_maintenance_workers.

I've separated the code for vacuum worker process to
backends/commands/vacuumworker.c, and created
includes/commands/vacuum_internal.h file to declare the definitions
for the lazy vacuum.

For autovacuum, this patch allows autovacuum worker process to use the
parallel option according to the relation size or the reloption. But
autovacuum delay, since there is no slots for parallel worker of
autovacuum in AutoVacuumShmem this patch doesn't support the change of
the autovacuum delay configuration during running.

Attached rebased version patch to the current HEAD.

Please apply this patch with the extension lock patch[1] when testing
as this patch can try to extend visibility map pages concurrently.

Because the patch leads performance degradation in the case where
bulk-loading to a partitioned table I think that the original
proposal, which makes group locking conflict when relation extension
locks, is more realistic approach. So I worked on this with the simple
patch instead of [1]. Attached three patches:

* 0001 patch publishes some static functions such as
heap_paralellscan_startblock_init so that the parallel vacuum code can
use them.
* 0002 patch makes the group locking conflict when relation extension locks.
* 0003 patch add paralel option to lazy vacuum.

Please review them.

Oops, forgot to attach patches.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachments:

v8-0001-Publish-some-parallel-heap-scan-functions.patchapplication/x-patch; name=v8-0001-Publish-some-parallel-heap-scan-functions.patchDownload

From ada1d927644355e4813ed050449a985c69a9989c Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 10 Aug 2018 14:37:40 +0900
Subject: [PATCH v8 1/3] Publish some parallel heap scan functions.

---
 src/backend/access/heap/heapam.c | 6 ++----
 src/include/access/heapam.h      | 2 ++
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index fb63471..e599294 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -93,8 +93,6 @@ static HeapScanDesc heap_beginscan_internal(Relation relation,
 						bool is_bitmapscan,
 						bool is_samplescan,
 						bool temp_snap);
-static void heap_parallelscan_startblock_init(HeapScanDesc scan);
-static BlockNumber heap_parallelscan_nextpage(HeapScanDesc scan);
 static HeapTuple heap_prepare_insert(Relation relation, HeapTuple tup,
 					TransactionId xid, CommandId cid, int options);
 static XLogRecPtr log_heap_update(Relation reln, Buffer oldbuf,
@@ -1706,7 +1704,7 @@ heap_beginscan_parallel(Relation relation, ParallelHeapScanDesc parallel_scan)
  *		only to set the startblock once.
  * ----------------
  */
-static void
+void
 heap_parallelscan_startblock_init(HeapScanDesc scan)
 {
 	BlockNumber sync_startpage = InvalidBlockNumber;
@@ -1754,7 +1752,7 @@ retry:
  *		first backend gets an InvalidBlockNumber return.
  * ----------------
  */
-static BlockNumber
+BlockNumber
 heap_parallelscan_nextpage(HeapScanDesc scan)
 {
 	BlockNumber page;
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 40e153f..c7204fc 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -127,6 +127,8 @@ extern void heap_rescan_set_params(HeapScanDesc scan, ScanKey key,
 					   bool allow_strat, bool allow_sync, bool allow_pagemode);
 extern void heap_endscan(HeapScanDesc scan);
 extern HeapTuple heap_getnext(HeapScanDesc scan, ScanDirection direction);
+extern BlockNumber heap_parallelscan_nextpage(HeapScanDesc scan);
+extern void heap_parallelscan_startblock_init(HeapScanDesc scan);
 
 extern Size heap_parallelscan_estimate(Snapshot snapshot);
 extern void heap_parallelscan_initialize(ParallelHeapScanDesc target,
-- 
2.10.5

v8-0002-Make-group-locking-conflict-when-relation-exntesi.patchapplication/x-patch; name=v8-0002-Make-group-locking-conflict-when-relation-exntesi.patchDownload

From eb4e53aefd1d67b133c50f5f32b3835338ced5ef Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Tue, 30 Oct 2018 17:17:39 +0900
Subject: [PATCH v8 2/3] Make group locking conflict when relation exntesion.

---
 src/backend/storage/lmgr/lock.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/src/backend/storage/lmgr/lock.c b/src/backend/storage/lmgr/lock.c
index 10f6f60..be00c02 100644
--- a/src/backend/storage/lmgr/lock.c
+++ b/src/backend/storage/lmgr/lock.c
@@ -1405,6 +1405,14 @@ LockCheckConflicts(LockMethod lockMethodTable,
 		return STATUS_FOUND;
 	}
 
+	/* Relation extension locks are conflict even in group locking */
+	if (lock->tag.locktag_type == LOCKTAG_RELATION_EXTEND)
+	{
+		PROCLOCK_PRINT("LockCheckConflicts: conflicting (group)",
+					   proclock);
+		return STATUS_FOUND;
+	}
+
 	/*
 	 * Locks held in conflicting modes by members of our own lock group are
 	 * not real conflicts; we can subtract those out and see if we still have
-- 
2.10.5

v8-0003-Add-parallel-option-to-lazy-vacuum.patchapplication/x-patch; name=v8-0003-Add-parallel-option-to-lazy-vacuum.patchDownload

From 578b693020848e0dfca05b8dac9b7dec5934339b Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 10 Aug 2018 14:38:21 +0900
Subject: [PATCH v8 3/3] Add parallel option to lazy vacuum.

---
 doc/src/sgml/config.sgml               |    9 +-
 doc/src/sgml/ref/create_table.sgml     |   11 +-
 doc/src/sgml/ref/vacuum.sgml           |   16 +
 src/backend/access/common/reloptions.c |   10 +
 src/backend/access/transam/parallel.c  |    7 +
 src/backend/catalog/system_views.sql   |   23 +-
 src/backend/commands/Makefile          |    2 +-
 src/backend/commands/vacuum.c          |   71 +-
 src/backend/commands/vacuumlazy.c      | 2087 ++++++++++++++++++++++++--------
 src/backend/commands/vacuumworker.c    |  327 +++++
 src/backend/nodes/equalfuncs.c         |    7 +-
 src/backend/optimizer/plan/planner.c   |  133 ++
 src/backend/parser/gram.y              |   90 +-
 src/backend/postmaster/autovacuum.c    |   38 +-
 src/backend/postmaster/pgstat.c        |   25 +-
 src/backend/tcop/utility.c             |    4 +-
 src/backend/utils/adt/pgstatfuncs.c    |    8 +-
 src/include/catalog/pg_proc.dat        |    6 +-
 src/include/commands/vacuum.h          |   11 +-
 src/include/commands/vacuum_internal.h |  191 +++
 src/include/nodes/parsenodes.h         |   18 +-
 src/include/optimizer/planner.h        |    1 +
 src/include/pgstat.h                   |    9 +-
 src/include/storage/lwlock.h           |    1 +
 src/include/utils/rel.h                |    1 +
 src/test/regress/expected/rules.out    |   20 +-
 src/test/regress/expected/vacuum.out   |    2 +
 src/test/regress/sql/vacuum.sql        |    3 +
 28 files changed, 2487 insertions(+), 644 deletions(-)
 create mode 100644 src/backend/commands/vacuumworker.c
 create mode 100644 src/include/commands/vacuum_internal.h

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 7554cba..da8c8d3 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2142,10 +2142,11 @@ include_dir 'conf.d'
        <listitem>
         <para>
          Sets the maximum number of parallel workers that can be
-         started by a single utility command.  Currently, the only
-         parallel utility command that supports the use of parallel
-         workers is <command>CREATE INDEX</command>, and only when
-         building a B-tree index.  Parallel workers are taken from the
+         started by a single utility command.  Currently, the parallel
+         utility commands that supports the use of parallel worker are
+         <command>CREATE INDEX</command>, and only when
+         building a B-tree index and <command>VACUUM</command> without
+         <literal>FULL</literal>.  Parallel workers are taken from the
          pool of processes established by <xref
          linkend="guc-max-worker-processes"/>, limited by <xref
          linkend="guc-max-parallel-workers"/>.  Note that the requested
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index 10428f8..d4d1106 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -1423,7 +1423,16 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
     </listitem>
    </varlistentry>
 
-   <varlistentry>
+    <varlistentry>
+    <term><literal>autovacuum_vacuum_parallel_workers</literal>, <literal>toast.autovacuum_multixact_freeze_max_age</literal> (<type>integer</type>)</term>
+    <listitem>
+     <para>
+      This sets the number of worker that can be used to vacuum for this table. If not set, the autovacuum performs with no workers (non-parallel).
+     </para>
+    </listitem>
+   </varlistentry>
+
+    <varlistentry>
     <term><literal>autovacuum_freeze_min_age</literal>, <literal>toast.autovacuum_freeze_min_age</literal> (<type>integer</type>)</term>
     <listitem>
      <para>
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index fd911f5..a742107 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -30,6 +30,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     FREEZE
     VERBOSE
     ANALYZE
+    PARALLEL <replaceable class="parameter">N</replaceable>
     DISABLE_PAGE_SKIPPING
     SKIP_LOCKED
 
@@ -143,6 +144,21 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    </varlistentry>
 
    <varlistentry>
+    <term><literal>PARALLEL <replaceable class="parameter">N</replaceable></literal></term>
+    <listitem>
+     <para>
+      Execute <command>VACUUM</command> in parallel by <replaceable class="parameter">N
+      </replaceable>a background workers. Collecting garbage on table is processed
+      in block-level parallel. For tables with indexes, parallel vacuum assigns each
+      index to each parallel vacuum worker and all garbages on a index are processed
+      by particular parallel vacuum worker. The maximum nunber of parallel workers
+      is <xref linkend="guc-max-parallel-workers-maintenance"/>. This option can not
+      use with <literal>FULL</literal> option.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><literal>DISABLE_PAGE_SKIPPING</literal></term>
     <listitem>
      <para>
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index db84da0..45e2bca 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -348,6 +348,14 @@ static relopt_int intRelOpts[] =
 		},
 		-1, 0, 1024
 	},
+	{
+		{
+			"autovacuum_vacuum_parallel_workers",
+			"Number of parallel processes that can be used to vacuum for this relation",
+			RELOPT_KIND_HEAP | RELOPT_KIND_TOAST,
+			ShareUpdateExclusiveLock
+		}, -1, 0, 1024
+	},
 
 	/* list terminator */
 	{{NULL}}
@@ -1377,6 +1385,8 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
 		offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, vacuum_scale_factor)},
 		{"autovacuum_analyze_scale_factor", RELOPT_TYPE_REAL,
 		offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, analyze_scale_factor)},
+		{"autovacuum_vacuum_parallel_workers", RELOPT_TYPE_INT,
+		offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, vacuum_parallel_workers)},
 		{"user_catalog_table", RELOPT_TYPE_BOOL,
 		offsetof(StdRdOptions, user_catalog_table)},
 		{"parallel_workers", RELOPT_TYPE_INT,
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index 8419719..dbb3e5d 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -23,6 +23,7 @@
 #include "catalog/index.h"
 #include "catalog/namespace.h"
 #include "commands/async.h"
+#include "commands/vacuum.h"
 #include "executor/execParallel.h"
 #include "libpq/libpq.h"
 #include "libpq/pqformat.h"
@@ -138,6 +139,9 @@ static const struct
 	},
 	{
 		"_bt_parallel_build_main", _bt_parallel_build_main
+	},
+	{
+		"lazy_parallel_vacuum_main", lazy_parallel_vacuum_main
 	}
 };
 
@@ -1283,6 +1287,9 @@ ParallelWorkerMain(Datum main_arg)
 	ParallelMasterBackendId = fps->parallel_master_backend_id;
 	on_shmem_exit(ParallelWorkerShutdown, (Datum) 0);
 
+	/* Report pid of master process for progress information */
+	pgstat_report_leader_pid(fps->parallel_master_pid);
+
 	/*
 	 * Now we can find and attach to the error queue provided for us.  That's
 	 * good, because until we do that, any errors that happen here will not be
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 53ddc59..a74b426 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -897,11 +897,24 @@ CREATE VIEW pg_stat_progress_vacuum AS
 					  WHEN 5 THEN 'truncating heap'
 					  WHEN 6 THEN 'performing final cleanup'
 					  END AS phase,
-		S.param2 AS heap_blks_total, S.param3 AS heap_blks_scanned,
-		S.param4 AS heap_blks_vacuumed, S.param5 AS index_vacuum_count,
-		S.param6 AS max_dead_tuples, S.param7 AS num_dead_tuples
-    FROM pg_stat_get_progress_info('VACUUM') AS S
-		LEFT JOIN pg_database D ON S.datid = D.oid;
+		S.param2 AS heap_blks_total,
+		W.heap_blks_scanned,
+		W.heap_blks_vacuumed,
+		W.index_vacuum_count,
+		S.param6 AS max_dead_tuples,
+		W.num_dead_tuples
+	FROM pg_stat_get_progress_info('VACUUM') AS S
+	        LEFT JOIN pg_database D ON S.datid = D.oid
+		LEFT JOIN
+		(SELECT leader_pid,
+			max(param3) AS heap_blks_scanned,
+			max(param4) AS heap_blks_vacuumed,
+			max(param5) AS index_vacuum_count,
+			max(param7) AS num_dead_tuples
+	        FROM pg_stat_get_progress_info('VACUUM')
+		GROUP BY leader_pid) AS W ON S.pid = W.leader_pid
+	WHERE
+		S.pid = S.leader_pid;
 
 CREATE VIEW pg_user_mappings AS
     SELECT
diff --git a/src/backend/commands/Makefile b/src/backend/commands/Makefile
index 4a6c99e..c3623da 100644
--- a/src/backend/commands/Makefile
+++ b/src/backend/commands/Makefile
@@ -20,6 +20,6 @@ OBJS = amcmds.o aggregatecmds.o alter.o analyze.o async.o cluster.o comment.o \
 	policy.o portalcmds.o prepare.o proclang.o publicationcmds.o \
 	schemacmds.o seclabel.o sequence.o statscmds.o subscriptioncmds.o \
 	tablecmds.o tablespace.o trigger.o tsearchcmds.o typecmds.o user.o \
-	vacuum.o vacuumlazy.o variable.o view.o
+	vacuum.o vacuumlazy.o vacuumworker.o variable.o view.o
 
 include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index a86963f..0eb38b2 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -38,6 +38,7 @@
 #include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
+#include "optimizer/planner.h"
 #include "pgstat.h"
 #include "postmaster/autovacuum.h"
 #include "storage/bufmgr.h"
@@ -68,13 +69,13 @@ static BufferAccessStrategy vac_strategy;
 
 
 /* non-export function prototypes */
-static List *expand_vacuum_rel(VacuumRelation *vrel, int options);
-static List *get_all_vacuum_rels(int options);
+static List *expand_vacuum_rel(VacuumRelation *vrel, VacuumOption options);
+static List *get_all_vacuum_rels(VacuumOption options);
 static void vac_truncate_clog(TransactionId frozenXID,
 				  MultiXactId minMulti,
 				  TransactionId lastSaneFrozenXid,
 				  MultiXactId lastSaneMinMulti);
-static bool vacuum_rel(Oid relid, RangeVar *relation, int options,
+static bool vacuum_rel(Oid relid, RangeVar *relation, VacuumOption options,
 		   VacuumParams *params);
 
 /*
@@ -89,15 +90,15 @@ ExecVacuum(VacuumStmt *vacstmt, bool isTopLevel)
 	VacuumParams params;
 
 	/* sanity checks on options */
-	Assert(vacstmt->options & (VACOPT_VACUUM | VACOPT_ANALYZE));
-	Assert((vacstmt->options & VACOPT_VACUUM) ||
-		   !(vacstmt->options & (VACOPT_FULL | VACOPT_FREEZE)));
-	Assert(!(vacstmt->options & VACOPT_SKIPTOAST));
+	Assert(vacstmt->options.flags & (VACOPT_VACUUM | VACOPT_ANALYZE));
+	Assert((vacstmt->options.flags & VACOPT_VACUUM) ||
+		   !(vacstmt->options.flags & (VACOPT_FULL | VACOPT_FREEZE)));
+	Assert(!(vacstmt->options.flags & VACOPT_SKIPTOAST));
 
 	/*
 	 * Make sure VACOPT_ANALYZE is specified if any column lists are present.
 	 */
-	if (!(vacstmt->options & VACOPT_ANALYZE))
+	if (!(vacstmt->options.flags & VACOPT_ANALYZE))
 	{
 		ListCell   *lc;
 
@@ -116,7 +117,7 @@ ExecVacuum(VacuumStmt *vacstmt, bool isTopLevel)
 	 * All freeze ages are zero if the FREEZE option is given; otherwise pass
 	 * them as -1 which means to use the default values.
 	 */
-	if (vacstmt->options & VACOPT_FREEZE)
+	if (vacstmt->options.flags & VACOPT_FREEZE)
 	{
 		params.freeze_min_age = 0;
 		params.freeze_table_age = 0;
@@ -163,7 +164,7 @@ ExecVacuum(VacuumStmt *vacstmt, bool isTopLevel)
  * memory context that will not disappear at transaction commit.
  */
 void
-vacuum(int options, List *relations, VacuumParams *params,
+vacuum(VacuumOption options, List *relations, VacuumParams *params,
 	   BufferAccessStrategy bstrategy, bool isTopLevel)
 {
 	static bool in_vacuum = false;
@@ -174,7 +175,7 @@ vacuum(int options, List *relations, VacuumParams *params,
 
 	Assert(params != NULL);
 
-	stmttype = (options & VACOPT_VACUUM) ? "VACUUM" : "ANALYZE";
+	stmttype = (options.flags & VACOPT_VACUUM) ? "VACUUM" : "ANALYZE";
 
 	/*
 	 * We cannot run VACUUM inside a user transaction block; if we were inside
@@ -184,7 +185,7 @@ vacuum(int options, List *relations, VacuumParams *params,
 	 *
 	 * ANALYZE (without VACUUM) can run either way.
 	 */
-	if (options & VACOPT_VACUUM)
+	if (options.flags & VACOPT_VACUUM)
 	{
 		PreventInTransactionBlock(isTopLevel, stmttype);
 		in_outer_xact = false;
@@ -206,8 +207,8 @@ vacuum(int options, List *relations, VacuumParams *params,
 	/*
 	 * Sanity check DISABLE_PAGE_SKIPPING option.
 	 */
-	if ((options & VACOPT_FULL) != 0 &&
-		(options & VACOPT_DISABLE_PAGE_SKIPPING) != 0)
+	if ((options.flags & VACOPT_FULL) != 0 &&
+		(options.flags & VACOPT_DISABLE_PAGE_SKIPPING) != 0)
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg("VACUUM option DISABLE_PAGE_SKIPPING cannot be used with FULL")));
@@ -216,7 +217,7 @@ vacuum(int options, List *relations, VacuumParams *params,
 	 * Send info about dead objects to the statistics collector, unless we are
 	 * in autovacuum --- autovacuum.c does this for itself.
 	 */
-	if ((options & VACOPT_VACUUM) && !IsAutoVacuumWorkerProcess())
+	if ((options.flags & VACOPT_VACUUM) && !IsAutoVacuumWorkerProcess())
 		pgstat_vacuum_stat();
 
 	/*
@@ -281,11 +282,11 @@ vacuum(int options, List *relations, VacuumParams *params,
 	 * transaction block, and also in an autovacuum worker, use own
 	 * transactions so we can release locks sooner.
 	 */
-	if (options & VACOPT_VACUUM)
+	if (options.flags & VACOPT_VACUUM)
 		use_own_xacts = true;
 	else
 	{
-		Assert(options & VACOPT_ANALYZE);
+		Assert(options.flags & VACOPT_ANALYZE);
 		if (IsAutoVacuumWorkerProcess())
 			use_own_xacts = true;
 		else if (in_outer_xact)
@@ -335,13 +336,13 @@ vacuum(int options, List *relations, VacuumParams *params,
 		{
 			VacuumRelation *vrel = lfirst_node(VacuumRelation, cur);
 
-			if (options & VACOPT_VACUUM)
+			if (options.flags & VACOPT_VACUUM)
 			{
 				if (!vacuum_rel(vrel->oid, vrel->relation, options, params))
 					continue;
 			}
 
-			if (options & VACOPT_ANALYZE)
+			if (options.flags & VACOPT_ANALYZE)
 			{
 				/*
 				 * If using separate xacts, start one for analyze. Otherwise,
@@ -354,7 +355,7 @@ vacuum(int options, List *relations, VacuumParams *params,
 					PushActiveSnapshot(GetTransactionSnapshot());
 				}
 
-				analyze_rel(vrel->oid, vrel->relation, options, params,
+				analyze_rel(vrel->oid, vrel->relation, options.flags, params,
 							vrel->va_cols, in_outer_xact, vac_strategy);
 
 				if (use_own_xacts)
@@ -390,7 +391,7 @@ vacuum(int options, List *relations, VacuumParams *params,
 		StartTransactionCommand();
 	}
 
-	if ((options & VACOPT_VACUUM) && !IsAutoVacuumWorkerProcess())
+	if ((options.flags & VACOPT_VACUUM) && !IsAutoVacuumWorkerProcess())
 	{
 		/*
 		 * Update pg_database.datfrozenxid, and truncate pg_xact if possible.
@@ -603,7 +604,7 @@ vacuum_open_relation(Oid relid, RangeVar *relation, VacuumParams *params,
  * are made in vac_context.
  */
 static List *
-expand_vacuum_rel(VacuumRelation *vrel, int options)
+expand_vacuum_rel(VacuumRelation *vrel, VacuumOption options)
 {
 	List	   *vacrels = NIL;
 	MemoryContext oldcontext;
@@ -635,7 +636,7 @@ expand_vacuum_rel(VacuumRelation *vrel, int options)
 		 * below, as well as find_all_inheritors's expectation that the caller
 		 * holds some lock on the starting relation.
 		 */
-		rvr_opts = (options & VACOPT_SKIP_LOCKED) ? RVR_SKIP_LOCKED : 0;
+		rvr_opts = (options.flags & VACOPT_SKIP_LOCKED) ? RVR_SKIP_LOCKED : 0;
 		relid = RangeVarGetRelidExtended(vrel->relation,
 										 AccessShareLock,
 										 rvr_opts,
@@ -647,7 +648,7 @@ expand_vacuum_rel(VacuumRelation *vrel, int options)
 		 */
 		if (!OidIsValid(relid))
 		{
-			if (options & VACOPT_VACUUM)
+			if (options.flags & VACOPT_VACUUM)
 				ereport(WARNING,
 						(errcode(ERRCODE_LOCK_NOT_AVAILABLE),
 						 errmsg("skipping vacuum of \"%s\" --- lock not available",
@@ -673,7 +674,7 @@ expand_vacuum_rel(VacuumRelation *vrel, int options)
 		 * Make a returnable VacuumRelation for this rel if user is a proper
 		 * owner.
 		 */
-		if (vacuum_is_relation_owner(relid, classForm, options))
+		if (vacuum_is_relation_owner(relid, classForm, options.flags))
 		{
 			oldcontext = MemoryContextSwitchTo(vac_context);
 			vacrels = lappend(vacrels, makeVacuumRelation(vrel->relation,
@@ -742,7 +743,7 @@ expand_vacuum_rel(VacuumRelation *vrel, int options)
  * the current database.  The list is built in vac_context.
  */
 static List *
-get_all_vacuum_rels(int options)
+get_all_vacuum_rels(VacuumOption options)
 {
 	List	   *vacrels = NIL;
 	Relation	pgclass;
@@ -760,7 +761,7 @@ get_all_vacuum_rels(int options)
 		Oid			relid = HeapTupleGetOid(tuple);
 
 		/* check permissions of relation */
-		if (!vacuum_is_relation_owner(relid, classForm, options))
+		if (!vacuum_is_relation_owner(relid, classForm, options.flags))
 			continue;
 
 		/*
@@ -1521,7 +1522,7 @@ vac_truncate_clog(TransactionId frozenXID,
  *		At entry and exit, we are not inside a transaction.
  */
 static bool
-vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
+vacuum_rel(Oid relid, RangeVar *relation, VacuumOption options, VacuumParams *params)
 {
 	LOCKMODE	lmode;
 	Relation	onerel;
@@ -1542,7 +1543,7 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 	 */
 	PushActiveSnapshot(GetTransactionSnapshot());
 
-	if (!(options & VACOPT_FULL))
+	if (!(options.flags & VACOPT_FULL))
 	{
 		/*
 		 * In lazy vacuum, we can set the PROC_IN_VACUUM flag, which lets
@@ -1582,10 +1583,10 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 	 * vacuum, but just ShareUpdateExclusiveLock for concurrent vacuum. Either
 	 * way, we can be sure that no other backend is vacuuming the same table.
 	 */
-	lmode = (options & VACOPT_FULL) ? AccessExclusiveLock : ShareUpdateExclusiveLock;
+	lmode = (options.flags & VACOPT_FULL) ? AccessExclusiveLock : ShareUpdateExclusiveLock;
 
 	/* open the relation and get the appropriate lock on it */
-	onerel = vacuum_open_relation(relid, relation, params, options, lmode);
+	onerel = vacuum_open_relation(relid, relation, params, options.flags, lmode);
 
 	/* leave if relation could not be opened or locked */
 	if (!onerel)
@@ -1605,7 +1606,7 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 	 */
 	if (!vacuum_is_relation_owner(RelationGetRelid(onerel),
 								  onerel->rd_rel,
-								  options & VACOPT_VACUUM))
+								  options.flags & VACOPT_VACUUM))
 	{
 		relation_close(onerel, lmode);
 		PopActiveSnapshot();
@@ -1677,7 +1678,7 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 	 * us to process it.  In VACUUM FULL, though, the toast table is
 	 * automatically rebuilt by cluster_rel so we shouldn't recurse to it.
 	 */
-	if (!(options & VACOPT_SKIPTOAST) && !(options & VACOPT_FULL))
+	if (!(options.flags & VACOPT_SKIPTOAST) && !(options.flags & VACOPT_FULL))
 		toast_relid = onerel->rd_rel->reltoastrelid;
 	else
 		toast_relid = InvalidOid;
@@ -1696,7 +1697,7 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 	/*
 	 * Do the actual work --- either FULL or "lazy" vacuum
 	 */
-	if (options & VACOPT_FULL)
+	if (options.flags & VACOPT_FULL)
 	{
 		int			cluster_options = 0;
 
@@ -1704,7 +1705,7 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 		relation_close(onerel, NoLock);
 		onerel = NULL;
 
-		if ((options & VACOPT_VERBOSE) != 0)
+		if ((options.flags & VACOPT_VERBOSE) != 0)
 			cluster_options |= CLUOPT_VERBOSE;
 
 		/* VACUUM FULL is now a variant of CLUSTER; see cluster.c */
diff --git a/src/backend/commands/vacuumlazy.c b/src/backend/commands/vacuumlazy.c
index 8996d36..e4f4183 100644
--- a/src/backend/commands/vacuumlazy.c
+++ b/src/backend/commands/vacuumlazy.c
@@ -22,6 +22,17 @@
  * of index scans performed.  So we don't use maintenance_work_mem memory for
  * the TID array, just enough to hold as many heap tuples as fit on one page.
  *
+ * Lazy vacuum can be performed with parallel workers. In parallel lazy vacuum,
+ * multiple vacuum worker processes get blocks in parallel using parallel heap
+ * scan and process each of them. If a table with indexes the parallel vacuum
+ * workers vacuum the heap and indexes in parallel.  Also, the dead tuple
+ * TIDs are shared among all vacuum processes including the leader process.
+ * Before getting into each state such as scanning heap, vacuum index the
+ * leader process does some preparation work and asks all vacuum worker process
+ * to run the same state. If table with no indexes, all vacuum processes just
+ * vacuum each page as we go. Therefore the dead tuple TIDs are not shared.
+ * The information required by parallel lazy vacuum such as vacuum statistics,
+ * parallel heap scan description are also shared among vacuum processes.
  *
  * Portions Copyright (c) 1996-2018, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
@@ -38,23 +49,32 @@
 
 #include "access/genam.h"
 #include "access/heapam.h"
-#include "access/heapam_xlog.h"
 #include "access/htup_details.h"
 #include "access/multixact.h"
+#include "access/parallel.h"
 #include "access/transam.h"
 #include "access/visibilitymap.h"
+#include "access/xact.h"
 #include "access/xlog.h"
 #include "catalog/storage.h"
 #include "commands/dbcommands.h"
 #include "commands/progress.h"
 #include "commands/vacuum.h"
+#include "commands/vacuum_internal.h"
 #include "miscadmin.h"
+#include "optimizer/paths.h"
+#include "optimizer/pathnode.h"
+#include "optimizer/planmain.h"
+#include "optimizer/planner.h"
 #include "pgstat.h"
 #include "portability/instr_time.h"
 #include "postmaster/autovacuum.h"
 #include "storage/bufmgr.h"
+#include "storage/condition_variable.h"
 #include "storage/freespace.h"
 #include "storage/lmgr.h"
+#include "storage/ipc.h"
+#include "tcop/tcopprot.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/pg_rusage.h"
@@ -111,70 +131,148 @@
  */
 #define PREFETCH_SIZE			((BlockNumber) 32)
 
-typedef struct LVRelStats
+/* See note in lazy_scan_get_nextpage about forcing scanning of last page */
+#define FORCE_CHECK_PAGE(nblocks, blkno, vacrelstats) \
+	((blkno) == (nblocks) - 1 && should_attempt_truncation((vacrelstats)))
+
+/* Macros for checking the status of vacuum worker slot */
+#define IsVacuumWorkerStopped(pid) ((pid) == 0)
+#define IsVacuumWorkerInvalid(pid) (((pid) == InvalidPid) || ((pid) == 0))
+
+/*
+ * LVTidMap controls the dead tuple TIDs collected during heap scan. The 'shared'
+ * indicates LVTidMap is shared among vacuum workers. When it's true, it exists
+ * in shared memory.
+ */
+struct LVTidMap
 {
-	/* hasindex = true means two-pass strategy; false means one-pass */
-	bool		hasindex;
-	/* Overall statistics about rel */
-	BlockNumber old_rel_pages;	/* previous value of pg_class.relpages */
-	BlockNumber rel_pages;		/* total number of pages */
-	BlockNumber scanned_pages;	/* number of pages we examined */
-	BlockNumber pinskipped_pages;	/* # of pages we skipped due to a pin */
-	BlockNumber frozenskipped_pages;	/* # of frozen pages we skipped */
-	BlockNumber tupcount_pages; /* pages whose tuples we counted */
-	double		old_live_tuples;	/* previous value of pg_class.reltuples */
-	double		new_rel_tuples; /* new estimated total # of tuples */
-	double		new_live_tuples;	/* new estimated total # of live tuples */
-	double		new_dead_tuples;	/* new estimated total # of dead tuples */
-	BlockNumber pages_removed;
-	double		tuples_deleted;
-	BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
+	int		max_items;	/* # slots allocated in itemptrs */
+	int		num_items;	/* current # of entries */
+
+	/* The fields used for vacuum heap */
+	int		item_idx;
+	int		vacuumed_pages;	/* # pages vacuumed in a heap vacuum cycle */
+
+	/* The fields used for only parallel lazy vacuum */
+	bool	shared;		/* dead tuples is shared among vacuum workers */
+	slock_t	mutex;
+
 	/* List of TIDs of tuples we intend to delete */
 	/* NB: this list is ordered by TID address */
-	int			num_dead_tuples;	/* current # of entries */
-	int			max_dead_tuples;	/* # slots allocated in array */
-	ItemPointer dead_tuples;	/* array of ItemPointerData */
-	int			num_index_scans;
-	TransactionId latestRemovedXid;
-	bool		lock_waiter_detected;
-} LVRelStats;
+	ItemPointerData	itemptrs[FLEXIBLE_ARRAY_MEMBER];
+};
+#define SizeOfLVTidMap offsetof(LVTidMap, itemptrs) + sizeof(ItemPointerData)
+
+/*
+ * Struct for index statistics that are used for parallel lazy vacuum.
+ * In single lazy vacuum, we update the statistics of index after cleanup
+ * them. However, since any updates are not allowed during parallel mode
+ * we store all index statistics to LVIndStats and update them after exit
+ * parallel mode.
+ */
+typedef struct IndexStats
+{
+	bool		need_update;
+	BlockNumber	num_pages;
+	BlockNumber	num_tuples;
+} IndexStats;
+struct LVIndStats
+{
+	/*
+	 * nindexes has the length of stats. nprocessed and mutex are used
+	 * only for parallel lazy vacuum when processing each indexes by
+	 * the workers and the leader.
+	 */
+	int		nindexes;	/* total # of indexes */
+	int		nprocessed;	/* used for vacuum/cleanup index */
+	slock_t		mutex;	/* protect nprocessed */
+	IndexStats stats[FLEXIBLE_ARRAY_MEMBER];
+};
+#define SizeOfLVIndStats offsetof(LVIndStats, stats) + sizeof(IndexStats)
+
+/* Scan description data for lazy vacuum */
+struct LVScanDescData
+{
+	/* Common information for scanning heap */
+	Relation	lv_rel;
+	bool		disable_page_skipping;	/* enable DISABLE_PAGE_SKIPPING option */
+	bool		aggressive;				/* aggressive vacuum */
+
+	/* Used for single lazy vacuum, otherwise NULL */
+	HeapScanDesc lv_heapscan;
 
+	/* Used for parallel lazy vacuum, otherwise invalid values */
+	BlockNumber	lv_cblock;
+	BlockNumber	lv_next_unskippable_block;
+	BlockNumber	lv_nblocks;
+};
+
+/*
+ * Status for leader in parallel lazy vacuum. LVLeader is only present
+ * in the leader process.
+ */
+typedef struct LVLeader
+{
+	/*
+	 * allrelstats points to a shared memory space that stores the all index
+	 * statistics.
+	 */
+	LVRelStats	*allrelstats;
+	ParallelContext	*pcxt;
+} LVLeader;
+
+/* Global variables for lazy vacuum */
+LVWorkerState	*WorkerState = NULL;
 
 /* A few variables that don't seem worth passing around as parameters */
 static int	elevel = -1;
-
+static BufferAccessStrategy vac_strategy;
 static TransactionId OldestXmin;
 static TransactionId FreezeLimit;
 static MultiXactId MultiXactCutoff;
 
-static BufferAccessStrategy vac_strategy;
-
-
 /* non-export function prototypes */
-static void lazy_scan_heap(Relation onerel, int options,
-			   LVRelStats *vacrelstats, Relation *Irel, int nindexes,
-			   bool aggressive);
-static void lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats);
+static void lazy_scan_heap(LVState *lvstate);
 static bool lazy_check_needs_freeze(Buffer buf, bool *hastup);
 static void lazy_vacuum_index(Relation indrel,
 				  IndexBulkDeleteResult **stats,
-				  LVRelStats *vacrelstats);
+				  LVRelStats *vacrelstats,
+				  LVTidMap *dead_tuples);
 static void lazy_cleanup_index(Relation indrel,
-				   IndexBulkDeleteResult *stats,
-				   LVRelStats *vacrelstats);
-static int lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
-				 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer);
+							   IndexBulkDeleteResult *stats,
+							   LVRelStats *vacrelstats,
+							   IndexStats *indstas);
+static int lazy_vacuum_page(LVState *lvstate, Relation onerel, BlockNumber blkno, Buffer buffer,
+							int tupindex, Buffer *vmbuffer);
 static bool should_attempt_truncation(LVRelStats *vacrelstats);
 static void lazy_truncate_heap(Relation onerel, LVRelStats *vacrelstats);
 static BlockNumber count_nondeletable_pages(Relation onerel,
 						 LVRelStats *vacrelstats);
-static void lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks);
-static void lazy_record_dead_tuple(LVRelStats *vacrelstats,
+static void lazy_record_dead_tuple(LVTidMap *dead_tuples,
 					   ItemPointer itemptr);
-static bool lazy_tid_reaped(ItemPointer itemptr, void *state);
+static bool lazy_tid_reaped(ItemPointer itemptr, void *dt);
 static int	vac_cmp_itemptr(const void *left, const void *right);
 static bool heap_page_is_all_visible(Relation rel, Buffer buf,
 						 TransactionId *visibility_cutoff_xid, bool *all_frozen);
+static BlockNumber lazy_get_next_vacuum_page(LVState *lvstate, int *tupindex_p,
+											 int *npages_p);
+static bool lazy_dead_tuples_is_full(LVTidMap *tidmap);
+static int lazy_get_dead_tuple_count(LVTidMap *dead_tuples);
+static BlockNumber lazy_scan_get_nextpage(LVScanDesc lvscan, LVRelStats *vacrelstats,
+										  bool *all_visible_according_to_vm_p,
+										  Buffer *vmbuffer_p);
+static long lazy_get_max_dead_tuples(LVRelStats *vacrelstats, BlockNumber relblocks);
+
+/* function prototypes for parallel vacuum */
+static LVLeader *lazy_vacuum_begin_parallel(LVState *lvstate, int request);
+static void lazy_vacuum_end_parallel(LVState *lvstate, LVLeader *lvleader,
+									 bool update_stats);
+static void lazy_prepare_next_state(LVState *lvstate, LVLeader *lvleader,
+									int next_state);
+static void lazy_gather_worker_stats(LVLeader *lvleader, LVRelStats *vacrelstats);
+static void lazy_wait_for_vacuum_workers_to_be_done(void);
+static void lazy_set_workers_state(VacWorkerState new_state);
+static void lazy_wait_for_vacuum_workers_attach(ParallelContext *pcxt);
 
 
 /*
@@ -187,12 +285,11 @@ static bool heap_page_is_all_visible(Relation rel, Buffer buf,
  *		and locked the relation.
  */
 void
-lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params,
+lazy_vacuum_rel(Relation onerel, VacuumOption options, VacuumParams *params,
 				BufferAccessStrategy bstrategy)
 {
+	LVState	   *lvstate;
 	LVRelStats *vacrelstats;
-	Relation   *Irel;
-	int			nindexes;
 	PGRUsage	ru0;
 	TimestampTz starttime = 0;
 	long		secs;
@@ -218,7 +315,7 @@ lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 		starttime = GetCurrentTimestamp();
 	}
 
-	if (options & VACOPT_VERBOSE)
+	if (options.flags & VACOPT_VERBOSE)
 		elevel = INFO;
 	else
 		elevel = DEBUG2;
@@ -246,26 +343,34 @@ lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 											   xidFullScanLimit);
 	aggressive |= MultiXactIdPrecedesOrEquals(onerel->rd_rel->relminmxid,
 											  mxactFullScanLimit);
-	if (options & VACOPT_DISABLE_PAGE_SKIPPING)
+	if (options.flags & VACOPT_DISABLE_PAGE_SKIPPING)
 		aggressive = true;
 
-	vacrelstats = (LVRelStats *) palloc0(sizeof(LVRelStats));
+	/* Create lazy vacuum state and statistics */
+	lvstate = (LVState *) palloc0(sizeof(LVState));
+	lvstate->options = options;
+	lvstate->aggressive = aggressive;
+	lvstate->relid = RelationGetRelid(onerel);
+	lvstate->relation = onerel;
+	lvstate->is_wraparound = params->is_wraparound;
+	lvstate->indstats = NULL;
+	lvstate->dead_tuples = NULL;
+	lvstate->lvshared = NULL;
 
+	vacrelstats = (LVRelStats *) palloc0(sizeof(LVRelStats));
 	vacrelstats->old_rel_pages = onerel->rd_rel->relpages;
 	vacrelstats->old_live_tuples = onerel->rd_rel->reltuples;
-	vacrelstats->num_index_scans = 0;
-	vacrelstats->pages_removed = 0;
 	vacrelstats->lock_waiter_detected = false;
+	lvstate->vacrelstats = vacrelstats;
 
 	/* Open all indexes of the relation */
-	vac_open_indexes(onerel, RowExclusiveLock, &nindexes, &Irel);
-	vacrelstats->hasindex = (nindexes > 0);
+	vac_open_indexes(onerel, RowExclusiveLock, &lvstate->nindexes, &lvstate->indRels);
+	vacrelstats->hasindex = (lvstate->nindexes > 0);
 
-	/* Do the vacuuming */
-	lazy_scan_heap(onerel, options, vacrelstats, Irel, nindexes, aggressive);
+	lazy_scan_heap(lvstate);
 
 	/* Done with indexes */
-	vac_close_indexes(nindexes, Irel, NoLock);
+	vac_close_indexes(lvstate->nindexes, lvstate->indRels, NoLock);
 
 	/*
 	 * Compute whether we actually scanned the all unfrozen pages. If we did,
@@ -454,7 +559,7 @@ vacuum_log_cleanup_info(Relation rel, LVRelStats *vacrelstats)
 }
 
 /*
- *	lazy_scan_heap() -- scan an open heap relation
+ *	do_lazy_scan_heap() -- scan an open heap relation
  *
  *		This routine prunes each page in the heap, which will among other
  *		things truncate dead tuples to dead line pointers, defragment the
@@ -469,32 +574,19 @@ vacuum_log_cleanup_info(Relation rel, LVRelStats *vacrelstats)
  *		dead line pointers need only be retained until all index pointers that
  *		reference them have been killed.
  */
-static void
-lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
-			   Relation *Irel, int nindexes, bool aggressive)
+int
+do_lazy_scan_heap(LVState *lvstate, bool *isFinished)
 {
-	BlockNumber nblocks,
-				blkno;
+	Relation 	onerel = lvstate->relation;
+	LVRelStats	*vacrelstats = lvstate->vacrelstats;
+	BlockNumber	blkno;
 	HeapTupleData tuple;
 	char	   *relname;
 	TransactionId relfrozenxid = onerel->rd_rel->relfrozenxid;
 	TransactionId relminmxid = onerel->rd_rel->relminmxid;
-	BlockNumber empty_pages,
-				vacuumed_pages,
-				next_fsm_block_to_vacuum;
-	double		num_tuples,		/* total number of nonremovable tuples */
-				live_tuples,	/* live tuples (reltuples estimate) */
-				tups_vacuumed,	/* tuples cleaned up by vacuum */
-				nkeep,			/* dead-but-not-removable tuples */
-				nunused;		/* unused item pointers */
-	IndexBulkDeleteResult **indstats;
 	int			i;
-	PGRUsage	ru0;
 	Buffer		vmbuffer = InvalidBuffer;
-	BlockNumber next_unskippable_block;
-	bool		skipping_blocks;
-	xl_heap_freeze_tuple *frozen;
-	StringInfoData buf;
+	bool		all_visible_accroding_to_vm;
 	const int	initprog_index[] = {
 		PROGRESS_VACUUM_PHASE,
 		PROGRESS_VACUUM_TOTAL_HEAP_BLKS,
@@ -502,117 +594,17 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	};
 	int64		initprog_val[3];
 
-	pg_rusage_init(&ru0);
-
 	relname = RelationGetRelationName(onerel);
-	if (aggressive)
-		ereport(elevel,
-				(errmsg("aggressively vacuuming \"%s.%s\"",
-						get_namespace_name(RelationGetNamespace(onerel)),
-						relname)));
-	else
-		ereport(elevel,
-				(errmsg("vacuuming \"%s.%s\"",
-						get_namespace_name(RelationGetNamespace(onerel)),
-						relname)));
-
-	empty_pages = vacuumed_pages = 0;
-	next_fsm_block_to_vacuum = (BlockNumber) 0;
-	num_tuples = live_tuples = tups_vacuumed = nkeep = nunused = 0;
-
-	indstats = (IndexBulkDeleteResult **)
-		palloc0(nindexes * sizeof(IndexBulkDeleteResult *));
-
-	nblocks = RelationGetNumberOfBlocks(onerel);
-	vacrelstats->rel_pages = nblocks;
-	vacrelstats->scanned_pages = 0;
-	vacrelstats->tupcount_pages = 0;
-	vacrelstats->nonempty_pages = 0;
-	vacrelstats->latestRemovedXid = InvalidTransactionId;
-
-	lazy_space_alloc(vacrelstats, nblocks);
-	frozen = palloc(sizeof(xl_heap_freeze_tuple) * MaxHeapTuplesPerPage);
 
 	/* Report that we're scanning the heap, advertising total # of blocks */
 	initprog_val[0] = PROGRESS_VACUUM_PHASE_SCAN_HEAP;
-	initprog_val[1] = nblocks;
-	initprog_val[2] = vacrelstats->max_dead_tuples;
+	initprog_val[1] = lvstate->lvscan->lv_nblocks;
+	initprog_val[2] = lvstate->dead_tuples->max_items;
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
-	/*
-	 * Except when aggressive is set, we want to skip pages that are
-	 * all-visible according to the visibility map, but only when we can skip
-	 * at least SKIP_PAGES_THRESHOLD consecutive pages.  Since we're reading
-	 * sequentially, the OS should be doing readahead for us, so there's no
-	 * gain in skipping a page now and then; that's likely to disable
-	 * readahead and so be counterproductive. Also, skipping even a single
-	 * page means that we can't update relfrozenxid, so we only want to do it
-	 * if we can skip a goodly number of pages.
-	 *
-	 * When aggressive is set, we can't skip pages just because they are
-	 * all-visible, but we can still skip pages that are all-frozen, since
-	 * such pages do not need freezing and do not affect the value that we can
-	 * safely set for relfrozenxid or relminmxid.
-	 *
-	 * Before entering the main loop, establish the invariant that
-	 * next_unskippable_block is the next block number >= blkno that we can't
-	 * skip based on the visibility map, either all-visible for a regular scan
-	 * or all-frozen for an aggressive scan.  We set it to nblocks if there's
-	 * no such block.  We also set up the skipping_blocks flag correctly at
-	 * this stage.
-	 *
-	 * Note: The value returned by visibilitymap_get_status could be slightly
-	 * out-of-date, since we make this test before reading the corresponding
-	 * heap page or locking the buffer.  This is OK.  If we mistakenly think
-	 * that the page is all-visible or all-frozen when in fact the flag's just
-	 * been cleared, we might fail to vacuum the page.  It's easy to see that
-	 * skipping a page when aggressive is not set is not a very big deal; we
-	 * might leave some dead tuples lying around, but the next vacuum will
-	 * find them.  But even when aggressive *is* set, it's still OK if we miss
-	 * a page whose all-frozen marking has just been cleared.  Any new XIDs
-	 * just added to that page are necessarily newer than the GlobalXmin we
-	 * computed, so they'll have no effect on the value to which we can safely
-	 * set relfrozenxid.  A similar argument applies for MXIDs and relminmxid.
-	 *
-	 * We will scan the table's last page, at least to the extent of
-	 * determining whether it has tuples or not, even if it should be skipped
-	 * according to the above rules; except when we've already determined that
-	 * it's not worth trying to truncate the table.  This avoids having
-	 * lazy_truncate_heap() take access-exclusive lock on the table to attempt
-	 * a truncation that just fails immediately because there are tuples in
-	 * the last page.  This is worth avoiding mainly because such a lock must
-	 * be replayed on any hot standby, where it can be disruptive.
-	 */
-	next_unskippable_block = 0;
-	if ((options & VACOPT_DISABLE_PAGE_SKIPPING) == 0)
-	{
-		while (next_unskippable_block < nblocks)
-		{
-			uint8		vmstatus;
-
-			vmstatus = visibilitymap_get_status(onerel, next_unskippable_block,
-												&vmbuffer);
-			if (aggressive)
-			{
-				if ((vmstatus & VISIBILITYMAP_ALL_FROZEN) == 0)
-					break;
-			}
-			else
-			{
-				if ((vmstatus & VISIBILITYMAP_ALL_VISIBLE) == 0)
-					break;
-			}
-			vacuum_delay_point();
-			next_unskippable_block++;
-		}
-	}
-
-	if (next_unskippable_block >= SKIP_PAGES_THRESHOLD)
-		skipping_blocks = true;
-	else
-		skipping_blocks = false;
-
-	for (blkno = 0; blkno < nblocks; blkno++)
+	while ((blkno = lazy_scan_get_nextpage(lvstate->lvscan, lvstate->vacrelstats,
+										   &all_visible_accroding_to_vm, &vmbuffer))
+		   != InvalidBlockNumber)
 	{
 		Buffer		buf;
 		Page		page;
@@ -629,159 +621,9 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		bool		has_dead_tuples;
 		TransactionId visibility_cutoff_xid = InvalidTransactionId;
 
-		/* see note above about forcing scanning of last page */
-#define FORCE_CHECK_PAGE() \
-		(blkno == nblocks - 1 && should_attempt_truncation(vacrelstats))
-
-		pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_SCANNED, blkno);
-
-		if (blkno == next_unskippable_block)
-		{
-			/* Time to advance next_unskippable_block */
-			next_unskippable_block++;
-			if ((options & VACOPT_DISABLE_PAGE_SKIPPING) == 0)
-			{
-				while (next_unskippable_block < nblocks)
-				{
-					uint8		vmskipflags;
-
-					vmskipflags = visibilitymap_get_status(onerel,
-														   next_unskippable_block,
-														   &vmbuffer);
-					if (aggressive)
-					{
-						if ((vmskipflags & VISIBILITYMAP_ALL_FROZEN) == 0)
-							break;
-					}
-					else
-					{
-						if ((vmskipflags & VISIBILITYMAP_ALL_VISIBLE) == 0)
-							break;
-					}
-					vacuum_delay_point();
-					next_unskippable_block++;
-				}
-			}
-
-			/*
-			 * We know we can't skip the current block.  But set up
-			 * skipping_blocks to do the right thing at the following blocks.
-			 */
-			if (next_unskippable_block - blkno > SKIP_PAGES_THRESHOLD)
-				skipping_blocks = true;
-			else
-				skipping_blocks = false;
-
-			/*
-			 * Normally, the fact that we can't skip this block must mean that
-			 * it's not all-visible.  But in an aggressive vacuum we know only
-			 * that it's not all-frozen, so it might still be all-visible.
-			 */
-			if (aggressive && VM_ALL_VISIBLE(onerel, blkno, &vmbuffer))
-				all_visible_according_to_vm = true;
-		}
-		else
-		{
-			/*
-			 * The current block is potentially skippable; if we've seen a
-			 * long enough run of skippable blocks to justify skipping it, and
-			 * we're not forced to check it, then go ahead and skip.
-			 * Otherwise, the page must be at least all-visible if not
-			 * all-frozen, so we can set all_visible_according_to_vm = true.
-			 */
-			if (skipping_blocks && !FORCE_CHECK_PAGE())
-			{
-				/*
-				 * Tricky, tricky.  If this is in aggressive vacuum, the page
-				 * must have been all-frozen at the time we checked whether it
-				 * was skippable, but it might not be any more.  We must be
-				 * careful to count it as a skipped all-frozen page in that
-				 * case, or else we'll think we can't update relfrozenxid and
-				 * relminmxid.  If it's not an aggressive vacuum, we don't
-				 * know whether it was all-frozen, so we have to recheck; but
-				 * in this case an approximate answer is OK.
-				 */
-				if (aggressive || VM_ALL_FROZEN(onerel, blkno, &vmbuffer))
-					vacrelstats->frozenskipped_pages++;
-				continue;
-			}
-			all_visible_according_to_vm = true;
-		}
-
 		vacuum_delay_point();
 
 		/*
-		 * If we are close to overrunning the available space for dead-tuple
-		 * TIDs, pause and do a cycle of vacuuming before we tackle this page.
-		 */
-		if ((vacrelstats->max_dead_tuples - vacrelstats->num_dead_tuples) < MaxHeapTuplesPerPage &&
-			vacrelstats->num_dead_tuples > 0)
-		{
-			const int	hvp_index[] = {
-				PROGRESS_VACUUM_PHASE,
-				PROGRESS_VACUUM_NUM_INDEX_VACUUMS
-			};
-			int64		hvp_val[2];
-
-			/*
-			 * Before beginning index vacuuming, we release any pin we may
-			 * hold on the visibility map page.  This isn't necessary for
-			 * correctness, but we do it anyway to avoid holding the pin
-			 * across a lengthy, unrelated operation.
-			 */
-			if (BufferIsValid(vmbuffer))
-			{
-				ReleaseBuffer(vmbuffer);
-				vmbuffer = InvalidBuffer;
-			}
-
-			/* Log cleanup info before we touch indexes */
-			vacuum_log_cleanup_info(onerel, vacrelstats);
-
-			/* Report that we are now vacuuming indexes */
-			pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
-										 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
-
-			/* Remove index entries */
-			for (i = 0; i < nindexes; i++)
-				lazy_vacuum_index(Irel[i],
-								  &indstats[i],
-								  vacrelstats);
-
-			/*
-			 * Report that we are now vacuuming the heap.  We also increase
-			 * the number of index scans here; note that by using
-			 * pgstat_progress_update_multi_param we can update both
-			 * parameters atomically.
-			 */
-			hvp_val[0] = PROGRESS_VACUUM_PHASE_VACUUM_HEAP;
-			hvp_val[1] = vacrelstats->num_index_scans + 1;
-			pgstat_progress_update_multi_param(2, hvp_index, hvp_val);
-
-			/* Remove tuples from heap */
-			lazy_vacuum_heap(onerel, vacrelstats);
-
-			/*
-			 * Forget the now-vacuumed tuples, and press on, but be careful
-			 * not to reset latestRemovedXid since we want that value to be
-			 * valid.
-			 */
-			vacrelstats->num_dead_tuples = 0;
-			vacrelstats->num_index_scans++;
-
-			/*
-			 * Vacuum the Free Space Map to make newly-freed space visible on
-			 * upper-level FSM pages.  Note we have not yet processed blkno.
-			 */
-			FreeSpaceMapVacuumRange(onerel, next_fsm_block_to_vacuum, blkno);
-			next_fsm_block_to_vacuum = blkno;
-
-			/* Report that we are once again scanning the heap */
-			pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
-										 PROGRESS_VACUUM_PHASE_SCAN_HEAP);
-		}
-
-		/*
 		 * Pin the visibility map page in case we need to mark the page
 		 * all-visible.  In most cases this will be very cheap, because we'll
 		 * already have the correct page pinned anyway.  However, it's
@@ -804,7 +646,8 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			 * it's OK to skip vacuuming pages we get a lock conflict on. They
 			 * will be dealt with in some future vacuum.
 			 */
-			if (!aggressive && !FORCE_CHECK_PAGE())
+			if (!lvstate->aggressive &&
+				!FORCE_CHECK_PAGE(vacrelstats->rel_pages, blkno, vacrelstats))
 			{
 				ReleaseBuffer(buf);
 				vacrelstats->pinskipped_pages++;
@@ -837,7 +680,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 					vacrelstats->nonempty_pages = blkno + 1;
 				continue;
 			}
-			if (!aggressive)
+			if (!lvstate->aggressive)
 			{
 				/*
 				 * Here, we must not advance scanned_pages; that would amount
@@ -891,7 +734,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 						(errmsg("relation \"%s\" page %u is uninitialized --- fixing",
 								relname, blkno)));
 				PageInit(page, BufferGetPageSize(buf), 0);
-				empty_pages++;
+				vacrelstats->empty_pages++;
 			}
 			freespace = PageGetHeapFreeSpace(page);
 			MarkBufferDirty(buf);
@@ -903,7 +746,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 
 		if (PageIsEmpty(page))
 		{
-			empty_pages++;
+			vacrelstats->empty_pages++;
 			freespace = PageGetHeapFreeSpace(page);
 
 			/* empty pages are always all-visible and all-frozen */
@@ -945,7 +788,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		 *
 		 * We count tuples removed by the pruning step as removed by VACUUM.
 		 */
-		tups_vacuumed += heap_page_prune(onerel, buf, OldestXmin, false,
+		vacrelstats->tuples_deleted += heap_page_prune(onerel, buf, OldestXmin, false,
 										 &vacrelstats->latestRemovedXid);
 
 		/*
@@ -956,7 +799,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		has_dead_tuples = false;
 		nfrozen = 0;
 		hastup = false;
-		prev_dead_count = vacrelstats->num_dead_tuples;
+		prev_dead_count = lazy_get_dead_tuple_count(lvstate->dead_tuples);
 		maxoff = PageGetMaxOffsetNumber(page);
 
 		/*
@@ -974,7 +817,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			/* Unused items require no processing, but we count 'em */
 			if (!ItemIdIsUsed(itemid))
 			{
-				nunused += 1;
+				vacrelstats->unused_tuples += 1;
 				continue;
 			}
 
@@ -989,13 +832,13 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 
 			/*
 			 * DEAD item pointers are to be vacuumed normally; but we don't
-			 * count them in tups_vacuumed, else we'd be double-counting (at
+			 * count them in vacrelstats->tuples_deleted, else we'd be double-counting (at
 			 * least in the common case where heap_page_prune() just freed up
 			 * a non-HOT tuple).
 			 */
 			if (ItemIdIsDead(itemid))
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(lvstate->dead_tuples, &(tuple.t_self));
 				all_visible = false;
 				continue;
 			}
@@ -1047,7 +890,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 					 */
 					if (HeapTupleIsHotUpdated(&tuple) ||
 						HeapTupleIsHeapOnly(&tuple))
-						nkeep += 1;
+						vacrelstats->new_dead_tuples += 1;
 					else
 						tupgone = true; /* we can delete the tuple */
 					all_visible = false;
@@ -1063,7 +906,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 					 * Count it as live.  Not only is this natural, but it's
 					 * also what acquire_sample_rows() does.
 					 */
-					live_tuples += 1;
+					vacrelstats->live_tuples += 1;
 
 					/*
 					 * Is the tuple definitely visible to all transactions?
@@ -1106,7 +949,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 					 * If tuple is recently deleted then we must not remove it
 					 * from relation.
 					 */
-					nkeep += 1;
+					vacrelstats->new_dead_tuples += 1;
 					all_visible = false;
 					break;
 				case HEAPTUPLE_INSERT_IN_PROGRESS:
@@ -1132,7 +975,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 					 * deleting transaction will commit and update the
 					 * counters after we report.
 					 */
-					live_tuples += 1;
+					vacrelstats->live_tuples += 1;
 					break;
 				default:
 					elog(ERROR, "unexpected HeapTupleSatisfiesVacuum result");
@@ -1141,17 +984,17 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 
 			if (tupgone)
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(lvstate->dead_tuples, &(tuple.t_self));
 				HeapTupleHeaderAdvanceLatestRemovedXid(tuple.t_data,
 													   &vacrelstats->latestRemovedXid);
-				tups_vacuumed += 1;
+				vacrelstats->tuples_deleted += 1;
 				has_dead_tuples = true;
 			}
 			else
 			{
 				bool		tuple_totally_frozen;
 
-				num_tuples += 1;
+				vacrelstats->num_tuples += 1;
 				hastup = true;
 
 				/*
@@ -1161,9 +1004,9 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 				if (heap_prepare_freeze_tuple(tuple.t_data,
 											  relfrozenxid, relminmxid,
 											  FreezeLimit, MultiXactCutoff,
-											  &frozen[nfrozen],
+											  &(lvstate->frozen[nfrozen]),
 											  &tuple_totally_frozen))
-					frozen[nfrozen++].offset = offnum;
+					lvstate->frozen[nfrozen++].offset = offnum;
 
 				if (!tuple_totally_frozen)
 					all_frozen = false;
@@ -1187,10 +1030,10 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 				ItemId		itemid;
 				HeapTupleHeader htup;
 
-				itemid = PageGetItemId(page, frozen[i].offset);
+				itemid = PageGetItemId(page, lvstate->frozen[i].offset);
 				htup = (HeapTupleHeader) PageGetItem(page, itemid);
 
-				heap_execute_freeze_tuple(htup, &frozen[i]);
+				heap_execute_freeze_tuple(htup, &(lvstate->frozen[i]));
 			}
 
 			/* Now WAL-log freezing if necessary */
@@ -1199,7 +1042,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 				XLogRecPtr	recptr;
 
 				recptr = log_heap_freeze(onerel, buf, FreezeLimit,
-										 frozen, nfrozen);
+										 lvstate->frozen, nfrozen);
 				PageSetLSN(page, recptr);
 			}
 
@@ -1210,20 +1053,24 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		 * If there are no indexes then we can vacuum the page right now
 		 * instead of doing a second scan.
 		 */
-		if (nindexes == 0 &&
-			vacrelstats->num_dead_tuples > 0)
+		if (lvstate->nindexes == 0 &&
+			lazy_get_dead_tuple_count(lvstate->dead_tuples) > 0)
 		{
 			/* Remove tuples from heap */
-			lazy_vacuum_page(onerel, blkno, buf, 0, vacrelstats, &vmbuffer);
+			lazy_vacuum_page(lvstate, onerel, blkno, buf, 0, &vmbuffer);
 			has_dead_tuples = false;
 
 			/*
 			 * Forget the now-vacuumed tuples, and press on, but be careful
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
+			 *
+			 * If table with no index, since the dead tuple space exists on
+			 * local memory regardless parallel or non-parallel lazy vacuum
+			 * we don't need to acquire the lock to modify it.
 			 */
-			vacrelstats->num_dead_tuples = 0;
-			vacuumed_pages++;
+			lvstate->dead_tuples->num_items = 0;
+			vacrelstats->vacuumed_pages++;
 
 			/*
 			 * Periodically do incremental FSM vacuuming to make newly-freed
@@ -1231,11 +1078,11 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			 * the current block, we haven't yet updated its FSM entry (that
 			 * happens further down), so passing end == blkno is correct.
 			 */
-			if (blkno - next_fsm_block_to_vacuum >= VACUUM_FSM_EVERY_PAGES)
+			if (blkno - lvstate->next_fsm_block_to_vacuum >= VACUUM_FSM_EVERY_PAGES)
 			{
-				FreeSpaceMapVacuumRange(onerel, next_fsm_block_to_vacuum,
+				FreeSpaceMapVacuumRange(onerel, lvstate->next_fsm_block_to_vacuum,
 										blkno);
-				next_fsm_block_to_vacuum = blkno;
+				lvstate->next_fsm_block_to_vacuum = blkno;
 			}
 		}
 
@@ -1338,127 +1185,31 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		 * page, so remember its free space as-is.  (This path will always be
 		 * taken if there are no indexes.)
 		 */
-		if (vacrelstats->num_dead_tuples == prev_dead_count)
+		if (lazy_get_dead_tuple_count(lvstate->dead_tuples) == prev_dead_count)
 			RecordPageWithFreeSpace(onerel, blkno, freespace);
-	}
-
-	/* report that everything is scanned and vacuumed */
-	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_SCANNED, blkno);
-
-	pfree(frozen);
 
-	/* save stats for use later */
-	vacrelstats->tuples_deleted = tups_vacuumed;
-	vacrelstats->new_dead_tuples = nkeep;
-
-	/* now we can compute the new value for pg_class.reltuples */
-	vacrelstats->new_live_tuples = vac_estimate_reltuples(onerel,
-														  nblocks,
-														  vacrelstats->tupcount_pages,
-														  live_tuples);
-
-	/* also compute total number of surviving heap entries */
-	vacrelstats->new_rel_tuples =
-		vacrelstats->new_live_tuples + vacrelstats->new_dead_tuples;
+		/* Dead tuple space is full, exit scanning */
+		if (lvstate->nindexes > 0 && lazy_dead_tuples_is_full(lvstate->dead_tuples))
+			break;
+	}
 
 	/*
-	 * Release any remaining pin on visibility map page.
+	 * Before beginning index vacuuming, we release any pin we may
+	 * hold on the visibility map page.  This isn't necessary for
+	 * correctness, but we do it anyway to avoid holding the pin
+	 * across a lengthy, unrelated operation.
 	 */
 	if (BufferIsValid(vmbuffer))
-	{
 		ReleaseBuffer(vmbuffer);
-		vmbuffer = InvalidBuffer;
-	}
-
-	/* If any tuples need to be deleted, perform final vacuum cycle */
-	/* XXX put a threshold on min number of tuples here? */
-	if (vacrelstats->num_dead_tuples > 0)
-	{
-		const int	hvp_index[] = {
-			PROGRESS_VACUUM_PHASE,
-			PROGRESS_VACUUM_NUM_INDEX_VACUUMS
-		};
-		int64		hvp_val[2];
-
-		/* Log cleanup info before we touch indexes */
-		vacuum_log_cleanup_info(onerel, vacrelstats);
-
-		/* Report that we are now vacuuming indexes */
-		pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
-									 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
-
-		/* Remove index entries */
-		for (i = 0; i < nindexes; i++)
-			lazy_vacuum_index(Irel[i],
-							  &indstats[i],
-							  vacrelstats);
-
-		/* Report that we are now vacuuming the heap */
-		hvp_val[0] = PROGRESS_VACUUM_PHASE_VACUUM_HEAP;
-		hvp_val[1] = vacrelstats->num_index_scans + 1;
-		pgstat_progress_update_multi_param(2, hvp_index, hvp_val);
-
-		/* Remove tuples from heap */
-		pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
-									 PROGRESS_VACUUM_PHASE_VACUUM_HEAP);
-		lazy_vacuum_heap(onerel, vacrelstats);
-		vacrelstats->num_index_scans++;
-	}
-
-	/*
-	 * Vacuum the remainder of the Free Space Map.  We must do this whether or
-	 * not there were indexes.
-	 */
-	if (blkno > next_fsm_block_to_vacuum)
-		FreeSpaceMapVacuumRange(onerel, next_fsm_block_to_vacuum, blkno);
-
-	/* report all blocks vacuumed; and that we're cleaning up */
-	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_VACUUMED, blkno);
-	pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
-								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
-
-	/* Do post-vacuum cleanup and statistics update for each index */
-	for (i = 0; i < nindexes; i++)
-		lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
 
-	/* If no indexes, make log report that lazy_vacuum_heap would've made */
-	if (vacuumed_pages)
-		ereport(elevel,
-				(errmsg("\"%s\": removed %.0f row versions in %u pages",
-						RelationGetRelationName(onerel),
-						tups_vacuumed, vacuumed_pages)));
+	/* Reached the end of the table */
+	if (!BlockNumberIsValid(blkno))
+		*isFinished = true;
 
-	/*
-	 * This is pretty messy, but we split it up so that we can skip emitting
-	 * individual parts of the message when not applicable.
-	 */
-	initStringInfo(&buf);
-	appendStringInfo(&buf,
-					 _("%.0f dead row versions cannot be removed yet, oldest xmin: %u\n"),
-					 nkeep, OldestXmin);
-	appendStringInfo(&buf, _("There were %.0f unused item pointers.\n"),
-					 nunused);
-	appendStringInfo(&buf, ngettext("Skipped %u page due to buffer pins, ",
-									"Skipped %u pages due to buffer pins, ",
-									vacrelstats->pinskipped_pages),
-					 vacrelstats->pinskipped_pages);
-	appendStringInfo(&buf, ngettext("%u frozen page.\n",
-									"%u frozen pages.\n",
-									vacrelstats->frozenskipped_pages),
-					 vacrelstats->frozenskipped_pages);
-	appendStringInfo(&buf, ngettext("%u page is entirely empty.\n",
-									"%u pages are entirely empty.\n",
-									empty_pages),
-					 empty_pages);
-	appendStringInfo(&buf, _("%s."), pg_rusage_show(&ru0));
+	/* Remember the just scanned block before leaving */
+	lvstate->current_block = blkno;
 
-	ereport(elevel,
-			(errmsg("\"%s\": found %.0f removable, %.0f nonremovable row versions in %u out of %u pages",
-					RelationGetRelationName(onerel),
-					tups_vacuumed, num_tuples,
-					vacrelstats->scanned_pages, nblocks),
-			 errdetail_internal("%s", buf.data)));
-	pfree(buf.data);
+	return lazy_get_dead_tuple_count(lvstate->dead_tuples);
 }
 
 
@@ -1473,38 +1224,36 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
  * the tuples until we've removed their index entries, and we want to
  * process index entry removal in batches as large as possible.
  */
-static void
-lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
+void
+lazy_vacuum_heap(Relation onerel, LVState *lvstate)
 {
-	int			tupindex;
+	int			tupindex = 0;
 	int			npages;
 	PGRUsage	ru0;
 	Buffer		vmbuffer = InvalidBuffer;
+	BlockNumber tblk;
 
 	pg_rusage_init(&ru0);
 	npages = 0;
 
-	tupindex = 0;
-	while (tupindex < vacrelstats->num_dead_tuples)
+	while ((tblk = lazy_get_next_vacuum_page(lvstate, &tupindex, &npages))
+		   != InvalidBlockNumber)
 	{
-		BlockNumber tblk;
 		Buffer		buf;
 		Page		page;
 		Size		freespace;
 
 		vacuum_delay_point();
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
 		buf = ReadBufferExtended(onerel, MAIN_FORKNUM, tblk, RBM_NORMAL,
 								 vac_strategy);
 		if (!ConditionalLockBufferForCleanup(buf))
 		{
 			ReleaseBuffer(buf);
-			++tupindex;
 			continue;
 		}
-		tupindex = lazy_vacuum_page(onerel, tblk, buf, tupindex, vacrelstats,
-									&vmbuffer);
+
+		lazy_vacuum_page(lvstate, onerel, tblk, buf, tupindex, &vmbuffer);
 
 		/* Now that we've compacted the page, record its available space */
 		page = BufferGetPage(buf);
@@ -1512,7 +1261,6 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 
 		UnlockReleaseBuffer(buf);
 		RecordPageWithFreeSpace(onerel, tblk, freespace);
-		npages++;
 	}
 
 	if (BufferIsValid(vmbuffer))
@@ -1521,11 +1269,13 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 		vmbuffer = InvalidBuffer;
 	}
 
-	ereport(elevel,
-			(errmsg("\"%s\": removed %d row versions in %d pages",
-					RelationGetRelationName(onerel),
-					tupindex, npages),
-			 errdetail_internal("%s", pg_rusage_show(&ru0))));
+	/* Report by only the vacuum leader */
+	if (!IsParallelWorker())
+		ereport(elevel,
+				(errmsg("\"%s\": removed %d row versions in %d pages",
+						RelationGetRelationName(onerel),
+						tupindex, npages),
+				 errdetail_internal("%s", pg_rusage_show(&ru0))));
 }
 
 /*
@@ -1539,29 +1289,32 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
  * The return value is the first tupindex after the tuples of this page.
  */
 static int
-lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
-				 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer)
+lazy_vacuum_page(LVState *lvstate, Relation onerel, BlockNumber blkno,
+				 Buffer buffer, int tupindex, Buffer *vmbuffer)
 {
+	LVRelStats *vacrelstats = lvstate->vacrelstats;
+	LVTidMap   *dead_tuples = lvstate->dead_tuples;
 	Page		page = BufferGetPage(buffer);
 	OffsetNumber unused[MaxOffsetNumber];
 	int			uncnt = 0;
 	TransactionId visibility_cutoff_xid;
 	bool		all_frozen;
+	int			num_items = lazy_get_dead_tuple_count(dead_tuples);
 
 	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_VACUUMED, blkno);
 
 	START_CRIT_SECTION();
 
-	for (; tupindex < vacrelstats->num_dead_tuples; tupindex++)
+	for (; tupindex < num_items; tupindex++)
 	{
 		BlockNumber tblk;
 		OffsetNumber toff;
 		ItemId		itemid;
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&dead_tuples->itemptrs[tupindex]);
 		if (tblk != blkno)
 			break;				/* past end of tuples for this block */
-		toff = ItemPointerGetOffsetNumber(&vacrelstats->dead_tuples[tupindex]);
+		toff = ItemPointerGetOffsetNumber(&dead_tuples->itemptrs[tupindex]);
 		itemid = PageGetItemId(page, toff);
 		ItemIdSetUnused(itemid);
 		unused[uncnt++] = toff;
@@ -1691,7 +1444,8 @@ lazy_check_needs_freeze(Buffer buf, bool *hastup)
 static void
 lazy_vacuum_index(Relation indrel,
 				  IndexBulkDeleteResult **stats,
-				  LVRelStats *vacrelstats)
+				  LVRelStats *vacrelstats,
+				  LVTidMap	*dead_tuples)
 {
 	IndexVacuumInfo ivinfo;
 	PGRUsage	ru0;
@@ -1708,12 +1462,13 @@ lazy_vacuum_index(Relation indrel,
 
 	/* Do bulk deletion */
 	*stats = index_bulk_delete(&ivinfo, *stats,
-							   lazy_tid_reaped, (void *) vacrelstats);
+							   lazy_tid_reaped, (void *) dead_tuples);
 
+	/* Report by both the leader and workers */
 	ereport(elevel,
 			(errmsg("scanned index \"%s\" to remove %d row versions",
 					RelationGetRelationName(indrel),
-					vacrelstats->num_dead_tuples),
+					lazy_get_dead_tuple_count(dead_tuples)),
 			 errdetail_internal("%s", pg_rusage_show(&ru0))));
 }
 
@@ -1723,7 +1478,8 @@ lazy_vacuum_index(Relation indrel,
 static void
 lazy_cleanup_index(Relation indrel,
 				   IndexBulkDeleteResult *stats,
-				   LVRelStats *vacrelstats)
+				   LVRelStats *vacrelstats,
+				   IndexStats *indstats)
 {
 	IndexVacuumInfo ivinfo;
 	PGRUsage	ru0;
@@ -1750,18 +1506,29 @@ lazy_cleanup_index(Relation indrel,
 
 	/*
 	 * Now update statistics in pg_class, but only if the index says the count
-	 * is accurate.
+	 * is accurate and in not parallel lazy vacuum.
 	 */
 	if (!stats->estimated_count)
-		vac_update_relstats(indrel,
-							stats->num_pages,
-							stats->num_index_tuples,
-							0,
-							false,
-							InvalidTransactionId,
-							InvalidMultiXactId,
-							false);
+	{
+		if (indstats)
+		{
+			/* In parallel lazy vacuum, remember them and update later */
+			indstats->need_update = true;
+			indstats->num_pages = stats->num_pages;
+			indstats->num_tuples = stats->num_index_tuples;
+		}
+		else
+			vac_update_relstats(indrel,
+								stats->num_pages,
+								stats->num_index_tuples,
+								0,
+								false,
+								InvalidTransactionId,
+								InvalidMultiXactId,
+								false);
+	}
 
+	/* Report by both the leader and workers */
 	ereport(elevel,
 			(errmsg("index \"%s\" now contains %.0f row versions in %u pages",
 					RelationGetRelationName(indrel),
@@ -2084,57 +1851,51 @@ count_nondeletable_pages(Relation onerel, LVRelStats *vacrelstats)
  *
  * See the comments at the head of this file for rationale.
  */
-static void
-lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+void
+lazy_space_alloc(LVState *lvstate, BlockNumber relblocks)
 {
 	long		maxtuples;
-	int			vac_work_mem = IsAutoVacuumWorkerProcess() &&
-	autovacuum_work_mem != -1 ?
-	autovacuum_work_mem : maintenance_work_mem;
+	LVTidMap	*dead_tuples;
 
-	if (vacrelstats->hasindex)
-	{
-		maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);
-		maxtuples = Min(maxtuples, INT_MAX);
-		maxtuples = Min(maxtuples, MaxAllocSize / sizeof(ItemPointerData));
+	Assert(lvstate->dead_tuples == NULL);
 
-		/* curious coding here to ensure the multiplication can't overflow */
-		if ((BlockNumber) (maxtuples / LAZY_ALLOC_TUPLES) > relblocks)
-			maxtuples = relblocks * LAZY_ALLOC_TUPLES;
+	maxtuples = lazy_get_max_dead_tuples(lvstate->vacrelstats,
+										 relblocks);
 
-		/* stay sane if small maintenance_work_mem */
-		maxtuples = Max(maxtuples, MaxHeapTuplesPerPage);
-	}
-	else
-	{
-		maxtuples = MaxHeapTuplesPerPage;
-	}
+	dead_tuples = (LVTidMap *) palloc(SizeOfLVTidMap +
+									  sizeof(ItemPointerData) * (int) maxtuples);
+	dead_tuples->max_items = maxtuples;
+	dead_tuples->num_items = 0;
+	dead_tuples->shared = false;
+	dead_tuples->item_idx = 0;
 
-	vacrelstats->num_dead_tuples = 0;
-	vacrelstats->max_dead_tuples = (int) maxtuples;
-	vacrelstats->dead_tuples = (ItemPointer)
-		palloc(maxtuples * sizeof(ItemPointerData));
+	lvstate->dead_tuples = dead_tuples;
 }
 
 /*
  * lazy_record_dead_tuple - remember one deletable tuple
  */
 static void
-lazy_record_dead_tuple(LVRelStats *vacrelstats,
-					   ItemPointer itemptr)
+lazy_record_dead_tuple(LVTidMap *dead_tuples, ItemPointer itemptr)
 {
+	if (dead_tuples->shared)
+		SpinLockAcquire(&(dead_tuples->mutex));
+
 	/*
 	 * The array shouldn't overflow under normal behavior, but perhaps it
 	 * could if we are given a really small maintenance_work_mem. In that
 	 * case, just forget the last few tuples (we'll get 'em next time).
 	 */
-	if (vacrelstats->num_dead_tuples < vacrelstats->max_dead_tuples)
+	if (dead_tuples->num_items < dead_tuples->max_items)
 	{
-		vacrelstats->dead_tuples[vacrelstats->num_dead_tuples] = *itemptr;
-		vacrelstats->num_dead_tuples++;
+		dead_tuples->itemptrs[dead_tuples->num_items] = *itemptr;
+		(dead_tuples->num_items)++;
 		pgstat_progress_update_param(PROGRESS_VACUUM_NUM_DEAD_TUPLES,
-									 vacrelstats->num_dead_tuples);
+									 dead_tuples->num_items);
 	}
+
+	if (dead_tuples->shared)
+		SpinLockRelease(&(dead_tuples->mutex));
 }
 
 /*
@@ -2145,14 +1906,14 @@ lazy_record_dead_tuple(LVRelStats *vacrelstats,
  *		Assumes dead_tuples array is in sorted order.
  */
 static bool
-lazy_tid_reaped(ItemPointer itemptr, void *state)
+lazy_tid_reaped(ItemPointer itemptr, void *dt)
 {
-	LVRelStats *vacrelstats = (LVRelStats *) state;
+	LVTidMap	*dead_tuples = (LVTidMap *) dt;
 	ItemPointer res;
 
 	res = (ItemPointer) bsearch((void *) itemptr,
-								(void *) vacrelstats->dead_tuples,
-								vacrelstats->num_dead_tuples,
+								(void *) dead_tuples->itemptrs,
+								lazy_get_dead_tuple_count(dead_tuples),
 								sizeof(ItemPointerData),
 								vac_cmp_itemptr);
 
@@ -2300,3 +2061,1255 @@ heap_page_is_all_visible(Relation rel, Buffer buf,
 
 	return all_visible;
 }
+
+/*
+ * Perform single or parallel lazy vacuum.
+ */
+static void
+lazy_scan_heap(LVState *lvstate)
+{
+	LVRelStats		*vacrelstats = lvstate->vacrelstats;
+	LVLeader		*lvleader = NULL;
+	Relation		onerel = lvstate->relation;
+	bool		isFinished = false;
+	int			nworkers = 0;
+	BlockNumber	nblocks;
+	char		*relname;
+	StringInfoData	buf;
+	PGRUsage		ru0;
+
+	/* Reset parallel vacuum worker stats */
+	if (WorkerState)
+		WorkerState = NULL;
+
+	relname = RelationGetRelationName(onerel);
+
+	/* Plan the number of parallel workers */
+	if ((lvstate->options.flags & VACOPT_PARALLEL) != 0)
+		nworkers = plan_lazy_vacuum_workers(RelationGetRelid(lvstate->relation),
+											lvstate->options.nworkers);
+
+	if (nworkers > 0)
+	{
+		/* Set parallel context and attempt to launch parallel workers */
+		lvleader = lazy_vacuum_begin_parallel(lvstate, nworkers);
+	}
+	else
+	{
+		/* Prepare dead tuple space for the single lazy scan heap */
+		lazy_space_alloc(lvstate, RelationGetNumberOfBlocks(lvstate->relation));
+	}
+
+	pg_rusage_init(&ru0);
+
+	if (lvstate->aggressive)
+		ereport(elevel,
+				(errmsg("aggressively vacuuming \"%s.%s\"",
+						get_namespace_name(RelationGetNamespace(onerel)),
+						relname)));
+	else
+		ereport(elevel,
+				(errmsg("vacuuming \"%s.%s\"",
+						get_namespace_name(RelationGetNamespace(onerel)),
+						relname)));
+
+	nblocks = RelationGetNumberOfBlocks(onerel);
+	vacrelstats->rel_pages = nblocks;
+
+	vacrelstats->scanned_pages = 0;
+	vacrelstats->tupcount_pages = 0;
+	vacrelstats->nonempty_pages = 0;
+	vacrelstats->empty_pages = 0;
+	vacrelstats->latestRemovedXid = InvalidTransactionId;
+
+	lvstate->lvscan = lv_beginscan(onerel, lvstate->lvshared, lvstate->aggressive,
+						  (lvstate->options.flags & VACOPT_DISABLE_PAGE_SKIPPING) != 0);
+	lvstate->indbulkstats = (IndexBulkDeleteResult **)
+		palloc0(lvstate->nindexes * sizeof(IndexBulkDeleteResult *));
+	lvstate->frozen = palloc(sizeof(xl_heap_freeze_tuple) * MaxHeapTuplesPerPage);
+
+	/* Do the actual lazy vacuum */
+	while (!isFinished)
+	{
+		int ndeadtuples;
+		const int	hvp_index[] = {
+			PROGRESS_VACUUM_PHASE,
+			PROGRESS_VACUUM_NUM_INDEX_VACUUMS
+		};
+		int64		hvp_val[2];
+
+		/*
+		 * Scan heap until the end of the table or dead tuple space is full if the
+		 * table with indexes.
+		 */
+		ndeadtuples = do_lazy_scan_heap(lvstate, &isFinished);
+
+		/* Reached the end of table with no garbage */
+		if (isFinished && ndeadtuples == 0)
+			break;
+
+		/* Log cleanup info before we touch indexes */
+		vacuum_log_cleanup_info(onerel, vacrelstats);
+
+		/* Prepare the index vacuum */
+		lazy_prepare_next_state(lvstate, lvleader, VACSTATE_VACUUM_INDEX);
+
+		/* Report that we are now vacuuming indexes */
+		pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
+									 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
+
+		/* Remove index entries */
+		lazy_vacuum_all_indexes(lvstate);
+
+		/* Prepare the heap vacuum */
+		lazy_prepare_next_state(lvstate, lvleader, VACSTATE_VACUUM_HEAP);
+
+		/*
+		 * Report that we are now vacuuming the heap.  We also increase
+		 * the number of index scans here; note that by using
+		 * pgstat_progress_update_multi_param we can update both
+		 * parameters atomically.
+		 */
+		hvp_val[0] = PROGRESS_VACUUM_PHASE_VACUUM_HEAP;
+		hvp_val[1] = vacrelstats->num_index_scans + 1;
+		pgstat_progress_update_multi_param(2, hvp_index, hvp_val);
+
+		/* Remove tuples from heap */
+		lazy_vacuum_heap(onerel, lvstate);
+
+		/*
+		 * Vacuum the Free Space Map to make newly-freed space visible on
+		 * upper-level FSM pages.  Note we have not yet processed blkno.
+		 */
+		FreeSpaceMapVacuumRange(onerel, lvstate->next_fsm_block_to_vacuum,
+								lvstate->current_block);
+		lvstate->next_fsm_block_to_vacuum = lvstate->current_block;
+
+		vacrelstats->num_index_scans++;
+
+		if (!isFinished)
+		{
+			/*
+			 * Prepare for the next heap scan. Forget the now-vacuumed tuples,
+			 * and press on, but be careful not to reset latestRemovedXid since
+			 * we want that value to be valid.
+			 */
+			lazy_prepare_next_state(lvstate, lvleader, VACSTATE_SCAN);
+			pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
+										 PROGRESS_VACUUM_PHASE_SCAN_HEAP);
+		}
+		else
+			pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_VACUUMED, nblocks);
+	}
+
+	/* report that everything is scanned and vacuumed */
+	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_SCANNED, nblocks);
+
+	/* End heap scan */
+	lv_endscan(lvstate->lvscan);
+
+	pfree(lvstate->frozen);
+
+	/*
+	 * Vacuum the remainder of the Free Space Map.  We must do this whether or
+	 * not there were indexes.
+	 */
+	if (lvstate->current_block > lvstate->next_fsm_block_to_vacuum)
+		FreeSpaceMapVacuumRange(onerel, lvstate->next_fsm_block_to_vacuum,
+								lvstate->current_block);
+
+	/* Prepare the cleanup index */
+	lazy_prepare_next_state(lvstate, lvleader, VACSTATE_CLEANUP_INDEX);
+
+	/* report all blocks vacuumed; and that we're cleaning up */
+	pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
+								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
+
+	/* Do post-vacuum cleanup and statistics update for each index */
+	lazy_cleanup_all_indexes(lvstate);
+
+	/* Shut down all vacuum workers */
+	lazy_prepare_next_state(lvstate, lvleader, VACSTATE_COMPLETED);
+
+	/* If no indexes, make log report that lazy_vacuum_heap would've made */
+	if (vacrelstats->vacuumed_pages)
+		ereport(elevel,
+				(errmsg("\"%s\": removed %.0f row versions in %u pages",
+						RelationGetRelationName(onerel),
+						vacrelstats->tuples_deleted, vacrelstats->vacuumed_pages)));
+
+	/* Finish parallel lazy vacuum and update index statistics */
+	if (nworkers > 0)
+		lazy_vacuum_end_parallel(lvstate, lvleader, true);
+
+	/*
+	 * This is pretty messy, but we split it up so that we can skip emitting
+	 * individual parts of the message when not applicable.
+	 */
+	initStringInfo(&buf);
+	appendStringInfo(&buf,
+					 _("%.0f dead row versions cannot be removed yet, oldest xmin: %u\n"),
+					 vacrelstats->new_dead_tuples, OldestXmin);
+	appendStringInfo(&buf, _("There were %.0f unused item pointers.\n"),
+					 vacrelstats->unused_tuples);
+	appendStringInfo(&buf, ngettext("Skipped %u page due to buffer pins, ",
+									"Skipped %u pages due to buffer pins, ",
+									vacrelstats->pinskipped_pages),
+					 vacrelstats->pinskipped_pages);
+	appendStringInfo(&buf, ngettext("%u frozen page.\n",
+									"%u frozen pages.\n",
+									vacrelstats->frozenskipped_pages),
+					 vacrelstats->frozenskipped_pages);
+	appendStringInfo(&buf, ngettext("%u page is entirely empty.\n",
+									"%u pages are entirely empty.\n",
+									vacrelstats->empty_pages),
+					 vacrelstats->empty_pages);
+	appendStringInfo(&buf, _("%s."), pg_rusage_show(&ru0));
+
+	ereport(elevel,
+			(errmsg("\"%s\": found %.0f removable, %.0f nonremovable row versions in %u out of %u pages",
+					RelationGetRelationName(lvstate->relation),
+					vacrelstats->tuples_deleted, vacrelstats->num_tuples,
+					vacrelstats->scanned_pages, nblocks),
+			 errdetail_internal("%s", buf.data)));
+	pfree(buf.data);
+}
+
+/*
+ * Create parallel context, and launch workers for lazy vacuum.
+ * Also this function constructs the leader's lvstate.
+ */
+static LVLeader *
+lazy_vacuum_begin_parallel(LVState *lvstate, int request)
+{
+	LVLeader		*lvleader = palloc(sizeof(LVLeader));
+	ParallelContext *pcxt;
+	Size			estshared,
+					estvacstats,
+					estindstats,
+					estdt,
+					estworker;
+	LVRelStats		*vacrelstats;
+	LVShared		*lvshared;
+	int			querylen;
+	int 		keys = 0;
+	char		*sharedquery;
+	long	 	maxtuples;
+	int			nparticipants = request + 1;
+	int i;
+
+	EnterParallelMode();
+	pcxt = CreateParallelContext("postgres", "lazy_parallel_vacuum_main",
+								 request, true);
+	lvleader->pcxt = pcxt;
+
+	/* Calculate maximum dead tuples we store */
+	maxtuples = lazy_get_max_dead_tuples(lvstate->vacrelstats,
+										 RelationGetNumberOfBlocks(lvstate->relation));
+
+	/* Estimate size for shared state -- VACUUM_KEY_SHARED */
+	estshared = MAXALIGN(sizeof(LVShared));
+	shm_toc_estimate_chunk(&pcxt->estimator, estshared);
+	keys++;
+
+	/* Estimate size for vacuum statistics for only workers -- VACUUM_KEY_VACUUM_STATS */
+	estvacstats = MAXALIGN(mul_size(sizeof(LVRelStats), request));
+	shm_toc_estimate_chunk(&pcxt->estimator, estvacstats);
+	keys++;
+
+	/* Estimate size for parallel worker status including the leader -- VACUUM_KEY_WORKERS */
+	estworker = MAXALIGN(SizeOfLVWorkerState +
+						 mul_size(sizeof(VacuumWorker), request));
+	shm_toc_estimate_chunk(&pcxt->estimator, estworker);
+	keys++;
+
+	/* We have to dead tuple information only when the table has indexes */
+	if (lvstate->nindexes > 0)
+	{
+		/* Estimate size for index statistics -- VACUUM_KEY_INDEX_STATS */
+		estindstats = MAXALIGN(SizeOfLVIndStats +
+							   mul_size(sizeof(IndexStats), lvstate->nindexes));
+		shm_toc_estimate_chunk(&pcxt->estimator, estindstats);
+		keys++;
+
+		/* Estimate size for dead tuple control -- VACUUM_KEY_DEAD_TUPLES */
+		estdt = MAXALIGN(SizeOfLVTidMap +
+						 mul_size(sizeof(ItemPointerData), maxtuples));
+		shm_toc_estimate_chunk(&pcxt->estimator, estdt);
+		keys++;
+	}
+	else
+	{
+		/* Dead tuple are stored into the local memory if no indexes */
+		lazy_space_alloc(lvstate, RelationGetNumberOfBlocks(lvstate->relation));
+		lvstate->indstats = NULL;
+	}
+
+	shm_toc_estimate_keys(&pcxt->estimator, keys);
+
+	/*
+	 * Finally, estimate VACUUM_KEY_QUERY_TEXT space. Auto vacuum doesn't have
+	 * debug_query_string.
+	 */
+	if (debug_query_string)
+	{
+		querylen = strlen(debug_query_string);
+		shm_toc_estimate_chunk(&pcxt->estimator, querylen + 1);
+		shm_toc_estimate_keys(&pcxt->estimator, 1);
+	}
+
+	InitializeParallelDSM(pcxt);
+
+	/*
+	 * Initialize dynamic shared memory for parallel lazy vacuum. We store
+	 * relevant informations of parallel heap scanning, dead tuple array
+	 * and vacuum statistics for each worker and some parameters for lazy vacuum.
+	 */
+	lvshared = shm_toc_allocate(pcxt->toc, estshared);
+	lvshared->relid = lvstate->relid;
+	lvshared->aggressive = lvstate->aggressive;
+	lvshared->options = lvstate->options;
+	lvshared->oldestXmin = OldestXmin;
+	lvshared->freezeLimit = FreezeLimit;
+	lvshared->multiXactCutoff = MultiXactCutoff;
+	lvshared->elevel = elevel;
+	lvshared->is_wraparound = lvstate->is_wraparound;
+	lvshared->cost_delay = VacuumCostDelay;
+	lvshared->cost_limit = VacuumCostLimit;
+	lvshared->max_dead_tuples_per_worker = maxtuples / nparticipants;
+	heap_parallelscan_initialize(&lvshared->heapdesc, lvstate->relation, SnapshotAny);
+	shm_toc_insert(pcxt->toc, VACUUM_KEY_SHARED, lvshared);
+	lvstate->lvshared = lvshared;
+
+	/* Prepare vacuum relation statistics */
+	vacrelstats = (LVRelStats *) shm_toc_allocate(pcxt->toc, estvacstats);
+	for (i = 0; i < request; i++)
+		memcpy(&vacrelstats[i], lvstate->vacrelstats, sizeof(LVRelStats));
+	shm_toc_insert(pcxt->toc, VACUUM_KEY_VACUUM_STATS, vacrelstats);
+	lvleader->allrelstats = vacrelstats;
+
+	/* Prepare worker status */
+	WorkerState = (LVWorkerState *) shm_toc_allocate(pcxt->toc, estworker);
+	ConditionVariableInit(&WorkerState->cv);
+	LWLockInitialize(&WorkerState->vacuumlock, LWTRANCHE_PARALLEL_VACUUM);
+	WorkerState->nparticipantvacuum = request;
+	for (i = 0; i < request; i++)
+	{
+		VacuumWorker *worker = &(WorkerState->workers[i]);
+
+		worker->pid = InvalidPid;
+		worker->state = VACSTATE_INVALID;	/* initial state */
+		SpinLockInit(&worker->mutex);
+	}
+	shm_toc_insert(pcxt->toc, VACUUM_KEY_WORKERS, WorkerState);
+
+	/* Prepare index statistics and deadtuple space if the table has index */
+	if (lvstate->nindexes > 0)
+	{
+		LVIndStats	*indstats;
+		LVTidMap	*dead_tuples;
+
+		/* Prepare Index statistics */
+		indstats = shm_toc_allocate(pcxt->toc, estindstats);
+		indstats->nindexes = lvstate->nindexes;
+		indstats->nprocessed = 0;
+		MemSet(indstats->stats, 0, sizeof(IndexStats) * indstats->nindexes);
+		SpinLockInit(&indstats->mutex);
+		shm_toc_insert(pcxt->toc, VACUUM_KEY_INDEX_STATS, indstats);
+		lvstate->indstats = indstats;
+
+		/* Prepare shared dead tuples space */
+		dead_tuples = (LVTidMap *) shm_toc_allocate(pcxt->toc, estdt);
+		dead_tuples->max_items = maxtuples;
+		dead_tuples->num_items = 0;
+		dead_tuples->item_idx = 0;
+		dead_tuples->shared = true;
+		SpinLockInit(&dead_tuples->mutex);
+		shm_toc_insert(pcxt->toc, VACUUM_KEY_DEAD_TUPLES, dead_tuples);
+		lvstate->dead_tuples = dead_tuples;
+	}
+
+	/* Store query string for workers */
+	if (debug_query_string)
+	{
+		sharedquery = shm_toc_allocate(pcxt->toc, querylen + 1);
+		memcpy(sharedquery, debug_query_string, querylen + 1);
+		shm_toc_insert(pcxt->toc, VACUUM_KEY_QUERY_TEXT, sharedquery);
+	}
+
+	/* Set master pid to itself */
+	pgstat_report_leader_pid(MyProcPid);
+
+	/* Launch workers */
+	LaunchParallelWorkers(pcxt);
+
+	if (pcxt->nworkers_launched == 0)
+	{
+		lazy_vacuum_end_parallel(lvstate, lvleader, false);
+		pfree(lvleader);
+		return NULL;
+	}
+
+	/* Update the number of workers participating */
+	WorkerState->nparticipantvacuum_launched = pcxt->nworkers_launched;
+
+	lazy_wait_for_vacuum_workers_attach(pcxt);
+
+	return lvleader;
+}
+
+/*
+ * Wait for all workers finish and exit parallel vacuum. If update_stats
+ * is true, gather vacuum statistics of all parallel workers and
+ * update index statistics.
+ */
+static void
+lazy_vacuum_end_parallel(LVState *lvstate, LVLeader *lvleader, bool update_stats)
+{
+	IndexStats *copied_indstats = NULL;
+
+	if (update_stats)
+	{
+		/* Copy index stats before destroy parallel context */
+		copied_indstats = palloc(sizeof(IndexStats) * lvstate->nindexes);
+		memcpy(copied_indstats, lvstate->indstats->stats,
+			   sizeof(IndexStats) * lvstate->nindexes);
+	}
+
+	/* Wait for workers finished vacuum */
+	WaitForParallelWorkersToFinish(lvleader->pcxt);
+
+	/* End parallel mode */
+	DestroyParallelContext(lvleader->pcxt);
+	ExitParallelMode();
+
+	/*
+	 * Since we cannot do any updates in parallel mode we update index statistics
+	 * after exit parallel mode.
+	 */
+	if (update_stats)
+	{
+		int i;
+
+		for (i = 0; i < lvstate->nindexes; i++)
+		{
+			Relation	ind = lvstate->indRels[i];
+			IndexStats *istat = (IndexStats *) &(copied_indstats[i]);
+
+			/* Update index statsistics */
+			if (istat->need_update)
+				vac_update_relstats(ind,
+									istat->num_pages,
+									istat->num_tuples,
+									0,
+									false,
+									InvalidTransactionId,
+									InvalidMultiXactId,
+									false);
+		}
+	}
+
+	/* Reset shared fields */
+	lvstate->indstats = NULL;
+	lvstate->dead_tuples = NULL;
+	WorkerState = NULL;
+}
+
+/*
+ * lazy_gather_worker_stats() -- Gather vacuum statistics from workers
+ */
+static void
+lazy_gather_worker_stats(LVLeader *lvleader, LVRelStats *vacrelstats)
+{
+	int	i;
+
+	if (!IsInParallelMode())
+		return;
+
+	/* Gather worker stats */
+	for (i = 0; i < (WorkerState->nparticipantvacuum_launched); i++)
+	{
+		LVRelStats *wstats = (LVRelStats *) &lvleader->allrelstats[i];
+
+		vacrelstats->scanned_pages += wstats->scanned_pages;
+		vacrelstats->pinskipped_pages += wstats->pinskipped_pages;
+		vacrelstats->frozenskipped_pages += wstats->frozenskipped_pages;
+		vacrelstats->tupcount_pages += wstats->tupcount_pages;
+		vacrelstats->empty_pages += wstats->empty_pages;
+		vacrelstats->vacuumed_pages += wstats->vacuumed_pages;
+		vacrelstats->num_tuples += wstats->num_tuples;
+		vacrelstats->live_tuples += wstats->live_tuples;
+		vacrelstats->tuples_deleted += wstats->tuples_deleted;
+		vacrelstats->unused_tuples += wstats->unused_tuples;
+		vacrelstats->pages_removed += wstats->pages_removed;
+		vacrelstats->new_dead_tuples += wstats->new_dead_tuples;
+		vacrelstats->new_live_tuples += wstats->new_live_tuples;
+		vacrelstats->nonempty_pages += wstats->nonempty_pages;
+	}
+}
+
+/*
+ * Return the number of maximum dead tuples can be stored according
+ * to vac_work_mem.
+ */
+static long
+lazy_get_max_dead_tuples(LVRelStats *vacrelstats, BlockNumber relblocks)
+{
+	long maxtuples;
+	int	vac_work_mem = IsAutoVacuumWorkerProcess() &&
+		autovacuum_work_mem != -1 ?
+		autovacuum_work_mem : maintenance_work_mem;
+
+	if (vacrelstats->hasindex)
+	{
+		maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);
+		maxtuples = Min(maxtuples, INT_MAX);
+		maxtuples = Min(maxtuples, MaxAllocSize / sizeof(ItemPointerData));
+
+		/* curious coding here to ensure the multiplication can't overflow */
+		if ((BlockNumber) (maxtuples / LAZY_ALLOC_TUPLES) > relblocks)
+			maxtuples = relblocks * LAZY_ALLOC_TUPLES;
+
+		/* stay sane if small maintenance_work_mem */
+		maxtuples = Max(maxtuples, MaxHeapTuplesPerPage);
+	}
+	else
+		maxtuples = MaxHeapTuplesPerPage;
+
+	return maxtuples;
+}
+
+/*
+ * lazy_prepare_next_state
+ *
+ * Before enter the next state prepare the next state. In parallel lazy vacuum,
+ * we must wait for the all vacuum workers to finish the previous state before
+ * preparation. Also, after prepared we change the state ot all vacuum workers
+ * and wake up them.
+ */
+static void
+lazy_prepare_next_state(LVState *lvstate, LVLeader *lvleader, int next_state)
+{
+	/* Wait for vacuum workers to finish the previous state */
+	if (IsInParallelMode())
+		lazy_wait_for_vacuum_workers_to_be_done();
+
+	switch (next_state)
+	{
+		/*
+		 * Before enter the next state do the preparation work. Since we can
+		 * guarantee that all vacuum workers don't touch and modify the parallel
+		 * vacuum shared data during preparation, we don't need to take any locks
+		 * related lazy vacuum to modify shared data.
+		 */
+		case VACSTATE_SCAN:
+			{
+				LVTidMap *dead_tuples = lvstate->dead_tuples;
+
+				/* Before scanning heap, clear the dead tuples */
+				MemSet(dead_tuples->itemptrs, 0,
+					   sizeof(ItemPointerData) * dead_tuples->max_items);
+				dead_tuples->num_items = 0;
+				dead_tuples->item_idx = 0;
+				dead_tuples->vacuumed_pages = 0;
+				break;
+			}
+		case VACSTATE_VACUUM_INDEX:
+			{
+				LVTidMap	*dead_tuples = lvstate->dead_tuples;
+
+				/* Before vacuum indexes, sort the dead tuple array */
+				qsort((void *) dead_tuples->itemptrs,
+					  dead_tuples->num_items,
+					  sizeof(ItemPointerData), vac_cmp_itemptr);
+
+				/* Reset the process counter of index vacuum */
+				if (lvstate->indstats)
+					lvstate->indstats->nprocessed = 0;
+				break;
+			}
+		case VACSTATE_CLEANUP_INDEX:
+		{
+			LVRelStats *vacrelstats = lvstate->vacrelstats;
+
+			/* Gather vacuum statistics of all vacuum workers */
+			lazy_gather_worker_stats(lvleader, lvstate->vacrelstats);
+
+			/* now we can compute the new value for pg_class.reltuples */
+			vacrelstats->new_live_tuples = vac_estimate_reltuples(lvstate->relation,
+																  vacrelstats->rel_pages,
+																  vacrelstats->tupcount_pages,
+																  vacrelstats->live_tuples);
+
+			/* also compute total number of surviving heap entries */
+			vacrelstats->new_rel_tuples =
+				vacrelstats->new_live_tuples + vacrelstats->new_dead_tuples;
+
+			/* Reset the process counter of index vacuum */
+			if (lvstate->indstats)
+				lvstate->indstats->nprocessed = 0;
+			break;
+		}
+		case VACSTATE_COMPLETED:
+		case VACSTATE_VACUUM_HEAP:
+			/* Before vacuum heap or before exit there is nothing preparation work */
+			break;
+		case VACSTATE_INVALID:
+			elog(ERROR, "unexpected vacuum state %d", next_state);
+			break;
+		default:
+			elog(ERROR, "invalid vacuum state %d", next_state);
+	}
+
+	/* Advance state to the VACUUM state and wake up vacuum workers */
+	if (IsInParallelMode())
+	{
+		lazy_set_workers_state(next_state);
+		ConditionVariableBroadcast(&WorkerState->cv);
+	}
+}
+
+/*
+ * lazy_dead_tuples_is_full - is the dead tuple space full?
+ *
+ * Return true if dead tuple space is full.
+ */
+static bool
+lazy_dead_tuples_is_full(LVTidMap *dead_tuples)
+{
+	bool isfull;
+
+	if (dead_tuples->shared)
+		SpinLockAcquire(&(dead_tuples->mutex));
+
+	isfull = ((dead_tuples->num_items > 0) &&
+			  ((dead_tuples->max_items - dead_tuples->num_items) < MaxHeapTuplesPerPage));
+
+	if (dead_tuples->shared)
+		SpinLockRelease(&(dead_tuples->mutex));
+
+	return isfull;
+}
+
+/*
+ * lazy_get_dead_tuple_count
+ *
+ * Get the current number of dead tuples we are having.
+ */
+static int
+lazy_get_dead_tuple_count(LVTidMap *dead_tuples)
+{
+	int num_items;
+
+	if (dead_tuples->shared)
+		SpinLockAcquire(&dead_tuples->mutex);
+
+	num_items = dead_tuples->num_items;
+
+	if (dead_tuples->shared)
+		SpinLockRelease(&dead_tuples->mutex);
+
+	return num_items;
+}
+
+/*
+ * lazy_get_next_vacuum_page
+ *
+ * For vacuum heap pages, return the block number we vacuum next from the
+ * dead tuple space. Also we advance the index of dead tuple until the
+ * different next block appears for the next search.
+ *
+ * NB: the dead_tuples must be sorted by TID order.
+ */
+static BlockNumber
+lazy_get_next_vacuum_page(LVState *lvstate, int *tupindex_p, int *npages_p)
+{
+	LVTidMap	*dead_tuples = lvstate->dead_tuples;
+	BlockNumber tblk;
+	BlockNumber	prev_tblk = InvalidBlockNumber;
+	BlockNumber	vacuum_tblk;
+
+	Assert(tupindex_p != NULL && npages_p != NULL);
+
+	if (!dead_tuples->shared)
+	{
+		/* Reached the end of dead tuples */
+		if (dead_tuples->item_idx >= dead_tuples->num_items)
+			return InvalidBlockNumber;
+
+		tblk = ItemPointerGetBlockNumber(&(dead_tuples->itemptrs[dead_tuples->item_idx]));
+		*tupindex_p = dead_tuples->item_idx++;
+		*npages_p = tblk;
+		return tblk;
+	}
+
+	/*
+	 * For parallel vacuum, need locks.
+	 *
+	 * XXX: The number of maximum tuple we need to advance is not a large
+	 * number, up to MaxHeapTuplesPerPage. So we use spin lock here.
+	 */
+	if (dead_tuples->shared)
+		SpinLockAcquire(&(dead_tuples->mutex));
+
+	if (dead_tuples->item_idx >= dead_tuples->num_items)
+	{
+		/* Reached the end of dead tuples array */
+		vacuum_tblk = InvalidBlockNumber;
+		*tupindex_p = dead_tuples->num_items;
+		*npages_p = dead_tuples->vacuumed_pages;
+		goto done;
+	}
+
+	/* Set the block number being vacuumed next */
+	vacuum_tblk = ItemPointerGetBlockNumber(&(dead_tuples->itemptrs[dead_tuples->item_idx]));
+
+	/* Set the output arguments */
+	*tupindex_p = dead_tuples->item_idx;
+	*npages_p = ++(dead_tuples->vacuumed_pages);
+
+	/* Advance the index to the beginning of the next different block */
+	while (dead_tuples->item_idx < dead_tuples->num_items)
+	{
+		tblk = ItemPointerGetBlockNumber(&(dead_tuples->itemptrs[dead_tuples->item_idx]));
+
+		if (BlockNumberIsValid(prev_tblk) && prev_tblk != tblk)
+			break;
+
+		prev_tblk = tblk;
+		dead_tuples->item_idx++;
+	}
+
+done:
+	if (dead_tuples->shared)
+		SpinLockRelease(&(dead_tuples->mutex));
+
+	return vacuum_tblk;
+}
+
+/*
+ * Vacuum all indexes. In parallel vacuum, each workers take indexes
+ * one by one. Also after vacuumed index they mark it as done. This marking
+ * is necessary to guarantee that all indexes are vacuumed based on
+ * the current collected dead tuples. The leader process continues to
+ * vacuum even if any indexes is not vacuumed completely due to failure of
+ * parallel worker for whatever reason. The mark will be checked before entering
+ * the next state.
+ */
+void
+lazy_vacuum_all_indexes(LVState *lvstate)
+{
+	int idx;
+	int nprocessed = 0;
+	LVIndStats *sharedstats = lvstate->indstats;
+
+	/* Take the index number we vacuum */
+	if (IsInParallelMode())
+	{
+		Assert(sharedstats != NULL);
+		SpinLockAcquire(&(sharedstats->mutex));
+		idx = (sharedstats->nprocessed)++;
+		SpinLockRelease(&sharedstats->mutex);
+	}
+	else
+		idx = nprocessed++;
+
+	while (idx  < lvstate->nindexes)
+	{
+		/* Remove index entries */
+		lazy_vacuum_index(lvstate->indRels[idx], &lvstate->indbulkstats[idx],
+						  lvstate->vacrelstats, lvstate->dead_tuples);
+
+		/* Take the next index number we vacuum */
+		if (IsInParallelMode())
+		{
+			SpinLockAcquire(&(sharedstats->mutex));
+			idx = (sharedstats->nprocessed)++;
+			SpinLockRelease(&sharedstats->mutex);
+		}
+		else
+			idx = nprocessed++;
+	}
+}
+
+/*
+ * Cleanup all indexes.
+ * This function is similar to lazy_vacuum_all_indexes.
+ */
+void
+lazy_cleanup_all_indexes(LVState *lvstate)
+{
+	int idx;
+	int nprocessed = 0;
+	LVIndStats *sharedstats = lvstate->indstats;
+
+	/* Return if no indexes */
+	if (lvstate->nindexes == 0)
+		return;
+
+	/* Get the target index number */
+	if (IsInParallelMode())
+	{
+		Assert(sharedstats != NULL);
+		SpinLockAcquire(&(sharedstats->mutex));
+		idx = (sharedstats->nprocessed)++;
+		SpinLockRelease(&sharedstats->mutex);
+	}
+	else
+		idx = nprocessed++;
+
+	while (idx  < lvstate->nindexes)
+	{
+		/*
+		 * Do post-vacuum cleanup. Update statistics for each index if not
+		 * in parallel vacuum.
+		 */
+		lazy_cleanup_index(lvstate->indRels[idx],
+						   lvstate->indbulkstats[idx],
+						   lvstate->vacrelstats,
+						   (lvstate->indstats) ? &(sharedstats->stats[idx]) : NULL);
+
+		/* Get the next target index number */
+		if (IsInParallelMode())
+		{
+			SpinLockAcquire(&(sharedstats->mutex));
+			idx = (sharedstats->nprocessed)++;
+			SpinLockRelease(&sharedstats->mutex);
+		}
+		else
+			idx = nprocessed++;
+	}
+}
+
+/*
+ * Set xid limits. This function is for parallel vacuum workers.
+ */
+void
+vacuum_set_xid_limits_for_worker(TransactionId oldestxmin, TransactionId freezelimit,
+								  MultiXactId multixactcutoff)
+{
+	OldestXmin = oldestxmin;
+	FreezeLimit = freezelimit;
+	MultiXactCutoff = multixactcutoff;
+}
+
+/*
+ * Set error level during lazy vacuum for vacuum workers.
+ */
+void
+vacuum_set_elevel_for_worker(int worker_elevel)
+{
+	elevel = worker_elevel;
+}
+
+/*
+ * lazy_set_workers_state - set new state to the all parallel workers
+ */
+static void
+lazy_set_workers_state(VacWorkerState new_state)
+{
+	int i;
+
+	Assert(!IsParallelWorker());
+
+	for (i = 0; i < WorkerState->nparticipantvacuum_launched; i++)
+	{
+		VacuumWorker *w = &WorkerState->workers[i];
+
+		SpinLockAcquire(&w->mutex);
+		if (!IsVacuumWorkerInvalid(w->pid))
+			w->state = new_state;
+		SpinLockRelease(&w->mutex);
+	}
+}
+
+/*
+ * Wait for parallel vacuum workers to attach to both the shmem context
+ * and a worker slot. This is needed for ensuring that the leader can see
+ * the states of all launched workers when checking.
+ */
+static void
+lazy_wait_for_vacuum_workers_attach(ParallelContext *pcxt)
+{
+	int i;
+
+	/* Wait for workers to attach to the shmem context */
+	WaitForParallelWorkersToAttach(pcxt);
+
+	/* Also, wait for workers to attach to the vacuum worker slot */
+	for (i = 0; i < pcxt->nworkers_launched; i++)
+	{
+		VacuumWorker	*worker = &WorkerState->workers[i];
+		int rc;
+
+		for (;;)
+		{
+			pid_t pid;
+
+			CHECK_FOR_INTERRUPTS();
+
+			/*
+			 * If the worker stopped without attaching the vacuum worker
+			 * slot, throw an error.
+			 */
+			if (IsVacuumWorkerStopped(worker->pid))
+				ereport(ERROR,
+						(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+						 errmsg("parallel vacuum worker failed to initialize")));
+
+			SpinLockAcquire(&worker->mutex);
+			pid = worker->pid;
+			SpinLockRelease(&worker->mutex);
+
+			/* The worker successfully attached */
+			if (pid != InvalidPid)
+				break;
+
+			rc = WaitLatch(MyLatch,
+						   WL_TIMEOUT | WL_POSTMASTER_DEATH,
+						   10L, WAIT_EVENT_PARALLEL_VACUUM_STARTUP);
+
+			if (rc & WL_POSTMASTER_DEATH)
+				return;
+
+			ResetLatch(MyLatch);
+		}
+	}
+}
+
+/*
+ * lazy_wait_for_vacuum_workers_to_be_done - all workers are done the previous work?
+ *
+ * Wait for all parallel workers to change its state to VACSTATE_WORKER_DONE.
+ */
+static void
+lazy_wait_for_vacuum_workers_to_be_done(void)
+{
+	while (true)
+	{
+		int i;
+		bool all_finished = true;
+
+		CHECK_FOR_INTERRUPTS();
+
+		for (i = 0; i < WorkerState->nparticipantvacuum_launched; i++)
+		{
+			VacuumWorker *w = &WorkerState->workers[i];
+			pid_t	pid;
+			int		state;
+
+			SpinLockAcquire(&w->mutex);
+			pid = w->pid;
+			state = w->state;
+			SpinLockRelease(&w->mutex);
+
+			/* Skip unused slot */
+			if (IsVacuumWorkerInvalid(pid))
+				continue;
+
+			if (state != VACSTATE_WORKER_DONE)
+			{
+				/* Not finished */
+				all_finished = false;
+				break;
+			}
+		}
+
+		/* All vacuum worker done */
+		if (all_finished)
+			break;
+
+		ConditionVariableSleep(&WorkerState->cv, WAIT_EVENT_PARALLEL_VACUUM);
+	}
+
+	ConditionVariableCancelSleep();
+}
+
+/*
+ * lv_beginscan() -- begin lazy vacuum heap scan
+ *
+ * In parallel vacuum we use parallel heap scan, so initialize parallel
+ * heap scan description.
+ */
+LVScanDesc
+lv_beginscan(Relation onerel, LVShared *lvshared, bool aggressive,
+			 bool disable_page_skipping)
+{
+	LVScanDesc	lvscan = (LVScanDesc) palloc(sizeof(LVScanDescData));
+
+	/* Scan target relation */
+	lvscan->lv_rel = onerel;
+	lvscan->lv_nblocks = RelationGetNumberOfBlocks(onerel);
+
+	/* Set scan options */
+	lvscan->aggressive = aggressive;
+	lvscan->disable_page_skipping = disable_page_skipping;
+
+	/* Initialize other fields */
+	lvscan->lv_heapscan = NULL;
+	lvscan->lv_cblock = 0;
+	lvscan->lv_next_unskippable_block = 0;
+
+	/* For parallel lazy vacuum */
+	if (lvshared)
+	{
+		Assert(!IsBootstrapProcessingMode());
+		lvscan->lv_heapscan = heap_beginscan_parallel(onerel, &lvshared->heapdesc);
+		heap_parallelscan_startblock_init(lvscan->lv_heapscan);
+	}
+
+	return lvscan;
+}
+
+/*
+ * lv_endscan() -- end lazy vacuum heap scan
+ */
+void
+lv_endscan(LVScanDesc lvscan)
+{
+	if (lvscan->lv_heapscan != NULL)
+		heap_endscan(lvscan->lv_heapscan);
+	pfree(lvscan);
+}
+
+/*
+ * Return the block number we need to scan next, or InvalidBlockNumber if
+ * scan finished.
+ *
+ * Except when aggressive is set, we want to skip pages that are
+ * all-visible according to the visibility map, but only when we can skip
+ * at least SKIP_PAGES_THRESHOLD consecutive pages.	 Since we're reading
+ * sequentially, the OS should be doing readahead for us, so there's no
+ * gain in skipping a page now and then; that's likely to disable
+ * readahead and so be counterproductive. Also, skipping even a single
+ * page means that we can't update relfrozenxid, so we only want to do it
+ * if we can skip a goodly number of pages.
+ *
+ * When aggressive is set, we can't skip pages just because they are
+ * all-visible, but we can still skip pages that are all-frozen, since
+ * such pages do not need freezing and do not affect the value that we can
+ * safely set for relfrozenxid or relminmxid.
+ *
+ * Before entering the main loop, establish the invariant that
+ * next_unskippable_block is the next block number >= blkno that we can't
+ * skip based on the visibility map, either all-visible for a regular scan
+ * or all-frozen for an aggressive scan.  We set it to nblocks if there's
+ * no such block.  We also set up the skipping_blocks flag correctly at
+ * this stage.
+ *
+ * In single lazy scan, before entering the main loop, establish the
+ * invariant that next_unskippable_block is the next block number >= blkno
+ * that's not we can't skip based on the visibility map, either all-visible
+ * for a regular scan or all-frozen for an aggressive scan.	 We set it to
+ * nblocks if there's no such block.  We also set up the skipping_blocks
+ * flag correctly at this stage.
+ *
+ * In parallel lazy scan, we scan heap pages using parallel heap scan.
+ * Each worker calls heap_parallelscan_nextpage() in order to exclusively
+ * get the block number we scan. Unlike single parallel lazy scan, we skip
+ * all-visible blocks immediately.
+ *
+ * Note: The value returned by visibilitymap_get_status could be slightly
+ * out-of-date, since we make this test before reading the corresponding
+ * heap page or locking the buffer.	 This is OK.  If we mistakenly think
+ * that the page is all-visible or all-frozen when in fact the flag's just
+ * been cleared, we might fail to vacuum the page.	It's easy to see that
+ * skipping a page when aggressive is not set is not a very big deal; we
+ * might leave some dead tuples lying around, but the next vacuum will
+ * find them.  But even when aggressive *is* set, it's still OK if we miss
+ * a page whose all-frozen marking has just been cleared.  Any new XIDs
+ * just added to that page are necessarily newer than the GlobalXmin we
+ * Computed, so they'll have no effect on the value to which we can safely
+ * set relfrozenxid.  A similar argument applies for MXIDs and relminmxid.
+ *
+ * We will scan the table's last page, at least to the extent of
+ * determining whether it has tuples or not, even if it should be skipped
+ * according to the above rules; except when we've already determined that
+ * it's not worth trying to truncate the table.	 This avoids having
+ * lazy_truncate_heap() take access-exclusive lock on the table to attempt
+ * a truncation that just fails immediately because there are tuples in
+ * the last page.  This is worth avoiding mainly because such a lock must
+ * be replayed on any hot standby, where it can be disruptive.
+ */
+static BlockNumber
+lazy_scan_get_nextpage(LVScanDesc lvscan, LVRelStats *vacrelstats,
+					   bool *all_visible_according_to_vm_p, Buffer *vmbuffer_p)
+{
+	BlockNumber blkno;
+
+	/* Parallel lazy scan mode */
+	if (lvscan->lv_heapscan)
+	{
+		/*
+		 * In parallel lazy vacuum since we cannot know how many consecutive
+		 * all-visible pages exits on table we skip to scan the all-visible
+		 * page immediately.
+		 */
+		while ((blkno = heap_parallelscan_nextpage(lvscan->lv_heapscan)) != InvalidBlockNumber)
+		{
+			*all_visible_according_to_vm_p = false;
+			vacuum_delay_point();
+
+			pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_SCANNED, blkno);
+
+			/* Consider to skip scanning the page according visibility map */
+			if (!lvscan->disable_page_skipping &&
+				!FORCE_CHECK_PAGE(vacrelstats->rel_pages, blkno, vacrelstats))
+			{
+				uint8		vmstatus;
+
+				vmstatus = visibilitymap_get_status(lvscan->lv_rel, blkno, vmbuffer_p);
+
+				if (lvscan->aggressive)
+				{
+					if ((vmstatus & VISIBILITYMAP_ALL_FROZEN) != 0)
+					{
+						vacrelstats->frozenskipped_pages++;
+						continue;
+					}
+					else if ((vmstatus & VISIBILITYMAP_ALL_VISIBLE) != 0)
+						*all_visible_according_to_vm_p = true;
+				}
+				else
+				{
+					if ((vmstatus & VISIBILITYMAP_ALL_VISIBLE) != 0)
+					{
+						if ((vmstatus & VISIBILITYMAP_ALL_FROZEN) != 0)
+							vacrelstats->frozenskipped_pages++;
+						continue;
+					}
+				}
+			}
+
+			/* Okay, need to scan current blkno, break */
+			break;
+		}
+	}
+	else	/* Single lazy scan mode */
+	{
+		bool skipping_blocks = false;
+
+		pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_SCANNED, lvscan->lv_cblock);
+
+		/* Initialize lv_nextunskippable_page if needed */
+		if (lvscan->lv_cblock == 0 && !lvscan->disable_page_skipping)
+		{
+			while (lvscan->lv_next_unskippable_block < lvscan->lv_nblocks)
+			{
+				uint8		vmstatus;
+
+				vmstatus = visibilitymap_get_status(lvscan->lv_rel,
+													lvscan->lv_next_unskippable_block,
+													vmbuffer_p);
+				if (lvscan->aggressive)
+				{
+					if ((vmstatus & VISIBILITYMAP_ALL_FROZEN) == 0)
+						break;
+				}
+				else
+				{
+					if ((vmstatus & VISIBILITYMAP_ALL_VISIBLE) == 0)
+						break;
+				}
+				vacuum_delay_point();
+				lvscan->lv_next_unskippable_block++;
+			}
+
+			if (lvscan->lv_next_unskippable_block >= SKIP_PAGES_THRESHOLD)
+				skipping_blocks = true;
+			else
+				skipping_blocks = false;
+		}
+
+		/* Decide the block number we need to scan */
+		for (blkno = lvscan->lv_cblock; blkno < lvscan->lv_nblocks; blkno++)
+		{
+			if (blkno == lvscan->lv_next_unskippable_block)
+			{
+				/* Time to advance next_unskippable_block */
+				lvscan->lv_next_unskippable_block++;
+				if (!lvscan->disable_page_skipping)
+				{
+					while (lvscan->lv_next_unskippable_block < lvscan->lv_nblocks)
+					{
+						uint8		vmstatus;
+
+						vmstatus = visibilitymap_get_status(lvscan->lv_rel,
+															lvscan->lv_next_unskippable_block,
+															vmbuffer_p);
+						if (lvscan->aggressive)
+						{
+							if ((vmstatus & VISIBILITYMAP_ALL_FROZEN) == 0)
+								break;
+						}
+						else
+						{
+							if ((vmstatus & VISIBILITYMAP_ALL_VISIBLE) == 0)
+								break;
+						}
+						vacuum_delay_point();
+						lvscan->lv_next_unskippable_block++;
+					}
+				}
+
+				/*
+				 * We know we can't skip the current block.	 But set up
+				 * skipping_all_visible_blocks to do the right thing at the
+				 * following blocks.
+				 */
+				if (lvscan->lv_next_unskippable_block - blkno > SKIP_PAGES_THRESHOLD)
+					skipping_blocks = true;
+				else
+					skipping_blocks = false;
+
+				/*
+				 * Normally, the fact that we can't skip this block must mean that
+				 * it's not all-visible.  But in an aggressive vacuum we know only
+				 * that it's not all-frozen, so it might still be all-visible.
+				 */
+				if (lvscan->aggressive && VM_ALL_VISIBLE(lvscan->lv_rel, blkno, vmbuffer_p))
+					*all_visible_according_to_vm_p = true;
+
+				/* Found out that next unskippable block number */
+				break;
+			}
+			else
+			{
+				/*
+				 * The current block is potentially skippable; if we've seen a
+				 * long enough run of skippable blocks to justify skipping it, and
+				 * we're not forced to check it, then go ahead and skip.
+				 * Otherwise, the page must be at least all-visible if not
+				 * all-frozen, so we can set *all_visible_according_to_vm_p = true.
+				 */
+				if (skipping_blocks &&
+					!FORCE_CHECK_PAGE(vacrelstats->rel_pages, blkno, vacrelstats))
+				{
+					/*
+					 * Tricky, tricky.	If this is in aggressive vacuum, the page
+					 * must have been all-frozen at the time we checked whether it
+					 * was skippable, but it might not be any more.	 We must be
+					 * careful to count it as a skipped all-frozen page in that
+					 * case, or else we'll think we can't update relfrozenxid and
+					 * relminmxid.	If it's not an aggressive vacuum, we don't
+					 * know whether it was all-frozen, so we have to recheck; but
+					 * in this case an approximate answer is OK.
+					 */
+					if (lvscan->aggressive || VM_ALL_FROZEN(lvscan->lv_rel, blkno, vmbuffer_p))
+						vacrelstats->frozenskipped_pages++;
+					continue;
+				}
+
+				*all_visible_according_to_vm_p = true;
+
+				/* We need to scan current blkno, break */
+				break;
+			}
+		} /* for */
+
+		/* Advance the current block number for the next scan */
+		lvscan->lv_cblock = blkno + 1;
+	}
+
+	return (blkno == lvscan->lv_nblocks) ? InvalidBlockNumber : blkno;
+}
diff --git a/src/backend/commands/vacuumworker.c b/src/backend/commands/vacuumworker.c
new file mode 100644
index 0000000..ccdc7b1
--- /dev/null
+++ b/src/backend/commands/vacuumworker.c
@@ -0,0 +1,327 @@
+/*-------------------------------------------------------------------------
+ *
+ * vacuumworker.c
+ *	  Parallel lazy vacuum worker.
+ *
+ * The parallel vacuum worker process is a process that helps lazy vacuums.
+ * It continues to wait for its state to be changed by the vacuum leader process.
+ * After finished any state it sets state as done. Normal termination is also
+ * by the leader process.
+ *
+ * Portions Copyright (c) 1996-2018, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ *	  src/backend/commands/vacuumworker.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "access/parallel.h"
+#include "access/xact.h"
+#include "commands/vacuum.h"
+#include "commands/vacuum_internal.h"
+#include "miscadmin.h"
+#include "pgstat.h"
+#include "storage/bufmgr.h"
+#include "storage/condition_variable.h"
+#include "storage/ipc.h"
+#include "tcop/tcopprot.h"
+
+static VacuumWorker	*MyVacuumWorker = NULL;
+
+/* Parallel vacuum worker function prototypes */
+static void lvworker_set_state(VacWorkerState new_state);
+static VacWorkerState lvworker_get_state(void);
+static void lvworker_mainloop(LVState *lvstate);
+static void lvworker_wait_for_next_work(void);
+static void lvworker_attach(void);
+static void lvworker_detach(void);
+static void lvworker_onexit(int code, Datum arg);
+
+/*
+ * Perform work within a launched parallel process.
+ */
+void
+lazy_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
+{
+	LVState		*lvstate = (LVState *) palloc(sizeof(LVState));
+	LVShared	*lvshared;
+	LVRelStats	*vacrelstats;
+	char		*sharedquery;
+
+	ereport(DEBUG1,
+			(errmsg("starting parallel lazy vacuum worker")));
+
+	/* Register the callback function */
+	before_shmem_exit(lvworker_onexit, (Datum) 0);
+
+	/* Look up worker state and attach to the vacuum worker slot */
+	WorkerState = (LVWorkerState *) shm_toc_lookup(toc, VACUUM_KEY_WORKERS, false);
+	lvworker_attach();
+
+	/* Set shared state */
+	lvshared = (LVShared *) shm_toc_lookup(toc, VACUUM_KEY_SHARED, false);
+
+	/*
+	 * Set debug_query_string. The debug_query_string can not be found in
+	 * autovacuum case.
+	 */
+	sharedquery = shm_toc_lookup(toc, VACUUM_KEY_QUERY_TEXT, true);
+	if (sharedquery)
+	{
+		debug_query_string = sharedquery;
+		pgstat_report_activity(STATE_RUNNING, debug_query_string);
+	}
+	else
+		pgstat_report_activity(STATE_RUNNING, lvshared->is_wraparound ?
+							   "autovacuum: parallel worker (to prevent wraparound)" :
+							   "autovacuum: parallel worker");
+
+	/* Set individual vacuum statistics */
+	vacrelstats = (LVRelStats *) shm_toc_lookup(toc, VACUUM_KEY_VACUUM_STATS, false);
+
+	/* Set lazy vacuum state */
+	lvstate->relid = lvshared->relid;
+	lvstate->aggressive = lvshared->aggressive;
+	lvstate->options = lvshared->options;
+	lvstate->vacrelstats = vacrelstats + ParallelWorkerNumber;
+	lvstate->relation = relation_open(lvstate->relid, ShareUpdateExclusiveLock);
+	vac_open_indexes(lvstate->relation, RowExclusiveLock, &lvstate->nindexes,
+					 &lvstate->indRels);
+	lvstate->lvshared = lvshared;
+	lvstate->indstats = NULL;
+	lvstate->dead_tuples = NULL;
+
+	/*
+	 * Set the PROC_IN_VACUUM flag, which lets other concurrent VACUUMs know that
+	 * they can ignore this one while determining their OldestXmin. Also set the
+	 * PROC_VACUUM_FOR_WRAPAROUND flag. Please see the comment in vacuum_rel for
+	 * details.
+	 */
+	LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
+	MyPgXact->vacuumFlags |= PROC_IN_VACUUM;
+	if (lvshared->is_wraparound)
+		MyPgXact->vacuumFlags |= PROC_VACUUM_FOR_WRAPAROUND;
+	LWLockRelease(ProcArrayLock);
+
+	/* Set the space for both index statistics and dead tuples if table with index */
+	if (lvstate->nindexes > 0)
+	{
+		LVTidMap		*dead_tuples;
+		LVIndStats		*indstats;
+
+		/* Attach shared dead tuples */
+		dead_tuples = (LVTidMap *) shm_toc_lookup(toc, VACUUM_KEY_DEAD_TUPLES, false);
+		lvstate->dead_tuples = dead_tuples;
+
+		/* Attach Shared index stats */
+		indstats = (LVIndStats *) shm_toc_lookup(toc, VACUUM_KEY_INDEX_STATS, false);
+		lvstate->indstats = indstats;
+
+		/* Prepare for index bulkdelete */
+		lvstate->indbulkstats = (IndexBulkDeleteResult **)
+			palloc0(lvstate->nindexes * sizeof(IndexBulkDeleteResult *));
+	}
+	else
+	{
+		/* Dead tuple are stored into the local memory if no indexes */
+		lazy_space_alloc(lvstate, RelationGetNumberOfBlocks(lvstate->relation));
+		lvstate->indstats = NULL;
+	}
+
+	/* Restore vacuum xid limits and elevel */
+	vacuum_set_xid_limits_for_worker(lvshared->oldestXmin, lvshared->freezeLimit,
+									 lvshared->multiXactCutoff);
+	vacuum_set_elevel_for_worker(lvshared->elevel);
+
+	pgstat_progress_start_command(PROGRESS_COMMAND_VACUUM,
+								  lvshared->relid);
+
+	/* Restore vacuum delay */
+	VacuumCostDelay = lvshared->cost_delay;
+	VacuumCostLimit = lvshared->cost_limit;
+	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostBalance = 0;
+	VacuumPageHit = 0;
+	VacuumPageMiss = 0;
+	VacuumPageDirty = 0;
+
+	/* Begin lazy heap scan */
+	lvstate->lvscan = lv_beginscan(lvstate->relation, lvstate->lvshared, lvshared->aggressive,
+						  (lvstate->options.flags & VACOPT_DISABLE_PAGE_SKIPPING) != 0);
+
+	/* Prepare other fields */
+	lvstate->frozen = palloc(sizeof(xl_heap_freeze_tuple) * MaxHeapTuplesPerPage);
+
+	/* Enter the main loop */
+	lvworker_mainloop(lvstate);
+
+	/* The lazy vacuum has done, do the post-processing */
+	lv_endscan(lvstate->lvscan);
+	pgstat_progress_end_command();
+	lvworker_detach();
+	cancel_before_shmem_exit(lvworker_onexit, (Datum) 0);
+
+	vac_close_indexes(lvstate->nindexes, lvstate->indRels, RowExclusiveLock);
+	heap_close(lvstate->relation, ShareUpdateExclusiveLock);
+}
+
+/*
+ * Main loop for vacuum workers.
+ */
+static void
+lvworker_mainloop(LVState *lvstate)
+{
+	bool	exit = false;
+
+	/*
+	 * Loop until the leader commands it to exit.
+	 */
+	while (!exit)
+	{
+		VacWorkerState mystate;
+
+		/* Wait for the status to be changed by the leader */
+		lvworker_wait_for_next_work();
+
+		/* Get my new state */
+		mystate = lvworker_get_state();
+
+		/* Dispatch the work according to the state */
+		switch (mystate)
+		{
+			case VACSTATE_SCAN:
+				{
+					bool dummy;
+					do_lazy_scan_heap(lvstate, &dummy);
+					break;
+				}
+			case VACSTATE_VACUUM_INDEX:
+				{
+					lazy_vacuum_all_indexes(lvstate);
+					break;
+				}
+			case VACSTATE_VACUUM_HEAP:
+				{
+					lazy_vacuum_heap(lvstate->relation, lvstate);
+					break;
+				}
+			case VACSTATE_CLEANUP_INDEX:
+				{
+					lazy_cleanup_all_indexes(lvstate);
+					break;
+				}
+			case VACSTATE_COMPLETED:
+				{
+					/* The leader asked us to exit */
+					exit = true;
+					break;
+				}
+			case VACSTATE_INVALID:
+			case VACSTATE_WORKER_DONE:
+				{
+					elog(ERROR, "unexpected vacuum state %d", mystate);
+					break;
+				}
+		}
+
+		/* Set my state as done after finished */
+		lvworker_set_state(VACSTATE_WORKER_DONE);
+	}
+}
+
+/*
+ * Wait for the my state to be changed by the vacuum leader.
+ */
+static void
+lvworker_wait_for_next_work(void)
+{
+	VacWorkerState mystate;
+
+	for (;;)
+	{
+		mystate = lvworker_get_state();
+
+		/* Got the next valid state by the vacuum leader */
+		if (mystate != VACSTATE_WORKER_DONE && mystate != VACSTATE_INVALID)
+			break;
+
+		/* Sleep until the next notification */
+		ConditionVariableSleep(&WorkerState->cv, WAIT_EVENT_PARALLEL_VACUUM);
+	}
+
+	ConditionVariableCancelSleep();
+}
+
+/*
+ * lvworker_get_state - get my current state
+ */
+static VacWorkerState
+lvworker_get_state(void)
+{
+	VacWorkerState state;
+
+	SpinLockAcquire(&MyVacuumWorker->mutex);
+	state = MyVacuumWorker->state;
+	SpinLockRelease(&MyVacuumWorker->mutex);
+
+	return state;
+}
+
+/*
+ * lvworker_set_state - set new state to my state
+ */
+static void
+lvworker_set_state(VacWorkerState new_state)
+{
+	SpinLockAcquire(&MyVacuumWorker->mutex);
+	MyVacuumWorker->state = new_state;
+	SpinLockRelease(&MyVacuumWorker->mutex);
+
+	ConditionVariableBroadcast(&WorkerState->cv);
+}
+
+/*
+ * Clean up function for parallel vacuum worker
+ */
+static void
+lvworker_onexit(int code, Datum arg)
+{
+	if (IsInParallelMode() && MyVacuumWorker)
+		lvworker_detach();
+}
+
+/*
+ * Detach the worker and cleanup worker information.
+ */
+static void
+lvworker_detach(void)
+{
+	SpinLockAcquire(&MyVacuumWorker->mutex);
+	MyVacuumWorker->state = VACSTATE_INVALID;
+	MyVacuumWorker->pid = 0;	/* the worker is dead */
+	SpinLockRelease(&MyVacuumWorker->mutex);
+
+	MyVacuumWorker = NULL;
+}
+
+/*
+ * Attach to a worker slot according to its ParallelWorkerNumber.
+ */
+static void
+lvworker_attach(void)
+{
+	VacuumWorker *vworker;
+
+	LWLockAcquire(&WorkerState->vacuumlock, LW_EXCLUSIVE);
+	vworker = &WorkerState->workers[ParallelWorkerNumber];
+	vworker->pid = MyProcPid;
+	vworker->state = VACSTATE_SCAN; /* first state */
+	LWLockRelease(&WorkerState->vacuumlock);
+
+	MyVacuumWorker = vworker;
+}
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 3bb91c9..e0e9d6d 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -1667,7 +1667,12 @@ _equalDropdbStmt(const DropdbStmt *a, const DropdbStmt *b)
 static bool
 _equalVacuumStmt(const VacuumStmt *a, const VacuumStmt *b)
 {
-	COMPARE_SCALAR_FIELD(options);
+	if (a->options.flags != b->options.flags)
+		return false;
+
+	if (a->options.nworkers != b->options.nworkers)
+		return false;
+
 	COMPARE_NODE_FIELD(rels);
 
 	return true;
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index c729a99..c33af66 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -52,6 +52,7 @@
 #include "parser/parsetree.h"
 #include "parser/parse_agg.h"
 #include "rewrite/rewriteManip.h"
+#include "storage/bufmgr.h"
 #include "storage/dsm_impl.h"
 #include "utils/rel.h"
 #include "utils/selfuncs.h"
@@ -6060,6 +6061,138 @@ plan_cluster_use_sort(Oid tableOid, Oid indexOid)
 }
 
 /*
+ * plan_lazy_vacuum_workers_index_workers
+ *		Use the planner to decide how many parallel worker processes
+ *		VACUUM and autovacuum should request for use
+ *
+ * tableOid is the table begin vacuumed which must not be non-tables or
+ * special system tables.
+ * nworkers_requested is the number of workers requested in VACUUM option
+ * by user. it's 0 if not requested.
+ *
+ * Return value is the number of parallel worker processes to request.  It
+ * may be unsafe to proceed if this is 0.  Note that this does not include the
+ * leader participating as a worker (value is always a number of parallel
+ * worker processes).
+ *
+ * Note: caller had better already hold some type of lock on the table and
+ * index.
+ */
+int
+plan_lazy_vacuum_workers(Oid tableOid, int nworkers_requested)
+{
+	int				parallel_workers;
+	PlannerInfo 	*root;
+	Query	   		*query;
+	PlannerGlobal 	*glob;
+	RangeTblEntry 	*rte;
+	RelOptInfo 		*rel;
+	Relation		heap;
+	BlockNumber		nblocks;
+
+	/* Return immediately when parallelism disabled */
+	if (max_parallel_maintenance_workers == 0)
+		return 0;
+
+	/* Set up largely-dummy planner state */
+	query = makeNode(Query);
+	query->commandType = CMD_SELECT;
+
+	glob = makeNode(PlannerGlobal);
+
+	root = makeNode(PlannerInfo);
+	root->parse = query;
+	root->glob = glob;
+	root->query_level = 1;
+	root->planner_cxt = CurrentMemoryContext;
+	root->wt_param_id = -1;
+
+	/*
+	 * Build a minimal RTE.
+	 *
+	 * Set the target's table to be an inheritance parent.  This is a kludge
+	 * that prevents problems within get_relation_info(), which does not
+	 * expect that any IndexOptInfo is currently undergoing REINDEX.
+	 */
+	rte = makeNode(RangeTblEntry);
+	rte->rtekind = RTE_RELATION;
+	rte->relid = tableOid;
+	rte->relkind = RELKIND_RELATION;	/* Don't be too picky. */
+	rte->lateral = false;
+	rte->inh = true;
+	rte->inFromCl = true;
+	query->rtable = list_make1(rte);
+
+	/* Set up RTE/RelOptInfo arrays */
+	setup_simple_rel_arrays(root);
+
+	/* Build RelOptInfo */
+	rel = build_simple_rel(root, 1, NULL);
+
+	heap = heap_open(tableOid, NoLock);
+	nblocks = RelationGetNumberOfBlocks(heap);
+
+	/*
+	 * If the number of workers is requested accept it (though still cap
+	 * at max_parallel_maitenance_workers).
+	 */
+	if (nworkers_requested > 0)
+	{
+		parallel_workers = Min(nworkers_requested,
+							   max_parallel_maintenance_workers);
+
+		if (parallel_workers != nworkers_requested)
+			ereport(NOTICE,
+					(errmsg("%d vacuum parallel worker requested but cappped by max_parallel_maintenance_workers",
+							nworkers_requested),
+					 errhint("Increase max_parallel_workers")));
+
+		goto done;
+	}
+
+	/*
+	 * If paralell_workers storage parameter is set for the table, accept that
+	 * as the number of parallel worker process to launch (though still cap
+	 * at max_parallel_maintenance_workers). Note that we deliberately do not
+	 * consider any other factor when parallel_workers is set. (e.g., memory
+	 * use by workers.)
+	 */
+	if (rel->rel_parallel_workers != -1)
+	{
+		parallel_workers = Min(rel->rel_parallel_workers,
+							   max_parallel_maintenance_workers);
+		goto done;
+	}
+
+	/*
+	 * Determine number of workers to scan the heap relation using generic
+	 * model.
+	 */
+	parallel_workers = compute_parallel_worker(rel,
+											   nblocks,
+											   -1,
+											   max_parallel_maintenance_workers);
+	/*
+	 * Cap workers based on available maintenance_work_mem as needed.
+	 *
+	 * Note that each tuplesort participant receives an even share of the
+	 * total maintenance_work_mem budget.  Aim to leave participants
+	 * (including the leader as a participant) with no less than 32MB of
+	 * memory.  This leaves cases where maintenance_work_mem is set to 64MB
+	 * immediately past the threshold of being capable of launching a single
+	 * parallel worker to sort.
+	 */
+	while (parallel_workers > 0 &&
+		   maintenance_work_mem / (parallel_workers + 1) < 32768L)
+		parallel_workers--;
+
+done:
+	heap_close(heap, NoLock);
+
+	return parallel_workers;
+}
+
+/*
  * plan_create_index_workers
  *		Use the planner to decide how many parallel worker processes
  *		CREATE INDEX should request for use
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 6d23bfb..47fff29 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -187,6 +187,7 @@ static void processCASbits(int cas_bits, int location, const char *constrType,
 			   bool *deferrable, bool *initdeferred, bool *not_valid,
 			   bool *no_inherit, core_yyscan_t yyscanner);
 static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
+static VacuumOption *makeVacOpt(VacuumOptionFlag flag, int nworkers);
 
 %}
 
@@ -237,6 +238,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 	struct ImportQual	*importqual;
 	InsertStmt			*istmt;
 	VariableSetStmt		*vsetstmt;
+	VacuumOption		*vacopt;
 	PartitionElem		*partelem;
 	PartitionSpec		*partspec;
 	PartitionBoundSpec	*partboundspec;
@@ -305,8 +307,8 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 				create_extension_opt_item alter_extension_opt_item
 
 %type <ival>	opt_lock lock_type cast_context
-%type <ival>	vacuum_option_list vacuum_option_elem
-				analyze_option_list analyze_option_elem
+%type <vacopt>	vacuum_option_list vacuum_option_elem
+%type <ival>	analyze_option_list analyze_option_elem
 %type <boolean>	opt_or_replace
 				opt_grant_grant_option opt_grant_admin_option
 				opt_nowait opt_if_exists opt_with_data
@@ -10503,22 +10505,29 @@ cluster_index_specification:
 VacuumStmt: VACUUM opt_full opt_freeze opt_verbose opt_analyze opt_vacuum_relation_list
 				{
 					VacuumStmt *n = makeNode(VacuumStmt);
-					n->options = VACOPT_VACUUM;
+					VacuumOption *vacopt = makeVacOpt(VACOPT_VACUUM, 0);
 					if ($2)
-						n->options |= VACOPT_FULL;
+						vacopt->flags |= VACOPT_FULL;
 					if ($3)
-						n->options |= VACOPT_FREEZE;
+						vacopt->flags |= VACOPT_FREEZE;
 					if ($4)
-						n->options |= VACOPT_VERBOSE;
+						vacopt->flags |= VACOPT_VERBOSE;
 					if ($5)
-						n->options |= VACOPT_ANALYZE;
+						vacopt->flags |= VACOPT_ANALYZE;
+
+					n->options.flags = vacopt->flags;
+					n->options.nworkers = 0;
 					n->rels = $6;
 					$$ = (Node *)n;
+					pfree(vacopt);
 				}
 			| VACUUM '(' vacuum_option_list ')' opt_vacuum_relation_list
 				{
-					VacuumStmt *n = makeNode(VacuumStmt);
-					n->options = VACOPT_VACUUM | $3;
+					VacuumStmt 		*n = makeNode(VacuumStmt);
+					VacuumOption 	*vacopt = $3;
+
+					n->options.flags = vacopt->flags | VACOPT_VACUUM;
+					n->options.nworkers = vacopt->nworkers;
 					n->rels = $5;
 					$$ = (Node *) n;
 				}
@@ -10526,20 +10535,44 @@ VacuumStmt: VACUUM opt_full opt_freeze opt_verbose opt_analyze opt_vacuum_relati
 
 vacuum_option_list:
 			vacuum_option_elem								{ $$ = $1; }
-			| vacuum_option_list ',' vacuum_option_elem		{ $$ = $1 | $3; }
+			| vacuum_option_list ',' vacuum_option_elem
+				{
+					VacuumOption *vacopt1 = (VacuumOption *) $1;
+					VacuumOption *vacopt2 = (VacuumOption *) $3;
+
+					/* OR flags */
+					vacopt1->flags |= vacopt2->flags;
+
+					/* Set requested parallel worker number */
+					if (vacopt2->flags == VACOPT_PARALLEL)
+						vacopt1->nworkers = vacopt2->nworkers;
+
+					$$ = vacopt1;
+					pfree(vacopt2);
+				}
 		;
 
 vacuum_option_elem:
-			analyze_keyword		{ $$ = VACOPT_ANALYZE; }
-			| VERBOSE			{ $$ = VACOPT_VERBOSE; }
-			| FREEZE			{ $$ = VACOPT_FREEZE; }
-			| FULL				{ $$ = VACOPT_FULL; }
+			analyze_keyword		{ $$ = makeVacOpt(VACOPT_ANALYZE, 0); }
+			| VERBOSE			{ $$ = makeVacOpt(VACOPT_VERBOSE, 0); }
+			| FREEZE			{ $$ = makeVacOpt(VACOPT_FREEZE, 0); }
+			| FULL				{ $$ = makeVacOpt(VACOPT_FULL, 0); }
+			| PARALLEL			{ $$ = makeVacOpt(VACOPT_PARALLEL, 0); }
+			| PARALLEL ICONST
+				{
+					if ($2 < 1)
+						ereport(ERROR,
+								(errcode(ERRCODE_SYNTAX_ERROR),
+								 errmsg("parallel vacuum degree must be more than 1"),
+								 parser_errposition(@1)));
+					$$ = makeVacOpt(VACOPT_PARALLEL, $2);
+				}
 			| IDENT
 				{
 					if (strcmp($1, "disable_page_skipping") == 0)
-						$$ = VACOPT_DISABLE_PAGE_SKIPPING;
+						$$ = makeVacOpt(VACOPT_DISABLE_PAGE_SKIPPING, 0);
 					else if (strcmp($1, "skip_locked") == 0)
-						$$ = VACOPT_SKIP_LOCKED;
+						$$ = makeVacOpt(VACOPT_SKIP_LOCKED, 0);
 					else
 						ereport(ERROR,
 								(errcode(ERRCODE_SYNTAX_ERROR),
@@ -10551,16 +10584,23 @@ vacuum_option_elem:
 AnalyzeStmt: analyze_keyword opt_verbose opt_vacuum_relation_list
 				{
 					VacuumStmt *n = makeNode(VacuumStmt);
-					n->options = VACOPT_ANALYZE;
+					VacuumOption *vacopt = makeVacOpt(VACOPT_ANALYZE, 0);
+
 					if ($2)
-						n->options |= VACOPT_VERBOSE;
+						vacopt->flags |= VACOPT_VERBOSE;
+
+					n->options.flags = vacopt->flags;
+					n->options.nworkers = 0;
 					n->rels = $3;
 					$$ = (Node *)n;
+					pfree(vacopt);
 				}
 			| analyze_keyword '(' analyze_option_list ')' opt_vacuum_relation_list
 				{
-					VacuumStmt *n = makeNode(VacuumStmt);
-					n->options = VACOPT_ANALYZE | $3;
+					VacuumStmt 		*n = makeNode(VacuumStmt);
+
+					n->options.flags = $3 | VACOPT_ANALYZE;
+					n->options.nworkers = 0;
 					n->rels = $5;
 					$$ = (Node *) n;
 				}
@@ -16319,6 +16359,16 @@ makeRecursiveViewSelect(char *relname, List *aliases, Node *query)
 	return (Node *) s;
 }
 
+static VacuumOption *
+makeVacOpt(VacuumOptionFlag flag, int nworkers)
+{
+	VacuumOption *vacopt = palloc(sizeof(VacuumOption));
+
+	vacopt->flags = flag;
+	vacopt->nworkers = nworkers;
+	return vacopt;
+}
+
 /* parser_init()
  * Initialize to parse one query string
  */
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 9780895..c117d7c 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -187,16 +187,16 @@ typedef struct av_relation
 /* struct to keep track of tables to vacuum and/or analyze, after rechecking */
 typedef struct autovac_table
 {
-	Oid			at_relid;
-	int			at_vacoptions;	/* bitmask of VacuumOption */
-	VacuumParams at_params;
-	int			at_vacuum_cost_delay;
-	int			at_vacuum_cost_limit;
-	bool		at_dobalance;
-	bool		at_sharedrel;
-	char	   *at_relname;
-	char	   *at_nspname;
-	char	   *at_datname;
+	Oid				at_relid;
+	VacuumOption	at_vacoptions;	/* bitmask of VacuumOption */
+	VacuumParams 	at_params;
+	int				at_vacuum_cost_delay;
+	int				at_vacuum_cost_limit;
+	bool			at_dobalance;
+	bool			at_sharedrel;
+	char	   		*at_relname;
+	char	   		*at_nspname;
+	char	   		*at_datname;
 } autovac_table;
 
 /*-------------
@@ -2490,7 +2490,7 @@ do_autovacuum(void)
 			 * next table in our list.
 			 */
 			HOLD_INTERRUPTS();
-			if (tab->at_vacoptions & VACOPT_VACUUM)
+			if (tab->at_vacoptions.flags & VACOPT_VACUUM)
 				errcontext("automatic vacuum of table \"%s.%s.%s\"",
 						   tab->at_datname, tab->at_nspname, tab->at_relname);
 			else
@@ -2842,6 +2842,7 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		int			vac_cost_limit;
 		int			vac_cost_delay;
 		int			log_min_duration;
+		int			parallel_workers;
 
 		/*
 		 * Calculate the vacuum cost parameters and the freeze ages.  If there
@@ -2888,13 +2889,20 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 			? avopts->multixact_freeze_table_age
 			: default_multixact_freeze_table_age;
 
+		parallel_workers = (avopts &&
+							avopts->vacuum_parallel_workers >= 0)
+			? avopts->vacuum_parallel_workers
+			: 0;
+
 		tab = palloc(sizeof(autovac_table));
 		tab->at_relid = relid;
 		tab->at_sharedrel = classForm->relisshared;
-		tab->at_vacoptions = VACOPT_SKIPTOAST |
+		tab->at_vacoptions.flags = VACOPT_SKIPTOAST |
 			(dovacuum ? VACOPT_VACUUM : 0) |
 			(doanalyze ? VACOPT_ANALYZE : 0) |
-			(!wraparound ? VACOPT_SKIP_LOCKED : 0);
+			(!wraparound ? VACOPT_SKIP_LOCKED : 0) |
+			(dovacuum ? VACOPT_PARALLEL : 0);	/* always consider parallel */
+		tab->at_vacoptions.nworkers = parallel_workers;
 		tab->at_params.freeze_min_age = freeze_min_age;
 		tab->at_params.freeze_table_age = freeze_table_age;
 		tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
@@ -3140,10 +3148,10 @@ autovac_report_activity(autovac_table *tab)
 	int			len;
 
 	/* Report the command and possible options */
-	if (tab->at_vacoptions & VACOPT_VACUUM)
+	if (tab->at_vacoptions.flags & VACOPT_VACUUM)
 		snprintf(activity, MAX_AUTOVAC_ACTIV_LEN,
 				 "autovacuum: VACUUM%s",
-				 tab->at_vacoptions & VACOPT_ANALYZE ? " ANALYZE" : "");
+				 tab->at_vacoptions.flags & VACOPT_ANALYZE ? " ANALYZE" : "");
 	else
 		snprintf(activity, MAX_AUTOVAC_ACTIV_LEN,
 				 "autovacuum: ANALYZE");
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 42bccce..9ffdecb 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -2487,7 +2487,6 @@ pgstat_fetch_stat_funcentry(Oid func_id)
 	return funcentry;
 }
 
-
 /* ----------
  * pgstat_fetch_stat_beentry() -
  *
@@ -3054,6 +3053,25 @@ pgstat_report_activity(BackendState state, const char *cmd_str)
 }
 
 /*-----------
+ * pgstat_report_leader_pid() -
+ *
+ * Report process id of the leader process that this backend is involved
+ * with.
+ */
+void
+pgstat_report_leader_pid(int pid)
+{
+	volatile PgBackendStatus *beentry = MyBEEntry;
+
+	if (!beentry)
+		return;
+
+	pgstat_increment_changecount_before(beentry);
+	beentry->st_leader_pid = pid;
+	pgstat_increment_changecount_after(beentry);
+}
+
+/*-----------
  * pgstat_progress_start_command() -
  *
  * Set st_progress_command (and st_progress_command_target) in own backend
@@ -3659,6 +3677,11 @@ pgstat_get_wait_ipc(WaitEventIPC w)
 			break;
 		case WAIT_EVENT_PARALLEL_FINISH:
 			event_name = "ParallelFinish";
+		case WAIT_EVENT_PARALLEL_VACUUM_STARTUP:
+			event_name = "ParallelVacuumStartup";
+			break;
+		case WAIT_EVENT_PARALLEL_VACUUM:
+			event_name = "ParallelVacuum";
 			break;
 		case WAIT_EVENT_PROCARRAY_GROUP_UPDATE:
 			event_name = "ProcArrayGroupUpdate";
diff --git a/src/backend/tcop/utility.c b/src/backend/tcop/utility.c
index 970c94e..23dc6d3 100644
--- a/src/backend/tcop/utility.c
+++ b/src/backend/tcop/utility.c
@@ -664,7 +664,7 @@ standard_ProcessUtility(PlannedStmt *pstmt,
 				VacuumStmt *stmt = (VacuumStmt *) parsetree;
 
 				/* we choose to allow this during "read only" transactions */
-				PreventCommandDuringRecovery((stmt->options & VACOPT_VACUUM) ?
+				PreventCommandDuringRecovery((stmt->options.flags & VACOPT_VACUUM) ?
 											 "VACUUM" : "ANALYZE");
 				/* forbidden in parallel mode due to CommandIsReadOnly */
 				ExecVacuum(stmt, isTopLevel);
@@ -2570,7 +2570,7 @@ CreateCommandTag(Node *parsetree)
 			break;
 
 		case T_VacuumStmt:
-			if (((VacuumStmt *) parsetree)->options & VACOPT_VACUUM)
+			if (((VacuumStmt *) parsetree)->options.flags & VACOPT_VACUUM)
 				tag = "VACUUM";
 			else
 				tag = "ANALYZE";
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index e95e347..67aaabf 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -439,7 +439,7 @@ pg_stat_get_backend_idset(PG_FUNCTION_ARGS)
 Datum
 pg_stat_get_progress_info(PG_FUNCTION_ARGS)
 {
-#define PG_STAT_GET_PROGRESS_COLS	PGSTAT_NUM_PROGRESS_PARAM + 3
+#define PG_STAT_GET_PROGRESS_COLS	PGSTAT_NUM_PROGRESS_PARAM + 4
 	int			num_backends = pgstat_fetch_stat_numbackends();
 	int			curr_backend;
 	char	   *cmd = text_to_cstring(PG_GETARG_TEXT_PP(0));
@@ -516,14 +516,16 @@ pg_stat_get_progress_info(PG_FUNCTION_ARGS)
 		if (has_privs_of_role(GetUserId(), beentry->st_userid))
 		{
 			values[2] = ObjectIdGetDatum(beentry->st_progress_command_target);
+			values[3] = Int32GetDatum(beentry->st_leader_pid);
 			for (i = 0; i < PGSTAT_NUM_PROGRESS_PARAM; i++)
-				values[i + 3] = Int64GetDatum(beentry->st_progress_param[i]);
+				values[i + 4] = Int64GetDatum(beentry->st_progress_param[i]);
 		}
 		else
 		{
 			nulls[2] = true;
+			nulls[3] = true;
 			for (i = 0; i < PGSTAT_NUM_PROGRESS_PARAM; i++)
-				nulls[i + 3] = true;
+				nulls[i + 4] = true;
 		}
 
 		tuplestore_putvalues(tupstore, tupdesc, values, nulls);
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 4026018..c30d791 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5005,9 +5005,9 @@
   proname => 'pg_stat_get_progress_info', prorows => '100', proretset => 't',
   provolatile => 's', proparallel => 'r', prorettype => 'record',
   proargtypes => 'text',
-  proallargtypes => '{text,int4,oid,oid,int8,int8,int8,int8,int8,int8,int8,int8,int8,int8}',
-  proargmodes => '{i,o,o,o,o,o,o,o,o,o,o,o,o,o}',
-  proargnames => '{cmdtype,pid,datid,relid,param1,param2,param3,param4,param5,param6,param7,param8,param9,param10}',
+  proallargtypes => '{text,int4,oid,oid,int4,int8,int8,int8,int8,int8,int8,int8,int8,int8,int8}',
+  proargmodes => '{i,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
+  proargnames => '{cmdtype,pid,datid,relid,leader_pid,param1,param2,param3,param4,param5,param6,param7,param8,param9,param10}',
   prosrc => 'pg_stat_get_progress_info' },
 { oid => '3099',
   descr => 'statistics: information about currently active replication',
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 2f4303e..ea71d60 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -14,16 +14,19 @@
 #ifndef VACUUM_H
 #define VACUUM_H
 
+#include "access/heapam_xlog.h"
 #include "access/htup.h"
 #include "catalog/pg_class.h"
+#include "access/parallel.h"
+#include "access/relscan.h"
 #include "catalog/pg_statistic.h"
 #include "catalog/pg_type.h"
 #include "nodes/parsenodes.h"
 #include "storage/buf.h"
+#include "storage/condition_variable.h"
 #include "storage/lock.h"
 #include "utils/relcache.h"
 
-
 /*----------
  * ANALYZE builds one of these structs for each attribute (column) that is
  * to be analyzed.  The struct and subsidiary data are in anl_context,
@@ -155,10 +158,9 @@ extern int	vacuum_freeze_table_age;
 extern int	vacuum_multixact_freeze_min_age;
 extern int	vacuum_multixact_freeze_table_age;
 
-
 /* in commands/vacuum.c */
 extern void ExecVacuum(VacuumStmt *vacstmt, bool isTopLevel);
-extern void vacuum(int options, List *relations, VacuumParams *params,
+extern void vacuum(VacuumOption options, List *relations, VacuumParams *params,
 	   BufferAccessStrategy bstrategy, bool isTopLevel);
 extern void vac_open_indexes(Relation relation, LOCKMODE lockmode,
 				 int *nindexes, Relation **Irel);
@@ -192,8 +194,9 @@ extern Relation vacuum_open_relation(Oid relid, RangeVar *relation,
 					 VacuumParams *params, int options, LOCKMODE lmode);
 
 /* in commands/vacuumlazy.c */
-extern void lazy_vacuum_rel(Relation onerel, int options,
+extern void lazy_vacuum_rel(Relation onerel, VacuumOption options,
 				VacuumParams *params, BufferAccessStrategy bstrategy);
+extern void lazy_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
 
 /* in commands/analyze.c */
 extern void analyze_rel(Oid relid, RangeVar *relation, int options,
diff --git a/src/include/commands/vacuum_internal.h b/src/include/commands/vacuum_internal.h
new file mode 100644
index 0000000..8a132f9
--- /dev/null
+++ b/src/include/commands/vacuum_internal.h
@@ -0,0 +1,191 @@
+/*-------------------------------------------------------------------------
+ *
+ * vacuum_internal.h
+ *	  Internal declarations for lazy vacuum
+ *
+ *
+ * Portions Copyright (c) 1996-2018, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/commands/vacuum_internal.h
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#ifndef VACUUM_INTERNAL_H
+#define VACUUM_INTERNAL_H
+
+/* DSM key for parallel lazy vacuum */
+#define VACUUM_KEY_SHARED			UINT64CONST(0xFFFFFFFFFFF00001)
+#define VACUUM_KEY_VACUUM_STATS		UINT64CONST(0xFFFFFFFFFFF00002)
+#define VACUUM_KEY_INDEX_STATS	    UINT64CONST(0xFFFFFFFFFFF00003)
+#define VACUUM_KEY_DEAD_TUPLES		UINT64CONST(0xFFFFFFFFFFF00004)
+#define VACUUM_KEY_WORKERS			UINT64CONST(0xFFFFFFFFFFF00005)
+#define VACUUM_KEY_QUERY_TEXT		UINT64CONST(0xFFFFFFFFFFF00006)
+
+/*
+ * Type definitions of lazy vacuum. The fields of these structs are
+ * accessed by only the vacuum leader.
+ */
+typedef struct LVScanDescData LVScanDescData;
+typedef struct LVScanDescData *LVScanDesc;
+typedef struct LVTidMap	LVTidMap;
+typedef struct LVIndStats LVIndStats;
+
+/* Vacuum worker state for parallel lazy vacuum */
+typedef enum VacWorkerSate
+{
+	VACSTATE_INVALID = 0,
+	VACSTATE_SCAN,
+	VACSTATE_VACUUM_INDEX,
+	VACSTATE_VACUUM_HEAP,
+	VACSTATE_CLEANUP_INDEX,
+	VACSTATE_WORKER_DONE,
+	VACSTATE_COMPLETED
+} VacWorkerState;
+
+/*
+ * The 'pid' always starts with InvalidPid, which means the vacuum worker
+ * is starting up. It's sets by the vacuum worker itself during start up. When
+ * the vacuum worker exits or detaches the vacuum worker slot, 'pid' is set to 0,
+ * which means the vacuum worker is dead.
+ */
+typedef struct VacuumWorker
+{
+	pid_t			pid;	/* parallel worker's pid.
+							   InvalidPid = not started yet; 0 = dead */
+
+	VacWorkerState	state;	/* current worker's state */
+	slock_t			mutex;	/* protect the above fields */
+} VacuumWorker;
+
+/* Struct to control parallel vacuum workers */
+typedef struct LVWorkerState
+{
+	int		nparticipantvacuum;		/* only parallel worker, not including
+									   the leader */
+	int		nparticipantvacuum_launched;	/* actual launched workers of
+											   nparticipantvacuum */
+
+	/* condition variable signaled when changing status */
+	ConditionVariable	cv;
+
+	/* protect workers array */
+	LWLock				vacuumlock;
+
+	VacuumWorker workers[FLEXIBLE_ARRAY_MEMBER];
+} LVWorkerState;
+#define SizeOfLVWorkerState offsetof(LVWorkerState, workers) + sizeof(VacuumWorker)
+
+typedef struct LVRelStats
+{
+	/* hasindex = true means two-pass strategy; false means one-pass */
+	bool		hasindex;
+	/* Overall statistics about rel */
+	BlockNumber old_rel_pages;		/* previous value of pg_class.relpages */
+	BlockNumber rel_pages;			/* total number of pages */
+	BlockNumber scanned_pages;		/* number of pages we examined */
+	BlockNumber pinskipped_pages;	/* # of pages we skipped due to a pin */
+	BlockNumber frozenskipped_pages;	/* # of frozen pages we skipped */
+	BlockNumber tupcount_pages;		/* pages whose tuples we counted */
+	BlockNumber empty_pages;		/* # of empty pages */
+	BlockNumber vacuumed_pages;		/* # of pages we vacuumed */
+	double		num_tuples;			/* total number of nonremoval tuples */
+	double		live_tuples;		/* live tuples (reltuples estimate) */
+	double		tuples_deleted;		/* tuples cleaned up by vacuum */
+	double		unused_tuples;		/* unused item pointers */
+	double		old_live_tuples;	/* previous value of pg_class.reltuples */
+	double		new_rel_tuples;		/* new estimated total # of tuples */
+	double		new_live_tuples;	/* new estimated total # of live tuples */
+	double		new_dead_tuples;	/* new estimated total # of dead tuples */
+	BlockNumber pages_removed;
+	BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
+	int			num_index_scans;
+	TransactionId latestRemovedXid;
+	bool		lock_waiter_detected;
+} LVRelStats;
+
+/*
+ * Shared information among parallel workers.
+ */
+typedef struct LVShared
+{
+	/* Target relation's OID */
+	Oid		relid;
+
+	/* Options and thresholds used for lazy vacuum */
+	VacuumOption		options;
+	bool				aggressive;		/* is an aggressive vacuum? */
+	bool				is_wraparound;	/* for anti-wraparound purpose? */
+	int					elevel;			/* verbose logging */
+	TransactionId	oldestXmin;
+	TransactionId	freezeLimit;
+	MultiXactId		multiXactCutoff;
+
+	/* Vacuum delay */
+	int		cost_delay;
+	int		cost_limit;
+
+	int		max_dead_tuples_per_worker;	/* Maximum tuples each worker can have */
+	ParallelHeapScanDescData heapdesc;	/* for heap scan */
+} LVShared;
+
+/*
+ * Working state for lazy vacuum execution. LVState is used by both vacuum
+ * workers and the vacuum leader. In parallel lazy vacuum, the 'vacrelstats'
+ * for vacuum worker and the 'dead_tuples' exit in shared memory in addition
+ * to the three fields for parallel lazy vacuum: 'lvshared', 'indstats' and
+ * 'pcxt'.
+ */
+typedef struct LVState
+{
+	/* Vacuum target relation and indexes */
+	Oid			relid;
+	Relation	relation;
+	Relation	*indRels;
+	int			nindexes;
+
+	/* Used during scanning heap */
+	IndexBulkDeleteResult	**indbulkstats;
+	xl_heap_freeze_tuple	*frozen;
+	BlockNumber				next_fsm_block_to_vacuum;
+	BlockNumber				current_block; /* block number being scanned */
+	VacuumOption			options;
+	bool					is_wraparound;
+
+	/* Scan description for lazy vacuum */
+	LVScanDesc	lvscan;
+	bool		aggressive;
+
+	/* Vacuum statistics for the target table */
+	LVRelStats	*vacrelstats;
+
+	/* Dead tuple array */
+	LVTidMap	*dead_tuples;
+
+	/*
+	 * The following fields are only present when a parallel lazy vacuum
+	 * is performed.
+	 */
+	LVShared		*lvshared;	/* shared information among vacuum workers */
+	LVIndStats		*indstats;	/* shared index statistics */
+
+} LVState;
+
+extern LVWorkerState	*WorkerState;
+
+extern LVScanDesc lv_beginscan(Relation relation, LVShared *lvshared,
+							   bool aggressive, bool disable_page_skipping);
+extern void lv_endscan(LVScanDesc lvscan);
+extern int do_lazy_scan_heap(LVState *lvstate, bool *isFinished);
+extern void lazy_cleanup_all_indexes(LVState *lvstate);
+extern void lazy_vacuum_all_indexes(LVState *lvstate);
+extern void lazy_vacuum_heap(Relation onerel, LVState *lvstate);
+extern void vacuum_set_xid_limits_for_worker(TransactionId oldestxmin,
+											 TransactionId freezelimit,
+											 MultiXactId multixactcutoff);
+extern void vacuum_set_elevel_for_worker(int elevel);
+extern void lazy_space_alloc(LVState *lvstate, BlockNumber relblocks);
+
+
+#endif							/* VACUUM_INTERNAL_H */
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index aa4a0db..a0f9578 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3147,7 +3147,7 @@ typedef struct ClusterStmt
  * and VACOPT_ANALYZE must be set in options.
  * ----------------------
  */
-typedef enum VacuumOption
+typedef enum VacuumOptionFlag
 {
 	VACOPT_VACUUM = 1 << 0,		/* do VACUUM */
 	VACOPT_ANALYZE = 1 << 1,	/* do ANALYZE */
@@ -3156,9 +3156,17 @@ typedef enum VacuumOption
 	VACOPT_FULL = 1 << 4,		/* FULL (non-concurrent) vacuum */
 	VACOPT_SKIP_LOCKED = 1 << 5,	/* skip if cannot get lock */
 	VACOPT_SKIPTOAST = 1 << 6,	/* don't process the TOAST table, if any */
-	VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7	/* don't skip any pages */
+	VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7,	/* don't skip any pages */
+	VACOPT_PARALLEL = 1 << 8	/* do VACUUM in parallel */
+} VacuumOptionFlag;
+
+typedef struct VacuumOption
+{
+	VacuumOptionFlag	flags;	/* OR of VacuumOptionFlag */
+	int					nworkers;	/* # of parallel vacuum workers */
 } VacuumOption;
 
+
 /*
  * Info about a single target table of VACUUM/ANALYZE.
  *
@@ -3176,9 +3184,9 @@ typedef struct VacuumRelation
 
 typedef struct VacuumStmt
 {
-	NodeTag		type;
-	int			options;		/* OR of VacuumOption flags */
-	List	   *rels;			/* list of VacuumRelation, or NIL for all */
+	NodeTag			type;
+	VacuumOption	options;		/* OR of VacuumOption flags */
+	List	   		*rels;			/* list of VacuumRelation, or NIL for all */
 } VacuumStmt;
 
 /* ----------------------
diff --git a/src/include/optimizer/planner.h b/src/include/optimizer/planner.h
index 3e733b3..20270ef 100644
--- a/src/include/optimizer/planner.h
+++ b/src/include/optimizer/planner.h
@@ -60,5 +60,6 @@ extern Expr *preprocess_phv_expression(PlannerInfo *root, Expr *expr);
 
 extern bool plan_cluster_use_sort(Oid tableOid, Oid indexOid);
 extern int	plan_create_index_workers(Oid tableOid, Oid indexOid);
+extern int	plan_lazy_vacuum_workers(Oid tableOid, int nworkers_requested);
 
 #endif							/* PLANNER_H */
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index f1c10d1..9c1d3fc 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -828,6 +828,8 @@ typedef enum
 	WAIT_EVENT_PARALLEL_BITMAP_SCAN,
 	WAIT_EVENT_PARALLEL_CREATE_INDEX_SCAN,
 	WAIT_EVENT_PARALLEL_FINISH,
+	WAIT_EVENT_PARALLEL_VACUUM_STARTUP,
+	WAIT_EVENT_PARALLEL_VACUUM,
 	WAIT_EVENT_PROCARRAY_GROUP_UPDATE,
 	WAIT_EVENT_PROMOTE,
 	WAIT_EVENT_REPLICATION_ORIGIN_DROP,
@@ -1032,13 +1034,17 @@ typedef struct PgBackendStatus
 
 	/*
 	 * Command progress reporting.  Any command which wishes can advertise
-	 * that it is running by setting st_progress_command,
+	 * that it is running by setting st_leaderpid, st_progress_command,
 	 * st_progress_command_target, and st_progress_param[].
 	 * st_progress_command_target should be the OID of the relation which the
 	 * command targets (we assume there's just one, as this is meant for
 	 * utility commands), but the meaning of each element in the
 	 * st_progress_param array is command-specific.
+	 * st_leader_pid can be used for command progress reporting of parallel
+	 * operation. Setting by the leader's pid of parallel operation we can
+	 * group them in progress reporting SQL.
 	 */
+	int			st_leader_pid;
 	ProgressCommandType st_progress_command;
 	Oid			st_progress_command_target;
 	int64		st_progress_param[PGSTAT_NUM_PROGRESS_PARAM];
@@ -1205,6 +1211,7 @@ extern const char *pgstat_get_crashed_backend_activity(int pid, char *buffer,
 									int buflen);
 extern const char *pgstat_get_backend_desc(BackendType backendType);
 
+extern void pgstat_report_leader_pid(int pid);
 extern void pgstat_progress_start_command(ProgressCommandType cmdtype,
 							  Oid relid);
 extern void pgstat_progress_update_param(int index, int64 val);
diff --git a/src/include/storage/lwlock.h b/src/include/storage/lwlock.h
index b2dcb73..fddcb54 100644
--- a/src/include/storage/lwlock.h
+++ b/src/include/storage/lwlock.h
@@ -219,6 +219,7 @@ typedef enum BuiltinTrancheIds
 	LWTRANCHE_SHARED_TUPLESTORE,
 	LWTRANCHE_TBM,
 	LWTRANCHE_PARALLEL_APPEND,
+	LWTRANCHE_PARALLEL_VACUUM,
 	LWTRANCHE_FIRST_USER_DEFINED
 }			BuiltinTrancheIds;
 
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 84469f5..c3715e4 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -250,6 +250,7 @@ typedef struct AutoVacOpts
 	int			multixact_freeze_max_age;
 	int			multixact_freeze_table_age;
 	int			log_min_duration;
+	int			vacuum_parallel_workers;
 	float8		vacuum_scale_factor;
 	float8		analyze_scale_factor;
 } AutoVacOpts;
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 735dd37..e2655fd 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1836,13 +1836,21 @@ pg_stat_progress_vacuum| SELECT s.pid,
             ELSE NULL::text
         END AS phase,
     s.param2 AS heap_blks_total,
-    s.param3 AS heap_blks_scanned,
-    s.param4 AS heap_blks_vacuumed,
-    s.param5 AS index_vacuum_count,
+    w.heap_blks_scanned,
+    w.heap_blks_vacuumed,
+    w.index_vacuum_count,
     s.param6 AS max_dead_tuples,
-    s.param7 AS num_dead_tuples
-   FROM (pg_stat_get_progress_info('VACUUM'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10)
-     LEFT JOIN pg_database d ON ((s.datid = d.oid)));
+    w.num_dead_tuples
+   FROM ((pg_stat_get_progress_info('VACUUM'::text) s(pid, datid, relid, leader_pid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10)
+     LEFT JOIN pg_database d ON ((s.datid = d.oid)))
+     LEFT JOIN ( SELECT pg_stat_get_progress_info.leader_pid,
+            max(pg_stat_get_progress_info.param3) AS heap_blks_scanned,
+            max(pg_stat_get_progress_info.param4) AS heap_blks_vacuumed,
+            max(pg_stat_get_progress_info.param5) AS index_vacuum_count,
+            max(pg_stat_get_progress_info.param7) AS num_dead_tuples
+           FROM pg_stat_get_progress_info('VACUUM'::text) pg_stat_get_progress_info(pid, datid, relid, leader_pid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10)
+          GROUP BY pg_stat_get_progress_info.leader_pid) w ON ((s.pid = w.leader_pid)))
+  WHERE (s.pid = s.leader_pid);
 pg_stat_replication| SELECT s.pid,
     s.usesysid,
     u.rolname AS usename,
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index fa9d663..b8805b2 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -129,6 +129,8 @@ ERROR:  relation "does_not_exist" does not exist
 VACUUM (SKIP_LOCKED) vactst;
 VACUUM (SKIP_LOCKED, FULL) vactst;
 ANALYZE (SKIP_LOCKED) vactst;
+-- parallel option
+VACUUM (PARALLEL 1) vactst;
 DROP TABLE vaccluster;
 DROP TABLE vactst;
 DROP TABLE vacparted;
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 9defa0d..8cb8f64 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -103,6 +103,9 @@ VACUUM (SKIP_LOCKED) vactst;
 VACUUM (SKIP_LOCKED, FULL) vactst;
 ANALYZE (SKIP_LOCKED) vactst;
 
+-- parallel option
+VACUUM (PARALLEL 1) vactst;
+
 DROP TABLE vaccluster;
 DROP TABLE vactst;
 DROP TABLE vacparted;
-- 
2.10.5

Yura Sokolov

funny.falcon@gmail.com

about 7 years ago

In reply to: Masahiko Sawada (#3)

Excuse me for being noisy.

Increasing vacuum's ring buffer improves vacuum upto 6 times.
/messages/by-id/20170720190405.GM1769@tamriel.snowman.net
This is one-line change.

How much improvement parallel vacuum gives?

31.10.2018 3:23, Masahiko Sawada пишет:

Show quoted text

On Tue, Oct 30, 2018 at 5:30 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Tue, Aug 14, 2018 at 9:31 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Thu, Nov 30, 2017 at 11:09 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:

On Tue, Oct 24, 2017 at 5:54 AM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Yeah, I was thinking the commit is relevant with this issue but as
Amit mentioned this error is emitted by DROP SCHEMA CASCASE.
I don't find out the cause of this issue yet. With the previous
version patch, autovacuum workers were woking with one parallel worker
but it never drops relations. So it's possible that the error might
not have been relevant with the patch but anywayI'll continue to work
on that.

This depends on the extension lock patch from
/messages/by-id/CAD21AoCmT3cFQUN4aVvzy5chw7DuzXrJCbrjTU05B+Ss=Gn1LA@mail.gmail.com/
if I am following correctly. So I propose to mark this patch as
returned with feedback for now, and come back to it once the root
problems are addressed. Feel free to correct me if you think that's
not adapted.

I've re-designed the parallel vacuum patch. Attached the latest
version patch. As the discussion so far, this patch depends on the
extension lock patch[1]. However I think we can discuss the design
part of parallel vacuum independently from that patch. That's way I'm
proposing the new patch. In this patch, I structured and refined the
lazy_scan_heap() because it's a single big function and not suitable
for making it parallel.

The parallel vacuum worker processes keep waiting for commands from
the parallel vacuum leader process. Before entering each phase of lazy
vacuum such as scanning heap, vacuum index and vacuum heap, the leader
process changes the all workers state to the next state. Vacuum worker
processes do the job according to the their state and wait for the
next command after finished. Also in before entering the next phase,
the leader process does some preparation works while vacuum workers is
sleeping; for example, clearing shared dead tuple space before
entering the 'scanning heap' phase. The status of vacuum workers are
stored into a DSM area pointed by WorkerState variables, and
controlled by the leader process. FOr the basic design and performance
improvements please refer to my presentation at PGCon 2018[2].

The number of parallel vacuum workers is determined according to
either the table size or PARALLEL option in VACUUM command. The
maximum of parallel workers is max_parallel_maintenance_workers.

I've separated the code for vacuum worker process to
backends/commands/vacuumworker.c, and created
includes/commands/vacuum_internal.h file to declare the definitions
for the lazy vacuum.

For autovacuum, this patch allows autovacuum worker process to use the
parallel option according to the relation size or the reloption. But
autovacuum delay, since there is no slots for parallel worker of
autovacuum in AutoVacuumShmem this patch doesn't support the change of
the autovacuum delay configuration during running.

Attached rebased version patch to the current HEAD.

Please apply this patch with the extension lock patch[1] when testing
as this patch can try to extend visibility map pages concurrently.

Because the patch leads performance degradation in the case where
bulk-loading to a partitioned table I think that the original
proposal, which makes group locking conflict when relation extension
locks, is more realistic approach. So I worked on this with the simple
patch instead of [1]. Attached three patches:

* 0001 patch publishes some static functions such as
heap_paralellscan_startblock_init so that the parallel vacuum code can
use them.
* 0002 patch makes the group locking conflict when relation extension locks.
* 0003 patch add paralel option to lazy vacuum.

Please review them.

Oops, forgot to attach patches.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Masahiko Sawada

sawada.mshk@gmail.com

about 7 years ago

In reply to: Yura Sokolov (#4)

1 attachment(s)

Hi,

On Thu, Nov 1, 2018 at 2:28 PM Yura Sokolov <funny.falcon@gmail.com> wrote:

Excuse me for being noisy.

Increasing vacuum's ring buffer improves vacuum upto 6 times.
/messages/by-id/20170720190405.GM1769@tamriel.snowman.net
This is one-line change.

How much improvement parallel vacuum gives?

It depends on hardware resources you can use.

In current design the scanning heap and vacuuming heap are procesed
with parallel workers at block level (using parallel sequential scan)
and the vacuuming indexes are also processed with parallel worker at
index-level. So even if a table is not large enough the more a table
has indexes you can get better performance. The performance test
result (I attached) I did before shows that parallel vacuum is up to
almost 10 times faster than single-process vacuum in a case. The test
used not-large table (4GB table) with many indexes but it would be
insteresting to test with large table.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachments:

parallel_vacuum.pngimage/png; name=parallel_vacuum.pngDownload

�PNG


IHDR���2��sRGB���gAMA���a	pHYs!��3��IDATx^���E���~���r���K!	 @����w8\����o"�����n�l�l���������������~��i����.&P��<�������@��8q��{&M��e��w���t�p�
�Z��bG�?��O�Vb��f�M7���|�Ii������������:th��/����'�k����/�5O=�T����_l�y���K/����v��e�]���
_�k�d:����Z�HB=����eCscy��7�5W^ye���o��� 64��O��=��c�[n9��RK�c�9&���A�E#���xM�|��K]�_�5^��I�J2��s�y���,��"��g�a>�`s�	'���[�n�o���o�hZO"�A\i����@���z����/�h��{n��
+��W_�������s�5O<�D��&�?>^K���XOl��&���?6�6�/����������e���[�������NUq��}������/��������?�5���x����]+�:�c����a����������k�a&��3h� ����7w�u��u�]���K��gn���i� r����
��_&���zg�a��[o�����+��O[������x�	�^�~����^�[l��y�����^���W>F��@�v5#��Q
��rK��/a$���"2�JJ��Q;�k�g������?��3^��o��e�������n:������r���4^���rF!��3E�� �����Z�D��>�8���_�����o��v�����������A�������A<��c��&�����z,^+1yr�v?�]O�C.?��C���O?�4^�N��X������>�Xc�?���x�?����w��k:^y��x-_Z��w�=^�c���v�^x�������8�YrI�x-ma��I�4;��O?}�_"�&
���/�|�M���f�{�d_r5V��D������^�����06;�i���Z��#�0��zj����3��l6,^����2D
#G����^x��x-;��M�Osc�\
a��F���{�Zv2�J����FhFb�7����/^���q����d:���_�u�]��pT��	�5��!�Al��{o������T�&�Z����~9��{�>��o�9^��A��r@���h�V
�����W_����� ����6Z`��v���_���ti��[��J�������o���������a�f��j����� �ql����?���H��IfRt`���_#G�����	t�� �0��;�����Ze>�\w�a���LC���|�� �Pe����Sf��������>m�_ ~�Lnsg�=z�0��~����"���J����;�{O�?�Q)R���Hd����K-�44���>b�������Gb9�����O�]g>8v����[�Gl_��X��8P��������~�-^���g��k�|w�������	#��$��W_�k�Q���.=�XE2hqc��?��q�L�>f��5�h;���5��7fEu{�������������47���`��^�b�_��iea���k���N6o�!g5�6��H�����]z�������o`}�P�@sj~�;"K�s���o���YnV��_7)���3O�f��O?�5��.���4Q�+������%�\�����?���rWn�5 �P��<#j��w���w���v1Q����^{����z�>�?��f�i�),I��vq�{�R����L��k��!��s�H4G=��mRN�*H��������y���Zv|��x��������v[���D�i/����������l����b�����mq��%�&q����Zs�������k�q��j��4�� 6]#e|�8��4��
������j)����K7S���������:�D�����6*��?4�|���LauJ�e���6����0��t�M�Zv����x�_�UF��?��R:����# y�g�u�=��1�,�w�=�k[�����F)���hX72��@N��D7�N#�5�"�[�� JA5&�H��t�I'�m��6[c��X{m�����o,���Sa����o����Z�C5�S�����{�)mWsc�D�VscY|�������nO;�4s����5�\�l��FVq�3��P���}�92�����}���2]t�M���Pk�\&��aMI�g�y&^���/�������+O�8�8���5?���D�2���w�oP������B�_~��>����#��),#�e��M',�k�Au���]*�y��j�x�DR����G6Eh���������La�>�l�V��B�NT�����n[v�e[��&Lhz�P��y��=���E�e�{v���x-;�*J�/!��d��� �G|oF�xDf,C�i�*�z������#k��4��j�3���^q�)��/�����2��^�Yq�:� �P� p��Z��W5T<�|��'�l*��;'���p��GS-�.)��lJ[Sm�����JR�i�=�h��+��q�5�8�5R�����)�Z�K���9���,����_^��2����I��q�kv�M�(�mEP� �}���^�Q?���QS�F�}�&}��W����lA��N��2f��V&�%���x� �T�}~�a���V����/������������A�+P�;�{��t�����h��.�i��A�L����2UCM�g��l2���h�������t4uc]�"�g�Z �7�[*��3iR�u���z5G�m4�}�~�����g�i��9iRK��f��NM�xu������K��:�4�� ���&����~�������EV��"F��}�A}�:���W���f)�������`����GD
|S��P�
���=�7wa��f�
������EF9����5�����I��D���?(�&�u�t}�jzc�>:�Q���4F~�����{��\���0qb~5qjzk���.�������4n������6�4ZZD��:�}��w�ZE����]��q[��ZRM�l9hR`�|i^d����!C�[��Hz��]G�W��+]
4������^k>=1Nl���h�o.�}����!���*w�6v���z�8�S��m����'��~�����T[��&2�e�	��No�!�G��|{z��(��g�[p
��R��B��m_�a�M;�>�������M)�ce����)�8_�3��)�'����,�����&o�6��-:��j���b����s�����">���or�X����p�@��Uv�i78���o,�*"�B���n�[o!�2]n�"O8��CR�����/�����������7M���Yg�N�D���7d:��S��0k@8�5���*���l�$5��{w$�����H�5�)�N�'}������[S	e��v�.�`"�mI�1������dj���NBkH�n��@����(��l��9��Cm-'�����geh
`i\eS��t_P��p�jge�)�������+�k��$_+���U=kEZ�7McM��[N%+n�uV�.LMX�S|Yx���� u�Mc�b����X@�\~����'����~�R�@3��G~���R��B�Ixu���K���:�M�l.i���F"jB�5Y#iN�wC�~h 7��KK���J�a�����T�Z}����nh�ky��v��^Xu�U ?����������9$����O4t�����c�5�����s��W�S�hm��v����������I���>���dAcH��)����������i��/M���k��o��eG�����������jK,�H(����Gt��Y<Y��Pt��T���&TWv!�~�����PNZV�F�hns��	���i�����oOL���$��#R�������L�:#o��F��D��fb�Dn��/p�X���L\��6(���g�(�i���>�8�����/6s�1�y��'�H}��VE�\��n�Y'����o��<V�����y\����7�z����m��}.�3�0��W�<�W�<M� ��HdX9I��nk��T���V+.��
�B`���41tm�y����z ���+Ec��DMV�f�\K�������e�BE�A�5N�h���q��V���)=~��.�c��f�6,����31�Y���8fL����=+����O�����\}���Ze�C�3j�M7�����8�>�t�D�������q���������ib�(��c��p�D9����B���d����@d��"�Ez|Q��Yv��A�"����'�ls�*�`Zpk�bh�����G��b�u~z@�~�������U��z��Y��Ht	�u�u�����o���B\l�����z�~�w��>����
������oq'g�V~�{1�O���
��������"�,b;�0�B�o9o��9�H����.�����j=z7��6k,J�j&���Z�g��L������������;��w^������������@�0���AiO��LM���6,y����de�����@h�P���j��YsZ���!�f���U	�pV��@4>uM�Z[S�1P��nQ�j�m �MF��H�:)������*��� J���������X(�ls'+L,���
7J;�b�s��P���s(�n
'&|�C=�hV������f���h�;��o'��|����~���m2Y�L�g��d���oq'+�V���O2	���3�������%9e��{��xK)�����z�/�q�$�~��a�k^K�/�-�0LD�7����V"aA���[��'�p�Z7�%"}X�n��<Hs)��2k��]�)���$���k�JM1vW:V�z ���U��%��{[����j��R^��4}KS���>�m�j *��'��� ��������?kICC).��F)���4r���w�y�@��2���������>
g�;��$(���E����x�r��[� Z���N\�����<w��I����&>o�)n��X��W����o�[�*����w���^������|�������_.�~]y��v�y�;	�38F����������
c���J�����ns'���\!���;\��I
������-$���@$����#�e����@��41K]����H4�"]�/.l�<U~���+��&�E[C�#����h.��du��T�^P��Zf�"��t��W	�p?�6�����!I�6������zF5�����E�\�on��Uf�x�����w&4��������dE ���1|KJ�g�2��>�q	�ps�p% ��+��I�xC���F����(�1��$w'HI^��1@��Y|����S��U�1_�������(����u^�*���������~6��NL�e���w�/w���������������������bB�6w�`Gf���J��U�����t���~q	L�fQ�w�
���3aB�L	�M�{���F��Z���]����+�k-�������Ka�Ih&e�������h�Su;~�d�Ks}��Y���;��o����X&�.�z��%��Ge����&���DE���U	7��Zr�}����H��_�s	�:����@R:R�[���s�a�+Pz�����(�����"���)��c��o�
x�J1>���3%��b[�9�mn�5~;I�����r�=l�����g���/��U�/��l#�*����_�&�������=_��rn�������l V�����|:`��Y~�����h%4�4�~�"����(5~�H�I����1(�����t���Y�=6�J�j���X��W������v�jK]�.���%e��v����y����g��3k�
�Qj��C��h���o���U"�@��B"�@�bAgp���
���1	f�3�K�?w��������b���~i�%M��n�� ��x�9im"4�:Mla[Sw��c���_�?�?L���6l��Y���J�b��]�o4`H��_*�V�0����h�Y�o���S%�����F����gMP�����&+y3r� 3����������o�K>b�����7������	�K�.��7���yAY�����"�I����Q�[��>c�k��w��W�}n�[�i'�5g������/�5�G�Mt�W� ��V������@�����IDy����u�&�[t��6j�)�����P�������l��s��S��������{C����Tc\H+�^�A�B-����.o�
+�2fn�$�#���!:��F�w����=��}��%���v��G'6o�����o2�W����WE;�M�}�7��yc��IF��O-����FghoQQ�,C.�a����y������9�������zF���Y���R��r
��D���o�,yPfer�j�i��}�
�3�}����v����`�*[�D/�&K�H�'m?z����vI���#�2i{�e�w�M�n����{��v������Dw���,�Fz�������!f��	S�1�7f�k����P�Nr���&.;��b�.%��V��;�����1��%��m��J������$n�a���������^�4q{���K������[n�n���G����y��"�j�	��l�
�J�,�}������^���YX ^�������,���p�X�o�E��~���#����[��_-�ER1��b �C��.��x�!��v[I`V�q��gM�q�'I��(������4���IM��3M�g�:&����a�I���i������>
b
���{�l>j&������[�)@�=��g�D7��
���K����^{Y�=���.��}<��s�����O�������Y�`[LX���]
�l$G�v���Ip����k��Fj�;"�K��#�8"~���*iF�v�!^k	)���P����Hj��������������9�O�pq�����^�`���|�z$0�����K�s���:�l����{���5���v?���Q~n��o~��g�?+Q�@�1P���DZsk�=�<gt@����w���4��t�}vyH����o��8��k�#�=b�@��~g�q �%,u�$nKX�[b��a�� |���gs�������{[�,m����FU���2����_�z4M�� ,w����}l��V2���Z��>��x�z�vY�f�i������Z�Zvpu�
M������e���.������B��X�Nx��������J�S��2���2�^�<��M1�j(lv�^���HMX���e��Z�K.���	Ao'�lXh���"�,bK�I��������^h���4�P� �����
��
��	�Zz��������m�����d��������>���B-��mi������n�h�u$�Mn�4Z�I�o1X��ey��ZS� <=^����)P��G�I�O#)��#R� <�����������)���HZ-n�h-q�Z��	�-�v9�B�+��IF-���!�����s�=��Q����F����V��z+~�~����89)�{���[�������Pt]�c����'�YnGE��-29�<���tm��_}����c�k�S�$��lb������_�~-jE3jg�t��w�y���7h)��r��u�Y�v�w!Oy����J�i�����L�Vm4n+��	��2�zbBCI_4R����R��	�A��$���*PLL��I7e2+����'(�i��P������,���~��6�����:��a���k���=�X3p��V���]2���0P4�J2���]v����:����NJ��.��5$h���~��d�h�)U�Y,|��gr��@���A8f����4����������ZQ��I[�y>>���j�LL����fb��y�F�&&X+d �y�)��u�=����cB��Z����N����y���D�X
�����N�~_4�i�a�q�z3�[f�el:�!Q��V[������FGs�o�7-��Qz�������N����IY�-���J'fY(�g�u��b�-���01������E�%�A8��;��z�1��s������kt�]?QP�$�cBX�'�Z|�xL��r�F�AX�:J�X��D�
HEj2��t�M[tZ��W	�Y�hS~����~W�m_�[�Z�	|����Q��)�^k��	�dWN�X]�d���������}`�AAO�]�`�n��7%�'�����fk�����7/T��P��F�.U�����	���	�M�)�B��i����s��s��.�g���%�R�#�s^s����s��h`�$I�u��N3�l����6�c�FS�@a���>�1�Hj!R��&1)Q�U�3#��^{m�������4jq��� lh(��h�AX-d��������j����z�Z���pk�����$�Ca�v95�k��6^��&W��rLf�qF;[w=&��w�}����{l�X����$���I
�G\T����pOF��v95�M��f�k���vb2�L35&�K>wb���|�A�jA�:p����^4>�����1\����
�d�\i]��/��F��[��3����R� �s�9�#68JLP,���&�lb��������k��NM!S{lN�{����L�]P�w�m���1/����5�K^|�E���!0� �����\s[q#����zi��y.}����^k/�}n>1�l�4@�~��/�o)3�{����67w��Tls�h�M�!����~�]��]�>���>�~��*�/�8�>W���VFDu���B�"@?M��A8�s4
����G�`��K.i�rI8v�CA�����AB�} ��VT[If�=��}(B98<#��:�����m���1�:������F�sb��a43M� ����Z��J���cFL����Oo�a$�N;����O���>
6C���]�@�"��8_���d���]�:��F��Q�v����Tj--�B�����ZQ�"��J���5?4�P������\� �1I�p��V����x�4�Z���A��(P��F'|�����BuB����Ey8��D�Rb��VV��I=�h���Q�M9��@
l��T%	e.������N��/	�p��������<\}5���<�]�	�����k�1���[�����P!����l^�'�G7l��l������ps����zL0����Q �����xG�=&���1�w��ny�_r���px��	�x���~q\�/���0�;��Ij��+!��a���f��<7]��-��~M7�t����!)�b�����BH�w��Ol�_\KI��T�H�g�}6^��
_-�~bB $��������Q������@5�JZ`�f���$,/��%}���Zr��7�k����K������� ��]��K� ?��x-;!�5W�
|q#�\$������v�-�X��:������k��h_�{x{�� d��I*J"m4z����l����
UjE�D��M�:a�H���P*U�M��F�H�j�w�A����)�Q����?k�����Z����jIl95��P����$p���G�\e�U���o�����U�

�������:KME���4��k����Tu;f�U�vTR�.������]���'�K������x�P�O1�wE=�z�_��3�-�����g�1O>�d��ty�v@��z���[��ls�|?��5C�;��n�7�
ls�d��4��RB<��������6�/~��_\l�_����/H�/������}r�V\����������<'���V����<���f��7����� $1���I��0��s�u�n	W��m����'^Kg��s�������hJ�inS�l�&&�p�7�&&C.���n�������hl~�^�-\���UI������r�_\�-~���q��k
��}�H��6�=�� $��V��FM+���I�����4}|]]/+m]8]c�t�{���vh������P�NX��q���Ign!Q�T����4�IB���*��I�*�5���q���v�B��ZQ��z�� �V�v�2����B������@e:��x��Wn\jAV�P��Z��&&Y���}��}�E�M��Q��<���	?>����	/�?�H�y;�&$Mso�V��D�t:����1!��<���)��8�������������?^���w\�X�����*��s�i���z���(>�[�����w��Q�Qlso]x@��zB�d��u_�_�w������w��Exk�i���'��G�w�^w�������/w�����jJ��J�5���w��1�������k���������#�w�$�5��������������w��� �B5�&O���9NJ5t���'�5�-'Q.	&O��yu6:\(W-)����o���}U�����hNh��I�j�%���]F�����g�9&�K��?G$_M�!����K��+��F�RL#LLjL� |�>�������@p�/�bZ|f�S>kI-o��t:\��\=��^}��m�.��A���h�47S%�3�tH�{���ht�B������2���1����Y���L/�����������c�;+�
�6w���8�����1c�5s����N
U�x=������u7��|�{�3 ��z8ylsO"���� ���om���~��>����_|~�~�n���%���9_�>�~����%
����$D? ��k��V���Z-*7����������i6]���[�������h�Jg��u�/.>�<�B�?c�����?�m�t�"�>���d������v�:U�`_j9�9|I:9����,]�v����V�J�fb��gY�������4:� tK	�1aB��m��^�����Nw;�������L!��o4E�r����������~[���p�b>�R���S�����oe��=�'��f]�-r����������!��U%p2����)U�AH���+�hZh!�]>�v��������I}qC~f����	i����:��;�T�����G�}/KA'�`���6�^z�9���m����[�nw�B-t���)��rB4�7,��s������K����W/����s�1��C������x���e�����I�~�&9w�a����������FUI�@i�=��d3���x���6o��H���0i�WVZ��N��vt�(��pa���$����g���q�\eEs�w�|�8��HZNoF�X&���*��ES�$����J�I����-�E/��n��~�����|��&���KZ2O%�
��2n������i����u^�G����3h�%�[��G`k\pi���t���������A���Jt-��+��=�j&����T�L��J40Ke������i�43�v=1�%#>��y���iI/���o,�g����H����Ly��,�����3������K��v��A��|Kz}�!�r�M�]��dC������T2�O��x40W��5}��_�G=����o�k�i�
������V[���=���Yw��1��)�0�]�5Ky�xMZ%
�}i�K���/�k�>d5��E��ww����E�d�F�Q�����dl.Z��2b����Y�=�>��e����S}w����V3IHY8ZZU�����2���P}M���$~{���C�
�??u�]���y���v����g��4��XT-�� �A�q\vt������;�?��x-VB���VK4�I��=z�����O�_l><nK���[��M�1��%-qIs�T=���rI�S+u{����
���8Z�6:X������]l����`>>iG�T��B_4�s�J����D-a6��zk1�	��tvt���D�
��H��B�Y�H�J������������z�y��]���wm\���_�c��JB�T� �����VM�	�wn�c%���^%^I ��ag�������_��I�j!�H���;�|sS�|~�a���N0��s@�ep�6Y|���8��O��5��Ax�����s�y��<Q����O4��?qx>�e_�]����w/��&������@=.^�G�S��t�VKa�_��et�X3����'�)d><�as�qM�����G��/����"�5���~�����
P���"����\��6i�Q���GK�
�]64(q;����R�m�#��bW��c#=�|�,�}�A��nC�&n��,��Q�N����n��tOs�g�~_�3������_�Y�rc�x��������t����	�3�o0�]R�����k!q�C���C��v?�|�~�<1���sZ�^Na�0�N8��G��n?kN���;�#�����f�7;�3)�o�\v��Is�"�'����-�Z#q;K���k���-�1��cL�����[���o�M�^\/q���[~�k]�4�Q�>�\�9��c�'.Gy�	+��kUja�Fg�m79�0�\!!������p����ac���i�����(a�y�|{���.�HBRvV������^{�k���3�s�kp\=���k�y#�tS�\�f�mfn����Y��)�%�i��0�0H#�@�S� |��R�s/�LSb����X�KQ�8��f�-^k�_��Wj���3�hSU��=}�Q��������5�\��&������HK.I��R��y����!p$`1�s#�1��0�v�����^������o�t�b6��i����j.�#�O_����n5���a�������%��sN�H
)M3�=z�	�Vyz)�`�����U�#�SWYe���.��B��$@�v��L$-%��sP��� ji'[P1b�E��F��w�x�9�'�/�i��,a���������={�uA��������/^k���vO�x��������}��q�!�u�i�f��6�h�Ib��~�.�<oGr�96��.����.��2+��GB&���in+R1����Q���N���&��t�A���v�a�Zs���_�?k�v�I����c����IPN$Q'\<n�7�1����u�����3A����mfq�3�}�Y���vS\���D]_p����P���y����&��s/�!��������O�O\��?��tA�H�������0mN��6'�@��&t���;����'�x����P�:��c���
7%X��N;�uf� 3cXa��#���s���3s����g��8��x�W��v� T ���w��vZ�^'I��\�D������!���D�V���Oo�a�u�Y��3�n5e|�#a�1�/�|3tg$�@��dX���K��@�s�a Dam&���f!s�=�u-����r�#��J)u%~�����d�F��@ ��6�@Y�-�(U+�ozF����oXG>P�.����z���]�%�T�}"H�g�u�Y��-$%%m���3N�^i!G#i{��$������g�=q{��������g�9q{��������z>7E���f�)q{�e�YfI�^i�=7�k���6������f�i���	&y���l�������E]�QCDc�=�iOc��M�3D�g���Jl����Z64��4�`%8��9��������''�,��D�������k�H+D[	M�$���Z���s�=7^��>���e�w�� ���o~��9/��6�a^L���~x��P��`�@{G�W�����-�����<	�0�[7�}��<�i����_k4
�5Z��I������
��Z@sn4�@sn4`�g�R���4���eHH�"wELO��m��7o*���ln��;��������W�^v}��wo��WE\�&�� c;��|��4}L4Gs�j�kR4��5}�5&�{��7^���a��{5&���Fy@�5����SN9�
E:Ul�����J�0���b�:���l	�L��$�1f(B����}��W�->�x�R=A�}�����;~�
_���������>�h��M�RM���?�o+����657*����G��y�a�QG�-�@�!��Ka4�vMu��.�(�RM��;�0����-7��k�8`x������:�����O{�����������}���TU�h��6��U��%��=H��1c�K'A!$�(��i��8����B]_4��Aj������m�FQ��V`
���B�/!58X@bx�-y��v������[#
hl����d��?aV}{������n�|iJ3�KS�U�^%�5���V��S�v#_z�%[�[���Q0���N��tm�.7�������F�0$H;��}�����l�M��(���>�7�����/o��V���=���HB#p5�J�����'�\����A�����jX`+�q�H#
b�f�6x�X=
!�,�)~��7f�m����m��� �������'O�0�C����S�Q81�m���jT�Q���Q����'A���n	�y���������j���6�T
$������Fh��o�i��I�g���8���W^���:�Ez����bb*����O?m�]���G�H�-�����<�s�b�����S�]8A����l�)�9~�a��E#����	d�x���r�9� ������������2�r��_|a�e�/�{��r������������K���K��{��S�{�������"�K���������2K�����
TF�75�}����f��;�sp�K����E�	��+-��Q�A��O�g��]��4D3�8�33~��v=��>������F�Z���x=��l�����3�d�
::2;`f$��_v�eV���t��������
���mL��1�..&*�Y���p��J
�"b)��	K�o
�F��G�I���g�J$��&a89�0���.��&������d���o_�z%��&��w��&�m�Mf����97Y�nG7� 6d�'E`6��(A��~L-1Eq�`�b�aj@F�����4���(S|��#��������J���7"�t�1s�o�Y	w��A���aC�q�Ze���_�	E9P4F��n�)^��fj%6W����E9P���Z�������Nh���}�V������&��9'�t��b���&������nx�V�������; $y��^h�W����H�1�0�1c���!��M��F��h������e|4Z�&�����A|�8*47*��	��aLKaxp��n9�@�������@�h%���z&��ra���^�0��E(1��0�w�y���PX����L����y���j�8D�&,�l��	{"���6��!�C6s�1�Q���O�<�Y��f� �R�-�0�\����f�����|S�$�������f�&��>h��5�<�>UDC�k�.��b�hM=��6<Ma(�J)��p��^-���vX)c�H�7�������B�����s5�S���7��W*��e�n�!���A�rxj��/�"��@�!��Jn�rac��z����@������*^�Np��oJ47V���W�yy��4C7���O?���e@�Y'2�X�N����U��B��i	�������^�pE{�)��
�%O�!�%�B�"�������G�ry�"���U�+$�'Z�����N�;�����0��s%�A�'����6^���S(��q(�>��N$��x���R�f���a#��ya��:����m�O�l��%�F��;���o������H;_����$N5��t���|���^nr�9i�K#p;#r�����}[��0�
Ou\4����a���ooS�(4Il^GR�z�����I��Zk�e����64����Px����M�
������D��$	C=z�x�6��4!��  ��
����-�b!��4Wb
�?�|�.�y	�&3e��7n�?��!�,���d�H�/�X4f�{��+�W��v!�;����!l��+/��ta�_��a�u ���A��>h����/��g�A�[����Lk}��0��u��#�\�P�!N�	�;=��@j��_K#j2n�^/�y\/L���(f2`�t���@�����r�0�I�c����$����^R�����4�r�h�[�0�I�0jTr/	3�
wC_4��qB� �4D��i��|���h�v��	��aL%a8z���@�����������(��������0������4sy7@��,�@���To:�x)��gO-Sf�.�h��K��^��aL%a8v��x-h��8��V�Q����p;��p����f�0�|�M��T��r$uF	�BFzF������R���"�m��<a�f��w+.?>p��0z����;���}���������A�T��x�r4>h�;Ee9H�gZ+��o;Npc������
�Y����M�j��B��i�����1����B����v��7�����
+������x�w2K�x��,�����$<,�����l��x�	l�i��l�4���*������J�P��]	C�S|w����4bd^x����!\�G.>�����d��!���R��l����^Q����.^�����L�4�q������k��iw�ayR����j]��^zi�j�Er�@(A�Z.�R�3����v�����L4Z�&TH�Mi:�����Tk4�&	�n�r<�y���r������o�[]CN�!]�b�Xu�U�����o5
�?%�B�p$��f8v���Z ���*4�0�s4�3�4�}�Z���$���3-�M�]w�ef�a��Y�6C/�
��D:�1;d���V���2�`3�iM��Q��k�@��0 xSL1�}$e�^&�P��%�f\��b������Y0�TSY[�($�K���[n�������~�������5I�����i[Y���.���ww�2��t�0��"G�n-O�64M�51v����w��eG@��&��&���>H������f��w�k�A14;�!^`�J�����Xb	����P������~�!��<Z=���P�Y2����3�l�H ����������i����aLa8v���Z	�,M�����&�D��e�@�p� ��}�u�h(a�)��'��0t�������C����F�1Y�!��s��/��������7>|cZ5��A�d�;�k��]��W=��4���j7<��#�Zv���Y�*7>���aXY����M�^��0�}���	�O���E��w��I2�i@C��^{�e/l�a��Eh�g*U2�*A�Q�79x�}iKor�2�J3t{��,��R6 ��I�DE[x���+y����^�@��)�I��1��|������!w��o�=~V�lw�������J+�GI?���+Vn-�e��K��@�>���[���h4}i����/��e����+���t<*}h�����f(!6���E�P�������d�9B���h
b�}P�"]�|�T.�5Ii��������N2����0�q�v�v���5�rT��%T�	�R�4�8&���F�F�T��|���	��_������K�g��g�i<��}��z���*Q����������5����@���h���1��f��{����
�LCG�D�es���01b�h_f�4�~��@�~��&�}���Zv(���fv��kfa�
C��S�g��v�9��W��e��4�Fmoi����w�@=w��u�i�����������&6����h�F�bV41��
C��-���f�W��&#�������O��><����4���Y������M�xFC�d?���U
���AIO�@���������g�L
GR
� �����[om�8�s�5���t���:l�;L
��������4��9���{�xk6$M,k�<�����h�o3$���i_}�U����A��U��k<�
����@A[�����:���'�R�.�sA��W��!�i��6U����g�k��hw�@}�b��f����k��}SY5&�v%ia�����%T��v�m��� q��\rI�(���<�|{�1�3��%�
������{���Zv4%�4_��F����U%e*�P��j��y�e���__�n7J��64���Z��,�H������h,��MS"_hb��F���s5��w�����I$2dH��_����^�fH&
���������x�����}�BZ9���������Ec���������Q3�|k�@�S 
���w��&(����z��
�������T��lhhY!G���>4��F�h�j�U���F��-:PTv����)T�=�x��m?U�zjD�(#A��5}K�U��p\����_|�e�W�����Q��Q*g�}v����������A���;^����`�@�h
oh*�S'��p�u�h�B��U=���M�=�z�@����������9�>j���&���;�Zv4�X�}�Y��M�f_|�)M�f��97A3�S��&��2���(���9�is>�<���
�&5E9P����x-;��(���a���Nf����vH���_S�ZT���>�l��f6[���H�_]]���&2a����3�:���i.�f�)��qR|��w�Zv4��4!��BE��h�M��Bm��f�u�5�^z��b�-lb6�Pd�B
S�c�)R���o�c$I������fe�������E��f��R�L �!����k	U,�0M�)4�(��d��
H��a/i{�������o?�����������y����zl4m����~s�h2P�I�s�46��r���+��R��k�9�����0L������v�����@H�1��R4��R��_T�M�anB�hJ�k2]>���X9��IM�RM���	��<79O*	�__��4��R��ZqGh��6y�p3aB����(���q��(�V�� c*	�HJ���U����{1���F�X�>U�U������8C47�4��b(*o\sn����X=�aLEa��q[�k�1b�
�E�j�,�F�t]�����,�0I;,��"-�:'�J��@�g�C�����5�������H��������}��]NS[����}/R��]c��T�3����8a�U�p��x��op���>_t�[�]i���Q�T���&��ib�6���|i���6s�=�
��_}3��37�2���:����b%�
C������o���n��H=i�f��F����h�0�"4�MG�t����?�V�
F#zR�H���%�HKR�f�$���7�������2l��E�M��A��@���C�k���=4���o�M�S���5�xb��m���V(�|t����/iq�]��z�Tj����������@O�.�F1k$	���U�7��rK�����s�^}���j���l�n�a��������7��||�N������������Lk5����2|+�������O�k���@
���'�����q8hb5u5��=Sc�����1GJ��0|~�.������g�oiM���.��e�t�]f����F9yr��t;fE������F�p�
�Z O�0����
���\?^K��#�+��	���w�9w��}�hE94��yOh�s�� c4�0�w{��u|F�X>�$��.V4��i���=4Z���6=�}�5ah�>�0���0~{�����`�=a�K��@�hL?�p��M�CM�#��� ���`���aS~���YZ(K`���4�m��f��>�i���t��?��]4-I5e�|��&�Uc3�T
����a�Hv��G�0|�����������M�6/��w�����~{��M?���������+�K���������?��lE	�J!a����1GG���0���%��������bF��y����_�@@2�7��xK �[�CL����)Ck0��u�Y�
y�=���	�x����[om���[�m�UK5��a�b��4��.��~=�m����9�'���k'Z���YvF4y�����P���[H�W�A��'�t�����m!*5S�\R��M�3��k��X�b�VH�+�O�M7��u�]����gN���R��(�A��u�FW�K�6~�l��z
sJ�������?n���7S.�Y]m������2>V�VJI�q��5�Z�V�hb0�wF���(wJ�Hx��m�I��J��|�~K��
�����1ZK���$�\�-<�$�l����9�c������/��>rN�����\�e)bt}����?��_�a�}v3l�tV�����gS~3�W��de����/q�r\8N"P��q<�|����.�K�G���|rnx�h{r��\��������m�+��W:_�fJ!Y97�g�5����Jcy�{��u�z�v�l��4��m�k�V��TZ��sa��]&�)�}N:_r\e��a#����z���i���n�J]	C�����j2�}hMz��'�������EY�QCL"��|tt�*�i���j�������FS����LC4
�4vFM�������>�e����������?������t/��#���k���l�1i�5{Gw�4�����)8Y)o�W��woY�P#1)���pMf�&�M�*3���/���HN-�GZ�k~�t��	C��5�O�1��O�,��V���c�����?��%�$KS�@�1q�'��i$0g��y���_�����k��T�&����FUh�AA��t���D3y�9�M3���r����-��Q_�o^�uA�������w�������]��{��'^�NQe�}�+���&at$h�w����WSD3��0t�H3�����O^3o�s%�aN�XyJ�)�����h����i�q�������-�E��v�;�����&�����G�������a�E��h0��/N�0f����Ih�.*�����f�{���z�o�r����������)�)���7�Z�Y�0�y����1b�Hh��%�A�sb,3����0���y6����<4\��h��k���h�������=a����ys�����/���Ao�{��j���o���}h��4���uk������Dt��6�+EB��uV:�0�����(��|�rdi�5n?�t����.�7k~cG����������W6�t[���g#��X����������'������>-OFs3r���������D�RV�������]�w�O0C�t���������U��JO(�����}�|���#��I��1�3.��n[e~��1o.��o7EB��H��it@I,����|��fT�i~|���P��
��z��l�'M�9��Q��_C���~���2���^�{z�?�J�`o���
�����u%a�]vi��Xj����4�"������[k5��sZaJ��W�b�<$:,�J!�}�]c��d��/Ukd'@*�<�����f��7�A�&�H$�����1���E��c_$�^����$VO���zI��k8��/�����y�����\�iI�$:�1��E6��C��?R_���7�l4
H�<�C�#r�87���6����n���M��d���x�v�md���jI�y���q�������3??z�����f�3��o>�<1���m���lf��s9������>�g�����[�[��=��+�������/���K:_�����I;_dc�z�b����K���/2Bd�J��rn8_�0�s�c��|1�E���Kd�9_\' �2dH���=_��8_R
L�1���w����X���a�e����31-��#^�qr\���i����O%�J�;6�xc���������.0k����}����4�(Cz�3�V�?9�Yh����h^����A���Q�H'���k�r���M�����v4v6I�����%AD���>tE�r����-�G'l�ly���{|��J?�+�#8Pb*z�#	�0$�Iit��of�O�:��HX�	�]#����P��%ES�A��]rU}�\f4-,}C>4!?���h����}��=�O���b	TO��/|c��hfpf�������#3��H��Wr�|y�1�3������=�E4z��k�1gDBs�h�o�4:v�e�h	tN*�~z����g4[���,��Ak�����>s_�|����������[������q�T�+���v|6V�.���'��H)�b���j��D�������=&�t�1��%��|��������h�c^��$�W�R��qgI��=��i,<M�.[���/4�|[���mu��G��,������W=�mS����2/]��|s���%-_^��|x�Y�����l��R�L�'����hf����<^��'��/�3
�1�r��I�����g|��d���T>:����i�&+���X>��GE��|�V�V��������|Va�����V�n�x�V�j�;���Y>h���s�1G<8^k�����M����c�q�e���8��e��%�������!����o�.;t1�Q�D0���5��'�y$�^>�{��z�`��53d>?����U�@�,���=D}���������0=v��:��w�np��>o
�q�_A�Mb������e�"^Ng��C�!������Ba��f�x^;�0�������>����:���^��9 ������f��7]"A����L���7[f�����h*�,���6�V�~n4-/�[�������K���~l���9l���"�K��������BtZa��sc\*�uai���1�&l���l���>�1��7�����=�EZ|�s����,�^i�==�������������=?�1�-���r�d����E�
N
Ys���M�����q���4,���a�E���k�L�["��������;�1�Bt���;R����s�L�7{�I��y��)Q];(��
Mq������]n9�4L"�v��r[�~;�N�������l	�l��w���Dt{�q�����1G�|9���M1�m�dD���{R-
�����������Q��$�����`��h*�h���q��Zv;,�3���N�����g�w�8wN?��x-��C��!j��O�����?�x���Zk�e{(d^"�zm$@�v�M����k��z���������\�5���x_��r_����{�����+Z��>���m�-����m�-7D�f_|��97�F��������Y��;������M������e�Z[F��zkcv�9�
�iW���o?{WYc�5�-�v��3(��xn<���	�@ ��)i7Ib�O|m�(C$���E���}cIh��OkLle$��k�5��3��<l�|v��UEH�0�OFU�8�YR�hG��3�&�G%��k�Qq������^�k����i���Q@A
��8N�cJ�$8����28Wi�zR�D��dk����1���`��I��{�?�}���R��=������E��PTB���r�v8a��n�5V���C����&�lo�'r�e���e�*�.�h��u��{o�7V���m�t�����m�,�Q�<[o�u�!|��V�����K��v��N��:��/�+����*��&{�����.�0i��������>��������%�{����y�J�;6���f��fJ��Re���8h���K,�D��m��vV��T���Y��/���@p4~�T�9��3;v��5���D���������e�9o��v���#������U�;�0�d��q�u���Y�x��0�����40�O2#�b*���h*8�|��4���Y4��O���W^9^k��h\�Y�h������n�ik �D"h$�G���^�~��!�q���������g�������<�iZ5��+��D��;M��=�[�v�a���K6�a�L���j����z�5`|�y��e �Dr��m�]�SP���h��R���c���4��x��������������3�0�]O��	C ���'k����k�%���
f����Yp��_���l������Ys�5m�5_PY�>�`+DC�R�D��z�9��[ZA{�G�.8.�e�]����z�`�
2i���>���������W_}����]���h����6��^H�<if����WZi%��b��m���qsXx�R9~������|.��s�o-������;�h�c���6s�1����3V/��r�����+�h�I�e����r�-�:K�-����=�p��u�����3yO�n���C=������r.�F�>�I�$7�5���/7�����Z�C
�@ �%�@ ��0"�06��,�!�A���^�y��	��|�YIv�@@�0P8D ����W/s�%���f���)�O�y�t��>�w��x{��~{���w��z�i�4�,�Xo3��������8����� F<�x[��
O7�t��y�E�������[��L3���L�>	�0P�{��z�<��� `h&aE� ����xaH�W�-�SO=�����@�0����W+����{��w^�"�0�u�YV�$�M���,�n�~~��?��K��� ��|���6���[������o��������C�-�����WC3$>r�Yg��&��R�
C��D��:��q�hv��z�}N6�#�aH��L��L� a|&�t�I��h���A��@ �a D4
���%,a��K,�@��a ��A:=A�NO��@���&��T**�R��,��t�A��n�E4�:g��:j�([i���Qy���5�@ P�DJe_�AS~���B-���&�|�<��Cf�-�h,*I�V ��6�Su��&�%�j���I����
/��+ �5����n���6���&�(�-}�
�#��o_��D|J���*}(()�&I�'_�g��N��J�R��WZ(S�����y�
��WZ������2��3&nO[� ����2��3'n����.�z>7�f'm���r���WZ�:7�k-��B%�D���o�n��K��E�X�����c�r�!�]��i�\������|a��2�S�k����(
l����6�9���������H� ��r�x-;tY��b������v�)^��(|Ik8U	�	����6�����Q����VH�S�.�B�I��4c��	�"m�rw�r(���OO��'�p�|���/����wP�����Zvh�ve_47)����-���|o>�����s����[������.a^44o>���8�o�4������f�I��-M��J������Y���/MU�����o����97i�4�
����������\���p���[�@�B����l/tjAG�/�������<��k���|�������H�m4vE�o�@��/�}�Y��Mi�]��A?��y����^k4��>����T�����6��~��g�6���/m�y���^{�v���b�~�;���}
������6�W�'B+O+=��Fpdm4�B`�/�)��RS�44�HT��Ac�L�}un������B��au�Q�9�G�r���������p���Z;3}\��?�������nh�|�M��!���X\q�x�z� T�@�#� �����:�gM�@��WrE�w�q���`_���Dw<���v6��SO��1�"���-7�LC�\��8!/I+_��JAx.}a�;��3� ]�|������x-;�
����oX���?�e�������C�_�p�E�.���B Q�Y��������	B���2���{�6��z�
������+���
S���~c�����GxJ�I �	!9AQ
<�>"��Kx��T���������~��D�A����|����F�a�G^��/���� B� ���B\_��x-;�<�H��������\Y��*J���{4�/��	����Bj��r��A����j�%��#C�����&��s�'B+'�q�.*m�� �\�1c�7��������hr&����x-;�<�zu�h�	�����!�4����,��AJ`�F��\����M�� T�@���3��w���`i�cn��F���k7K����l����TR0EBq?�p������T�d'��]�a3A������Y���/�>�l���_�,�x��c�B3|��g�_�5^��&�Gb�|hK���%����<�����}.���!���S�p�W��!���[om�N|����	#�~�K���yw�X-Vf52� lh(E�gAj!����4�E�.���4�����q0i,�����97Y�qT�R�i�8��Oh�����@d;1�RR���O<�����7�jV}{���2o.c���Y��u��.��}����A���W��H�K.�����nv��]t��F��P��6���'d{Am9@�#�~���lIR]?���R��A��6���@ _eAJu1M��DP�j������l���vJ|���[��]B�^�=�����$F�����>JR�w�ak��#��xkMS�C�c4�~�/�]w]���+��+ g	A�����iLE9K����97m�,!��c������uH�0�������km��S_��^z����
1Y�G���#�0����M�����	�n7S�9���QrG��U��|A�:Z���������L�P���"a�����@�����v)��v[�S�zM�
k���F@�8.|��!����:�����oi,;�mh�j$����A�
��n�!����H)�m�ESy����#��� ����b��=��-���&�y�S��������A�� ��Mn�>�mW	���d�"���Mb����SF&�O���z�]O�QJK�Gy/�N��r|9�R�K��A���n�R���K��=_�:�Sr�������$��}`_ �|q\�x�
{L�|�������c*���_lq� �|�xw����d,�]��kJ�uF�	K���L����C�
BA�a��f��o��}��g�y���R�7.}a��&��g�o->��L����Jh�l�N	��I�,��4U�5��3jgm���N�e�M!���W`��u'wc��k��������k�����E������&hVi�Sc*Vg%P?��xu���M��Uy�������cq��Gw��U�S��xsAX2��L�|�8K����x-;�l��%����DZ=9Kpr`C��!1�Y�5:2��6���IX�a^ha ��A@�������� ����8����C$������n��5�Q�2�\D� �5l�'�p�M��cL��x��"c�������"F��3�{�"H�A��k�d/��&�VS03�/��i��M���d_
|K}�������8`|��o��8F���!�w4F?h�\p�5CI���1�g�}�s��h�����.�4�lI�m%
<�y����c^�W�	���0zt�M�5��O�����E���f:��������.�8X4BZsn��^�AOr"� $��&f��3vz9���}A�C#�B5q ��2�@L!������{���g�����1gG��JK�'#at�1����ww���Q� ���1��T������!��h�����Q�����S���M���mX�y��g��r�y/B�Z>�����{�_/1���X���P!5�MCpM���/�<^�NGr�`���(g�������@���N;�.�x�1��!�'q�9�%��S�X;#��h��-L	i�sO\&1�S�K�����Z� l!{�k�@��a� lh8!^KG�
i:�I���H�uhJ��:%4�G��Y}�v�)hB{�t���P!5��%���\#<���N&M��T@�A��JSxC#<� ,� [L�{�k�@���,z�J%7
�'�v4� ,�0zt�x-��,)�Y"�%|��l`v ���x�B�]v�-�@�}�~���z
T�`�[q|0~p�H�	*B�'N(�R���S<�8jjJp�TO� 3��x-���1�������s��7����
m��@�q����!<&s��W�#k�X��w��>�v!'d�P�<f���TS)��H��i����&��*	�P!5��\�]Q���h�������1^��FS�
�Mp����Rk���!;��A!1�@����J�;��� ��>����f�72��1$�-�X�3�8�V�'��p�%���*e3}u\���O_>�����-��?Z���������\� ���Z2���P.�[>h��4)v�Rd�Y�lM6�&SH3���K���N:�������f�UVit���,���s�a��Z�9��/JF�����`�����V��<����d���K% -c>�W��]B�n��8���6P��b{�8�9���
6��&�����I�

'�k�@����M�Y2A(R��F����"��>Ad����"��9������(���jS�(|�w�?A��������#�p�h�ZZjA����w��w�y!
�d�$�9�����re���/
� ,�Y�p�����%�����-i7Sc�T?�������ZV� ������G��X��=K�P���t�'��h���.Y���$q���-��do�YK��J���:��T��/����]�;	! Fo��R�I^��\��6�h�b�AmV��Ob�;q�~'��	���V�����@
8��"��M�����`�l��D���3�x�o��D��m�^����s,�d�|�B�:����d��bW�mh����6^#v[�>K������� ��m2s�m��d�$�/��'��c#��c�v�xm�����%��K��{���+�!P������m��g��>�J�
B��x i!�m��}d:I�w��q���.
�g�����k-I���q�mC�4����_����q��M&�����li�\a�~�T:7�7���$h�>j�M�
�Z�F�!����c��D���%�7�YB�_|a?�����kW;=��$�s�3��Ju�$���PI�nf�����,�R�m �8�IH���U������gI�Y�:������u4@���c��isI7H �k�e�������0������3}'��&f�ZK`6��~/�������/�,�T����nl�y�&��k����]=P�����OLG�y��'�R�;EV��E]�>b��xK�C!=K��oh��l��]'rCl������bs���	L�?�h\jA�
A�)��f���"4��5v(�@�
�����d�h*�hz�h�M���>������A�����/�����-����9������JV����X
�/A��4G��������r#F�D�1m��� \/�5��[���D�����x�%�����(~_4�
4�����([4�\M#y��������,a9��N��;6�!$�N��)���n�g8��
�){h�C�1�2A����(���n���.v�D7�.�MR�Za� �Q�J��������/�X������n���
��k�F�t��7��qDZ �	�!���@
����}D0"0�;Y)h�����Z��AV��G�Nf9M�>������ _���f�P��D��>H:���Y����� �ok���������W8��� S�d3vl)[��q]c��h]{�Fh���������*NQqQ�&�O�4��c�����i*�Z	�$���g}�x�5B��D����d��i�$N����p�p~7�x���������6Y�� ��J�P�h:B� ����'a2E�M���J��ah����Z,�����9;�����d�9�����y��ly-�����yaA8n�����e�N���R�_������:^��fj������i��4��5�dJ��97�r��#��������.���-
�J��"���f��n�����M&��tW�y���@���q�L�Br�\Bf��g+t)�� R����j�����mf1
?�9����� ���������+	���s�5�}D���8B�T";�J<����
�f�M�' 5O���0a���@��>��f����#?{���n�2��O���/k���yc��$��]�|�zU� ��R��44�;��Y�����T|m�E9>���j�M�p!}C��"]�����_|���&H<!�C��C
��A�8����'�M&�q,���b���[�X� lU��5!u�|��5��5��A���}S5=[�N���G�������S������k�<�F��G�}��S����s���'H�^��[���@05m0p�p<���us���[�J@6�Yd�El��^y�����$,?:�Tz��A��%I�r�N�"��].p�����U�����k ��d�T�.���'�B���.Q���;�4uB��� �/�C*<w{�i!�i�c���]4��%��M���tWQ���t��#M���Y�6=�������bh��ZZ
�P{����Ch5��*-�I#�@���L��>��������H$6Q��T�O�-lI�&���h�\����5
�5,�f��y��q�8�4/�����M-�8R1I��0��\�gf ��4x���W��[A����Z�1QL�{&|�l�RC�8�������v�L1��"0�C����������rC ����B�=���b��P��,��n)Y /H �������v��
��[e���La����v�����[.xy�!�rl1�����9F~��(���O �e@�	���d�%xW�q.%#<pl��+�8o��H�qSc��k�|�w�Yr��u|'�
rn�m��z�]�������������s�c"9V;�!�1��u�=�4;_R�<�|q.��������
�i��S���=��l��fM�Rw������N;-~�>��$Y6����	���M!��[.B�*p�0�����R@"����q��Ok���F�G����gX�y�M�m!�M�Vx{Q�9����~
*Q�L��!�]�L��Fmd������;�����f��
��p
�'��:5���<I"�g�e�F��@�:��a�R����DT*������6�K�M4�������,�?m��N}p���h�M��%�n����t�t�����=��N��y'"@��s��r��
�P���WR�����Bd��`
M�������]"��%��ZTK���E�������@��!�]�����ae� t������L�Y����SOm��j*����EI����L3�������S�
��bq�!7m+KI������e������ �����I�Ac�{���h�8K|��F���{����C� ��	"�����a� d��N�V�>�l�5"��,����V������'���,Z�|������U��W-A*!A��h�D�!�Cy�e4S�JA�i���a��E���A#4-]k15���V&6<�4�9�d����
PL"D����w����[X{c��8��������&Dn`�bJ�C`��!��VS�Za&A���C �XA�A�����k%'�/��� �BQ	�F������-Q>�G��%�hz�H����9uT� T�@���T��5hk���;�Zv|#
@��aA�P�\(�=�J=�g?7� 3
��c�k��_4%�4�u�B�����o�L�5�B�y_4Sc�#Ksn���� T�@���L�}kEj�(hzNk���~�A#aA��4�4�b�0� l�.)�>���P������4�k���Vdy��,h��k"4� �(a��R@j �����l��n���BG��Ju�8BI����8B��"��/A�0��("�P�k��i��q�h���BR��<GQ��a�j�����\���_�n��u
��JG���*L�G����\cM`�FS�h�E������&���>
_sn�FX�BR��A"�g&�)�Ouj��Y�<#<���zu>�p����c=W+���g4��h�g4�*���I�����4M�F����P����{Zo���%m
-��.���p���
v� ��"nR��r�|�j��!����u��}�����3�{��f��Wo�32mN;��$�n���������,�����#���_3�Fh��w+�3J�"�Yb?�m���D:H����Cze�{�L���%@T!�h��Hk����Y��E
7P�K�����S��)�$=K�?	M�E�*�|��[�}r{���s,D����+�#��E[���4X������Tw{��6^�Y���%Sk�<�S�^�����qu���,�}c_���q�l�1���r�8vR���1�������\H�4����;�Fs~�s��"������%�������Xs������o[+�u�/1��`��f&��R]b"�m\3b�p���+������A�K/)�&��3�[�1��3��u�<�v��d?��-�K�,���W�+�{��v���."�Sr�QG��4n�$oa>!��}��'E8KD��q�HC/4����F��f��$�=T����hr@W1v����9l���f�����B#5'G�X>�����E�,�4��
S���/��&_�H�A4}k���
�Oc������	�6������[T_c������h�}H#l�:�����o}�T��?jkI�\r�����p�f����#��=�5��y�+��+���z������}���4N�
'wA��5	����Z�//3�~�s�,;� OM���}7n\��M;O_��&�U�>��7�����}S�b(dj�4�g��4d�������=A�3n����*%O=�T���� �>h�"���u���NH��z�WF
~$��������i�Q4�����b����x5Y?��1H��TBsn�FX�Bz�2@%��v~��� ���/UM��F�h��g����[�f���hBT4�Ssn��������=������i�J|�/�BN�X�i��B#�����C�,��h<�O������"1F/���V3\t�E[G�%A���k���$"�����^�%�Ni���g�d{��97aj\��f����c��RO������?��E�E����@���hD��z���f:���j��"����j�RT`i|���7E��K$�������l#~Q62�D}���h�~$����Ksn4���?���w��t�������.����p���������0�AsQk��@W�)�F��`!���������5M�vM��&�Q�a�A����;������	�=��\�ER����]��f��N�8X4,�A��z��y�������vYl�����==�]Z������4����h*bHI!���EiD���g��w����&�GJS�P��$��
�L����A�4�r`Z}M��/�8���n�����7�1v��x-��p�H�F�x��x-;g���Y���RX��F ����;�l��f�c�z��m7�p#�a����vG��w��V�F��s1������/��)��#�6}N`�=p��s�
	��<��H+<�������&�V�TGc������4o?4�0�V�aOsn���
�������BI�	g���s�=w� \k��=�>�a��#�}��������V[�>�3������;�oO�1��
��1v��Ze4i\��B�BE���_�2T#�5��#�&KB3-����3O�6U�H�{��._
���-�O�:�bX�.t��g��x��?�|��D.�J��{�z�iv�����I�~����v=����.�z�}>h4|_��qHh,�tA��.�"�,�h�G�1Jej�"=KV]u���� L3:��{��p�������q������>��R����L�d:�����GS���GJ�}�YR,@��`����RH�E�[J�4&��B�y�S����@��*���O?�������:����w�3�<�lB�d!.��������f,��
H��;�B�����������h�`�����J�'(�����%����%�%�@�u|�d�����IkN��>�=K��}M�c����Y��N�7���,9�r� �|��s�g����fl�� ���W^i���|1�D[�}"�Y����=_r<����KBad�!��f����/q��w`���%�K;_��F~���(e�����n����|o������olf�uV�<+U	�j<`nZ�Fm�8��~$�l��}<�������vr��]vY�����n�������I�������@�m�hQ�S}M.M��&@^��8����k�P�WZi%��@�#�|@��^>g'�����s�9V�#��#z���J�
�#�|���1Ic���%�kM9tM�Y�TJs��N�4�;��)�hj���t�������R�p��f��Y��v#����{��,$�'L��c$��BM��qYS�Rs�ib}j5�rMH�F�Sd��v��� d��&�-4}�-RS�A�����af�i&��[�i{�G��jhh89^���������{3��M��&
@s�*�YB������{�����@��!���^.5�n��n\�&}iJ�1������|��v�����V�������,��>h�M(�U�B�^�c�=����j�����}^���_\��%~��'�.��"2K4}�5�%;��YR� �6����hl%y�(#������[�R�I���4��w~	�A��Cc�h�
'�N	��T3e���ir�5�&8K��AH�'������f��M��h��nhh��qbhz<h��4�Cc���T4���)���qh�����1u���x��O�G_�t"O<�D�'�k����q!>s�xM��I_G�����M����&
B���%��B!}J��rA���CZ��9K\���Jh�%�f��_~y���zu�h�@gI@C!��W�^6�������{����i&#��u�qvL��Utg�7���%u�,��M��,�J^�k��?x�{�r�F�x�1c���5�z���}D�u�&���4GSIG��i	@~�/�s�/�!wAH���o�,tP����\���U�}Y/�I�����Rej��\��R35,J�f/h��;�F��4�����8���FH�-�-wPY4�N^$	�!��U��1G�k�@1�qz����sgs�z!I~uUsGG9>�|��!���`a���Y�%r�T�����,����_��$A���5^����K5���	��:t�h�\�AQJ
7l�����/��N=�T��d�m��'�l��Y}��+6�I����/5���n1th�~�����Ih�h�#�C�E���4Y"���oU�L
5up�C��P3�h��^�qz=����G�D�I�T�5�P@��(����-���}���A8r����lh����E�#��Ne�������ixi�H��LW��%�_������Zw�)�X�	��T�*��������$M#������=z���Ni�=�����w�@~�z�uZ���z��v��;~������T��X.��$��� <����-��b.�����(����l��Svn{��"�����n%%��u�Y�J�l7n�����������~�v�h����\

0��hkR�Br����2 �B1�� �N�e���4K�#����6�0�*����D���J4�L�ev&����s,D����3����8�b��m��(Z�h���X
aH/
�|�����T��m����d��W9_+�P���c��MV���/z���/H;_�{��%�X���!�lC#*?_��J�K���/9I�%&"w��&d�Jf�������/�<e&/�/�m~��o\���/�-�^z�������3�h�g�.�<���5�o��v��~I��A(�A��4��(�oI����&M�%��
����}r����Ek�/�E�����x����v�4��>���]�4���$�m��ZC(�~��g�>�h��6��O?�����nHz�o�H����&�,����������PM�����jl�i�r%4A��v(M���c�))��z�L
�.�ZS�wTC��]N�����U�����m_�� ����<�3jTI�MBZ� �$}�8K4��Ju�������F�j1I[V47)����"5�;M����4�g��Y�$����9r�x-����������k�����<�g�C����/8,^kB��>Hs�4F��G����^w�u�Zv0����2K4��5����r�UW�k�	�%�a+�������a�>|�hZ�7k;Tqh���&�JI
ih�b%���[o�5^�N��B�����Z	�q=�Me����0,5�L|��Ac���|��4�������:7E�p�
�0� �������N��c���/�k�3y��hJ��y����-��G4BM�9�;�&��������hnRE�Lj�&8\��	��5A��~�{`��8&O-%�0�!���������x4�Af�y�3������YR)�1�nx���L�,��fj���m����4E>4���n�)^��gI1A�AB-�r�2|��Z �:E8K4��A��k��8Kn���x-;Af����?^�iP���&��X��>�����5��o6�����)i�d4k�:7� �(];�f*���Qw�9�#�2���������q�h���G�)����j�Gy��x-;�NQ��&�[s���� 3
�����L���,&N�7::M�7H3���LJ%'�z��x-;Af���I;�G)�������x-;��bL��l;��9�Wg����q�h�!h*�h�Mp�C�Y�����.�������Mh������KS0Uc��4Jk���k��f���j,����Yg��ydS+��7���4�L��F��=��(��&�+�f��9����� _DU���j-�������]
B*T�����KA�k���}<���lM�e���HA���Bx���UU44��,6k�f�}�=�eG#�5�����V{�������q�}�i��(g�����MJ�t�KA��K��O>��;���d<iU�����,�@���iz�������k���a�E���m5i�=�@�A�y�z�5M�|���������v)\pA3bD��K;0}J����64�.�l�]�R�i��g��v�P�Q����DK�)��������G_�N�/�y]�6���{��l���]U�1E���#yw�������J���������u����&S��m�[9^rLe�S��c!�A�xw���� w����/�G�m���s�E���ui�KR�d[��q�q<��znde�d�z�m�>Yw�C�m�N�&�?���v�*m�����m��W�m|g�s����mI��Y
3�����>�J]B:����P��P����^K��V[5zi*�F��%I��D�'��i�]"
q�h0����?r���>�����Y"hbv5����g	���������f���Sv�O_h���s:���V�:��s��$|�������� w>~�y�x���a�G�q��n�_���q� �GVD��A4|h���=�b�3����]g^�j&���~�Bc����������	�)"!�G�4�{i��&����#�04A����}�������h{7)��P�MJs
��S>�B3�����/�g��O��}��SG�������M�H��G����|�eG35�m��� ������w1/m<U�%�.}���/��Q����e���"���"��-�������4�IM�����F�
���Fh��??�_��=��y����],y������7�S�X��_41��t� ����P�5[�F����"�'�����=�cV���z87���B��������=�P���q]��+���$K��5����������
'G��`hQ��#��4�-B����__�;^� �b)�?X��q��z�_x��x-;��@�f\�T� �0-����7��*~������5��4�3��d�������P����qi�j�u��	��b��F��;��=^�|��v&�����W#����T�8K4A� ��CV��:+�#��Q��vA��]���l��4�����������5�8k24ScM��j���3����:�o�h�����aj����K$�D��dA��[��J�	�����#EL�5E4M�5S����w\A�����[��yq�.�����[���:4!E����L�\*B�P?�e����������� cAx�� ��m��K�����V�b^X���i.�z�;���)	c���q����c"a;�
������z>7E���"��	���k"4��$v�	�����^��ye��7��U���m
�V��J�oR-���nR��8�Zr�����=X$���	g�	'�`�N�~J9���M��� <��������w59>&�c�x�}�j��d�*������Op�Bq�L�<"��F���(�������Evv�� �t�M�����_���Ki~�~����O<���s�g�G��H=mrt������By~����8v��p��
�&:��]D��g�y&^�����o/l����� ����m�o�H����6��4w"x����9ZB�����Dy��l��^��N�dFd^�dj�`���j�F#�t���0������-���5�G����"��m��.5�Y5v(MU��C������}\x���#���{��9b��;�8�^N�~����Oo��wG��h��7�����_|�E��y�W�G��v�5��G.*	-���k�����j��
|����>j����z�8j�hs������bn��fR�[h6#Qb��+�h��������-� �J�&7
�r#4���|6�Piz#S9.z9��[���OO>���W�}�\������+�#u��,����r�q��B�qS���l�5t3��/��>�U�~�a�^7�
G:�������������~�i���/�d��W�|a����@�c����{�8����c/�\o��;�|�9����/��4$������D��]<��O�����_��lc���A�4c\�r�i��5��kFt��Kz������4�����Ig<���S�����h��y�������������\�c@'9_R������g���R7������9���^XF	�Xq������7��f(���r��h�y���?�W7���k�xK�6���yX$JM�F����'K��Vk"t���EU�_n�>�{��l���x�7�pC;E{�SO=e���<N"I�� �8K4A�7��a�y|��yc��J[�P�.�������C�����j*�h:���N�_�q�K,g�4�MKLAS�L�.����� ��$A�Djt��c�����4��@��iI�DS�'"�8k��o�4��4��V%d:���/&� L�}+h������sm�q�yy���g���b�t�����gR�~>d�wfkP�{��/�)���tz�H8��%�|��C�)���C���n�>��"�1�
6���������c��~�x-;E����F���e�9�w:q����Y��D�q�h��nOB�����q��o�wM�Uq���)�!�� � ��W��,`G|q�.��a�x��j���E�'�p	F4����C7�1	���Q:47� +B86Aj�l�2�L�H�{q�)���_`�����|v�BQ���XM�OM>�o/�-�5q�����N���\f����e��/�	S�b��A;FS�zs�$1���s�u��*����]^�};�����j'�t&N|�.��/-��Z����NA����(g���O���e�*MEM���<^���O��+1����T������g���������������%���8�-��X�,-���@uA�A�I�Yc����i
h��#>dF|���������f�5_G���/Y>h,�z��s&�f�#��de���1'&Y)���I��AA�Q��<���N'�`��/G�+��nk�A{/�L8j������z��7#���hY"��$�5vE���fFP��E�@A�Q�i�&���|��w�k��I����x-;Y��w�+U��w)��=7����R�:K4��tZc��\_$����B��5>-�GBj�hY:�a�k��H7� =����T
��l �x������7X���Km����1�t�F8���3���`1b93iR�T��2��F:h��x-;R��M�c���M����!,0|��|�(�����t[�+���e��-�hh-c�~�1���'��'�h�,���I{�)c���H�S�U��9>hB{4�VML`K�4��p|"z|�
�#V��N�O�4{tj���m�9h�f����ol����?����(���~���x-;Y��\4[V�9r��f�k����=o^�������,hL�����4k��fZ���~�;�o�)��E�q3|����t�zkp���G_s��F�K��YG��|�7�^�5�p?�w�����w~{�3����;�V7��Y2�Ks~�����B�R��(4������ot�{���O���	��s4n�����3��A�%KC���������C�N[*�����d������K.1�l�u;���8p��E��;�f�mfz��e_����s��=���#�Xjs�*]~��$�e����
�Sb�a�KISCS�I������Q��J�/��i~}�^���C�;��0�l=�y����G}6�_����%R����)"�$�M`�����9rC+<XF�\'Z�����N$�L�[�2v�~�w�c�7�b���������GI6[iG3P���L�\�Tp@���T���v���J�j�%7I��Nc��y3�	c��������P~��=��]!��u���x��?��O^3��������K�4��\����A+��n�������_4�%>����w�B�������A�6�/��o����l�����4��=���l��n��&�P��g�m�Q������
���E�4;��5��1?����$��!qc$�1r�9��������G!)m|�3�L�4�8��]2_�=hE��K���������O����u����^��^��������(#��i�_����sSgI�!��D����[!�j��:z H�&�F��{�Yu�U�*	B�:m���f���3�=�����1�����lp)
^z�PKO���#���!�S�FS��"a�������U#-�N�y�����&Nx]i7��	#�?����g����B.������L�\���SHS��x���z�g�]\+)~�&9��[y�C����o4j�r<�c!�FKQEZ�
r|9��E �^~�
�}���6b%Q�a*���[cm������/KS���z�L��y������:������5�v4�?w�y�������6�����M���Y�w>-M���R�aA�O,�8�^8&���]�������\Hq������)��c>CnZ�����>C&r�k��������u(b��s�����?�����l����=_b��@���y��G}d�ecD�����d��64`���9���@���*�9V����5��:���yV�F#����!�u�Yv���X���.k�8�p��YD4��E������
7��v�A���
�#�q,h��%�w��n�����'n0�{�Z���o��1?=~������������m�g��3�Y���Bs�}3K4^s��]�2�	�N;m�����o������n{����1�@��"w�$��q�)��?/C�\Yx���rs4������"n���R�D:��P��$4q�E9Kn������{/4�}��f��^d>:a�������������%mE�/d��e��������M�n�
�u	��H]�8�~v�H���Z����O\j~������K5��,��� �W����[+� ��bW.5����"�80Z���O�����'@��{��}�8K4N|��\�R_��$����S�M��{.0��u�����v���=�'�wm�|i������P�������0z�&�A����m(#����"���e�Qvx$07���<���~i���%�5Y��#�FHk�Pe�~w�Y�r�]�{������`�OO��l���m�w��wL���N�Dc���G:�F����|L��������~}bt�������f�!\	[#��2����7��3"�HE�}#��W����@�������O5��zF�������N��>s���N��D���g�4��jF��y���I3�d�hnR�Z�h�C+�0k�a�}{�Z��N-9K>��>TdD������M������k��l�����D��|�_s<=�P���?|�8K����hYE8K����{�77�o�|{��V�|}���w�oS#?x����2-C�?(z�����>�_V�����b��z*���u�S�#'E���HS_g�f�H�r�6�h�hZiy��j��EZb����{Fw��47z��H����;+���DUR�Z��Mi�%�T�>8�R�/ms�!7gI|q||�����6�m?�}���>�/���Go�n�o���_���N2�v�2�oI�7��_�lZ3�L�xu�P��"pxl#|�Gcz>a�~������in���lo������O$8���ei��H����h9��/�GS�CK�-��x��W
�ZG���.IKSw�&����=����Y��[�*��E��I ��U�e�����Z)��#0����F8���2��HN�w�u��A���1����@�����>�>��L9�_t��Q���Y�a�����TM�"A����@<*���������������g��r`+B�����l�H������{4v(M�r���B�#���s�<.��.��K���97?G����������K���������� ����M	�"���p�1��[�2�����+G�4O�7���@����Hx����F�l!\��c��O�IK[����{����0!�o��}���,��a�������4����/��f�s�g���]IK[�n�>�9::�[��V����n�1&��
�h���!���jf�!����oiBd���6Et�y�c�4���C�z��R~��@-w�0����xZ-.�#��m�o�z���F=e����G|�]����VL?�)���(�u��W��+��{n��j���;� dj���W�2{-c�����=�msp����������]��2R�bz>��/��q�'?0/�o���\�}6����������d��%��:f�3�{~�1��G�3_7���M�Y���� �����(�o��qX����oc��`��o�y���������%_���Yr���3K�Dr���c�4�	�R��Lt(�����=JEn�9�;����j��s���L�k��.�����L��N�O����~��W���B�����;����n��J�E�g]�a�4�R/���V4��U�e�h���1��������~�H_q�9<���gO~]XZ.��/�r{���t[*�o�����������h���%-�G��������^{�r�KZz-f�1+E�����.We���pS���%%&�]�f6}�:�����Oo�-��^6f�s"�������v5G~�o��������0�nl�~��0G~�O�����m�N����e��g�{�fn��P����r������{$�-m������wJ�[��������������~w%�.i���-��K�[�r���������������Y+�oIK��{�����)��?cRJV&���17�zY�f�%?
���� ����1��������?���3���i�xk�e���2wFS���������z;3�fT>)v�����6��?d�6g~��P^�)v?�cz?�L������5��:�FR�r�
6���[\���}���F2x3���3�Zvn��x-;����6�����,IC�
L�@-$����J�A��{4&��������mfz,��1����;����Vc.��������1L�/�d�_��W��A\=�����������H���'��k��J�>�{/M���4��BMP�o���]�e��{����S��hGyd����k���+7V������6� �c��w����,[mE���0u�`�����J�UW�)��3�r��"^�����(����`��?��A���@�K��0��s�k��2�T�Zv��3�����q{���<xr�i��k�����������"�Zv:� D����� �	�������k�����H�]������k�D��h���rM�9�v�b��y�&Z�+���ru��P������\�-���h�������->���i��}	�*-|O��J���l[k��1H�[-������0f|��=e�Z[����7��B��A�c4��&?��M�����s%:�����Wu�;�+����\e�I6�tS��wo�m&P'dp^��N|n�� ���4�@�����]�@ �#�@ ���P��2m	���/���_��z�Kg}�8:����_�!k 2������?��<��3����A���%�x97>����x��'�d� ���D��<��S�	���W�g��{����g�������4���1Hk�Dt�����_����
aN��r\�s��N�����u�c����8���,R��S�	��>h��s�nw���B�/��R@p�wp�8V�mS�����~����,�9J.��r��o��&�N�n��"�,��&g�%���_r�%�����#���O<�G}��r�)�g��������I�b��~��vs,��~�\�[l�E��23Z��v�i�����+�������V,��k�Zj)��:��[����|�����c�1���j��	�X�������x>�����I\z�����Gc�X���3�h��k�i�������v�x��6=m���|��)�1-���������g�y���.m,���,�s�=7�b��+�h�~���x����k=��;�O&�UW]e�O����^zi��f��[��y��m�}��g��s�xk\��/��}�@�c`��������k���o�m������5�����!���a!;�y7������]o
.�]v����T�e@T��'q�Gdn7�����]H���:�����
�$a�]�v���58^&�{N:�$�^	49��*�w �l\����o���:Z����o�]�~��1"s�1�}��q��7��-m.�������2�l��v�TSM��o��h�ip��{�n%@�1�v�`��k/���i��8 ���Zk��5�s�`�s����_�b��Qm���v�����#K,�k[��^��}��:��&�i=7���,tA��!t���G�� ��>��,��2�����r�����p�
�i����
@Y5b4�[n���e	O�a��w��Z�o��f��2szd��E���?o��n;��BH!�j��V��\�heI0EA�]dp�%��h@[K��4��f�`���rA��F�*�n�]w]{�L+("J�O?����6��������U.��N{=p#soF7�|�Y`���MS�B�i���#�x�����Ix��v�h�=\~����X�#Q�	�4������,�
B��\�Y`pJz�	'�`[�;Y2����|5���s�Yfi�xf�a����r�
a����nq
S���?���64���n � D��)/�$��k!��J����*�#W2G�u��X*G��O!R�+9N26�E�������$]�D� t5B�����j��)�_}��=W\��8������>j�����/B1�#Al
}���,8�������bE��Y�H�.
.3�hQ|S.�J��2^[�n��M����.�4��k�>�������s�?�|�x��gZMLs���;{Q�9�����[o����p��9����\,7.�$�^x��JX��!� m,�:��/4[�/n����oSO=�=�|.�e�����v�������e��������=i��>��8��S�w�������;��������	���>���s����	����>g8q��zk���oF�C`��dz�q����y����{9>\7�SXO;�.Jr�8�>%��p�0��R���������������Rw24���&	^����������
��)u�HB[Vp�0E��$�4�i��J�-��h~L��a�=	�?�����������F�����;;��%������Q.jL"t�`��{~���xk���V�������������h�86��"��E]d��}�����_���C5/�����o�������.%4Q���O����s��g��2��C	�@ �a ��A:=A�NO��@��a ��A:=Ar�@\b�X������r�"X6����3��������A��|~��V������e��2.v���5�X#~V�K����Z�&�9	�0�;���;.~flJB�l�����J[s�"d�M�`�o$�#��6�lcK�Q��<�rAHI17�b�UV������j��f��b���"t.$/��%�(�1�4���h�AU�W��@S
t� ���g��6��-���`��H	'�%1���7�fIG#�^!��:�,[�H'$5�5Ax�QG��2B
�3&���?�,���NS�H�G8���CH�q��� �� D0!|�~#�`��H��>��KAP�l�����B� �-��<@01unMJe"^��X(����=�oB�E�{(q&���_�m�� �S>5��R�
-
�PJmQ9�Ta��j�L���;�P�I�m�i6�E?P��T��`���t��T��ZH�#0-�@�ca wV�UC"�?��JdRE
=���O��%T�A�I�4�����:A������.P��m�)�0�rH��T���TT(ADP����\q�9�E�1� vL*��u	�� �<�A��� 	t>� �� �@�'�@ ���B8@X���t�%(��@ ����@ �R�@ Ja �@ "(��@ �����?�1�l���J������ns[Q����f�-���g�e[<�z�.����}/���uC�6�BI>J��f�Z�����@ ��N�R,~�5��
&���{��8��;P<�����{�9�����R������;��c{����s8�����olJ��1��^x����A�4Z�@ ��J)�V��c�y�5����m`AG��O>��5*~������~��W�~��6�@��vh�2�{�����(���
�_a��IC�~�����@ mI�S
�w�y�,���f�Wl�b�CA�����)mXQ
���� �����.j����~�,�����.0o���Yz���e�}����o��N8�6
�JP �@ �M�S
Ey�8�]��c�������~�-�j�#�<b���>����������>�'�|�,��r�}�q%���n�W�������^o��Ga��?����ca��=�\��'L����j���������w����,-��MzM��0s
��I��B,*���b�f�kj�0���g����[+w�kj����j��k�|�k�]���O{
{�����kj���x�1�sL3n��6������=������>��f��q��������om;����'��c�kO��k�p��?4hP�r����2����������T��[m��5j�);9f�{�6{����V�^S��X�������G�$����qF�������p�������J)$����� ��e�@K�����ok-<���������<��s�����^�~���U��3�������t�I��
Xq'p�V1|��������!M�=z����?�|��W^y�y��[j���g���w��$��a\ps����i�
�����B,d�����9��S�g��|�}���g��u�UV������~j��k��Y~p��r�)�g��w���
�	&my����'�����N;�d.����Y~Ju�A�����;�0���n�,���
�
��y�����C�Z%���0�t���)�,��_~���������1�5(����5X�|]�

}���b>�K��v��

���=~��y����g������[�g��L%7o���K�g�A�k�	��'���
�_|?����z��������������k{���������K/�4~�|�:o��n��@)�[p�*����h4y�����0���&(������[(�c��g�@x�-p���(�(�EP��h&87>1�Z���}�r/�<�;�P�9^y[�8�E\�P�u�r��z&����7-I�E���HP
;9�'���G��g�����4e4�@�s@�"��w�y�9����.h]tQk�N�	�v�mm-^��(���,
j�n�����B�6J����6�r��3,��]w�r0Ay����?���F�5��n�����[��E�|>����={�����������q��],�X(�&��-�f0�!(��G�]h����
9��'��������+~��o�1~����L������k7��]-����/��2~��	���7Tx��g�g�Ar�}��?�������Z������"]/���U���g;��v��n��&+�h������P ��3���#�I�������N����o�����.���V��%N��?���[m�
J(^
b�I�|��G�W���H�A��4��(��z`e$��-��&p��cRo�0`9��~�RxZiC :(�$�|����$t�"����i$;��o�����_RJ�RH��)����ZL��w�q�����m�?W�f;	dGy�]�z�����oFq�RHB����w�yv{��J�����2��U��=�1��Z���w'��w�����z�c��0`)J)$��Z�yB���!.�l��A�b��\5E��a)."���|�dS`%���7$�H7�<��>��<!�_�o�	qlE\7|GG��`��+�x��a�#{������i��IJ!e���B��;�`���ni����E)���Es��6jv���'B�Vy����C?�a]Wu���s&:���!(�K�G�>}�����c�+m�.�����p����"�(����$��k
�$���"��VrI2(�	7�"8^y_;�{�[S^0���>��n��	��W�}z������O�����5��N�s]�v��^LhP,)�$�U��k#.b����62my���z��b��5�;x-�B��QJ��\u�U���p%bQE�r�9���o��@Q��9��hJa���Gs���Gk�D7�����@ :
A)X�R
	,��u�+���y�%�l��a6L_���:PD�	��E$�<���'�P(��D,)��������Y>P�������Y�u�wt�q ��������(��%�@�u�R��uT��R8fLw�^k�D�0��$��!����yClO��a�hB�x1e�W��V(�w���Y)f�V�E�<��^D"��"��C��:XHW_}ukY�S�wh�-�0W\qEU���|����?�i��;Jg=�����e�KJ!��k�T�I�}�7�t�h�E�7b_���"�������r[D�.�L�P�9^����=c o�E\7|GgRR���*��{�d�;h�����<.�}F	���{�Jx��v�sT
A)X�R
�p��$����/���O��!����/�8^�����Yg�e�W\qE���K�LdJ��`R�������*;���-N'�YO7�t�v!`��+	Y�(���]cM���g�a��r[@�y��)��q�/
����v}r�>o��l��6v��	Ja��*��&�/R
��q����k	���{,~�$� l�7����Y1B1op�hB�A������Z��DJl�
���n��=��?�,�Et"�V���1��� Y��w�y�*f��{�]�F!�h��-�������LWN�l���9��e�]���o��$����k���(5Qy���$+����k�w���eo�����?���8�v�T���A)��3c��BV$u���PP���RHS^O����?�d�,�1��k��b���p!�q�N�R

GE����@ ��*��-�\�n&|���r������Z�>��f��������z/s�~���G*(0Q�m�@Av���!(�4P��������k��zy�����<k���\3��4�}����9�{��3��`�{�.�������L�:7����^,��0�1o��[���(*���N;��������Huw�q����y/0��"����+2+E)�(������"��r^pG�
�����Lb����kB��qo������|�����7$h��f��7�<�FOA��a�\�u�wt�Db(�������	'�`��2'X�Q
e�I'�d?%�/�S�8��P��SJ��{��K��Z��+��;�
�}|7����R��?�#�0�|��y{dS��I
����2{�w��m	Ja
�*7�9����c+tX���p!j���q�.��pa]~��v�:��c�K7�7�
+�`�(|)W
'M�6Rq!��Q�%�f�y�M�m�������>~�(�E���Ed9s����"@P��H�",oEZn�y�u�����h��-y��Wm/b��b��|�w$o��L8	�`���g�}����E��O?�4�R%��P ���1aar$���	��	aa�!����{q����T��{�W"��[k7����3M)D�[c�5lp-� �zM�����t0{3C���B����/n.��kYj������� �������������9'E
V�	'I�����j�I��C�m�L�Z�=�u���g��@w�v�i�g�'(���#�v������L���������7�rH3�3)\���q����f���n6������_PN�o��9&����;�]�F��o�;�\���<xp�����mnyflsKB`��+�������=�Klsk��^fx����\k����LV@��u��
kb�6��6�U���0K�S���)M8������6^���p���u9&mc�0��-���������	��l�{�m|'������m���m��;n�W��.g��\K ����uA����	��������m�C���6����`�k|���u���q���^�b���f����2���x��5��<,����cQd�[������&]_\\3nm������6W&�\_�j;��-���k���\K�s&���K���V~-�-�ZJ���I�b�{�������|�O�%��I�&�W�����TC���5M&�}���.���8���_kO�R
�dL�(�����K�������a7�k70)�Rp��&�BR�9a�D�:����4'���������{�j����a��n�"E���>:��j�9��	���2�����#�XB�v-\�����v�f��\�(r�0������J_������������-���w54}��v�`� 8?O��Q&	�.o8����
�,��J���,R�	�L�\kX^`����7X2����U�H�0���;�Z��0^��%.��0��& ���������.�z\E�2�MAI�_�R�$Y
���X�ZX=���n=����5
�z� @��Wy�K�����E�(R�[7��.����\D�����	Q��r�+8�����<������^L�g�}v[���
�{�n�����^A��%]�v5����}����,����[�xu�&x�����(g�g�}��m��l�=��0Rp��'�n������ �EV��..��B�~>C����-��b�h(�x�c*w�Lr����0��R��E,�@6��S�rfI�\�!��7�-�X�Iz����/"��I2@A�%�x8�����y��v[��=�"�,b�eXuQ(�\RBF�U[{���r����C���P����s�"�
��rnXE%���8X>������c�����^1����,s�1c'�C���W�6���0`IS
'N|?R�D3�R�y�����(�3��o�=~���q��
�MM����",x4��yA�D�VB<��7���C�,7n�V`����7(	E\7|�� �^@I���F�������.�����-�q��^z�-�FR$�v]�!���RH���u�]����z����JC��}�w�@)d;
����I���|�|��rJ��������QR�.�sP��:��������cv���e���2�=u�v������R�H�����g]v���������,iJ!W8f�a��@ �gHnDD9j�4��Xy���R����5�\�6Sp���l��QJ��y��x`��&������sD�?�L*d���$LJ�@7���$�A��PW�e"G�=�����0<�E�����1�n�E5�����0`)J)$h���<A�����AC�t� ���F�EX<�h�����������-����������5?O�a���(E]7|W{�:���7�|ska�0�&�l���C�������v.��g�i&[]C�B���6�l��s��K�W,s���r����&@�/�:�P��n`�6�5��%p��vj���3�����R.�g[��V�<�u��b��(�XW^ye�;x
Jb�HB*��H]�o�sk|��-��5u�kS�R�TR
iw7f�!����!����dy@�r��F�����7��m@�*B)$N���(l��"�Y���j�6`^`��[��mWD&5�Z�
�!�Q�SO������h/���Ja����.���#e�>z3�@ �=A)X*Y
���3f���3=E$�`Q)*�D����=����!�������&��D�@���,E)��@ �H� &��z���z���r����KL,�)t�
.��&(�KkJ��	/EJa����x���.\�������wP> D�H�����x�I�A�`%$q"O(�,������"
���E��<�M�j�`a/���;���#���3�����?S��|j�O���Z�f���Y����/�DP
���B �p����g:�����#X��s����������;�P
�A�21�U�'�[���(�Et!1'��)��hA�7(�E]7�)��u�fA��;��6�&I���7�t��z�d!S������;�p�f�%���b��!^�k(K��;��9��#��s�������"��L�������{���e����S����9���l2!�,���6������_�80�O���k��Vc�5����z�6)\
��y�����o��/���-~Q����(�0mJ�CP�)/3�s�*���Qz����6uR=��B�'D,�(i(�I5
)�vj�������Yt"���@�::J��$J>�G�B���e��r�I���#M+��{-���?7��[���Qr��$b�����H�~s��I�T�[�R�dS
�D�?�g��M�A�0M�	M�	�&m5�VZi%[+��s����AHg����(�+���DH
Y��N9�����n&X��rB�cB�����G�D>,y(z>�`��#��@����"���d��VB��HB>�^�%��9��v�W4O=�T����x���w6f�~������ItLg7��C�������2K������q96��U���(&�>)��4�b`�\��.�3!n,r3��3�z����`�unl��7)
73�j\%�+���lh8����]�@��	&
�(�O>��-:���*q��RX���F&����P
w����������#���O?��^�}����>B�>}�;Y&aX,�L�ve�=(����u��$��n�S����I-��qc^���/6��3K���F
���Z"~Q��r����
��z��z�D�<J�2�P���8��Cl���]���BCD�d�NwfsR���^l����B�uS:��)>��R

'9�B���/��"���y�c���&1EX<�EX$�z�-���xBn>y������o�,?��#o��VDqy���'A���=��:���m��}�ZE�{����:nY�=y-c�s�=���(���w���SO=������kt��^�������y��������O.��N:��F���O?Y�Dz��v���������RHe2mB�!R�((�5D���]T{���6�PXg�C�?��;���*��^^��D0,1����_��4.��J@-�^�31�c}��(���8�@v��E���<0S���6o(;A�w����[�3�<!�:o75�
�yC��"\�y'� '�p�3�aL�
����)A)�fRIJ!��*�k��E����nh�<���L`�����H�Cfu(���% �Q:��R��$8�'>)C���_�
�@ P$��:/�R�(���Xb�%lL����� �Yu�U���-��lE�d,a4��$�ZB��`)<��3�-������g�yl�.�����������������b���^��Zm����X6��Z���n��Y�<�xV��5�\���6��K�m(����{��'�R��{����2-�&cn
��O<����U�mXs.���Xn�.�����+��%� g���E�m�?�|����"cO�20�;�����kl���8������t�������w�1F��$�(��>��`���-9���m�i��s��q������E�m�W���yw���\���?l��m�+�:�� �OX�&hls]��o���+&x�6����u��6��6�����m�U��@�<�O`�s��������'���$�����8F�B`x��������/��k��E�6�j�y��������3% ��b<1���!�;��s}q]!/A�%A�/�Z���z}��|�'�����`�k�O���g��"6�m��TK���w��7�����d�M�,�����t*���!�R�M��MN�B�~�l�,B!��UL�'q�s�����XE����k$����6��M7�T��}

'G�z@�L�+`�Z�Wu���qA)�e�]����*�Ln��:��c,`�U
-HZ!F1���%��k�4Q���	j���[la�D\�V�
��D
=V ����B;I$�m��
�e�)����xB��������O�������?~�wqcUp�4y�b��2o8Et��|��$1�V��`������;v��/o��a��&�y'� ���������<�;�������W_o)��}�Yg5�]w] ^��2�'x�w�aK�M*��������o"kX^���v<H|^3��a�F�E��+2�2�[�p�������b�ZWY�s1�POR�N�a�0`��������G��v��5����"@6�;+/������p'y�E�u�nO7+����3�(�K7���@P<���
�c:o��"&S�C�*b�Q�P1^����Z\���7�U�X'�i�e���
��9��3���~-����������� �N�x���x��P�occ
%c$�e�x{^�$^7��H`P��6%n��(�|7�O�0�V%4J�Q�%�h�)�]����������/����%A)X|,�������@�=@{��W^�*ML��GF�����J!�
]K#��$OJ���P?��-&ne
K��^�K)4O,+��%O����b�A�)3���W�*�6����R�
��V�-��m�G�1
�����"����������H!<9:V	���e)�u��$�7�rW���j	Ja�R�R���C�`�,b���Yv�`����<A��I#yA��'oH���M���V������<���u'O��H�V���$�;o�6}���$���1�N�K�K�RHR���_��2/W
�6��Q��M�����R�k�u�kP��q(���`Iv�'���G����E���+�#S��3Xg�.���s�m������V�_iJ�z5��Z�o
Ja����gt�u�.f��LJ����$�������B���g���M'nc�8b��|W����nk��>�\s�J2F�0'�����F�G�5��|��]�v�
1�����rK�b�B=�7�����]w���Ra1���(����yX*��"VF����	)��*��b����H)\+R�65���Q��W���V�0`�U
a��S����������Y]�0S,"(��l�I3B���5|G�Q��i�|������"�#I�)"�Kq�V\,8E\7(-E]7U�nm����Z�<_g%(�KQJ!Y�n��<�~�[�)/P8���u���d�I3y��,y+X*��d�mc��F�a����5��I[�
���M�J�0`�(����b��9$�����@ �KP
�Rc���������&��M��k
nC�l�+D���hB�aR��ZC&���'�x�hB����'$���t��G������������D,��uI
��}�uA�����,���au��g5`!���v�	�A)X�R
�@ �d�����6q��.���:������OwM	���Z�&t$��eL
��K/mMB�_����E�W�w��u�D�O���-�����N������aM�D�,ywN����b��W�Hh�x�}�p���n�����$#a��q�mk:��-��b��SO����`�n�H������zX'����Gi{�O?��6�������s�E��nh�G�.^��c��d��X�.���^�tM��6�����Mk>����H1]���8����B��7
�(����"=���Et����q
f�w��g�CP
�RX6c����������p������@v���]���Ym��,`U("����"�v���n�F�@M�q_Hf)�MMb�������ey��IY���;��}L\�R��R�b�2�b�e����?���Dy�BjR��I�X�����������Q
�U�iy���:�T�@����k��w/�����K/���\��.�����z�������v@AY��H	)��kP.���'��=��i��������7�_��}�j���g���BP
�����e-�@~$)�L���Q�������aqK�-,W
[+^�D	� u��C)�v!�q%c��z��m!ij���x����^���G�]����m�;���i�'�k!VH�>���Zk�����o�<`^�bz������w]��g�9�K�Nk��s1�F�E�k;�k^�dj��n%�'�}��G��Z=A)X�R
?����O����M���������&o����BMn���BM�L��hBy��K,7�"����]X���,�p0��"���^,��dQ����4�L��{��1����sOkms�h�:k!�\�J!����<��=J�k�3�<����B�;)L
X��b���XQ
�:��#
.����������g�^�"�b�u*J&ni,���}(���2��a�H���<�F�����O��3�k������p���"����ks| �Y���5��:��(p(t�~���5X�P��N�G�~�]�I���;V1�m$�t���(T|.�K�B��#=��7<��3[E���w��*���1�B9�����5����U���)�m�]��O<�*uN��x����^�*�t7���|��n�����m��l��"�,f�
�J���0���a��%��?�����Z,l���u�Y��E["�!	��1��5�i��)�[�Rc��)�]�g�0{�;0�Y{y��<�\�E|���X�$g�����7�+L��,�w�������3�����(AP
k��g�miQ����MR�F��
R������^{m�nh���r�-=����������QJ!�{f�yBW�":p��<�
7Q�@y�%��DZ�h���!��ho��f�k4/p��.�� 5o75��HA���|G�%�Z�R�(n(ziJ!E�i�M��d\a
��}���7�QK/�'�!��X���+���9��c��1�+�����@ �O�R���B�J+�d;;�b?���6�%��U���Rx�y�Y�x
������R��o_/����l]!�R&��^\�����7�j�m<N#F��J����]k��V<f���Mp�f`�t�2px��~���s�6���6�S�U������uY��X\d�����k����m��Fw��l��
>�8^�6�'�d<�q�v|�Z�����m�o�����m��[��^�>��u5g�F���K\���F�nK�n�n����%m�z���cMF'�`d[��8ls��m�s��������]`���m���w����;�_'��]B5�Z���o��n���#���:$����j��x��[o=��p��c�8=�|~G9��7H�b5���=c��%p]�Q����IV61�y�i�Bj��)�������'�4�Lc��<"���"��u����(.K�������kn��WnT8�B\#��j����"�r��Y ����O�E��b���P���$yp�����G<zZ�*���s�(x�~��g�$)�L��-�P��i����:>�Z��|

����$���y$����w��m��]������oB����t8��'y���B\�(u��{���N�u����.; ���a��D[_e�U������������a]]�.�Q]3bD��$�%�
����1`@� aQI"( 9	,������V����=3�=]����S}kn�����W�N��������U�5Xu�E��U��I(+��^N��NA�7o�_RaM��
h�--6��2�(6qq����@lA;[P���&.�
��;������pb������n�%M}�Qm	d*��#�_�����K����!��������If��l):�{����_�x��{���Pl���<.Ua0x>!�&x5���c�~����,�d�e��U��}���U�|_�GD&},!l�~�=;T������~}Iu�[��/~�8����l�
Q�����u���Pym�����)�0�4��mAG=`����L-��t.�9�:b{++,-��M:M:h�0�o{�,7,j�
b���m��i�*
`�c��)dDS�(������T�s���X����>$�S��"F 3U�>4We�����@!�\�n�19�4jg�����������@�P��!L�B���X�bz�q�oM��#����0��E
�+�������q�����c��h*������I�(�m���
^38�x<���1�2���X�B?Q��pk"���]^���j�
^�w'|e�V���q�Z"P�uH��"�Y43���c=����$4(`�c���5k�krl~"��>�����r�u�*;w��%
y�q�e�_SA��S^z4�D!S��}k!\hx��L6;��c�K!V�x:��B����sMt�y��J����KL8
h|�Xa,�.�S7����iU��3��=~�x=��x���d"�`�h�p�fm
ID!�m)��D�`zX I`l�L��,,��Zi�Y(�E��M�]���_|��>�]�vz�L(��p
���P��4h�@["�����,dB �c�t1��$���o���mv������#(3�^z4�R��x<������q��M�D!l��C�<�G��q�����!�,.�������(������L����7O���B�&���9���Fp������l@��p
�&L�pd��~����#{0��b���#M�&3�5����p��M~��W�p�6�g�^�����ZL��m�5\Cb\t\\�����x>.��E�G�Jz<�'q�`�^e|q�[��
�dE�m�5���6�.�^��b�6zQ���z�8/Q�����UYh���	�B����Sl���B���Xh���h6�z���mp���	�.]����K�;�B��k�)dO��E�G�JQ��[�#����V����6
l[
YI��
X�L�C�(|����������E�G�E����x<�7^z4�D���c;&��>&��m\M��j��v�B���[������l�e?}�8~���NxQ���Z��m=�p�1=���������1��mO����a�;�C2}<l�����5��m�E�GcG>/�6�E�!iK����B���.]B����C�$�I��]�(�h\�B�����F:���������:�e��\]�E�I�����`%r�������&��w��-�lC�������ET
/
SLVV��j�J�����J�S���������x��r��G��7��+���P���������+�e��05j���z�����O&�h���*QX-8�/<�']���.�)��S��5(���0�t��E7n,o���s�1z���U�V-�X�����OANdS�N8aO`�x@�c�|����������'�c6/_�|�M��~���e���AN��������7G�G��GT,4q�t���������m�y�'����p��� ���Pw	{a����n�%K�h�2�dff:Y�E���p
W�B��6^��P�Qu�Q1E!D�!"��������s���t*��x���k�!����/����_7�!X��-+�>�h����=Q���z<�'�O\,4!r��6!3]�����w��z��&�5>�^Z� QB��r��I�2ed��I:o���R�dI���W��R�O;�4-
��y����k�\�����C�	���4i����f�B��E���3����<�*��"��t�����R�%��o�W_}e}%�J.|�X��b%��lo�����o�B���kxK�']��e�vm���1D!SA��s�\y��Z��m�6����������+W����g�}��%�L%��Y3�C0�KL)����AN�4o�\[���U���Y%�������[�*��#���G#Io��zY	�����<���g
�Qa�q!��xV��qo����3���q~y�����H������Cb�q~2y<��>�g����Q��3;��<3/�y���e?V)L^��`���:/V�;$V^Xx.a~���Xya��k^y�5�;/�����3��?��T����:��c���C���_B+����f}���s��#N%!VLb��Q�B��7�U��=���r+/�#�FsG���Xu$�<������D��CX�!�<�Uy3C�K����K6������_�z���=[:� �7�W���<�?By����S'm���ZF��C��p�
 4�UX��M��Z������7����/;v�S�0�3�|���jkB�T�R��e�\ ^lMCV�K��
=����b�����E8 �Lg�a2�n"�L�6���)����c`�C������p���N�c�.�� �S:�t��APr���'�h��L���������i�&';��$c�M�>}�Xo��E�m��2���j�e����5���m��h{7 ���E�
�#������'H+Q�p���W���(QB�8��=	��_�y�-�<��)
��Q#����i{<�'�N��B�`��#��eV�6,�d�6L����]v��MZ��'�xB�8�������&�f�F��!IT&���"�:��C�����
�vE!����0�.�eZw�����i����S����l�������j������M�3�(���*6����JM�R�bt��-��t!-D!B� �����,�}�.D���7��q�	
�9o�Z�re�u���%��m�%@�������������
p�l�W�������!���1(09��U�X�l3s�L'�XVmG���5|����i�S�����A�������k�N.��=����i_��(d�x���#����$���`
�i{0����`
i�)fXl�0v����(����
�v������q��[��`_��(������K�������r�">����^����,fq���E.f�C$��M��6w'��;�k�����k�q���~��7�#{p
?}�I�N���+r�����%\�O���9�����[o�~�1p!
��^Q���@�����Q������O�/4a����6`���i]��d��_y���&�Da�QL��B*Sa���3.D��]������I��;P��@�6��<l����X
���6�g��B:�(�����&�{L|R�����H�Kf����b!�;��k����x\�v���[n�+��5k&=�����~��vEa�_��Y��4� �a���'�.�[����z���[B���Y�.���_��Q�5�5��t�7i�s�&����!
���H���9��i.\H�N�w�y:��'1l[
���V�g��"i��������x<��@|��]D|�L1S�b0�,����b�L1�va�H;Q�/D�
��c�����K�J'�x��)�}��p��w��-
o��Z�
0 8��
����d����#{�0�b�I��}�Xp�g{�	n
a��0MM�_�,^�X���+�����1p�@����x \��Nz��z�1N����Dg��?��mQ�{�J5���F�mU����+�
d*���x�"�Xh��l�j��]L�N�4�����y�t���i\l�v�'��
��u��`� ����B�?�b�	,]�\�N]�D�J^���y �������b?Z���|,Q.��H���c�
�	[u��Fa��P[6�����_�|l�����6\�<�B��B���q��r���K�F�r����K�O������+J����;�#s���_�[���Gx���^U��b�:������m�tl�@�r�6\�E��
W������R��6p],�`��}�	�Q�w������!M�i����[/��$&�mX�����G����&�a���9-D�UW]%��U�{�mD	���s���7*~,�^{���fc��2 ^/������uX��o��|�:���^������� >��cr�e�I���u��D|[��-�+e��a���G���<-�_� ���������CVVo���[�V����!��g��?����_������}V38�	}�� )�(|���]!�B����@@q�r�����%lmw��gK�%����r���SO=%3f���:^|�E����[�h2�q�oe����C��cF~��!���?���y��2p�@}L�����}���>��>'f�u�A�r�u��QX�D����[�������%�\VF~��Y�z;��T�4`������7U�� 8*�5Yh�����4jo��Vpd���Zpd3.�����Xhb{U�\-4q1�B|�l����w�	��A'���p�tYh�b0���t�mi#
c��
��\T�pJ���Gu�v
5	*��04��bf�:K�V�x��t&}VS�c�X����k'��>w��-�)����`����3d�GO9nwV�[��T�����x<�d ����&�8�^h���Y��]��b������#;`�%r�m�Z�-y�D!��{�����0a��d'o"VGv38����g���wq-��d�B���
k���<to�2��l�P�
���������w�0,E�5�*(����m�&�p+�Z�bz�2�l�x�Pg���x�����b��Y�2m����*����c��7*�{R��~�!8����k��MRYn�0y�)�5�
����7���P��/�/
���������|0��(���z�������?_����\Slt�L;����+��;��#��6�D��{����O�V���*�'_��Y7�Nz����^��4��|o:B�����g.�`DK�����{q���+"�;v�U�)2w��Ng���={v��L��gv��?�A'��;���y2MJ�9"���^��/yL�������d��t@��J4�<����T8�����(7�>�����f^���AH�g�?>?V�&�k�?��<���L�;���l
I>�<s!	���P���~fp/y&�v�g�8�oK�����$�t��np
s����g��O��`���gZ�)����|���Y�)��g�qy�"+��.��5�z���<������#~'yf}�~�]���%|?��{j�]�W��<�+y���?3��o]��~��&��+�c�M���a]2�M����R~������W�����O�.��/7�<�
���.a�#���y�y�3r����<�.����#|�����ZM�H_E^<����A&\��H_E^<��k0(la���W�ya]��~�=p�X�[�/��C��b��B\�6m�z�������C~��gru`���F�<x��=z�����VN|�N8�}�y�|�R�JI�-�|�$����~Z
��������O>Y:u��*

A������p�!i��k�j�����e�(���Ma�����uvpTtFP�}�����`��+z���b��W����-S�������C��b�!lO��+�������m�����H����K/�����k��JX!������o����A�#���sD�'�6*>�fT�H�s���g�9�H�Q�s������9"M3N�������������S|�����(o�o����Y�Qb��{�j�	��6<S��m(G�e�X�]4�,��������( ��D����6�/4I?(�p��N����.�U�L����>:�y�!:x��W[kq�7��}����#�Xy<��WM0T��H1��NM0��Xh���6i'
�F������8�����������<F�6���U�E�������ER8�	�K
&r���?V
��u����3��<���(���g8��	>K.�@L_"[��c��K��d���xl����c�������2m�f�34OQ#�Da)���Dv8-�+olr��@.x���y�^�:U��'
�O��;M��G�c:����)5e�`�p
�`)�����6���T8m�m�%�%�tJ���������U�q1m��6��A���������LX���/�
c���4n$�I;Q���%K�5�\#7�p�\���R�J��Z���D[
a���z���q�|�T�IDAT9���	��@%O�������N ����@D�+�m�a�\=l����������`*�D!+r]8~��D!,|�����88*<�.4�MD����������!
olC�IlI�`� h�mp��M��B�)S���i��JH�b��\h�b��H��&O��BFlmG�]��^�l&V&��3��Kn�5Q[�N��5�{Ba8X	����z<������&��r�	�j��/4I��?\�|�I��+At����&&/Q�h���
�D61Ba8T5����c	C��l�wS9.6�������rm��W��L���b���������i0��Y�,��}�=�B��B��=�����2}@e�N�?��(\?�m-\7��q����s��p����_GC�umwl@�b:�6�B�����(D������0�E����X	\���7���)\���/�����\�E��`��{m'��&���"�`�f�b�����P��7i'
�V���T�\Y.��2���Ks����;i��5������9q����P�P��8�:Zz<O:�Ea�xQ�8i!
�����+
��OG����~�,�-4�E(����c��0�/4I��$q��/4I����)���(��q(�*U������X�Jg�uV�
�=
���W[��g�G���"�a��o�0<F��<�'��M�/4I���|�G�t�@Q����t��]j����� �
N(�U��H�z��M�-D�%���">�.��8�����E^��lO�����i0j;��H�v�_��.�JcawQo�������B��q�:ut���z(���Y3����"�X
a�O_���Z��?9��h'����0�D��B�����]�?�>f�6�����N�iP+�Y��m��m�E@!m�o���m���S,2r1�G����n�pa�������N����r�:T;����������s�='/����K
�S8���
<X���W/����F���
��<����
��(���t�9��	�������i+k�d���(���������������K��]���(���(�&|F��\r�^=n�8m���B��#������;����z���G�8_|��R�f�=������)��"]�v��1����Jn�����&"
�N!�_l.g'n�J�B�?�s��j��!�����/
��zM�6���{��M��h�8~���St��E�����e�����S��2y��O?]>��}e5j�k�����NJ�.�#�W\!w�yg�}����{��&J"�����~����h�s�hm%�!�o�p^�W�w���p�K���}���M��B��>�,8��_h����~�m�I��e�R�JR�\99�����&L� %J��_|Q���v�i�I��=��[�n�_#P���:$����Ii��}pT0��������y���1���6�YY�d��;����W��M�d��z��p��Jq�bj���.0��oq��5��e�%��m4�.:j�.����6��Ql�u\���.��co�+W�L����o�������E��T�R��s��������{�=_��}u�5��,!����B��
�!0�����k�,�l��(?����������+���\���)8���"��R������~����S�
���{�3�,�/_�} ����b:��y�`����w�g�T��L���_��Q��
��uG)�=D���J��P�T�*"<�����B!o���AN��w���4���$o��1���"o���A���z��"��5jT���F���'�2_}����	��������������\��������6}o�T�g:��������3�3w,����=������g:�s/y&��a@��>N�s��Y?�^%���������)�g
,�u)��`�rK���������	!�5L's��r��s��g"�;yf���~�u��q
���_|�L�c~'�������kP���s9���s�y��x^��ux�|�V��g��?�rB^t�"/,�@���QC
�_�e>��)���/��D�Wx
����9�	����+V��� /�_�>���g�{�3�����+�v�<!�������/���Vf����,��
��<s��v�<3�.C�K���~�f����yf����~������f{N����~�<�w��<3Xt�W����������x�*�/�5�};yf�
�P��6*��E�%$V]����U<{6�������z�^Ps���9�w����F8Fb�*;
�v*L�������k��6�p�N��P����S2������Bv�jU��U�r�~������)� nL1hD�����-!��H�>���A��, M#�-���^5������Z�j��K/�����QF]�t��R����GV�h3<����SOi!X�n]=r5G����(`%��'�F��t^0jd$dF��t|~dg��*�5�}�u���/�D��$9��1-�H��&X���������#;`!q���]zl�5�BO��E�G��(��������_��9c�D����V	�JJx�SL����.�W:��_���B>�^h��X:-4���B��m�E�G���X�<������;�k��DX�����X�7^��a������:`������T�0�
m�<Y�v"�����4���e��LI[0=i{A���%���z�."���H�G:�6/
=��EaP	��.����_�
���t��sq2dg�W�����Wan��x]�Bih���m�T��o�Mx[0�k[��e.������-p�7��is��-���7\��`�E�d0e.�mt�2el�"[0��m�`0���#�(�h
c)���?��=��m+�n��7������U�\8���x<�������h�-pX�a�(�>��
b[ �m�#p�������)�(|gu���
�b���������YU�
JF,)�6m���zK��	V����zZ-44h���&�!1CJ��:aQl�E�p$6a�I5�&XVX<e���x�/
=�T��U�>�9�6��?
����}/���*qXVu91�<E�t�[r��EVl�Xh��s�M��h������8��`���������?8���M*D!,����|�Fp�g�������v�-��k����W���"�zJ�k3�-��1��	�QX�l[>��1���iCZ�_f�b`Usa)�,�S��}��xQ���Db���dZ����h�#��{��p1:D�2���Cu��U����/
��)w ���U�8��8D�w�a5�Il�N
��b�����.������
�p���v��453���l@�w�����-�l;H�0vxQ����R�D���m����N�8���������.���g�M��u�j���\LS��^���Xw����B�&e�0��o�2����k���SA�-m�4���U%����b������E �awM��qs�`��^�6`E����-X�����%��O?
����4��l���E��~��']����I���~�����-��g���������IO�}�]'MX}�t����l����������D9���/�#;����G�����I�(���^�im���C�U�=������X���[�����������Gi���
�\�>��,��Xp]���~���J��$e�U��x�/
=�T[
a���2�[]��-G.Z����#*Sm8J2$8J����TgRJu�������W��Xh��)��2u<{����s����5l��	�^�	6@�\�� ���bA�����^��\,h���m����0vxQ���"\09�p��nT(d�Jg�9�:Q�au%�I��$k��!
���3:V�<���<Z5������I_~�����lwi{{�	&�~�63g�t�j��v������m�E�cp?�����;�������/��o�y�S<�%J��-Zh�H��R�JI��
�B��v�i:�
L%;��3�n��IUh[�p��w�.'�wDF�T�#F���@L��64������)�w�����@l��,.���b�	�O�+��.:B��Y���&���*��}�G�`����p
�B��^:���������������S7"��{���pg�u�|��G������Z�ji�J�r��%����(V"�x����S���+!�,�zh#t��Sta��_L����3}����,4q�_0V���Gv�@�����^:���h;����k��z���)��"W\q�����P�dI���W���.a������T�tiy��'��F��P�lY�����.%��M�6�Q�Y>�5������5�����l���]����u��I�����B����Pl[p��r�;�pa�uQwx�.�������x\A�l/
����SO��
f��h���W^yE�-[&����V�Z��#���O?]�B��/��i��Q�-���Fv��)����;��u����T�X1�Y���L��!L�����b� �<s�0yLk��!����g�{�|.'N�y�����]��#�G�(y�<��<3�'��>}���jc���Q����y�N!�o���G�"&�7�C1�!�����\:9���7:q����#���t�L���w���>�����0�][L���*?9;u��7@O��e�%�����uL��� ���O��|_������-�`4h��>7;E���_��#h�B+�|_�kZ�Y|�31��9�g�s
��@��&���,�;�P�8������f����'�1(���g�9d�y���P7���Y9�(�N����5$����a8�;y����]���������~��<~O��<�Z�/s����2�;�/�5m��3]@�Dxn���+y�����y�<����~�B(O�G�
��Q����O�it��<S����������jZ^8��F�6����5`��~�pU���~�������w���"�v�<S,����.`���]�h��3��5���?�������Q�<�~�^f�h_Chw)3����<��H���������<����������+���<�{e�<���y3f�l���<#!��>��oX�yVB��'�</�_��H���;vl��_��_bqy���P��3�*f��<��t��+���Ka{����_,B�D�U�!s�"e��_<{��?����������
�(�������U���0�w���C!���B�
�
_�z��;�v��i��Y�x����ru�bs��@��|po������'v�������03���
�F�{���{�J�1������t�����\�����@,��
[ �m��2��=E
~�x/��ca	������b.���QFA5j��C9D��#�F=,,����%�d��(����^���������g�V;
�����+�m���-���(�7�T��]��?t*F�.�D`yq�����B�%���Xm��,��aE��f,vXSl�5�BO��E�Gc[.����~�.�c��2�kY;q����r�W�23����_U*�-��w/Wb1S��)&��S4a���B,����
��2�m�����d����\ �\l��E�Gc[���{�B^1�mqz�����R%��S�V��\������df%�VE����D���*�	0�o	-�6��>8��B��J��pu�u'��
��-��(�h\Lqgv�M�^9�}��u�}q�Ov��D`:�\0b:�h�w�<=}���u������Y�����K��o�8��6��Eh
��Mc[`!���N���[ n]���.;���`��b�:\���
����p�l������E�G�J�b�������<�����S���
�,�h�
�~���������6gL����m�/��b��TNf�e��2�\El��]lw�E�G�J�8&�}�W��P~nQ)8JV������p�g��B#�D��I���O�#w�����A�������t��s3���%3d�-�F�!dlAx3��
Xh�U�6XU�b���R�����B�foZ
�Y���2�����'���L��N���E��E�� ��v��������	+���l�`33n�-�r`����b�V/
=��0^���?��M.��	a�.|o �����/6l(oX���X^
�}\��t���=.H�������B���($`i2A>�3H��Ap�?D�7wH���.��>f����E�BbH�3}�q��{�����h?���;�>6a�����-�a��c��m/�b���ZL��7\�O{�/
=��4}��9������gd��JZ(F�����OQ?L��}�=}�p��u� ���U�-�>6�Q�M\������i�Da�"l)�f���JV�r�-������d)�,���M���#U����d���=�m��Rh�_l�t�b����z�	/
�c��U2T���DGF�����I����dl��J��I����_�O
�G9�����Nz�3:�HL�M�n�#w�H��Z~��g�N�����AH�I���������B��?>8�Q�3�&X
]D���p?fo�B[��0T&�{�~�~���uY����
B1b
�����ap����o!��$�� ��ul]����:.���kxQ��7Da~q
��Y��r2�ZI��x�:k�L��t��������)�}%Na<�g��S��{�r�y��{����G���Nf���u�S��=�-N��Wq
��~���.xQ�C��-�(���M�S�6�V�n���)�=����B���q2�l�����=�?j����X��_m��h�u���a"o�����lu�b�	�f���M�+w����~r�=$M��������"��O>)t������L�8�W�^]?��;Lz����S&T�Z�j������n����M�M����[m�*Y���7O�<�A��]��s��?D~�SJ����~����d����u�������rHN(�,\�R���k�c����[�4���P�/{RD(>�#��������%s����(,�.�
���BO��E�^�A�R�J-�^y��=��I�&R�b�\����o/W_}�������m��R�\9>�k��R�re��V"�
I�F�BR*A��X��{�#�k{��u�k�f���x�������:l}��[���:.�9�����Nd�>��o��["G�?j���^�R�N��uk�@P�U'^o��h�
���G�����O_o�p$.�0(���1��D��6m�L�818��Pm�E��r�"���{	��o�Y[�X�T�T)y�����������V�Z�Qu�������O���{N��{���J�F�T��cu��X�lYy��Gt�/y�)l�e�ah���d��p�z�8{���0c������{�2��F%|-w��m��X\�����j9�O���?���`
�kk0����u��`�����`����V@��a��IR�D�=��cD���?���?"��2e�H�.]r���Q�U1��$��E��,�/j�:Y���/_�upd���Wk����5q��U���m�/�}����2�_#l��M/��

�[o������V�#+8��m�`����p�&X<Xb�Y������U�2UyX��_�������A��"�6S��>������[w
�bb���O�>��=���>��[�n����G����\y����|��z����+��9��|���&����d�����DF�)g�}�^�	�^{���Q#������w�����AN|4o�\[�l���?c��IQ=��V�7d�(����������g���pF���:/V�)ZKy�����<3Mb/$��U��<�z����a�O2��S�dpV�X��������`o���o�<~_������BR�gv���T�Qn�3-���e?�zSP������<�j�����������);{��{O����<D����,���SWk����A�"VI4�g��Ou�����L���<�k�7V�/�<"�Z�J�����HAy��2x7	�KU]�00e���(BXG��Kf^�:�(4]�b�+/�~�/q��/�������#����Y����y���������������(�����P��
��_��_|qO��O�d����qc��8>����pX�p<�rx�=��>���SN�-Z�*T��6w�{�aN��,H��1Qh8mo9[������{��i�/�]��x���_!���*��-�,�
�-�KX���������U��_�>�vU����\'��w��o
��������l�����h�+�������`��k����������t���U�4j< :6�<�]C!�9D:�,�����x�>��;(q���o��5Qh��G�6�>3��&k�-�\s�����eb��J��u����j�F2e'Qx6]�0��YSw��6���
D�N����*-	�
�}������SB�rpn4.O<{�\� ���,�(�H��C���s�#{�=$�V�Y�d���!���{^z4��B�;T5�6B��+�~_(�N�����L��{�����a���i��_���%��J<F��
�?��A|�L!�m���L1hs1���a{���w1����IDM��)e��I�_���.SA>�'����e�d��i2�?��7�S&��W��?Cc�v���,fq������m����M��e�m�*�t�m��h7�`��?�����7U��u��c���A:\6l�^��xDt��6\�O{�/
=�DEaHG%
k���a��Y���C��eB��H�X�dw�Q2��W����y<E�l%�f�4;Hs�x������G�(Y�����h��5"���U�����e����}b��3&8�s����jc/
=�dE!0�\3Na�^��w��
��v�`��E6���K�:7C�����S��"�k�(�]�!?�}VpVj�g����b��w�}�`�+�� ����	�d�}����]Sv�����ABHNV��U��Su������M=X�����.xQ��F���l-���y��G�V}�������*n�w��2dB����[Jj_Dv_��d��z���NT��=i�-���MX}��ez�������`��]�q�l����=|a�����]�~Vi��~R��XU�O0R�.���� �m������fV��v��Y�f��?���p1��j���;m�E�GSXQ[T���� =��c���U|l�7E6��iOb54�r&�,�RIW=9����3���i�}'�T�H'g�Q�b�	���`�����9.S;\�\��xQ���B�f%����L�?�����L�A>mB��b�:�D�����9�e��I:mQ���3�d\��u�0��v�z�������Bv��m� ~���&X;l��
,4���&��p���F���;Ti���?���S��G%��P�v)#����;xQ���J�tS���1��O�xl�q���f��
�����V�`�<�X�;�[�=�hv�����=	�#+���?E�S����H.,E�'��M�eKV�0M�o�m�����&������B�&���V���(a�%���<V(
S.|<�W�m]�$;m��1���d�/ctBH���2�T�4�bz�-����k��_�0Q6;) ���6!T�m�xN�:58���S�M���X�t��z�5
3��s�x��SiL�&�x���^���N����3���c/
=��+axm !e{�3�;\LO��0zs��m�-6�7��(��|#�x����l�9A~i}��g��w����/Y�n�0����.�����(�3�_k�F��-���lCYvQo����&#FtUq�jw�U�p�J�x�B���*��	�����f��5�=^�xqpdvMq����&L�����vd�����%
�����'�p,����)1:R���I#d��#uZ?}��h����-����.��EV����������xV�cy��c�J#u���{���%�R��N����.���5XLap������fl���qa���p?��>"����� R���r�1�������L��[���X�F2���Uq�t�+M���]@l���k����|;��� ~KQlq�q�$Y3�KY;q�d�<Zfv�M~��l��Ay�~���^8�t}��H
M\����1m����O>	���7����N�w����=�F<bm���ci���K����s�h��US�����S��x<I����;L��C�<�H���E�~��F�x4��Y����]�R��!�����������(4����`�����n�7��"k�~&k~����_��nw����dJ����#�gbG���m��y���2o����B�G���`Y�4�"�b�;>Si�J_���u�`,���:�`����1�E��e�Z����*w;�g���z+J��ZhB�[dg�W��+�������P����-�_S���(�������/�\�T�":t�9sr������cG}�5�\�w�����a���\*U�$W]u�^5���u��"���n�����h-������(l�Oy�B������6����ra�,�wM�\48<o��b����6yu\�M����n�p\��G����:��0L�<�@~���Lm|�N?�;S���;����'S�^"���	7�_���g�_Sm��(�t�6����c��.�
�p?�h�uFn���Ju���y�=J4VTu�� ].�k�vm�j�#i��u���D����g%�����t�l�r��k��233d��%
3$++C����Ea�7�R�J�)��F�{��{�L�����y��{���8�N�:r��7���S�s�=��	GL���__�����#��{@d�_E^-R���r��e�q"C��Z�98	
����s��2$*}��/��]�:�J<>��"��J����W�^ �mC�c{5@�������s�)�vp\�+a���6��YV~�O�N.S��@F��!?�s����|����2�A�K��-*i���J`&D�
AD�X<E[c{7�=e�6���m�F8P�	B��c�/k����-
s�B^��r���S�v�$�g?%��=�������"�����>�)S��7��q��E �������yQ�uk��/

������w���r�q�i��y��'��5Sd]p�h��g�!����>���.�{��7W����l����[� '���9}[f����b^�v���v/�Ft�:�=%�RB����D5�}����S4�����������2���eV�:��U��r����������*%$�K�G"{����[\�g�����j����I!���;E��K>a�l�Ov������a���y�����.
��z�+m/���SD�pa�X�f��)tM��R�j��)'�x�����%J��/���C�5���N��={�i���K�I�3
Q�tt��&M����X�X"2��X��0A�f���y�����yx��~��YYZ(^�*����q��'&b�-[����r���R[5l�U�#�OPI�|����<�W_}58��.������SHnE`Ae�d����,E��7����������9.��7�-�]���#kG�++F�+��z_2G(k��������?���>&S�]&3�V�t�?���z����:M7��6�T��)�����n��F��r��?�6X$],��.�Z�)L}Hu^Q&����{���X$rn2��/^<8J=^����5j�9��������#q���Q�F���c.S���P#W\q��q���paM,W�����+����T�\Y����W�"�V���t���Q�������h�im���;�3GO���g�ZV����A�Y�K�9�DG����FJJ��x�����J��[T�����Ju]3J?����W+����3�g��D����hQ��<�B.��B`z�<s�5*\�<�iq�C(�EF���1�&a�Y���Xy|s�w#��O�wE����f������;���!<��)B�<sOT����z����y�t1e��R�B(g�-3��H����{�3�+��<�{V�aZ��/�gZ(����u���g���s�3����{�^�g���.�0�����2g�Y�.�wlU���k#�v�_-���e�}�[�rXY=��::�T�s�~L>��8Z�,�����t%,l^Y>�z�|Y�L$���kz�nt��4�����W�����'��l�,2�M����x�W����~��K���Xu)�c�6�A�WP��]K�~�u�>	�_���~�19i�)o���Y���e�%V��1"����_����_|���j�W��+���<�
)����@��� V_E�D�9H���g�q�����(W��;��}������������.�=����m���?/�z���
�(��������������J��o_)Y��t��]
����������O���;���C�P���=4�Y���0h���6xN}U��ms��|�B��q��N������j�����u*=��,����
�f��2B#m�����-�Y2;[`97�X�
�-�Lh�mCG���Svo�&���,�{O�}������eP��e�0%6���O��i]'����v7���D���������At��c�0:z��7�2�?0LL#���'�FSLb�cj�J����o�����Qj"$�:��.{^�����w���i��o��0kah&%��6F��Y�����~Y��WTb;>��A�?����`�<�i�����r�������i��B��b���i�����hSl/j��aZ.e�'=�e���T���k��_;V����o������'�&�m��)�zg�e�p\,6�����d�i;:m��-b��B�&�M��p��w�zhaQJ*H��0����A��GD��*���(���;���=�bs�����z�`��v�U���n�_;G^�iF���'���9�d��mK>�s����������c{0����B�������p^ �
���3f��_~�_��������_-=UC����$B��P
j� �T�Aa��8:p�w��������l[	�����Wl^�R	V��>�(8�V(WM\,�z����W,�>_��3*=�oZ�q�����x-��%FR���Sfu�-������=xQ��g��������ZED��Y"���_&�!�5U"�f%�MZX�N<�/�5
�c[�j�&���]	�;T��R}���5
�F�����I#��l�;�c��#r��!��o�?!K�=�oZ:���>���Zt��f<r���52h���<j1PL���Oz��,�~hp��d�g���o�
������A��#;�V�-Bm�E�~����]��D��U�J�[Z�cD�$�9Ia�w����������*���\�����k#)�)���������Rc�,Fd���a898)��E��T�Y	��J,�xk���Jw)yw!Y�x�+"m�
J���R���>��V�[��6�m���|���1U�Y��XV\�Qd�$(�V~�O���U�~������J]����9��d��w�����7����� �������������4�VIY��+�_={�][7�|�\�w�%�<@~�Q_��!�k����3R�1�� r����m*�n-E.������^�UW�L%�j���W�&�(qp�����k�rN6��O	��E~8Bd~S�)�7��+������"�S������7�S�����!��B:�6���d�~������s�{�F}����p�0�`��Wl�U��
g`�n����Q��'.
.VR#�lD:��\ h���~O��&���f�`Y�F;Y��)�t�2�K-�����zg����eZ�k��$^f*�����O�������]b����Y���,�y��y���y�BR�;���������&�_��������n��2����u����3H��z�)Q����m�J����"�*Gzd���DFnY��}���5��?>Wb��������_���`���V~�uY�=�FD$�>�m �$��_�������i��Az^up��gTrS���X�P@b���JH������Jl�����$P3�-����k'����-?7�H~{��l����R�����%?s�FY�<�Pp�������b��Y[eT���bk)~�_��pt�P���������/���d�*�M��<�_�{�f�v���<-"��_o��i�,yX�=�� k��<����gNm�A>��Y�>r��'�`��HH���6H�Ge,@x!�`� l�m���%"��������?���Jt>�e~������$DPA��t���f� ���W2M��;v�������Q�_�P:�������R�G��$q\-4�.�.�u�:B}���jgU�:��-�n��*o������a��N�����m�m����o����JO���J�(,�|��#�f]dK�&M�H�����kG��?ro�ov����7$"��Rdh�"q_#{��=�<���D���-Tg�Xp�I��$z�(Q�Du:-UzH������5���45������cJ_'�����=����}���Fpd�jP��Bx��T����k�:g�$�::gJ[��7OF}�Ypd���W�}�G�E�+�l23%c�|�(^\2T[��a��T�Ea%;�nG�)�W4����'��w�t���L�q�������YO�������!�|x��f�^��~�jY�[�m�R-$����f�l�!�����?�A��q~�������&[~�?9xm����x0
fn�e���-�"����sU����-H�n�(��o�e�L6�P�s��~��f���c����\	�z+VH�_~����uNaS3uo��t���V��j��8Q_3��S���J�M6o���T�V;wZ�-azP�s�B��%�������(1R�B�!Q�ZZ*�f�~q�b�-U�g��*��V��L%������p�&�3�����L2�g�s/���J��-��E�H�E��<(�2G���M�L^.Rw�H%�|+Ri�89�__���H���~$�s��Ef�EC������1/O����t���q0_�E�YK;����J�x�sO���yD�uj�g����6��}�H��J����l��S�.D��i���B6vfh����5�U#�
*���5�-���'�Q.���������mp��L��k�X}�������So6u�������v,��|
�p���m�-c�*�-;�������@�z��e��ewj�P/
�0L�9�N���uj�I���&wO�y6L�����c����RN�\��9�J�������I�����E���v�����:������0<�R�l�xT���8���~����s����#==Zd���
��lUB!�������Ub�U�)R��J�h��Wp��6�^"���H������[=�b"��TI��6��%��,�'�.����n����'*�x����U�3�{}�)���i���*�w��Z�7t[�WY�n�d>��d�f��"����J��"?4����U;���������d�lF���������T���>�o�V���m�:���/{d���5-��P���a���shr����k��A���x�x�.2;�����"Jh%���$��������~�k�����[JF�������Z�u�t]UOZ,�KZ�;=���4��S�}*r��y�z*��X�#U����I�����Ya�=�IL�W�t�d<�Q2^����E2z<)��)=D�5��:��������:����On�����R��a�������gA4X�W� �n+�>��>��tR�}������[*��N��$��'"�c��D�A:��+�>����I9���O�}����j~#���4�{�N���������1�IER����������,\�����2�o;$c�*cOl����M2^yt����������}�Bos���2CU�����>2�w��b��EN���\��Rr)`���?#S���������r�OD����/��{��"^�[V4iq������������9���������c���'���E�� 2��(���i2>��<�B1��A"v�$��8R6.O������u����;��[��|}�H���+������6���\��{��
,�CF���e���2�j��?�w��E:\�����^��D���Dj_��<|EpN����",�>|����S^x�������%
�-�k7n��V�����E2��y�c�Y�{n�+��	y���
vf����j������^��K���Z����VM.|
	��"��%K��F�Y��|��+�z����i}d�O��'_��_�vfY�|ypd�����8�{R~~�i�p
��'^Y�]&o�!8�~E�������B���6�Z/-��_��cu�m��,�=x��(�����SN�.]�����+���n�)�	��y�i�������K����ky�������**VC��K"W]�a��Y2�C����Q�b�:�q��G�`{3W��]�hB=�����e�di� ���B���
�
tBh�l/fD!+�m�5\�BL�t�o������?��5l�P�mou	l����6a0�b`��>���Q�R%���;s�q���������s��5m�FZ?�Hp���!w������?���E^]���s��o���%r��"�5��W��|�8?^�L����UG��l+��.�����=���S�gt^�u�Y����G���V��������j���e��6a�����%8�VO��k�XM_�xq'���O?]����������3��h����'m��o_9����#;L�4I�<����^��L�<YW�n��9�P�lY���C�!�4i"����,�n	t�����U����Yp�][��a���qG�sR����R����;=���K�������{*��KS:j�|�~��IE��3i����'1w�>'����9UGf4k�[�s
�V{����l�������]�W�sR��}����<U�����g�A�����A�b���D��Gp@���uN*��J�b�D�y'��S���js����G���z��J��JLY�o��.����~��T$u��W�&^ye���:'��Od��?�/]�^�Q�lU������oQ�f�k�I��n�9L��C^Zk���^*��{�V�!�E,�O��&�'�`_w�u:UQ�J��o�Y���y���&5�R��TQ
u��Uc���t�
R�n]�r��v�UTrr-��Ow�-UN9E��:W���b���O>�N��%�J��R�^����������"�~x�R��E�%�".�F&�������	���'�yY������	��a��=pq����x����"nI����J���B�N�Dk�*x��W�gx��0������x<E
/
=����x<^z<����x�(�x<���=�I<m�o7��Ea�@��~��\{��r�g�I'�$��U����>8#5�z�!] ������>R��L�"��z��������v��:?�,]�T���'�J��3�<S_�r��o&���������E�wkIbk�y'�x�~�,\"�����K��#�������?��2e���C��1c��4`v�������g�-�����$_*�={��h�B.��9��st����7����S����H}��p�3�%���>v,����p�:ZA�s���Qn��6}O=�T����u��x��A�eA�����yo�;1V_~�e9���t\�u���>�Ho������5M8����F���������F���?��%:u�k���\��i�F>�������1h������g{���P7��cBZ�re}���C�=�����,J���1���~����w����jJ�,��}<���D��Q���q6c}�#F�2M��]t�^<o�C��<_�s�w�)��q���w��p����w_pV���.\����>�CtF$�����e�]&�������y����Ns��?��3�����ys��
��<��-%�p�
��[o�����������O>yO�g|��w{���k���nC)\�W�^����J^Eit
&Lr������(Et�s��2d�F��(����M��+�G��
��R�O;�� ��!���O<�D��@����x��z��T�<���u�+���-�W���3g9�����m��as���'���?��R��e���2;.�B��T�\v/h���T�X1��&@=��F2p�;v�(_|���l�R��Z&t�5
!�A�i�[a"�(�q�+�����@i��������3��g��|�aHp��]%xF��}�� 'B�~�t���}���u�A��X;�3��i�}���S��6� ����������_��W����|���;�<i���>�E��\����_�a����������c��1��'�E2 4�\��O���m.W����! D�j-?��,�D#j:���>�|����a� �`��@�gCG��e����U+-n��Y�F<#�W<���7]�LPm��m����� ��[Hff�.�u��
r�����z��W����~�����X�]�v��!U�V�[n�%8J��J^5������n��i�.��r]�R���pMF�6Da,����
&������������;�M1��	TN�!F���T��?�P
+�j�����f��J�t�40��7����J[#)4l5��B^�D��	��s�
�@��cqbDM��oM��{��G7��h������(A�4���J����b��2`Z7��!z(������SD�b�0Ea���T�[���t��l5�9X��T'B����6��W�8H,,�D�Q�-;��%
M���Gk1�A��xEa��^F�Bf<��y�����D�m�OY���nj����"������u���F���=�~��!���:Q�N���]�,��q�Q�F�6���<�D@�ay��4E!e� ����#�-��7Q���i[�2�`����gB_�Ca�����EaQ�J����,��Si�0�UQ�6ZL�3����&t�X;h\<����Q6S�a�����&`!�Q~��G�=c$���g$]�i�����J�:�*4lF�Gq��-��=<��X;�B��t�k�TA�������L�t�A���D!V,X<L�(��t�����D!��Jb�������o+�(���u�Au�z��	[����Gag��7�hQ�����(�<s��6m�.���BY��$��t^����J��3�;m��={9��(D��<����E~��{�@�>�~w��J���h /Q�],v\��sd��0S��N�}2.7�?�_�������B�M>e�0}��E�Cs��JI���M50:3|S-8�)<*=�
��
��a���T���~Z9��\V��@��w0�M�B:���t�b�UaAD�`"��0���������GL�
��~�R!4��*���Bg���]�vAN����)����O�u���w���x�3��Gwn�}���|�B%�4�/�(�ead ��d��(���7(s<� So��K
�b��ANL�31�Kh��l$R�p����U �N|#��:��$XM���<o��~��k!��/^���X�l1p���x��n����P�u������;���BL`�������u�]����o�6�3V��y�PFO;�9a]B0"����i����g��z��5�o��Y���_�.~���@yU�����
8��
_x�Aa�Q�B��S)Y���� ���iOF�`���	���If�t0X�f������ �L
�����������U�����?������iA*,?�������Z�:����!8Q�����O�05����(ct�t�XWSi�d��s7�"��SO=��}��<O
�ut��FR�� �y��E�����a;�Z�\��T7���O������(�7: ������h������Ay�3��
��z�}c��$���)�|	�*������u��K."
+�����Ao=bZ���^���\����������
�&�]"+�g�=���\j�����������'������`A�RU��|w��Oev����!�Q��^�������O�u8�����h�_�)��i�<RF�X��>���@
QN�������,���c���x�!.�G<��\����`��R����&�
�������h{����u���P��3M�@?J?A{��LA,(�*yQ���9�Qt,
��TQT��g�Q������*R�=�������(.���uR�= U��7�����x<���E����x<���B����x<���B����x<������x</
=����x<
/
=����x<^z<��� ���M#�/�=B?�pd���,&��C�_���X����;P�����]#�
t�� N�7�A>� ���B�vv�Hdw���	������
��������I����:u���.��b�+�{��BQ�N.$��b��p�����C�= �@�B�:�]|��(d�+v���$��8vL`g�/�0OQ�{x��������!�`�]
!�n�;����-R���������>�3���~~+��;���B��������s���{��x�^z<��!?Q��tl��Velu��'�=\bX���m����c����oX����^�Y�f����������;y�B`_W��b+.���H�~'�t������(��	��N�:�;[�a=e���c�����{��[�-Z�H�.,�����{)�]�y�=\x�+���-�x�^�����m�����[��-�>�`:t��lE�gq�=O���B���6 
�w�6�=g��1B,�oK�G#�4���<��e/�p�dDX��M��������R�X�=�c�}X����$
o���9-Z������=[[����s���m���z-+
k-[��{Y#����=R�L������i���[ka�{������w�`�C<�7.�����W3f�����_?=E�}E�q_}����h=����vxVE����y<��m�J^z<�}��,�!X�i���S��K/�T[�G� �a>S���������������(�=N�)�G�����B "�
+
'L����}�%��������7��ob�O�>�=��d��������#
�8�m�D��,t�L�6-�gc��j�������(�:u��3����St������
�(l���1
��5���Lc��9S�����X�n1�k�BN�����
�"����o�Q.��b������O?��n�MO=C��1S�L�V�ZU[���}�����b��\<��q��z*|���Z����/��\v�er�q����U����F~�o�����A}��b��r�y���q��wk���{��{���E! ��L��3F6B��U�VM��/"�����~��a��'��S4�����x<�2��c����s���g���B��������&<|�X��Jf�����E����x<���B����x<������x<EFF��?o��v�r��IEND�B`�

Amit Kapila

amit.kapila16@gmail.com

about 7 years ago

In reply to: Masahiko Sawada (#2)

On Tue, Oct 30, 2018 at 2:04 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Attached rebased version patch to the current HEAD.

Please apply this patch with the extension lock patch[1] when testing
as this patch can try to extend visibility map pages concurrently.

Because the patch leads performance degradation in the case where
bulk-loading to a partitioned table I think that the original
proposal, which makes group locking conflict when relation extension
locks, is more realistic approach. So I worked on this with the simple
patch instead of [1]. Attached three patches:

* 0001 patch publishes some static functions such as
heap_paralellscan_startblock_init so that the parallel vacuum code can
use them.
* 0002 patch makes the group locking conflict when relation extension locks.
* 0003 patch add paralel option to lazy vacuum.

Please review them.

I could see that you have put a lot of effort on this patch and still
we are not able to make much progress mainly I guess because of
relation extension lock problem. I think we can park that problem for
some time (as already we have invested quite some time on it), discuss
a bit about actual parallel vacuum patch and then come back to it. I
don't know if that is right or not. I am not sure we can make this
ready for PG12 timeframe, but I feel this patch deserves some
attention. I have started reading the main parallel vacuum patch and
below are some assorted comments.

+     <para>
+      Execute <command>VACUUM</command> in parallel by <replaceable
class="parameter">N
+      </replaceable>a background workers. Collecting garbage on table
is processed
+      in block-level parallel. For tables with indexes, parallel
vacuum assigns each
+      index to each parallel vacuum worker and all garbages on a
index are processed
+      by particular parallel vacuum worker. The maximum nunber of
parallel workers
+      is <xref linkend="guc-max-parallel-workers-maintenance"/>. This
option can not
+      use with <literal>FULL</literal> option.
+     </para>

There are a couple of mistakes in above para:
(a) "..a background workers." a seems redundant.
(b) "Collecting garbage on table is processed in block-level
parallel."/"Collecting garbage on table is processed at block-level in
parallel."
(c) "For tables with indexes, parallel vacuum assigns each index to
each parallel vacuum worker and all garbages on a index are processed
by particular parallel vacuum worker."
We can rephrase it as:
"For tables with indexes, parallel vacuum assigns a worker to each
index and all garbages on a index are processed by particular that
parallel vacuum worker."
(d) Typo: nunber/number
(e) Typo: can not/cannot

I have glanced part of the patch, but didn't find any README or doc
containing the design of this patch. I think without having design in
place, it is difficult to review a patch of this size and complexity.
To start with at least explain how the work is distributed among
workers, say there are two workers which needs to vacuum a table with
four indexes, how it works? How does the leader participate and
coordinate the work. The other parts that you can explain how the
state is maintained during parallel vacuum, something like you are
trying to do in below function:

+ * lazy_prepare_next_state
+ *
+ * Before enter the next state prepare the next state. In parallel lazy vacuum,
+ * we must wait for the all vacuum workers to finish the previous state before
+ * preparation. Also, after prepared we change the state ot all vacuum workers
+ * and wake up them.
+ */
+static void
+lazy_prepare_next_state(LVState *lvstate, LVLeader *lvleader, int next_state)

Still other things are how the stats are shared among leader and
worker. I can understand few things in bits and pieces while glancing
through the patch, but it would be easier to understand if you
document it at one place. It can help reviewers to understand it.

Can you consider to split the patch so that the refactoring you have
done in current code to make it usable by parallel vacuum is a
separate patch?

+/*
+ * Vacuum all indexes. In parallel vacuum, each workers take indexes
+ * one by one. Also after vacuumed index they mark it as done. This marking
+ * is necessary to guarantee that all indexes are vacuumed based on
+ * the current collected dead tuples. The leader process continues to
+ * vacuum even if any indexes is not vacuumed completely due to failure of
+ * parallel worker for whatever reason. The mark will be checked
before entering
+ * the next state.
+ */
+void
+lazy_vacuum_all_indexes(LVState *lvstate)

I didn't understand what you want to say here. Do you mean that
leader can continue collecting more dead tuple TIDs when workers are
vacuuming the index? How does it deal with the errors if any during
index vacuum?

+ * plan_lazy_vacuum_workers_index_workers
+ * Use the planner to decide how many parallel worker processes
+ * VACUUM and autovacuum should request for use
+ *
+ * tableOid is the table begin vacuumed which must not be non-tables or
+ * special system tables.
..
+ plan_lazy_vacuum_workers(Oid tableOid, int nworkers_requested)

The comment starting from tableOid is not clear. The actual function
name(plan_lazy_vacuum_workers) and name in comments
(plan_lazy_vacuum_workers_index_workers) doesn't match. Can you take
relation as input parameter instead of taking tableOid as that can
save a lot of code in this function.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Amit Kapila

amit.kapila16@gmail.com

about 7 years ago

In reply to: Amit Kapila (#6)

On Sat, Nov 24, 2018 at 5:47 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Oct 30, 2018 at 2:04 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

I could see that you have put a lot of effort on this patch and still
we are not able to make much progress mainly I guess because of
relation extension lock problem. I think we can park that problem for
some time (as already we have invested quite some time on it), discuss
a bit about actual parallel vacuum patch and then come back to it.

Today, I was reading this and previous related thread [1]/messages/by-id/CAD21AoD1xAqp4zK-Vi1cuY3feq2oO8HcpJiz32UDUfe0BE31Xw@mail.gmail.com and it seems
to me multiple people Andres [2]/messages/by-id/20160823164836.naody2ht6cutioiz@alap3.anarazel.de, Simon [3]/messages/by-id/CANP8+jKWOw6AAorFOjdynxUKqs6XRReOcNy-VXRFFU_4bBT8ww@mail.gmail.com have pointed out that
parallelization for index portion is more valuable. Also, some of the
results [4]/messages/by-id/CAGTBQpbU3R_VgyWk6jaD=6v-Wwrm8+6CbrzQxQocH0fmedWRkw@mail.gmail.com indicate the same. Now, when there are no indexes,
parallelizing heap scans also have benefit, but I think in practice we
will see more cases where the user wants to vacuum tables with
indexes. So how about if we break this problem in the following way
where each piece give the benefit of its own:
(a) Parallelize index scans wherein the workers will be launched only
to vacuum indexes. Only one worker per index will be spawned.
(b) Parallelize per-index vacuum. Each index can be vacuumed by
multiple workers.
(c) Parallelize heap scans where multiple workers will scan the heap,
collect dead TIDs and then launch multiple workers for indexes.

I think if we break this problem into multiple patches, it will reduce
the scope of each patch and help us in making progress. Now, it's
been more than 2 years that we are trying to solve this problem, but
still didn't make much progress. I understand there are various
genuine reasons and all of that work will help us in solving all the
problems in this area. How about if we first target problem (a) and
once we are done with that we can see which of (b) or (c) we want to
do first?

[1]: /messages/by-id/CAD21AoD1xAqp4zK-Vi1cuY3feq2oO8HcpJiz32UDUfe0BE31Xw@mail.gmail.com
[2]: /messages/by-id/20160823164836.naody2ht6cutioiz@alap3.anarazel.de
[3]: /messages/by-id/CANP8+jKWOw6AAorFOjdynxUKqs6XRReOcNy-VXRFFU_4bBT8ww@mail.gmail.com
[4]: /messages/by-id/CAGTBQpbU3R_VgyWk6jaD=6v-Wwrm8+6CbrzQxQocH0fmedWRkw@mail.gmail.com

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Masahiko Sawada

sawada.mshk@gmail.com

about 7 years ago

In reply to: Amit Kapila (#7)

On Sun, Nov 25, 2018 at 2:35 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Sat, Nov 24, 2018 at 5:47 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Oct 30, 2018 at 2:04 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Thank you for the comment.

I could see that you have put a lot of effort on this patch and still
we are not able to make much progress mainly I guess because of
relation extension lock problem. I think we can park that problem for
some time (as already we have invested quite some time on it), discuss
a bit about actual parallel vacuum patch and then come back to it.

Today, I was reading this and previous related thread [1] and it seems
to me multiple people Andres [2], Simon [3] have pointed out that
parallelization for index portion is more valuable. Also, some of the
results [4] indicate the same. Now, when there are no indexes,
parallelizing heap scans also have benefit, but I think in practice we
will see more cases where the user wants to vacuum tables with
indexes. So how about if we break this problem in the following way
where each piece give the benefit of its own:
(a) Parallelize index scans wherein the workers will be launched only
to vacuum indexes. Only one worker per index will be spawned.
(b) Parallelize per-index vacuum. Each index can be vacuumed by
multiple workers.
(c) Parallelize heap scans where multiple workers will scan the heap,
collect dead TIDs and then launch multiple workers for indexes.

I think if we break this problem into multiple patches, it will reduce
the scope of each patch and help us in making progress. Now, it's
been more than 2 years that we are trying to solve this problem, but
still didn't make much progress. I understand there are various
genuine reasons and all of that work will help us in solving all the
problems in this area. How about if we first target problem (a) and
once we are done with that we can see which of (b) or (c) we want to
do first?

Thank you for suggestion. It seems good to me. We would get a nice
performance scalability even by only (a), and vacuum will get more
powerful by (b) or (c). Also, (a) would not require to resovle the
relation extension lock issue IIUC. I'll change the patch and submit
to the next CF.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Amit Kapila

amit.kapila16@gmail.com

about 7 years ago

In reply to: Masahiko Sawada (#8)

On Mon, Nov 26, 2018 at 2:08 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Sun, Nov 25, 2018 at 2:35 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Sat, Nov 24, 2018 at 5:47 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Oct 30, 2018 at 2:04 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Thank you for the comment.

I could see that you have put a lot of effort on this patch and still
we are not able to make much progress mainly I guess because of
relation extension lock problem. I think we can park that problem for
some time (as already we have invested quite some time on it), discuss
a bit about actual parallel vacuum patch and then come back to it.

Today, I was reading this and previous related thread [1] and it seems
to me multiple people Andres [2], Simon [3] have pointed out that
parallelization for index portion is more valuable. Also, some of the
results [4] indicate the same. Now, when there are no indexes,
parallelizing heap scans also have benefit, but I think in practice we
will see more cases where the user wants to vacuum tables with
indexes. So how about if we break this problem in the following way
where each piece give the benefit of its own:
(a) Parallelize index scans wherein the workers will be launched only
to vacuum indexes. Only one worker per index will be spawned.
(b) Parallelize per-index vacuum. Each index can be vacuumed by
multiple workers.
(c) Parallelize heap scans where multiple workers will scan the heap,
collect dead TIDs and then launch multiple workers for indexes.

I think if we break this problem into multiple patches, it will reduce
the scope of each patch and help us in making progress. Now, it's
been more than 2 years that we are trying to solve this problem, but
still didn't make much progress. I understand there are various
genuine reasons and all of that work will help us in solving all the
problems in this area. How about if we first target problem (a) and
once we are done with that we can see which of (b) or (c) we want to
do first?

Thank you for suggestion. It seems good to me. We would get a nice
performance scalability even by only (a), and vacuum will get more
powerful by (b) or (c). Also, (a) would not require to resovle the
relation extension lock issue IIUC.

Yes, I also think so. We do acquire 'relation extension lock' during
index vacuum, but as part of (a), we are talking one worker per-index,
so there shouldn't be a problem with respect to deadlocks.

I'll change the patch and submit
to the next CF.

Okay.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#10

Masahiko Sawada

sawada.mshk@gmail.com

about 7 years ago

In reply to: Amit Kapila (#9)

2 attachment(s)

On Tue, Nov 27, 2018 at 11:26 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Nov 26, 2018 at 2:08 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Sun, Nov 25, 2018 at 2:35 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Sat, Nov 24, 2018 at 5:47 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Oct 30, 2018 at 2:04 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Thank you for the comment.

I could see that you have put a lot of effort on this patch and still
we are not able to make much progress mainly I guess because of
relation extension lock problem. I think we can park that problem for
some time (as already we have invested quite some time on it), discuss
a bit about actual parallel vacuum patch and then come back to it.

Today, I was reading this and previous related thread [1] and it seems
to me multiple people Andres [2], Simon [3] have pointed out that
parallelization for index portion is more valuable. Also, some of the
results [4] indicate the same. Now, when there are no indexes,
parallelizing heap scans also have benefit, but I think in practice we
will see more cases where the user wants to vacuum tables with
indexes. So how about if we break this problem in the following way
where each piece give the benefit of its own:
(a) Parallelize index scans wherein the workers will be launched only
to vacuum indexes. Only one worker per index will be spawned.
(b) Parallelize per-index vacuum. Each index can be vacuumed by
multiple workers.
(c) Parallelize heap scans where multiple workers will scan the heap,
collect dead TIDs and then launch multiple workers for indexes.

I think if we break this problem into multiple patches, it will reduce
the scope of each patch and help us in making progress. Now, it's
been more than 2 years that we are trying to solve this problem, but
still didn't make much progress. I understand there are various
genuine reasons and all of that work will help us in solving all the
problems in this area. How about if we first target problem (a) and
once we are done with that we can see which of (b) or (c) we want to
do first?

Thank you for suggestion. It seems good to me. We would get a nice
performance scalability even by only (a), and vacuum will get more
powerful by (b) or (c). Also, (a) would not require to resovle the
relation extension lock issue IIUC.

Yes, I also think so. We do acquire 'relation extension lock' during
index vacuum, but as part of (a), we are talking one worker per-index,
so there shouldn't be a problem with respect to deadlocks.

I'll change the patch and submit
to the next CF.

Okay.

Attached the updated patches. I scaled back the scope of this patch.
The patch now includes only feature (a), that is it execute both index
vacuum and cleanup index in parallel. It also doesn't include
autovacuum support for now.

The PARALLEL option works alomst same as before patch. In VACUUM
command, we can specify 'PARALLEL n' option where n is the number of
parallel workers to request. If the n is omitted the number of
parallel worekrs would be # of indexes -1. Also we can specify
parallel degree by parallel_worker reloption. The number or parallel
workers is capped by Min(# of indexes - 1,
max_maintenance_parallel_workers). That is, parallel vacuum can be
executed for a table if it has more than one indexes.

For internal design, the details are written at the top of comment in
vacuumlazy.c file. In parallel vacuum mode, we allocate DSM at the
beginning of lazy vacuum which stores shared information as well as
dead tuples. When starting either index vacuum or cleanup vacuum we
launch parallel workers. The parallel workers perform either index
vacuum or clenaup vacuum for each indexes, and then exit after done
all indexes. Then the leader process re-initialize DSM and re-launch
at the next time, not destroy parallel context here. After done lazy
vacuum, the leader process exits the parallel mode and updates index
statistics since we are not allowed any writes during parallel mode.

Also I've attached 0002 patch to support parallel lazy vacuum for
vacuumdb command.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachments:

v9-0002-Add-P-option-to-vacuumdb-command.patchapplication/x-patch; name=v9-0002-Add-P-option-to-vacuumdb-command.patchDownload

From 33a9a44fa4f090d7dd6dd319edcb1cb754064de8 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Tue, 18 Dec 2018 16:08:24 +0900
Subject: [PATCH v9 2/2] Add -P option to vacuumdb command.

---
 doc/src/sgml/ref/vacuumdb.sgml    | 16 +++++++++++++
 src/bin/scripts/t/100_vacuumdb.pl | 10 +++++++-
 src/bin/scripts/vacuumdb.c        | 50 ++++++++++++++++++++++++++++++++++++++-
 3 files changed, 74 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/ref/vacuumdb.sgml b/doc/src/sgml/ref/vacuumdb.sgml
index 955a17a..0d085a6 100644
--- a/doc/src/sgml/ref/vacuumdb.sgml
+++ b/doc/src/sgml/ref/vacuumdb.sgml
@@ -158,6 +158,22 @@ PostgreSQL documentation
      </varlistentry>
 
      <varlistentry>
+      <term><option>-P <replaceable class="parameter">workers</replaceable></option></term>
+      <term><option>--parallel=<replaceable class="parameter">workers</replaceable></option></term>
+      <listitem>
+       <para>
+        Execute parallel vacuum with <productname>PostgreSQL</productname>'s
+        <replaceable class="parameter">workers</replaceable> background workers.
+       </para>
+       <para>
+        <application>vacuumdb</application> will require background workers,
+        so make sure your <xref linkend="guc-max-parallel-workers-maintenance"/>
+        setting is more than one.
+       </para>
+      </listitem>
+     </varlistentry>
+
+     <varlistentry>
       <term><option>-q</option></term>
       <term><option>--quiet</option></term>
       <listitem>
diff --git a/src/bin/scripts/t/100_vacuumdb.pl b/src/bin/scripts/t/100_vacuumdb.pl
index 4c477a2..4d513a1 100644
--- a/src/bin/scripts/t/100_vacuumdb.pl
+++ b/src/bin/scripts/t/100_vacuumdb.pl
@@ -3,7 +3,7 @@ use warnings;
 
 use PostgresNode;
 use TestLib;
-use Test::More tests => 23;
+use Test::More tests => 27;
 
 program_help_ok('vacuumdb');
 program_version_ok('vacuumdb');
@@ -33,6 +33,14 @@ $node->issues_sql_like(
 	[ 'vacuumdb', '-Z', 'postgres' ],
 	qr/statement: ANALYZE;/,
 	'vacuumdb -Z');
+$node->issues_sql_like(
+	[ 'vacuumdb', '-P2', 'postgres' ],
+	qr/statement: VACUUM \(PARALLEL 2\);/,
+	'vacuumdb -P2');
+$node->issues_sql_like(
+	[ 'vacuumdb', '-P', 'postgres' ],
+	qr/statement: VACUUM \(PARALLEL\);/,
+	'vacuumdb -P2');
 $node->command_ok([qw(vacuumdb -Z --table=pg_am dbname=template1)],
 	'vacuumdb with connection string');
 
diff --git a/src/bin/scripts/vacuumdb.c b/src/bin/scripts/vacuumdb.c
index bcea9e5..ee7bd7e 100644
--- a/src/bin/scripts/vacuumdb.c
+++ b/src/bin/scripts/vacuumdb.c
@@ -40,6 +40,9 @@ typedef struct vacuumingOptions
 	bool		and_analyze;
 	bool		full;
 	bool		freeze;
+	int			parallel_workers;	/* -1: disabled, 0: PARALLEL without number of
+									 * workers.
+									 */
 } vacuumingOptions;
 
 
@@ -108,6 +111,7 @@ main(int argc, char *argv[])
 		{"full", no_argument, NULL, 'f'},
 		{"verbose", no_argument, NULL, 'v'},
 		{"jobs", required_argument, NULL, 'j'},
+		{"parallel", optional_argument, NULL, 'P'},
 		{"maintenance-db", required_argument, NULL, 2},
 		{"analyze-in-stages", no_argument, NULL, 3},
 		{NULL, 0, NULL, 0}
@@ -133,6 +137,7 @@ main(int argc, char *argv[])
 
 	/* initialize options to all false */
 	memset(&vacopts, 0, sizeof(vacopts));
+	vacopts.parallel_workers = -1;
 
 	progname = get_progname(argv[0]);
 
@@ -140,7 +145,7 @@ main(int argc, char *argv[])
 
 	handle_help_version_opts(argc, argv, "vacuumdb", help);
 
-	while ((c = getopt_long(argc, argv, "h:p:U:wWeqd:zZFat:fvj:", long_options, &optindex)) != -1)
+	while ((c = getopt_long(argc, argv, "h:p:P::U:wWeqd:zZFat:fvj:", long_options, &optindex)) != -1)
 	{
 		switch (c)
 		{
@@ -207,6 +212,25 @@ main(int argc, char *argv[])
 					exit(1);
 				}
 				break;
+			case 'P':
+				{
+					int parallel_workers = 0;
+
+					if (optarg != NULL)
+					{
+						parallel_workers = atoi(optarg);
+						if (parallel_workers <= 0)
+						{
+							fprintf(stderr, _("%s: number of parallel workers must be at least 1\n"),
+									progname);
+							exit(1);
+						}
+					}
+
+					/* allow to set 0, meaning PARALLEL without the parallel degree */
+					vacopts.parallel_workers = parallel_workers;
+					break;
+				}
 			case 2:
 				maintenance_db = pg_strdup(optarg);
 				break;
@@ -251,9 +275,22 @@ main(int argc, char *argv[])
 					progname, "freeze");
 			exit(1);
 		}
+		if (vacopts.parallel_workers >= 0)
+		{
+			fprintf(stderr, _("%s: cannot use the \"%s\" option when performing only analyze\n"),
+					progname, "parallel");
+			exit(1);
+		}
 		/* allow 'and_analyze' with 'analyze_only' */
 	}
 
+	if (vacopts.full && vacopts.parallel_workers >= 0)
+	{
+		fprintf(stderr, _("%s: cannot use the \"%s\" option with \"%s\" option"),
+				progname, "full", "parallel");
+		exit(1);
+	}
+
 	setup_cancel_handler();
 
 	/* Avoid opening extra connections. */
@@ -667,6 +704,16 @@ prepare_vacuum_command(PQExpBuffer sql, PGconn *conn,
 				appendPQExpBuffer(sql, "%sANALYZE", sep);
 				sep = comma;
 			}
+			if (vacopts->parallel_workers > 0)
+			{
+				appendPQExpBuffer(sql, "%sPARALLEL %d", sep, vacopts->parallel_workers);
+				sep = comma;
+			}
+			if (vacopts->parallel_workers == 0)
+			{
+				appendPQExpBuffer(sql, "%sPARALLEL", sep);
+				sep = comma;
+			}
 			if (sep != paren)
 				appendPQExpBufferChar(sql, ')');
 		}
@@ -1004,6 +1051,7 @@ help(const char *progname)
 	printf(_("  -f, --full                      do full vacuuming\n"));
 	printf(_("  -F, --freeze                    freeze row transaction information\n"));
 	printf(_("  -j, --jobs=NUM                  use this many concurrent connections to vacuum\n"));
+	printf(_("  -P, --parallel=NUM              do parallel vacuuming\n"));
 	printf(_("  -q, --quiet                     don't write any messages\n"));
 	printf(_("  -t, --table='TABLE[(COLUMNS)]'  vacuum specific table(s) only\n"));
 	printf(_("  -v, --verbose                   write a lot of output\n"));
-- 
2.10.5

v9-0001-Add-parallel-option-to-VACUUM-command.patchapplication/x-patch; name=v9-0001-Add-parallel-option-to-VACUUM-command.patchDownload

From 291fb83c321c720b45b3eda227af28dd8350d2ed Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Tue, 18 Dec 2018 14:48:34 +0900
Subject: [PATCH v9 1/2] Add parallel option to VACUUM command

In parallel vacuum, we do both index vacuum and cleanup vacuum
in parallel with parallel worker processes if the table has
more than one index. All processes including the leader process
process indexes one by one.

Parallel vacuum can be performed by specifying like
VACUUM (PARALLEL 2) tbl, meaning that performing vacuum with 2
parallel worker processes. Or the setting parallel_workers
reloption more than 0 invokes parallel vacuum.

The parallel vacuum degree is limited by both the number of
indexes the table has and max_parallel_maintenance_workers.
---
 doc/src/sgml/config.sgml              |  11 +-
 doc/src/sgml/ref/vacuum.sgml          |  17 +
 src/backend/access/transam/parallel.c |   4 +
 src/backend/commands/vacuum.c         |  76 ++--
 src/backend/commands/vacuumlazy.c     | 815 +++++++++++++++++++++++++++++-----
 src/backend/nodes/equalfuncs.c        |   6 +-
 src/backend/parser/gram.y             |  73 ++-
 src/backend/postmaster/autovacuum.c   |   8 +-
 src/backend/tcop/utility.c            |   4 +-
 src/include/commands/vacuum.h         |   7 +-
 src/include/nodes/parsenodes.h        |  17 +-
 src/test/regress/expected/vacuum.out  |   2 +
 src/test/regress/sql/vacuum.sql       |   3 +
 13 files changed, 851 insertions(+), 192 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 4a7121a..5c3cd09 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2185,11 +2185,12 @@ include_dir 'conf.d'
        <listitem>
         <para>
          Sets the maximum number of parallel workers that can be
-         started by a single utility command.  Currently, the only
-         parallel utility command that supports the use of parallel
-         workers is <command>CREATE INDEX</command>, and only when
-         building a B-tree index.  Parallel workers are taken from the
-         pool of processes established by <xref
+         started by a single utility command.  Currently, the parallel
+         utility commands that supports the use of parallel worker are
+         <command>CREATE INDEX</command>, and only when
+         building a B-tree index and <command>VACUUM</command> without
+         <literal>FULL</literal> option.  Parallel workers are taken from
+         the pool of processes established by <xref
          linkend="guc-max-worker-processes"/>, limited by <xref
          linkend="guc-max-parallel-workers"/>.  Note that the requested
          number of workers may not actually be available at run time.
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index fd911f5..453890d 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -30,6 +30,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     FREEZE
     VERBOSE
     ANALYZE
+    PARALLEL [ <replaceable class="parameter">N</replaceable> ]
     DISABLE_PAGE_SKIPPING
     SKIP_LOCKED
 
@@ -143,6 +144,22 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    </varlistentry>
 
    <varlistentry>
+    <term><literal>PARALLEL <replaceable class="parameter">N</replaceable></literal></term>
+    <listitem>
+     <para>
+      Execute index vacuum and cleanup index in parallel with
+      <replaceable class="parameter">N</replaceable> background workers. If the parallel
+      degree <replaceable class="parameter">N</replaceable> is omitted,
+      <command>VACUUM</command> requests the number of indexes - 1 processes, which is the
+      maximum number of parallel vacuum workers since individual indexes is processed by
+      one process. The actual number of parallel vacuum workers may be less due to the
+      setting of <xref linkend="guc-max-parallel-workers-maintenance"/>.
+      This option can not use with  <literal>FULL</literal> option.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><literal>DISABLE_PAGE_SKIPPING</literal></term>
     <listitem>
      <para>
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index b9a9ae5..33c46e0 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -23,6 +23,7 @@
 #include "catalog/index.h"
 #include "catalog/namespace.h"
 #include "commands/async.h"
+#include "commands/vacuum.h"
 #include "executor/execParallel.h"
 #include "libpq/libpq.h"
 #include "libpq/pqformat.h"
@@ -138,6 +139,9 @@ static const struct
 	},
 	{
 		"_bt_parallel_build_main", _bt_parallel_build_main
+	},
+	{
+		"lazy_parallel_vacuum_main", lazy_parallel_vacuum_main
 	}
 };
 
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 25b3b03..401262e 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -68,13 +68,13 @@ static BufferAccessStrategy vac_strategy;
 
 
 /* non-export function prototypes */
-static List *expand_vacuum_rel(VacuumRelation *vrel, int options);
-static List *get_all_vacuum_rels(int options);
+static List *expand_vacuum_rel(VacuumRelation *vrel, VacuumOption options);
+static List *get_all_vacuum_rels(VacuumOption options);
 static void vac_truncate_clog(TransactionId frozenXID,
 				  MultiXactId minMulti,
 				  TransactionId lastSaneFrozenXid,
 				  MultiXactId lastSaneMinMulti);
-static bool vacuum_rel(Oid relid, RangeVar *relation, int options,
+static bool vacuum_rel(Oid relid, RangeVar *relation, VacuumOption options,
 		   VacuumParams *params);
 
 /*
@@ -89,15 +89,15 @@ ExecVacuum(VacuumStmt *vacstmt, bool isTopLevel)
 	VacuumParams params;
 
 	/* sanity checks on options */
-	Assert(vacstmt->options & (VACOPT_VACUUM | VACOPT_ANALYZE));
-	Assert((vacstmt->options & VACOPT_VACUUM) ||
-		   !(vacstmt->options & (VACOPT_FULL | VACOPT_FREEZE)));
-	Assert(!(vacstmt->options & VACOPT_SKIPTOAST));
+	Assert(vacstmt->options.flags & (VACOPT_VACUUM | VACOPT_ANALYZE));
+	Assert((vacstmt->options.flags & VACOPT_VACUUM) ||
+		   !(vacstmt->options.flags & (VACOPT_FULL | VACOPT_FREEZE)));
+	Assert(!(vacstmt->options.flags & VACOPT_SKIPTOAST));
 
 	/*
 	 * Make sure VACOPT_ANALYZE is specified if any column lists are present.
 	 */
-	if (!(vacstmt->options & VACOPT_ANALYZE))
+	if (!(vacstmt->options.flags & VACOPT_ANALYZE))
 	{
 		ListCell   *lc;
 
@@ -112,11 +112,17 @@ ExecVacuum(VacuumStmt *vacstmt, bool isTopLevel)
 		}
 	}
 
+	if ((vacstmt->options.flags & VACOPT_FULL) &&
+		(vacstmt->options.flags & VACOPT_PARALLEL))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot specify FULL option with PARALLEL option")));
+
 	/*
 	 * All freeze ages are zero if the FREEZE option is given; otherwise pass
 	 * them as -1 which means to use the default values.
 	 */
-	if (vacstmt->options & VACOPT_FREEZE)
+	if (vacstmt->options.flags & VACOPT_FREEZE)
 	{
 		params.freeze_min_age = 0;
 		params.freeze_table_age = 0;
@@ -163,7 +169,7 @@ ExecVacuum(VacuumStmt *vacstmt, bool isTopLevel)
  * memory context that will not disappear at transaction commit.
  */
 void
-vacuum(int options, List *relations, VacuumParams *params,
+vacuum(VacuumOption options, List *relations, VacuumParams *params,
 	   BufferAccessStrategy bstrategy, bool isTopLevel)
 {
 	static bool in_vacuum = false;
@@ -174,7 +180,7 @@ vacuum(int options, List *relations, VacuumParams *params,
 
 	Assert(params != NULL);
 
-	stmttype = (options & VACOPT_VACUUM) ? "VACUUM" : "ANALYZE";
+	stmttype = (options.flags & VACOPT_VACUUM) ? "VACUUM" : "ANALYZE";
 
 	/*
 	 * We cannot run VACUUM inside a user transaction block; if we were inside
@@ -184,7 +190,7 @@ vacuum(int options, List *relations, VacuumParams *params,
 	 *
 	 * ANALYZE (without VACUUM) can run either way.
 	 */
-	if (options & VACOPT_VACUUM)
+	if (options.flags & VACOPT_VACUUM)
 	{
 		PreventInTransactionBlock(isTopLevel, stmttype);
 		in_outer_xact = false;
@@ -206,8 +212,8 @@ vacuum(int options, List *relations, VacuumParams *params,
 	/*
 	 * Sanity check DISABLE_PAGE_SKIPPING option.
 	 */
-	if ((options & VACOPT_FULL) != 0 &&
-		(options & VACOPT_DISABLE_PAGE_SKIPPING) != 0)
+	if ((options.flags & VACOPT_FULL) != 0 &&
+		(options.flags & VACOPT_DISABLE_PAGE_SKIPPING) != 0)
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg("VACUUM option DISABLE_PAGE_SKIPPING cannot be used with FULL")));
@@ -216,7 +222,7 @@ vacuum(int options, List *relations, VacuumParams *params,
 	 * Send info about dead objects to the statistics collector, unless we are
 	 * in autovacuum --- autovacuum.c does this for itself.
 	 */
-	if ((options & VACOPT_VACUUM) && !IsAutoVacuumWorkerProcess())
+	if ((options.flags & VACOPT_VACUUM) && !IsAutoVacuumWorkerProcess())
 		pgstat_vacuum_stat();
 
 	/*
@@ -281,11 +287,11 @@ vacuum(int options, List *relations, VacuumParams *params,
 	 * transaction block, and also in an autovacuum worker, use own
 	 * transactions so we can release locks sooner.
 	 */
-	if (options & VACOPT_VACUUM)
+	if (options.flags & VACOPT_VACUUM)
 		use_own_xacts = true;
 	else
 	{
-		Assert(options & VACOPT_ANALYZE);
+		Assert(options.flags & VACOPT_ANALYZE);
 		if (IsAutoVacuumWorkerProcess())
 			use_own_xacts = true;
 		else if (in_outer_xact)
@@ -335,13 +341,13 @@ vacuum(int options, List *relations, VacuumParams *params,
 		{
 			VacuumRelation *vrel = lfirst_node(VacuumRelation, cur);
 
-			if (options & VACOPT_VACUUM)
+			if (options.flags & VACOPT_VACUUM)
 			{
 				if (!vacuum_rel(vrel->oid, vrel->relation, options, params))
 					continue;
 			}
 
-			if (options & VACOPT_ANALYZE)
+			if (options.flags & VACOPT_ANALYZE)
 			{
 				/*
 				 * If using separate xacts, start one for analyze. Otherwise,
@@ -354,7 +360,7 @@ vacuum(int options, List *relations, VacuumParams *params,
 					PushActiveSnapshot(GetTransactionSnapshot());
 				}
 
-				analyze_rel(vrel->oid, vrel->relation, options, params,
+				analyze_rel(vrel->oid, vrel->relation, options.flags, params,
 							vrel->va_cols, in_outer_xact, vac_strategy);
 
 				if (use_own_xacts)
@@ -390,7 +396,7 @@ vacuum(int options, List *relations, VacuumParams *params,
 		StartTransactionCommand();
 	}
 
-	if ((options & VACOPT_VACUUM) && !IsAutoVacuumWorkerProcess())
+	if ((options.flags & VACOPT_VACUUM) && !IsAutoVacuumWorkerProcess())
 	{
 		/*
 		 * Update pg_database.datfrozenxid, and truncate pg_xact if possible.
@@ -603,7 +609,7 @@ vacuum_open_relation(Oid relid, RangeVar *relation, VacuumParams *params,
  * are made in vac_context.
  */
 static List *
-expand_vacuum_rel(VacuumRelation *vrel, int options)
+expand_vacuum_rel(VacuumRelation *vrel, VacuumOption options)
 {
 	List	   *vacrels = NIL;
 	MemoryContext oldcontext;
@@ -635,7 +641,7 @@ expand_vacuum_rel(VacuumRelation *vrel, int options)
 		 * below, as well as find_all_inheritors's expectation that the caller
 		 * holds some lock on the starting relation.
 		 */
-		rvr_opts = (options & VACOPT_SKIP_LOCKED) ? RVR_SKIP_LOCKED : 0;
+		rvr_opts = (options.flags & VACOPT_SKIP_LOCKED) ? RVR_SKIP_LOCKED : 0;
 		relid = RangeVarGetRelidExtended(vrel->relation,
 										 AccessShareLock,
 										 rvr_opts,
@@ -647,7 +653,7 @@ expand_vacuum_rel(VacuumRelation *vrel, int options)
 		 */
 		if (!OidIsValid(relid))
 		{
-			if (options & VACOPT_VACUUM)
+			if (options.flags & VACOPT_VACUUM)
 				ereport(WARNING,
 						(errcode(ERRCODE_LOCK_NOT_AVAILABLE),
 						 errmsg("skipping vacuum of \"%s\" --- lock not available",
@@ -673,7 +679,7 @@ expand_vacuum_rel(VacuumRelation *vrel, int options)
 		 * Make a returnable VacuumRelation for this rel if user is a proper
 		 * owner.
 		 */
-		if (vacuum_is_relation_owner(relid, classForm, options))
+		if (vacuum_is_relation_owner(relid, classForm, options.flags))
 		{
 			oldcontext = MemoryContextSwitchTo(vac_context);
 			vacrels = lappend(vacrels, makeVacuumRelation(vrel->relation,
@@ -742,7 +748,7 @@ expand_vacuum_rel(VacuumRelation *vrel, int options)
  * the current database.  The list is built in vac_context.
  */
 static List *
-get_all_vacuum_rels(int options)
+get_all_vacuum_rels(VacuumOption options)
 {
 	List	   *vacrels = NIL;
 	Relation	pgclass;
@@ -760,7 +766,7 @@ get_all_vacuum_rels(int options)
 		Oid			relid = classForm->oid;
 
 		/* check permissions of relation */
-		if (!vacuum_is_relation_owner(relid, classForm, options))
+		if (!vacuum_is_relation_owner(relid, classForm, options.flags))
 			continue;
 
 		/*
@@ -1521,7 +1527,7 @@ vac_truncate_clog(TransactionId frozenXID,
  *		At entry and exit, we are not inside a transaction.
  */
 static bool
-vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
+vacuum_rel(Oid relid, RangeVar *relation, VacuumOption options, VacuumParams *params)
 {
 	LOCKMODE	lmode;
 	Relation	onerel;
@@ -1542,7 +1548,7 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 	 */
 	PushActiveSnapshot(GetTransactionSnapshot());
 
-	if (!(options & VACOPT_FULL))
+	if (!(options.flags & VACOPT_FULL))
 	{
 		/*
 		 * In lazy vacuum, we can set the PROC_IN_VACUUM flag, which lets
@@ -1582,10 +1588,10 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 	 * vacuum, but just ShareUpdateExclusiveLock for concurrent vacuum. Either
 	 * way, we can be sure that no other backend is vacuuming the same table.
 	 */
-	lmode = (options & VACOPT_FULL) ? AccessExclusiveLock : ShareUpdateExclusiveLock;
+	lmode = (options.flags & VACOPT_FULL) ? AccessExclusiveLock : ShareUpdateExclusiveLock;
 
 	/* open the relation and get the appropriate lock on it */
-	onerel = vacuum_open_relation(relid, relation, params, options, lmode);
+	onerel = vacuum_open_relation(relid, relation, params, options.flags, lmode);
 
 	/* leave if relation could not be opened or locked */
 	if (!onerel)
@@ -1605,7 +1611,7 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 	 */
 	if (!vacuum_is_relation_owner(RelationGetRelid(onerel),
 								  onerel->rd_rel,
-								  options & VACOPT_VACUUM))
+								  options.flags & VACOPT_VACUUM))
 	{
 		relation_close(onerel, lmode);
 		PopActiveSnapshot();
@@ -1677,7 +1683,7 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 	 * us to process it.  In VACUUM FULL, though, the toast table is
 	 * automatically rebuilt by cluster_rel so we shouldn't recurse to it.
 	 */
-	if (!(options & VACOPT_SKIPTOAST) && !(options & VACOPT_FULL))
+	if (!(options.flags & VACOPT_SKIPTOAST) && !(options.flags & VACOPT_FULL))
 		toast_relid = onerel->rd_rel->reltoastrelid;
 	else
 		toast_relid = InvalidOid;
@@ -1696,7 +1702,7 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 	/*
 	 * Do the actual work --- either FULL or "lazy" vacuum
 	 */
-	if (options & VACOPT_FULL)
+	if (options.flags & VACOPT_FULL)
 	{
 		int			cluster_options = 0;
 
@@ -1704,7 +1710,7 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 		relation_close(onerel, NoLock);
 		onerel = NULL;
 
-		if ((options & VACOPT_VERBOSE) != 0)
+		if ((options.flags & VACOPT_VERBOSE) != 0)
 			cluster_options |= CLUOPT_VERBOSE;
 
 		/* VACUUM FULL is now a variant of CLUSTER; see cluster.c */
diff --git a/src/backend/commands/vacuumlazy.c b/src/backend/commands/vacuumlazy.c
index 8134c52..d4acb47 100644
--- a/src/backend/commands/vacuumlazy.c
+++ b/src/backend/commands/vacuumlazy.c
@@ -22,6 +22,19 @@
  * of index scans performed.  So we don't use maintenance_work_mem memory for
  * the TID array, just enough to hold as many heap tuples as fit on one page.
  *
+ * Lazy vacuum supports parallel execution with parallel worker processes. In
+ * parallel vacuum, we perform both index vacuum and index cleanup in parallel.
+ * Individual indexes is processed by one vacuum process. At beginning of
+ * lazy vacuum (at lazy_scan_heap) we prepare the parallel context and initialize
+ * the shared memory segments that contains shared information as well as the
+ * memory space for dead tuples. When starting either index vacuum or cleanup
+ * vacuum, we launch parallel worker processes. Once all indexes are processed
+ * the parallel worker processes exit and the leader process re-initializes the
+ * shared memory segment. Note that all parallel workers live during one either
+ * index vacuum or cleanup index but the leader process neither exits from the
+ * parallel mode nor destories the parallel context. For updating the index
+ * statistics, since any updates are not allowed during parallel mode we update
+ * the index statistics after exited from the parallel mode.
  *
  * Portions Copyright (c) 1996-2018, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
@@ -41,8 +54,10 @@
 #include "access/heapam_xlog.h"
 #include "access/htup_details.h"
 #include "access/multixact.h"
+#include "access/parallel.h"
 #include "access/transam.h"
 #include "access/visibilitymap.h"
+#include "access/xact.h"
 #include "access/xlog.h"
 #include "catalog/storage.h"
 #include "commands/dbcommands.h"
@@ -55,6 +70,8 @@
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
 #include "storage/lmgr.h"
+#include "storage/spin.h"
+#include "tcop/tcopprot.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/pg_rusage.h"
@@ -111,10 +128,79 @@
  */
 #define PREFETCH_SIZE			((BlockNumber) 32)
 
+/* DSM key for parallel lazy vacuum */
+#define PARALLEL_VACUUM_KEY_SHARED			UINT64CONST(0xFFFFFFFFFFF00001)
+#define PARALLEL_VACUUM_KEY_DEAD_TUPLES		UINT64CONST(0xFFFFFFFFFFF00002)
+#define PARALLEL_VACUUM_KEY_QUERY_TEXT		UINT64CONST(0xFFFFFFFFFFF00003)
+
+/*
+ * Structs for an index bulk-deletion statistic that is used for parallel
+ * lazy vacuum. This is allocated in a dynamic shared memory segment.
+ */
+typedef struct LVIndStats
+{
+	bool updated;	/* is the stats updated? */
+	IndexBulkDeleteResult stats;
+} LVIndStats;
+
+/*
+ * LVTidMap controls the dead tuple TIDs collected during heap scan. This is
+ * allocated in a dynamic shared memory segment in parallel lazy vacuum mode, or
+ * in a local memory.
+ */
+typedef struct LVTidMap
+{
+	int			max_dead_tuples;	/* # slots allocated in array */
+	int			num_dead_tuples;	/* current # of entries */
+	/* List of TIDs of tuples we intend to delete */
+	/* NB: this list is ordered by TID address */
+	ItemPointerData itemptrs[FLEXIBLE_ARRAY_MEMBER];	/* array of ItemPointerData */
+} LVTidMap;
+#define SizeOfLVTidMap offsetof(LVTidMap, itemptrs) + sizeof(ItemPointerData)
+
+/*
+ * Status for parallel vacuum index and cleanup index. This is allocated in a
+ * dynamic shared memory segment.
+ */
+typedef struct LVShared
+{
+	/*
+	 * Target table relid and vacuum settings. These fields are not modified
+	 * during the lazy vacuum.
+	 */
+	Oid		relid;
+	bool	is_wraparound;
+	int		elevel;
+
+	/*
+	 * An indication for vacuum workers of doing either vacuuming index or
+	 * cleanup index.
+	 */
+	bool	for_cleanup;
+
+	/*
+	 * Fields for vacuum index or cleanup index, or both necessary for
+	 * IndexVacuumInfo.
+	 *
+	 * reltuples is the total number of input heap tuples. We set either an
+	 * old live tuples in vacuum index or th new live tuples in cleanup index.
+	 *
+	 * estimated_count is true if the reltuples is estimated value.
+	 */
+	double	reltuples;
+	bool	estimated_count;
+
+	/*
+	 * Variables to control parallel index vacuum. An variable-sized field
+	 * 'indstats' must come last.
+	 */
+	pg_atomic_uint32	nprocessed;
+	LVIndStats			indstats[FLEXIBLE_ARRAY_MEMBER];
+} LVShared;
+#define SizeOfLVShared offsetof(LVShared, indstats) + sizeof(LVIndStats)
+
 typedef struct LVRelStats
 {
-	/* hasindex = true means two-pass strategy; false means one-pass */
-	bool		hasindex;
 	/* Overall statistics about rel */
 	BlockNumber old_rel_pages;	/* previous value of pg_class.relpages */
 	BlockNumber rel_pages;		/* total number of pages */
@@ -129,16 +215,34 @@ typedef struct LVRelStats
 	BlockNumber pages_removed;
 	double		tuples_deleted;
 	BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
-	/* List of TIDs of tuples we intend to delete */
-	/* NB: this list is ordered by TID address */
-	int			num_dead_tuples;	/* current # of entries */
-	int			max_dead_tuples;	/* # slots allocated in array */
-	ItemPointer dead_tuples;	/* array of ItemPointerData */
 	int			num_index_scans;
 	TransactionId latestRemovedXid;
 	bool		lock_waiter_detected;
 } LVRelStats;
 
+/*
+ * Working state for lazy vacuum execution used by only leader process. This is
+ * present only in the leader process. In parallel lazy vacuum, the 'lvshared'
+ * and 'pcxt' are not NULL and they point to a dynamic shared memory segment.
+ */
+typedef struct LVState
+{
+	Oid			relid;
+	Relation	relation;
+	LVRelStats	*vacrelstats;
+	Relation	*indRels;
+	/* nindexes > 0 means two-pass strategy; false means one-pass */
+	int			nindexes;
+
+	/* Lazy vacuum options and scan status */
+	VacuumOption	options;
+	bool			is_wraparound;
+	bool			aggressive;
+
+	/* Variables for parallel lazy vacuum */
+	LVShared		*lvshared;
+	ParallelContext	*pcxt;
+} LVState;
 
 /* A few variables that don't seem worth passing around as parameters */
 static int	elevel = -1;
@@ -151,31 +255,44 @@ static BufferAccessStrategy vac_strategy;
 
 
 /* non-export function prototypes */
-static void lazy_scan_heap(Relation onerel, int options,
-			   LVRelStats *vacrelstats, Relation *Irel, int nindexes,
-			   bool aggressive);
-static void lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats);
+static void lazy_scan_heap(LVState *lvstate);
+static void lazy_vacuum_heap(LVState *lvstate, LVTidMap *dead_tuples);
 static bool lazy_check_needs_freeze(Buffer buf, bool *hastup);
-static void lazy_vacuum_index(Relation indrel,
-				  IndexBulkDeleteResult **stats,
-				  LVRelStats *vacrelstats);
-static void lazy_cleanup_index(Relation indrel,
-				   IndexBulkDeleteResult *stats,
-				   LVRelStats *vacrelstats);
-static int lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
-				 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer);
+static IndexBulkDeleteResult *lazy_vacuum_index(Relation indrel,
+									IndexBulkDeleteResult *stats,
+									double reltuples,
+									LVTidMap	*dead_tuples);
+static IndexBulkDeleteResult *lazy_cleanup_index(Relation indrel,
+									IndexBulkDeleteResult *stats,
+									double reltuples, bool estimated_count,
+									bool update_stats);
+static int lazy_vacuum_page(LVState *lvstate, Relation onerel, BlockNumber blkno,
+					Buffer buffer, int tupindex, Buffer *vmbuffer,
+					TransactionId latestRemovedXid, LVTidMap *dead_tuples);
 static bool should_attempt_truncation(LVRelStats *vacrelstats);
 static void lazy_truncate_heap(Relation onerel, LVRelStats *vacrelstats);
 static BlockNumber count_nondeletable_pages(Relation onerel,
 						 LVRelStats *vacrelstats);
-static void lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks);
-static void lazy_record_dead_tuple(LVRelStats *vacrelstats,
-					   ItemPointer itemptr);
+static LVTidMap *lazy_space_alloc(LVState *lvstate, BlockNumber relblocks,
+								  int parallel_workers);
+static void lazy_record_dead_tuple(LVTidMap *dead_tuples, ItemPointer itemptr);
 static bool lazy_tid_reaped(ItemPointer itemptr, void *state);
 static int	vac_cmp_itemptr(const void *left, const void *right);
 static bool heap_page_is_all_visible(Relation rel, Buffer buf,
 						 TransactionId *visibility_cutoff_xid, bool *all_frozen);
-
+static int lazy_compute_parallel_workers(Relation rel, int nrequests, int nindexes);
+static LVTidMap *lazy_prepare_parallel(LVState *lvstate, long maxtuples, int request);
+static void lazy_end_parallel(LVState *lvstate, bool update_indstats);
+static void lazy_begin_parallel_vacuum_index(LVState *lvstate, bool for_cleanup);
+static void lazy_end_parallel_vacuum_index(LVState *lvstate);
+static void lazy_vacuum_all_indexes_for_leader(LVState *lvstate,
+											   IndexBulkDeleteResult **stats,
+											   LVTidMap *dead_tuples,
+											   bool do_parallel,
+											   bool for_cleanup);
+static void lazy_vacuum_all_indexes_for_worker(Relation *indrels, int nindexes,
+												LVShared *lvshared, LVTidMap *dead_tuples,
+												bool for_cleanup);
 
 /*
  *	lazy_vacuum_rel() -- perform LAZY VACUUM for one heap relation
@@ -187,9 +304,10 @@ static bool heap_page_is_all_visible(Relation rel, Buffer buf,
  *		and locked the relation.
  */
 void
-lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params,
+lazy_vacuum_rel(Relation onerel, VacuumOption options, VacuumParams *params,
 				BufferAccessStrategy bstrategy)
 {
+	LVState	   *lvstate;
 	LVRelStats *vacrelstats;
 	Relation   *Irel;
 	int			nindexes;
@@ -201,6 +319,7 @@ lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 				write_rate;
 	bool		aggressive;		/* should we scan all unfrozen pages? */
 	bool		scanned_all_unfrozen;	/* actually scanned all such pages? */
+	bool		hasindex;
 	TransactionId xidFullScanLimit;
 	MultiXactId mxactFullScanLimit;
 	BlockNumber new_rel_pages;
@@ -218,7 +337,7 @@ lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 		starttime = GetCurrentTimestamp();
 	}
 
-	if (options & VACOPT_VERBOSE)
+	if (options.flags & VACOPT_VERBOSE)
 		elevel = INFO;
 	else
 		elevel = DEBUG2;
@@ -246,7 +365,7 @@ lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 											   xidFullScanLimit);
 	aggressive |= MultiXactIdPrecedesOrEquals(onerel->rd_rel->relminmxid,
 											  mxactFullScanLimit);
-	if (options & VACOPT_DISABLE_PAGE_SKIPPING)
+	if (options.flags & VACOPT_DISABLE_PAGE_SKIPPING)
 		aggressive = true;
 
 	vacrelstats = (LVRelStats *) palloc0(sizeof(LVRelStats));
@@ -259,10 +378,20 @@ lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 
 	/* Open all indexes of the relation */
 	vac_open_indexes(onerel, RowExclusiveLock, &nindexes, &Irel);
-	vacrelstats->hasindex = (nindexes > 0);
+	hasindex = (nindexes > 0);
+
+	/* Create a lazy vacuum working state */
+	lvstate = (LVState *) palloc0(sizeof(LVState));
+	lvstate->vacrelstats = vacrelstats;
+	lvstate->relation = onerel;
+	lvstate->indRels = Irel;
+	lvstate->nindexes = nindexes;
+	lvstate->options = options;
+	lvstate->aggressive = aggressive;
+	lvstate->is_wraparound = params->is_wraparound;
 
 	/* Do the vacuuming */
-	lazy_scan_heap(onerel, options, vacrelstats, Irel, nindexes, aggressive);
+	lazy_scan_heap(lvstate);
 
 	/* Done with indexes */
 	vac_close_indexes(nindexes, Irel, NoLock);
@@ -333,7 +462,7 @@ lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 						new_rel_pages,
 						new_live_tuples,
 						new_rel_allvisible,
-						vacrelstats->hasindex,
+						hasindex,
 						new_frozen_xid,
 						new_min_multi,
 						false);
@@ -465,14 +594,29 @@ vacuum_log_cleanup_info(Relation rel, LVRelStats *vacrelstats)
  *		dead-tuple TIDs, invoke vacuuming of indexes and call lazy_vacuum_heap
  *		to reclaim dead line pointers.
  *
+ *		If the table has more than one index and parallel lazy vacuum is requested,
+ *		we run both index vacuum and cleanup index in parallel. When allocating the
+ *		space for lazy scan heap, we enter the parallel mode, create the parallel
+ *		context and initailize a dynamic shared memory segment for dead tuples.
+ *		The dead_tuples points either to a dynamic shared memory segment in parallel
+ *		vacuum or to a local memory in single vacuum. Before starting parallel
+ *		index vacuum and parallel cleanup index we launch parallel workers. All
+ *		parallel workers will exit after all indexes are processed and the leader
+ *		process re-initialize parallel context and then re-launch them at the next
+ *		timing. The index statistics are updated by the leader after exited from
+ *		the parallel mode since currently all writes are not allowed during the
+ *		parallel mode.
+ *
  *		If there are no indexes then we can reclaim line pointers on the fly;
  *		dead line pointers need only be retained until all index pointers that
  *		reference them have been killed.
  */
 static void
-lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
-			   Relation *Irel, int nindexes, bool aggressive)
+lazy_scan_heap(LVState *lvstate)
 {
+	Relation	onerel = lvstate->relation;
+	LVRelStats	*vacrelstats = lvstate->vacrelstats;
+	LVTidMap	*dead_tuples = NULL;
 	BlockNumber nblocks,
 				blkno;
 	HeapTupleData tuple;
@@ -495,6 +639,8 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	bool		skipping_blocks;
 	xl_heap_freeze_tuple *frozen;
 	StringInfoData buf;
+	bool		do_parallel;
+	int			parallel_workers = 0;
 	const int	initprog_index[] = {
 		PROGRESS_VACUUM_PHASE,
 		PROGRESS_VACUUM_TOTAL_HEAP_BLKS,
@@ -505,7 +651,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	pg_rusage_init(&ru0);
 
 	relname = RelationGetRelationName(onerel);
-	if (aggressive)
+	if (lvstate->aggressive)
 		ereport(elevel,
 				(errmsg("aggressively vacuuming \"%s.%s\"",
 						get_namespace_name(RelationGetNamespace(onerel)),
@@ -521,7 +667,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	num_tuples = live_tuples = tups_vacuumed = nkeep = nunused = 0;
 
 	indstats = (IndexBulkDeleteResult **)
-		palloc0(nindexes * sizeof(IndexBulkDeleteResult *));
+		palloc0(lvstate->nindexes * sizeof(IndexBulkDeleteResult *));
 
 	nblocks = RelationGetNumberOfBlocks(onerel);
 	vacrelstats->rel_pages = nblocks;
@@ -530,13 +676,26 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	vacrelstats->nonempty_pages = 0;
 	vacrelstats->latestRemovedXid = InvalidTransactionId;
 
-	lazy_space_alloc(vacrelstats, nblocks);
+	/*
+	 * Compute the number of parallel vacuum worker to request and then enable
+	 * parallel lazy vacuum.
+	 */
+	if ((lvstate->options.flags & VACOPT_PARALLEL) != 0)
+	{
+		parallel_workers = lazy_compute_parallel_workers(lvstate->relation,
+														 lvstate->options.nworkers,
+														 lvstate->nindexes);
+		if (parallel_workers > 0)
+			do_parallel = true;
+	}
+
+	dead_tuples = lazy_space_alloc(lvstate, nblocks, parallel_workers);
 	frozen = palloc(sizeof(xl_heap_freeze_tuple) * MaxHeapTuplesPerPage);
 
 	/* Report that we're scanning the heap, advertising total # of blocks */
 	initprog_val[0] = PROGRESS_VACUUM_PHASE_SCAN_HEAP;
 	initprog_val[1] = nblocks;
-	initprog_val[2] = vacrelstats->max_dead_tuples;
+	initprog_val[2] = dead_tuples->max_dead_tuples;
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
 	/*
@@ -584,7 +743,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	 * be replayed on any hot standby, where it can be disruptive.
 	 */
 	next_unskippable_block = 0;
-	if ((options & VACOPT_DISABLE_PAGE_SKIPPING) == 0)
+	if ((lvstate->options.flags & VACOPT_DISABLE_PAGE_SKIPPING) == 0)
 	{
 		while (next_unskippable_block < nblocks)
 		{
@@ -592,7 +751,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 
 			vmstatus = visibilitymap_get_status(onerel, next_unskippable_block,
 												&vmbuffer);
-			if (aggressive)
+			if (lvstate->aggressive)
 			{
 				if ((vmstatus & VISIBILITYMAP_ALL_FROZEN) == 0)
 					break;
@@ -639,7 +798,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		{
 			/* Time to advance next_unskippable_block */
 			next_unskippable_block++;
-			if ((options & VACOPT_DISABLE_PAGE_SKIPPING) == 0)
+			if ((lvstate->options.flags & VACOPT_DISABLE_PAGE_SKIPPING) == 0)
 			{
 				while (next_unskippable_block < nblocks)
 				{
@@ -648,7 +807,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 					vmskipflags = visibilitymap_get_status(onerel,
 														   next_unskippable_block,
 														   &vmbuffer);
-					if (aggressive)
+					if (lvstate->aggressive)
 					{
 						if ((vmskipflags & VISIBILITYMAP_ALL_FROZEN) == 0)
 							break;
@@ -677,7 +836,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			 * it's not all-visible.  But in an aggressive vacuum we know only
 			 * that it's not all-frozen, so it might still be all-visible.
 			 */
-			if (aggressive && VM_ALL_VISIBLE(onerel, blkno, &vmbuffer))
+			if (lvstate->aggressive && VM_ALL_VISIBLE(onerel, blkno, &vmbuffer))
 				all_visible_according_to_vm = true;
 		}
 		else
@@ -701,7 +860,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 				 * know whether it was all-frozen, so we have to recheck; but
 				 * in this case an approximate answer is OK.
 				 */
-				if (aggressive || VM_ALL_FROZEN(onerel, blkno, &vmbuffer))
+				if (lvstate->aggressive || VM_ALL_FROZEN(onerel, blkno, &vmbuffer))
 					vacrelstats->frozenskipped_pages++;
 				continue;
 			}
@@ -714,8 +873,8 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		 * If we are close to overrunning the available space for dead-tuple
 		 * TIDs, pause and do a cycle of vacuuming before we tackle this page.
 		 */
-		if ((vacrelstats->max_dead_tuples - vacrelstats->num_dead_tuples) < MaxHeapTuplesPerPage &&
-			vacrelstats->num_dead_tuples > 0)
+		if ((dead_tuples->max_dead_tuples - dead_tuples->num_dead_tuples) < MaxHeapTuplesPerPage &&
+			dead_tuples->num_dead_tuples > 0)
 		{
 			const int	hvp_index[] = {
 				PROGRESS_VACUUM_PHASE,
@@ -743,10 +902,8 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 										 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
 			/* Remove index entries */
-			for (i = 0; i < nindexes; i++)
-				lazy_vacuum_index(Irel[i],
-								  &indstats[i],
-								  vacrelstats);
+			lazy_vacuum_all_indexes_for_leader(lvstate, indstats, dead_tuples,
+											   do_parallel, false);
 
 			/*
 			 * Report that we are now vacuuming the heap.  We also increase
@@ -759,14 +916,14 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			pgstat_progress_update_multi_param(2, hvp_index, hvp_val);
 
 			/* Remove tuples from heap */
-			lazy_vacuum_heap(onerel, vacrelstats);
+			lazy_vacuum_heap(lvstate, dead_tuples);
 
 			/*
 			 * Forget the now-vacuumed tuples, and press on, but be careful
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_dead_tuples = 0;
 			vacrelstats->num_index_scans++;
 
 			/*
@@ -804,7 +961,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			 * it's OK to skip vacuuming pages we get a lock conflict on. They
 			 * will be dealt with in some future vacuum.
 			 */
-			if (!aggressive && !FORCE_CHECK_PAGE())
+			if (!lvstate->aggressive && !FORCE_CHECK_PAGE())
 			{
 				ReleaseBuffer(buf);
 				vacrelstats->pinskipped_pages++;
@@ -837,7 +994,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 					vacrelstats->nonempty_pages = blkno + 1;
 				continue;
 			}
-			if (!aggressive)
+			if (!lvstate->aggressive)
 			{
 				/*
 				 * Here, we must not advance scanned_pages; that would amount
@@ -956,7 +1113,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		has_dead_tuples = false;
 		nfrozen = 0;
 		hastup = false;
-		prev_dead_count = vacrelstats->num_dead_tuples;
+		prev_dead_count = dead_tuples->num_dead_tuples;
 		maxoff = PageGetMaxOffsetNumber(page);
 
 		/*
@@ -995,7 +1152,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			 */
 			if (ItemIdIsDead(itemid))
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				all_visible = false;
 				continue;
 			}
@@ -1135,7 +1292,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 
 			if (tupgone)
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				HeapTupleHeaderAdvanceLatestRemovedXid(tuple.t_data,
 													   &vacrelstats->latestRemovedXid);
 				tups_vacuumed += 1;
@@ -1204,11 +1361,12 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		 * If there are no indexes then we can vacuum the page right now
 		 * instead of doing a second scan.
 		 */
-		if (nindexes == 0 &&
-			vacrelstats->num_dead_tuples > 0)
+		if (lvstate->nindexes == 0 && dead_tuples->num_dead_tuples > 0)
 		{
 			/* Remove tuples from heap */
-			lazy_vacuum_page(onerel, blkno, buf, 0, vacrelstats, &vmbuffer);
+			lazy_vacuum_page(lvstate, onerel, blkno, buf, 0, &vmbuffer,
+							 lvstate->vacrelstats->latestRemovedXid,
+							 dead_tuples);
 			has_dead_tuples = false;
 
 			/*
@@ -1216,7 +1374,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_dead_tuples = 0;
 			vacuumed_pages++;
 
 			/*
@@ -1332,7 +1490,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		 * page, so remember its free space as-is.  (This path will always be
 		 * taken if there are no indexes.)
 		 */
-		if (vacrelstats->num_dead_tuples == prev_dead_count)
+		if (dead_tuples->num_dead_tuples == prev_dead_count)
 			RecordPageWithFreeSpace(onerel, blkno, freespace);
 	}
 
@@ -1366,7 +1524,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 
 	/* If any tuples need to be deleted, perform final vacuum cycle */
 	/* XXX put a threshold on min number of tuples here? */
-	if (vacrelstats->num_dead_tuples > 0)
+	if (dead_tuples->num_dead_tuples > 0)
 	{
 		const int	hvp_index[] = {
 			PROGRESS_VACUUM_PHASE,
@@ -1382,10 +1540,8 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 									 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
 		/* Remove index entries */
-		for (i = 0; i < nindexes; i++)
-			lazy_vacuum_index(Irel[i],
-							  &indstats[i],
-							  vacrelstats);
+		lazy_vacuum_all_indexes_for_leader(lvstate, indstats, dead_tuples,
+										   do_parallel, false);
 
 		/* Report that we are now vacuuming the heap */
 		hvp_val[0] = PROGRESS_VACUUM_PHASE_VACUUM_HEAP;
@@ -1395,7 +1551,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		/* Remove tuples from heap */
 		pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
 									 PROGRESS_VACUUM_PHASE_VACUUM_HEAP);
-		lazy_vacuum_heap(onerel, vacrelstats);
+		lazy_vacuum_heap(lvstate, dead_tuples);
 		vacrelstats->num_index_scans++;
 	}
 
@@ -1412,8 +1568,11 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
 
 	/* Do post-vacuum cleanup and statistics update for each index */
-	for (i = 0; i < nindexes; i++)
-		lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
+	lazy_vacuum_all_indexes_for_leader(lvstate, indstats, dead_tuples,
+									   do_parallel, true);
+
+	if (do_parallel)
+		lazy_end_parallel(lvstate, true);
 
 	/* If no indexes, make log report that lazy_vacuum_heap would've made */
 	if (vacuumed_pages)
@@ -1468,8 +1627,9 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
  * process index entry removal in batches as large as possible.
  */
 static void
-lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
+lazy_vacuum_heap(LVState *lvstate, LVTidMap *dead_tuples)
 {
+	Relation	onerel = lvstate->relation;
 	int			tupindex;
 	int			npages;
 	PGRUsage	ru0;
@@ -1479,7 +1639,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 	npages = 0;
 
 	tupindex = 0;
-	while (tupindex < vacrelstats->num_dead_tuples)
+	while (tupindex < dead_tuples->num_dead_tuples)
 	{
 		BlockNumber tblk;
 		Buffer		buf;
@@ -1488,7 +1648,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 
 		vacuum_delay_point();
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&dead_tuples->itemptrs[tupindex]);
 		buf = ReadBufferExtended(onerel, MAIN_FORKNUM, tblk, RBM_NORMAL,
 								 vac_strategy);
 		if (!ConditionalLockBufferForCleanup(buf))
@@ -1497,8 +1657,9 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 			++tupindex;
 			continue;
 		}
-		tupindex = lazy_vacuum_page(onerel, tblk, buf, tupindex, vacrelstats,
-									&vmbuffer);
+		tupindex = lazy_vacuum_page(lvstate, onerel, tblk, buf, tupindex,
+									&vmbuffer, lvstate->vacrelstats->latestRemovedXid,
+									dead_tuples);
 
 		/* Now that we've compacted the page, record its available space */
 		page = BufferGetPage(buf);
@@ -1533,8 +1694,9 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
  * The return value is the first tupindex after the tuples of this page.
  */
 static int
-lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
-				 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer)
+lazy_vacuum_page(LVState *lvstate, Relation onerel, BlockNumber blkno,
+				 Buffer buffer, int tupindex, Buffer *vmbuffer,
+				 TransactionId latestRemovedXid, LVTidMap *dead_tuples)
 {
 	Page		page = BufferGetPage(buffer);
 	OffsetNumber unused[MaxOffsetNumber];
@@ -1546,16 +1708,16 @@ lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 
 	START_CRIT_SECTION();
 
-	for (; tupindex < vacrelstats->num_dead_tuples; tupindex++)
+	for (; tupindex < dead_tuples->num_dead_tuples; tupindex++)
 	{
 		BlockNumber tblk;
 		OffsetNumber toff;
 		ItemId		itemid;
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&dead_tuples->itemptrs[tupindex]);
 		if (tblk != blkno)
 			break;				/* past end of tuples for this block */
-		toff = ItemPointerGetOffsetNumber(&vacrelstats->dead_tuples[tupindex]);
+		toff = ItemPointerGetOffsetNumber(&dead_tuples->itemptrs[tupindex]);
 		itemid = PageGetItemId(page, toff);
 		ItemIdSetUnused(itemid);
 		unused[uncnt++] = toff;
@@ -1576,7 +1738,7 @@ lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 		recptr = log_heap_clean(onerel, buffer,
 								NULL, 0, NULL, 0,
 								unused, uncnt,
-								vacrelstats->latestRemovedXid);
+								latestRemovedXid);
 		PageSetLSN(page, recptr);
 	}
 
@@ -1675,6 +1837,86 @@ lazy_check_needs_freeze(Buffer buf, bool *hastup)
 	return false;
 }
 
+/*
+ * Vacuum or cleanup all indexes with parallel workers if requested. This function
+ * is used by the parallel vacuum leader process. In parallel lazy vacuum, we save
+ * the index bulk-deletion results to the shared memory space. All vacuum workers
+ * process different indexes we can write them without locking.
+ */
+static void
+lazy_vacuum_all_indexes_for_leader(LVState *lvstate, IndexBulkDeleteResult **stats,
+								   LVTidMap *dead_tuples, bool do_parallel,
+								   bool for_cleanup)
+{
+	LVShared	*lvshared = lvstate->lvshared;
+	LVRelStats	*vacrelstats = lvstate->vacrelstats;
+	int			nprocessed = 0;
+	int			idx;
+
+	Assert(!IsParallelWorker());
+
+	/* no job if the table has no index */
+	if (lvstate->nindexes < 1)
+		return;
+
+	if (do_parallel)
+		lazy_begin_parallel_vacuum_index(lvstate, for_cleanup);
+
+	for (;;)
+	{
+		IndexBulkDeleteResult *r = NULL;
+
+		/*
+		 * Get the next index number to vacuum and set index statistics. In parallel
+		 * lazy vacuum, index bulk-deletion results are stored in the shared memory
+		 * segment. If it's already updated we use it rather than setting it to NULL.
+		 * In single vacuum, we can always use an element of the 'stats'.
+		 */
+		if (do_parallel)
+		{
+			idx = pg_atomic_fetch_add_u32(&(lvshared->nprocessed), 1);
+
+			if (lvshared->indstats[idx].updated)
+				r = &(lvshared->indstats[idx].stats);
+		}
+		else
+		{
+			idx = nprocessed++;
+			r = stats[idx];
+		}
+
+		/* all indexes are processed? */
+		if (idx >= lvstate->nindexes)
+			break;
+
+		/*
+		 * Do vacuuming or cleanup one index. For cleanup index, we don't update
+		 * index statistics during parallel mode.
+		 */
+		if (!for_cleanup)
+			r = lazy_vacuum_index(lvstate->indRels[idx], r,
+								  vacrelstats->old_rel_pages,
+								  dead_tuples);
+		else
+			r = lazy_cleanup_index(lvstate->indRels[idx], r,
+								   vacrelstats->new_rel_tuples,
+								   vacrelstats->tupcount_pages < vacrelstats->rel_pages,
+								   !do_parallel);
+
+		if (do_parallel && r)
+		{
+			/* save index bulk-deletion result to the shared memory space */
+			lvshared->indstats[idx].updated = true;
+			memcpy(&(lvshared->indstats[idx].stats), r, sizeof(IndexBulkDeleteResult));
+
+			/* save pointer to the shared memory segment */
+			r = &(lvshared->indstats[idx].stats);
+		}
+	}
+
+	if (do_parallel)
+		lazy_end_parallel_vacuum_index(lvstate);
+}
 
 /*
  *	lazy_vacuum_index() -- vacuum one index relation.
@@ -1682,11 +1924,11 @@ lazy_check_needs_freeze(Buffer buf, bool *hastup)
  *		Delete all the index entries pointing to tuples listed in
  *		vacrelstats->dead_tuples, and update running statistics.
  */
-static void
-lazy_vacuum_index(Relation indrel,
-				  IndexBulkDeleteResult **stats,
-				  LVRelStats *vacrelstats)
+static IndexBulkDeleteResult *
+lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult *stats,
+				  double reltuples, LVTidMap *dead_tuples)
 {
+	IndexBulkDeleteResult *res;
 	IndexVacuumInfo ivinfo;
 	PGRUsage	ru0;
 
@@ -1696,28 +1938,29 @@ lazy_vacuum_index(Relation indrel,
 	ivinfo.analyze_only = false;
 	ivinfo.estimated_count = true;
 	ivinfo.message_level = elevel;
-	/* We can only provide an approximate value of num_heap_tuples here */
-	ivinfo.num_heap_tuples = vacrelstats->old_live_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
 	/* Do bulk deletion */
-	*stats = index_bulk_delete(&ivinfo, *stats,
-							   lazy_tid_reaped, (void *) vacrelstats);
+	res = index_bulk_delete(&ivinfo, stats,
+							lazy_tid_reaped, (void *) dead_tuples);
 
 	ereport(elevel,
-			(errmsg("scanned index \"%s\" to remove %d row versions",
+			(errmsg("scanned index \"%s\" to remove %d row versions %s",
 					RelationGetRelationName(indrel),
-					vacrelstats->num_dead_tuples),
+					dead_tuples->num_dead_tuples,
+					IsParallelWorker() ? "by vacuum worker" : ""),
 			 errdetail_internal("%s", pg_rusage_show(&ru0))));
+
+	return res;
 }
 
 /*
  *	lazy_cleanup_index() -- do post-vacuum cleanup for one index relation.
  */
-static void
-lazy_cleanup_index(Relation indrel,
-				   IndexBulkDeleteResult *stats,
-				   LVRelStats *vacrelstats)
+static IndexBulkDeleteResult *
+lazy_cleanup_index(Relation indrel, IndexBulkDeleteResult *stats,
+				   double reltuples, bool estimated_count, bool update_stats)
 {
 	IndexVacuumInfo ivinfo;
 	PGRUsage	ru0;
@@ -1726,27 +1969,21 @@ lazy_cleanup_index(Relation indrel,
 
 	ivinfo.index = indrel;
 	ivinfo.analyze_only = false;
-	ivinfo.estimated_count = (vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	ivinfo.estimated_count = estimated_count;
 	ivinfo.message_level = elevel;
-
-	/*
-	 * Now we can provide a better estimate of total number of surviving
-	 * tuples (we assume indexes are more interested in that than in the
-	 * number of nominally live tuples).
-	 */
-	ivinfo.num_heap_tuples = vacrelstats->new_rel_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
 	stats = index_vacuum_cleanup(&ivinfo, stats);
 
 	if (!stats)
-		return;
+		return NULL;
 
 	/*
 	 * Now update statistics in pg_class, but only if the index says the count
 	 * is accurate.
 	 */
-	if (!stats->estimated_count)
+	if (!stats->estimated_count && update_stats)
 		vac_update_relstats(indrel,
 							stats->num_pages,
 							stats->num_index_tuples,
@@ -1767,8 +2004,7 @@ lazy_cleanup_index(Relation indrel,
 					   stats->tuples_removed,
 					   stats->pages_deleted, stats->pages_free,
 					   pg_rusage_show(&ru0))));
-
-	pfree(stats);
+	return stats;
 }
 
 /*
@@ -2078,15 +2314,16 @@ count_nondeletable_pages(Relation onerel, LVRelStats *vacrelstats)
  *
  * See the comments at the head of this file for rationale.
  */
-static void
-lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+static LVTidMap *
+lazy_space_alloc(LVState *lvstate, BlockNumber relblocks, int parallel_workers)
 {
+	LVTidMap	*dead_tuples = NULL;
 	long		maxtuples;
 	int			vac_work_mem = IsAutoVacuumWorkerProcess() &&
 	autovacuum_work_mem != -1 ?
 	autovacuum_work_mem : maintenance_work_mem;
 
-	if (vacrelstats->hasindex)
+	if (lvstate->nindexes > 0)
 	{
 		maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);
 		maxtuples = Min(maxtuples, INT_MAX);
@@ -2100,34 +2337,44 @@ lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
 		maxtuples = Max(maxtuples, MaxHeapTuplesPerPage);
 	}
 	else
-	{
 		maxtuples = MaxHeapTuplesPerPage;
+
+	/*
+	 * Allocate memory for dead tuples. In parallel lazy vacuum, we enter the parallel
+	 * mode and prepare all memory necessary for executing parallel lazy vacuum
+	 * including the space to store dead tuples. In single process vacuum, we allocate
+	 * them in local memory.
+	 */
+	if (parallel_workers > 0)
+		dead_tuples = lazy_prepare_parallel(lvstate, maxtuples, parallel_workers);
+	else
+	{
+		dead_tuples = (LVTidMap *)
+			palloc(SizeOfLVTidMap + maxtuples * sizeof(ItemPointerData));
+		dead_tuples->num_dead_tuples = 0;
+		dead_tuples->max_dead_tuples = (int) maxtuples;
 	}
 
-	vacrelstats->num_dead_tuples = 0;
-	vacrelstats->max_dead_tuples = (int) maxtuples;
-	vacrelstats->dead_tuples = (ItemPointer)
-		palloc(maxtuples * sizeof(ItemPointerData));
+	return dead_tuples;
 }
 
 /*
  * lazy_record_dead_tuple - remember one deletable tuple
  */
 static void
-lazy_record_dead_tuple(LVRelStats *vacrelstats,
-					   ItemPointer itemptr)
+lazy_record_dead_tuple(LVTidMap *dead_tuples, ItemPointer itemptr)
 {
 	/*
 	 * The array shouldn't overflow under normal behavior, but perhaps it
 	 * could if we are given a really small maintenance_work_mem. In that
 	 * case, just forget the last few tuples (we'll get 'em next time).
 	 */
-	if (vacrelstats->num_dead_tuples < vacrelstats->max_dead_tuples)
+	if (dead_tuples->num_dead_tuples < dead_tuples->max_dead_tuples)
 	{
-		vacrelstats->dead_tuples[vacrelstats->num_dead_tuples] = *itemptr;
-		vacrelstats->num_dead_tuples++;
+		dead_tuples->itemptrs[dead_tuples->num_dead_tuples] = *itemptr;
+		dead_tuples->num_dead_tuples++;
 		pgstat_progress_update_param(PROGRESS_VACUUM_NUM_DEAD_TUPLES,
-									 vacrelstats->num_dead_tuples);
+									 dead_tuples->num_dead_tuples);
 	}
 }
 
@@ -2141,12 +2388,12 @@ lazy_record_dead_tuple(LVRelStats *vacrelstats,
 static bool
 lazy_tid_reaped(ItemPointer itemptr, void *state)
 {
-	LVRelStats *vacrelstats = (LVRelStats *) state;
+	LVTidMap	*dead_tuples = (LVTidMap *) state;
 	ItemPointer res;
 
 	res = (ItemPointer) bsearch((void *) itemptr,
-								(void *) vacrelstats->dead_tuples,
-								vacrelstats->num_dead_tuples,
+								(void *) dead_tuples->itemptrs,
+								dead_tuples->num_dead_tuples,
 								sizeof(ItemPointerData),
 								vac_cmp_itemptr);
 
@@ -2294,3 +2541,329 @@ heap_page_is_all_visible(Relation rel, Buffer buf,
 
 	return all_visible;
 }
+
+/*
+ * Compute the number of parallel worker process to request. Vacuums can be executed
+ * in parallel if the table has more than one index since we support parallel index
+ * vacuum that processes one index by one vacuum process. The relation size of table
+ * and indexes doesn't affect to the parallel degree.
+ */
+static int
+lazy_compute_parallel_workers(Relation rel, int nrequests, int nindexes)
+{
+	int parallel_workers = nindexes - 1;
+
+	if (nindexes < 2)
+		return 0;
+
+	if (nrequests)
+		parallel_workers = Min(nrequests, nindexes - 1);
+	else if (rel->rd_options)
+	{
+		StdRdOptions *relopts = (StdRdOptions *) rel->rd_options;
+		parallel_workers = Min(relopts->parallel_workers, nindexes - 1);
+	}
+
+	/* cap by max_parallel_maintenace_workers */
+	parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+
+	return parallel_workers;
+}
+
+/*
+ * Enter the parallel mode, allocate and initialize a DSM segment. Return
+ * the memory space for storing dead tuples.
+ */
+static LVTidMap *
+lazy_prepare_parallel(LVState *lvstate, long maxtuples, int request)
+{
+	LVShared	*shared;
+	ParallelContext *pcxt;
+	LVTidMap	*tidmap;
+	char	*sharedquery;
+	Size	estshared;
+	Size	estdt;
+	int		querylen;
+	int		i;
+	int		keys = 0;
+
+	EnterParallelMode();
+	pcxt = CreateParallelContext("postgres", "lazy_parallel_vacuum_main",
+								 request, true);
+	lvstate->pcxt = pcxt;
+
+	/* Estimate size for shared information -- PARALLEL_VACUUM_KEY_SHARED */
+	estshared = MAXALIGN(add_size(SizeOfLVShared,
+								  mul_size(sizeof(LVIndStats), lvstate->nindexes)));
+	shm_toc_estimate_chunk(&pcxt->estimator, estshared);
+	keys++;
+
+	/* Estimate size for dead tuples -- PARALLEL_VACUUM_KEY_DEAD_TUPLES */
+	estdt = MAXALIGN(add_size(sizeof(LVTidMap),
+							  mul_size(sizeof(ItemPointerData), maxtuples)));
+	shm_toc_estimate_chunk(&pcxt->estimator, estdt);
+	keys++;
+
+	shm_toc_estimate_keys(&pcxt->estimator, keys);
+
+	/*
+	 * Finally, estimate VACUUM_KEY_QUERY_TEXT space. Auto vacuums don't have
+	 * debug_query_string.
+	 */
+	if (debug_query_string)
+	{
+		querylen = strlen(debug_query_string);
+		shm_toc_estimate_chunk(&pcxt->estimator, querylen + 1);
+		shm_toc_estimate_keys(&pcxt->estimator, 1);
+	}
+
+	/* create the DSM */
+	InitializeParallelDSM(pcxt);
+
+	/* prepare shared information */
+	shared = (LVShared *) shm_toc_allocate(pcxt->toc, estshared);
+	shared->relid = RelationGetRelid(lvstate->relation);
+	shared->is_wraparound = lvstate->is_wraparound;
+	shared->elevel = elevel;
+	pg_atomic_init_u32(&(shared->nprocessed), 0);
+
+	for (i = 0; i < lvstate->nindexes; i++)
+	{
+		LVIndStats *s = &(shared->indstats[i]);
+		s->updated = false;
+		MemSet(&(s->stats), 0, sizeof(IndexBulkDeleteResult));
+	}
+
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_SHARED, shared);
+	lvstate->lvshared = shared;
+
+	/* prepare the dead tuple space */
+	tidmap = (LVTidMap *) shm_toc_allocate(pcxt->toc, estdt);
+	tidmap->max_dead_tuples = maxtuples;
+	tidmap->num_dead_tuples = 0;
+	MemSet(tidmap->itemptrs, 0, sizeof(ItemPointerData) * maxtuples);
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_DEAD_TUPLES, tidmap);
+
+	/* Store query string for workers */
+	sharedquery = (char *) shm_toc_allocate(pcxt->toc, querylen + 1);
+	memcpy(sharedquery, debug_query_string, querylen + 1);
+	sharedquery[querylen] = '\0';
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, sharedquery);
+
+	return tidmap;
+}
+
+/*
+ * Shutdown workers, destroy parallel context, and end parallel mode. If
+ * 'update_indstats' is true, we copy statistics of all indexes before
+ * destroying the parallel context, and then update them after exit parallel
+ * mode.
+ */
+static void
+lazy_end_parallel(LVState *lvstate, bool update_indstats)
+{
+	LVIndStats *copied_indstats = NULL;
+
+	Assert(!IsParallelWorker());
+
+	if (update_indstats && lvstate->nindexes > 0)
+	{
+		/* copy the index statistics to a temporary space */
+		copied_indstats = palloc(sizeof(LVIndStats) * lvstate->nindexes);
+		memcpy(copied_indstats, lvstate->lvshared->indstats,
+			   sizeof(LVIndStats) * lvstate->nindexes);
+	}
+
+	/* Shutdown worker processes and destroy the parallel context */
+	WaitForParallelWorkersToFinish(lvstate->pcxt);
+	DestroyParallelContext(lvstate->pcxt);
+	ExitParallelMode();
+
+	if (copied_indstats)
+	{
+		int i;
+
+		for (i = 0; i < lvstate->nindexes; i++)
+		{
+			LVIndStats *s = &(copied_indstats[i]);
+
+			/* Update index statistics */
+			if (s->updated && !s->stats.estimated_count)
+				vac_update_relstats(lvstate->indRels[i],
+									s->stats.num_pages,
+									s->stats.num_index_tuples,
+									0,
+									false,
+									InvalidTransactionId,
+									InvalidMultiXactId,
+									false);
+		}
+
+		pfree(copied_indstats);
+	}
+}
+
+/*
+ * Begin a parallel index vacuum or cleanup index. Set shared information
+ * and launch parallel worker processes.
+ */
+static void
+lazy_begin_parallel_vacuum_index(LVState *lvstate, bool for_cleanup)
+{
+	LVRelStats *vacrelstats = lvstate->vacrelstats;
+
+	Assert(!IsParallelWorker());
+
+	/*
+	 * Request workers to do either vacuuming indexes or cleaning indexes.
+	 */
+	lvstate->lvshared->for_cleanup = for_cleanup;
+
+	if (for_cleanup)
+	{
+		/*
+		 * Now we can provide a better estimate of total number of surviving
+		 * tuples (we assume indexes are more interested in that than in the
+		 * number of nominally live tuples).
+		 */
+		lvstate->lvshared->reltuples = vacrelstats->new_rel_tuples;
+		lvstate->lvshared->estimated_count =
+			(vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	}
+	else
+	{
+		/* We can only provide an approximate value of num_heap_tuples here */
+		lvstate->lvshared->reltuples = vacrelstats->old_live_tuples;
+		lvstate->lvshared->estimated_count = true;
+	}
+
+	LaunchParallelWorkers(lvstate->pcxt);
+
+	/*
+	 * if no workers launched, we vacuum all indexes by the leader process
+	 * alone. Since there is hope that we can launch workers in the next
+	 * execution time we don't want to end the parallel mode yet.
+	 */
+	if (lvstate->pcxt->nworkers_launched == 0)
+		return;
+
+	WaitForParallelWorkersToAttach(lvstate->pcxt);
+}
+
+/*
+ * Wait for all worker processes to finish and reinitialize DSM for
+ * the next execution.
+ */
+static void
+lazy_end_parallel_vacuum_index(LVState *lvstate)
+{
+	Assert(!IsParallelWorker());
+
+	WaitForParallelWorkersToFinish(lvstate->pcxt);
+
+	/* Reset the processing count */
+	pg_atomic_write_u32(&(lvstate->lvshared->nprocessed), 0);
+
+	/*
+	 * Reinitialize the DSM space except to relaunch parallel workers for
+	 * the next execution.
+	 */
+	ReinitializeParallelDSM(lvstate->pcxt);
+}
+
+/*
+ * Perform work within a launched parallel process.
+ *
+ * Parallel vacuum worker processes doesn't report the vacuum progress
+ * information.
+ */
+void
+lazy_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
+{
+	Relation	onerel;
+	Relation	*indrels;
+	LVShared	*lvshared;
+	LVTidMap	*dead_tuples;
+	int			nindexes;
+	char		*sharedquery;
+
+	/* Set lazy vacuum state and open relations */
+	lvshared = (LVShared *) shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_SHARED, false);
+	onerel = heap_open(lvshared->relid, ShareUpdateExclusiveLock);
+	vac_open_indexes(onerel, RowExclusiveLock, &nindexes, &indrels);
+	elevel = lvshared->elevel;
+
+	ereport(DEBUG1,
+			(errmsg("starting parallel lazy vacuum worker for %s",
+					lvshared->for_cleanup ? "cleanup" : "vacuuming")));
+
+	/* Set debug_query_string for individual workers */
+	sharedquery = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, true);
+
+	/* Report the query string from leader */
+	debug_query_string = sharedquery;
+	pgstat_report_activity(STATE_RUNNING, debug_query_string);
+
+	/* Set dead tuple space within worker */
+	dead_tuples = (LVTidMap *) shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_DEAD_TUPLES, false);
+
+	/* Set cost-based vacuum delay */
+	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostBalance = 0;
+	VacuumPageHit = 0;
+	VacuumPageMiss = 0;
+	VacuumPageDirty = 0;
+
+	/* Do either vacuuming indexes or cleaning indexes */
+	lazy_vacuum_all_indexes_for_worker(indrels, nindexes, lvshared,
+									   dead_tuples,
+									   lvshared->for_cleanup);
+
+	vac_close_indexes(nindexes, indrels, RowExclusiveLock);
+	heap_close(onerel, ShareUpdateExclusiveLock);
+}
+
+/*
+ * Vacuum or cleanup indexes. This function is used by the parallel vacuum
+ * worker processes. Similar to the leader process in parallel lazy vacuum, we save
+ * the bulk-deletion result to the shared memory space.
+ */
+static void
+lazy_vacuum_all_indexes_for_worker(Relation *indrels, int nindexes,
+								   LVShared *lvshared, LVTidMap *dead_tuples,
+								   bool for_cleanup)
+{
+	int idx = 0;
+
+	Assert(IsParallelWorker());
+
+	for (;;)
+	{
+		IndexBulkDeleteResult *stats = NULL;
+
+		/* Get next index to process */
+		idx = pg_atomic_fetch_add_u32(&(lvshared->nprocessed), 1);
+
+		/* all indexes are processed? */
+		if (idx >= nindexes)
+			break;
+
+		/* If this index has already been processed before, get the pointer */
+		if (lvshared->indstats[idx].updated)
+			stats = &(lvshared->indstats[idx].stats);
+
+		if (!for_cleanup)
+			stats = lazy_vacuum_index(indrels[idx], stats, lvshared->reltuples,
+									  dead_tuples);
+		else
+			lazy_cleanup_index(indrels[idx], stats, lvshared->reltuples,
+							   lvshared->estimated_count, false);
+
+		if (stats)
+		{
+			/* Update the shared index statistics */
+			lvshared->indstats[idx].updated = true;
+			memcpy(&(lvshared->indstats[idx].stats), stats, sizeof(IndexBulkDeleteResult));
+		}
+	}
+}
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 273e275..2d27af5 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -1668,8 +1668,10 @@ _equalDropdbStmt(const DropdbStmt *a, const DropdbStmt *b)
 static bool
 _equalVacuumStmt(const VacuumStmt *a, const VacuumStmt *b)
 {
-	COMPARE_SCALAR_FIELD(options);
-	COMPARE_NODE_FIELD(rels);
+	if (a->options.flags != b->options.flags)
+		return false;
+	if (a->options.nworkers != b->options.nworkers)
+		return false;
 
 	return true;
 }
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 2c2208f..1707959 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -187,6 +187,7 @@ static void processCASbits(int cas_bits, int location, const char *constrType,
 			   bool *deferrable, bool *initdeferred, bool *not_valid,
 			   bool *no_inherit, core_yyscan_t yyscanner);
 static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
+static VacuumOption *makeVacOpt(VacuumOptionFlag flag, int nworkers);
 
 %}
 
@@ -237,6 +238,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 	struct ImportQual	*importqual;
 	InsertStmt			*istmt;
 	VariableSetStmt		*vsetstmt;
+	VacuumOption		*vacopt;
 	PartitionElem		*partelem;
 	PartitionSpec		*partspec;
 	PartitionBoundSpec	*partboundspec;
@@ -305,8 +307,8 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 				create_extension_opt_item alter_extension_opt_item
 
 %type <ival>	opt_lock lock_type cast_context
-%type <ival>	vacuum_option_list vacuum_option_elem
-				analyze_option_list analyze_option_elem
+%type <vacopt>	vacuum_option_list vacuum_option_elem
+%type <ival>	analyze_option_list analyze_option_elem
 %type <boolean>	opt_or_replace
 				opt_grant_grant_option opt_grant_admin_option
 				opt_nowait opt_if_exists opt_with_data
@@ -10478,22 +10480,24 @@ cluster_index_specification:
 VacuumStmt: VACUUM opt_full opt_freeze opt_verbose opt_analyze opt_vacuum_relation_list
 				{
 					VacuumStmt *n = makeNode(VacuumStmt);
-					n->options = VACOPT_VACUUM;
+					n->options.flags = VACOPT_VACUUM;
 					if ($2)
-						n->options |= VACOPT_FULL;
+						n->options.flags |= VACOPT_FULL;
 					if ($3)
-						n->options |= VACOPT_FREEZE;
+						n->options.flags |= VACOPT_FREEZE;
 					if ($4)
-						n->options |= VACOPT_VERBOSE;
+						n->options.flags |= VACOPT_VERBOSE;
 					if ($5)
-						n->options |= VACOPT_ANALYZE;
+						n->options.flags |= VACOPT_ANALYZE;
+					n->options.nworkers = 0;
 					n->rels = $6;
 					$$ = (Node *)n;
 				}
 			| VACUUM '(' vacuum_option_list ')' opt_vacuum_relation_list
 				{
 					VacuumStmt *n = makeNode(VacuumStmt);
-					n->options = VACOPT_VACUUM | $3;
+					n->options.flags = VACOPT_VACUUM | $3->flags;
+					n->options.nworkers = $3->nworkers;
 					n->rels = $5;
 					$$ = (Node *) n;
 				}
@@ -10501,20 +10505,40 @@ VacuumStmt: VACUUM opt_full opt_freeze opt_verbose opt_analyze opt_vacuum_relati
 
 vacuum_option_list:
 			vacuum_option_elem								{ $$ = $1; }
-			| vacuum_option_list ',' vacuum_option_elem		{ $$ = $1 | $3; }
+			| vacuum_option_list ',' vacuum_option_elem
+				{
+					VacuumOption *vacopt1 = $1;
+					VacuumOption *vacopt2 = $3;
+
+					vacopt1->flags |= vacopt2->flags;
+					if (vacopt2->flags == VACOPT_PARALLEL)
+						vacopt1->nworkers = vacopt2->nworkers;
+					pfree(vacopt2);
+					$$ = vacopt1;
+				}
 		;
 
 vacuum_option_elem:
-			analyze_keyword		{ $$ = VACOPT_ANALYZE; }
-			| VERBOSE			{ $$ = VACOPT_VERBOSE; }
-			| FREEZE			{ $$ = VACOPT_FREEZE; }
-			| FULL				{ $$ = VACOPT_FULL; }
+			analyze_keyword		{ $$ = makeVacOpt(VACOPT_ANALYZE, 0); }
+			| VERBOSE			{ $$ = makeVacOpt(VACOPT_VERBOSE, 0); }
+			| FREEZE			{ $$ = makeVacOpt(VACOPT_FREEZE, 0); }
+			| FULL				{ $$ = makeVacOpt(VACOPT_FULL, 0); }
+			| PARALLEL			{ $$ = makeVacOpt(VACOPT_PARALLEL, 0); }
+			| PARALLEL ICONST
+			{
+				if ($2 < 1)
+					ereport(ERROR,
+							(errcode(ERRCODE_SYNTAX_ERROR),
+							 errmsg("parallel vacuum degree must be more than 1"),
+							 parser_errposition(@1)));
+				$$ = makeVacOpt(VACOPT_PARALLEL, $2);
+			}
 			| IDENT
 				{
 					if (strcmp($1, "disable_page_skipping") == 0)
-						$$ = VACOPT_DISABLE_PAGE_SKIPPING;
+						$$ = makeVacOpt(VACOPT_DISABLE_PAGE_SKIPPING, 0);
 					else if (strcmp($1, "skip_locked") == 0)
-						$$ = VACOPT_SKIP_LOCKED;
+						$$ = makeVacOpt(VACOPT_SKIP_LOCKED, 0);
 					else
 						ereport(ERROR,
 								(errcode(ERRCODE_SYNTAX_ERROR),
@@ -10526,16 +10550,16 @@ vacuum_option_elem:
 AnalyzeStmt: analyze_keyword opt_verbose opt_vacuum_relation_list
 				{
 					VacuumStmt *n = makeNode(VacuumStmt);
-					n->options = VACOPT_ANALYZE;
+					n->options.flags = VACOPT_ANALYZE;
 					if ($2)
-						n->options |= VACOPT_VERBOSE;
+						n->options.flags |= VACOPT_VERBOSE;
 					n->rels = $3;
 					$$ = (Node *)n;
 				}
 			| analyze_keyword '(' analyze_option_list ')' opt_vacuum_relation_list
 				{
 					VacuumStmt *n = makeNode(VacuumStmt);
-					n->options = VACOPT_ANALYZE | $3;
+					n->options.flags = VACOPT_ANALYZE | $3;
 					n->rels = $5;
 					$$ = (Node *) n;
 				}
@@ -16033,6 +16057,19 @@ makeXmlExpr(XmlExprOp op, char *name, List *named_args, List *args,
 	return (Node *) x;
 }
 
+
+/*
+ * Create a VacuumOption with the given options.
+ */
+static VacuumOption *
+makeVacOpt(VacuumOptionFlag flag, int nworkers)
+{
+	VacuumOption *vacopt = palloc(sizeof(VacuumOption));
+
+	vacopt->flags = flag;
+	vacopt->nworkers = nworkers;
+	return vacopt;
+}
 /*
  * Merge the input and output parameters of a table function.
  */
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 2d5086d..a84878a 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -184,11 +184,15 @@ typedef struct av_relation
 								 * reloptions, or NULL if none */
 } av_relation;
 
-/* struct to keep track of tables to vacuum and/or analyze, after rechecking */
+/*
+ * struct to keep track of tables to vacuum and/or analyze, after rechecking.
+ * Since autovacuum doesn't support parallel lazy vacuum the at_vacoptions
+ * is just a bitmak of VacuumOptionFlag.
+ */
 typedef struct autovac_table
 {
 	Oid			at_relid;
-	int			at_vacoptions;	/* bitmask of VacuumOption */
+	int			at_vacoptions;	/* bitmask of VacuumOptionFlag */
 	VacuumParams at_params;
 	int			at_vacuum_cost_delay;
 	int			at_vacuum_cost_limit;
diff --git a/src/backend/tcop/utility.c b/src/backend/tcop/utility.c
index 970c94e..23dc6d3 100644
--- a/src/backend/tcop/utility.c
+++ b/src/backend/tcop/utility.c
@@ -664,7 +664,7 @@ standard_ProcessUtility(PlannedStmt *pstmt,
 				VacuumStmt *stmt = (VacuumStmt *) parsetree;
 
 				/* we choose to allow this during "read only" transactions */
-				PreventCommandDuringRecovery((stmt->options & VACOPT_VACUUM) ?
+				PreventCommandDuringRecovery((stmt->options.flags & VACOPT_VACUUM) ?
 											 "VACUUM" : "ANALYZE");
 				/* forbidden in parallel mode due to CommandIsReadOnly */
 				ExecVacuum(stmt, isTopLevel);
@@ -2570,7 +2570,7 @@ CreateCommandTag(Node *parsetree)
 			break;
 
 		case T_VacuumStmt:
-			if (((VacuumStmt *) parsetree)->options & VACOPT_VACUUM)
+			if (((VacuumStmt *) parsetree)->options.flags & VACOPT_VACUUM)
 				tag = "VACUUM";
 			else
 				tag = "ANALYZE";
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index dfff23a..5b87241 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -15,6 +15,7 @@
 #define VACUUM_H
 
 #include "access/htup.h"
+#include "access/parallel.h"
 #include "catalog/pg_class.h"
 #include "catalog/pg_statistic.h"
 #include "catalog/pg_type.h"
@@ -163,7 +164,7 @@ extern int	vacuum_multixact_freeze_table_age;
 
 /* in commands/vacuum.c */
 extern void ExecVacuum(VacuumStmt *vacstmt, bool isTopLevel);
-extern void vacuum(int options, List *relations, VacuumParams *params,
+extern void vacuum(VacuumOption options, List *relations, VacuumParams *params,
 	   BufferAccessStrategy bstrategy, bool isTopLevel);
 extern void vac_open_indexes(Relation relation, LOCKMODE lockmode,
 				 int *nindexes, Relation **Irel);
@@ -197,8 +198,10 @@ extern Relation vacuum_open_relation(Oid relid, RangeVar *relation,
 					 VacuumParams *params, int options, LOCKMODE lmode);
 
 /* in commands/vacuumlazy.c */
-extern void lazy_vacuum_rel(Relation onerel, int options,
+extern void lazy_vacuum_rel(Relation onerel, VacuumOption options,
 				VacuumParams *params, BufferAccessStrategy bstrategy);
+extern void lazy_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
+
 
 /* in commands/analyze.c */
 extern void analyze_rel(Oid relid, RangeVar *relation, int options,
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index e5bdc1c..a2b4662 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3144,7 +3144,7 @@ typedef struct ClusterStmt
  * and VACOPT_ANALYZE must be set in options.
  * ----------------------
  */
-typedef enum VacuumOption
+typedef enum VacuumOptionFlag
 {
 	VACOPT_VACUUM = 1 << 0,		/* do VACUUM */
 	VACOPT_ANALYZE = 1 << 1,	/* do ANALYZE */
@@ -3153,7 +3153,14 @@ typedef enum VacuumOption
 	VACOPT_FULL = 1 << 4,		/* FULL (non-concurrent) vacuum */
 	VACOPT_SKIP_LOCKED = 1 << 5,	/* skip if cannot get lock */
 	VACOPT_SKIPTOAST = 1 << 6,	/* don't process the TOAST table, if any */
-	VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7	/* don't skip any pages */
+	VACOPT_PARALLEL = 1 << 7,	/* do lazy VACUUM in parallel */
+	VACOPT_DISABLE_PAGE_SKIPPING = 1 << 8	/* don't skip any pages */
+} VacuumOptionFlag;
+
+typedef struct VacuumOption
+{
+	VacuumOptionFlag	flags;	/* OR of VacuumOptionFlag */
+	int					nworkers;	/* # of parallel vacuum workers */
 } VacuumOption;
 
 /*
@@ -3173,9 +3180,9 @@ typedef struct VacuumRelation
 
 typedef struct VacuumStmt
 {
-	NodeTag		type;
-	int			options;		/* OR of VacuumOption flags */
-	List	   *rels;			/* list of VacuumRelation, or NIL for all */
+	NodeTag			type;
+	VacuumOption	options;
+	List		   *rels;			/* list of VacuumRelation, or NIL for all */
 } VacuumStmt;
 
 /* ----------------------
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index fa9d663..9b5b7dc 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -80,6 +80,8 @@ CONTEXT:  SQL function "do_analyze" statement 1
 SQL function "wrap_do_analyze" statement 1
 VACUUM FULL vactst;
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+VACUUM (PARALLEL) vaccluster;
+VACUUM (PARALLEL 2) vaccluster;
 -- partitioned table
 CREATE TABLE vacparted (a int, b char) PARTITION BY LIST (a);
 CREATE TABLE vacparted1 PARTITION OF vacparted FOR VALUES IN (1);
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 9defa0d..f92c4e5 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -61,6 +61,9 @@ VACUUM FULL vaccluster;
 VACUUM FULL vactst;
 
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+VACUUM (PARALLEL) vaccluster;
+VACUUM (PARALLEL 2) vaccluster;
+
 
 -- partitioned table
 CREATE TABLE vacparted (a int, b char) PARTITION BY LIST (a);
-- 
2.10.5

#11

Amit Kapila

amit.kapila16@gmail.com

about 7 years ago

In reply to: Masahiko Sawada (#10)

On Tue, Dec 18, 2018 at 1:29 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Attached the updated patches. I scaled back the scope of this patch.
The patch now includes only feature (a), that is it execute both index
vacuum and cleanup index in parallel. It also doesn't include
autovacuum support for now.

The PARALLEL option works alomst same as before patch. In VACUUM
command, we can specify 'PARALLEL n' option where n is the number of
parallel workers to request. If the n is omitted the number of
parallel worekrs would be # of indexes -1.

I think for now this is okay, but I guess in furture when we make
heapscans also parallel or maybe allow more than one worker per-index
vacuum, then this won't hold good. So, I am not sure if below text in
docs is most appropriate.

+    <term><literal>PARALLEL <replaceable
class="parameter">N</replaceable></literal></term>
+    <listitem>
+     <para>
+      Execute index vacuum and cleanup index in parallel with
+      <replaceable class="parameter">N</replaceable> background
workers. If the parallel
+      degree <replaceable class="parameter">N</replaceable> is omitted,
+      <command>VACUUM</command> requests the number of indexes - 1
processes, which is the
+      maximum number of parallel vacuum workers since individual
indexes is processed by
+      one process. The actual number of parallel vacuum workers may
be less due to the
+      setting of <xref linkend="guc-max-parallel-workers-maintenance"/>.
+      This option can not use with  <literal>FULL</literal> option.

It might be better to use some generic language in docs, something
like "If the parallel degree N is omitted, then vacuum decides the
number of workers based on number of indexes on the relation which is
further limited by max-parallel-workers-maintenance". I think you
also need to mention in some way that you consider storage option
parallel_workers.

Few assorted comments:
1.
+lazy_begin_parallel_vacuum_index(LVState *lvstate, bool for_cleanup)
{
..
+
+ LaunchParallelWorkers(lvstate->pcxt);
+
+ /*
+ * if no workers launched, we vacuum all indexes by the leader process
+ * alone. Since there is hope that we can launch workers in the next
+ * execution time we don't want to end the parallel mode yet.
+ */
+ if (lvstate->pcxt->nworkers_launched == 0)
+ return;

It is quite possible that the workers are not launched because we fail
to allocate memory, basically when pcxt->nworkers is zero. I think in
such cases there is no use for being in parallel mode. You can even
detect that before calling lazy_begin_parallel_vacuum_index.

2.
static void
+lazy_vacuum_all_indexes_for_leader(LVState *lvstate,
IndexBulkDeleteResult **stats,
+    LVTidMap *dead_tuples, bool do_parallel,
+    bool for_cleanup)
{
..
+ if (do_parallel)
+ lazy_begin_parallel_vacuum_index(lvstate, for_cleanup);
+
+ for (;;)
+ {
+ IndexBulkDeleteResult *r = NULL;
+
+ /*
+ * Get the next index number to vacuum and set index statistics. In parallel
+ * lazy vacuum, index bulk-deletion results are stored in the shared memory
+ * segment. If it's already updated we use it rather than setting it to NULL.
+ * In single vacuum, we can always use an element of the 'stats'.
+ */
+ if (do_parallel)
+ {
+ idx = pg_atomic_fetch_add_u32(&(lvshared->nprocessed), 1);
+
+ if (lvshared->indstats[idx].updated)
+ r = &(lvshared->indstats[idx].stats);
+ }

It is quite possible that we are not able to launch any workers in
lazy_begin_parallel_vacuum_index, in such cases, we should not use
parallel mode path, basically as written we can't rely on
'do_parallel' flag.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#12

Masahiko Sawada

sawada.mshk@gmail.com

about 7 years ago

In reply to: Amit Kapila (#11)

2 attachment(s)

On Thu, Dec 20, 2018 at 3:38 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Dec 18, 2018 at 1:29 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Attached the updated patches. I scaled back the scope of this patch.
The patch now includes only feature (a), that is it execute both index
vacuum and cleanup index in parallel. It also doesn't include
autovacuum support for now.

The PARALLEL option works alomst same as before patch. In VACUUM
command, we can specify 'PARALLEL n' option where n is the number of
parallel workers to request. If the n is omitted the number of
parallel worekrs would be # of indexes -1.

I think for now this is okay, but I guess in furture when we make
heapscans also parallel or maybe allow more than one worker per-index
vacuum, then this won't hold good. So, I am not sure if below text in
docs is most appropriate.
+    <term><literal>PARALLEL <replaceable
class="parameter">N</replaceable></literal></term>
+    <listitem>
+     <para>
+      Execute index vacuum and cleanup index in parallel with
+      <replaceable class="parameter">N</replaceable> background
workers. If the parallel
+      degree <replaceable class="parameter">N</replaceable> is omitted,
+      <command>VACUUM</command> requests the number of indexes - 1
processes, which is the
+      maximum number of parallel vacuum workers since individual
indexes is processed by
+      one process. The actual number of parallel vacuum workers may
be less due to the
+      setting of <xref linkend="guc-max-parallel-workers-maintenance"/>.
+      This option can not use with  <literal>FULL</literal> option.
It might be better to use some generic language in docs, something
like "If the parallel degree N is omitted, then vacuum decides the
number of workers based on number of indexes on the relation which is
further limited by max-parallel-workers-maintenance".

Thank you for the review.

I agreed your concern and the text you suggested.

I think you
also need to mention in some way that you consider storage option
parallel_workers.

Added.

Few assorted comments:
1.
+lazy_begin_parallel_vacuum_index(LVState *lvstate, bool for_cleanup)
{
..
+
+ LaunchParallelWorkers(lvstate->pcxt);
+
+ /*
+ * if no workers launched, we vacuum all indexes by the leader process
+ * alone. Since there is hope that we can launch workers in the next
+ * execution time we don't want to end the parallel mode yet.
+ */
+ if (lvstate->pcxt->nworkers_launched == 0)
+ return;
It is quite possible that the workers are not launched because we fail
to allocate memory, basically when pcxt->nworkers is zero. I think in
such cases there is no use for being in parallel mode. You can even
detect that before calling lazy_begin_parallel_vacuum_index.

Agreed. we can stop preparation and exit parallel mode if
pcxt->nworkers got 0 after InitializeParallelDSM() .

2.
static void
+lazy_vacuum_all_indexes_for_leader(LVState *lvstate,
IndexBulkDeleteResult **stats,
+    LVTidMap *dead_tuples, bool do_parallel,
+    bool for_cleanup)
{
..
+ if (do_parallel)
+ lazy_begin_parallel_vacuum_index(lvstate, for_cleanup);
+
+ for (;;)
+ {
+ IndexBulkDeleteResult *r = NULL;
+
+ /*
+ * Get the next index number to vacuum and set index statistics. In parallel
+ * lazy vacuum, index bulk-deletion results are stored in the shared memory
+ * segment. If it's already updated we use it rather than setting it to NULL.
+ * In single vacuum, we can always use an element of the 'stats'.
+ */
+ if (do_parallel)
+ {
+ idx = pg_atomic_fetch_add_u32(&(lvshared->nprocessed), 1);
+
+ if (lvshared->indstats[idx].updated)
+ r = &(lvshared->indstats[idx].stats);
+ }

Fixed.

Attached new version patch.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachments:

v10-0002-Add-P-option-to-vacuumdb-command.patchapplication/octet-stream; name=v10-0002-Add-P-option-to-vacuumdb-command.patchDownload

From 3f77c12d27a31fb15dfa11cbdda7ac8b321faaa9 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Tue, 18 Dec 2018 16:08:24 +0900
Subject: [PATCH v10 2/2] Add -P option to vacuumdb command.

---
 doc/src/sgml/ref/vacuumdb.sgml    | 16 +++++++++++++
 src/bin/scripts/t/100_vacuumdb.pl | 10 +++++++-
 src/bin/scripts/vacuumdb.c        | 50 ++++++++++++++++++++++++++++++++++++++-
 3 files changed, 74 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/ref/vacuumdb.sgml b/doc/src/sgml/ref/vacuumdb.sgml
index 955a17a..0d085a6 100644
--- a/doc/src/sgml/ref/vacuumdb.sgml
+++ b/doc/src/sgml/ref/vacuumdb.sgml
@@ -158,6 +158,22 @@ PostgreSQL documentation
      </varlistentry>
 
      <varlistentry>
+      <term><option>-P <replaceable class="parameter">workers</replaceable></option></term>
+      <term><option>--parallel=<replaceable class="parameter">workers</replaceable></option></term>
+      <listitem>
+       <para>
+        Execute parallel vacuum with <productname>PostgreSQL</productname>'s
+        <replaceable class="parameter">workers</replaceable> background workers.
+       </para>
+       <para>
+        <application>vacuumdb</application> will require background workers,
+        so make sure your <xref linkend="guc-max-parallel-workers-maintenance"/>
+        setting is more than one.
+       </para>
+      </listitem>
+     </varlistentry>
+
+     <varlistentry>
       <term><option>-q</option></term>
       <term><option>--quiet</option></term>
       <listitem>
diff --git a/src/bin/scripts/t/100_vacuumdb.pl b/src/bin/scripts/t/100_vacuumdb.pl
index 4c477a2..4d513a1 100644
--- a/src/bin/scripts/t/100_vacuumdb.pl
+++ b/src/bin/scripts/t/100_vacuumdb.pl
@@ -3,7 +3,7 @@ use warnings;
 
 use PostgresNode;
 use TestLib;
-use Test::More tests => 23;
+use Test::More tests => 27;
 
 program_help_ok('vacuumdb');
 program_version_ok('vacuumdb');
@@ -33,6 +33,14 @@ $node->issues_sql_like(
 	[ 'vacuumdb', '-Z', 'postgres' ],
 	qr/statement: ANALYZE;/,
 	'vacuumdb -Z');
+$node->issues_sql_like(
+	[ 'vacuumdb', '-P2', 'postgres' ],
+	qr/statement: VACUUM \(PARALLEL 2\);/,
+	'vacuumdb -P2');
+$node->issues_sql_like(
+	[ 'vacuumdb', '-P', 'postgres' ],
+	qr/statement: VACUUM \(PARALLEL\);/,
+	'vacuumdb -P2');
 $node->command_ok([qw(vacuumdb -Z --table=pg_am dbname=template1)],
 	'vacuumdb with connection string');
 
diff --git a/src/bin/scripts/vacuumdb.c b/src/bin/scripts/vacuumdb.c
index bcea9e5..ee7bd7e 100644
--- a/src/bin/scripts/vacuumdb.c
+++ b/src/bin/scripts/vacuumdb.c
@@ -40,6 +40,9 @@ typedef struct vacuumingOptions
 	bool		and_analyze;
 	bool		full;
 	bool		freeze;
+	int			parallel_workers;	/* -1: disabled, 0: PARALLEL without number of
+									 * workers.
+									 */
 } vacuumingOptions;
 
 
@@ -108,6 +111,7 @@ main(int argc, char *argv[])
 		{"full", no_argument, NULL, 'f'},
 		{"verbose", no_argument, NULL, 'v'},
 		{"jobs", required_argument, NULL, 'j'},
+		{"parallel", optional_argument, NULL, 'P'},
 		{"maintenance-db", required_argument, NULL, 2},
 		{"analyze-in-stages", no_argument, NULL, 3},
 		{NULL, 0, NULL, 0}
@@ -133,6 +137,7 @@ main(int argc, char *argv[])
 
 	/* initialize options to all false */
 	memset(&vacopts, 0, sizeof(vacopts));
+	vacopts.parallel_workers = -1;
 
 	progname = get_progname(argv[0]);
 
@@ -140,7 +145,7 @@ main(int argc, char *argv[])
 
 	handle_help_version_opts(argc, argv, "vacuumdb", help);
 
-	while ((c = getopt_long(argc, argv, "h:p:U:wWeqd:zZFat:fvj:", long_options, &optindex)) != -1)
+	while ((c = getopt_long(argc, argv, "h:p:P::U:wWeqd:zZFat:fvj:", long_options, &optindex)) != -1)
 	{
 		switch (c)
 		{
@@ -207,6 +212,25 @@ main(int argc, char *argv[])
 					exit(1);
 				}
 				break;
+			case 'P':
+				{
+					int parallel_workers = 0;
+
+					if (optarg != NULL)
+					{
+						parallel_workers = atoi(optarg);
+						if (parallel_workers <= 0)
+						{
+							fprintf(stderr, _("%s: number of parallel workers must be at least 1\n"),
+									progname);
+							exit(1);
+						}
+					}
+
+					/* allow to set 0, meaning PARALLEL without the parallel degree */
+					vacopts.parallel_workers = parallel_workers;
+					break;
+				}
 			case 2:
 				maintenance_db = pg_strdup(optarg);
 				break;
@@ -251,9 +275,22 @@ main(int argc, char *argv[])
 					progname, "freeze");
 			exit(1);
 		}
+		if (vacopts.parallel_workers >= 0)
+		{
+			fprintf(stderr, _("%s: cannot use the \"%s\" option when performing only analyze\n"),
+					progname, "parallel");
+			exit(1);
+		}
 		/* allow 'and_analyze' with 'analyze_only' */
 	}
 
+	if (vacopts.full && vacopts.parallel_workers >= 0)
+	{
+		fprintf(stderr, _("%s: cannot use the \"%s\" option with \"%s\" option"),
+				progname, "full", "parallel");
+		exit(1);
+	}
+
 	setup_cancel_handler();
 
 	/* Avoid opening extra connections. */
@@ -667,6 +704,16 @@ prepare_vacuum_command(PQExpBuffer sql, PGconn *conn,
 				appendPQExpBuffer(sql, "%sANALYZE", sep);
 				sep = comma;
 			}
+			if (vacopts->parallel_workers > 0)
+			{
+				appendPQExpBuffer(sql, "%sPARALLEL %d", sep, vacopts->parallel_workers);
+				sep = comma;
+			}
+			if (vacopts->parallel_workers == 0)
+			{
+				appendPQExpBuffer(sql, "%sPARALLEL", sep);
+				sep = comma;
+			}
 			if (sep != paren)
 				appendPQExpBufferChar(sql, ')');
 		}
@@ -1004,6 +1051,7 @@ help(const char *progname)
 	printf(_("  -f, --full                      do full vacuuming\n"));
 	printf(_("  -F, --freeze                    freeze row transaction information\n"));
 	printf(_("  -j, --jobs=NUM                  use this many concurrent connections to vacuum\n"));
+	printf(_("  -P, --parallel=NUM              do parallel vacuuming\n"));
 	printf(_("  -q, --quiet                     don't write any messages\n"));
 	printf(_("  -t, --table='TABLE[(COLUMNS)]'  vacuum specific table(s) only\n"));
 	printf(_("  -v, --verbose                   write a lot of output\n"));
-- 
2.10.5

v10-0001-Add-parallel-option-to-VACUUM-command.patchapplication/octet-stream; name=v10-0001-Add-parallel-option-to-VACUUM-command.patchDownload

From 0f04345a5bd2ea1c1e7b217f93c2f2b4d1ef395a Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Tue, 18 Dec 2018 14:48:34 +0900
Subject: [PATCH v10 1/2] Add parallel option to VACUUM command

In parallel vacuum, we do both index vacuum and cleanup vacuum
in parallel with parallel worker processes if the table has
more than one index. In parallel index vacuum and parallel cleanup
index individual indexes is processed by one of the vacuum processes
including the leader process.

Parallel vacuum can be performed by specifying like
VACUUM (PARALLEL 2) tbl, meaning that performing vacuum with 2
parallel worker processes. Also we can also control parallel
degree by setting parallel_workers reloption to more than 0.

The parallel vacuum degree is limited by both the number of
indexes the table has and max_parallel_maintenance_workers.
---
 doc/src/sgml/config.sgml              |  11 +-
 doc/src/sgml/ref/vacuum.sgml          |  25 +
 src/backend/access/transam/parallel.c |   4 +
 src/backend/commands/vacuum.c         |  76 ++--
 src/backend/commands/vacuumlazy.c     | 834 +++++++++++++++++++++++++++++-----
 src/backend/nodes/equalfuncs.c        |   6 +-
 src/backend/parser/gram.y             |  73 ++-
 src/backend/postmaster/autovacuum.c   |  11 +-
 src/backend/tcop/utility.c            |   4 +-
 src/include/commands/vacuum.h         |   7 +-
 src/include/nodes/parsenodes.h        |  17 +-
 src/test/regress/expected/vacuum.out  |   2 +
 src/test/regress/sql/vacuum.sql       |   3 +
 13 files changed, 878 insertions(+), 195 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index e94b305..b018b7b 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2185,11 +2185,12 @@ include_dir 'conf.d'
        <listitem>
         <para>
          Sets the maximum number of parallel workers that can be
-         started by a single utility command.  Currently, the only
-         parallel utility command that supports the use of parallel
-         workers is <command>CREATE INDEX</command>, and only when
-         building a B-tree index.  Parallel workers are taken from the
-         pool of processes established by <xref
+         started by a single utility command.  Currently, the parallel
+         utility commands that supports the use of parallel worker are
+         <command>CREATE INDEX</command>, and only when
+         building a B-tree index and <command>VACUUM</command> without
+         <literal>FULL</literal> option.  Parallel workers are taken from
+         the pool of processes established by <xref
          linkend="guc-max-worker-processes"/>, limited by <xref
          linkend="guc-max-parallel-workers"/>.  Note that the requested
          number of workers may not actually be available at run time.
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index fd911f5..95d6ff20 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -30,6 +30,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     FREEZE
     VERBOSE
     ANALYZE
+    PARALLEL [ <replaceable class="parameter">N</replaceable> ]
     DISABLE_PAGE_SKIPPING
     SKIP_LOCKED
 
@@ -143,6 +144,21 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    </varlistentry>
 
    <varlistentry>
+    <term><literal>PARALLEL <replaceable class="parameter">N</replaceable></literal></term>
+    <listitem>
+     <para>
+      Execute index vacuum and cleanup index in parallel with
+      <replaceable class="parameter">N</replaceable> background workers. If the
+      parallel degree <replaceable class="parameter">N</replaceable> is omitted,
+      then <command>VACUUM</command> decides the number of workers based on
+      number of indexes on the relation which further limited by
+      <xref linkend="guc-max-parallel-workers-maintenance"/>.
+      This option can not use with  <literal>FULL</literal> option.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><literal>DISABLE_PAGE_SKIPPING</literal></term>
     <listitem>
      <para>
@@ -261,6 +277,15 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    </para>
 
    <para>
+    Setting a value for <literal>parallel_workers</literal> via
+    <xref linkend="sql-altertable"/> also controls how many parallel
+    worker processes will be requested by a <command>VACUUM</command>
+    against the table. This setting is overwritten by setting
+    <replaceable class="parameter">N</replaceable> of <literal>PARALLEL</literal>
+    option.
+   </para>
+
+   <para>
     <command>VACUUM</command> causes a substantial increase in I/O traffic,
     which might cause poor performance for other active sessions.  Therefore,
     it is sometimes advisable to use the cost-based vacuum delay feature.
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index b9a9ae5..33c46e0 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -23,6 +23,7 @@
 #include "catalog/index.h"
 #include "catalog/namespace.h"
 #include "commands/async.h"
+#include "commands/vacuum.h"
 #include "executor/execParallel.h"
 #include "libpq/libpq.h"
 #include "libpq/pqformat.h"
@@ -138,6 +139,9 @@ static const struct
 	},
 	{
 		"_bt_parallel_build_main", _bt_parallel_build_main
+	},
+	{
+		"lazy_parallel_vacuum_main", lazy_parallel_vacuum_main
 	}
 };
 
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 25b3b03..401262e 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -68,13 +68,13 @@ static BufferAccessStrategy vac_strategy;
 
 
 /* non-export function prototypes */
-static List *expand_vacuum_rel(VacuumRelation *vrel, int options);
-static List *get_all_vacuum_rels(int options);
+static List *expand_vacuum_rel(VacuumRelation *vrel, VacuumOption options);
+static List *get_all_vacuum_rels(VacuumOption options);
 static void vac_truncate_clog(TransactionId frozenXID,
 				  MultiXactId minMulti,
 				  TransactionId lastSaneFrozenXid,
 				  MultiXactId lastSaneMinMulti);
-static bool vacuum_rel(Oid relid, RangeVar *relation, int options,
+static bool vacuum_rel(Oid relid, RangeVar *relation, VacuumOption options,
 		   VacuumParams *params);
 
 /*
@@ -89,15 +89,15 @@ ExecVacuum(VacuumStmt *vacstmt, bool isTopLevel)
 	VacuumParams params;
 
 	/* sanity checks on options */
-	Assert(vacstmt->options & (VACOPT_VACUUM | VACOPT_ANALYZE));
-	Assert((vacstmt->options & VACOPT_VACUUM) ||
-		   !(vacstmt->options & (VACOPT_FULL | VACOPT_FREEZE)));
-	Assert(!(vacstmt->options & VACOPT_SKIPTOAST));
+	Assert(vacstmt->options.flags & (VACOPT_VACUUM | VACOPT_ANALYZE));
+	Assert((vacstmt->options.flags & VACOPT_VACUUM) ||
+		   !(vacstmt->options.flags & (VACOPT_FULL | VACOPT_FREEZE)));
+	Assert(!(vacstmt->options.flags & VACOPT_SKIPTOAST));
 
 	/*
 	 * Make sure VACOPT_ANALYZE is specified if any column lists are present.
 	 */
-	if (!(vacstmt->options & VACOPT_ANALYZE))
+	if (!(vacstmt->options.flags & VACOPT_ANALYZE))
 	{
 		ListCell   *lc;
 
@@ -112,11 +112,17 @@ ExecVacuum(VacuumStmt *vacstmt, bool isTopLevel)
 		}
 	}
 
+	if ((vacstmt->options.flags & VACOPT_FULL) &&
+		(vacstmt->options.flags & VACOPT_PARALLEL))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot specify FULL option with PARALLEL option")));
+
 	/*
 	 * All freeze ages are zero if the FREEZE option is given; otherwise pass
 	 * them as -1 which means to use the default values.
 	 */
-	if (vacstmt->options & VACOPT_FREEZE)
+	if (vacstmt->options.flags & VACOPT_FREEZE)
 	{
 		params.freeze_min_age = 0;
 		params.freeze_table_age = 0;
@@ -163,7 +169,7 @@ ExecVacuum(VacuumStmt *vacstmt, bool isTopLevel)
  * memory context that will not disappear at transaction commit.
  */
 void
-vacuum(int options, List *relations, VacuumParams *params,
+vacuum(VacuumOption options, List *relations, VacuumParams *params,
 	   BufferAccessStrategy bstrategy, bool isTopLevel)
 {
 	static bool in_vacuum = false;
@@ -174,7 +180,7 @@ vacuum(int options, List *relations, VacuumParams *params,
 
 	Assert(params != NULL);
 
-	stmttype = (options & VACOPT_VACUUM) ? "VACUUM" : "ANALYZE";
+	stmttype = (options.flags & VACOPT_VACUUM) ? "VACUUM" : "ANALYZE";
 
 	/*
 	 * We cannot run VACUUM inside a user transaction block; if we were inside
@@ -184,7 +190,7 @@ vacuum(int options, List *relations, VacuumParams *params,
 	 *
 	 * ANALYZE (without VACUUM) can run either way.
 	 */
-	if (options & VACOPT_VACUUM)
+	if (options.flags & VACOPT_VACUUM)
 	{
 		PreventInTransactionBlock(isTopLevel, stmttype);
 		in_outer_xact = false;
@@ -206,8 +212,8 @@ vacuum(int options, List *relations, VacuumParams *params,
 	/*
 	 * Sanity check DISABLE_PAGE_SKIPPING option.
 	 */
-	if ((options & VACOPT_FULL) != 0 &&
-		(options & VACOPT_DISABLE_PAGE_SKIPPING) != 0)
+	if ((options.flags & VACOPT_FULL) != 0 &&
+		(options.flags & VACOPT_DISABLE_PAGE_SKIPPING) != 0)
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg("VACUUM option DISABLE_PAGE_SKIPPING cannot be used with FULL")));
@@ -216,7 +222,7 @@ vacuum(int options, List *relations, VacuumParams *params,
 	 * Send info about dead objects to the statistics collector, unless we are
 	 * in autovacuum --- autovacuum.c does this for itself.
 	 */
-	if ((options & VACOPT_VACUUM) && !IsAutoVacuumWorkerProcess())
+	if ((options.flags & VACOPT_VACUUM) && !IsAutoVacuumWorkerProcess())
 		pgstat_vacuum_stat();
 
 	/*
@@ -281,11 +287,11 @@ vacuum(int options, List *relations, VacuumParams *params,
 	 * transaction block, and also in an autovacuum worker, use own
 	 * transactions so we can release locks sooner.
 	 */
-	if (options & VACOPT_VACUUM)
+	if (options.flags & VACOPT_VACUUM)
 		use_own_xacts = true;
 	else
 	{
-		Assert(options & VACOPT_ANALYZE);
+		Assert(options.flags & VACOPT_ANALYZE);
 		if (IsAutoVacuumWorkerProcess())
 			use_own_xacts = true;
 		else if (in_outer_xact)
@@ -335,13 +341,13 @@ vacuum(int options, List *relations, VacuumParams *params,
 		{
 			VacuumRelation *vrel = lfirst_node(VacuumRelation, cur);
 
-			if (options & VACOPT_VACUUM)
+			if (options.flags & VACOPT_VACUUM)
 			{
 				if (!vacuum_rel(vrel->oid, vrel->relation, options, params))
 					continue;
 			}
 
-			if (options & VACOPT_ANALYZE)
+			if (options.flags & VACOPT_ANALYZE)
 			{
 				/*
 				 * If using separate xacts, start one for analyze. Otherwise,
@@ -354,7 +360,7 @@ vacuum(int options, List *relations, VacuumParams *params,
 					PushActiveSnapshot(GetTransactionSnapshot());
 				}
 
-				analyze_rel(vrel->oid, vrel->relation, options, params,
+				analyze_rel(vrel->oid, vrel->relation, options.flags, params,
 							vrel->va_cols, in_outer_xact, vac_strategy);
 
 				if (use_own_xacts)
@@ -390,7 +396,7 @@ vacuum(int options, List *relations, VacuumParams *params,
 		StartTransactionCommand();
 	}
 
-	if ((options & VACOPT_VACUUM) && !IsAutoVacuumWorkerProcess())
+	if ((options.flags & VACOPT_VACUUM) && !IsAutoVacuumWorkerProcess())
 	{
 		/*
 		 * Update pg_database.datfrozenxid, and truncate pg_xact if possible.
@@ -603,7 +609,7 @@ vacuum_open_relation(Oid relid, RangeVar *relation, VacuumParams *params,
  * are made in vac_context.
  */
 static List *
-expand_vacuum_rel(VacuumRelation *vrel, int options)
+expand_vacuum_rel(VacuumRelation *vrel, VacuumOption options)
 {
 	List	   *vacrels = NIL;
 	MemoryContext oldcontext;
@@ -635,7 +641,7 @@ expand_vacuum_rel(VacuumRelation *vrel, int options)
 		 * below, as well as find_all_inheritors's expectation that the caller
 		 * holds some lock on the starting relation.
 		 */
-		rvr_opts = (options & VACOPT_SKIP_LOCKED) ? RVR_SKIP_LOCKED : 0;
+		rvr_opts = (options.flags & VACOPT_SKIP_LOCKED) ? RVR_SKIP_LOCKED : 0;
 		relid = RangeVarGetRelidExtended(vrel->relation,
 										 AccessShareLock,
 										 rvr_opts,
@@ -647,7 +653,7 @@ expand_vacuum_rel(VacuumRelation *vrel, int options)
 		 */
 		if (!OidIsValid(relid))
 		{
-			if (options & VACOPT_VACUUM)
+			if (options.flags & VACOPT_VACUUM)
 				ereport(WARNING,
 						(errcode(ERRCODE_LOCK_NOT_AVAILABLE),
 						 errmsg("skipping vacuum of \"%s\" --- lock not available",
@@ -673,7 +679,7 @@ expand_vacuum_rel(VacuumRelation *vrel, int options)
 		 * Make a returnable VacuumRelation for this rel if user is a proper
 		 * owner.
 		 */
-		if (vacuum_is_relation_owner(relid, classForm, options))
+		if (vacuum_is_relation_owner(relid, classForm, options.flags))
 		{
 			oldcontext = MemoryContextSwitchTo(vac_context);
 			vacrels = lappend(vacrels, makeVacuumRelation(vrel->relation,
@@ -742,7 +748,7 @@ expand_vacuum_rel(VacuumRelation *vrel, int options)
  * the current database.  The list is built in vac_context.
  */
 static List *
-get_all_vacuum_rels(int options)
+get_all_vacuum_rels(VacuumOption options)
 {
 	List	   *vacrels = NIL;
 	Relation	pgclass;
@@ -760,7 +766,7 @@ get_all_vacuum_rels(int options)
 		Oid			relid = classForm->oid;
 
 		/* check permissions of relation */
-		if (!vacuum_is_relation_owner(relid, classForm, options))
+		if (!vacuum_is_relation_owner(relid, classForm, options.flags))
 			continue;
 
 		/*
@@ -1521,7 +1527,7 @@ vac_truncate_clog(TransactionId frozenXID,
  *		At entry and exit, we are not inside a transaction.
  */
 static bool
-vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
+vacuum_rel(Oid relid, RangeVar *relation, VacuumOption options, VacuumParams *params)
 {
 	LOCKMODE	lmode;
 	Relation	onerel;
@@ -1542,7 +1548,7 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 	 */
 	PushActiveSnapshot(GetTransactionSnapshot());
 
-	if (!(options & VACOPT_FULL))
+	if (!(options.flags & VACOPT_FULL))
 	{
 		/*
 		 * In lazy vacuum, we can set the PROC_IN_VACUUM flag, which lets
@@ -1582,10 +1588,10 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 	 * vacuum, but just ShareUpdateExclusiveLock for concurrent vacuum. Either
 	 * way, we can be sure that no other backend is vacuuming the same table.
 	 */
-	lmode = (options & VACOPT_FULL) ? AccessExclusiveLock : ShareUpdateExclusiveLock;
+	lmode = (options.flags & VACOPT_FULL) ? AccessExclusiveLock : ShareUpdateExclusiveLock;
 
 	/* open the relation and get the appropriate lock on it */
-	onerel = vacuum_open_relation(relid, relation, params, options, lmode);
+	onerel = vacuum_open_relation(relid, relation, params, options.flags, lmode);
 
 	/* leave if relation could not be opened or locked */
 	if (!onerel)
@@ -1605,7 +1611,7 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 	 */
 	if (!vacuum_is_relation_owner(RelationGetRelid(onerel),
 								  onerel->rd_rel,
-								  options & VACOPT_VACUUM))
+								  options.flags & VACOPT_VACUUM))
 	{
 		relation_close(onerel, lmode);
 		PopActiveSnapshot();
@@ -1677,7 +1683,7 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 	 * us to process it.  In VACUUM FULL, though, the toast table is
 	 * automatically rebuilt by cluster_rel so we shouldn't recurse to it.
 	 */
-	if (!(options & VACOPT_SKIPTOAST) && !(options & VACOPT_FULL))
+	if (!(options.flags & VACOPT_SKIPTOAST) && !(options.flags & VACOPT_FULL))
 		toast_relid = onerel->rd_rel->reltoastrelid;
 	else
 		toast_relid = InvalidOid;
@@ -1696,7 +1702,7 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 	/*
 	 * Do the actual work --- either FULL or "lazy" vacuum
 	 */
-	if (options & VACOPT_FULL)
+	if (options.flags & VACOPT_FULL)
 	{
 		int			cluster_options = 0;
 
@@ -1704,7 +1710,7 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 		relation_close(onerel, NoLock);
 		onerel = NULL;
 
-		if ((options & VACOPT_VERBOSE) != 0)
+		if ((options.flags & VACOPT_VERBOSE) != 0)
 			cluster_options |= CLUOPT_VERBOSE;
 
 		/* VACUUM FULL is now a variant of CLUSTER; see cluster.c */
diff --git a/src/backend/commands/vacuumlazy.c b/src/backend/commands/vacuumlazy.c
index 8134c52..85b5de8 100644
--- a/src/backend/commands/vacuumlazy.c
+++ b/src/backend/commands/vacuumlazy.c
@@ -22,6 +22,19 @@
  * of index scans performed.  So we don't use maintenance_work_mem memory for
  * the TID array, just enough to hold as many heap tuples as fit on one page.
  *
+ * Lazy vacuum supports parallel execution with parallel worker processes. In
+ * parallel vacuum, we perform both index vacuum and index cleanup in parallel.
+ * Individual indexes is processed by one vacuum process. At beginning of
+ * lazy vacuum (at lazy_scan_heap) we prepare the parallel context and initialize
+ * the shared memory segments that contains shared information as well as the
+ * memory space for dead tuples. When starting either index vacuum or cleanup
+ * vacuum, we launch parallel worker processes. Once all indexes are processed
+ * the parallel worker processes exit and the leader process re-initializes the
+ * shared memory segment. Note that all parallel workers live during one either
+ * index vacuum or cleanup index but the leader process neither exits from the
+ * parallel mode nor destories the parallel context. For updating the index
+ * statistics, since any updates are not allowed during parallel mode we update
+ * the index statistics after exited from the parallel mode.
  *
  * Portions Copyright (c) 1996-2018, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
@@ -41,8 +54,10 @@
 #include "access/heapam_xlog.h"
 #include "access/htup_details.h"
 #include "access/multixact.h"
+#include "access/parallel.h"
 #include "access/transam.h"
 #include "access/visibilitymap.h"
+#include "access/xact.h"
 #include "access/xlog.h"
 #include "catalog/storage.h"
 #include "commands/dbcommands.h"
@@ -55,6 +70,8 @@
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
 #include "storage/lmgr.h"
+#include "storage/spin.h"
+#include "tcop/tcopprot.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/pg_rusage.h"
@@ -111,10 +128,79 @@
  */
 #define PREFETCH_SIZE			((BlockNumber) 32)
 
+/* DSM key for parallel lazy vacuum */
+#define PARALLEL_VACUUM_KEY_SHARED			UINT64CONST(0xFFFFFFFFFFF00001)
+#define PARALLEL_VACUUM_KEY_DEAD_TUPLES		UINT64CONST(0xFFFFFFFFFFF00002)
+#define PARALLEL_VACUUM_KEY_QUERY_TEXT		UINT64CONST(0xFFFFFFFFFFF00003)
+
+/*
+ * Structs for an index bulk-deletion statistic that is used for parallel
+ * lazy vacuum. This is allocated in a dynamic shared memory segment.
+ */
+typedef struct LVIndStats
+{
+	bool updated;	/* is the stats updated? */
+	IndexBulkDeleteResult stats;
+} LVIndStats;
+
+/*
+ * LVTidMap controls the dead tuple TIDs collected during heap scan. This is
+ * allocated in a dynamic shared memory segment in parallel lazy vacuum mode, or
+ * in a local memory.
+ */
+typedef struct LVTidMap
+{
+	int			max_dead_tuples;	/* # slots allocated in array */
+	int			num_dead_tuples;	/* current # of entries */
+	/* List of TIDs of tuples we intend to delete */
+	/* NB: this list is ordered by TID address */
+	ItemPointerData itemptrs[FLEXIBLE_ARRAY_MEMBER];	/* array of ItemPointerData */
+} LVTidMap;
+#define SizeOfLVTidMap offsetof(LVTidMap, itemptrs) + sizeof(ItemPointerData)
+
+/*
+ * Status for parallel vacuum index and cleanup index. This is allocated in a
+ * dynamic shared memory segment.
+ */
+typedef struct LVShared
+{
+	/*
+	 * Target table relid and vacuum settings. These fields are not modified
+	 * during the lazy vacuum.
+	 */
+	Oid		relid;
+	bool	is_wraparound;
+	int		elevel;
+
+	/*
+	 * An indication for vacuum workers of doing either vacuuming index or
+	 * cleanup index.
+	 */
+	bool	for_cleanup;
+
+	/*
+	 * Fields for vacuum index or cleanup index, or both necessary for
+	 * IndexVacuumInfo.
+	 *
+	 * reltuples is the total number of input heap tuples. We set either an
+	 * old live tuples in vacuum index or th new live tuples in cleanup index.
+	 *
+	 * estimated_count is true if the reltuples is estimated value.
+	 */
+	double	reltuples;
+	bool	estimated_count;
+
+	/*
+	 * Variables to control parallel index vacuum. An variable-sized field
+	 * 'indstats' must come last.
+	 */
+	pg_atomic_uint32	nprocessed;
+	LVIndStats			indstats[FLEXIBLE_ARRAY_MEMBER];
+} LVShared;
+#define SizeOfLVShared offsetof(LVShared, indstats) + sizeof(LVIndStats)
+
 typedef struct LVRelStats
 {
-	/* hasindex = true means two-pass strategy; false means one-pass */
-	bool		hasindex;
 	/* Overall statistics about rel */
 	BlockNumber old_rel_pages;	/* previous value of pg_class.relpages */
 	BlockNumber rel_pages;		/* total number of pages */
@@ -129,16 +215,34 @@ typedef struct LVRelStats
 	BlockNumber pages_removed;
 	double		tuples_deleted;
 	BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
-	/* List of TIDs of tuples we intend to delete */
-	/* NB: this list is ordered by TID address */
-	int			num_dead_tuples;	/* current # of entries */
-	int			max_dead_tuples;	/* # slots allocated in array */
-	ItemPointer dead_tuples;	/* array of ItemPointerData */
 	int			num_index_scans;
 	TransactionId latestRemovedXid;
 	bool		lock_waiter_detected;
 } LVRelStats;
 
+/*
+ * Working state for lazy vacuum execution used by only leader process. This is
+ * present only in the leader process. In parallel lazy vacuum, the 'lvshared'
+ * and 'pcxt' are not NULL and they point to a dynamic shared memory segment.
+ */
+typedef struct LVState
+{
+	Relation	relation;
+	LVRelStats	*vacrelstats;
+	Relation	*indRels;
+	/* nindexes > 0 means two-pass strategy; false means one-pass */
+	int			nindexes;
+
+	/* Lazy vacuum options and scan status */
+	VacuumOption	options;
+	bool			is_wraparound;
+	bool			aggressive;
+	bool			parallel_ready;
+
+	/* Variables for parallel lazy vacuum */
+	LVShared		*lvshared;
+	ParallelContext	*pcxt;
+} LVState;
 
 /* A few variables that don't seem worth passing around as parameters */
 static int	elevel = -1;
@@ -151,31 +255,43 @@ static BufferAccessStrategy vac_strategy;
 
 
 /* non-export function prototypes */
-static void lazy_scan_heap(Relation onerel, int options,
-			   LVRelStats *vacrelstats, Relation *Irel, int nindexes,
-			   bool aggressive);
-static void lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats);
+static void lazy_scan_heap(LVState *lvstate);
+static void lazy_vacuum_heap(LVState *lvstate, LVTidMap *dead_tuples);
 static bool lazy_check_needs_freeze(Buffer buf, bool *hastup);
-static void lazy_vacuum_index(Relation indrel,
-				  IndexBulkDeleteResult **stats,
-				  LVRelStats *vacrelstats);
-static void lazy_cleanup_index(Relation indrel,
-				   IndexBulkDeleteResult *stats,
-				   LVRelStats *vacrelstats);
-static int lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
-				 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer);
+static IndexBulkDeleteResult *lazy_vacuum_index(Relation indrel,
+									IndexBulkDeleteResult *stats,
+									double reltuples,
+									LVTidMap	*dead_tuples);
+static IndexBulkDeleteResult *lazy_cleanup_index(Relation indrel,
+									IndexBulkDeleteResult *stats,
+									double reltuples, bool estimated_count,
+									bool update_stats);
+static int lazy_vacuum_page(LVState *lvstate, Relation onerel, BlockNumber blkno,
+					Buffer buffer, int tupindex, Buffer *vmbuffer,
+					TransactionId latestRemovedXid, LVTidMap *dead_tuples);
 static bool should_attempt_truncation(LVRelStats *vacrelstats);
 static void lazy_truncate_heap(Relation onerel, LVRelStats *vacrelstats);
 static BlockNumber count_nondeletable_pages(Relation onerel,
 						 LVRelStats *vacrelstats);
-static void lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks);
-static void lazy_record_dead_tuple(LVRelStats *vacrelstats,
-					   ItemPointer itemptr);
+static LVTidMap *lazy_space_alloc(LVState *lvstate, BlockNumber relblocks,
+								  int parallel_workers);
+static void lazy_record_dead_tuple(LVTidMap *dead_tuples, ItemPointer itemptr);
 static bool lazy_tid_reaped(ItemPointer itemptr, void *state);
 static int	vac_cmp_itemptr(const void *left, const void *right);
 static bool heap_page_is_all_visible(Relation rel, Buffer buf,
 						 TransactionId *visibility_cutoff_xid, bool *all_frozen);
-
+static int lazy_compute_parallel_workers(Relation rel, int nrequests, int nindexes);
+static LVTidMap *lazy_prepare_parallel(LVState *lvstate, long maxtuples, int request);
+static void lazy_end_parallel(LVState *lvstate, bool update_indstats);
+static bool lazy_begin_parallel_vacuum_index(LVState *lvstate, bool for_cleanup);
+static void lazy_end_parallel_vacuum_index(LVState *lvstate);
+static void lazy_vacuum_all_indexes_for_leader(LVState *lvstate,
+											   IndexBulkDeleteResult **stats,
+											   LVTidMap *dead_tuples,
+											   bool for_cleanup);
+static void lazy_vacuum_all_indexes_for_worker(Relation *indrels, int nindexes,
+												LVShared *lvshared, LVTidMap *dead_tuples,
+												bool for_cleanup);
 
 /*
  *	lazy_vacuum_rel() -- perform LAZY VACUUM for one heap relation
@@ -187,9 +303,10 @@ static bool heap_page_is_all_visible(Relation rel, Buffer buf,
  *		and locked the relation.
  */
 void
-lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params,
+lazy_vacuum_rel(Relation onerel, VacuumOption options, VacuumParams *params,
 				BufferAccessStrategy bstrategy)
 {
+	LVState	   *lvstate;
 	LVRelStats *vacrelstats;
 	Relation   *Irel;
 	int			nindexes;
@@ -201,6 +318,7 @@ lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 				write_rate;
 	bool		aggressive;		/* should we scan all unfrozen pages? */
 	bool		scanned_all_unfrozen;	/* actually scanned all such pages? */
+	bool		hasindex;
 	TransactionId xidFullScanLimit;
 	MultiXactId mxactFullScanLimit;
 	BlockNumber new_rel_pages;
@@ -218,7 +336,7 @@ lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 		starttime = GetCurrentTimestamp();
 	}
 
-	if (options & VACOPT_VERBOSE)
+	if (options.flags & VACOPT_VERBOSE)
 		elevel = INFO;
 	else
 		elevel = DEBUG2;
@@ -246,7 +364,7 @@ lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 											   xidFullScanLimit);
 	aggressive |= MultiXactIdPrecedesOrEquals(onerel->rd_rel->relminmxid,
 											  mxactFullScanLimit);
-	if (options & VACOPT_DISABLE_PAGE_SKIPPING)
+	if (options.flags & VACOPT_DISABLE_PAGE_SKIPPING)
 		aggressive = true;
 
 	vacrelstats = (LVRelStats *) palloc0(sizeof(LVRelStats));
@@ -259,10 +377,23 @@ lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 
 	/* Open all indexes of the relation */
 	vac_open_indexes(onerel, RowExclusiveLock, &nindexes, &Irel);
-	vacrelstats->hasindex = (nindexes > 0);
+	hasindex = (nindexes > 0);
+
+	/* Create a lazy vacuum working state */
+	lvstate = (LVState *) palloc0(sizeof(LVState));
+	lvstate->relation = onerel;
+	lvstate->vacrelstats = vacrelstats;
+	lvstate->indRels = Irel;
+	lvstate->nindexes = nindexes;
+	lvstate->options = options;
+	lvstate->is_wraparound = params->is_wraparound;
+	lvstate->aggressive = aggressive;
+	lvstate->parallel_ready = false;
+	lvstate->lvshared = NULL;
+	lvstate->pcxt = NULL;
 
 	/* Do the vacuuming */
-	lazy_scan_heap(onerel, options, vacrelstats, Irel, nindexes, aggressive);
+	lazy_scan_heap(lvstate);
 
 	/* Done with indexes */
 	vac_close_indexes(nindexes, Irel, NoLock);
@@ -333,7 +464,7 @@ lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 						new_rel_pages,
 						new_live_tuples,
 						new_rel_allvisible,
-						vacrelstats->hasindex,
+						hasindex,
 						new_frozen_xid,
 						new_min_multi,
 						false);
@@ -465,14 +596,28 @@ vacuum_log_cleanup_info(Relation rel, LVRelStats *vacrelstats)
  *		dead-tuple TIDs, invoke vacuuming of indexes and call lazy_vacuum_heap
  *		to reclaim dead line pointers.
  *
+ *		If the table has more than one index and parallel lazy vacuum is requested,
+ *		we run both index vacuum and cleanup index in parallel. When allocating the
+ *		space for lazy scan heap, we enter the parallel mode, create the parallel
+ *		context and initailize a dynamic shared memory segment for dead tuples.
+ *		The dead_tuples points either to a dynamic shared memory segment in parallel
+ *		vacuum case or to a local memory in single vacuum case. Before starting
+ *		parallel index vacuum and parallel cleanup index we launch parallel workers.
+ * 		All parallel workers will exit after processed all indexes the leader process
+ *		re-initialize parallel context and then re-launch them at the next execution.
+ *		The index statistics are updated by the leader after exited from the parallel
+ * 		mode since currently all writes are not allowed during the parallel mode.
+ *
  *		If there are no indexes then we can reclaim line pointers on the fly;
  *		dead line pointers need only be retained until all index pointers that
  *		reference them have been killed.
  */
 static void
-lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
-			   Relation *Irel, int nindexes, bool aggressive)
+lazy_scan_heap(LVState *lvstate)
 {
+	Relation	onerel = lvstate->relation;
+	LVRelStats	*vacrelstats = lvstate->vacrelstats;
+	LVTidMap	*dead_tuples = NULL;
 	BlockNumber nblocks,
 				blkno;
 	HeapTupleData tuple;
@@ -495,6 +640,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	bool		skipping_blocks;
 	xl_heap_freeze_tuple *frozen;
 	StringInfoData buf;
+	int			parallel_workers = 0;
 	const int	initprog_index[] = {
 		PROGRESS_VACUUM_PHASE,
 		PROGRESS_VACUUM_TOTAL_HEAP_BLKS,
@@ -505,7 +651,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	pg_rusage_init(&ru0);
 
 	relname = RelationGetRelationName(onerel);
-	if (aggressive)
+	if (lvstate->aggressive)
 		ereport(elevel,
 				(errmsg("aggressively vacuuming \"%s.%s\"",
 						get_namespace_name(RelationGetNamespace(onerel)),
@@ -521,7 +667,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	num_tuples = live_tuples = tups_vacuumed = nkeep = nunused = 0;
 
 	indstats = (IndexBulkDeleteResult **)
-		palloc0(nindexes * sizeof(IndexBulkDeleteResult *));
+		palloc0(lvstate->nindexes * sizeof(IndexBulkDeleteResult *));
 
 	nblocks = RelationGetNumberOfBlocks(onerel);
 	vacrelstats->rel_pages = nblocks;
@@ -530,13 +676,22 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	vacrelstats->nonempty_pages = 0;
 	vacrelstats->latestRemovedXid = InvalidTransactionId;
 
-	lazy_space_alloc(vacrelstats, nblocks);
+	/*
+	 * Compute the number of parallel vacuum worker to request and then enable
+	 * parallel lazy vacuum.
+	 */
+	if ((lvstate->options.flags & VACOPT_PARALLEL) != 0)
+		parallel_workers = lazy_compute_parallel_workers(lvstate->relation,
+														 lvstate->options.nworkers,
+														 lvstate->nindexes);
+
+	dead_tuples = lazy_space_alloc(lvstate, nblocks, parallel_workers);
 	frozen = palloc(sizeof(xl_heap_freeze_tuple) * MaxHeapTuplesPerPage);
 
 	/* Report that we're scanning the heap, advertising total # of blocks */
 	initprog_val[0] = PROGRESS_VACUUM_PHASE_SCAN_HEAP;
 	initprog_val[1] = nblocks;
-	initprog_val[2] = vacrelstats->max_dead_tuples;
+	initprog_val[2] = dead_tuples->max_dead_tuples;
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
 	/*
@@ -584,7 +739,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	 * be replayed on any hot standby, where it can be disruptive.
 	 */
 	next_unskippable_block = 0;
-	if ((options & VACOPT_DISABLE_PAGE_SKIPPING) == 0)
+	if ((lvstate->options.flags & VACOPT_DISABLE_PAGE_SKIPPING) == 0)
 	{
 		while (next_unskippable_block < nblocks)
 		{
@@ -592,7 +747,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 
 			vmstatus = visibilitymap_get_status(onerel, next_unskippable_block,
 												&vmbuffer);
-			if (aggressive)
+			if (lvstate->aggressive)
 			{
 				if ((vmstatus & VISIBILITYMAP_ALL_FROZEN) == 0)
 					break;
@@ -639,7 +794,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		{
 			/* Time to advance next_unskippable_block */
 			next_unskippable_block++;
-			if ((options & VACOPT_DISABLE_PAGE_SKIPPING) == 0)
+			if ((lvstate->options.flags & VACOPT_DISABLE_PAGE_SKIPPING) == 0)
 			{
 				while (next_unskippable_block < nblocks)
 				{
@@ -648,7 +803,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 					vmskipflags = visibilitymap_get_status(onerel,
 														   next_unskippable_block,
 														   &vmbuffer);
-					if (aggressive)
+					if (lvstate->aggressive)
 					{
 						if ((vmskipflags & VISIBILITYMAP_ALL_FROZEN) == 0)
 							break;
@@ -677,7 +832,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			 * it's not all-visible.  But in an aggressive vacuum we know only
 			 * that it's not all-frozen, so it might still be all-visible.
 			 */
-			if (aggressive && VM_ALL_VISIBLE(onerel, blkno, &vmbuffer))
+			if (lvstate->aggressive && VM_ALL_VISIBLE(onerel, blkno, &vmbuffer))
 				all_visible_according_to_vm = true;
 		}
 		else
@@ -701,7 +856,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 				 * know whether it was all-frozen, so we have to recheck; but
 				 * in this case an approximate answer is OK.
 				 */
-				if (aggressive || VM_ALL_FROZEN(onerel, blkno, &vmbuffer))
+				if (lvstate->aggressive || VM_ALL_FROZEN(onerel, blkno, &vmbuffer))
 					vacrelstats->frozenskipped_pages++;
 				continue;
 			}
@@ -714,8 +869,8 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		 * If we are close to overrunning the available space for dead-tuple
 		 * TIDs, pause and do a cycle of vacuuming before we tackle this page.
 		 */
-		if ((vacrelstats->max_dead_tuples - vacrelstats->num_dead_tuples) < MaxHeapTuplesPerPage &&
-			vacrelstats->num_dead_tuples > 0)
+		if ((dead_tuples->max_dead_tuples - dead_tuples->num_dead_tuples) < MaxHeapTuplesPerPage &&
+			dead_tuples->num_dead_tuples > 0)
 		{
 			const int	hvp_index[] = {
 				PROGRESS_VACUUM_PHASE,
@@ -743,10 +898,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 										 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
 			/* Remove index entries */
-			for (i = 0; i < nindexes; i++)
-				lazy_vacuum_index(Irel[i],
-								  &indstats[i],
-								  vacrelstats);
+			lazy_vacuum_all_indexes_for_leader(lvstate, indstats, dead_tuples, false);
 
 			/*
 			 * Report that we are now vacuuming the heap.  We also increase
@@ -759,14 +911,14 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			pgstat_progress_update_multi_param(2, hvp_index, hvp_val);
 
 			/* Remove tuples from heap */
-			lazy_vacuum_heap(onerel, vacrelstats);
+			lazy_vacuum_heap(lvstate, dead_tuples);
 
 			/*
 			 * Forget the now-vacuumed tuples, and press on, but be careful
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_dead_tuples = 0;
 			vacrelstats->num_index_scans++;
 
 			/*
@@ -804,7 +956,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			 * it's OK to skip vacuuming pages we get a lock conflict on. They
 			 * will be dealt with in some future vacuum.
 			 */
-			if (!aggressive && !FORCE_CHECK_PAGE())
+			if (!lvstate->aggressive && !FORCE_CHECK_PAGE())
 			{
 				ReleaseBuffer(buf);
 				vacrelstats->pinskipped_pages++;
@@ -837,7 +989,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 					vacrelstats->nonempty_pages = blkno + 1;
 				continue;
 			}
-			if (!aggressive)
+			if (!lvstate->aggressive)
 			{
 				/*
 				 * Here, we must not advance scanned_pages; that would amount
@@ -956,7 +1108,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		has_dead_tuples = false;
 		nfrozen = 0;
 		hastup = false;
-		prev_dead_count = vacrelstats->num_dead_tuples;
+		prev_dead_count = dead_tuples->num_dead_tuples;
 		maxoff = PageGetMaxOffsetNumber(page);
 
 		/*
@@ -995,7 +1147,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			 */
 			if (ItemIdIsDead(itemid))
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				all_visible = false;
 				continue;
 			}
@@ -1135,7 +1287,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 
 			if (tupgone)
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				HeapTupleHeaderAdvanceLatestRemovedXid(tuple.t_data,
 													   &vacrelstats->latestRemovedXid);
 				tups_vacuumed += 1;
@@ -1204,11 +1356,12 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		 * If there are no indexes then we can vacuum the page right now
 		 * instead of doing a second scan.
 		 */
-		if (nindexes == 0 &&
-			vacrelstats->num_dead_tuples > 0)
+		if (lvstate->nindexes == 0 && dead_tuples->num_dead_tuples > 0)
 		{
 			/* Remove tuples from heap */
-			lazy_vacuum_page(onerel, blkno, buf, 0, vacrelstats, &vmbuffer);
+			lazy_vacuum_page(lvstate, onerel, blkno, buf, 0, &vmbuffer,
+							 lvstate->vacrelstats->latestRemovedXid,
+							 dead_tuples);
 			has_dead_tuples = false;
 
 			/*
@@ -1216,7 +1369,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_dead_tuples = 0;
 			vacuumed_pages++;
 
 			/*
@@ -1332,7 +1485,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		 * page, so remember its free space as-is.  (This path will always be
 		 * taken if there are no indexes.)
 		 */
-		if (vacrelstats->num_dead_tuples == prev_dead_count)
+		if (dead_tuples->num_dead_tuples == prev_dead_count)
 			RecordPageWithFreeSpace(onerel, blkno, freespace);
 	}
 
@@ -1366,7 +1519,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 
 	/* If any tuples need to be deleted, perform final vacuum cycle */
 	/* XXX put a threshold on min number of tuples here? */
-	if (vacrelstats->num_dead_tuples > 0)
+	if (dead_tuples->num_dead_tuples > 0)
 	{
 		const int	hvp_index[] = {
 			PROGRESS_VACUUM_PHASE,
@@ -1382,10 +1535,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 									 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
 		/* Remove index entries */
-		for (i = 0; i < nindexes; i++)
-			lazy_vacuum_index(Irel[i],
-							  &indstats[i],
-							  vacrelstats);
+		lazy_vacuum_all_indexes_for_leader(lvstate, indstats, dead_tuples, false);
 
 		/* Report that we are now vacuuming the heap */
 		hvp_val[0] = PROGRESS_VACUUM_PHASE_VACUUM_HEAP;
@@ -1395,7 +1545,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		/* Remove tuples from heap */
 		pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
 									 PROGRESS_VACUUM_PHASE_VACUUM_HEAP);
-		lazy_vacuum_heap(onerel, vacrelstats);
+		lazy_vacuum_heap(lvstate, dead_tuples);
 		vacrelstats->num_index_scans++;
 	}
 
@@ -1412,8 +1562,10 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
 
 	/* Do post-vacuum cleanup and statistics update for each index */
-	for (i = 0; i < nindexes; i++)
-		lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
+	lazy_vacuum_all_indexes_for_leader(lvstate, indstats, dead_tuples, true);
+
+	if (lvstate->parallel_ready)
+		lazy_end_parallel(lvstate, true);
 
 	/* If no indexes, make log report that lazy_vacuum_heap would've made */
 	if (vacuumed_pages)
@@ -1468,8 +1620,9 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
  * process index entry removal in batches as large as possible.
  */
 static void
-lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
+lazy_vacuum_heap(LVState *lvstate, LVTidMap *dead_tuples)
 {
+	Relation	onerel = lvstate->relation;
 	int			tupindex;
 	int			npages;
 	PGRUsage	ru0;
@@ -1479,7 +1632,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 	npages = 0;
 
 	tupindex = 0;
-	while (tupindex < vacrelstats->num_dead_tuples)
+	while (tupindex < dead_tuples->num_dead_tuples)
 	{
 		BlockNumber tblk;
 		Buffer		buf;
@@ -1488,7 +1641,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 
 		vacuum_delay_point();
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&dead_tuples->itemptrs[tupindex]);
 		buf = ReadBufferExtended(onerel, MAIN_FORKNUM, tblk, RBM_NORMAL,
 								 vac_strategy);
 		if (!ConditionalLockBufferForCleanup(buf))
@@ -1497,8 +1650,9 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 			++tupindex;
 			continue;
 		}
-		tupindex = lazy_vacuum_page(onerel, tblk, buf, tupindex, vacrelstats,
-									&vmbuffer);
+		tupindex = lazy_vacuum_page(lvstate, onerel, tblk, buf, tupindex,
+									&vmbuffer, lvstate->vacrelstats->latestRemovedXid,
+									dead_tuples);
 
 		/* Now that we've compacted the page, record its available space */
 		page = BufferGetPage(buf);
@@ -1533,8 +1687,9 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
  * The return value is the first tupindex after the tuples of this page.
  */
 static int
-lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
-				 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer)
+lazy_vacuum_page(LVState *lvstate, Relation onerel, BlockNumber blkno,
+				 Buffer buffer, int tupindex, Buffer *vmbuffer,
+				 TransactionId latestRemovedXid, LVTidMap *dead_tuples)
 {
 	Page		page = BufferGetPage(buffer);
 	OffsetNumber unused[MaxOffsetNumber];
@@ -1546,16 +1701,16 @@ lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 
 	START_CRIT_SECTION();
 
-	for (; tupindex < vacrelstats->num_dead_tuples; tupindex++)
+	for (; tupindex < dead_tuples->num_dead_tuples; tupindex++)
 	{
 		BlockNumber tblk;
 		OffsetNumber toff;
 		ItemId		itemid;
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&dead_tuples->itemptrs[tupindex]);
 		if (tblk != blkno)
 			break;				/* past end of tuples for this block */
-		toff = ItemPointerGetOffsetNumber(&vacrelstats->dead_tuples[tupindex]);
+		toff = ItemPointerGetOffsetNumber(&dead_tuples->itemptrs[tupindex]);
 		itemid = PageGetItemId(page, toff);
 		ItemIdSetUnused(itemid);
 		unused[uncnt++] = toff;
@@ -1576,7 +1731,7 @@ lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 		recptr = log_heap_clean(onerel, buffer,
 								NULL, 0, NULL, 0,
 								unused, uncnt,
-								vacrelstats->latestRemovedXid);
+								latestRemovedXid);
 		PageSetLSN(page, recptr);
 	}
 
@@ -1675,6 +1830,88 @@ lazy_check_needs_freeze(Buffer buf, bool *hastup)
 	return false;
 }
 
+/*
+ * Vacuum or cleanup all indexes with parallel workers if requested. This function
+ * is used by the parallel vacuum leader process. In parallel lazy vacuum, we save
+ * the index bulk-deletion results to the shared memory space we prepared since the
+ * index bulk-deletion result returned from ambulkdelete and amvacuumcleanup might
+ * exist in local memory. All vacuum workers process different indexes we can write
+ * them without locking.
+ */
+static void
+lazy_vacuum_all_indexes_for_leader(LVState *lvstate, IndexBulkDeleteResult **stats,
+								   LVTidMap *dead_tuples, bool for_cleanup)
+{
+	LVShared	*lvshared = lvstate->lvshared;
+	LVRelStats	*vacrelstats = lvstate->vacrelstats;
+	int			nprocessed = 0;
+	bool		do_parallel = false;
+	int			idx;
+
+	Assert(!IsParallelWorker());
+
+	/* no job if the table has no index */
+	if (lvstate->nindexes < 1)
+		return;
+
+	if (lvstate->parallel_ready)
+		do_parallel = lazy_begin_parallel_vacuum_index(lvstate, for_cleanup);
+
+	for (;;)
+	{
+		IndexBulkDeleteResult *r = NULL;
+
+		/*
+		 * Get the next index number to vacuum and set index statistics. In parallel
+		 * lazy vacuum, index bulk-deletion results are stored in the shared memory
+		 * segment. If it's already updated we use it rather than setting it to NULL.
+		 * In single vacuum, we can always use an element of the 'stats'.
+		 */
+		if (do_parallel)
+		{
+			idx = pg_atomic_fetch_add_u32(&(lvshared->nprocessed), 1);
+
+			if (lvshared->indstats[idx].updated)
+				r = &(lvshared->indstats[idx].stats);
+		}
+		else
+		{
+			idx = nprocessed++;
+			r = stats[idx];
+		}
+
+		/* Done for all indexes? */
+		if (idx >= lvstate->nindexes)
+			break;
+
+		/*
+		 * Do vacuuming or cleanup one index. For cleanup index, we don't update
+		 * index statistics during parallel mode.
+		 */
+		if (!for_cleanup)
+			r = lazy_vacuum_index(lvstate->indRels[idx], r,
+								  vacrelstats->old_rel_pages,
+								  dead_tuples);
+		else
+			r = lazy_cleanup_index(lvstate->indRels[idx], r,
+								   vacrelstats->new_rel_tuples,
+								   vacrelstats->tupcount_pages < vacrelstats->rel_pages,
+								   !do_parallel);
+
+		if (do_parallel && r)
+		{
+			/* save index bulk-deletion result to the shared memory space */
+			lvshared->indstats[idx].updated = true;
+			memcpy(&(lvshared->indstats[idx].stats), r, sizeof(IndexBulkDeleteResult));
+
+			/* save pointer to the shared memory segment */
+			r = &(lvshared->indstats[idx].stats);
+		}
+	}
+
+	if (do_parallel)
+		lazy_end_parallel_vacuum_index(lvstate);
+}
 
 /*
  *	lazy_vacuum_index() -- vacuum one index relation.
@@ -1682,11 +1919,11 @@ lazy_check_needs_freeze(Buffer buf, bool *hastup)
  *		Delete all the index entries pointing to tuples listed in
  *		vacrelstats->dead_tuples, and update running statistics.
  */
-static void
-lazy_vacuum_index(Relation indrel,
-				  IndexBulkDeleteResult **stats,
-				  LVRelStats *vacrelstats)
+static IndexBulkDeleteResult *
+lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult *stats,
+				  double reltuples, LVTidMap *dead_tuples)
 {
+	IndexBulkDeleteResult *res;
 	IndexVacuumInfo ivinfo;
 	PGRUsage	ru0;
 
@@ -1696,28 +1933,29 @@ lazy_vacuum_index(Relation indrel,
 	ivinfo.analyze_only = false;
 	ivinfo.estimated_count = true;
 	ivinfo.message_level = elevel;
-	/* We can only provide an approximate value of num_heap_tuples here */
-	ivinfo.num_heap_tuples = vacrelstats->old_live_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
 	/* Do bulk deletion */
-	*stats = index_bulk_delete(&ivinfo, *stats,
-							   lazy_tid_reaped, (void *) vacrelstats);
+	res = index_bulk_delete(&ivinfo, stats,
+							lazy_tid_reaped, (void *) dead_tuples);
 
 	ereport(elevel,
-			(errmsg("scanned index \"%s\" to remove %d row versions",
+			(errmsg("scanned index \"%s\" to remove %d row versions %s",
 					RelationGetRelationName(indrel),
-					vacrelstats->num_dead_tuples),
+					dead_tuples->num_dead_tuples,
+					IsParallelWorker() ? "by vacuum worker" : ""),
 			 errdetail_internal("%s", pg_rusage_show(&ru0))));
+
+	return res;
 }
 
 /*
  *	lazy_cleanup_index() -- do post-vacuum cleanup for one index relation.
  */
-static void
-lazy_cleanup_index(Relation indrel,
-				   IndexBulkDeleteResult *stats,
-				   LVRelStats *vacrelstats)
+static IndexBulkDeleteResult *
+lazy_cleanup_index(Relation indrel, IndexBulkDeleteResult *stats,
+				   double reltuples, bool estimated_count, bool update_stats)
 {
 	IndexVacuumInfo ivinfo;
 	PGRUsage	ru0;
@@ -1726,27 +1964,21 @@ lazy_cleanup_index(Relation indrel,
 
 	ivinfo.index = indrel;
 	ivinfo.analyze_only = false;
-	ivinfo.estimated_count = (vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	ivinfo.estimated_count = estimated_count;
 	ivinfo.message_level = elevel;
-
-	/*
-	 * Now we can provide a better estimate of total number of surviving
-	 * tuples (we assume indexes are more interested in that than in the
-	 * number of nominally live tuples).
-	 */
-	ivinfo.num_heap_tuples = vacrelstats->new_rel_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
 	stats = index_vacuum_cleanup(&ivinfo, stats);
 
 	if (!stats)
-		return;
+		return NULL;
 
 	/*
 	 * Now update statistics in pg_class, but only if the index says the count
 	 * is accurate.
 	 */
-	if (!stats->estimated_count)
+	if (!stats->estimated_count && update_stats)
 		vac_update_relstats(indrel,
 							stats->num_pages,
 							stats->num_index_tuples,
@@ -1767,8 +1999,7 @@ lazy_cleanup_index(Relation indrel,
 					   stats->tuples_removed,
 					   stats->pages_deleted, stats->pages_free,
 					   pg_rusage_show(&ru0))));
-
-	pfree(stats);
+	return stats;
 }
 
 /*
@@ -2078,15 +2309,16 @@ count_nondeletable_pages(Relation onerel, LVRelStats *vacrelstats)
  *
  * See the comments at the head of this file for rationale.
  */
-static void
-lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+static LVTidMap *
+lazy_space_alloc(LVState *lvstate, BlockNumber relblocks, int parallel_workers)
 {
+	LVTidMap	*dead_tuples = NULL;
 	long		maxtuples;
 	int			vac_work_mem = IsAutoVacuumWorkerProcess() &&
 	autovacuum_work_mem != -1 ?
 	autovacuum_work_mem : maintenance_work_mem;
 
-	if (vacrelstats->hasindex)
+	if (lvstate->nindexes > 0)
 	{
 		maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);
 		maxtuples = Min(maxtuples, INT_MAX);
@@ -2100,34 +2332,46 @@ lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
 		maxtuples = Max(maxtuples, MaxHeapTuplesPerPage);
 	}
 	else
-	{
 		maxtuples = MaxHeapTuplesPerPage;
+
+	/*
+	 * In parallel lazy vacuum, we enter the parallel mode and prepare all memory
+	 * necessary for executing parallel lazy vacuum including the space to store
+	 * dead tuples.
+	 */
+	if (parallel_workers > 0)
+	{
+		dead_tuples = lazy_prepare_parallel(lvstate, maxtuples, parallel_workers);
+
+		/* Preparation was a success, return the dead tuple space */
+		if (dead_tuples)
+			return dead_tuples;
 	}
 
-	vacrelstats->num_dead_tuples = 0;
-	vacrelstats->max_dead_tuples = (int) maxtuples;
-	vacrelstats->dead_tuples = (ItemPointer)
-		palloc(maxtuples * sizeof(ItemPointerData));
+	dead_tuples = (LVTidMap *) palloc(SizeOfLVTidMap + maxtuples * sizeof(ItemPointerData));
+	dead_tuples->num_dead_tuples = 0;
+	dead_tuples->max_dead_tuples = (int) maxtuples;
+
+	return dead_tuples;
 }
 
 /*
  * lazy_record_dead_tuple - remember one deletable tuple
  */
 static void
-lazy_record_dead_tuple(LVRelStats *vacrelstats,
-					   ItemPointer itemptr)
+lazy_record_dead_tuple(LVTidMap *dead_tuples, ItemPointer itemptr)
 {
 	/*
 	 * The array shouldn't overflow under normal behavior, but perhaps it
 	 * could if we are given a really small maintenance_work_mem. In that
 	 * case, just forget the last few tuples (we'll get 'em next time).
 	 */
-	if (vacrelstats->num_dead_tuples < vacrelstats->max_dead_tuples)
+	if (dead_tuples->num_dead_tuples < dead_tuples->max_dead_tuples)
 	{
-		vacrelstats->dead_tuples[vacrelstats->num_dead_tuples] = *itemptr;
-		vacrelstats->num_dead_tuples++;
+		dead_tuples->itemptrs[dead_tuples->num_dead_tuples] = *itemptr;
+		dead_tuples->num_dead_tuples++;
 		pgstat_progress_update_param(PROGRESS_VACUUM_NUM_DEAD_TUPLES,
-									 vacrelstats->num_dead_tuples);
+									 dead_tuples->num_dead_tuples);
 	}
 }
 
@@ -2141,12 +2385,12 @@ lazy_record_dead_tuple(LVRelStats *vacrelstats,
 static bool
 lazy_tid_reaped(ItemPointer itemptr, void *state)
 {
-	LVRelStats *vacrelstats = (LVRelStats *) state;
+	LVTidMap	*dead_tuples = (LVTidMap *) state;
 	ItemPointer res;
 
 	res = (ItemPointer) bsearch((void *) itemptr,
-								(void *) vacrelstats->dead_tuples,
-								vacrelstats->num_dead_tuples,
+								(void *) dead_tuples->itemptrs,
+								dead_tuples->num_dead_tuples,
 								sizeof(ItemPointerData),
 								vac_cmp_itemptr);
 
@@ -2294,3 +2538,351 @@ heap_page_is_all_visible(Relation rel, Buffer buf,
 
 	return all_visible;
 }
+
+/*
+ * Compute the number of parallel worker process to request. Vacuums can be executed
+ * in parallel if the table has more than one index since we support parallel index
+ * vacuum that processes one index by one vacuum process. The relation size of table
+ * and indexes doesn't affect to the parallel degree.
+ */
+static int
+lazy_compute_parallel_workers(Relation rel, int nrequests, int nindexes)
+{
+	int parallel_workers = nindexes - 1;
+
+	if (nindexes <= 1)
+		return 0;
+
+	if (nrequests)
+		parallel_workers = Min(nrequests, nindexes - 1);
+	else if (rel->rd_options)
+	{
+		StdRdOptions *relopts = (StdRdOptions *) rel->rd_options;
+		parallel_workers = Min(relopts->parallel_workers, nindexes - 1);
+	}
+
+	/* cap by max_parallel_maintenace_workers */
+	parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+
+	Assert(parallel_workers > 0);
+
+	return parallel_workers;
+}
+
+/*
+ * Enter the parallel mode, allocate and initialize a DSM segment. Return
+ * the memory space for storing dead tuples or NULL if no workers are prepared.
+ */
+static LVTidMap *
+lazy_prepare_parallel(LVState *lvstate, long maxtuples, int request)
+{
+	LVShared	*shared;
+	ParallelContext *pcxt;
+	LVTidMap	*tidmap;
+	char	*sharedquery;
+	Size	estshared;
+	Size	estdt;
+	int		querylen;
+	int		i;
+	int		keys = 0;
+
+	Assert(request > 0);
+
+	EnterParallelMode();
+	pcxt = CreateParallelContext("postgres", "lazy_parallel_vacuum_main",
+								 request, true);
+	lvstate->pcxt = pcxt;
+
+	/* estimate size for shared information -- PARALLEL_VACUUM_KEY_SHARED */
+	estshared = MAXALIGN(add_size(SizeOfLVShared,
+								  mul_size(sizeof(LVIndStats), lvstate->nindexes)));
+	shm_toc_estimate_chunk(&pcxt->estimator, estshared);
+	keys++;
+
+	/* estimate size for dead tuples -- PARALLEL_VACUUM_KEY_DEAD_TUPLES */
+	estdt = MAXALIGN(add_size(sizeof(LVTidMap),
+							  mul_size(sizeof(ItemPointerData), maxtuples)));
+	shm_toc_estimate_chunk(&pcxt->estimator, estdt);
+	keys++;
+
+	shm_toc_estimate_keys(&pcxt->estimator, keys);
+
+	/*
+	 * Finally, estimate VACUUM_KEY_QUERY_TEXT space. Auto vacuums don't have
+	 * debug_query_string.
+	 */
+	if (debug_query_string)
+	{
+		querylen = strlen(debug_query_string);
+		shm_toc_estimate_chunk(&pcxt->estimator, querylen + 1);
+		shm_toc_estimate_keys(&pcxt->estimator, 1);
+	}
+
+	/* create the DSM */
+	InitializeParallelDSM(pcxt);
+
+	/* stop preparation and exit parallel mode if failed to allocate DSM segment */
+	if (pcxt->nworkers == 0)
+	{
+		lazy_end_parallel(lvstate, false);
+		return NULL;
+	}
+
+	/* prepare shared information */
+	shared = (LVShared *) shm_toc_allocate(pcxt->toc, estshared);
+	shared->relid = RelationGetRelid(lvstate->relation);
+	shared->is_wraparound = lvstate->is_wraparound;
+	shared->elevel = elevel;
+	pg_atomic_init_u32(&(shared->nprocessed), 0);
+
+	for (i = 0; i < lvstate->nindexes; i++)
+	{
+		LVIndStats *s = &(shared->indstats[i]);
+		s->updated = false;
+		MemSet(&(s->stats), 0, sizeof(IndexBulkDeleteResult));
+	}
+
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_SHARED, shared);
+	lvstate->lvshared = shared;
+
+	/* prepare the dead tuple space */
+	tidmap = (LVTidMap *) shm_toc_allocate(pcxt->toc, estdt);
+	tidmap->max_dead_tuples = maxtuples;
+	tidmap->num_dead_tuples = 0;
+	MemSet(tidmap->itemptrs, 0, sizeof(ItemPointerData) * maxtuples);
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_DEAD_TUPLES, tidmap);
+
+	/* store query string for workers */
+	sharedquery = (char *) shm_toc_allocate(pcxt->toc, querylen + 1);
+	memcpy(sharedquery, debug_query_string, querylen + 1);
+	sharedquery[querylen] = '\0';
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, sharedquery);
+
+	/* All setup is done, now we're ready for parallel vacuum execution */
+	lvstate->parallel_ready = true;
+
+	return tidmap;
+}
+
+/*
+ * Shutdown workers, destroy parallel context, and end parallel mode. If
+ * 'update_indstats' is true, we copy statistics of all indexes before
+ * destroying the parallel context, and then update them after exit parallel
+ * mode.
+ */
+static void
+lazy_end_parallel(LVState *lvstate, bool update_indstats)
+{
+	LVIndStats *copied_indstats = NULL;
+
+	Assert(!IsParallelWorker());
+
+	if (update_indstats && lvstate->nindexes > 0)
+	{
+		/* copy the index statistics to a temporary space */
+		copied_indstats = palloc(sizeof(LVIndStats) * lvstate->nindexes);
+		memcpy(copied_indstats, lvstate->lvshared->indstats,
+			   sizeof(LVIndStats) * lvstate->nindexes);
+	}
+
+	/* Shutdown worker processes and destroy the parallel context */
+	WaitForParallelWorkersToFinish(lvstate->pcxt);
+	DestroyParallelContext(lvstate->pcxt);
+	ExitParallelMode();
+
+	if (copied_indstats)
+	{
+		int i;
+
+		for (i = 0; i < lvstate->nindexes; i++)
+		{
+			LVIndStats *s = &(copied_indstats[i]);
+
+			/* Update index statistics */
+			if (s->updated && !s->stats.estimated_count)
+				vac_update_relstats(lvstate->indRels[i],
+									s->stats.num_pages,
+									s->stats.num_index_tuples,
+									0,
+									false,
+									InvalidTransactionId,
+									InvalidMultiXactId,
+									false);
+		}
+
+		pfree(copied_indstats);
+	}
+
+	lvstate->parallel_ready = false;
+}
+
+/*
+ * Begin a parallel index vacuum or cleanup index. Set shared information
+ * and launch parallel worker processes. Return true if at least one worker
+ * has been launched.
+ */
+static bool
+lazy_begin_parallel_vacuum_index(LVState *lvstate, bool for_cleanup)
+{
+	LVRelStats *vacrelstats = lvstate->vacrelstats;
+
+	Assert(!IsParallelWorker());
+
+	/*
+	 * Request workers to do either vacuuming indexes or cleaning indexes.
+	 */
+	lvstate->lvshared->for_cleanup = for_cleanup;
+
+	if (for_cleanup)
+	{
+		/*
+		 * Now we can provide a better estimate of total number of surviving
+		 * tuples (we assume indexes are more interested in that than in the
+		 * number of nominally live tuples).
+		 */
+		lvstate->lvshared->reltuples = vacrelstats->new_rel_tuples;
+		lvstate->lvshared->estimated_count =
+			(vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	}
+	else
+	{
+		/* We can only provide an approximate value of num_heap_tuples here */
+		lvstate->lvshared->reltuples = vacrelstats->old_live_tuples;
+		lvstate->lvshared->estimated_count = true;
+	}
+
+	LaunchParallelWorkers(lvstate->pcxt);
+
+	/*
+	 * if no workers launched, we vacuum all indexes by the leader process
+	 * alone. Since there is hope that we can launch workers in the next
+	 * execution time we don't want to end the parallel mode yet.
+	 */
+	if (lvstate->pcxt->nworkers_launched == 0)
+	{
+		lazy_end_parallel_vacuum_index(lvstate);
+		return false;
+	}
+
+	WaitForParallelWorkersToAttach(lvstate->pcxt);
+
+	return true;
+}
+
+/*
+ * Wait for all worker processes to finish and reinitialize DSM for
+ * the next execution.
+ */
+static void
+lazy_end_parallel_vacuum_index(LVState *lvstate)
+{
+	Assert(!IsParallelWorker());
+
+	WaitForParallelWorkersToFinish(lvstate->pcxt);
+
+	/* Reset the processing count */
+	pg_atomic_write_u32(&(lvstate->lvshared->nprocessed), 0);
+
+	/*
+	 * Reinitialize the DSM space except to relaunch parallel workers for
+	 * the next execution.
+	 */
+	ReinitializeParallelDSM(lvstate->pcxt);
+}
+
+/*
+ * Perform work within a launched parallel process.
+ *
+ * Parallel vacuum worker processes doesn't report the vacuum progress
+ * information.
+ */
+void
+lazy_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
+{
+	Relation	onerel;
+	Relation	*indrels;
+	LVShared	*lvshared;
+	LVTidMap	*dead_tuples;
+	int			nindexes;
+	char		*sharedquery;
+
+	/* Set lazy vacuum state and open relations */
+	lvshared = (LVShared *) shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_SHARED, false);
+	onerel = heap_open(lvshared->relid, ShareUpdateExclusiveLock);
+	vac_open_indexes(onerel, RowExclusiveLock, &nindexes, &indrels);
+	elevel = lvshared->elevel;
+
+	ereport(DEBUG1,
+			(errmsg("starting parallel lazy vacuum worker for %s",
+					lvshared->for_cleanup ? "cleanup" : "vacuuming")));
+
+	/* Set debug_query_string for individual workers */
+	sharedquery = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, true);
+
+	/* Report the query string from leader */
+	debug_query_string = sharedquery;
+	pgstat_report_activity(STATE_RUNNING, debug_query_string);
+
+	/* Set dead tuple space within worker */
+	dead_tuples = (LVTidMap *) shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_DEAD_TUPLES, false);
+
+	/* Set cost-based vacuum delay */
+	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostBalance = 0;
+	VacuumPageHit = 0;
+	VacuumPageMiss = 0;
+	VacuumPageDirty = 0;
+
+	/* Do either vacuuming indexes or cleaning indexes */
+	lazy_vacuum_all_indexes_for_worker(indrels, nindexes, lvshared,
+									   dead_tuples,
+									   lvshared->for_cleanup);
+
+	vac_close_indexes(nindexes, indrels, RowExclusiveLock);
+	heap_close(onerel, ShareUpdateExclusiveLock);
+}
+
+/*
+ * Vacuum or cleanup indexes. This function is used by the parallel vacuum worker
+ * processes. Similar to the leader process in parallel lazy vacuum, we save index
+ * bulk-deletion results to the shared memory space.
+ */
+static void
+lazy_vacuum_all_indexes_for_worker(Relation *indrels, int nindexes,
+								   LVShared *lvshared, LVTidMap *dead_tuples,
+								   bool for_cleanup)
+{
+	int idx = 0;
+
+	Assert(IsParallelWorker());
+
+	for (;;)
+	{
+		IndexBulkDeleteResult *stats = NULL;
+
+		/* Get next index to process */
+		idx = pg_atomic_fetch_add_u32(&(lvshared->nprocessed), 1);
+
+		/*  Done for all indexes? */
+		if (idx >= nindexes)
+			break;
+
+		/* If this index has already been processed before, get the pointer */
+		if (lvshared->indstats[idx].updated)
+			stats = &(lvshared->indstats[idx].stats);
+
+		if (!for_cleanup)
+			stats = lazy_vacuum_index(indrels[idx], stats, lvshared->reltuples,
+									  dead_tuples);
+		else
+			lazy_cleanup_index(indrels[idx], stats, lvshared->reltuples,
+							   lvshared->estimated_count, false);
+
+		if (stats)
+		{
+			/* Update the shared index statistics */
+			lvshared->indstats[idx].updated = true;
+			memcpy(&(lvshared->indstats[idx].stats), stats, sizeof(IndexBulkDeleteResult));
+		}
+	}
+}
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 273e275..2d27af5 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -1668,8 +1668,10 @@ _equalDropdbStmt(const DropdbStmt *a, const DropdbStmt *b)
 static bool
 _equalVacuumStmt(const VacuumStmt *a, const VacuumStmt *b)
 {
-	COMPARE_SCALAR_FIELD(options);
-	COMPARE_NODE_FIELD(rels);
+	if (a->options.flags != b->options.flags)
+		return false;
+	if (a->options.nworkers != b->options.nworkers)
+		return false;
 
 	return true;
 }
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 2c2208f..1707959 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -187,6 +187,7 @@ static void processCASbits(int cas_bits, int location, const char *constrType,
 			   bool *deferrable, bool *initdeferred, bool *not_valid,
 			   bool *no_inherit, core_yyscan_t yyscanner);
 static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
+static VacuumOption *makeVacOpt(VacuumOptionFlag flag, int nworkers);
 
 %}
 
@@ -237,6 +238,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 	struct ImportQual	*importqual;
 	InsertStmt			*istmt;
 	VariableSetStmt		*vsetstmt;
+	VacuumOption		*vacopt;
 	PartitionElem		*partelem;
 	PartitionSpec		*partspec;
 	PartitionBoundSpec	*partboundspec;
@@ -305,8 +307,8 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 				create_extension_opt_item alter_extension_opt_item
 
 %type <ival>	opt_lock lock_type cast_context
-%type <ival>	vacuum_option_list vacuum_option_elem
-				analyze_option_list analyze_option_elem
+%type <vacopt>	vacuum_option_list vacuum_option_elem
+%type <ival>	analyze_option_list analyze_option_elem
 %type <boolean>	opt_or_replace
 				opt_grant_grant_option opt_grant_admin_option
 				opt_nowait opt_if_exists opt_with_data
@@ -10478,22 +10480,24 @@ cluster_index_specification:
 VacuumStmt: VACUUM opt_full opt_freeze opt_verbose opt_analyze opt_vacuum_relation_list
 				{
 					VacuumStmt *n = makeNode(VacuumStmt);
-					n->options = VACOPT_VACUUM;
+					n->options.flags = VACOPT_VACUUM;
 					if ($2)
-						n->options |= VACOPT_FULL;
+						n->options.flags |= VACOPT_FULL;
 					if ($3)
-						n->options |= VACOPT_FREEZE;
+						n->options.flags |= VACOPT_FREEZE;
 					if ($4)
-						n->options |= VACOPT_VERBOSE;
+						n->options.flags |= VACOPT_VERBOSE;
 					if ($5)
-						n->options |= VACOPT_ANALYZE;
+						n->options.flags |= VACOPT_ANALYZE;
+					n->options.nworkers = 0;
 					n->rels = $6;
 					$$ = (Node *)n;
 				}
 			| VACUUM '(' vacuum_option_list ')' opt_vacuum_relation_list
 				{
 					VacuumStmt *n = makeNode(VacuumStmt);
-					n->options = VACOPT_VACUUM | $3;
+					n->options.flags = VACOPT_VACUUM | $3->flags;
+					n->options.nworkers = $3->nworkers;
 					n->rels = $5;
 					$$ = (Node *) n;
 				}
@@ -10501,20 +10505,40 @@ VacuumStmt: VACUUM opt_full opt_freeze opt_verbose opt_analyze opt_vacuum_relati
 
 vacuum_option_list:
 			vacuum_option_elem								{ $$ = $1; }
-			| vacuum_option_list ',' vacuum_option_elem		{ $$ = $1 | $3; }
+			| vacuum_option_list ',' vacuum_option_elem
+				{
+					VacuumOption *vacopt1 = $1;
+					VacuumOption *vacopt2 = $3;
+
+					vacopt1->flags |= vacopt2->flags;
+					if (vacopt2->flags == VACOPT_PARALLEL)
+						vacopt1->nworkers = vacopt2->nworkers;
+					pfree(vacopt2);
+					$$ = vacopt1;
+				}
 		;
 
 vacuum_option_elem:
-			analyze_keyword		{ $$ = VACOPT_ANALYZE; }
-			| VERBOSE			{ $$ = VACOPT_VERBOSE; }
-			| FREEZE			{ $$ = VACOPT_FREEZE; }
-			| FULL				{ $$ = VACOPT_FULL; }
+			analyze_keyword		{ $$ = makeVacOpt(VACOPT_ANALYZE, 0); }
+			| VERBOSE			{ $$ = makeVacOpt(VACOPT_VERBOSE, 0); }
+			| FREEZE			{ $$ = makeVacOpt(VACOPT_FREEZE, 0); }
+			| FULL				{ $$ = makeVacOpt(VACOPT_FULL, 0); }
+			| PARALLEL			{ $$ = makeVacOpt(VACOPT_PARALLEL, 0); }
+			| PARALLEL ICONST
+			{
+				if ($2 < 1)
+					ereport(ERROR,
+							(errcode(ERRCODE_SYNTAX_ERROR),
+							 errmsg("parallel vacuum degree must be more than 1"),
+							 parser_errposition(@1)));
+				$$ = makeVacOpt(VACOPT_PARALLEL, $2);
+			}
 			| IDENT
 				{
 					if (strcmp($1, "disable_page_skipping") == 0)
-						$$ = VACOPT_DISABLE_PAGE_SKIPPING;
+						$$ = makeVacOpt(VACOPT_DISABLE_PAGE_SKIPPING, 0);
 					else if (strcmp($1, "skip_locked") == 0)
-						$$ = VACOPT_SKIP_LOCKED;
+						$$ = makeVacOpt(VACOPT_SKIP_LOCKED, 0);
 					else
 						ereport(ERROR,
 								(errcode(ERRCODE_SYNTAX_ERROR),
@@ -10526,16 +10550,16 @@ vacuum_option_elem:
 AnalyzeStmt: analyze_keyword opt_verbose opt_vacuum_relation_list
 				{
 					VacuumStmt *n = makeNode(VacuumStmt);
-					n->options = VACOPT_ANALYZE;
+					n->options.flags = VACOPT_ANALYZE;
 					if ($2)
-						n->options |= VACOPT_VERBOSE;
+						n->options.flags |= VACOPT_VERBOSE;
 					n->rels = $3;
 					$$ = (Node *)n;
 				}
 			| analyze_keyword '(' analyze_option_list ')' opt_vacuum_relation_list
 				{
 					VacuumStmt *n = makeNode(VacuumStmt);
-					n->options = VACOPT_ANALYZE | $3;
+					n->options.flags = VACOPT_ANALYZE | $3;
 					n->rels = $5;
 					$$ = (Node *) n;
 				}
@@ -16033,6 +16057,19 @@ makeXmlExpr(XmlExprOp op, char *name, List *named_args, List *args,
 	return (Node *) x;
 }
 
+
+/*
+ * Create a VacuumOption with the given options.
+ */
+static VacuumOption *
+makeVacOpt(VacuumOptionFlag flag, int nworkers)
+{
+	VacuumOption *vacopt = palloc(sizeof(VacuumOption));
+
+	vacopt->flags = flag;
+	vacopt->nworkers = nworkers;
+	return vacopt;
+}
 /*
  * Merge the input and output parameters of a table function.
  */
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 2d5086d..67ac5e3 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -188,7 +188,7 @@ typedef struct av_relation
 typedef struct autovac_table
 {
 	Oid			at_relid;
-	int			at_vacoptions;	/* bitmask of VacuumOption */
+	VacuumOption at_vacoptions;	/* bitmask of VacuumOptionFlag */
 	VacuumParams at_params;
 	int			at_vacuum_cost_delay;
 	int			at_vacuum_cost_limit;
@@ -2482,7 +2482,7 @@ do_autovacuum(void)
 			 * next table in our list.
 			 */
 			HOLD_INTERRUPTS();
-			if (tab->at_vacoptions & VACOPT_VACUUM)
+			if (tab->at_vacoptions.flags & VACOPT_VACUUM)
 				errcontext("automatic vacuum of table \"%s.%s.%s\"",
 						   tab->at_datname, tab->at_nspname, tab->at_relname);
 			else
@@ -2883,10 +2883,11 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		tab = palloc(sizeof(autovac_table));
 		tab->at_relid = relid;
 		tab->at_sharedrel = classForm->relisshared;
-		tab->at_vacoptions = VACOPT_SKIPTOAST |
+		tab->at_vacoptions.flags = VACOPT_SKIPTOAST |
 			(dovacuum ? VACOPT_VACUUM : 0) |
 			(doanalyze ? VACOPT_ANALYZE : 0) |
 			(!wraparound ? VACOPT_SKIP_LOCKED : 0);
+		tab->at_vacoptions.nworkers = 0;
 		tab->at_params.freeze_min_age = freeze_min_age;
 		tab->at_params.freeze_table_age = freeze_table_age;
 		tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
@@ -3132,10 +3133,10 @@ autovac_report_activity(autovac_table *tab)
 	int			len;
 
 	/* Report the command and possible options */
-	if (tab->at_vacoptions & VACOPT_VACUUM)
+	if (tab->at_vacoptions.flags & VACOPT_VACUUM)
 		snprintf(activity, MAX_AUTOVAC_ACTIV_LEN,
 				 "autovacuum: VACUUM%s",
-				 tab->at_vacoptions & VACOPT_ANALYZE ? " ANALYZE" : "");
+				 tab->at_vacoptions.flags & VACOPT_ANALYZE ? " ANALYZE" : "");
 	else
 		snprintf(activity, MAX_AUTOVAC_ACTIV_LEN,
 				 "autovacuum: ANALYZE");
diff --git a/src/backend/tcop/utility.c b/src/backend/tcop/utility.c
index 970c94e..23dc6d3 100644
--- a/src/backend/tcop/utility.c
+++ b/src/backend/tcop/utility.c
@@ -664,7 +664,7 @@ standard_ProcessUtility(PlannedStmt *pstmt,
 				VacuumStmt *stmt = (VacuumStmt *) parsetree;
 
 				/* we choose to allow this during "read only" transactions */
-				PreventCommandDuringRecovery((stmt->options & VACOPT_VACUUM) ?
+				PreventCommandDuringRecovery((stmt->options.flags & VACOPT_VACUUM) ?
 											 "VACUUM" : "ANALYZE");
 				/* forbidden in parallel mode due to CommandIsReadOnly */
 				ExecVacuum(stmt, isTopLevel);
@@ -2570,7 +2570,7 @@ CreateCommandTag(Node *parsetree)
 			break;
 
 		case T_VacuumStmt:
-			if (((VacuumStmt *) parsetree)->options & VACOPT_VACUUM)
+			if (((VacuumStmt *) parsetree)->options.flags & VACOPT_VACUUM)
 				tag = "VACUUM";
 			else
 				tag = "ANALYZE";
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index dfff23a..5b87241 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -15,6 +15,7 @@
 #define VACUUM_H
 
 #include "access/htup.h"
+#include "access/parallel.h"
 #include "catalog/pg_class.h"
 #include "catalog/pg_statistic.h"
 #include "catalog/pg_type.h"
@@ -163,7 +164,7 @@ extern int	vacuum_multixact_freeze_table_age;
 
 /* in commands/vacuum.c */
 extern void ExecVacuum(VacuumStmt *vacstmt, bool isTopLevel);
-extern void vacuum(int options, List *relations, VacuumParams *params,
+extern void vacuum(VacuumOption options, List *relations, VacuumParams *params,
 	   BufferAccessStrategy bstrategy, bool isTopLevel);
 extern void vac_open_indexes(Relation relation, LOCKMODE lockmode,
 				 int *nindexes, Relation **Irel);
@@ -197,8 +198,10 @@ extern Relation vacuum_open_relation(Oid relid, RangeVar *relation,
 					 VacuumParams *params, int options, LOCKMODE lmode);
 
 /* in commands/vacuumlazy.c */
-extern void lazy_vacuum_rel(Relation onerel, int options,
+extern void lazy_vacuum_rel(Relation onerel, VacuumOption options,
 				VacuumParams *params, BufferAccessStrategy bstrategy);
+extern void lazy_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
+
 
 /* in commands/analyze.c */
 extern void analyze_rel(Oid relid, RangeVar *relation, int options,
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index e5bdc1c..a2b4662 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3144,7 +3144,7 @@ typedef struct ClusterStmt
  * and VACOPT_ANALYZE must be set in options.
  * ----------------------
  */
-typedef enum VacuumOption
+typedef enum VacuumOptionFlag
 {
 	VACOPT_VACUUM = 1 << 0,		/* do VACUUM */
 	VACOPT_ANALYZE = 1 << 1,	/* do ANALYZE */
@@ -3153,7 +3153,14 @@ typedef enum VacuumOption
 	VACOPT_FULL = 1 << 4,		/* FULL (non-concurrent) vacuum */
 	VACOPT_SKIP_LOCKED = 1 << 5,	/* skip if cannot get lock */
 	VACOPT_SKIPTOAST = 1 << 6,	/* don't process the TOAST table, if any */
-	VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7	/* don't skip any pages */
+	VACOPT_PARALLEL = 1 << 7,	/* do lazy VACUUM in parallel */
+	VACOPT_DISABLE_PAGE_SKIPPING = 1 << 8	/* don't skip any pages */
+} VacuumOptionFlag;
+
+typedef struct VacuumOption
+{
+	VacuumOptionFlag	flags;	/* OR of VacuumOptionFlag */
+	int					nworkers;	/* # of parallel vacuum workers */
 } VacuumOption;
 
 /*
@@ -3173,9 +3180,9 @@ typedef struct VacuumRelation
 
 typedef struct VacuumStmt
 {
-	NodeTag		type;
-	int			options;		/* OR of VacuumOption flags */
-	List	   *rels;			/* list of VacuumRelation, or NIL for all */
+	NodeTag			type;
+	VacuumOption	options;
+	List		   *rels;			/* list of VacuumRelation, or NIL for all */
 } VacuumStmt;
 
 /* ----------------------
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index fa9d663..9b5b7dc 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -80,6 +80,8 @@ CONTEXT:  SQL function "do_analyze" statement 1
 SQL function "wrap_do_analyze" statement 1
 VACUUM FULL vactst;
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+VACUUM (PARALLEL) vaccluster;
+VACUUM (PARALLEL 2) vaccluster;
 -- partitioned table
 CREATE TABLE vacparted (a int, b char) PARTITION BY LIST (a);
 CREATE TABLE vacparted1 PARTITION OF vacparted FOR VALUES IN (1);
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 9defa0d..f92c4e5 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -61,6 +61,9 @@ VACUUM FULL vaccluster;
 VACUUM FULL vactst;
 
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+VACUUM (PARALLEL) vaccluster;
+VACUUM (PARALLEL 2) vaccluster;
+
 
 -- partitioned table
 CREATE TABLE vacparted (a int, b char) PARTITION BY LIST (a);
-- 
2.10.5

#13

Masahiko Sawada

sawada.mshk@gmail.com

almost 7 years ago

In reply to: Masahiko Sawada (#12)

2 attachment(s)

On Fri, Dec 28, 2018 at 11:43 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Thu, Dec 20, 2018 at 3:38 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Dec 18, 2018 at 1:29 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Attached the updated patches. I scaled back the scope of this patch.
The patch now includes only feature (a), that is it execute both index
vacuum and cleanup index in parallel. It also doesn't include
autovacuum support for now.

The PARALLEL option works alomst same as before patch. In VACUUM
command, we can specify 'PARALLEL n' option where n is the number of
parallel workers to request. If the n is omitted the number of
parallel worekrs would be # of indexes -1.

I think for now this is okay, but I guess in furture when we make
heapscans also parallel or maybe allow more than one worker per-index
vacuum, then this won't hold good. So, I am not sure if below text in
docs is most appropriate.
+    <term><literal>PARALLEL <replaceable
class="parameter">N</replaceable></literal></term>
+    <listitem>
+     <para>
+      Execute index vacuum and cleanup index in parallel with
+      <replaceable class="parameter">N</replaceable> background
workers. If the parallel
+      degree <replaceable class="parameter">N</replaceable> is omitted,
+      <command>VACUUM</command> requests the number of indexes - 1
processes, which is the
+      maximum number of parallel vacuum workers since individual
indexes is processed by
+      one process. The actual number of parallel vacuum workers may
be less due to the
+      setting of <xref linkend="guc-max-parallel-workers-maintenance"/>.
+      This option can not use with  <literal>FULL</literal> option.
It might be better to use some generic language in docs, something
like "If the parallel degree N is omitted, then vacuum decides the
number of workers based on number of indexes on the relation which is
further limited by max-parallel-workers-maintenance".
Thank you for the review.

I agreed your concern and the text you suggested.

I think you
also need to mention in some way that you consider storage option
parallel_workers.

Added.
Few assorted comments:
1.
+lazy_begin_parallel_vacuum_index(LVState *lvstate, bool for_cleanup)
{
..
+
+ LaunchParallelWorkers(lvstate->pcxt);
+
+ /*
+ * if no workers launched, we vacuum all indexes by the leader process
+ * alone. Since there is hope that we can launch workers in the next
+ * execution time we don't want to end the parallel mode yet.
+ */
+ if (lvstate->pcxt->nworkers_launched == 0)
+ return;
It is quite possible that the workers are not launched because we fail
to allocate memory, basically when pcxt->nworkers is zero. I think in
such cases there is no use for being in parallel mode. You can even
detect that before calling lazy_begin_parallel_vacuum_index.
Agreed. we can stop preparation and exit parallel mode if
pcxt->nworkers got 0 after InitializeParallelDSM() .
2.
static void
+lazy_vacuum_all_indexes_for_leader(LVState *lvstate,
IndexBulkDeleteResult **stats,
+    LVTidMap *dead_tuples, bool do_parallel,
+    bool for_cleanup)
{
..
+ if (do_parallel)
+ lazy_begin_parallel_vacuum_index(lvstate, for_cleanup);
+
+ for (;;)
+ {
+ IndexBulkDeleteResult *r = NULL;
+
+ /*
+ * Get the next index number to vacuum and set index statistics. In parallel
+ * lazy vacuum, index bulk-deletion results are stored in the shared memory
+ * segment. If it's already updated we use it rather than setting it to NULL.
+ * In single vacuum, we can always use an element of the 'stats'.
+ */
+ if (do_parallel)
+ {
+ idx = pg_atomic_fetch_add_u32(&(lvshared->nprocessed), 1);
+
+ if (lvshared->indstats[idx].updated)
+ r = &(lvshared->indstats[idx].stats);
+ }
It is quite possible that we are not able to launch any workers in
lazy_begin_parallel_vacuum_index, in such cases, we should not use
parallel mode path, basically as written we can't rely on
'do_parallel' flag.
Fixed.

Attached new version patch.

Rebased.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachments:

v11-0001-Add-parallel-option-to-VACUUM-command.patchapplication/octet-stream; name=v11-0001-Add-parallel-option-to-VACUUM-command.patchDownload

From 751ebf5a115d0198a5df8b7a0ba65efa665d8ac7 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Tue, 18 Dec 2018 14:48:34 +0900
Subject: [PATCH v11 1/2] Add parallel option to VACUUM command

In parallel vacuum, we do both index vacuum and cleanup vacuum
in parallel with parallel worker processes if the table has
more than one index. All processes including the leader process
process indexes one by one.

Parallel vacuum can be performed by specifying like
VACUUM (PARALLEL 2) tbl, meaning that performing vacuum with 2
parallel worker processes. Or the setting parallel_workers
reloption more than 0 invokes parallel vacuum.

The parallel vacuum degree is limited by both the number of
indexes the table has and max_parallel_maintenance_workers.
---
 doc/src/sgml/config.sgml              |  11 +-
 doc/src/sgml/ref/vacuum.sgml          |  25 +
 src/backend/access/transam/parallel.c |   4 +
 src/backend/commands/vacuum.c         |  76 ++--
 src/backend/commands/vacuumlazy.c     | 832 +++++++++++++++++++++++++++++-----
 src/backend/nodes/equalfuncs.c        |   6 +-
 src/backend/parser/gram.y             |  73 ++-
 src/backend/postmaster/autovacuum.c   |  11 +-
 src/backend/tcop/utility.c            |   4 +-
 src/include/commands/vacuum.h         |   7 +-
 src/include/nodes/parsenodes.h        |  17 +-
 src/test/regress/expected/vacuum.out  |   2 +
 src/test/regress/sql/vacuum.sql       |   3 +
 13 files changed, 876 insertions(+), 195 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index b6f5822..913855e 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2185,11 +2185,12 @@ include_dir 'conf.d'
        <listitem>
         <para>
          Sets the maximum number of parallel workers that can be
-         started by a single utility command.  Currently, the only
-         parallel utility command that supports the use of parallel
-         workers is <command>CREATE INDEX</command>, and only when
-         building a B-tree index.  Parallel workers are taken from the
-         pool of processes established by <xref
+         started by a single utility command.  Currently, the parallel
+         utility commands that supports the use of parallel worker are
+         <command>CREATE INDEX</command>, and only when
+         building a B-tree index and <command>VACUUM</command> without
+         <literal>FULL</literal> option.  Parallel workers are taken from
+         the pool of processes established by <xref
          linkend="guc-max-worker-processes"/>, limited by <xref
          linkend="guc-max-parallel-workers"/>.  Note that the requested
          number of workers may not actually be available at run time.
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index fd911f5..95d6ff20 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -30,6 +30,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     FREEZE
     VERBOSE
     ANALYZE
+    PARALLEL [ <replaceable class="parameter">N</replaceable> ]
     DISABLE_PAGE_SKIPPING
     SKIP_LOCKED
 
@@ -143,6 +144,21 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    </varlistentry>
 
    <varlistentry>
+    <term><literal>PARALLEL <replaceable class="parameter">N</replaceable></literal></term>
+    <listitem>
+     <para>
+      Execute index vacuum and cleanup index in parallel with
+      <replaceable class="parameter">N</replaceable> background workers. If the
+      parallel degree <replaceable class="parameter">N</replaceable> is omitted,
+      then <command>VACUUM</command> decides the number of workers based on
+      number of indexes on the relation which further limited by
+      <xref linkend="guc-max-parallel-workers-maintenance"/>.
+      This option can not use with  <literal>FULL</literal> option.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><literal>DISABLE_PAGE_SKIPPING</literal></term>
     <listitem>
      <para>
@@ -261,6 +277,15 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    </para>
 
    <para>
+    Setting a value for <literal>parallel_workers</literal> via
+    <xref linkend="sql-altertable"/> also controls how many parallel
+    worker processes will be requested by a <command>VACUUM</command>
+    against the table. This setting is overwritten by setting
+    <replaceable class="parameter">N</replaceable> of <literal>PARALLEL</literal>
+    option.
+   </para>
+
+   <para>
     <command>VACUUM</command> causes a substantial increase in I/O traffic,
     which might cause poor performance for other active sessions.  Therefore,
     it is sometimes advisable to use the cost-based vacuum delay feature.
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index 9c55c20..532b1a9 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -23,6 +23,7 @@
 #include "catalog/index.h"
 #include "catalog/namespace.h"
 #include "commands/async.h"
+#include "commands/vacuum.h"
 #include "executor/execParallel.h"
 #include "libpq/libpq.h"
 #include "libpq/pqformat.h"
@@ -138,6 +139,9 @@ static const struct
 	},
 	{
 		"_bt_parallel_build_main", _bt_parallel_build_main
+	},
+	{
+		"lazy_parallel_vacuum_main", lazy_parallel_vacuum_main
 	}
 };
 
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index ff1e178..6fbeea5 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -68,13 +68,13 @@ static BufferAccessStrategy vac_strategy;
 
 
 /* non-export function prototypes */
-static List *expand_vacuum_rel(VacuumRelation *vrel, int options);
-static List *get_all_vacuum_rels(int options);
+static List *expand_vacuum_rel(VacuumRelation *vrel, VacuumOption options);
+static List *get_all_vacuum_rels(VacuumOption options);
 static void vac_truncate_clog(TransactionId frozenXID,
 				  MultiXactId minMulti,
 				  TransactionId lastSaneFrozenXid,
 				  MultiXactId lastSaneMinMulti);
-static bool vacuum_rel(Oid relid, RangeVar *relation, int options,
+static bool vacuum_rel(Oid relid, RangeVar *relation, VacuumOption options,
 		   VacuumParams *params);
 
 /*
@@ -89,15 +89,15 @@ ExecVacuum(VacuumStmt *vacstmt, bool isTopLevel)
 	VacuumParams params;
 
 	/* sanity checks on options */
-	Assert(vacstmt->options & (VACOPT_VACUUM | VACOPT_ANALYZE));
-	Assert((vacstmt->options & VACOPT_VACUUM) ||
-		   !(vacstmt->options & (VACOPT_FULL | VACOPT_FREEZE)));
-	Assert(!(vacstmt->options & VACOPT_SKIPTOAST));
+	Assert(vacstmt->options.flags & (VACOPT_VACUUM | VACOPT_ANALYZE));
+	Assert((vacstmt->options.flags & VACOPT_VACUUM) ||
+		   !(vacstmt->options.flags & (VACOPT_FULL | VACOPT_FREEZE)));
+	Assert(!(vacstmt->options.flags & VACOPT_SKIPTOAST));
 
 	/*
 	 * Make sure VACOPT_ANALYZE is specified if any column lists are present.
 	 */
-	if (!(vacstmt->options & VACOPT_ANALYZE))
+	if (!(vacstmt->options.flags & VACOPT_ANALYZE))
 	{
 		ListCell   *lc;
 
@@ -112,11 +112,17 @@ ExecVacuum(VacuumStmt *vacstmt, bool isTopLevel)
 		}
 	}
 
+	if ((vacstmt->options.flags & VACOPT_FULL) &&
+		(vacstmt->options.flags & VACOPT_PARALLEL))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot specify FULL option with PARALLEL option")));
+
 	/*
 	 * All freeze ages are zero if the FREEZE option is given; otherwise pass
 	 * them as -1 which means to use the default values.
 	 */
-	if (vacstmt->options & VACOPT_FREEZE)
+	if (vacstmt->options.flags & VACOPT_FREEZE)
 	{
 		params.freeze_min_age = 0;
 		params.freeze_table_age = 0;
@@ -163,7 +169,7 @@ ExecVacuum(VacuumStmt *vacstmt, bool isTopLevel)
  * memory context that will not disappear at transaction commit.
  */
 void
-vacuum(int options, List *relations, VacuumParams *params,
+vacuum(VacuumOption options, List *relations, VacuumParams *params,
 	   BufferAccessStrategy bstrategy, bool isTopLevel)
 {
 	static bool in_vacuum = false;
@@ -174,7 +180,7 @@ vacuum(int options, List *relations, VacuumParams *params,
 
 	Assert(params != NULL);
 
-	stmttype = (options & VACOPT_VACUUM) ? "VACUUM" : "ANALYZE";
+	stmttype = (options.flags & VACOPT_VACUUM) ? "VACUUM" : "ANALYZE";
 
 	/*
 	 * We cannot run VACUUM inside a user transaction block; if we were inside
@@ -184,7 +190,7 @@ vacuum(int options, List *relations, VacuumParams *params,
 	 *
 	 * ANALYZE (without VACUUM) can run either way.
 	 */
-	if (options & VACOPT_VACUUM)
+	if (options.flags & VACOPT_VACUUM)
 	{
 		PreventInTransactionBlock(isTopLevel, stmttype);
 		in_outer_xact = false;
@@ -206,8 +212,8 @@ vacuum(int options, List *relations, VacuumParams *params,
 	/*
 	 * Sanity check DISABLE_PAGE_SKIPPING option.
 	 */
-	if ((options & VACOPT_FULL) != 0 &&
-		(options & VACOPT_DISABLE_PAGE_SKIPPING) != 0)
+	if ((options.flags & VACOPT_FULL) != 0 &&
+		(options.flags & VACOPT_DISABLE_PAGE_SKIPPING) != 0)
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg("VACUUM option DISABLE_PAGE_SKIPPING cannot be used with FULL")));
@@ -216,7 +222,7 @@ vacuum(int options, List *relations, VacuumParams *params,
 	 * Send info about dead objects to the statistics collector, unless we are
 	 * in autovacuum --- autovacuum.c does this for itself.
 	 */
-	if ((options & VACOPT_VACUUM) && !IsAutoVacuumWorkerProcess())
+	if ((options.flags & VACOPT_VACUUM) && !IsAutoVacuumWorkerProcess())
 		pgstat_vacuum_stat();
 
 	/*
@@ -281,11 +287,11 @@ vacuum(int options, List *relations, VacuumParams *params,
 	 * transaction block, and also in an autovacuum worker, use own
 	 * transactions so we can release locks sooner.
 	 */
-	if (options & VACOPT_VACUUM)
+	if (options.flags & VACOPT_VACUUM)
 		use_own_xacts = true;
 	else
 	{
-		Assert(options & VACOPT_ANALYZE);
+		Assert(options.flags & VACOPT_ANALYZE);
 		if (IsAutoVacuumWorkerProcess())
 			use_own_xacts = true;
 		else if (in_outer_xact)
@@ -335,13 +341,13 @@ vacuum(int options, List *relations, VacuumParams *params,
 		{
 			VacuumRelation *vrel = lfirst_node(VacuumRelation, cur);
 
-			if (options & VACOPT_VACUUM)
+			if (options.flags & VACOPT_VACUUM)
 			{
 				if (!vacuum_rel(vrel->oid, vrel->relation, options, params))
 					continue;
 			}
 
-			if (options & VACOPT_ANALYZE)
+			if (options.flags & VACOPT_ANALYZE)
 			{
 				/*
 				 * If using separate xacts, start one for analyze. Otherwise,
@@ -354,7 +360,7 @@ vacuum(int options, List *relations, VacuumParams *params,
 					PushActiveSnapshot(GetTransactionSnapshot());
 				}
 
-				analyze_rel(vrel->oid, vrel->relation, options, params,
+				analyze_rel(vrel->oid, vrel->relation, options.flags, params,
 							vrel->va_cols, in_outer_xact, vac_strategy);
 
 				if (use_own_xacts)
@@ -390,7 +396,7 @@ vacuum(int options, List *relations, VacuumParams *params,
 		StartTransactionCommand();
 	}
 
-	if ((options & VACOPT_VACUUM) && !IsAutoVacuumWorkerProcess())
+	if ((options.flags & VACOPT_VACUUM) && !IsAutoVacuumWorkerProcess())
 	{
 		/*
 		 * Update pg_database.datfrozenxid, and truncate pg_xact if possible.
@@ -603,7 +609,7 @@ vacuum_open_relation(Oid relid, RangeVar *relation, VacuumParams *params,
  * are made in vac_context.
  */
 static List *
-expand_vacuum_rel(VacuumRelation *vrel, int options)
+expand_vacuum_rel(VacuumRelation *vrel, VacuumOption options)
 {
 	List	   *vacrels = NIL;
 	MemoryContext oldcontext;
@@ -635,7 +641,7 @@ expand_vacuum_rel(VacuumRelation *vrel, int options)
 		 * below, as well as find_all_inheritors's expectation that the caller
 		 * holds some lock on the starting relation.
 		 */
-		rvr_opts = (options & VACOPT_SKIP_LOCKED) ? RVR_SKIP_LOCKED : 0;
+		rvr_opts = (options.flags & VACOPT_SKIP_LOCKED) ? RVR_SKIP_LOCKED : 0;
 		relid = RangeVarGetRelidExtended(vrel->relation,
 										 AccessShareLock,
 										 rvr_opts,
@@ -647,7 +653,7 @@ expand_vacuum_rel(VacuumRelation *vrel, int options)
 		 */
 		if (!OidIsValid(relid))
 		{
-			if (options & VACOPT_VACUUM)
+			if (options.flags & VACOPT_VACUUM)
 				ereport(WARNING,
 						(errcode(ERRCODE_LOCK_NOT_AVAILABLE),
 						 errmsg("skipping vacuum of \"%s\" --- lock not available",
@@ -673,7 +679,7 @@ expand_vacuum_rel(VacuumRelation *vrel, int options)
 		 * Make a returnable VacuumRelation for this rel if user is a proper
 		 * owner.
 		 */
-		if (vacuum_is_relation_owner(relid, classForm, options))
+		if (vacuum_is_relation_owner(relid, classForm, options.flags))
 		{
 			oldcontext = MemoryContextSwitchTo(vac_context);
 			vacrels = lappend(vacrels, makeVacuumRelation(vrel->relation,
@@ -742,7 +748,7 @@ expand_vacuum_rel(VacuumRelation *vrel, int options)
  * the current database.  The list is built in vac_context.
  */
 static List *
-get_all_vacuum_rels(int options)
+get_all_vacuum_rels(VacuumOption options)
 {
 	List	   *vacrels = NIL;
 	Relation	pgclass;
@@ -760,7 +766,7 @@ get_all_vacuum_rels(int options)
 		Oid			relid = classForm->oid;
 
 		/* check permissions of relation */
-		if (!vacuum_is_relation_owner(relid, classForm, options))
+		if (!vacuum_is_relation_owner(relid, classForm, options.flags))
 			continue;
 
 		/*
@@ -1521,7 +1527,7 @@ vac_truncate_clog(TransactionId frozenXID,
  *		At entry and exit, we are not inside a transaction.
  */
 static bool
-vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
+vacuum_rel(Oid relid, RangeVar *relation, VacuumOption options, VacuumParams *params)
 {
 	LOCKMODE	lmode;
 	Relation	onerel;
@@ -1542,7 +1548,7 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 	 */
 	PushActiveSnapshot(GetTransactionSnapshot());
 
-	if (!(options & VACOPT_FULL))
+	if (!(options.flags & VACOPT_FULL))
 	{
 		/*
 		 * In lazy vacuum, we can set the PROC_IN_VACUUM flag, which lets
@@ -1582,10 +1588,10 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 	 * vacuum, but just ShareUpdateExclusiveLock for concurrent vacuum. Either
 	 * way, we can be sure that no other backend is vacuuming the same table.
 	 */
-	lmode = (options & VACOPT_FULL) ? AccessExclusiveLock : ShareUpdateExclusiveLock;
+	lmode = (options.flags & VACOPT_FULL) ? AccessExclusiveLock : ShareUpdateExclusiveLock;
 
 	/* open the relation and get the appropriate lock on it */
-	onerel = vacuum_open_relation(relid, relation, params, options, lmode);
+	onerel = vacuum_open_relation(relid, relation, params, options.flags, lmode);
 
 	/* leave if relation could not be opened or locked */
 	if (!onerel)
@@ -1605,7 +1611,7 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 	 */
 	if (!vacuum_is_relation_owner(RelationGetRelid(onerel),
 								  onerel->rd_rel,
-								  options & VACOPT_VACUUM))
+								  options.flags & VACOPT_VACUUM))
 	{
 		relation_close(onerel, lmode);
 		PopActiveSnapshot();
@@ -1677,7 +1683,7 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 	 * us to process it.  In VACUUM FULL, though, the toast table is
 	 * automatically rebuilt by cluster_rel so we shouldn't recurse to it.
 	 */
-	if (!(options & VACOPT_SKIPTOAST) && !(options & VACOPT_FULL))
+	if (!(options.flags & VACOPT_SKIPTOAST) && !(options.flags & VACOPT_FULL))
 		toast_relid = onerel->rd_rel->reltoastrelid;
 	else
 		toast_relid = InvalidOid;
@@ -1696,7 +1702,7 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 	/*
 	 * Do the actual work --- either FULL or "lazy" vacuum
 	 */
-	if (options & VACOPT_FULL)
+	if (options.flags & VACOPT_FULL)
 	{
 		int			cluster_options = 0;
 
@@ -1704,7 +1710,7 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 		relation_close(onerel, NoLock);
 		onerel = NULL;
 
-		if ((options & VACOPT_VERBOSE) != 0)
+		if ((options.flags & VACOPT_VERBOSE) != 0)
 			cluster_options |= CLUOPT_VERBOSE;
 
 		/* VACUUM FULL is now a variant of CLUSTER; see cluster.c */
diff --git a/src/backend/commands/vacuumlazy.c b/src/backend/commands/vacuumlazy.c
index b67267f..6ef5e93 100644
--- a/src/backend/commands/vacuumlazy.c
+++ b/src/backend/commands/vacuumlazy.c
@@ -22,6 +22,19 @@
  * of index scans performed.  So we don't use maintenance_work_mem memory for
  * the TID array, just enough to hold as many heap tuples as fit on one page.
  *
+ * Lazy vacuum supports parallel execution with parallel worker processes. In
+ * parallel vacuum, we perform both index vacuum and index cleanup in parallel.
+ * Individual indexes is processed by one vacuum process. At beginning of
+ * lazy vacuum (at lazy_scan_heap) we prepare the parallel context and initialize
+ * the shared memory segments that contains shared information as well as the
+ * memory space for dead tuples. When starting either index vacuum or cleanup
+ * vacuum, we launch parallel worker processes. Once all indexes are processed
+ * the parallel worker processes exit and the leader process re-initializes the
+ * shared memory segment. Note that all parallel workers live during one either
+ * index vacuum or cleanup index but the leader process neither exits from the
+ * parallel mode nor destories the parallel context. For updating the index
+ * statistics, since any updates are not allowed during parallel mode we update
+ * the index statistics after exited from the parallel mode.
  *
  * Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
@@ -41,8 +54,10 @@
 #include "access/heapam_xlog.h"
 #include "access/htup_details.h"
 #include "access/multixact.h"
+#include "access/parallel.h"
 #include "access/transam.h"
 #include "access/visibilitymap.h"
+#include "access/xact.h"
 #include "access/xlog.h"
 #include "catalog/storage.h"
 #include "commands/dbcommands.h"
@@ -55,6 +70,8 @@
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
 #include "storage/lmgr.h"
+#include "storage/spin.h"
+#include "tcop/tcopprot.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/pg_rusage.h"
@@ -111,10 +128,79 @@
  */
 #define PREFETCH_SIZE			((BlockNumber) 32)
 
+/* DSM key for parallel lazy vacuum */
+#define PARALLEL_VACUUM_KEY_SHARED			UINT64CONST(0xFFFFFFFFFFF00001)
+#define PARALLEL_VACUUM_KEY_DEAD_TUPLES		UINT64CONST(0xFFFFFFFFFFF00002)
+#define PARALLEL_VACUUM_KEY_QUERY_TEXT		UINT64CONST(0xFFFFFFFFFFF00003)
+
+/*
+ * Structs for an index bulk-deletion statistic that is used for parallel
+ * lazy vacuum. This is allocated in a dynamic shared memory segment.
+ */
+typedef struct LVIndStats
+{
+	bool updated;	/* is the stats updated? */
+	IndexBulkDeleteResult stats;
+} LVIndStats;
+
+/*
+ * LVTidMap controls the dead tuple TIDs collected during heap scan. This is
+ * allocated in a dynamic shared memory segment in parallel lazy vacuum mode, or
+ * in a local memory.
+ */
+typedef struct LVTidMap
+{
+	int			max_dead_tuples;	/* # slots allocated in array */
+	int			num_dead_tuples;	/* current # of entries */
+	/* List of TIDs of tuples we intend to delete */
+	/* NB: this list is ordered by TID address */
+	ItemPointerData itemptrs[FLEXIBLE_ARRAY_MEMBER];	/* array of ItemPointerData */
+} LVTidMap;
+#define SizeOfLVTidMap offsetof(LVTidMap, itemptrs) + sizeof(ItemPointerData)
+
+/*
+ * Status for parallel vacuum index and cleanup index. This is allocated in a
+ * dynamic shared memory segment.
+ */
+typedef struct LVShared
+{
+	/*
+	 * Target table relid and vacuum settings. These fields are not modified
+	 * during the lazy vacuum.
+	 */
+	Oid		relid;
+	bool	is_wraparound;
+	int		elevel;
+
+	/*
+	 * An indication for vacuum workers of doing either vacuuming index or
+	 * cleanup index.
+	 */
+	bool	for_cleanup;
+
+	/*
+	 * Fields for vacuum index or cleanup index, or both necessary for
+	 * IndexVacuumInfo.
+	 *
+	 * reltuples is the total number of input heap tuples. We set either an
+	 * old live tuples in vacuum index or th new live tuples in cleanup index.
+	 *
+	 * estimated_count is true if the reltuples is estimated value.
+	 */
+	double	reltuples;
+	bool	estimated_count;
+
+	/*
+	 * Variables to control parallel index vacuum. An variable-sized field
+	 * 'indstats' must come last.
+	 */
+	pg_atomic_uint32	nprocessed;
+	LVIndStats			indstats[FLEXIBLE_ARRAY_MEMBER];
+} LVShared;
+#define SizeOfLVShared offsetof(LVShared, indstats) + sizeof(LVIndStats)
+
 typedef struct LVRelStats
 {
-	/* hasindex = true means two-pass strategy; false means one-pass */
-	bool		hasindex;
 	/* Overall statistics about rel */
 	BlockNumber old_rel_pages;	/* previous value of pg_class.relpages */
 	BlockNumber rel_pages;		/* total number of pages */
@@ -129,16 +215,34 @@ typedef struct LVRelStats
 	BlockNumber pages_removed;
 	double		tuples_deleted;
 	BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
-	/* List of TIDs of tuples we intend to delete */
-	/* NB: this list is ordered by TID address */
-	int			num_dead_tuples;	/* current # of entries */
-	int			max_dead_tuples;	/* # slots allocated in array */
-	ItemPointer dead_tuples;	/* array of ItemPointerData */
 	int			num_index_scans;
 	TransactionId latestRemovedXid;
 	bool		lock_waiter_detected;
 } LVRelStats;
 
+/*
+ * Working state for lazy vacuum execution used by only leader process. This is
+ * present only in the leader process. In parallel lazy vacuum, the 'lvshared'
+ * and 'pcxt' are not NULL and they point to a dynamic shared memory segment.
+ */
+typedef struct LVState
+{
+	Relation	relation;
+	LVRelStats	*vacrelstats;
+	Relation	*indRels;
+	/* nindexes > 0 means two-pass strategy; false means one-pass */
+	int			nindexes;
+
+	/* Lazy vacuum options and scan status */
+	VacuumOption	options;
+	bool			is_wraparound;
+	bool			aggressive;
+	bool			parallel_ready;
+
+	/* Variables for parallel lazy vacuum */
+	LVShared		*lvshared;
+	ParallelContext	*pcxt;
+} LVState;
 
 /* A few variables that don't seem worth passing around as parameters */
 static int	elevel = -1;
@@ -151,31 +255,43 @@ static BufferAccessStrategy vac_strategy;
 
 
 /* non-export function prototypes */
-static void lazy_scan_heap(Relation onerel, int options,
-			   LVRelStats *vacrelstats, Relation *Irel, int nindexes,
-			   bool aggressive);
-static void lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats);
+static void lazy_scan_heap(LVState *lvstate);
+static void lazy_vacuum_heap(LVState *lvstate, LVTidMap *dead_tuples);
 static bool lazy_check_needs_freeze(Buffer buf, bool *hastup);
-static void lazy_vacuum_index(Relation indrel,
-				  IndexBulkDeleteResult **stats,
-				  LVRelStats *vacrelstats);
-static void lazy_cleanup_index(Relation indrel,
-				   IndexBulkDeleteResult *stats,
-				   LVRelStats *vacrelstats);
-static int lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
-				 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer);
+static IndexBulkDeleteResult *lazy_vacuum_index(Relation indrel,
+									IndexBulkDeleteResult *stats,
+									double reltuples,
+									LVTidMap	*dead_tuples);
+static IndexBulkDeleteResult *lazy_cleanup_index(Relation indrel,
+									IndexBulkDeleteResult *stats,
+									double reltuples, bool estimated_count,
+									bool update_stats);
+static int lazy_vacuum_page(LVState *lvstate, Relation onerel, BlockNumber blkno,
+					Buffer buffer, int tupindex, Buffer *vmbuffer,
+					TransactionId latestRemovedXid, LVTidMap *dead_tuples);
 static bool should_attempt_truncation(LVRelStats *vacrelstats);
 static void lazy_truncate_heap(Relation onerel, LVRelStats *vacrelstats);
 static BlockNumber count_nondeletable_pages(Relation onerel,
 						 LVRelStats *vacrelstats);
-static void lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks);
-static void lazy_record_dead_tuple(LVRelStats *vacrelstats,
-					   ItemPointer itemptr);
+static LVTidMap *lazy_space_alloc(LVState *lvstate, BlockNumber relblocks,
+								  int parallel_workers);
+static void lazy_record_dead_tuple(LVTidMap *dead_tuples, ItemPointer itemptr);
 static bool lazy_tid_reaped(ItemPointer itemptr, void *state);
 static int	vac_cmp_itemptr(const void *left, const void *right);
 static bool heap_page_is_all_visible(Relation rel, Buffer buf,
 						 TransactionId *visibility_cutoff_xid, bool *all_frozen);
-
+static int lazy_compute_parallel_workers(Relation rel, int nrequests, int nindexes);
+static LVTidMap *lazy_prepare_parallel(LVState *lvstate, long maxtuples, int request);
+static void lazy_end_parallel(LVState *lvstate, bool update_indstats);
+static bool lazy_begin_parallel_vacuum_index(LVState *lvstate, bool for_cleanup);
+static void lazy_end_parallel_vacuum_index(LVState *lvstate);
+static void lazy_vacuum_all_indexes_for_leader(LVState *lvstate,
+											   IndexBulkDeleteResult **stats,
+											   LVTidMap *dead_tuples,
+											   bool for_cleanup);
+static void lazy_vacuum_all_indexes_for_worker(Relation *indrels, int nindexes,
+												LVShared *lvshared, LVTidMap *dead_tuples,
+												bool for_cleanup);
 
 /*
  *	lazy_vacuum_rel() -- perform LAZY VACUUM for one heap relation
@@ -187,9 +303,10 @@ static bool heap_page_is_all_visible(Relation rel, Buffer buf,
  *		and locked the relation.
  */
 void
-lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params,
+lazy_vacuum_rel(Relation onerel, VacuumOption options, VacuumParams *params,
 				BufferAccessStrategy bstrategy)
 {
+	LVState	   *lvstate;
 	LVRelStats *vacrelstats;
 	Relation   *Irel;
 	int			nindexes;
@@ -201,6 +318,7 @@ lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 				write_rate;
 	bool		aggressive;		/* should we scan all unfrozen pages? */
 	bool		scanned_all_unfrozen;	/* actually scanned all such pages? */
+	bool		hasindex;
 	TransactionId xidFullScanLimit;
 	MultiXactId mxactFullScanLimit;
 	BlockNumber new_rel_pages;
@@ -218,7 +336,7 @@ lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 		starttime = GetCurrentTimestamp();
 	}
 
-	if (options & VACOPT_VERBOSE)
+	if (options.flags & VACOPT_VERBOSE)
 		elevel = INFO;
 	else
 		elevel = DEBUG2;
@@ -246,7 +364,7 @@ lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 											   xidFullScanLimit);
 	aggressive |= MultiXactIdPrecedesOrEquals(onerel->rd_rel->relminmxid,
 											  mxactFullScanLimit);
-	if (options & VACOPT_DISABLE_PAGE_SKIPPING)
+	if (options.flags & VACOPT_DISABLE_PAGE_SKIPPING)
 		aggressive = true;
 
 	vacrelstats = (LVRelStats *) palloc0(sizeof(LVRelStats));
@@ -259,10 +377,23 @@ lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 
 	/* Open all indexes of the relation */
 	vac_open_indexes(onerel, RowExclusiveLock, &nindexes, &Irel);
-	vacrelstats->hasindex = (nindexes > 0);
+	hasindex = (nindexes > 0);
+
+	/* Create a lazy vacuum working state */
+	lvstate = (LVState *) palloc0(sizeof(LVState));
+	lvstate->relation = onerel;
+	lvstate->vacrelstats = vacrelstats;
+	lvstate->indRels = Irel;
+	lvstate->nindexes = nindexes;
+	lvstate->options = options;
+	lvstate->is_wraparound = params->is_wraparound;
+	lvstate->aggressive = aggressive;
+	lvstate->parallel_ready = false;
+	lvstate->lvshared = NULL;
+	lvstate->pcxt = NULL;
 
 	/* Do the vacuuming */
-	lazy_scan_heap(onerel, options, vacrelstats, Irel, nindexes, aggressive);
+	lazy_scan_heap(lvstate);
 
 	/* Done with indexes */
 	vac_close_indexes(nindexes, Irel, NoLock);
@@ -333,7 +464,7 @@ lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 						new_rel_pages,
 						new_live_tuples,
 						new_rel_allvisible,
-						vacrelstats->hasindex,
+						hasindex,
 						new_frozen_xid,
 						new_min_multi,
 						false);
@@ -465,14 +596,28 @@ vacuum_log_cleanup_info(Relation rel, LVRelStats *vacrelstats)
  *		dead-tuple TIDs, invoke vacuuming of indexes and call lazy_vacuum_heap
  *		to reclaim dead line pointers.
  *
+ *		If the table has more than one index and parallel lazy vacuum is requested,
+ *		we run both index vacuum and cleanup index in parallel. When allocating the
+ *		space for lazy scan heap, we enter the parallel mode, create the parallel
+ *		context and initailize a dynamic shared memory segment for dead tuples.
+ *		The dead_tuples points either to a dynamic shared memory segment in parallel
+ *		vacuum case or to a local memory in single vacuum case. Before starting
+ *		parallel index vacuum and parallel cleanup index we launch parallel workers.
+ * 		All parallel workers will exit after processed all indexes the leader process
+ *		re-initialize parallel context and then re-launch them at the next execution.
+ *		The index statistics are updated by the leader after exited from the parallel
+ * 		mode since currently all writes are not allowed during the parallel mode.
+ *
  *		If there are no indexes then we can reclaim line pointers on the fly;
  *		dead line pointers need only be retained until all index pointers that
  *		reference them have been killed.
  */
 static void
-lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
-			   Relation *Irel, int nindexes, bool aggressive)
+lazy_scan_heap(LVState *lvstate)
 {
+	Relation	onerel = lvstate->relation;
+	LVRelStats	*vacrelstats = lvstate->vacrelstats;
+	LVTidMap	*dead_tuples = NULL;
 	BlockNumber nblocks,
 				blkno;
 	HeapTupleData tuple;
@@ -495,6 +640,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	bool		skipping_blocks;
 	xl_heap_freeze_tuple *frozen;
 	StringInfoData buf;
+	int			parallel_workers = 0;
 	const int	initprog_index[] = {
 		PROGRESS_VACUUM_PHASE,
 		PROGRESS_VACUUM_TOTAL_HEAP_BLKS,
@@ -505,7 +651,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	pg_rusage_init(&ru0);
 
 	relname = RelationGetRelationName(onerel);
-	if (aggressive)
+	if (lvstate->aggressive)
 		ereport(elevel,
 				(errmsg("aggressively vacuuming \"%s.%s\"",
 						get_namespace_name(RelationGetNamespace(onerel)),
@@ -521,7 +667,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	num_tuples = live_tuples = tups_vacuumed = nkeep = nunused = 0;
 
 	indstats = (IndexBulkDeleteResult **)
-		palloc0(nindexes * sizeof(IndexBulkDeleteResult *));
+		palloc0(lvstate->nindexes * sizeof(IndexBulkDeleteResult *));
 
 	nblocks = RelationGetNumberOfBlocks(onerel);
 	vacrelstats->rel_pages = nblocks;
@@ -530,13 +676,22 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	vacrelstats->nonempty_pages = 0;
 	vacrelstats->latestRemovedXid = InvalidTransactionId;
 
-	lazy_space_alloc(vacrelstats, nblocks);
+	/*
+	 * Compute the number of parallel vacuum worker to request and then enable
+	 * parallel lazy vacuum.
+	 */
+	if ((lvstate->options.flags & VACOPT_PARALLEL) != 0)
+		parallel_workers = lazy_compute_parallel_workers(lvstate->relation,
+														 lvstate->options.nworkers,
+														 lvstate->nindexes);
+
+	dead_tuples = lazy_space_alloc(lvstate, nblocks, parallel_workers);
 	frozen = palloc(sizeof(xl_heap_freeze_tuple) * MaxHeapTuplesPerPage);
 
 	/* Report that we're scanning the heap, advertising total # of blocks */
 	initprog_val[0] = PROGRESS_VACUUM_PHASE_SCAN_HEAP;
 	initprog_val[1] = nblocks;
-	initprog_val[2] = vacrelstats->max_dead_tuples;
+	initprog_val[2] = dead_tuples->max_dead_tuples;
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
 	/*
@@ -584,7 +739,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	 * be replayed on any hot standby, where it can be disruptive.
 	 */
 	next_unskippable_block = 0;
-	if ((options & VACOPT_DISABLE_PAGE_SKIPPING) == 0)
+	if ((lvstate->options.flags & VACOPT_DISABLE_PAGE_SKIPPING) == 0)
 	{
 		while (next_unskippable_block < nblocks)
 		{
@@ -592,7 +747,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 
 			vmstatus = visibilitymap_get_status(onerel, next_unskippable_block,
 												&vmbuffer);
-			if (aggressive)
+			if (lvstate->aggressive)
 			{
 				if ((vmstatus & VISIBILITYMAP_ALL_FROZEN) == 0)
 					break;
@@ -639,7 +794,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		{
 			/* Time to advance next_unskippable_block */
 			next_unskippable_block++;
-			if ((options & VACOPT_DISABLE_PAGE_SKIPPING) == 0)
+			if ((lvstate->options.flags & VACOPT_DISABLE_PAGE_SKIPPING) == 0)
 			{
 				while (next_unskippable_block < nblocks)
 				{
@@ -648,7 +803,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 					vmskipflags = visibilitymap_get_status(onerel,
 														   next_unskippable_block,
 														   &vmbuffer);
-					if (aggressive)
+					if (lvstate->aggressive)
 					{
 						if ((vmskipflags & VISIBILITYMAP_ALL_FROZEN) == 0)
 							break;
@@ -677,7 +832,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			 * it's not all-visible.  But in an aggressive vacuum we know only
 			 * that it's not all-frozen, so it might still be all-visible.
 			 */
-			if (aggressive && VM_ALL_VISIBLE(onerel, blkno, &vmbuffer))
+			if (lvstate->aggressive && VM_ALL_VISIBLE(onerel, blkno, &vmbuffer))
 				all_visible_according_to_vm = true;
 		}
 		else
@@ -701,7 +856,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 				 * know whether it was all-frozen, so we have to recheck; but
 				 * in this case an approximate answer is OK.
 				 */
-				if (aggressive || VM_ALL_FROZEN(onerel, blkno, &vmbuffer))
+				if (lvstate->aggressive || VM_ALL_FROZEN(onerel, blkno, &vmbuffer))
 					vacrelstats->frozenskipped_pages++;
 				continue;
 			}
@@ -714,8 +869,8 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		 * If we are close to overrunning the available space for dead-tuple
 		 * TIDs, pause and do a cycle of vacuuming before we tackle this page.
 		 */
-		if ((vacrelstats->max_dead_tuples - vacrelstats->num_dead_tuples) < MaxHeapTuplesPerPage &&
-			vacrelstats->num_dead_tuples > 0)
+		if ((dead_tuples->max_dead_tuples - dead_tuples->num_dead_tuples) < MaxHeapTuplesPerPage &&
+			dead_tuples->num_dead_tuples > 0)
 		{
 			const int	hvp_index[] = {
 				PROGRESS_VACUUM_PHASE,
@@ -743,10 +898,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 										 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
 			/* Remove index entries */
-			for (i = 0; i < nindexes; i++)
-				lazy_vacuum_index(Irel[i],
-								  &indstats[i],
-								  vacrelstats);
+			lazy_vacuum_all_indexes_for_leader(lvstate, indstats, dead_tuples, false);
 
 			/*
 			 * Report that we are now vacuuming the heap.  We also increase
@@ -759,14 +911,14 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			pgstat_progress_update_multi_param(2, hvp_index, hvp_val);
 
 			/* Remove tuples from heap */
-			lazy_vacuum_heap(onerel, vacrelstats);
+			lazy_vacuum_heap(lvstate, dead_tuples);
 
 			/*
 			 * Forget the now-vacuumed tuples, and press on, but be careful
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_dead_tuples = 0;
 			vacrelstats->num_index_scans++;
 
 			/*
@@ -804,7 +956,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			 * it's OK to skip vacuuming pages we get a lock conflict on. They
 			 * will be dealt with in some future vacuum.
 			 */
-			if (!aggressive && !FORCE_CHECK_PAGE())
+			if (!lvstate->aggressive && !FORCE_CHECK_PAGE())
 			{
 				ReleaseBuffer(buf);
 				vacrelstats->pinskipped_pages++;
@@ -837,7 +989,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 					vacrelstats->nonempty_pages = blkno + 1;
 				continue;
 			}
-			if (!aggressive)
+			if (!lvstate->aggressive)
 			{
 				/*
 				 * Here, we must not advance scanned_pages; that would amount
@@ -956,7 +1108,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		has_dead_tuples = false;
 		nfrozen = 0;
 		hastup = false;
-		prev_dead_count = vacrelstats->num_dead_tuples;
+		prev_dead_count = dead_tuples->num_dead_tuples;
 		maxoff = PageGetMaxOffsetNumber(page);
 
 		/*
@@ -995,7 +1147,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			 */
 			if (ItemIdIsDead(itemid))
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				all_visible = false;
 				continue;
 			}
@@ -1135,7 +1287,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 
 			if (tupgone)
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				HeapTupleHeaderAdvanceLatestRemovedXid(tuple.t_data,
 													   &vacrelstats->latestRemovedXid);
 				tups_vacuumed += 1;
@@ -1204,11 +1356,12 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		 * If there are no indexes then we can vacuum the page right now
 		 * instead of doing a second scan.
 		 */
-		if (nindexes == 0 &&
-			vacrelstats->num_dead_tuples > 0)
+		if (lvstate->nindexes == 0 && dead_tuples->num_dead_tuples > 0)
 		{
 			/* Remove tuples from heap */
-			lazy_vacuum_page(onerel, blkno, buf, 0, vacrelstats, &vmbuffer);
+			lazy_vacuum_page(lvstate, onerel, blkno, buf, 0, &vmbuffer,
+							 lvstate->vacrelstats->latestRemovedXid,
+							 dead_tuples);
 			has_dead_tuples = false;
 
 			/*
@@ -1216,7 +1369,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_dead_tuples = 0;
 			vacuumed_pages++;
 
 			/*
@@ -1332,7 +1485,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		 * page, so remember its free space as-is.  (This path will always be
 		 * taken if there are no indexes.)
 		 */
-		if (vacrelstats->num_dead_tuples == prev_dead_count)
+		if (dead_tuples->num_dead_tuples == prev_dead_count)
 			RecordPageWithFreeSpace(onerel, blkno, freespace);
 	}
 
@@ -1366,7 +1519,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 
 	/* If any tuples need to be deleted, perform final vacuum cycle */
 	/* XXX put a threshold on min number of tuples here? */
-	if (vacrelstats->num_dead_tuples > 0)
+	if (dead_tuples->num_dead_tuples > 0)
 	{
 		const int	hvp_index[] = {
 			PROGRESS_VACUUM_PHASE,
@@ -1382,10 +1535,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 									 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
 		/* Remove index entries */
-		for (i = 0; i < nindexes; i++)
-			lazy_vacuum_index(Irel[i],
-							  &indstats[i],
-							  vacrelstats);
+		lazy_vacuum_all_indexes_for_leader(lvstate, indstats, dead_tuples, false);
 
 		/* Report that we are now vacuuming the heap */
 		hvp_val[0] = PROGRESS_VACUUM_PHASE_VACUUM_HEAP;
@@ -1395,7 +1545,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		/* Remove tuples from heap */
 		pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
 									 PROGRESS_VACUUM_PHASE_VACUUM_HEAP);
-		lazy_vacuum_heap(onerel, vacrelstats);
+		lazy_vacuum_heap(lvstate, dead_tuples);
 		vacrelstats->num_index_scans++;
 	}
 
@@ -1412,8 +1562,10 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
 
 	/* Do post-vacuum cleanup and statistics update for each index */
-	for (i = 0; i < nindexes; i++)
-		lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
+	lazy_vacuum_all_indexes_for_leader(lvstate, indstats, dead_tuples, true);
+
+	if (lvstate->parallel_ready)
+		lazy_end_parallel(lvstate, true);
 
 	/* If no indexes, make log report that lazy_vacuum_heap would've made */
 	if (vacuumed_pages)
@@ -1468,8 +1620,9 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
  * process index entry removal in batches as large as possible.
  */
 static void
-lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
+lazy_vacuum_heap(LVState *lvstate, LVTidMap *dead_tuples)
 {
+	Relation	onerel = lvstate->relation;
 	int			tupindex;
 	int			npages;
 	PGRUsage	ru0;
@@ -1479,7 +1632,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 	npages = 0;
 
 	tupindex = 0;
-	while (tupindex < vacrelstats->num_dead_tuples)
+	while (tupindex < dead_tuples->num_dead_tuples)
 	{
 		BlockNumber tblk;
 		Buffer		buf;
@@ -1488,7 +1641,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 
 		vacuum_delay_point();
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&dead_tuples->itemptrs[tupindex]);
 		buf = ReadBufferExtended(onerel, MAIN_FORKNUM, tblk, RBM_NORMAL,
 								 vac_strategy);
 		if (!ConditionalLockBufferForCleanup(buf))
@@ -1497,8 +1650,9 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 			++tupindex;
 			continue;
 		}
-		tupindex = lazy_vacuum_page(onerel, tblk, buf, tupindex, vacrelstats,
-									&vmbuffer);
+		tupindex = lazy_vacuum_page(lvstate, onerel, tblk, buf, tupindex,
+									&vmbuffer, lvstate->vacrelstats->latestRemovedXid,
+									dead_tuples);
 
 		/* Now that we've compacted the page, record its available space */
 		page = BufferGetPage(buf);
@@ -1533,8 +1687,9 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
  * The return value is the first tupindex after the tuples of this page.
  */
 static int
-lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
-				 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer)
+lazy_vacuum_page(LVState *lvstate, Relation onerel, BlockNumber blkno,
+				 Buffer buffer, int tupindex, Buffer *vmbuffer,
+				 TransactionId latestRemovedXid, LVTidMap *dead_tuples)
 {
 	Page		page = BufferGetPage(buffer);
 	OffsetNumber unused[MaxOffsetNumber];
@@ -1546,16 +1701,16 @@ lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 
 	START_CRIT_SECTION();
 
-	for (; tupindex < vacrelstats->num_dead_tuples; tupindex++)
+	for (; tupindex < dead_tuples->num_dead_tuples; tupindex++)
 	{
 		BlockNumber tblk;
 		OffsetNumber toff;
 		ItemId		itemid;
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&dead_tuples->itemptrs[tupindex]);
 		if (tblk != blkno)
 			break;				/* past end of tuples for this block */
-		toff = ItemPointerGetOffsetNumber(&vacrelstats->dead_tuples[tupindex]);
+		toff = ItemPointerGetOffsetNumber(&dead_tuples->itemptrs[tupindex]);
 		itemid = PageGetItemId(page, toff);
 		ItemIdSetUnused(itemid);
 		unused[uncnt++] = toff;
@@ -1576,7 +1731,7 @@ lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 		recptr = log_heap_clean(onerel, buffer,
 								NULL, 0, NULL, 0,
 								unused, uncnt,
-								vacrelstats->latestRemovedXid);
+								latestRemovedXid);
 		PageSetLSN(page, recptr);
 	}
 
@@ -1675,6 +1830,88 @@ lazy_check_needs_freeze(Buffer buf, bool *hastup)
 	return false;
 }
 
+/*
+ * Vacuum or cleanup all indexes with parallel workers if requested. This function
+ * is used by the parallel vacuum leader process. In parallel lazy vacuum, we save
+ * the index bulk-deletion results to the shared memory space we prepared since the
+ * index bulk-deletion result returned from ambulkdelete and amvacuumcleanup might
+ * exist in local memory. All vacuum workers process different indexes we can write
+ * them without locking.
+ */
+static void
+lazy_vacuum_all_indexes_for_leader(LVState *lvstate, IndexBulkDeleteResult **stats,
+								   LVTidMap *dead_tuples, bool for_cleanup)
+{
+	LVShared	*lvshared = lvstate->lvshared;
+	LVRelStats	*vacrelstats = lvstate->vacrelstats;
+	int			nprocessed = 0;
+	bool		do_parallel = false;
+	int			idx;
+
+	Assert(!IsParallelWorker());
+
+	/* no job if the table has no index */
+	if (lvstate->nindexes < 1)
+		return;
+
+	if (lvstate->parallel_ready)
+		do_parallel = lazy_begin_parallel_vacuum_index(lvstate, for_cleanup);
+
+	for (;;)
+	{
+		IndexBulkDeleteResult *r = NULL;
+
+		/*
+		 * Get the next index number to vacuum and set index statistics. In parallel
+		 * lazy vacuum, index bulk-deletion results are stored in the shared memory
+		 * segment. If it's already updated we use it rather than setting it to NULL.
+		 * In single vacuum, we can always use an element of the 'stats'.
+		 */
+		if (do_parallel)
+		{
+			idx = pg_atomic_fetch_add_u32(&(lvshared->nprocessed), 1);
+
+			if (lvshared->indstats[idx].updated)
+				r = &(lvshared->indstats[idx].stats);
+		}
+		else
+		{
+			idx = nprocessed++;
+			r = stats[idx];
+		}
+
+		/* Done for all indexes? */
+		if (idx >= lvstate->nindexes)
+			break;
+
+		/*
+		 * Do vacuuming or cleanup one index. For cleanup index, we don't update
+		 * index statistics during parallel mode.
+		 */
+		if (!for_cleanup)
+			r = lazy_vacuum_index(lvstate->indRels[idx], r,
+								  vacrelstats->old_rel_pages,
+								  dead_tuples);
+		else
+			r = lazy_cleanup_index(lvstate->indRels[idx], r,
+								   vacrelstats->new_rel_tuples,
+								   vacrelstats->tupcount_pages < vacrelstats->rel_pages,
+								   !do_parallel);
+
+		if (do_parallel && r)
+		{
+			/* save index bulk-deletion result to the shared memory space */
+			lvshared->indstats[idx].updated = true;
+			memcpy(&(lvshared->indstats[idx].stats), r, sizeof(IndexBulkDeleteResult));
+
+			/* save pointer to the shared memory segment */
+			r = &(lvshared->indstats[idx].stats);
+		}
+	}
+
+	if (do_parallel)
+		lazy_end_parallel_vacuum_index(lvstate);
+}
 
 /*
  *	lazy_vacuum_index() -- vacuum one index relation.
@@ -1682,11 +1919,11 @@ lazy_check_needs_freeze(Buffer buf, bool *hastup)
  *		Delete all the index entries pointing to tuples listed in
  *		vacrelstats->dead_tuples, and update running statistics.
  */
-static void
-lazy_vacuum_index(Relation indrel,
-				  IndexBulkDeleteResult **stats,
-				  LVRelStats *vacrelstats)
+static IndexBulkDeleteResult *
+lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult *stats,
+				  double reltuples, LVTidMap *dead_tuples)
 {
+	IndexBulkDeleteResult *res;
 	IndexVacuumInfo ivinfo;
 	PGRUsage	ru0;
 
@@ -1696,28 +1933,29 @@ lazy_vacuum_index(Relation indrel,
 	ivinfo.analyze_only = false;
 	ivinfo.estimated_count = true;
 	ivinfo.message_level = elevel;
-	/* We can only provide an approximate value of num_heap_tuples here */
-	ivinfo.num_heap_tuples = vacrelstats->old_live_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
 	/* Do bulk deletion */
-	*stats = index_bulk_delete(&ivinfo, *stats,
-							   lazy_tid_reaped, (void *) vacrelstats);
+	res = index_bulk_delete(&ivinfo, stats,
+							lazy_tid_reaped, (void *) dead_tuples);
 
 	ereport(elevel,
-			(errmsg("scanned index \"%s\" to remove %d row versions",
+			(errmsg("scanned index \"%s\" to remove %d row versions %s",
 					RelationGetRelationName(indrel),
-					vacrelstats->num_dead_tuples),
+					dead_tuples->num_dead_tuples,
+					IsParallelWorker() ? "by vacuum worker" : ""),
 			 errdetail_internal("%s", pg_rusage_show(&ru0))));
+
+	return res;
 }
 
 /*
  *	lazy_cleanup_index() -- do post-vacuum cleanup for one index relation.
  */
-static void
-lazy_cleanup_index(Relation indrel,
-				   IndexBulkDeleteResult *stats,
-				   LVRelStats *vacrelstats)
+static IndexBulkDeleteResult *
+lazy_cleanup_index(Relation indrel, IndexBulkDeleteResult *stats,
+				   double reltuples, bool estimated_count, bool update_stats)
 {
 	IndexVacuumInfo ivinfo;
 	PGRUsage	ru0;
@@ -1726,27 +1964,21 @@ lazy_cleanup_index(Relation indrel,
 
 	ivinfo.index = indrel;
 	ivinfo.analyze_only = false;
-	ivinfo.estimated_count = (vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	ivinfo.estimated_count = estimated_count;
 	ivinfo.message_level = elevel;
-
-	/*
-	 * Now we can provide a better estimate of total number of surviving
-	 * tuples (we assume indexes are more interested in that than in the
-	 * number of nominally live tuples).
-	 */
-	ivinfo.num_heap_tuples = vacrelstats->new_rel_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
 	stats = index_vacuum_cleanup(&ivinfo, stats);
 
 	if (!stats)
-		return;
+		return NULL;
 
 	/*
 	 * Now update statistics in pg_class, but only if the index says the count
 	 * is accurate.
 	 */
-	if (!stats->estimated_count)
+	if (!stats->estimated_count && update_stats)
 		vac_update_relstats(indrel,
 							stats->num_pages,
 							stats->num_index_tuples,
@@ -1767,8 +1999,7 @@ lazy_cleanup_index(Relation indrel,
 					   stats->tuples_removed,
 					   stats->pages_deleted, stats->pages_free,
 					   pg_rusage_show(&ru0))));
-
-	pfree(stats);
+	return stats;
 }
 
 /*
@@ -2078,15 +2309,16 @@ count_nondeletable_pages(Relation onerel, LVRelStats *vacrelstats)
  *
  * See the comments at the head of this file for rationale.
  */
-static void
-lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+static LVTidMap *
+lazy_space_alloc(LVState *lvstate, BlockNumber relblocks, int parallel_workers)
 {
+	LVTidMap	*dead_tuples = NULL;
 	long		maxtuples;
 	int			vac_work_mem = IsAutoVacuumWorkerProcess() &&
 	autovacuum_work_mem != -1 ?
 	autovacuum_work_mem : maintenance_work_mem;
 
-	if (vacrelstats->hasindex)
+	if (lvstate->nindexes > 0)
 	{
 		maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);
 		maxtuples = Min(maxtuples, INT_MAX);
@@ -2100,34 +2332,46 @@ lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
 		maxtuples = Max(maxtuples, MaxHeapTuplesPerPage);
 	}
 	else
-	{
 		maxtuples = MaxHeapTuplesPerPage;
+
+	/*
+	 * In parallel lazy vacuum, we enter the parallel mode and prepare all memory
+	 * necessary for executing parallel lazy vacuum including the space to store
+	 * dead tuples.
+	 */
+	if (parallel_workers > 0)
+	{
+		dead_tuples = lazy_prepare_parallel(lvstate, maxtuples, parallel_workers);
+
+		/* Preparation was a success, return the dead tuple space */
+		if (dead_tuples)
+			return dead_tuples;
 	}
 
-	vacrelstats->num_dead_tuples = 0;
-	vacrelstats->max_dead_tuples = (int) maxtuples;
-	vacrelstats->dead_tuples = (ItemPointer)
-		palloc(maxtuples * sizeof(ItemPointerData));
+	dead_tuples = (LVTidMap *) palloc(SizeOfLVTidMap + maxtuples * sizeof(ItemPointerData));
+	dead_tuples->num_dead_tuples = 0;
+	dead_tuples->max_dead_tuples = (int) maxtuples;
+
+	return dead_tuples;
 }
 
 /*
  * lazy_record_dead_tuple - remember one deletable tuple
  */
 static void
-lazy_record_dead_tuple(LVRelStats *vacrelstats,
-					   ItemPointer itemptr)
+lazy_record_dead_tuple(LVTidMap *dead_tuples, ItemPointer itemptr)
 {
 	/*
 	 * The array shouldn't overflow under normal behavior, but perhaps it
 	 * could if we are given a really small maintenance_work_mem. In that
 	 * case, just forget the last few tuples (we'll get 'em next time).
 	 */
-	if (vacrelstats->num_dead_tuples < vacrelstats->max_dead_tuples)
+	if (dead_tuples->num_dead_tuples < dead_tuples->max_dead_tuples)
 	{
-		vacrelstats->dead_tuples[vacrelstats->num_dead_tuples] = *itemptr;
-		vacrelstats->num_dead_tuples++;
+		dead_tuples->itemptrs[dead_tuples->num_dead_tuples] = *itemptr;
+		dead_tuples->num_dead_tuples++;
 		pgstat_progress_update_param(PROGRESS_VACUUM_NUM_DEAD_TUPLES,
-									 vacrelstats->num_dead_tuples);
+									 dead_tuples->num_dead_tuples);
 	}
 }
 
@@ -2141,12 +2385,12 @@ lazy_record_dead_tuple(LVRelStats *vacrelstats,
 static bool
 lazy_tid_reaped(ItemPointer itemptr, void *state)
 {
-	LVRelStats *vacrelstats = (LVRelStats *) state;
+	LVTidMap	*dead_tuples = (LVTidMap *) state;
 	ItemPointer res;
 
 	res = (ItemPointer) bsearch((void *) itemptr,
-								(void *) vacrelstats->dead_tuples,
-								vacrelstats->num_dead_tuples,
+								(void *) dead_tuples->itemptrs,
+								dead_tuples->num_dead_tuples,
 								sizeof(ItemPointerData),
 								vac_cmp_itemptr);
 
@@ -2294,3 +2538,349 @@ heap_page_is_all_visible(Relation rel, Buffer buf,
 
 	return all_visible;
 }
+
+/*
+ * Compute the number of parallel worker process to request. Vacuums can be executed
+ * in parallel if the table has more than one index since we support parallel index
+ * vacuum that processes one index by one vacuum process. The relation size of table
+ * and indexes doesn't affect to the parallel degree.
+ */
+static int
+lazy_compute_parallel_workers(Relation rel, int nrequests, int nindexes)
+{
+	int parallel_workers = nindexes - 1;
+
+	if (nindexes <= 1)
+		return 0;
+
+	if (nrequests)
+		parallel_workers = Min(nrequests, nindexes - 1);
+	else if (rel->rd_options)
+	{
+		StdRdOptions *relopts = (StdRdOptions *) rel->rd_options;
+		parallel_workers = Min(relopts->parallel_workers, nindexes - 1);
+	}
+
+	/* cap by max_parallel_maintenace_workers */
+	parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+
+	return parallel_workers;
+}
+
+/*
+ * Enter the parallel mode, allocate and initialize a DSM segment. Return
+ * the memory space for storing dead tuples or NULL if no workers are prepared.
+ */
+static LVTidMap *
+lazy_prepare_parallel(LVState *lvstate, long maxtuples, int request)
+{
+	LVShared	*shared;
+	ParallelContext *pcxt;
+	LVTidMap	*tidmap;
+	char	*sharedquery;
+	Size	estshared;
+	Size	estdt;
+	int		querylen;
+	int		i;
+	int		keys = 0;
+
+	Assert(request > 0);
+
+	EnterParallelMode();
+	pcxt = CreateParallelContext("postgres", "lazy_parallel_vacuum_main",
+								 request, true);
+	lvstate->pcxt = pcxt;
+
+	/* quick exit if no workers are prepared, e.g. under serializable isolation */
+	if (pcxt->nworkers == 0)
+	{
+		lazy_end_parallel(lvstate, false);
+		return NULL;
+	}
+
+	/* Estimate size for shared information -- PARALLEL_VACUUM_KEY_SHARED */
+	estshared = MAXALIGN(add_size(SizeOfLVShared,
+								  mul_size(sizeof(LVIndStats), lvstate->nindexes)));
+	shm_toc_estimate_chunk(&pcxt->estimator, estshared);
+	keys++;
+
+	/* Estimate size for dead tuples -- PARALLEL_VACUUM_KEY_DEAD_TUPLES */
+	estdt = MAXALIGN(add_size(sizeof(LVTidMap),
+							  mul_size(sizeof(ItemPointerData), maxtuples)));
+	shm_toc_estimate_chunk(&pcxt->estimator, estdt);
+	keys++;
+
+	shm_toc_estimate_keys(&pcxt->estimator, keys);
+
+	/*
+	 * Finally, estimate VACUUM_KEY_QUERY_TEXT space. Auto vacuums don't have
+	 * debug_query_string.
+	 */
+	if (debug_query_string)
+	{
+		querylen = strlen(debug_query_string);
+		shm_toc_estimate_chunk(&pcxt->estimator, querylen + 1);
+		shm_toc_estimate_keys(&pcxt->estimator, 1);
+	}
+
+	/* create the DSM */
+	InitializeParallelDSM(pcxt);
+
+	/* prepare shared information */
+	shared = (LVShared *) shm_toc_allocate(pcxt->toc, estshared);
+	shared->relid = RelationGetRelid(lvstate->relation);
+	shared->is_wraparound = lvstate->is_wraparound;
+	shared->elevel = elevel;
+	pg_atomic_init_u32(&(shared->nprocessed), 0);
+
+	for (i = 0; i < lvstate->nindexes; i++)
+	{
+		LVIndStats *s = &(shared->indstats[i]);
+		s->updated = false;
+		MemSet(&(s->stats), 0, sizeof(IndexBulkDeleteResult));
+	}
+
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_SHARED, shared);
+	lvstate->lvshared = shared;
+
+	/* prepare the dead tuple space */
+	tidmap = (LVTidMap *) shm_toc_allocate(pcxt->toc, estdt);
+	tidmap->max_dead_tuples = maxtuples;
+	tidmap->num_dead_tuples = 0;
+	MemSet(tidmap->itemptrs, 0, sizeof(ItemPointerData) * maxtuples);
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_DEAD_TUPLES, tidmap);
+
+	/* Store query string for workers */
+	sharedquery = (char *) shm_toc_allocate(pcxt->toc, querylen + 1);
+	memcpy(sharedquery, debug_query_string, querylen + 1);
+	sharedquery[querylen] = '\0';
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, sharedquery);
+
+	/* All setup is done, now we're ready for parallel vacuum execution */
+	lvstate->parallel_ready = true;
+
+	return tidmap;
+}
+
+/*
+ * Shutdown workers, destroy parallel context, and end parallel mode. If
+ * 'update_indstats' is true, we copy statistics of all indexes before
+ * destroying the parallel context, and then update them after exit parallel
+ * mode.
+ */
+static void
+lazy_end_parallel(LVState *lvstate, bool update_indstats)
+{
+	LVIndStats *copied_indstats = NULL;
+
+	Assert(!IsParallelWorker());
+
+	if (update_indstats && lvstate->nindexes > 0)
+	{
+		/* copy the index statistics to a temporary space */
+		copied_indstats = palloc(sizeof(LVIndStats) * lvstate->nindexes);
+		memcpy(copied_indstats, lvstate->lvshared->indstats,
+			   sizeof(LVIndStats) * lvstate->nindexes);
+	}
+
+	/* Shutdown worker processes and destroy the parallel context */
+	WaitForParallelWorkersToFinish(lvstate->pcxt);
+	DestroyParallelContext(lvstate->pcxt);
+	ExitParallelMode();
+
+	if (copied_indstats)
+	{
+		int i;
+
+		for (i = 0; i < lvstate->nindexes; i++)
+		{
+			LVIndStats *s = &(copied_indstats[i]);
+
+			/* Update index statistics */
+			if (s->updated && !s->stats.estimated_count)
+				vac_update_relstats(lvstate->indRels[i],
+									s->stats.num_pages,
+									s->stats.num_index_tuples,
+									0,
+									false,
+									InvalidTransactionId,
+									InvalidMultiXactId,
+									false);
+		}
+
+		pfree(copied_indstats);
+	}
+
+	lvstate->parallel_ready = false;
+}
+
+/*
+ * Begin a parallel index vacuum or cleanup index. Set shared information
+ * and launch parallel worker processes. Return true if at least one worker
+ * has been launched.
+ */
+static bool
+lazy_begin_parallel_vacuum_index(LVState *lvstate, bool for_cleanup)
+{
+	LVRelStats *vacrelstats = lvstate->vacrelstats;
+
+	Assert(!IsParallelWorker());
+
+	/*
+	 * Request workers to do either vacuuming indexes or cleaning indexes.
+	 */
+	lvstate->lvshared->for_cleanup = for_cleanup;
+
+	if (for_cleanup)
+	{
+		/*
+		 * Now we can provide a better estimate of total number of surviving
+		 * tuples (we assume indexes are more interested in that than in the
+		 * number of nominally live tuples).
+		 */
+		lvstate->lvshared->reltuples = vacrelstats->new_rel_tuples;
+		lvstate->lvshared->estimated_count =
+			(vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	}
+	else
+	{
+		/* We can only provide an approximate value of num_heap_tuples here */
+		lvstate->lvshared->reltuples = vacrelstats->old_live_tuples;
+		lvstate->lvshared->estimated_count = true;
+	}
+
+	LaunchParallelWorkers(lvstate->pcxt);
+
+	/*
+	 * if no workers launched, we vacuum all indexes by the leader process
+	 * alone. Since there is hope that we can launch workers in the next
+	 * execution time we don't want to end the parallel mode yet.
+	 */
+	if (lvstate->pcxt->nworkers_launched == 0)
+	{
+		lazy_end_parallel_vacuum_index(lvstate);
+		return false;
+	}
+
+	WaitForParallelWorkersToAttach(lvstate->pcxt);
+
+	return true;
+}
+
+/*
+ * Wait for all worker processes to finish and reinitialize DSM for
+ * the next execution.
+ */
+static void
+lazy_end_parallel_vacuum_index(LVState *lvstate)
+{
+	Assert(!IsParallelWorker());
+
+	WaitForParallelWorkersToFinish(lvstate->pcxt);
+
+	/* Reset the processing count */
+	pg_atomic_write_u32(&(lvstate->lvshared->nprocessed), 0);
+
+	/*
+	 * Reinitialize the DSM space except to relaunch parallel workers for
+	 * the next execution.
+	 */
+	ReinitializeParallelDSM(lvstate->pcxt);
+}
+
+/*
+ * Perform work within a launched parallel process.
+ *
+ * Parallel vacuum worker processes doesn't report the vacuum progress
+ * information.
+ */
+void
+lazy_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
+{
+	Relation	onerel;
+	Relation	*indrels;
+	LVShared	*lvshared;
+	LVTidMap	*dead_tuples;
+	int			nindexes;
+	char		*sharedquery;
+
+	/* Set lazy vacuum state and open relations */
+	lvshared = (LVShared *) shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_SHARED, false);
+	onerel = heap_open(lvshared->relid, ShareUpdateExclusiveLock);
+	vac_open_indexes(onerel, RowExclusiveLock, &nindexes, &indrels);
+	elevel = lvshared->elevel;
+
+	ereport(DEBUG1,
+			(errmsg("starting parallel lazy vacuum worker for %s",
+					lvshared->for_cleanup ? "cleanup" : "vacuuming")));
+
+	/* Set debug_query_string for individual workers */
+	sharedquery = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, true);
+
+	/* Report the query string from leader */
+	debug_query_string = sharedquery;
+	pgstat_report_activity(STATE_RUNNING, debug_query_string);
+
+	/* Set dead tuple space within worker */
+	dead_tuples = (LVTidMap *) shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_DEAD_TUPLES, false);
+
+	/* Set cost-based vacuum delay */
+	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostBalance = 0;
+	VacuumPageHit = 0;
+	VacuumPageMiss = 0;
+	VacuumPageDirty = 0;
+
+	/* Do either vacuuming indexes or cleaning indexes */
+	lazy_vacuum_all_indexes_for_worker(indrels, nindexes, lvshared,
+									   dead_tuples,
+									   lvshared->for_cleanup);
+
+	vac_close_indexes(nindexes, indrels, RowExclusiveLock);
+	heap_close(onerel, ShareUpdateExclusiveLock);
+}
+
+/*
+ * Vacuum or cleanup indexes. This function is used by the parallel vacuum worker
+ * processes. Similar to the leader process in parallel lazy vacuum, we save index
+ * bulk-deletion results to the shared memory space.
+ */
+static void
+lazy_vacuum_all_indexes_for_worker(Relation *indrels, int nindexes,
+								   LVShared *lvshared, LVTidMap *dead_tuples,
+								   bool for_cleanup)
+{
+	int idx = 0;
+
+	Assert(IsParallelWorker());
+
+	for (;;)
+	{
+		IndexBulkDeleteResult *stats = NULL;
+
+		/* Get next index to process */
+		idx = pg_atomic_fetch_add_u32(&(lvshared->nprocessed), 1);
+
+		/*  Done for all indexes? */
+		if (idx >= nindexes)
+			break;
+
+		/* If this index has already been processed before, get the pointer */
+		if (lvshared->indstats[idx].updated)
+			stats = &(lvshared->indstats[idx].stats);
+
+		if (!for_cleanup)
+			stats = lazy_vacuum_index(indrels[idx], stats, lvshared->reltuples,
+									  dead_tuples);
+		else
+			lazy_cleanup_index(indrels[idx], stats, lvshared->reltuples,
+							   lvshared->estimated_count, false);
+
+		if (stats)
+		{
+			/* Update the shared index statistics */
+			lvshared->indstats[idx].updated = true;
+			memcpy(&(lvshared->indstats[idx].stats), stats, sizeof(IndexBulkDeleteResult));
+		}
+	}
+}
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 133df1b..5bea393 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -1667,8 +1667,10 @@ _equalDropdbStmt(const DropdbStmt *a, const DropdbStmt *b)
 static bool
 _equalVacuumStmt(const VacuumStmt *a, const VacuumStmt *b)
 {
-	COMPARE_SCALAR_FIELD(options);
-	COMPARE_NODE_FIELD(rels);
+	if (a->options.flags != b->options.flags)
+		return false;
+	if (a->options.nworkers != b->options.nworkers)
+		return false;
 
 	return true;
 }
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index c086235..91ad021 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -187,6 +187,7 @@ static void processCASbits(int cas_bits, int location, const char *constrType,
 			   bool *deferrable, bool *initdeferred, bool *not_valid,
 			   bool *no_inherit, core_yyscan_t yyscanner);
 static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
+static VacuumOption *makeVacOpt(VacuumOptionFlag flag, int nworkers);
 
 %}
 
@@ -237,6 +238,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 	struct ImportQual	*importqual;
 	InsertStmt			*istmt;
 	VariableSetStmt		*vsetstmt;
+	VacuumOption		*vacopt;
 	PartitionElem		*partelem;
 	PartitionSpec		*partspec;
 	PartitionBoundSpec	*partboundspec;
@@ -305,8 +307,8 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 				create_extension_opt_item alter_extension_opt_item
 
 %type <ival>	opt_lock lock_type cast_context
-%type <ival>	vacuum_option_list vacuum_option_elem
-				analyze_option_list analyze_option_elem
+%type <vacopt>	vacuum_option_list vacuum_option_elem
+%type <ival>	analyze_option_list analyze_option_elem
 %type <boolean>	opt_or_replace
 				opt_grant_grant_option opt_grant_admin_option
 				opt_nowait opt_if_exists opt_with_data
@@ -10476,22 +10478,24 @@ cluster_index_specification:
 VacuumStmt: VACUUM opt_full opt_freeze opt_verbose opt_analyze opt_vacuum_relation_list
 				{
 					VacuumStmt *n = makeNode(VacuumStmt);
-					n->options = VACOPT_VACUUM;
+					n->options.flags = VACOPT_VACUUM;
 					if ($2)
-						n->options |= VACOPT_FULL;
+						n->options.flags |= VACOPT_FULL;
 					if ($3)
-						n->options |= VACOPT_FREEZE;
+						n->options.flags |= VACOPT_FREEZE;
 					if ($4)
-						n->options |= VACOPT_VERBOSE;
+						n->options.flags |= VACOPT_VERBOSE;
 					if ($5)
-						n->options |= VACOPT_ANALYZE;
+						n->options.flags |= VACOPT_ANALYZE;
+					n->options.nworkers = 0;
 					n->rels = $6;
 					$$ = (Node *)n;
 				}
 			| VACUUM '(' vacuum_option_list ')' opt_vacuum_relation_list
 				{
 					VacuumStmt *n = makeNode(VacuumStmt);
-					n->options = VACOPT_VACUUM | $3;
+					n->options.flags = VACOPT_VACUUM | $3->flags;
+					n->options.nworkers = $3->nworkers;
 					n->rels = $5;
 					$$ = (Node *) n;
 				}
@@ -10499,20 +10503,40 @@ VacuumStmt: VACUUM opt_full opt_freeze opt_verbose opt_analyze opt_vacuum_relati
 
 vacuum_option_list:
 			vacuum_option_elem								{ $$ = $1; }
-			| vacuum_option_list ',' vacuum_option_elem		{ $$ = $1 | $3; }
+			| vacuum_option_list ',' vacuum_option_elem
+				{
+					VacuumOption *vacopt1 = $1;
+					VacuumOption *vacopt2 = $3;
+
+					vacopt1->flags |= vacopt2->flags;
+					if (vacopt2->flags == VACOPT_PARALLEL)
+						vacopt1->nworkers = vacopt2->nworkers;
+					pfree(vacopt2);
+					$$ = vacopt1;
+				}
 		;
 
 vacuum_option_elem:
-			analyze_keyword		{ $$ = VACOPT_ANALYZE; }
-			| VERBOSE			{ $$ = VACOPT_VERBOSE; }
-			| FREEZE			{ $$ = VACOPT_FREEZE; }
-			| FULL				{ $$ = VACOPT_FULL; }
+			analyze_keyword		{ $$ = makeVacOpt(VACOPT_ANALYZE, 0); }
+			| VERBOSE			{ $$ = makeVacOpt(VACOPT_VERBOSE, 0); }
+			| FREEZE			{ $$ = makeVacOpt(VACOPT_FREEZE, 0); }
+			| FULL				{ $$ = makeVacOpt(VACOPT_FULL, 0); }
+			| PARALLEL			{ $$ = makeVacOpt(VACOPT_PARALLEL, 0); }
+			| PARALLEL ICONST
+			{
+				if ($2 < 1)
+					ereport(ERROR,
+							(errcode(ERRCODE_SYNTAX_ERROR),
+							 errmsg("parallel vacuum degree must be more than 1"),
+							 parser_errposition(@1)));
+				$$ = makeVacOpt(VACOPT_PARALLEL, $2);
+			}
 			| IDENT
 				{
 					if (strcmp($1, "disable_page_skipping") == 0)
-						$$ = VACOPT_DISABLE_PAGE_SKIPPING;
+						$$ = makeVacOpt(VACOPT_DISABLE_PAGE_SKIPPING, 0);
 					else if (strcmp($1, "skip_locked") == 0)
-						$$ = VACOPT_SKIP_LOCKED;
+						$$ = makeVacOpt(VACOPT_SKIP_LOCKED, 0);
 					else
 						ereport(ERROR,
 								(errcode(ERRCODE_SYNTAX_ERROR),
@@ -10524,16 +10548,16 @@ vacuum_option_elem:
 AnalyzeStmt: analyze_keyword opt_verbose opt_vacuum_relation_list
 				{
 					VacuumStmt *n = makeNode(VacuumStmt);
-					n->options = VACOPT_ANALYZE;
+					n->options.flags = VACOPT_ANALYZE;
 					if ($2)
-						n->options |= VACOPT_VERBOSE;
+						n->options.flags |= VACOPT_VERBOSE;
 					n->rels = $3;
 					$$ = (Node *)n;
 				}
 			| analyze_keyword '(' analyze_option_list ')' opt_vacuum_relation_list
 				{
 					VacuumStmt *n = makeNode(VacuumStmt);
-					n->options = VACOPT_ANALYZE | $3;
+					n->options.flags = VACOPT_ANALYZE | $3;
 					n->rels = $5;
 					$$ = (Node *) n;
 				}
@@ -16031,6 +16055,19 @@ makeXmlExpr(XmlExprOp op, char *name, List *named_args, List *args,
 	return (Node *) x;
 }
 
+
+/*
+ * Create a VacuumOption with the given options.
+ */
+static VacuumOption *
+makeVacOpt(VacuumOptionFlag flag, int nworkers)
+{
+	VacuumOption *vacopt = palloc(sizeof(VacuumOption));
+
+	vacopt->flags = flag;
+	vacopt->nworkers = nworkers;
+	return vacopt;
+}
 /*
  * Merge the input and output parameters of a table function.
  */
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 4cf6787..1d1b5c4 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -188,7 +188,7 @@ typedef struct av_relation
 typedef struct autovac_table
 {
 	Oid			at_relid;
-	int			at_vacoptions;	/* bitmask of VacuumOption */
+	VacuumOption at_vacoptions;	/* bitmask of VacuumOptionFlag */
 	VacuumParams at_params;
 	int			at_vacuum_cost_delay;
 	int			at_vacuum_cost_limit;
@@ -2482,7 +2482,7 @@ do_autovacuum(void)
 			 * next table in our list.
 			 */
 			HOLD_INTERRUPTS();
-			if (tab->at_vacoptions & VACOPT_VACUUM)
+			if (tab->at_vacoptions.flags & VACOPT_VACUUM)
 				errcontext("automatic vacuum of table \"%s.%s.%s\"",
 						   tab->at_datname, tab->at_nspname, tab->at_relname);
 			else
@@ -2883,10 +2883,11 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		tab = palloc(sizeof(autovac_table));
 		tab->at_relid = relid;
 		tab->at_sharedrel = classForm->relisshared;
-		tab->at_vacoptions = VACOPT_SKIPTOAST |
+		tab->at_vacoptions.flags = VACOPT_SKIPTOAST |
 			(dovacuum ? VACOPT_VACUUM : 0) |
 			(doanalyze ? VACOPT_ANALYZE : 0) |
 			(!wraparound ? VACOPT_SKIP_LOCKED : 0);
+		tab->at_vacoptions.nworkers = 0;
 		tab->at_params.freeze_min_age = freeze_min_age;
 		tab->at_params.freeze_table_age = freeze_table_age;
 		tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
@@ -3132,10 +3133,10 @@ autovac_report_activity(autovac_table *tab)
 	int			len;
 
 	/* Report the command and possible options */
-	if (tab->at_vacoptions & VACOPT_VACUUM)
+	if (tab->at_vacoptions.flags & VACOPT_VACUUM)
 		snprintf(activity, MAX_AUTOVAC_ACTIV_LEN,
 				 "autovacuum: VACUUM%s",
-				 tab->at_vacoptions & VACOPT_ANALYZE ? " ANALYZE" : "");
+				 tab->at_vacoptions.flags & VACOPT_ANALYZE ? " ANALYZE" : "");
 	else
 		snprintf(activity, MAX_AUTOVAC_ACTIV_LEN,
 				 "autovacuum: ANALYZE");
diff --git a/src/backend/tcop/utility.c b/src/backend/tcop/utility.c
index 27ae6be..d11d397 100644
--- a/src/backend/tcop/utility.c
+++ b/src/backend/tcop/utility.c
@@ -664,7 +664,7 @@ standard_ProcessUtility(PlannedStmt *pstmt,
 				VacuumStmt *stmt = (VacuumStmt *) parsetree;
 
 				/* we choose to allow this during "read only" transactions */
-				PreventCommandDuringRecovery((stmt->options & VACOPT_VACUUM) ?
+				PreventCommandDuringRecovery((stmt->options.flags & VACOPT_VACUUM) ?
 											 "VACUUM" : "ANALYZE");
 				/* forbidden in parallel mode due to CommandIsReadOnly */
 				ExecVacuum(stmt, isTopLevel);
@@ -2570,7 +2570,7 @@ CreateCommandTag(Node *parsetree)
 			break;
 
 		case T_VacuumStmt:
-			if (((VacuumStmt *) parsetree)->options & VACOPT_VACUUM)
+			if (((VacuumStmt *) parsetree)->options.flags & VACOPT_VACUUM)
 				tag = "VACUUM";
 			else
 				tag = "ANALYZE";
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 0588139..f022b94 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -15,6 +15,7 @@
 #define VACUUM_H
 
 #include "access/htup.h"
+#include "access/parallel.h"
 #include "catalog/pg_class.h"
 #include "catalog/pg_statistic.h"
 #include "catalog/pg_type.h"
@@ -163,7 +164,7 @@ extern int	vacuum_multixact_freeze_table_age;
 
 /* in commands/vacuum.c */
 extern void ExecVacuum(VacuumStmt *vacstmt, bool isTopLevel);
-extern void vacuum(int options, List *relations, VacuumParams *params,
+extern void vacuum(VacuumOption options, List *relations, VacuumParams *params,
 	   BufferAccessStrategy bstrategy, bool isTopLevel);
 extern void vac_open_indexes(Relation relation, LOCKMODE lockmode,
 				 int *nindexes, Relation **Irel);
@@ -197,8 +198,10 @@ extern Relation vacuum_open_relation(Oid relid, RangeVar *relation,
 					 VacuumParams *params, int options, LOCKMODE lmode);
 
 /* in commands/vacuumlazy.c */
-extern void lazy_vacuum_rel(Relation onerel, int options,
+extern void lazy_vacuum_rel(Relation onerel, VacuumOption options,
 				VacuumParams *params, BufferAccessStrategy bstrategy);
+extern void lazy_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
+
 
 /* in commands/analyze.c */
 extern void analyze_rel(Oid relid, RangeVar *relation, int options,
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 27782fe..6674c5f 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3143,7 +3143,7 @@ typedef struct ClusterStmt
  * and VACOPT_ANALYZE must be set in options.
  * ----------------------
  */
-typedef enum VacuumOption
+typedef enum VacuumOptionFlag
 {
 	VACOPT_VACUUM = 1 << 0,		/* do VACUUM */
 	VACOPT_ANALYZE = 1 << 1,	/* do ANALYZE */
@@ -3152,7 +3152,14 @@ typedef enum VacuumOption
 	VACOPT_FULL = 1 << 4,		/* FULL (non-concurrent) vacuum */
 	VACOPT_SKIP_LOCKED = 1 << 5,	/* skip if cannot get lock */
 	VACOPT_SKIPTOAST = 1 << 6,	/* don't process the TOAST table, if any */
-	VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7	/* don't skip any pages */
+	VACOPT_PARALLEL = 1 << 7,	/* do lazy VACUUM in parallel */
+	VACOPT_DISABLE_PAGE_SKIPPING = 1 << 8	/* don't skip any pages */
+} VacuumOptionFlag;
+
+typedef struct VacuumOption
+{
+	VacuumOptionFlag	flags;	/* OR of VacuumOptionFlag */
+	int					nworkers;	/* # of parallel vacuum workers */
 } VacuumOption;
 
 /*
@@ -3172,9 +3179,9 @@ typedef struct VacuumRelation
 
 typedef struct VacuumStmt
 {
-	NodeTag		type;
-	int			options;		/* OR of VacuumOption flags */
-	List	   *rels;			/* list of VacuumRelation, or NIL for all */
+	NodeTag			type;
+	VacuumOption	options;
+	List		   *rels;			/* list of VacuumRelation, or NIL for all */
 } VacuumStmt;
 
 /* ----------------------
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index fa9d663..9b5b7dc 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -80,6 +80,8 @@ CONTEXT:  SQL function "do_analyze" statement 1
 SQL function "wrap_do_analyze" statement 1
 VACUUM FULL vactst;
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+VACUUM (PARALLEL) vaccluster;
+VACUUM (PARALLEL 2) vaccluster;
 -- partitioned table
 CREATE TABLE vacparted (a int, b char) PARTITION BY LIST (a);
 CREATE TABLE vacparted1 PARTITION OF vacparted FOR VALUES IN (1);
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 9defa0d..f92c4e5 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -61,6 +61,9 @@ VACUUM FULL vaccluster;
 VACUUM FULL vactst;
 
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+VACUUM (PARALLEL) vaccluster;
+VACUUM (PARALLEL 2) vaccluster;
+
 
 -- partitioned table
 CREATE TABLE vacparted (a int, b char) PARTITION BY LIST (a);
-- 
2.10.5

v11-0002-Add-P-option-to-vacuumdb-command.patchapplication/octet-stream; name=v11-0002-Add-P-option-to-vacuumdb-command.patchDownload

From d783a8c5ee81b43b909d2d311c1696b985fe6552 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Tue, 18 Dec 2018 16:08:24 +0900
Subject: [PATCH v11 2/2] Add -P option to vacuumdb command.

---
 doc/src/sgml/ref/vacuumdb.sgml    | 16 +++++++++++++
 src/bin/scripts/t/100_vacuumdb.pl |  8 +++++++
 src/bin/scripts/vacuumdb.c        | 49 ++++++++++++++++++++++++++++++++++++++-
 3 files changed, 72 insertions(+), 1 deletion(-)

diff --git a/doc/src/sgml/ref/vacuumdb.sgml b/doc/src/sgml/ref/vacuumdb.sgml
index da4d51e..aa5f120 100644
--- a/doc/src/sgml/ref/vacuumdb.sgml
+++ b/doc/src/sgml/ref/vacuumdb.sgml
@@ -175,6 +175,22 @@ PostgreSQL documentation
      </varlistentry>
 
      <varlistentry>
+      <term><option>-P <replaceable class="parameter">workers</replaceable></option></term>
+      <term><option>--parallel=<replaceable class="parameter">workers</replaceable></option></term>
+      <listitem>
+       <para>
+        Execute parallel vacuum with <productname>PostgreSQL</productname>'s
+        <replaceable class="parameter">workers</replaceable> background workers.
+       </para>
+       <para>
+        <application>vacuumdb</application> will require background workers,
+        so make sure your <xref linkend="guc-max-parallel-workers-maintenance"/>
+        setting is more than one.
+       </para>
+      </listitem>
+     </varlistentry>
+
+     <varlistentry>
       <term><option>-q</option></term>
       <term><option>--quiet</option></term>
       <listitem>
diff --git a/src/bin/scripts/t/100_vacuumdb.pl b/src/bin/scripts/t/100_vacuumdb.pl
index 7cb2542..7c9532c 100644
--- a/src/bin/scripts/t/100_vacuumdb.pl
+++ b/src/bin/scripts/t/100_vacuumdb.pl
@@ -48,6 +48,14 @@ $node->issues_sql_like(
 $node->command_fails(
 	[ 'vacuumdb', '--analyze-only', '--disable-page-skipping', 'postgres' ],
 	'--analyze-only and --disable-page-skipping specified together');
+$node->issues_sql_like(
+	[ 'vacuumdb', '-P2', 'postgres' ],
+	qr/statement: VACUUM \(PARALLEL 2\);/,
+	'vacuumdb -P2');
+$node->issues_sql_like(
+	[ 'vacuumdb', '-P', 'postgres' ],
+	qr/statement: VACUUM \(PARALLEL\);/,
+	'vacuumdb -P2');
 $node->command_ok([qw(vacuumdb -Z --table=pg_am dbname=template1)],
 	'vacuumdb with connection string');
 
diff --git a/src/bin/scripts/vacuumdb.c b/src/bin/scripts/vacuumdb.c
index 127b75e..e2bdddd 100644
--- a/src/bin/scripts/vacuumdb.c
+++ b/src/bin/scripts/vacuumdb.c
@@ -42,6 +42,8 @@ typedef struct vacuumingOptions
 	bool		freeze;
 	bool		disable_page_skipping;
 	bool		skip_locked;
+	int			parallel_workers;	/* -1: disabled, 0: PARALLEL without number of
+									 * workers. */
 } vacuumingOptions;
 
 
@@ -110,6 +112,7 @@ main(int argc, char *argv[])
 		{"full", no_argument, NULL, 'f'},
 		{"verbose", no_argument, NULL, 'v'},
 		{"jobs", required_argument, NULL, 'j'},
+		{"parallel", optional_argument, NULL, 'P'},
 		{"maintenance-db", required_argument, NULL, 2},
 		{"analyze-in-stages", no_argument, NULL, 3},
 		{"disable-page-skipping", no_argument, NULL, 4},
@@ -137,6 +140,7 @@ main(int argc, char *argv[])
 
 	/* initialize options to all false */
 	memset(&vacopts, 0, sizeof(vacopts));
+	vacopts.parallel_workers = -1;
 
 	progname = get_progname(argv[0]);
 
@@ -144,7 +148,7 @@ main(int argc, char *argv[])
 
 	handle_help_version_opts(argc, argv, "vacuumdb", help);
 
-	while ((c = getopt_long(argc, argv, "h:p:U:wWeqd:zZFat:fvj:", long_options, &optindex)) != -1)
+	while ((c = getopt_long(argc, argv, "h:p:P::U:wWeqd:zZFat:fvj:", long_options, &optindex)) != -1)
 	{
 		switch (c)
 		{
@@ -211,6 +215,25 @@ main(int argc, char *argv[])
 					exit(1);
 				}
 				break;
+			case 'P':
+				{
+					int parallel_workers = 0;
+
+					if (optarg != NULL)
+					{
+						parallel_workers = atoi(optarg);
+						if (parallel_workers <= 0)
+						{
+							fprintf(stderr, _("%s: number of parallel workers must be at least 1\n"),
+									progname);
+							exit(1);
+						}
+					}
+
+					/* allow to set 0, meaning PARALLEL without the parallel degree */
+					vacopts.parallel_workers = parallel_workers;
+					break;
+				}
 			case 2:
 				maintenance_db = pg_strdup(optarg);
 				break;
@@ -267,9 +290,22 @@ main(int argc, char *argv[])
 					progname, "disable-page-skipping");
 			exit(1);
 		}
+		if (vacopts.parallel_workers >= 0)
+		{
+			fprintf(stderr, _("%s: cannot use the \"%s\" option when performing only analyze\n"),
+					progname, "parallel");
+			exit(1);
+		}
 		/* allow 'and_analyze' with 'analyze_only' */
 	}
 
+	if (vacopts.full && vacopts.parallel_workers >= 0)
+	{
+		fprintf(stderr, _("%s: cannot use the \"%s\" option with \"%s\" option"),
+				progname, "full", "parallel");
+		exit(1);
+	}
+
 	setup_cancel_handler();
 
 	/* Avoid opening extra connections. */
@@ -737,6 +773,16 @@ prepare_vacuum_command(PQExpBuffer sql, PGconn *conn,
 				appendPQExpBuffer(sql, "%sANALYZE", sep);
 				sep = comma;
 			}
+			if (vacopts->parallel_workers > 0)
+			{
+				appendPQExpBuffer(sql, "%sPARALLEL %d", sep, vacopts->parallel_workers);
+				sep = comma;
+			}
+			if (vacopts->parallel_workers == 0)
+			{
+				appendPQExpBuffer(sql, "%sPARALLEL", sep);
+				sep = comma;
+			}
 			if (sep != paren)
 				appendPQExpBufferChar(sql, ')');
 		}
@@ -1075,6 +1121,7 @@ help(const char *progname)
 	printf(_("  -f, --full                      do full vacuuming\n"));
 	printf(_("  -F, --freeze                    freeze row transaction information\n"));
 	printf(_("  -j, --jobs=NUM                  use this many concurrent connections to vacuum\n"));
+	printf(_("  -P, --parallel=NUM              do parallel vacuuming\n"));
 	printf(_("  -q, --quiet                     don't write any messages\n"));
 	printf(_("      --skip-locked               skip relations that cannot be immediately locked\n"));
 	printf(_("  -t, --table='TABLE[(COLUMNS)]'  vacuum specific table(s) only\n"));
-- 
2.10.5

#14

Haribabu Kommi

kommi.haribabu@gmail.com

almost 7 years ago

In reply to: Masahiko Sawada (#13)

On Tue, Jan 15, 2019 at 6:00 PM Masahiko Sawada <sawada.mshk@gmail.com>
wrote:

Rebased.

I started reviewing the patch, I didn't finish my review yet.
Following are some of the comments.

+    <term><literal>PARALLEL <replaceable
class="parameter">N</replaceable></literal></term>
+    <listitem>
+     <para>
+      Execute index vacuum and cleanup index in parallel with

I doubt that user can understand the terms index vacuum and cleanup index.
May be it needs some more detailed information.

- VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7 /* don't skip any pages */
+ VACOPT_PARALLEL = 1 << 7, /* do lazy VACUUM in parallel */
+ VACOPT_DISABLE_PAGE_SKIPPING = 1 << 8 /* don't skip any pages */
+} VacuumOptionFlag;

Any specific reason behind not adding it as last member of the enum?

-typedef enum VacuumOption
+typedef enum VacuumOptionFlag
 {

I don't find the new name quite good, how about VacuumFlags?

+typedef struct VacuumOption
+{

How about VacuumOptions? Because this structure can contains all the
options provided to vacuum operation.

+ vacopt1->flags |= vacopt2->flags;
+ if (vacopt2->flags == VACOPT_PARALLEL)
+ vacopt1->nworkers = vacopt2->nworkers;
+ pfree(vacopt2);
+ $$ = vacopt1;
+ }

As the above statement indicates the the last parallel number of workers
is considered into the account, can we explain it in docs?

postgres=# vacuum (parallel 2, verbose) tbl;

With verbose, no parallel workers related information is available.
I feel giving that information is required even when it is not parallel
vacuum also.

Regards,
Haribabu Kommi
Fujitsu Australia

#15

Masahiko Sawada

sawada.mshk@gmail.com

almost 7 years ago

In reply to: Haribabu Kommi (#14)

2 attachment(s)

On Fri, Jan 18, 2019 at 10:38 AM Haribabu Kommi
<kommi.haribabu@gmail.com> wrote:

On Tue, Jan 15, 2019 at 6:00 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Rebased.

I started reviewing the patch, I didn't finish my review yet.
Following are some of the comments.

Thank you for reviewing the patch.

+    <term><literal>PARALLEL <replaceable class="parameter">N</replaceable></literal></term>
+    <listitem>
+     <para>
+      Execute index vacuum and cleanup index in parallel with
I doubt that user can understand the terms index vacuum and cleanup index.
May be it needs some more detailed information.

Agreed. Table 27.22 "Vacuum phases" has a good description of vacuum
phases. So maybe adding the referencint to it would work.

- VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7 /* don't skip any pages */
+ VACOPT_PARALLEL = 1 << 7, /* do lazy VACUUM in parallel */
+ VACOPT_DISABLE_PAGE_SKIPPING = 1 << 8 /* don't skip any pages */
+} VacuumOptionFlag;

Any specific reason behind not adding it as last member of the enum?

My mistake, fixed it.

-typedef enum VacuumOption
+typedef enum VacuumOptionFlag
{
I don't find the new name quite good, how about VacuumFlags?

Agreed with removing "Option" from the name but I think VacuumFlag
would be better because this enum represents only one flag. Thoughts?

+typedef struct VacuumOption
+{

How about VacuumOptions? Because this structure can contains all the
options provided to vacuum operation.

Agreed.

+ vacopt1->flags |= vacopt2->flags;
+ if (vacopt2->flags == VACOPT_PARALLEL)
+ vacopt1->nworkers = vacopt2->nworkers;
+ pfree(vacopt2);
+ $$ = vacopt1;
+ }
As the above statement indicates the the last parallel number of workers
is considered into the account, can we explain it in docs?

Agreed.

postgres=# vacuum (parallel 2, verbose) tbl;

With verbose, no parallel workers related information is available.
I feel giving that information is required even when it is not parallel
vacuum also.

Agreed. How about the folloiwng verbose output? I've added the number
of launched, planned and requested vacuum workers and purpose (vacuum
or cleanup).

postgres(1:91536)=# vacuum (verbose, parallel 30) test; -- table
'test' has 3 indexes
INFO: vacuuming "public.test"
INFO: launched 2 parallel vacuum workers for index vacuum (planned:
2, requested: 30)
INFO: scanned index "test_idx1" to remove 2000 row versions
DETAIL: CPU: user: 0.12 s, system: 0.00 s, elapsed: 0.12 s
INFO: scanned index "test_idx2" to remove 2000 row versions by
parallel vacuum worker
DETAIL: CPU: user: 0.07 s, system: 0.05 s, elapsed: 0.12 s
INFO: scanned index "test_idx3" to remove 2000 row versions by
parallel vacuum worker
DETAIL: CPU: user: 0.09 s, system: 0.05 s, elapsed: 0.14 s
INFO: "test": removed 2000 row versions in 10 pages
DETAIL: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s
INFO: launched 2 parallel vacuum workers for index cleanup (planned:
2, requested: 30)
INFO: index "test_idx1" now contains 991151 row versions in 2745 pages
DETAIL: 2000 index row versions were removed.
24 index pages have been deleted, 18 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO: index "test_idx2" now contains 991151 row versions in 2745 pages
DETAIL: 2000 index row versions were removed.
24 index pages have been deleted, 18 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO: index "test_idx3" now contains 991151 row versions in 2745 pages
DETAIL: 2000 index row versions were removed.
24 index pages have been deleted, 18 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO: "test": found 2000 removable, 367 nonremovable row versions in
41 out of 4425 pages
DETAIL: 0 dead row versions cannot be removed yet, oldest xmin: 500
There were 6849 unused item pointers.
Skipped 0 pages due to buffer pins, 0 frozen pages.
0 pages are entirely empty.
CPU: user: 0.12 s, system: 0.01 s, elapsed: 0.17 s.
VACUUM

Since the previous patch conflicts with 285d8e12 I've attached the
latest version patch that incorporated the review comment I got.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachments:

v12-0002-Add-P-option-to-vacuumdb-command.patchtext/x-patch; charset=US-ASCII; name=v12-0002-Add-P-option-to-vacuumdb-command.patchDownload

From 5dbfd59c5d4a89c4559f9ee0e49b69ab29ecc714 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Tue, 18 Dec 2018 16:08:24 +0900
Subject: [PATCH v12 2/2] Add -P option to vacuumdb command.

---
 doc/src/sgml/ref/vacuumdb.sgml       | 16 ++++++++++++
 src/backend/access/heap/vacuumlazy.c |  9 +++----
 src/bin/scripts/t/100_vacuumdb.pl    |  8 ++++++
 src/bin/scripts/vacuumdb.c           | 49 +++++++++++++++++++++++++++++++++++-
 4 files changed, 75 insertions(+), 7 deletions(-)

diff --git a/doc/src/sgml/ref/vacuumdb.sgml b/doc/src/sgml/ref/vacuumdb.sgml
index da4d51e..aa5f120 100644
--- a/doc/src/sgml/ref/vacuumdb.sgml
+++ b/doc/src/sgml/ref/vacuumdb.sgml
@@ -175,6 +175,22 @@ PostgreSQL documentation
      </varlistentry>
 
      <varlistentry>
+      <term><option>-P <replaceable class="parameter">workers</replaceable></option></term>
+      <term><option>--parallel=<replaceable class="parameter">workers</replaceable></option></term>
+      <listitem>
+       <para>
+        Execute parallel vacuum with <productname>PostgreSQL</productname>'s
+        <replaceable class="parameter">workers</replaceable> background workers.
+       </para>
+       <para>
+        <application>vacuumdb</application> will require background workers,
+        so make sure your <xref linkend="guc-max-parallel-workers-maintenance"/>
+        setting is more than one.
+       </para>
+      </listitem>
+     </varlistentry>
+
+     <varlistentry>
       <term><option>-q</option></term>
       <term><option>--quiet</option></term>
       <listitem>
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index d12df24..d2caf5f 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -2642,12 +2642,9 @@ lazy_prepare_parallel(LVState *lvstate, long maxtuples, int request)
 	 * Finally, estimate VACUUM_KEY_QUERY_TEXT space. Auto vacuums don't have
 	 * debug_query_string.
 	 */
-	if (debug_query_string)
-	{
-		querylen = strlen(debug_query_string);
-		shm_toc_estimate_chunk(&pcxt->estimator, querylen + 1);
-		shm_toc_estimate_keys(&pcxt->estimator, 1);
-	}
+	querylen = strlen(debug_query_string);
+	shm_toc_estimate_chunk(&pcxt->estimator, querylen + 1);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
 
 	/* create the DSM */
 	InitializeParallelDSM(pcxt);
diff --git a/src/bin/scripts/t/100_vacuumdb.pl b/src/bin/scripts/t/100_vacuumdb.pl
index 3d0ba58..16a68af 100644
--- a/src/bin/scripts/t/100_vacuumdb.pl
+++ b/src/bin/scripts/t/100_vacuumdb.pl
@@ -48,6 +48,14 @@ $node->issues_sql_like(
 $node->command_fails(
 	[ 'vacuumdb', '--analyze-only', '--disable-page-skipping', 'postgres' ],
 	'--analyze-only and --disable-page-skipping specified together');
+$node->issues_sql_like(
+	[ 'vacuumdb', '-P2', 'postgres' ],
+	qr/statement: VACUUM \(PARALLEL 2\);/,
+	'vacuumdb -P2');
+$node->issues_sql_like(
+	[ 'vacuumdb', '-P', 'postgres' ],
+	qr/statement: VACUUM \(PARALLEL\);/,
+	'vacuumdb -P2');
 $node->command_ok([qw(vacuumdb -Z --table=pg_am dbname=template1)],
 	'vacuumdb with connection string');
 
diff --git a/src/bin/scripts/vacuumdb.c b/src/bin/scripts/vacuumdb.c
index 127b75e..e2bdddd 100644
--- a/src/bin/scripts/vacuumdb.c
+++ b/src/bin/scripts/vacuumdb.c
@@ -42,6 +42,8 @@ typedef struct vacuumingOptions
 	bool		freeze;
 	bool		disable_page_skipping;
 	bool		skip_locked;
+	int			parallel_workers;	/* -1: disabled, 0: PARALLEL without number of
+									 * workers. */
 } vacuumingOptions;
 
 
@@ -110,6 +112,7 @@ main(int argc, char *argv[])
 		{"full", no_argument, NULL, 'f'},
 		{"verbose", no_argument, NULL, 'v'},
 		{"jobs", required_argument, NULL, 'j'},
+		{"parallel", optional_argument, NULL, 'P'},
 		{"maintenance-db", required_argument, NULL, 2},
 		{"analyze-in-stages", no_argument, NULL, 3},
 		{"disable-page-skipping", no_argument, NULL, 4},
@@ -137,6 +140,7 @@ main(int argc, char *argv[])
 
 	/* initialize options to all false */
 	memset(&vacopts, 0, sizeof(vacopts));
+	vacopts.parallel_workers = -1;
 
 	progname = get_progname(argv[0]);
 
@@ -144,7 +148,7 @@ main(int argc, char *argv[])
 
 	handle_help_version_opts(argc, argv, "vacuumdb", help);
 
-	while ((c = getopt_long(argc, argv, "h:p:U:wWeqd:zZFat:fvj:", long_options, &optindex)) != -1)
+	while ((c = getopt_long(argc, argv, "h:p:P::U:wWeqd:zZFat:fvj:", long_options, &optindex)) != -1)
 	{
 		switch (c)
 		{
@@ -211,6 +215,25 @@ main(int argc, char *argv[])
 					exit(1);
 				}
 				break;
+			case 'P':
+				{
+					int parallel_workers = 0;
+
+					if (optarg != NULL)
+					{
+						parallel_workers = atoi(optarg);
+						if (parallel_workers <= 0)
+						{
+							fprintf(stderr, _("%s: number of parallel workers must be at least 1\n"),
+									progname);
+							exit(1);
+						}
+					}
+
+					/* allow to set 0, meaning PARALLEL without the parallel degree */
+					vacopts.parallel_workers = parallel_workers;
+					break;
+				}
 			case 2:
 				maintenance_db = pg_strdup(optarg);
 				break;
@@ -267,9 +290,22 @@ main(int argc, char *argv[])
 					progname, "disable-page-skipping");
 			exit(1);
 		}
+		if (vacopts.parallel_workers >= 0)
+		{
+			fprintf(stderr, _("%s: cannot use the \"%s\" option when performing only analyze\n"),
+					progname, "parallel");
+			exit(1);
+		}
 		/* allow 'and_analyze' with 'analyze_only' */
 	}
 
+	if (vacopts.full && vacopts.parallel_workers >= 0)
+	{
+		fprintf(stderr, _("%s: cannot use the \"%s\" option with \"%s\" option"),
+				progname, "full", "parallel");
+		exit(1);
+	}
+
 	setup_cancel_handler();
 
 	/* Avoid opening extra connections. */
@@ -737,6 +773,16 @@ prepare_vacuum_command(PQExpBuffer sql, PGconn *conn,
 				appendPQExpBuffer(sql, "%sANALYZE", sep);
 				sep = comma;
 			}
+			if (vacopts->parallel_workers > 0)
+			{
+				appendPQExpBuffer(sql, "%sPARALLEL %d", sep, vacopts->parallel_workers);
+				sep = comma;
+			}
+			if (vacopts->parallel_workers == 0)
+			{
+				appendPQExpBuffer(sql, "%sPARALLEL", sep);
+				sep = comma;
+			}
 			if (sep != paren)
 				appendPQExpBufferChar(sql, ')');
 		}
@@ -1075,6 +1121,7 @@ help(const char *progname)
 	printf(_("  -f, --full                      do full vacuuming\n"));
 	printf(_("  -F, --freeze                    freeze row transaction information\n"));
 	printf(_("  -j, --jobs=NUM                  use this many concurrent connections to vacuum\n"));
+	printf(_("  -P, --parallel=NUM              do parallel vacuuming\n"));
 	printf(_("  -q, --quiet                     don't write any messages\n"));
 	printf(_("      --skip-locked               skip relations that cannot be immediately locked\n"));
 	printf(_("  -t, --table='TABLE[(COLUMNS)]'  vacuum specific table(s) only\n"));
-- 
1.8.3.1

v12-0001-Add-parallel-option-to-VACUUM-command.patchtext/x-patch; charset=US-ASCII; name=v12-0001-Add-parallel-option-to-VACUUM-command.patchDownload

From 1799e263526b808118187a73b4d5fad574e14343 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Tue, 18 Dec 2018 14:48:34 +0900
Subject: [PATCH v12 1/2] Add parallel option to VACUUM command

In parallel vacuum, we do both index vacuum and cleanup vacuum
in parallel with parallel worker processes if the table has
more than one index. All processes including the leader process
process indexes one by one.

Parallel vacuum can be performed by specifying like
VACUUM (PARALLEL 2) tbl, meaning that performing vacuum with 2
parallel worker processes. Or the setting parallel_workers
reloption more than 0 invokes parallel vacuum.

The parallel vacuum degree is limited by both the number of
indexes the table has and max_parallel_maintenance_workers.
---
 doc/src/sgml/config.sgml              |  24 +-
 doc/src/sgml/ref/vacuum.sgml          |  28 ++
 src/backend/access/heap/vacuumlazy.c  | 885 +++++++++++++++++++++++++++++-----
 src/backend/access/transam/parallel.c |   4 +
 src/backend/commands/vacuum.c         |  78 +--
 src/backend/nodes/equalfuncs.c        |   6 +-
 src/backend/parser/gram.y             |  73 ++-
 src/backend/postmaster/autovacuum.c   |  13 +-
 src/backend/tcop/utility.c            |   4 +-
 src/bin/scripts/t/100_vacuumdb.pl     |   2 +-
 src/include/access/heapam.h           |   6 +-
 src/include/commands/vacuum.h         |   2 +-
 src/include/nodes/parsenodes.h        |  19 +-
 src/test/regress/expected/vacuum.out  |   2 +
 src/test/regress/sql/vacuum.sql       |   3 +
 15 files changed, 940 insertions(+), 209 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index b6f5822..b77a2bd 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2185,18 +2185,18 @@ include_dir 'conf.d'
        <listitem>
         <para>
          Sets the maximum number of parallel workers that can be
-         started by a single utility command.  Currently, the only
-         parallel utility command that supports the use of parallel
-         workers is <command>CREATE INDEX</command>, and only when
-         building a B-tree index.  Parallel workers are taken from the
-         pool of processes established by <xref
-         linkend="guc-max-worker-processes"/>, limited by <xref
-         linkend="guc-max-parallel-workers"/>.  Note that the requested
-         number of workers may not actually be available at run time.
-         If this occurs, the utility operation will run with fewer
-         workers than expected.  The default value is 2.  Setting this
-         value to 0 disables the use of parallel workers by utility
-         commands.
+         started by a single utility command.  Currently, the parallel
+         utility commands that support the use of parallel workers are
+         <command>CREATE INDEX</command> and <command>VACUUM</command>
+         without <literal>FULL</literal> option, and only when building
+         a B-tree index.  Parallel workers are taken from the pool of
+         processes established by <xref linkend="guc-max-worker-processes"/>,
+         limited by <xref linkend="guc-max-parallel-workers"/>.
+         Note that the requested number of workers may not actually be
+         available at run time.  If this occurs, the utility operation
+         will run with fewer workers than expected.  The default value
+         is 2.  Setting this value to 0 disables the use of parallel
+         workers by utility commands.
         </para>
 
         <para>
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index fd911f5..c1f05bd 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -30,6 +30,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     FREEZE
     VERBOSE
     ANALYZE
+    PARALLEL [ <replaceable class="parameter">N</replaceable> ]
     DISABLE_PAGE_SKIPPING
     SKIP_LOCKED
 
@@ -143,6 +144,24 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    </varlistentry>
 
    <varlistentry>
+    <term><literal>PARALLEL <replaceable class="parameter">N</replaceable></literal></term>
+    <listitem>
+     <para>
+      Execute index vacuum and cleanup index in parallel with
+      <replaceable class="parameter">N</replaceable> background workers (for the detail
+      of each vacuum phases, please refer to <xref linkend="vacuum-phases"/>. If the
+      parallel degree <replaceable class="parameter">N</replaceable> is omitted,
+      then <command>VACUUM</command> decides the number of workers based on
+      number of indexes on the relation which further limited by
+      <xref linkend="guc-max-parallel-workers-maintenance"/>. Also if this option
+      is specified multile times, the last parallel degree
+      <replaceable class="parameter">N</replaceable> is considered into the account.
+      This option can not use with  <literal>FULL</literal> option.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><literal>DISABLE_PAGE_SKIPPING</literal></term>
     <listitem>
      <para>
@@ -261,6 +280,15 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    </para>
 
    <para>
+    Setting a value for <literal>parallel_workers</literal> via
+    <xref linkend="sql-altertable"/> also controls how many parallel
+    worker processes will be requested by a <command>VACUUM</command>
+    against the table. This setting is overwritten by setting
+    <replaceable class="parameter">N</replaceable> of <literal>PARALLEL</literal>
+    option.
+   </para>
+
+   <para>
     <command>VACUUM</command> causes a substantial increase in I/O traffic,
     which might cause poor performance for other active sessions.  Therefore,
     it is sometimes advisable to use the cost-based vacuum delay feature.
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 2d317a9..d12df24 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -22,6 +22,19 @@
  * of index scans performed.  So we don't use maintenance_work_mem memory for
  * the TID array, just enough to hold as many heap tuples as fit on one page.
  *
+ * Lazy vacuum supports parallel execution with parallel worker processes. In
+ * parallel vacuum, we perform both index vacuum and index cleanup in parallel.
+ * Individual indexes is processed by one vacuum process. At beginning of
+ * lazy vacuum (at lazy_scan_heap) we prepare the parallel context and initialize
+ * the shared memory segments that contains shared information as well as the
+ * memory space for dead tuples. When starting either index vacuum or cleanup
+ * vacuum, we launch parallel worker processes. Once all indexes are processed
+ * the parallel worker processes exit and the leader process re-initializes the
+ * shared memory segment. Note that all parallel workers live during one either
+ * index vacuum or cleanup index but the leader process neither exits from the
+ * parallel mode nor destories the parallel context. For updating the index
+ * statistics, since any updates are not allowed during parallel mode we update
+ * the index statistics after exited from the parallel mode.
  *
  * Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
@@ -41,8 +54,10 @@
 #include "access/heapam_xlog.h"
 #include "access/htup_details.h"
 #include "access/multixact.h"
+#include "access/parallel.h"
 #include "access/transam.h"
 #include "access/visibilitymap.h"
+#include "access/xact.h"
 #include "access/xlog.h"
 #include "catalog/storage.h"
 #include "commands/dbcommands.h"
@@ -55,6 +70,8 @@
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
 #include "storage/lmgr.h"
+#include "storage/spin.h"
+#include "tcop/tcopprot.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/pg_rusage.h"
@@ -111,10 +128,79 @@
  */
 #define PREFETCH_SIZE			((BlockNumber) 32)
 
+/* DSM keys for parallel lazy vacuum */
+#define PARALLEL_VACUUM_KEY_SHARED			UINT64CONST(0xFFFFFFFFFFF00001)
+#define PARALLEL_VACUUM_KEY_DEAD_TUPLES		UINT64CONST(0xFFFFFFFFFFF00002)
+#define PARALLEL_VACUUM_KEY_QUERY_TEXT		UINT64CONST(0xFFFFFFFFFFF00003)
+
+/*
+ * Structs for an index bulk-deletion statistic that is used for parallel
+ * lazy vacuum. This is allocated in a dynamic shared memory segment.
+ */
+typedef struct LVIndStats
+{
+	bool updated;	/* is the stats updated? */
+	IndexBulkDeleteResult stats;
+} LVIndStats;
+
+/*
+ * LVTidMap controls the dead tuple TIDs collected during heap scan. This is
+ * allocated in a dynamic shared memory segment when parallel lazy vacuum mode,
+ * or allocated in a local memory.
+ */
+typedef struct LVTidMap
+{
+	int			max_dead_tuples;	/* # slots allocated in array */
+	int			num_dead_tuples;	/* current # of entries */
+	/* List of TIDs of tuples we intend to delete */
+	/* NB: this list is ordered by TID address */
+	ItemPointerData itemptrs[FLEXIBLE_ARRAY_MEMBER];	/* array of ItemPointerData */
+} LVTidMap;
+#define SizeOfLVTidMap offsetof(LVTidMap, itemptrs) + sizeof(ItemPointerData)
+
+/*
+ * Status for parallel index vacuum and cleanup. This is allocated in a dynamic
+ * shared memory segment.
+ */
+typedef struct LVShared
+{
+	/*
+	 * Target table relid and vacuum settings. These fields are not modified
+	 * during the lazy vacuum.
+	 */
+	Oid		relid;
+	bool	is_wraparound;
+	int		elevel;
+
+	/*
+	 * An indication for vacuum workers of doing either vacuuming index or
+	 * cleanup index.
+	 */
+	bool	for_cleanup;
+
+	/*
+	 * Fields for vacuum index or cleanup index, or both necessary for
+	 * IndexVacuumInfo.
+	 *
+	 * reltuples is the total number of input heap tuples. We set either an
+	 * old live tuples in vacuum index or th new live tuples in cleanup index.
+	 *
+	 * estimated_count is true if the reltuples is estimated value.
+	 */
+	double	reltuples;
+	bool	estimated_count;
+
+	/*
+	 * Variables to control parallel index vacuum. An variable-sized field
+	 * 'indstats' must come last.
+	 */
+	pg_atomic_uint32	nprocessed;
+	LVIndStats			indstats[FLEXIBLE_ARRAY_MEMBER];
+} LVShared;
+#define SizeOfLVShared offsetof(LVShared, indstats) + sizeof(LVIndStats)
+
 typedef struct LVRelStats
 {
-	/* hasindex = true means two-pass strategy; false means one-pass */
-	bool		hasindex;
 	/* Overall statistics about rel */
 	BlockNumber old_rel_pages;	/* previous value of pg_class.relpages */
 	BlockNumber rel_pages;		/* total number of pages */
@@ -129,16 +215,35 @@ typedef struct LVRelStats
 	BlockNumber pages_removed;
 	double		tuples_deleted;
 	BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
-	/* List of TIDs of tuples we intend to delete */
-	/* NB: this list is ordered by TID address */
-	int			num_dead_tuples;	/* current # of entries */
-	int			max_dead_tuples;	/* # slots allocated in array */
-	ItemPointer dead_tuples;	/* array of ItemPointerData */
 	int			num_index_scans;
 	TransactionId latestRemovedXid;
 	bool		lock_waiter_detected;
 } LVRelStats;
 
+/*
+ * Working state for lazy heap vacuum execution used by only leader process.
+ * This is present only in the leader process. In parallel lazy vacuum, the
+ * 'lvshared' and 'pcxt' are not NULL and they point to the dynamic shared
+ * memory segment.
+ */
+typedef struct LVState
+{
+	Relation	relation;
+	LVRelStats	*vacrelstats;
+	Relation	*indRels;
+	/* nindexes > 0 means two-pass strategy; false means one-pass */
+	int			nindexes;
+
+	/* Lazy vacuum options and scan status */
+	VacuumOptions	options;
+	bool			is_wraparound;
+	bool			aggressive;
+	bool			parallel_ready;	/* true if parallel vacuum is prepared */
+
+	/* Variables for parallel lazy index vacuum */
+	LVShared		*lvshared;
+	ParallelContext	*pcxt;
+} LVState;
 
 /* A few variables that don't seem worth passing around as parameters */
 static int	elevel = -1;
@@ -151,31 +256,43 @@ static BufferAccessStrategy vac_strategy;
 
 
 /* non-export function prototypes */
-static void lazy_scan_heap(Relation onerel, int options,
-			   LVRelStats *vacrelstats, Relation *Irel, int nindexes,
-			   bool aggressive);
-static void lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats);
+static void lazy_scan_heap(LVState *lvstate);
+static void lazy_vacuum_heap(LVState *lvstate, LVTidMap *dead_tuples);
 static bool lazy_check_needs_freeze(Buffer buf, bool *hastup);
-static void lazy_vacuum_index(Relation indrel,
-				  IndexBulkDeleteResult **stats,
-				  LVRelStats *vacrelstats);
-static void lazy_cleanup_index(Relation indrel,
-				   IndexBulkDeleteResult *stats,
-				   LVRelStats *vacrelstats);
-static int lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
-				 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer);
+static IndexBulkDeleteResult *lazy_vacuum_index(Relation indrel,
+									IndexBulkDeleteResult *stats,
+									double reltuples,
+									LVTidMap	*dead_tuples);
+static IndexBulkDeleteResult *lazy_cleanup_index(Relation indrel,
+									IndexBulkDeleteResult *stats,
+									double reltuples, bool estimated_count,
+									bool update_stats);
+static int lazy_vacuum_page(LVState *lvstate, Relation onerel, BlockNumber blkno,
+					Buffer buffer, int tupindex, Buffer *vmbuffer,
+					TransactionId latestRemovedXid, LVTidMap *dead_tuples);
 static bool should_attempt_truncation(LVRelStats *vacrelstats);
 static void lazy_truncate_heap(Relation onerel, LVRelStats *vacrelstats);
 static BlockNumber count_nondeletable_pages(Relation onerel,
 						 LVRelStats *vacrelstats);
-static void lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks);
-static void lazy_record_dead_tuple(LVRelStats *vacrelstats,
-					   ItemPointer itemptr);
+static LVTidMap *lazy_space_alloc(LVState *lvstate, BlockNumber relblocks,
+								  int parallel_workers);
+static void lazy_record_dead_tuple(LVTidMap *dead_tuples, ItemPointer itemptr);
 static bool lazy_tid_reaped(ItemPointer itemptr, void *state);
 static int	vac_cmp_itemptr(const void *left, const void *right);
 static bool heap_page_is_all_visible(Relation rel, Buffer buf,
 						 TransactionId *visibility_cutoff_xid, bool *all_frozen);
-
+static LVTidMap *lazy_prepare_parallel(LVState *lvstate, long maxtuples, int request);
+static void lazy_end_parallel(LVState *lvstate, bool update_indstats);
+static bool lazy_begin_parallel_vacuum_index(LVState *lvstate, bool for_cleanup);
+static void lazy_end_parallel_vacuum_index(LVState *lvstate);
+static void lazy_vacuum_all_indexes(LVState *lvstate,
+									IndexBulkDeleteResult **stats,
+									LVTidMap *dead_tuples,
+									bool for_cleanup);
+static void lazy_vacuum_indexes_for_worker(Relation *indrels, int nindexes,
+										   LVShared *lvshared, LVTidMap *dead_tuples,
+										   bool for_cleanup);
+static int compute_parallel_workers(Relation rel, int nrequests, int nindexes);
 
 /*
  *	vacuum_heap_rel() -- perform VACUUM for one heap relation
@@ -187,9 +304,10 @@ static bool heap_page_is_all_visible(Relation rel, Buffer buf,
  *		and locked the relation.
  */
 void
-heap_vacuum_rel(Relation onerel, int options, VacuumParams *params,
+heap_vacuum_rel(Relation onerel, VacuumOptions options, VacuumParams *params,
 				BufferAccessStrategy bstrategy)
 {
+	LVState	   *lvstate;
 	LVRelStats *vacrelstats;
 	Relation   *Irel;
 	int			nindexes;
@@ -201,6 +319,7 @@ heap_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 				write_rate;
 	bool		aggressive;		/* should we scan all unfrozen pages? */
 	bool		scanned_all_unfrozen;	/* actually scanned all such pages? */
+	bool		hasindex;
 	TransactionId xidFullScanLimit;
 	MultiXactId mxactFullScanLimit;
 	BlockNumber new_rel_pages;
@@ -218,7 +337,7 @@ heap_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 		starttime = GetCurrentTimestamp();
 	}
 
-	if (options & VACOPT_VERBOSE)
+	if (options.flags & VACOPT_VERBOSE)
 		elevel = INFO;
 	else
 		elevel = DEBUG2;
@@ -246,7 +365,7 @@ heap_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 											   xidFullScanLimit);
 	aggressive |= MultiXactIdPrecedesOrEquals(onerel->rd_rel->relminmxid,
 											  mxactFullScanLimit);
-	if (options & VACOPT_DISABLE_PAGE_SKIPPING)
+	if (options.flags & VACOPT_DISABLE_PAGE_SKIPPING)
 		aggressive = true;
 
 	vacrelstats = (LVRelStats *) palloc0(sizeof(LVRelStats));
@@ -259,10 +378,23 @@ heap_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 
 	/* Open all indexes of the relation */
 	vac_open_indexes(onerel, RowExclusiveLock, &nindexes, &Irel);
-	vacrelstats->hasindex = (nindexes > 0);
+	hasindex = (nindexes > 0);
+
+	/* Create a lazy vacuum working state */
+	lvstate = (LVState *) palloc0(sizeof(LVState));
+	lvstate->relation = onerel;
+	lvstate->vacrelstats = vacrelstats;
+	lvstate->indRels = Irel;
+	lvstate->nindexes = nindexes;
+	lvstate->options = options;
+	lvstate->is_wraparound = params->is_wraparound;
+	lvstate->aggressive = aggressive;
+	lvstate->parallel_ready = false;
+	lvstate->lvshared = NULL;
+	lvstate->pcxt = NULL;
 
 	/* Do the vacuuming */
-	lazy_scan_heap(onerel, options, vacrelstats, Irel, nindexes, aggressive);
+	lazy_scan_heap(lvstate);
 
 	/* Done with indexes */
 	vac_close_indexes(nindexes, Irel, NoLock);
@@ -333,7 +465,7 @@ heap_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 						new_rel_pages,
 						new_live_tuples,
 						new_rel_allvisible,
-						vacrelstats->hasindex,
+						hasindex,
 						new_frozen_xid,
 						new_min_multi,
 						false);
@@ -465,14 +597,29 @@ vacuum_log_cleanup_info(Relation rel, LVRelStats *vacrelstats)
  *		dead-tuple TIDs, invoke vacuuming of indexes and call lazy_vacuum_heap
  *		to reclaim dead line pointers.
  *
+ *		If the table has more than one index and parallel lazy vacuum is requested,
+ *		we execute both index vacuum and cleanup index with parallel workers. When
+ *		allocating the space for lazy scan heap, we enter the parallel mode, create
+ *		the parallel context and initailize a dynamic shared memory segment for dead
+ *		tuples. The dead_tuples points either to a dynamic shared memory segment in
+ *		parallel vacuum case or to a local memory in single process vacuum case.
+ *		Before starting	parallel index vacuum and parallel cleanup index we launch
+ *		parallel workers. All parallel workers will exit after processed all indexes
+ *		and the leader process re-initialize parallel context and then re-launch them
+ *		at the next execution. The index statistics are updated by the leader after
+ *		exited from the parallel mode since all writes are not allowed during the
+ *		parallel mode.
+ *
  *		If there are no indexes then we can reclaim line pointers on the fly;
  *		dead line pointers need only be retained until all index pointers that
  *		reference them have been killed.
  */
 static void
-lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
-			   Relation *Irel, int nindexes, bool aggressive)
+lazy_scan_heap(LVState *lvstate)
 {
+	Relation	onerel = lvstate->relation;
+	LVRelStats	*vacrelstats = lvstate->vacrelstats;
+	LVTidMap	*dead_tuples = NULL;
 	BlockNumber nblocks,
 				blkno;
 	HeapTupleData tuple;
@@ -487,7 +634,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 				tups_vacuumed,	/* tuples cleaned up by vacuum */
 				nkeep,			/* dead-but-not-removable tuples */
 				nunused;		/* unused item pointers */
-	IndexBulkDeleteResult **indstats;
+	IndexBulkDeleteResult **indstats = NULL;
 	int			i;
 	PGRUsage	ru0;
 	Buffer		vmbuffer = InvalidBuffer;
@@ -495,6 +642,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	bool		skipping_blocks;
 	xl_heap_freeze_tuple *frozen;
 	StringInfoData buf;
+	int			parallel_workers = 0;
 	const int	initprog_index[] = {
 		PROGRESS_VACUUM_PHASE,
 		PROGRESS_VACUUM_TOTAL_HEAP_BLKS,
@@ -505,7 +653,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	pg_rusage_init(&ru0);
 
 	relname = RelationGetRelationName(onerel);
-	if (aggressive)
+	if (lvstate->aggressive)
 		ereport(elevel,
 				(errmsg("aggressively vacuuming \"%s.%s\"",
 						get_namespace_name(RelationGetNamespace(onerel)),
@@ -520,9 +668,6 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	next_fsm_block_to_vacuum = (BlockNumber) 0;
 	num_tuples = live_tuples = tups_vacuumed = nkeep = nunused = 0;
 
-	indstats = (IndexBulkDeleteResult **)
-		palloc0(nindexes * sizeof(IndexBulkDeleteResult *));
-
 	nblocks = RelationGetNumberOfBlocks(onerel);
 	vacrelstats->rel_pages = nblocks;
 	vacrelstats->scanned_pages = 0;
@@ -530,13 +675,31 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	vacrelstats->nonempty_pages = 0;
 	vacrelstats->latestRemovedXid = InvalidTransactionId;
 
-	lazy_space_alloc(vacrelstats, nblocks);
+	/*
+	 * Compute the number of parallel vacuum worker to request and then enable
+	 * parallel lazy vacuum.
+	 */
+	if ((lvstate->options.flags & VACOPT_PARALLEL) != 0)
+		parallel_workers = compute_parallel_workers(lvstate->relation,
+													lvstate->options.nworkers,
+													lvstate->nindexes);
+
+	dead_tuples = lazy_space_alloc(lvstate, nblocks, parallel_workers);
 	frozen = palloc(sizeof(xl_heap_freeze_tuple) * MaxHeapTuplesPerPage);
 
+	/*
+	 * allocate the memory for index bulkdelete results if in the single vacuum
+	 * mode. In parallel mode, we've already prepared it in the shared memory
+	 * segment.
+	 */
+	if (!lvstate->parallel_ready)
+		indstats = (IndexBulkDeleteResult **)
+			palloc0(lvstate->nindexes * sizeof(IndexBulkDeleteResult *));
+
 	/* Report that we're scanning the heap, advertising total # of blocks */
 	initprog_val[0] = PROGRESS_VACUUM_PHASE_SCAN_HEAP;
 	initprog_val[1] = nblocks;
-	initprog_val[2] = vacrelstats->max_dead_tuples;
+	initprog_val[2] = dead_tuples->max_dead_tuples;
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
 	/*
@@ -584,7 +747,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	 * be replayed on any hot standby, where it can be disruptive.
 	 */
 	next_unskippable_block = 0;
-	if ((options & VACOPT_DISABLE_PAGE_SKIPPING) == 0)
+	if ((lvstate->options.flags & VACOPT_DISABLE_PAGE_SKIPPING) == 0)
 	{
 		while (next_unskippable_block < nblocks)
 		{
@@ -592,7 +755,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 
 			vmstatus = visibilitymap_get_status(onerel, next_unskippable_block,
 												&vmbuffer);
-			if (aggressive)
+			if (lvstate->aggressive)
 			{
 				if ((vmstatus & VISIBILITYMAP_ALL_FROZEN) == 0)
 					break;
@@ -639,7 +802,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		{
 			/* Time to advance next_unskippable_block */
 			next_unskippable_block++;
-			if ((options & VACOPT_DISABLE_PAGE_SKIPPING) == 0)
+			if ((lvstate->options.flags & VACOPT_DISABLE_PAGE_SKIPPING) == 0)
 			{
 				while (next_unskippable_block < nblocks)
 				{
@@ -648,7 +811,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 					vmskipflags = visibilitymap_get_status(onerel,
 														   next_unskippable_block,
 														   &vmbuffer);
-					if (aggressive)
+					if (lvstate->aggressive)
 					{
 						if ((vmskipflags & VISIBILITYMAP_ALL_FROZEN) == 0)
 							break;
@@ -677,7 +840,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			 * it's not all-visible.  But in an aggressive vacuum we know only
 			 * that it's not all-frozen, so it might still be all-visible.
 			 */
-			if (aggressive && VM_ALL_VISIBLE(onerel, blkno, &vmbuffer))
+			if (lvstate->aggressive && VM_ALL_VISIBLE(onerel, blkno, &vmbuffer))
 				all_visible_according_to_vm = true;
 		}
 		else
@@ -701,7 +864,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 				 * know whether it was all-frozen, so we have to recheck; but
 				 * in this case an approximate answer is OK.
 				 */
-				if (aggressive || VM_ALL_FROZEN(onerel, blkno, &vmbuffer))
+				if (lvstate->aggressive || VM_ALL_FROZEN(onerel, blkno, &vmbuffer))
 					vacrelstats->frozenskipped_pages++;
 				continue;
 			}
@@ -714,8 +877,8 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		 * If we are close to overrunning the available space for dead-tuple
 		 * TIDs, pause and do a cycle of vacuuming before we tackle this page.
 		 */
-		if ((vacrelstats->max_dead_tuples - vacrelstats->num_dead_tuples) < MaxHeapTuplesPerPage &&
-			vacrelstats->num_dead_tuples > 0)
+		if ((dead_tuples->max_dead_tuples - dead_tuples->num_dead_tuples) < MaxHeapTuplesPerPage &&
+			dead_tuples->num_dead_tuples > 0)
 		{
 			const int	hvp_index[] = {
 				PROGRESS_VACUUM_PHASE,
@@ -743,10 +906,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 										 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
 			/* Remove index entries */
-			for (i = 0; i < nindexes; i++)
-				lazy_vacuum_index(Irel[i],
-								  &indstats[i],
-								  vacrelstats);
+			lazy_vacuum_all_indexes(lvstate, indstats, dead_tuples, false);
 
 			/*
 			 * Report that we are now vacuuming the heap.  We also increase
@@ -759,14 +919,14 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			pgstat_progress_update_multi_param(2, hvp_index, hvp_val);
 
 			/* Remove tuples from heap */
-			lazy_vacuum_heap(onerel, vacrelstats);
+			lazy_vacuum_heap(lvstate, dead_tuples);
 
 			/*
 			 * Forget the now-vacuumed tuples, and press on, but be careful
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_dead_tuples = 0;
 			vacrelstats->num_index_scans++;
 
 			/*
@@ -804,7 +964,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			 * it's OK to skip vacuuming pages we get a lock conflict on. They
 			 * will be dealt with in some future vacuum.
 			 */
-			if (!aggressive && !FORCE_CHECK_PAGE())
+			if (!lvstate->aggressive && !FORCE_CHECK_PAGE())
 			{
 				ReleaseBuffer(buf);
 				vacrelstats->pinskipped_pages++;
@@ -837,7 +997,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 					vacrelstats->nonempty_pages = blkno + 1;
 				continue;
 			}
-			if (!aggressive)
+			if (!lvstate->aggressive)
 			{
 				/*
 				 * Here, we must not advance scanned_pages; that would amount
@@ -956,7 +1116,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		has_dead_tuples = false;
 		nfrozen = 0;
 		hastup = false;
-		prev_dead_count = vacrelstats->num_dead_tuples;
+		prev_dead_count = dead_tuples->num_dead_tuples;
 		maxoff = PageGetMaxOffsetNumber(page);
 
 		/*
@@ -995,7 +1155,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			 */
 			if (ItemIdIsDead(itemid))
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				all_visible = false;
 				continue;
 			}
@@ -1135,7 +1295,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 
 			if (tupgone)
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				HeapTupleHeaderAdvanceLatestRemovedXid(tuple.t_data,
 													   &vacrelstats->latestRemovedXid);
 				tups_vacuumed += 1;
@@ -1204,11 +1364,12 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		 * If there are no indexes then we can vacuum the page right now
 		 * instead of doing a second scan.
 		 */
-		if (nindexes == 0 &&
-			vacrelstats->num_dead_tuples > 0)
+		if (lvstate->nindexes == 0 && dead_tuples->num_dead_tuples > 0)
 		{
 			/* Remove tuples from heap */
-			lazy_vacuum_page(onerel, blkno, buf, 0, vacrelstats, &vmbuffer);
+			lazy_vacuum_page(lvstate, onerel, blkno, buf, 0, &vmbuffer,
+							 lvstate->vacrelstats->latestRemovedXid,
+							 dead_tuples);
 			has_dead_tuples = false;
 
 			/*
@@ -1216,7 +1377,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_dead_tuples = 0;
 			vacuumed_pages++;
 
 			/*
@@ -1332,7 +1493,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		 * page, so remember its free space as-is.  (This path will always be
 		 * taken if there are no indexes.)
 		 */
-		if (vacrelstats->num_dead_tuples == prev_dead_count)
+		if (dead_tuples->num_dead_tuples == prev_dead_count)
 			RecordPageWithFreeSpace(onerel, blkno, freespace);
 	}
 
@@ -1366,7 +1527,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 
 	/* If any tuples need to be deleted, perform final vacuum cycle */
 	/* XXX put a threshold on min number of tuples here? */
-	if (vacrelstats->num_dead_tuples > 0)
+	if (dead_tuples->num_dead_tuples > 0)
 	{
 		const int	hvp_index[] = {
 			PROGRESS_VACUUM_PHASE,
@@ -1382,10 +1543,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 									 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
 		/* Remove index entries */
-		for (i = 0; i < nindexes; i++)
-			lazy_vacuum_index(Irel[i],
-							  &indstats[i],
-							  vacrelstats);
+		lazy_vacuum_all_indexes(lvstate, indstats, dead_tuples, false);
 
 		/* Report that we are now vacuuming the heap */
 		hvp_val[0] = PROGRESS_VACUUM_PHASE_VACUUM_HEAP;
@@ -1395,7 +1553,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		/* Remove tuples from heap */
 		pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
 									 PROGRESS_VACUUM_PHASE_VACUUM_HEAP);
-		lazy_vacuum_heap(onerel, vacrelstats);
+		lazy_vacuum_heap(lvstate, dead_tuples);
 		vacrelstats->num_index_scans++;
 	}
 
@@ -1412,8 +1570,10 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
 
 	/* Do post-vacuum cleanup and statistics update for each index */
-	for (i = 0; i < nindexes; i++)
-		lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
+	lazy_vacuum_all_indexes(lvstate, indstats, dead_tuples, true);
+
+	if (lvstate->parallel_ready)
+		lazy_end_parallel(lvstate, true);
 
 	/* If no indexes, make log report that lazy_vacuum_heap would've made */
 	if (vacuumed_pages)
@@ -1468,8 +1628,9 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
  * process index entry removal in batches as large as possible.
  */
 static void
-lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
+lazy_vacuum_heap(LVState *lvstate, LVTidMap *dead_tuples)
 {
+	Relation	onerel = lvstate->relation;
 	int			tupindex;
 	int			npages;
 	PGRUsage	ru0;
@@ -1479,7 +1640,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 	npages = 0;
 
 	tupindex = 0;
-	while (tupindex < vacrelstats->num_dead_tuples)
+	while (tupindex < dead_tuples->num_dead_tuples)
 	{
 		BlockNumber tblk;
 		Buffer		buf;
@@ -1488,7 +1649,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 
 		vacuum_delay_point();
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&dead_tuples->itemptrs[tupindex]);
 		buf = ReadBufferExtended(onerel, MAIN_FORKNUM, tblk, RBM_NORMAL,
 								 vac_strategy);
 		if (!ConditionalLockBufferForCleanup(buf))
@@ -1497,8 +1658,9 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 			++tupindex;
 			continue;
 		}
-		tupindex = lazy_vacuum_page(onerel, tblk, buf, tupindex, vacrelstats,
-									&vmbuffer);
+		tupindex = lazy_vacuum_page(lvstate, onerel, tblk, buf, tupindex,
+									&vmbuffer, lvstate->vacrelstats->latestRemovedXid,
+									dead_tuples);
 
 		/* Now that we've compacted the page, record its available space */
 		page = BufferGetPage(buf);
@@ -1533,8 +1695,9 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
  * The return value is the first tupindex after the tuples of this page.
  */
 static int
-lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
-				 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer)
+lazy_vacuum_page(LVState *lvstate, Relation onerel, BlockNumber blkno,
+				 Buffer buffer, int tupindex, Buffer *vmbuffer,
+				 TransactionId latestRemovedXid, LVTidMap *dead_tuples)
 {
 	Page		page = BufferGetPage(buffer);
 	OffsetNumber unused[MaxOffsetNumber];
@@ -1546,16 +1709,16 @@ lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 
 	START_CRIT_SECTION();
 
-	for (; tupindex < vacrelstats->num_dead_tuples; tupindex++)
+	for (; tupindex < dead_tuples->num_dead_tuples; tupindex++)
 	{
 		BlockNumber tblk;
 		OffsetNumber toff;
 		ItemId		itemid;
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&dead_tuples->itemptrs[tupindex]);
 		if (tblk != blkno)
 			break;				/* past end of tuples for this block */
-		toff = ItemPointerGetOffsetNumber(&vacrelstats->dead_tuples[tupindex]);
+		toff = ItemPointerGetOffsetNumber(&dead_tuples->itemptrs[tupindex]);
 		itemid = PageGetItemId(page, toff);
 		ItemIdSetUnused(itemid);
 		unused[uncnt++] = toff;
@@ -1576,7 +1739,7 @@ lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 		recptr = log_heap_clean(onerel, buffer,
 								NULL, 0, NULL, 0,
 								unused, uncnt,
-								vacrelstats->latestRemovedXid);
+								latestRemovedXid);
 		PageSetLSN(page, recptr);
 	}
 
@@ -1675,6 +1838,98 @@ lazy_check_needs_freeze(Buffer buf, bool *hastup)
 	return false;
 }
 
+/*
+ * Vacuum or cleanup all indexes. If we're ready to do parallel vacuum it's done
+ * with parallel workers. So this function must be used by the parallel vacuum
+ * leader process.
+ *
+ * In parallel lazy vacuum, we copy the index bulk-deletion results returned from
+ * ambulkdelete and amvacuumcleanup to the shared memory because they are allocated
+ * in the local memory and it's possible that an index will be vacuumed by the
+ * different vacuum process at the next time.
+ *
+ * Since all vacuum workers process different indexes we can write them without
+ * locking.
+ */
+static void
+lazy_vacuum_all_indexes(LVState *lvstate, IndexBulkDeleteResult **stats,
+						LVTidMap *dead_tuples, bool for_cleanup)
+{
+	LVShared	*lvshared = lvstate->lvshared;
+	LVRelStats	*vacrelstats = lvstate->vacrelstats;
+	int			nprocessed = 0;
+	bool		do_parallel = false;
+	int			idx;
+
+	Assert(!IsParallelWorker());
+	Assert(lvstate->parallel_ready ||
+		   (!lvstate->parallel_ready && stats != NULL));
+
+	/* no job if the table has no index */
+	if (lvstate->nindexes < 1)
+		return;
+
+	if (lvstate->parallel_ready)
+		do_parallel = lazy_begin_parallel_vacuum_index(lvstate, for_cleanup);
+
+	for (;;)
+	{
+		IndexBulkDeleteResult *result = NULL;
+		bool copy_result = false;
+
+		/* Get the next index number to vacuum and set index statistics */
+		if (do_parallel)
+		{
+			idx = pg_atomic_fetch_add_u32(&(lvshared->nprocessed), 1);
+
+			/*
+			 * If there is already-updated result in the shared memory we
+			 * use it. Otherwise we pass NULL to index AMs and copy the
+			 * result to the shared memory segment.
+			 */
+			if (lvshared->indstats[idx].updated)
+				result = &(lvshared->indstats[idx].stats);
+			else
+				copy_result = true;
+		}
+		else
+		{
+			idx = nprocessed++;
+			result = stats[idx];
+		}
+
+		/* Done for all indexes? */
+		if (idx >= lvstate->nindexes)
+			break;
+
+		/*
+		 * Do vacuuming or cleanup one index. For cleanup index, we don't update
+		 * index statistics during parallel mode.
+		 */
+		if (for_cleanup)
+			result = lazy_cleanup_index(lvstate->indRels[idx], result,
+										vacrelstats->new_rel_tuples,
+										vacrelstats->tupcount_pages < vacrelstats->rel_pages,
+										!do_parallel);
+		else
+			result = lazy_vacuum_index(lvstate->indRels[idx], result,
+									   vacrelstats->old_rel_pages,
+									   dead_tuples);
+
+		if (do_parallel && result)
+		{
+			/* save index bulk-deletion result to the shared memory space */
+			lvshared->indstats[idx].updated = true;
+
+			if (copy_result)
+				memcpy(&(lvshared->indstats[idx].stats), result,
+					   sizeof(IndexBulkDeleteResult));
+		}
+	}
+
+	if (do_parallel)
+		lazy_end_parallel_vacuum_index(lvstate);
+}
 
 /*
  *	lazy_vacuum_index() -- vacuum one index relation.
@@ -1682,11 +1937,11 @@ lazy_check_needs_freeze(Buffer buf, bool *hastup)
  *		Delete all the index entries pointing to tuples listed in
  *		vacrelstats->dead_tuples, and update running statistics.
  */
-static void
-lazy_vacuum_index(Relation indrel,
-				  IndexBulkDeleteResult **stats,
-				  LVRelStats *vacrelstats)
+static IndexBulkDeleteResult *
+lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult *stats,
+				  double reltuples, LVTidMap *dead_tuples)
 {
+	IndexBulkDeleteResult *res;
 	IndexVacuumInfo ivinfo;
 	PGRUsage	ru0;
 
@@ -1696,28 +1951,29 @@ lazy_vacuum_index(Relation indrel,
 	ivinfo.analyze_only = false;
 	ivinfo.estimated_count = true;
 	ivinfo.message_level = elevel;
-	/* We can only provide an approximate value of num_heap_tuples here */
-	ivinfo.num_heap_tuples = vacrelstats->old_live_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
 	/* Do bulk deletion */
-	*stats = index_bulk_delete(&ivinfo, *stats,
-							   lazy_tid_reaped, (void *) vacrelstats);
+	res = index_bulk_delete(&ivinfo, stats,
+							lazy_tid_reaped, (void *) dead_tuples);
 
 	ereport(elevel,
-			(errmsg("scanned index \"%s\" to remove %d row versions",
+			(errmsg("scanned index \"%s\" to remove %d row versions %s",
 					RelationGetRelationName(indrel),
-					vacrelstats->num_dead_tuples),
+					dead_tuples->num_dead_tuples,
+					IsParallelWorker() ? "by parallel vacuum worker" : ""),
 			 errdetail_internal("%s", pg_rusage_show(&ru0))));
+
+	return res;
 }
 
 /*
  *	lazy_cleanup_index() -- do post-vacuum cleanup for one index relation.
  */
-static void
-lazy_cleanup_index(Relation indrel,
-				   IndexBulkDeleteResult *stats,
-				   LVRelStats *vacrelstats)
+static IndexBulkDeleteResult *
+lazy_cleanup_index(Relation indrel, IndexBulkDeleteResult *stats,
+				   double reltuples, bool estimated_count, bool update_stats)
 {
 	IndexVacuumInfo ivinfo;
 	PGRUsage	ru0;
@@ -1726,27 +1982,21 @@ lazy_cleanup_index(Relation indrel,
 
 	ivinfo.index = indrel;
 	ivinfo.analyze_only = false;
-	ivinfo.estimated_count = (vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	ivinfo.estimated_count = estimated_count;
 	ivinfo.message_level = elevel;
-
-	/*
-	 * Now we can provide a better estimate of total number of surviving
-	 * tuples (we assume indexes are more interested in that than in the
-	 * number of nominally live tuples).
-	 */
-	ivinfo.num_heap_tuples = vacrelstats->new_rel_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
 	stats = index_vacuum_cleanup(&ivinfo, stats);
 
 	if (!stats)
-		return;
+		return NULL;
 
 	/*
 	 * Now update statistics in pg_class, but only if the index says the count
 	 * is accurate.
 	 */
-	if (!stats->estimated_count)
+	if (!stats->estimated_count && update_stats)
 		vac_update_relstats(indrel,
 							stats->num_pages,
 							stats->num_index_tuples,
@@ -1767,8 +2017,7 @@ lazy_cleanup_index(Relation indrel,
 					   stats->tuples_removed,
 					   stats->pages_deleted, stats->pages_free,
 					   pg_rusage_show(&ru0))));
-
-	pfree(stats);
+	return stats;
 }
 
 /*
@@ -2078,15 +2327,16 @@ count_nondeletable_pages(Relation onerel, LVRelStats *vacrelstats)
  *
  * See the comments at the head of this file for rationale.
  */
-static void
-lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+static LVTidMap *
+lazy_space_alloc(LVState *lvstate, BlockNumber relblocks, int parallel_workers)
 {
+	LVTidMap	*dead_tuples = NULL;
 	long		maxtuples;
 	int			vac_work_mem = IsAutoVacuumWorkerProcess() &&
 	autovacuum_work_mem != -1 ?
 	autovacuum_work_mem : maintenance_work_mem;
 
-	if (vacrelstats->hasindex)
+	if (lvstate->nindexes > 0)
 	{
 		maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);
 		maxtuples = Min(maxtuples, INT_MAX);
@@ -2100,34 +2350,46 @@ lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
 		maxtuples = Max(maxtuples, MaxHeapTuplesPerPage);
 	}
 	else
-	{
 		maxtuples = MaxHeapTuplesPerPage;
+
+	/*
+	 * In parallel lazy vacuum, we enter the parallel mode and prepare all memory
+	 * necessary for executing parallel lazy vacuum including the space to store
+	 * dead tuples.
+	 */
+	if (parallel_workers > 0)
+	{
+		dead_tuples = lazy_prepare_parallel(lvstate, maxtuples, parallel_workers);
+
+		/* Preparation was a success, return the dead tuple space */
+		if (dead_tuples)
+			return dead_tuples;
 	}
 
-	vacrelstats->num_dead_tuples = 0;
-	vacrelstats->max_dead_tuples = (int) maxtuples;
-	vacrelstats->dead_tuples = (ItemPointer)
-		palloc(maxtuples * sizeof(ItemPointerData));
+	dead_tuples = (LVTidMap *) palloc(SizeOfLVTidMap + maxtuples * sizeof(ItemPointerData));
+	dead_tuples->num_dead_tuples = 0;
+	dead_tuples->max_dead_tuples = (int) maxtuples;
+
+	return dead_tuples;
 }
 
 /*
  * lazy_record_dead_tuple - remember one deletable tuple
  */
 static void
-lazy_record_dead_tuple(LVRelStats *vacrelstats,
-					   ItemPointer itemptr)
+lazy_record_dead_tuple(LVTidMap *dead_tuples, ItemPointer itemptr)
 {
 	/*
 	 * The array shouldn't overflow under normal behavior, but perhaps it
 	 * could if we are given a really small maintenance_work_mem. In that
 	 * case, just forget the last few tuples (we'll get 'em next time).
 	 */
-	if (vacrelstats->num_dead_tuples < vacrelstats->max_dead_tuples)
+	if (dead_tuples->num_dead_tuples < dead_tuples->max_dead_tuples)
 	{
-		vacrelstats->dead_tuples[vacrelstats->num_dead_tuples] = *itemptr;
-		vacrelstats->num_dead_tuples++;
+		dead_tuples->itemptrs[dead_tuples->num_dead_tuples] = *itemptr;
+		dead_tuples->num_dead_tuples++;
 		pgstat_progress_update_param(PROGRESS_VACUUM_NUM_DEAD_TUPLES,
-									 vacrelstats->num_dead_tuples);
+									 dead_tuples->num_dead_tuples);
 	}
 }
 
@@ -2141,12 +2403,12 @@ lazy_record_dead_tuple(LVRelStats *vacrelstats,
 static bool
 lazy_tid_reaped(ItemPointer itemptr, void *state)
 {
-	LVRelStats *vacrelstats = (LVRelStats *) state;
+	LVTidMap	*dead_tuples = (LVTidMap *) state;
 	ItemPointer res;
 
 	res = (ItemPointer) bsearch((void *) itemptr,
-								(void *) vacrelstats->dead_tuples,
-								vacrelstats->num_dead_tuples,
+								(void *) dead_tuples->itemptrs,
+								dead_tuples->num_dead_tuples,
 								sizeof(ItemPointerData),
 								vac_cmp_itemptr);
 
@@ -2294,3 +2556,378 @@ heap_page_is_all_visible(Relation rel, Buffer buf,
 
 	return all_visible;
 }
+
+/*
+ * Compute the number of parallel worker process to request. Vacuums can be executed
+ * in parallel if the table has more than one index since the parallel index vacuum
+ * processes one index by one vacuum process. The relation size of table and indexes
+ * doesn't affect to the parallel degree.
+ */
+static int
+compute_parallel_workers(Relation rel, int nrequests, int nindexes)
+{
+	int parallel_workers;
+
+	if (nindexes <= 1)
+		return 0;
+
+	if (nrequests)
+		parallel_workers = Min(nrequests, nindexes - 1);
+	else if (rel->rd_options)
+	{
+		StdRdOptions *relopts = (StdRdOptions *) rel->rd_options;
+		parallel_workers = Min(relopts->parallel_workers, nindexes - 1);
+	}
+	else
+	{
+		/*
+		 * The parallel degree is neither requested nor set in relopts. Compute
+		 * it based on the number of indexes.
+		 */
+		parallel_workers = nindexes - 1;
+	}
+
+	/* cap by max_parallel_maintenace_workers */
+	parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+
+	return parallel_workers;
+}
+
+/*
+ * Enter the parallel mode, allocate and initialize a DSM segment. Return
+ * the memory space for storing dead tuples or NULL if no workers are prepared.
+ */
+static LVTidMap *
+lazy_prepare_parallel(LVState *lvstate, long maxtuples, int request)
+{
+	LVShared	*shared;
+	ParallelContext *pcxt;
+	LVTidMap	*tidmap;
+	char	*sharedquery;
+	Size	estshared;
+	Size	estdt;
+	int		querylen;
+	int		i;
+	int		keys = 0;
+
+	Assert(request > 0);
+
+	EnterParallelMode();
+	pcxt = CreateParallelContext("postgres", "heap_parallel_vacuum_main",
+								 request, true);
+	lvstate->pcxt = pcxt;
+
+	/* quick exit if no workers are prepared, e.g. under serializable isolation */
+	if (pcxt->nworkers == 0)
+	{
+		lazy_end_parallel(lvstate, false);
+		return NULL;
+	}
+
+	/* Estimate size for shared information -- PARALLEL_VACUUM_KEY_SHARED */
+	estshared = MAXALIGN(add_size(SizeOfLVShared,
+								  mul_size(sizeof(LVIndStats), lvstate->nindexes)));
+	shm_toc_estimate_chunk(&pcxt->estimator, estshared);
+	keys++;
+
+	/* Estimate size for dead tuples -- PARALLEL_VACUUM_KEY_DEAD_TUPLES */
+	estdt = MAXALIGN(add_size(sizeof(LVTidMap),
+							  mul_size(sizeof(ItemPointerData), maxtuples)));
+	shm_toc_estimate_chunk(&pcxt->estimator, estdt);
+	keys++;
+
+	shm_toc_estimate_keys(&pcxt->estimator, keys);
+
+	/*
+	 * Finally, estimate VACUUM_KEY_QUERY_TEXT space. Auto vacuums don't have
+	 * debug_query_string.
+	 */
+	if (debug_query_string)
+	{
+		querylen = strlen(debug_query_string);
+		shm_toc_estimate_chunk(&pcxt->estimator, querylen + 1);
+		shm_toc_estimate_keys(&pcxt->estimator, 1);
+	}
+
+	/* create the DSM */
+	InitializeParallelDSM(pcxt);
+
+	/* prepare shared information */
+	shared = (LVShared *) shm_toc_allocate(pcxt->toc, estshared);
+	shared->relid = RelationGetRelid(lvstate->relation);
+	shared->is_wraparound = lvstate->is_wraparound;
+	shared->elevel = elevel;
+	pg_atomic_init_u32(&(shared->nprocessed), 0);
+
+	for (i = 0; i < lvstate->nindexes; i++)
+	{
+		LVIndStats *s = &(shared->indstats[i]);
+		s->updated = false;
+		MemSet(&(s->stats), 0, sizeof(IndexBulkDeleteResult));
+	}
+
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_SHARED, shared);
+	lvstate->lvshared = shared;
+
+	/* prepare the dead tuple space */
+	tidmap = (LVTidMap *) shm_toc_allocate(pcxt->toc, estdt);
+	tidmap->max_dead_tuples = maxtuples;
+	tidmap->num_dead_tuples = 0;
+	MemSet(tidmap->itemptrs, 0, sizeof(ItemPointerData) * maxtuples);
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_DEAD_TUPLES, tidmap);
+
+	/* Store query string for workers */
+	sharedquery = (char *) shm_toc_allocate(pcxt->toc, querylen + 1);
+	memcpy(sharedquery, debug_query_string, querylen + 1);
+	sharedquery[querylen] = '\0';
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, sharedquery);
+
+	/* All setup is done, now we're ready for parallel vacuum execution */
+	lvstate->parallel_ready = true;
+
+	return tidmap;
+}
+
+/*
+ * Shutdown workers, destroy parallel context, and end parallel mode. If
+ * 'update_indstats' is true, we copy statistics of all indexes before
+ * destroying the parallel context, and then update them after exit parallel
+ * mode.
+ */
+static void
+lazy_end_parallel(LVState *lvstate, bool update_indstats)
+{
+	LVIndStats *copied_indstats = NULL;
+
+	Assert(!IsParallelWorker());
+
+	if (update_indstats && lvstate->nindexes > 0)
+	{
+		/* copy the index statistics to a temporary space */
+		copied_indstats = palloc(sizeof(LVIndStats) * lvstate->nindexes);
+		memcpy(copied_indstats, lvstate->lvshared->indstats,
+			   sizeof(LVIndStats) * lvstate->nindexes);
+	}
+
+	/* Shutdown worker processes and destroy the parallel context */
+	WaitForParallelWorkersToFinish(lvstate->pcxt);
+	DestroyParallelContext(lvstate->pcxt);
+	ExitParallelMode();
+
+	if (copied_indstats)
+	{
+		int i;
+
+		for (i = 0; i < lvstate->nindexes; i++)
+		{
+			LVIndStats *s = &(copied_indstats[i]);
+
+			/* Update index statistics */
+			if (s->updated && !s->stats.estimated_count)
+				vac_update_relstats(lvstate->indRels[i],
+									s->stats.num_pages,
+									s->stats.num_index_tuples,
+									0,
+									false,
+									InvalidTransactionId,
+									InvalidMultiXactId,
+									false);
+		}
+
+		pfree(copied_indstats);
+	}
+
+	lvstate->parallel_ready = false;
+}
+
+/*
+ * Begin a parallel index vacuum or cleanup index. Set shared information
+ * and launch parallel worker processes. Return true if at least one worker
+ * has been launched.
+ */
+static bool
+lazy_begin_parallel_vacuum_index(LVState *lvstate, bool for_cleanup)
+{
+	LVRelStats *vacrelstats = lvstate->vacrelstats;
+
+	Assert(!IsParallelWorker());
+
+	/*
+	 * Request workers to do either vacuuming indexes or cleaning indexes.
+	 */
+	lvstate->lvshared->for_cleanup = for_cleanup;
+
+	if (for_cleanup)
+	{
+		/*
+		 * Now we can provide a better estimate of total number of surviving
+		 * tuples (we assume indexes are more interested in that than in the
+		 * number of nominally live tuples).
+		 */
+		lvstate->lvshared->reltuples = vacrelstats->new_rel_tuples;
+		lvstate->lvshared->estimated_count =
+			(vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	}
+	else
+	{
+		/* We can only provide an approximate value of num_heap_tuples here */
+		lvstate->lvshared->reltuples = vacrelstats->old_live_tuples;
+		lvstate->lvshared->estimated_count = true;
+	}
+
+	LaunchParallelWorkers(lvstate->pcxt);
+
+	/* Report parallel vacuum worker information */
+	ereport(elevel,
+			(errmsg(ngettext("launched %d parallel vacuum worker %s (planned: %d, requested: %d)",
+							 "launched %d parallel vacuum workers %s (planned: %d, requested: %d)",
+							 lvstate->pcxt->nworkers_launched),
+					lvstate->pcxt->nworkers_launched,
+					for_cleanup ? "for index cleanup" : "for index vacuum",
+					lvstate->pcxt->nworkers,
+					lvstate->options.nworkers)));
+
+	/*
+	 * if no workers launched, we vacuum all indexes by the leader process
+	 * alone. Since there is hope that we can launch workers in the next
+	 * execution time we don't want to end the parallel mode yet.
+	 */
+	if (lvstate->pcxt->nworkers_launched == 0)
+	{
+		lazy_end_parallel_vacuum_index(lvstate);
+		return false;
+	}
+
+	WaitForParallelWorkersToAttach(lvstate->pcxt);
+
+	return true;
+}
+
+/*
+ * Wait for all worker processes to finish and reinitialize DSM for
+ * the next execution.
+ */
+static void
+lazy_end_parallel_vacuum_index(LVState *lvstate)
+{
+	Assert(!IsParallelWorker());
+
+	WaitForParallelWorkersToFinish(lvstate->pcxt);
+
+	/* Reset the processing count */
+	pg_atomic_write_u32(&(lvstate->lvshared->nprocessed), 0);
+
+	/*
+	 * Reinitialize the DSM space except to relaunch parallel workers for
+	 * the next execution.
+	 */
+	ReinitializeParallelDSM(lvstate->pcxt);
+}
+
+/*
+ * Perform work within a launched parallel process.
+ *
+ * Parallel vacuum worker processes doesn't report the vacuum progress
+ * information.
+ */
+void
+heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
+{
+	Relation	onerel;
+	Relation	*indrels;
+	LVShared	*lvshared;
+	LVTidMap	*dead_tuples;
+	int			nindexes;
+	char		*sharedquery;
+
+	/* Set lazy vacuum state and open relations */
+	lvshared = (LVShared *) shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_SHARED, false);
+	onerel = heap_open(lvshared->relid, ShareUpdateExclusiveLock);
+	vac_open_indexes(onerel, RowExclusiveLock, &nindexes, &indrels);
+	elevel = lvshared->elevel;
+
+	ereport(DEBUG1,
+			(errmsg("starting parallel lazy vacuum worker for %s",
+					lvshared->for_cleanup ? "cleanup" : "vacuuming")));
+
+	/* Set debug_query_string for individual workers */
+	sharedquery = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, true);
+
+	/* Report the query string from leader */
+	debug_query_string = sharedquery;
+	pgstat_report_activity(STATE_RUNNING, debug_query_string);
+
+	/* Set dead tuple space within worker */
+	dead_tuples = (LVTidMap *) shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_DEAD_TUPLES, false);
+
+	/* Set cost-based vacuum delay */
+	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostBalance = 0;
+	VacuumPageHit = 0;
+	VacuumPageMiss = 0;
+	VacuumPageDirty = 0;
+
+	/* Do either vacuuming indexes or cleaning indexes */
+	lazy_vacuum_indexes_for_worker(indrels, nindexes, lvshared,
+								   dead_tuples,
+								   lvshared->for_cleanup);
+
+	vac_close_indexes(nindexes, indrels, RowExclusiveLock);
+	heap_close(onerel, ShareUpdateExclusiveLock);
+}
+
+/*
+ * Vacuum or cleanup indexes. This function must be used by the parallel vacuum
+ * worker processes. Similar to the leader process in parallel lazy vacuum, we
+ * copy the index bulk-deletion results to the shared memory segment.
+ */
+static void
+lazy_vacuum_indexes_for_worker(Relation *indrels, int nindexes,
+							   LVShared *lvshared, LVTidMap *dead_tuples,
+							   bool for_cleanup)
+{
+	int idx = 0;
+
+	Assert(IsParallelWorker());
+
+	for (;;)
+	{
+		IndexBulkDeleteResult *result = NULL;
+		bool copy_result = false;
+
+		/* Get next index to process */
+		idx = pg_atomic_fetch_add_u32(&(lvshared->nprocessed), 1);
+
+		/*  Done for all indexes? */
+		if (idx >= nindexes)
+			break;
+
+		/*
+		 * If there is already-updated result in the shared memory we use it.
+		 * Otherwise we pass NULL to index AMs and copy the result to the
+		 * shared memory segment.
+		 */
+		if (lvshared->indstats[idx].updated)
+			result = &(lvshared->indstats[idx].stats);
+		else
+			copy_result = true;
+
+		/* Do vacuuming or cleanup one index */
+		if (for_cleanup)
+			result = lazy_cleanup_index(indrels[idx], result, lvshared->reltuples,
+									   lvshared->estimated_count, false);
+		else
+			result = lazy_vacuum_index(indrels[idx], result, lvshared->reltuples,
+									  dead_tuples);
+
+		if (result)
+		{
+			/* Update the shared index statistics */
+			lvshared->indstats[idx].updated = true;
+
+			if (copy_result)
+				memcpy(&(lvshared->indstats[idx].stats), result,
+					   sizeof(IndexBulkDeleteResult));
+		}
+	}
+}
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index 9c55c20..e53af92 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -14,6 +14,7 @@
 
 #include "postgres.h"
 
+#include "access/heapam.h"
 #include "access/nbtree.h"
 #include "access/parallel.h"
 #include "access/session.h"
@@ -138,6 +139,9 @@ static const struct
 	},
 	{
 		"_bt_parallel_build_main", _bt_parallel_build_main
+	},
+	{
+		"heap_parallel_vacuum_main", heap_parallel_vacuum_main
 	}
 };
 
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index c4522cd..83de654 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -68,13 +68,13 @@ static BufferAccessStrategy vac_strategy;
 
 
 /* non-export function prototypes */
-static List *expand_vacuum_rel(VacuumRelation *vrel, int options);
-static List *get_all_vacuum_rels(int options);
+static List *expand_vacuum_rel(VacuumRelation *vrel, VacuumOptions options);
+static List *get_all_vacuum_rels(VacuumOptions options);
 static void vac_truncate_clog(TransactionId frozenXID,
 				  MultiXactId minMulti,
 				  TransactionId lastSaneFrozenXid,
 				  MultiXactId lastSaneMinMulti);
-static bool vacuum_rel(Oid relid, RangeVar *relation, int options,
+static bool vacuum_rel(Oid relid, RangeVar *relation, VacuumOptions options,
 		   VacuumParams *params);
 
 /*
@@ -89,15 +89,15 @@ ExecVacuum(VacuumStmt *vacstmt, bool isTopLevel)
 	VacuumParams params;
 
 	/* sanity checks on options */
-	Assert(vacstmt->options & (VACOPT_VACUUM | VACOPT_ANALYZE));
-	Assert((vacstmt->options & VACOPT_VACUUM) ||
-		   !(vacstmt->options & (VACOPT_FULL | VACOPT_FREEZE)));
-	Assert(!(vacstmt->options & VACOPT_SKIPTOAST));
+	Assert(vacstmt->options.flags & (VACOPT_VACUUM | VACOPT_ANALYZE));
+	Assert((vacstmt->options.flags & VACOPT_VACUUM) ||
+		   !(vacstmt->options.flags & (VACOPT_FULL | VACOPT_FREEZE)));
+	Assert(!(vacstmt->options.flags & VACOPT_SKIPTOAST));
 
 	/*
 	 * Make sure VACOPT_ANALYZE is specified if any column lists are present.
 	 */
-	if (!(vacstmt->options & VACOPT_ANALYZE))
+	if (!(vacstmt->options.flags & VACOPT_ANALYZE))
 	{
 		ListCell   *lc;
 
@@ -112,11 +112,17 @@ ExecVacuum(VacuumStmt *vacstmt, bool isTopLevel)
 		}
 	}
 
+	if ((vacstmt->options.flags & VACOPT_FULL) &&
+		(vacstmt->options.flags & VACOPT_PARALLEL))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot specify FULL option with PARALLEL option")));
+
 	/*
 	 * All freeze ages are zero if the FREEZE option is given; otherwise pass
 	 * them as -1 which means to use the default values.
 	 */
-	if (vacstmt->options & VACOPT_FREEZE)
+	if (vacstmt->options.flags & VACOPT_FREEZE)
 	{
 		params.freeze_min_age = 0;
 		params.freeze_table_age = 0;
@@ -144,7 +150,7 @@ ExecVacuum(VacuumStmt *vacstmt, bool isTopLevel)
 /*
  * Internal entry point for VACUUM and ANALYZE commands.
  *
- * options is a bitmask of VacuumOption flags, indicating what to do.
+ * options is a VacuumOptions, indicating what to do.
  *
  * relations, if not NIL, is a list of VacuumRelation to process; otherwise,
  * we process all relevant tables in the database.  For each VacuumRelation,
@@ -163,7 +169,7 @@ ExecVacuum(VacuumStmt *vacstmt, bool isTopLevel)
  * memory context that will not disappear at transaction commit.
  */
 void
-vacuum(int options, List *relations, VacuumParams *params,
+vacuum(VacuumOptions options, List *relations, VacuumParams *params,
 	   BufferAccessStrategy bstrategy, bool isTopLevel)
 {
 	static bool in_vacuum = false;
@@ -174,7 +180,7 @@ vacuum(int options, List *relations, VacuumParams *params,
 
 	Assert(params != NULL);
 
-	stmttype = (options & VACOPT_VACUUM) ? "VACUUM" : "ANALYZE";
+	stmttype = (options.flags & VACOPT_VACUUM) ? "VACUUM" : "ANALYZE";
 
 	/*
 	 * We cannot run VACUUM inside a user transaction block; if we were inside
@@ -184,7 +190,7 @@ vacuum(int options, List *relations, VacuumParams *params,
 	 *
 	 * ANALYZE (without VACUUM) can run either way.
 	 */
-	if (options & VACOPT_VACUUM)
+	if (options.flags & VACOPT_VACUUM)
 	{
 		PreventInTransactionBlock(isTopLevel, stmttype);
 		in_outer_xact = false;
@@ -206,8 +212,8 @@ vacuum(int options, List *relations, VacuumParams *params,
 	/*
 	 * Sanity check DISABLE_PAGE_SKIPPING option.
 	 */
-	if ((options & VACOPT_FULL) != 0 &&
-		(options & VACOPT_DISABLE_PAGE_SKIPPING) != 0)
+	if ((options.flags & VACOPT_FULL) != 0 &&
+		(options.flags & VACOPT_DISABLE_PAGE_SKIPPING) != 0)
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg("VACUUM option DISABLE_PAGE_SKIPPING cannot be used with FULL")));
@@ -216,7 +222,7 @@ vacuum(int options, List *relations, VacuumParams *params,
 	 * Send info about dead objects to the statistics collector, unless we are
 	 * in autovacuum --- autovacuum.c does this for itself.
 	 */
-	if ((options & VACOPT_VACUUM) && !IsAutoVacuumWorkerProcess())
+	if ((options.flags & VACOPT_VACUUM) && !IsAutoVacuumWorkerProcess())
 		pgstat_vacuum_stat();
 
 	/*
@@ -281,11 +287,11 @@ vacuum(int options, List *relations, VacuumParams *params,
 	 * transaction block, and also in an autovacuum worker, use own
 	 * transactions so we can release locks sooner.
 	 */
-	if (options & VACOPT_VACUUM)
+	if (options.flags & VACOPT_VACUUM)
 		use_own_xacts = true;
 	else
 	{
-		Assert(options & VACOPT_ANALYZE);
+		Assert(options.flags & VACOPT_ANALYZE);
 		if (IsAutoVacuumWorkerProcess())
 			use_own_xacts = true;
 		else if (in_outer_xact)
@@ -335,13 +341,13 @@ vacuum(int options, List *relations, VacuumParams *params,
 		{
 			VacuumRelation *vrel = lfirst_node(VacuumRelation, cur);
 
-			if (options & VACOPT_VACUUM)
+			if (options.flags & VACOPT_VACUUM)
 			{
 				if (!vacuum_rel(vrel->oid, vrel->relation, options, params))
 					continue;
 			}
 
-			if (options & VACOPT_ANALYZE)
+			if (options.flags & VACOPT_ANALYZE)
 			{
 				/*
 				 * If using separate xacts, start one for analyze. Otherwise,
@@ -354,7 +360,7 @@ vacuum(int options, List *relations, VacuumParams *params,
 					PushActiveSnapshot(GetTransactionSnapshot());
 				}
 
-				analyze_rel(vrel->oid, vrel->relation, options, params,
+				analyze_rel(vrel->oid, vrel->relation, options.flags, params,
 							vrel->va_cols, in_outer_xact, vac_strategy);
 
 				if (use_own_xacts)
@@ -390,7 +396,7 @@ vacuum(int options, List *relations, VacuumParams *params,
 		StartTransactionCommand();
 	}
 
-	if ((options & VACOPT_VACUUM) && !IsAutoVacuumWorkerProcess())
+	if ((options.flags & VACOPT_VACUUM) && !IsAutoVacuumWorkerProcess())
 	{
 		/*
 		 * Update pg_database.datfrozenxid, and truncate pg_xact if possible.
@@ -603,7 +609,7 @@ vacuum_open_relation(Oid relid, RangeVar *relation, VacuumParams *params,
  * are made in vac_context.
  */
 static List *
-expand_vacuum_rel(VacuumRelation *vrel, int options)
+expand_vacuum_rel(VacuumRelation *vrel, VacuumOptions options)
 {
 	List	   *vacrels = NIL;
 	MemoryContext oldcontext;
@@ -635,7 +641,7 @@ expand_vacuum_rel(VacuumRelation *vrel, int options)
 		 * below, as well as find_all_inheritors's expectation that the caller
 		 * holds some lock on the starting relation.
 		 */
-		rvr_opts = (options & VACOPT_SKIP_LOCKED) ? RVR_SKIP_LOCKED : 0;
+		rvr_opts = (options.flags & VACOPT_SKIP_LOCKED) ? RVR_SKIP_LOCKED : 0;
 		relid = RangeVarGetRelidExtended(vrel->relation,
 										 AccessShareLock,
 										 rvr_opts,
@@ -647,7 +653,7 @@ expand_vacuum_rel(VacuumRelation *vrel, int options)
 		 */
 		if (!OidIsValid(relid))
 		{
-			if (options & VACOPT_VACUUM)
+			if (options.flags & VACOPT_VACUUM)
 				ereport(WARNING,
 						(errcode(ERRCODE_LOCK_NOT_AVAILABLE),
 						 errmsg("skipping vacuum of \"%s\" --- lock not available",
@@ -673,7 +679,7 @@ expand_vacuum_rel(VacuumRelation *vrel, int options)
 		 * Make a returnable VacuumRelation for this rel if user is a proper
 		 * owner.
 		 */
-		if (vacuum_is_relation_owner(relid, classForm, options))
+		if (vacuum_is_relation_owner(relid, classForm, options.flags))
 		{
 			oldcontext = MemoryContextSwitchTo(vac_context);
 			vacrels = lappend(vacrels, makeVacuumRelation(vrel->relation,
@@ -742,7 +748,7 @@ expand_vacuum_rel(VacuumRelation *vrel, int options)
  * the current database.  The list is built in vac_context.
  */
 static List *
-get_all_vacuum_rels(int options)
+get_all_vacuum_rels(VacuumOptions options)
 {
 	List	   *vacrels = NIL;
 	Relation	pgclass;
@@ -760,7 +766,7 @@ get_all_vacuum_rels(int options)
 		Oid			relid = classForm->oid;
 
 		/* check permissions of relation */
-		if (!vacuum_is_relation_owner(relid, classForm, options))
+		if (!vacuum_is_relation_owner(relid, classForm, options.flags))
 			continue;
 
 		/*
@@ -1521,7 +1527,7 @@ vac_truncate_clog(TransactionId frozenXID,
  *		At entry and exit, we are not inside a transaction.
  */
 static bool
-vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
+vacuum_rel(Oid relid, RangeVar *relation, VacuumOptions options, VacuumParams *params)
 {
 	LOCKMODE	lmode;
 	Relation	onerel;
@@ -1542,7 +1548,7 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 	 */
 	PushActiveSnapshot(GetTransactionSnapshot());
 
-	if (!(options & VACOPT_FULL))
+	if (!(options.flags & VACOPT_FULL))
 	{
 		/*
 		 * In lazy vacuum, we can set the PROC_IN_VACUUM flag, which lets
@@ -1582,10 +1588,10 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 	 * vacuum, but just ShareUpdateExclusiveLock for concurrent vacuum. Either
 	 * way, we can be sure that no other backend is vacuuming the same table.
 	 */
-	lmode = (options & VACOPT_FULL) ? AccessExclusiveLock : ShareUpdateExclusiveLock;
+	lmode = (options.flags & VACOPT_FULL) ? AccessExclusiveLock : ShareUpdateExclusiveLock;
 
 	/* open the relation and get the appropriate lock on it */
-	onerel = vacuum_open_relation(relid, relation, params, options, lmode);
+	onerel = vacuum_open_relation(relid, relation, params, options.flags, lmode);
 
 	/* leave if relation could not be opened or locked */
 	if (!onerel)
@@ -1605,7 +1611,7 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 	 */
 	if (!vacuum_is_relation_owner(RelationGetRelid(onerel),
 								  onerel->rd_rel,
-								  options & VACOPT_VACUUM))
+								  options.flags & VACOPT_VACUUM))
 	{
 		relation_close(onerel, lmode);
 		PopActiveSnapshot();
@@ -1677,7 +1683,7 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 	 * us to process it.  In VACUUM FULL, though, the toast table is
 	 * automatically rebuilt by cluster_rel so we shouldn't recurse to it.
 	 */
-	if (!(options & VACOPT_SKIPTOAST) && !(options & VACOPT_FULL))
+	if (!(options.flags & VACOPT_SKIPTOAST) && !(options.flags & VACOPT_FULL))
 		toast_relid = onerel->rd_rel->reltoastrelid;
 	else
 		toast_relid = InvalidOid;
@@ -1696,7 +1702,7 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 	/*
 	 * Do the actual work --- either FULL or "lazy" vacuum
 	 */
-	if (options & VACOPT_FULL)
+	if (options.flags & VACOPT_FULL)
 	{
 		int			cluster_options = 0;
 
@@ -1704,7 +1710,7 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 		relation_close(onerel, NoLock);
 		onerel = NULL;
 
-		if ((options & VACOPT_VERBOSE) != 0)
+		if ((options.flags & VACOPT_VERBOSE) != 0)
 			cluster_options |= CLUOPT_VERBOSE;
 
 		/* VACUUM FULL is now a variant of CLUSTER; see cluster.c */
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 133df1b..5bea393 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -1667,8 +1667,10 @@ _equalDropdbStmt(const DropdbStmt *a, const DropdbStmt *b)
 static bool
 _equalVacuumStmt(const VacuumStmt *a, const VacuumStmt *b)
 {
-	COMPARE_SCALAR_FIELD(options);
-	COMPARE_NODE_FIELD(rels);
+	if (a->options.flags != b->options.flags)
+		return false;
+	if (a->options.nworkers != b->options.nworkers)
+		return false;
 
 	return true;
 }
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index c086235..e3b6a9a 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -187,6 +187,7 @@ static void processCASbits(int cas_bits, int location, const char *constrType,
 			   bool *deferrable, bool *initdeferred, bool *not_valid,
 			   bool *no_inherit, core_yyscan_t yyscanner);
 static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
+static VacuumOptions *makeVacOpt(VacuumFlag flag, int nworkers);
 
 %}
 
@@ -237,6 +238,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 	struct ImportQual	*importqual;
 	InsertStmt			*istmt;
 	VariableSetStmt		*vsetstmt;
+	VacuumOptions		*vacopt;
 	PartitionElem		*partelem;
 	PartitionSpec		*partspec;
 	PartitionBoundSpec	*partboundspec;
@@ -305,8 +307,8 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 				create_extension_opt_item alter_extension_opt_item
 
 %type <ival>	opt_lock lock_type cast_context
-%type <ival>	vacuum_option_list vacuum_option_elem
-				analyze_option_list analyze_option_elem
+%type <vacopt>	vacuum_option_list vacuum_option_elem
+%type <ival>	analyze_option_list analyze_option_elem
 %type <boolean>	opt_or_replace
 				opt_grant_grant_option opt_grant_admin_option
 				opt_nowait opt_if_exists opt_with_data
@@ -10476,22 +10478,24 @@ cluster_index_specification:
 VacuumStmt: VACUUM opt_full opt_freeze opt_verbose opt_analyze opt_vacuum_relation_list
 				{
 					VacuumStmt *n = makeNode(VacuumStmt);
-					n->options = VACOPT_VACUUM;
+					n->options.flags = VACOPT_VACUUM;
 					if ($2)
-						n->options |= VACOPT_FULL;
+						n->options.flags |= VACOPT_FULL;
 					if ($3)
-						n->options |= VACOPT_FREEZE;
+						n->options.flags |= VACOPT_FREEZE;
 					if ($4)
-						n->options |= VACOPT_VERBOSE;
+						n->options.flags |= VACOPT_VERBOSE;
 					if ($5)
-						n->options |= VACOPT_ANALYZE;
+						n->options.flags |= VACOPT_ANALYZE;
+					n->options.nworkers = 0;
 					n->rels = $6;
 					$$ = (Node *)n;
 				}
 			| VACUUM '(' vacuum_option_list ')' opt_vacuum_relation_list
 				{
 					VacuumStmt *n = makeNode(VacuumStmt);
-					n->options = VACOPT_VACUUM | $3;
+					n->options.flags = VACOPT_VACUUM | $3->flags;
+					n->options.nworkers = $3->nworkers;
 					n->rels = $5;
 					$$ = (Node *) n;
 				}
@@ -10499,20 +10503,40 @@ VacuumStmt: VACUUM opt_full opt_freeze opt_verbose opt_analyze opt_vacuum_relati
 
 vacuum_option_list:
 			vacuum_option_elem								{ $$ = $1; }
-			| vacuum_option_list ',' vacuum_option_elem		{ $$ = $1 | $3; }
+			| vacuum_option_list ',' vacuum_option_elem
+				{
+					VacuumOptions *vacopt1 = $1;
+					VacuumOptions *vacopt2 = $3;
+
+					vacopt1->flags |= vacopt2->flags;
+					if (vacopt2->flags == VACOPT_PARALLEL)
+						vacopt1->nworkers = vacopt2->nworkers;
+					pfree(vacopt2);
+					$$ = vacopt1;
+				}
 		;
 
 vacuum_option_elem:
-			analyze_keyword		{ $$ = VACOPT_ANALYZE; }
-			| VERBOSE			{ $$ = VACOPT_VERBOSE; }
-			| FREEZE			{ $$ = VACOPT_FREEZE; }
-			| FULL				{ $$ = VACOPT_FULL; }
+			analyze_keyword		{ $$ = makeVacOpt(VACOPT_ANALYZE, 0); }
+			| VERBOSE			{ $$ = makeVacOpt(VACOPT_VERBOSE, 0); }
+			| FREEZE			{ $$ = makeVacOpt(VACOPT_FREEZE, 0); }
+			| FULL				{ $$ = makeVacOpt(VACOPT_FULL, 0); }
+			| PARALLEL			{ $$ = makeVacOpt(VACOPT_PARALLEL, 0); }
+			| PARALLEL ICONST
+			{
+				if ($2 < 1)
+					ereport(ERROR,
+							(errcode(ERRCODE_SYNTAX_ERROR),
+							 errmsg("parallel vacuum degree must be more than 1"),
+							 parser_errposition(@1)));
+				$$ = makeVacOpt(VACOPT_PARALLEL, $2);
+			}
 			| IDENT
 				{
 					if (strcmp($1, "disable_page_skipping") == 0)
-						$$ = VACOPT_DISABLE_PAGE_SKIPPING;
+						$$ = makeVacOpt(VACOPT_DISABLE_PAGE_SKIPPING, 0);
 					else if (strcmp($1, "skip_locked") == 0)
-						$$ = VACOPT_SKIP_LOCKED;
+						$$ = makeVacOpt(VACOPT_SKIP_LOCKED, 0);
 					else
 						ereport(ERROR,
 								(errcode(ERRCODE_SYNTAX_ERROR),
@@ -10524,16 +10548,16 @@ vacuum_option_elem:
 AnalyzeStmt: analyze_keyword opt_verbose opt_vacuum_relation_list
 				{
 					VacuumStmt *n = makeNode(VacuumStmt);
-					n->options = VACOPT_ANALYZE;
+					n->options.flags = VACOPT_ANALYZE;
 					if ($2)
-						n->options |= VACOPT_VERBOSE;
+						n->options.flags |= VACOPT_VERBOSE;
 					n->rels = $3;
 					$$ = (Node *)n;
 				}
 			| analyze_keyword '(' analyze_option_list ')' opt_vacuum_relation_list
 				{
 					VacuumStmt *n = makeNode(VacuumStmt);
-					n->options = VACOPT_ANALYZE | $3;
+					n->options.flags = VACOPT_ANALYZE | $3;
 					n->rels = $5;
 					$$ = (Node *) n;
 				}
@@ -16031,6 +16055,19 @@ makeXmlExpr(XmlExprOp op, char *name, List *named_args, List *args,
 	return (Node *) x;
 }
 
+
+/*
+ * Create a VacuumOptions with the given options.
+ */
+static VacuumOptions *
+makeVacOpt(VacuumFlag flag, int nworkers)
+{
+	VacuumOptions *vacopt = palloc(sizeof(VacuumOptions));
+
+	vacopt->flags = flag;
+	vacopt->nworkers = nworkers;
+	return vacopt;
+}
 /*
  * Merge the input and output parameters of a table function.
  */
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 4cf6787..f6bbf22 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -188,8 +188,8 @@ typedef struct av_relation
 typedef struct autovac_table
 {
 	Oid			at_relid;
-	int			at_vacoptions;	/* bitmask of VacuumOption */
-	VacuumParams at_params;
+	VacuumOptions	at_vacoptions;
+	VacuumParams	at_params;
 	int			at_vacuum_cost_delay;
 	int			at_vacuum_cost_limit;
 	bool		at_dobalance;
@@ -2482,7 +2482,7 @@ do_autovacuum(void)
 			 * next table in our list.
 			 */
 			HOLD_INTERRUPTS();
-			if (tab->at_vacoptions & VACOPT_VACUUM)
+			if (tab->at_vacoptions.flags & VACOPT_VACUUM)
 				errcontext("automatic vacuum of table \"%s.%s.%s\"",
 						   tab->at_datname, tab->at_nspname, tab->at_relname);
 			else
@@ -2883,10 +2883,11 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		tab = palloc(sizeof(autovac_table));
 		tab->at_relid = relid;
 		tab->at_sharedrel = classForm->relisshared;
-		tab->at_vacoptions = VACOPT_SKIPTOAST |
+		tab->at_vacoptions.flags = VACOPT_SKIPTOAST |
 			(dovacuum ? VACOPT_VACUUM : 0) |
 			(doanalyze ? VACOPT_ANALYZE : 0) |
 			(!wraparound ? VACOPT_SKIP_LOCKED : 0);
+		tab->at_vacoptions.nworkers = 0;	/* parallel lazy vacuum is not supported */
 		tab->at_params.freeze_min_age = freeze_min_age;
 		tab->at_params.freeze_table_age = freeze_table_age;
 		tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
@@ -3132,10 +3133,10 @@ autovac_report_activity(autovac_table *tab)
 	int			len;
 
 	/* Report the command and possible options */
-	if (tab->at_vacoptions & VACOPT_VACUUM)
+	if (tab->at_vacoptions.flags & VACOPT_VACUUM)
 		snprintf(activity, MAX_AUTOVAC_ACTIV_LEN,
 				 "autovacuum: VACUUM%s",
-				 tab->at_vacoptions & VACOPT_ANALYZE ? " ANALYZE" : "");
+				 tab->at_vacoptions.flags & VACOPT_ANALYZE ? " ANALYZE" : "");
 	else
 		snprintf(activity, MAX_AUTOVAC_ACTIV_LEN,
 				 "autovacuum: ANALYZE");
diff --git a/src/backend/tcop/utility.c b/src/backend/tcop/utility.c
index 27ae6be..d11d397 100644
--- a/src/backend/tcop/utility.c
+++ b/src/backend/tcop/utility.c
@@ -664,7 +664,7 @@ standard_ProcessUtility(PlannedStmt *pstmt,
 				VacuumStmt *stmt = (VacuumStmt *) parsetree;
 
 				/* we choose to allow this during "read only" transactions */
-				PreventCommandDuringRecovery((stmt->options & VACOPT_VACUUM) ?
+				PreventCommandDuringRecovery((stmt->options.flags & VACOPT_VACUUM) ?
 											 "VACUUM" : "ANALYZE");
 				/* forbidden in parallel mode due to CommandIsReadOnly */
 				ExecVacuum(stmt, isTopLevel);
@@ -2570,7 +2570,7 @@ CreateCommandTag(Node *parsetree)
 			break;
 
 		case T_VacuumStmt:
-			if (((VacuumStmt *) parsetree)->options & VACOPT_VACUUM)
+			if (((VacuumStmt *) parsetree)->options.flags & VACOPT_VACUUM)
 				tag = "VACUUM";
 			else
 				tag = "ANALYZE";
diff --git a/src/bin/scripts/t/100_vacuumdb.pl b/src/bin/scripts/t/100_vacuumdb.pl
index 7cb2542..3d0ba58 100644
--- a/src/bin/scripts/t/100_vacuumdb.pl
+++ b/src/bin/scripts/t/100_vacuumdb.pl
@@ -3,7 +3,7 @@ use warnings;
 
 use PostgresNode;
 use TestLib;
-use Test::More tests => 30;
+use Test::More tests => 34;
 
 program_help_ok('vacuumdb');
 program_version_ok('vacuumdb');
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 4b78150..cf4a13d 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -14,9 +14,11 @@
 #ifndef HEAPAM_H
 #define HEAPAM_H
 
+#include "access/parallel.h"
 #include "access/sdir.h"
 #include "access/skey.h"
 #include "nodes/lockoptions.h"
+#include "nodes/parsenodes.h"
 #include "nodes/primnodes.h"
 #include "storage/bufpage.h"
 #include "storage/lockdefs.h"
@@ -188,6 +190,8 @@ extern Size SyncScanShmemSize(void);
 
 /* in heap/vacuumlazy.c */
 struct VacuumParams;
-extern void heap_vacuum_rel(Relation onerel, int options,
+extern void heap_vacuum_rel(Relation onerel, VacuumOptions options,
 				struct VacuumParams *params, BufferAccessStrategy bstrategy);
+extern void heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
+
 #endif							/* HEAPAM_H */
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 0a051ec..d3503e2 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -163,7 +163,7 @@ extern int	vacuum_multixact_freeze_table_age;
 
 /* in commands/vacuum.c */
 extern void ExecVacuum(VacuumStmt *vacstmt, bool isTopLevel);
-extern void vacuum(int options, List *relations, VacuumParams *params,
+extern void vacuum(VacuumOptions options, List *relations, VacuumParams *params,
 	   BufferAccessStrategy bstrategy, bool isTopLevel);
 extern void vac_open_indexes(Relation relation, LOCKMODE lockmode,
 				 int *nindexes, Relation **Irel);
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 27782fe..a01c811 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3143,7 +3143,7 @@ typedef struct ClusterStmt
  * and VACOPT_ANALYZE must be set in options.
  * ----------------------
  */
-typedef enum VacuumOption
+typedef enum VacuumFlag
 {
 	VACOPT_VACUUM = 1 << 0,		/* do VACUUM */
 	VACOPT_ANALYZE = 1 << 1,	/* do ANALYZE */
@@ -3152,8 +3152,15 @@ typedef enum VacuumOption
 	VACOPT_FULL = 1 << 4,		/* FULL (non-concurrent) vacuum */
 	VACOPT_SKIP_LOCKED = 1 << 5,	/* skip if cannot get lock */
 	VACOPT_SKIPTOAST = 1 << 6,	/* don't process the TOAST table, if any */
-	VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7	/* don't skip any pages */
-} VacuumOption;
+	VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7,	/* don't skip any pages */
+	VACOPT_PARALLEL = 1 << 8	/* do lazy VACUUM in parallel */
+} VacuumFlag;
+
+typedef struct VacuumOptions
+{
+	VacuumFlag	flags;	/* OR of VacuumFlag */
+	int			nworkers;	/* # of parallel vacuum workers */
+} VacuumOptions;
 
 /*
  * Info about a single target table of VACUUM/ANALYZE.
@@ -3172,9 +3179,9 @@ typedef struct VacuumRelation
 
 typedef struct VacuumStmt
 {
-	NodeTag		type;
-	int			options;		/* OR of VacuumOption flags */
-	List	   *rels;			/* list of VacuumRelation, or NIL for all */
+	NodeTag			type;
+	VacuumOptions	options;
+	List		   *rels;			/* list of VacuumRelation, or NIL for all */
 } VacuumStmt;
 
 /* ----------------------
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index fa9d663..9b5b7dc 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -80,6 +80,8 @@ CONTEXT:  SQL function "do_analyze" statement 1
 SQL function "wrap_do_analyze" statement 1
 VACUUM FULL vactst;
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+VACUUM (PARALLEL) vaccluster;
+VACUUM (PARALLEL 2) vaccluster;
 -- partitioned table
 CREATE TABLE vacparted (a int, b char) PARTITION BY LIST (a);
 CREATE TABLE vacparted1 PARTITION OF vacparted FOR VALUES IN (1);
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 9defa0d..f92c4e5 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -61,6 +61,9 @@ VACUUM FULL vaccluster;
 VACUUM FULL vactst;
 
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+VACUUM (PARALLEL) vaccluster;
+VACUUM (PARALLEL 2) vaccluster;
+
 
 -- partitioned table
 CREATE TABLE vacparted (a int, b char) PARTITION BY LIST (a);
-- 
1.8.3.1

#16

Haribabu Kommi

kommi.haribabu@gmail.com

almost 7 years ago

In reply to: Masahiko Sawada (#15)

On Fri, Jan 18, 2019 at 11:42 PM Masahiko Sawada <sawada.mshk@gmail.com>
wrote:

On Fri, Jan 18, 2019 at 10:38 AM Haribabu Kommi
<kommi.haribabu@gmail.com> wrote:

On Tue, Jan 15, 2019 at 6:00 PM Masahiko Sawada <sawada.mshk@gmail.com>

wrote:

Rebased.

I started reviewing the patch, I didn't finish my review yet.
Following are some of the comments.

Thank you for reviewing the patch.

+ <term><literal>PARALLEL <replaceable

class="parameter">N</replaceable></literal></term>
+    <listitem>
+     <para>
+      Execute index vacuum and cleanup index in parallel with
I doubt that user can understand the terms index vacuum and cleanup
index.

May be it needs some more detailed information.

Agreed. Table 27.22 "Vacuum phases" has a good description of vacuum
phases. So maybe adding the referencint to it would work.

OK.

-typedef enum VacuumOption
+typedef enum VacuumOptionFlag
{
I don't find the new name quite good, how about VacuumFlags?
Agreed with removing "Option" from the name but I think VacuumFlag
would be better because this enum represents only one flag. Thoughts?

OK.

postgres=# vacuum (parallel 2, verbose) tbl;

With verbose, no parallel workers related information is available.
I feel giving that information is required even when it is not parallel
vacuum also.

Agreed. How about the folloiwng verbose output? I've added the number
of launched, planned and requested vacuum workers and purpose (vacuum
or cleanup).

postgres(1:91536)=# vacuum (verbose, parallel 30) test; -- table
'test' has 3 indexes
INFO: vacuuming "public.test"
INFO: launched 2 parallel vacuum workers for index vacuum (planned:
2, requested: 30)
INFO: scanned index "test_idx1" to remove 2000 row versions
DETAIL: CPU: user: 0.12 s, system: 0.00 s, elapsed: 0.12 s
INFO: scanned index "test_idx2" to remove 2000 row versions by
parallel vacuum worker
DETAIL: CPU: user: 0.07 s, system: 0.05 s, elapsed: 0.12 s
INFO: scanned index "test_idx3" to remove 2000 row versions by
parallel vacuum worker
DETAIL: CPU: user: 0.09 s, system: 0.05 s, elapsed: 0.14 s
INFO: "test": removed 2000 row versions in 10 pages
DETAIL: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s
INFO: launched 2 parallel vacuum workers for index cleanup (planned:
2, requested: 30)
INFO: index "test_idx1" now contains 991151 row versions in 2745 pages
DETAIL: 2000 index row versions were removed.
24 index pages have been deleted, 18 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO: index "test_idx2" now contains 991151 row versions in 2745 pages
DETAIL: 2000 index row versions were removed.
24 index pages have been deleted, 18 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO: index "test_idx3" now contains 991151 row versions in 2745 pages
DETAIL: 2000 index row versions were removed.
24 index pages have been deleted, 18 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO: "test": found 2000 removable, 367 nonremovable row versions in
41 out of 4425 pages
DETAIL: 0 dead row versions cannot be removed yet, oldest xmin: 500
There were 6849 unused item pointers.
Skipped 0 pages due to buffer pins, 0 frozen pages.
0 pages are entirely empty.
CPU: user: 0.12 s, system: 0.01 s, elapsed: 0.17 s.
VACUUM

The verbose output is good.

Since the previous patch conflicts with 285d8e12 I've attached the

latest version patch that incorporated the review comment I got.

Thanks for the latest patch. I have some more minor comments.

+ Execute index vacuum and cleanup index in parallel with

Better to use vacuum index and cleanup index? This is in same with
the description of vacuum phases. It is better to follow same notation
in the patch.

+ dead_tuples = lazy_space_alloc(lvstate, nblocks, parallel_workers);

With the change, the lazy_space_alloc takes care of initializing the
parallel vacuum, can we write something related to that in the comments.

+ initprog_val[2] = dead_tuples->max_dead_tuples;

dead_tuples variable may need rename for better reading?

+ if (lvshared->indstats[idx].updated)
+ result = &(lvshared->indstats[idx].stats);
+ else
+ copy_result = true;

I don't see a need for copy_result variable, how about directly using
the updated flag to decide whether to copy or not? Once the result is
copied update the flag.

+use Test::More tests => 34;

I don't find any new tetst are added in this patch.

I am thinking of performance penalty if we use the parallel option of
vacuum on a small sized table?

Regards,
Haribabu Kommi
Fujitsu Australia

#17

Masahiko Sawada

sawada.mshk@gmail.com

almost 7 years ago

In reply to: Haribabu Kommi (#16)

2 attachment(s)

On Tue, Jan 22, 2019 at 9:59 PM Haribabu Kommi <kommi.haribabu@gmail.com> wrote:

Thanks for the latest patch. I have some more minor comments.

Thank you for reviewing the patch.

+ Execute index vacuum and cleanup index in parallel with

Better to use vacuum index and cleanup index? This is in same with
the description of vacuum phases. It is better to follow same notation
in the patch.

Agreed. I've changed it to "Vacuum index and cleanup index in parallel
with ...".

+ dead_tuples = lazy_space_alloc(lvstate, nblocks, parallel_workers);

With the change, the lazy_space_alloc takes care of initializing the
parallel vacuum, can we write something related to that in the comments.

Agreed.

+ initprog_val[2] = dead_tuples->max_dead_tuples;

dead_tuples variable may need rename for better reading?

I might not get your comment correctly but I've tried to fix it.
Please review it.

+ if (lvshared->indstats[idx].updated)
+ result = &(lvshared->indstats[idx].stats);
+ else
+ copy_result = true;
I don't see a need for copy_result variable, how about directly using
the updated flag to decide whether to copy or not? Once the result is
copied update the flag.

You're right. Fixed.

+use Test::More tests => 34;

I don't find any new tetst are added in this patch.

Fixed.

I am thinking of performance penalty if we use the parallel option of
vacuum on a small sized table?

Hm, unlike other parallel operations other than ParallelAppend the
parallel vacuum executes multiple index vacuum simultaneously.
Therefore this can avoid contension. I think there is a performance
penalty but it would not be big.

Attached the latest patches.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachments:

v13-0001-Add-parallel-option-to-VACUUM-command.patchapplication/x-patch; name=v13-0001-Add-parallel-option-to-VACUUM-command.patchDownload

From 53a21a6983d6ef2787bc6e104d0ce62ed905a4d3 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Tue, 18 Dec 2018 14:48:34 +0900
Subject: [PATCH v13 1/2] Add parallel option to VACUUM command

In parallel vacuum, we do both index vacuum and cleanup vacuum
in parallel with parallel worker processes if the table has
more than one index. All processes including the leader process
process indexes one by one.

Parallel vacuum can be performed by specifying like
VACUUM (PARALLEL 2) tbl, meaning that performing vacuum with 2
parallel worker processes. Or the setting parallel_workers
reloption more than 0 invokes parallel vacuum.

The parallel vacuum degree is limited by both the number of
indexes the table has and max_parallel_maintenance_workers.
---
 doc/src/sgml/config.sgml              |  24 +-
 doc/src/sgml/ref/vacuum.sgml          |  28 ++
 src/backend/access/heap/vacuumlazy.c  | 892 +++++++++++++++++++++++++++++-----
 src/backend/access/transam/parallel.c |   4 +
 src/backend/commands/vacuum.c         |  78 +--
 src/backend/nodes/equalfuncs.c        |   6 +-
 src/backend/parser/gram.y             |  73 ++-
 src/backend/postmaster/autovacuum.c   |  13 +-
 src/backend/tcop/utility.c            |   4 +-
 src/include/access/heapam.h           |   5 +-
 src/include/commands/vacuum.h         |   2 +-
 src/include/nodes/parsenodes.h        |  19 +-
 src/test/regress/expected/vacuum.out  |   2 +
 src/test/regress/sql/vacuum.sql       |   3 +
 14 files changed, 945 insertions(+), 208 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index b6f5822..b77a2bd 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2185,18 +2185,18 @@ include_dir 'conf.d'
        <listitem>
         <para>
          Sets the maximum number of parallel workers that can be
-         started by a single utility command.  Currently, the only
-         parallel utility command that supports the use of parallel
-         workers is <command>CREATE INDEX</command>, and only when
-         building a B-tree index.  Parallel workers are taken from the
-         pool of processes established by <xref
-         linkend="guc-max-worker-processes"/>, limited by <xref
-         linkend="guc-max-parallel-workers"/>.  Note that the requested
-         number of workers may not actually be available at run time.
-         If this occurs, the utility operation will run with fewer
-         workers than expected.  The default value is 2.  Setting this
-         value to 0 disables the use of parallel workers by utility
-         commands.
+         started by a single utility command.  Currently, the parallel
+         utility commands that support the use of parallel workers are
+         <command>CREATE INDEX</command> and <command>VACUUM</command>
+         without <literal>FULL</literal> option, and only when building
+         a B-tree index.  Parallel workers are taken from the pool of
+         processes established by <xref linkend="guc-max-worker-processes"/>,
+         limited by <xref linkend="guc-max-parallel-workers"/>.
+         Note that the requested number of workers may not actually be
+         available at run time.  If this occurs, the utility operation
+         will run with fewer workers than expected.  The default value
+         is 2.  Setting this value to 0 disables the use of parallel
+         workers by utility commands.
         </para>
 
         <para>
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index fd911f5..add3060 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -30,6 +30,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     FREEZE
     VERBOSE
     ANALYZE
+    PARALLEL [ <replaceable class="parameter">N</replaceable> ]
     DISABLE_PAGE_SKIPPING
     SKIP_LOCKED
 
@@ -143,6 +144,24 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    </varlistentry>
 
    <varlistentry>
+    <term><literal>PARALLEL <replaceable class="parameter">N</replaceable></literal></term>
+    <listitem>
+     <para>
+      Vacuum index and cleanup index in parallel
+      <replaceable class="parameter">N</replaceable> background workers (for the detail
+      of each vacuum phases, please refer to <xref linkend="vacuum-phases"/>. If the
+      parallel degree <replaceable class="parameter">N</replaceable> is omitted,
+      then <command>VACUUM</command> decides the number of workers based on
+      number of indexes on the relation which further limited by
+      <xref linkend="guc-max-parallel-workers-maintenance"/>. Also if this option
+      is specified multile times, the last parallel degree
+      <replaceable class="parameter">N</replaceable> is considered into the account.
+      This option can not use with  <literal>FULL</literal> option.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><literal>DISABLE_PAGE_SKIPPING</literal></term>
     <listitem>
      <para>
@@ -261,6 +280,15 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    </para>
 
    <para>
+    Setting a value for <literal>parallel_workers</literal> via
+    <xref linkend="sql-altertable"/> also controls how many parallel
+    worker processes will be requested by a <command>VACUUM</command>
+    against the table. This setting is overwritten by setting
+    <replaceable class="parameter">N</replaceable> of <literal>PARALLEL</literal>
+    option.
+   </para>
+
+   <para>
     <command>VACUUM</command> causes a substantial increase in I/O traffic,
     which might cause poor performance for other active sessions.  Therefore,
     it is sometimes advisable to use the cost-based vacuum delay feature.
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 37aa484..6d38e02 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -22,6 +22,19 @@
  * of index scans performed.  So we don't use maintenance_work_mem memory for
  * the TID array, just enough to hold as many heap tuples as fit on one page.
  *
+ * Lazy vacuum supports parallel execution with parallel worker processes. In
+ * parallel vacuum, we perform both index vacuum and index cleanup in parallel.
+ * Individual indexes is processed by one vacuum process. At beginning of
+ * lazy vacuum (at lazy_scan_heap) we prepare the parallel context and initialize
+ * the shared memory segments that contains shared information as well as the
+ * memory space for dead tuples. When starting either index vacuum or cleanup
+ * vacuum, we launch parallel worker processes. Once all indexes are processed
+ * the parallel worker processes exit and the leader process re-initializes the
+ * shared memory segment. Note that all parallel workers live during one either
+ * index vacuum or cleanup index but the leader process neither exits from the
+ * parallel mode nor destories the parallel context. For updating the index
+ * statistics, since any updates are not allowed during parallel mode we update
+ * the index statistics after exited from the parallel mode.
  *
  * Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
@@ -41,8 +54,10 @@
 #include "access/heapam_xlog.h"
 #include "access/htup_details.h"
 #include "access/multixact.h"
+#include "access/parallel.h"
 #include "access/transam.h"
 #include "access/visibilitymap.h"
+#include "access/xact.h"
 #include "access/xlog.h"
 #include "catalog/storage.h"
 #include "commands/dbcommands.h"
@@ -55,6 +70,7 @@
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
 #include "storage/lmgr.h"
+#include "tcop/tcopprot.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/pg_rusage.h"
@@ -110,10 +126,79 @@
  */
 #define PREFETCH_SIZE			((BlockNumber) 32)
 
+/* DSM keys for parallel lazy vacuum */
+#define PARALLEL_VACUUM_KEY_SHARED			UINT64CONST(0xFFFFFFFFFFF00001)
+#define PARALLEL_VACUUM_KEY_DEAD_TUPLES		UINT64CONST(0xFFFFFFFFFFF00002)
+#define PARALLEL_VACUUM_KEY_QUERY_TEXT		UINT64CONST(0xFFFFFFFFFFF00003)
+
+/*
+ * Structs for an index bulk-deletion statistic that is used for parallel
+ * lazy vacuum. This is allocated in a dynamic shared memory segment.
+ */
+typedef struct LVIndStats
+{
+	bool updated;	/* is the stats updated? */
+	IndexBulkDeleteResult stats;
+} LVIndStats;
+
+/*
+ * LVDeadTuples controls the dead tuple TIDs collected during heap scan.
+ * This is allocated in a dynamic shared memory segment when parallel
+ * lazy vacuum mode, or allocated in a local memory.
+ */
+typedef struct LVDeadTuples
+{
+	int			max_tuples;	/* # slots allocated in array */
+	int			num_tuples;	/* current # of entries */
+	/* List of TIDs of tuples we intend to delete */
+	/* NB: this list is ordered by TID address */
+	ItemPointerData itemptrs[FLEXIBLE_ARRAY_MEMBER];	/* array of ItemPointerData */
+} LVDeadTuples;
+#define SizeOfLVDeadTuples offsetof(LVDeadTuples, itemptrs) + sizeof(ItemPointerData)
+
+/*
+ * Status for parallel index vacuum and index cleanup. This is allocated in
+ * a dynamic shared memory segment.
+ */
+typedef struct LVShared
+{
+	/*
+	 * Target table relid and vacuum settings. These fields are not modified
+	 * during the lazy vacuum.
+	 */
+	Oid		relid;
+	bool	is_wraparound;
+	int		elevel;
+
+	/*
+	 * An indication for vacuum workers of doing either vacuuming index or
+	 * cleanup index.
+	 */
+	bool	for_cleanup;
+
+	/*
+	 * Fields for vacuum index or cleanup index, or both necessary for
+	 * IndexVacuumInfo.
+	 *
+	 * reltuples is the total number of input heap tuples. We set either an
+	 * old live tuples in vacuum index or th new live tuples in cleanup index.
+	 *
+	 * estimated_count is true if the reltuples is estimated value.
+	 */
+	double	reltuples;
+	bool	estimated_count;
+
+	/*
+	 * Variables to control parallel index vacuum. An variable-sized field
+	 * 'indstats' must come last.
+	 */
+	pg_atomic_uint32	nprocessed;
+	LVIndStats			indstats[FLEXIBLE_ARRAY_MEMBER];
+} LVShared;
+#define SizeOfLVShared offsetof(LVShared, indstats) + sizeof(LVIndStats)
+
 typedef struct LVRelStats
 {
-	/* hasindex = true means two-pass strategy; false means one-pass */
-	bool		hasindex;
 	/* Overall statistics about rel */
 	BlockNumber old_rel_pages;	/* previous value of pg_class.relpages */
 	BlockNumber rel_pages;		/* total number of pages */
@@ -128,16 +213,35 @@ typedef struct LVRelStats
 	BlockNumber pages_removed;
 	double		tuples_deleted;
 	BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
-	/* List of TIDs of tuples we intend to delete */
-	/* NB: this list is ordered by TID address */
-	int			num_dead_tuples;	/* current # of entries */
-	int			max_dead_tuples;	/* # slots allocated in array */
-	ItemPointer dead_tuples;	/* array of ItemPointerData */
 	int			num_index_scans;
 	TransactionId latestRemovedXid;
 	bool		lock_waiter_detected;
 } LVRelStats;
 
+/*
+ * Working state for lazy heap vacuum execution used by only leader process.
+ * This is present only in the leader process. In parallel lazy vacuum, the
+ * 'lvshared' and 'pcxt' are not NULL and they point to the dynamic shared
+ * memory segment.
+ */
+typedef struct LVState
+{
+	Relation	relation;
+	LVRelStats	*vacrelstats;
+	Relation	*indRels;
+	/* nindexes > 0 means two-pass strategy; false means one-pass */
+	int			nindexes;
+
+	/* Lazy vacuum options and scan status */
+	VacuumOptions	options;
+	bool			is_wraparound;
+	bool			aggressive;
+	bool			parallel_ready;	/* true if parallel vacuum is prepared */
+
+	/* Variables for parallel lazy index vacuum */
+	LVShared		*lvshared;
+	ParallelContext	*pcxt;
+} LVState;
 
 /* A few variables that don't seem worth passing around as parameters */
 static int	elevel = -1;
@@ -150,31 +254,43 @@ static BufferAccessStrategy vac_strategy;
 
 
 /* non-export function prototypes */
-static void lazy_scan_heap(Relation onerel, int options,
-			   LVRelStats *vacrelstats, Relation *Irel, int nindexes,
-			   bool aggressive);
-static void lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats);
+static void lazy_scan_heap(LVState *lvstate);
+static void lazy_vacuum_heap(LVState *lvstate, LVDeadTuples *dead_tuples);
 static bool lazy_check_needs_freeze(Buffer buf, bool *hastup);
-static void lazy_vacuum_index(Relation indrel,
-				  IndexBulkDeleteResult **stats,
-				  LVRelStats *vacrelstats);
-static void lazy_cleanup_index(Relation indrel,
-				   IndexBulkDeleteResult *stats,
-				   LVRelStats *vacrelstats);
-static int lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
-				 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer);
+static IndexBulkDeleteResult *lazy_vacuum_index(Relation indrel,
+									IndexBulkDeleteResult *stats,
+									double reltuples,
+									LVDeadTuples	*dead_tuples);
+static IndexBulkDeleteResult *lazy_cleanup_index(Relation indrel,
+									IndexBulkDeleteResult *stats,
+									double reltuples, bool estimated_count,
+									bool update_stats);
+static int lazy_vacuum_page(LVState *lvstate, Relation onerel, BlockNumber blkno,
+					Buffer buffer, int tupindex, Buffer *vmbuffer,
+					TransactionId latestRemovedXid, LVDeadTuples *dead_tuples);
 static bool should_attempt_truncation(LVRelStats *vacrelstats);
 static void lazy_truncate_heap(Relation onerel, LVRelStats *vacrelstats);
 static BlockNumber count_nondeletable_pages(Relation onerel,
 						 LVRelStats *vacrelstats);
-static void lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks);
-static void lazy_record_dead_tuple(LVRelStats *vacrelstats,
-					   ItemPointer itemptr);
+static LVDeadTuples *lazy_space_alloc(LVState *lvstate, BlockNumber relblocks,
+								  int parallel_workers);
+static void lazy_record_dead_tuple(LVDeadTuples *dead_tuples, ItemPointer itemptr);
 static bool lazy_tid_reaped(ItemPointer itemptr, void *state);
 static int	vac_cmp_itemptr(const void *left, const void *right);
 static bool heap_page_is_all_visible(Relation rel, Buffer buf,
 						 TransactionId *visibility_cutoff_xid, bool *all_frozen);
-
+static LVDeadTuples *lazy_prepare_parallel(LVState *lvstate, long maxtuples, int request);
+static void lazy_end_parallel(LVState *lvstate, bool update_indstats);
+static bool lazy_begin_parallel_vacuum_index(LVState *lvstate, bool for_cleanup);
+static void lazy_end_parallel_vacuum_index(LVState *lvstate);
+static void lazy_vacuum_all_indexes(LVState *lvstate,
+									IndexBulkDeleteResult **stats,
+									LVDeadTuples *dead_tuples,
+									bool for_cleanup);
+static void lazy_vacuum_indexes_for_worker(Relation *indrels, int nindexes,
+										   LVShared *lvshared, LVDeadTuples *dead_tuples,
+										   bool for_cleanup);
+static int compute_parallel_workers(Relation rel, int nrequests, int nindexes);
 
 /*
  *	heap_vacuum_rel() -- perform VACUUM for one heap relation
@@ -186,9 +302,10 @@ static bool heap_page_is_all_visible(Relation rel, Buffer buf,
  *		and locked the relation.
  */
 void
-heap_vacuum_rel(Relation onerel, int options, VacuumParams *params,
+heap_vacuum_rel(Relation onerel, VacuumOptions options, VacuumParams *params,
 				BufferAccessStrategy bstrategy)
 {
+	LVState	   *lvstate;
 	LVRelStats *vacrelstats;
 	Relation   *Irel;
 	int			nindexes;
@@ -200,6 +317,7 @@ heap_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 				write_rate;
 	bool		aggressive;		/* should we scan all unfrozen pages? */
 	bool		scanned_all_unfrozen;	/* actually scanned all such pages? */
+	bool		hasindex;
 	TransactionId xidFullScanLimit;
 	MultiXactId mxactFullScanLimit;
 	BlockNumber new_rel_pages;
@@ -217,7 +335,7 @@ heap_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 		starttime = GetCurrentTimestamp();
 	}
 
-	if (options & VACOPT_VERBOSE)
+	if (options.flags & VACOPT_VERBOSE)
 		elevel = INFO;
 	else
 		elevel = DEBUG2;
@@ -245,7 +363,7 @@ heap_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 											   xidFullScanLimit);
 	aggressive |= MultiXactIdPrecedesOrEquals(onerel->rd_rel->relminmxid,
 											  mxactFullScanLimit);
-	if (options & VACOPT_DISABLE_PAGE_SKIPPING)
+	if (options.flags & VACOPT_DISABLE_PAGE_SKIPPING)
 		aggressive = true;
 
 	vacrelstats = (LVRelStats *) palloc0(sizeof(LVRelStats));
@@ -258,10 +376,23 @@ heap_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 
 	/* Open all indexes of the relation */
 	vac_open_indexes(onerel, RowExclusiveLock, &nindexes, &Irel);
-	vacrelstats->hasindex = (nindexes > 0);
+	hasindex = (nindexes > 0);
+
+	/* Create a lazy vacuum working state */
+	lvstate = (LVState *) palloc0(sizeof(LVState));
+	lvstate->relation = onerel;
+	lvstate->vacrelstats = vacrelstats;
+	lvstate->indRels = Irel;
+	lvstate->nindexes = nindexes;
+	lvstate->options = options;
+	lvstate->is_wraparound = params->is_wraparound;
+	lvstate->aggressive = aggressive;
+	lvstate->parallel_ready = false;
+	lvstate->lvshared = NULL;
+	lvstate->pcxt = NULL;
 
 	/* Do the vacuuming */
-	lazy_scan_heap(onerel, options, vacrelstats, Irel, nindexes, aggressive);
+	lazy_scan_heap(lvstate);
 
 	/* Done with indexes */
 	vac_close_indexes(nindexes, Irel, NoLock);
@@ -332,7 +463,7 @@ heap_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 						new_rel_pages,
 						new_live_tuples,
 						new_rel_allvisible,
-						vacrelstats->hasindex,
+						hasindex,
 						new_frozen_xid,
 						new_min_multi,
 						false);
@@ -464,14 +595,29 @@ vacuum_log_cleanup_info(Relation rel, LVRelStats *vacrelstats)
  *		dead-tuple TIDs, invoke vacuuming of indexes and call lazy_vacuum_heap
  *		to reclaim dead line pointers.
  *
+ *		If the table has more than one index and parallel lazy vacuum is requested,
+ *		we execute both index vacuum and cleanup index with parallel workers. When
+ *		allocating the space for lazy scan heap, we enter the parallel mode, create
+ *		the parallel context and initailize a dynamic shared memory segment for dead
+ *		tuples. The dead_tuples points either to a dynamic shared memory segment in
+ *		parallel vacuum case or to a local memory in single process vacuum case.
+ *		Before starting	parallel index vacuum and parallel cleanup index we launch
+ *		parallel workers. All parallel workers will exit after processed all indexes
+ *		and the leader process re-initialize parallel context and then re-launch them
+ *		at the next execution. The index statistics are updated by the leader after
+ *		exited from the parallel mode since all writes are not allowed during the
+ *		parallel mode.
+ *
  *		If there are no indexes then we can reclaim line pointers on the fly;
  *		dead line pointers need only be retained until all index pointers that
  *		reference them have been killed.
  */
 static void
-lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
-			   Relation *Irel, int nindexes, bool aggressive)
+lazy_scan_heap(LVState *lvstate)
 {
+	Relation	onerel = lvstate->relation;
+	LVRelStats	*vacrelstats = lvstate->vacrelstats;
+	LVDeadTuples	*dead_tuples = NULL;
 	BlockNumber nblocks,
 				blkno;
 	HeapTupleData tuple;
@@ -486,7 +632,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 				tups_vacuumed,	/* tuples cleaned up by vacuum */
 				nkeep,			/* dead-but-not-removable tuples */
 				nunused;		/* unused item pointers */
-	IndexBulkDeleteResult **indstats;
+	IndexBulkDeleteResult **indstats = NULL;
 	int			i;
 	PGRUsage	ru0;
 	Buffer		vmbuffer = InvalidBuffer;
@@ -494,6 +640,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	bool		skipping_blocks;
 	xl_heap_freeze_tuple *frozen;
 	StringInfoData buf;
+	int			parallel_workers = 0;
 	const int	initprog_index[] = {
 		PROGRESS_VACUUM_PHASE,
 		PROGRESS_VACUUM_TOTAL_HEAP_BLKS,
@@ -504,7 +651,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	pg_rusage_init(&ru0);
 
 	relname = RelationGetRelationName(onerel);
-	if (aggressive)
+	if (lvstate->aggressive)
 		ereport(elevel,
 				(errmsg("aggressively vacuuming \"%s.%s\"",
 						get_namespace_name(RelationGetNamespace(onerel)),
@@ -519,9 +666,6 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	next_fsm_block_to_vacuum = (BlockNumber) 0;
 	num_tuples = live_tuples = tups_vacuumed = nkeep = nunused = 0;
 
-	indstats = (IndexBulkDeleteResult **)
-		palloc0(nindexes * sizeof(IndexBulkDeleteResult *));
-
 	nblocks = RelationGetNumberOfBlocks(onerel);
 	vacrelstats->rel_pages = nblocks;
 	vacrelstats->scanned_pages = 0;
@@ -529,13 +673,36 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	vacrelstats->nonempty_pages = 0;
 	vacrelstats->latestRemovedXid = InvalidTransactionId;
 
-	lazy_space_alloc(vacrelstats, nblocks);
+	/*
+	 * Compute the number of parallel vacuum worker to request and then enable
+	 * parallel lazy vacuum.
+	 */
+	if ((lvstate->options.flags & VACOPT_PARALLEL) != 0)
+		parallel_workers = compute_parallel_workers(lvstate->relation,
+													lvstate->options.nworkers,
+													lvstate->nindexes);
+
+	/*
+	 * Allocate memory space for lazy vacuum. If parallel_workers > 0, we
+	 * prepare for parallel vacuum, entering the parallel mode, initializing
+	 * a dynamic shared memory segment.
+	 */
+	dead_tuples = lazy_space_alloc(lvstate, nblocks, parallel_workers);
 	frozen = palloc(sizeof(xl_heap_freeze_tuple) * MaxHeapTuplesPerPage);
 
+	/*
+	 * allocate the memory for index bulkdelete results if in the single vacuum
+	 * mode. In parallel mode, we've already prepared it in the shared memory
+	 * segment.
+	 */
+	if (!lvstate->parallel_ready)
+		indstats = (IndexBulkDeleteResult **)
+			palloc0(lvstate->nindexes * sizeof(IndexBulkDeleteResult *));
+
 	/* Report that we're scanning the heap, advertising total # of blocks */
 	initprog_val[0] = PROGRESS_VACUUM_PHASE_SCAN_HEAP;
 	initprog_val[1] = nblocks;
-	initprog_val[2] = vacrelstats->max_dead_tuples;
+	initprog_val[2] = dead_tuples->max_tuples;
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
 	/*
@@ -583,7 +750,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	 * be replayed on any hot standby, where it can be disruptive.
 	 */
 	next_unskippable_block = 0;
-	if ((options & VACOPT_DISABLE_PAGE_SKIPPING) == 0)
+	if ((lvstate->options.flags & VACOPT_DISABLE_PAGE_SKIPPING) == 0)
 	{
 		while (next_unskippable_block < nblocks)
 		{
@@ -591,7 +758,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 
 			vmstatus = visibilitymap_get_status(onerel, next_unskippable_block,
 												&vmbuffer);
-			if (aggressive)
+			if (lvstate->aggressive)
 			{
 				if ((vmstatus & VISIBILITYMAP_ALL_FROZEN) == 0)
 					break;
@@ -638,7 +805,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		{
 			/* Time to advance next_unskippable_block */
 			next_unskippable_block++;
-			if ((options & VACOPT_DISABLE_PAGE_SKIPPING) == 0)
+			if ((lvstate->options.flags & VACOPT_DISABLE_PAGE_SKIPPING) == 0)
 			{
 				while (next_unskippable_block < nblocks)
 				{
@@ -647,7 +814,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 					vmskipflags = visibilitymap_get_status(onerel,
 														   next_unskippable_block,
 														   &vmbuffer);
-					if (aggressive)
+					if (lvstate->aggressive)
 					{
 						if ((vmskipflags & VISIBILITYMAP_ALL_FROZEN) == 0)
 							break;
@@ -676,7 +843,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			 * it's not all-visible.  But in an aggressive vacuum we know only
 			 * that it's not all-frozen, so it might still be all-visible.
 			 */
-			if (aggressive && VM_ALL_VISIBLE(onerel, blkno, &vmbuffer))
+			if (lvstate->aggressive && VM_ALL_VISIBLE(onerel, blkno, &vmbuffer))
 				all_visible_according_to_vm = true;
 		}
 		else
@@ -700,7 +867,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 				 * know whether it was all-frozen, so we have to recheck; but
 				 * in this case an approximate answer is OK.
 				 */
-				if (aggressive || VM_ALL_FROZEN(onerel, blkno, &vmbuffer))
+				if (lvstate->aggressive || VM_ALL_FROZEN(onerel, blkno, &vmbuffer))
 					vacrelstats->frozenskipped_pages++;
 				continue;
 			}
@@ -713,8 +880,8 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		 * If we are close to overrunning the available space for dead-tuple
 		 * TIDs, pause and do a cycle of vacuuming before we tackle this page.
 		 */
-		if ((vacrelstats->max_dead_tuples - vacrelstats->num_dead_tuples) < MaxHeapTuplesPerPage &&
-			vacrelstats->num_dead_tuples > 0)
+		if ((dead_tuples->max_tuples - dead_tuples->num_tuples) < MaxHeapTuplesPerPage &&
+			dead_tuples->num_tuples > 0)
 		{
 			const int	hvp_index[] = {
 				PROGRESS_VACUUM_PHASE,
@@ -742,10 +909,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 										 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
 			/* Remove index entries */
-			for (i = 0; i < nindexes; i++)
-				lazy_vacuum_index(Irel[i],
-								  &indstats[i],
-								  vacrelstats);
+			lazy_vacuum_all_indexes(lvstate, indstats, dead_tuples, false);
 
 			/*
 			 * Report that we are now vacuuming the heap.  We also increase
@@ -758,14 +922,14 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			pgstat_progress_update_multi_param(2, hvp_index, hvp_val);
 
 			/* Remove tuples from heap */
-			lazy_vacuum_heap(onerel, vacrelstats);
+			lazy_vacuum_heap(lvstate, dead_tuples);
 
 			/*
 			 * Forget the now-vacuumed tuples, and press on, but be careful
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 			vacrelstats->num_index_scans++;
 
 			/*
@@ -803,7 +967,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			 * it's OK to skip vacuuming pages we get a lock conflict on. They
 			 * will be dealt with in some future vacuum.
 			 */
-			if (!aggressive && !FORCE_CHECK_PAGE())
+			if (!lvstate->aggressive && !FORCE_CHECK_PAGE())
 			{
 				ReleaseBuffer(buf);
 				vacrelstats->pinskipped_pages++;
@@ -836,7 +1000,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 					vacrelstats->nonempty_pages = blkno + 1;
 				continue;
 			}
-			if (!aggressive)
+			if (!lvstate->aggressive)
 			{
 				/*
 				 * Here, we must not advance scanned_pages; that would amount
@@ -955,7 +1119,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		has_dead_tuples = false;
 		nfrozen = 0;
 		hastup = false;
-		prev_dead_count = vacrelstats->num_dead_tuples;
+		prev_dead_count = dead_tuples->num_tuples;
 		maxoff = PageGetMaxOffsetNumber(page);
 
 		/*
@@ -994,7 +1158,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			 */
 			if (ItemIdIsDead(itemid))
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				all_visible = false;
 				continue;
 			}
@@ -1134,7 +1298,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 
 			if (tupgone)
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				HeapTupleHeaderAdvanceLatestRemovedXid(tuple.t_data,
 													   &vacrelstats->latestRemovedXid);
 				tups_vacuumed += 1;
@@ -1203,11 +1367,12 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		 * If there are no indexes then we can vacuum the page right now
 		 * instead of doing a second scan.
 		 */
-		if (nindexes == 0 &&
-			vacrelstats->num_dead_tuples > 0)
+		if (lvstate->nindexes == 0 && dead_tuples->num_tuples > 0)
 		{
 			/* Remove tuples from heap */
-			lazy_vacuum_page(onerel, blkno, buf, 0, vacrelstats, &vmbuffer);
+			lazy_vacuum_page(lvstate, onerel, blkno, buf, 0, &vmbuffer,
+							 lvstate->vacrelstats->latestRemovedXid,
+							 dead_tuples);
 			has_dead_tuples = false;
 
 			/*
@@ -1215,7 +1380,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 			vacuumed_pages++;
 
 			/*
@@ -1331,7 +1496,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		 * page, so remember its free space as-is.  (This path will always be
 		 * taken if there are no indexes.)
 		 */
-		if (vacrelstats->num_dead_tuples == prev_dead_count)
+		if (dead_tuples->num_tuples == prev_dead_count)
 			RecordPageWithFreeSpace(onerel, blkno, freespace);
 	}
 
@@ -1365,7 +1530,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 
 	/* If any tuples need to be deleted, perform final vacuum cycle */
 	/* XXX put a threshold on min number of tuples here? */
-	if (vacrelstats->num_dead_tuples > 0)
+	if (dead_tuples->num_tuples > 0)
 	{
 		const int	hvp_index[] = {
 			PROGRESS_VACUUM_PHASE,
@@ -1381,10 +1546,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 									 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
 		/* Remove index entries */
-		for (i = 0; i < nindexes; i++)
-			lazy_vacuum_index(Irel[i],
-							  &indstats[i],
-							  vacrelstats);
+		lazy_vacuum_all_indexes(lvstate, indstats, dead_tuples, false);
 
 		/* Report that we are now vacuuming the heap */
 		hvp_val[0] = PROGRESS_VACUUM_PHASE_VACUUM_HEAP;
@@ -1394,7 +1556,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		/* Remove tuples from heap */
 		pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
 									 PROGRESS_VACUUM_PHASE_VACUUM_HEAP);
-		lazy_vacuum_heap(onerel, vacrelstats);
+		lazy_vacuum_heap(lvstate, dead_tuples);
 		vacrelstats->num_index_scans++;
 	}
 
@@ -1411,8 +1573,10 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
 
 	/* Do post-vacuum cleanup and statistics update for each index */
-	for (i = 0; i < nindexes; i++)
-		lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
+	lazy_vacuum_all_indexes(lvstate, indstats, dead_tuples, true);
+
+	if (lvstate->parallel_ready)
+		lazy_end_parallel(lvstate, true);
 
 	/* If no indexes, make log report that lazy_vacuum_heap would've made */
 	if (vacuumed_pages)
@@ -1467,8 +1631,9 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
  * process index entry removal in batches as large as possible.
  */
 static void
-lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
+lazy_vacuum_heap(LVState *lvstate, LVDeadTuples *dead_tuples)
 {
+	Relation	onerel = lvstate->relation;
 	int			tupindex;
 	int			npages;
 	PGRUsage	ru0;
@@ -1478,7 +1643,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 	npages = 0;
 
 	tupindex = 0;
-	while (tupindex < vacrelstats->num_dead_tuples)
+	while (tupindex < dead_tuples->num_tuples)
 	{
 		BlockNumber tblk;
 		Buffer		buf;
@@ -1487,7 +1652,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 
 		vacuum_delay_point();
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&dead_tuples->itemptrs[tupindex]);
 		buf = ReadBufferExtended(onerel, MAIN_FORKNUM, tblk, RBM_NORMAL,
 								 vac_strategy);
 		if (!ConditionalLockBufferForCleanup(buf))
@@ -1496,8 +1661,9 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 			++tupindex;
 			continue;
 		}
-		tupindex = lazy_vacuum_page(onerel, tblk, buf, tupindex, vacrelstats,
-									&vmbuffer);
+		tupindex = lazy_vacuum_page(lvstate, onerel, tblk, buf, tupindex,
+									&vmbuffer, lvstate->vacrelstats->latestRemovedXid,
+									dead_tuples);
 
 		/* Now that we've compacted the page, record its available space */
 		page = BufferGetPage(buf);
@@ -1532,8 +1698,9 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
  * The return value is the first tupindex after the tuples of this page.
  */
 static int
-lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
-				 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer)
+lazy_vacuum_page(LVState *lvstate, Relation onerel, BlockNumber blkno,
+				 Buffer buffer, int tupindex, Buffer *vmbuffer,
+				 TransactionId latestRemovedXid, LVDeadTuples *dead_tuples)
 {
 	Page		page = BufferGetPage(buffer);
 	OffsetNumber unused[MaxOffsetNumber];
@@ -1545,16 +1712,16 @@ lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 
 	START_CRIT_SECTION();
 
-	for (; tupindex < vacrelstats->num_dead_tuples; tupindex++)
+	for (; tupindex < dead_tuples->num_tuples; tupindex++)
 	{
 		BlockNumber tblk;
 		OffsetNumber toff;
 		ItemId		itemid;
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&dead_tuples->itemptrs[tupindex]);
 		if (tblk != blkno)
 			break;				/* past end of tuples for this block */
-		toff = ItemPointerGetOffsetNumber(&vacrelstats->dead_tuples[tupindex]);
+		toff = ItemPointerGetOffsetNumber(&dead_tuples->itemptrs[tupindex]);
 		itemid = PageGetItemId(page, toff);
 		ItemIdSetUnused(itemid);
 		unused[uncnt++] = toff;
@@ -1575,7 +1742,7 @@ lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 		recptr = log_heap_clean(onerel, buffer,
 								NULL, 0, NULL, 0,
 								unused, uncnt,
-								vacrelstats->latestRemovedXid);
+								latestRemovedXid);
 		PageSetLSN(page, recptr);
 	}
 
@@ -1674,6 +1841,98 @@ lazy_check_needs_freeze(Buffer buf, bool *hastup)
 	return false;
 }
 
+/*
+ * Vacuum or cleanup all indexes. If we're ready to do parallel vacuum it's done
+ * with parallel workers. So this function must be used by the parallel vacuum
+ * leader process.
+ *
+ * In parallel lazy vacuum, we copy the index bulk-deletion results returned from
+ * ambulkdelete and amvacuumcleanup to the shared memory because they are allocated
+ * in the local memory and it's possible that an index will be vacuumed by the
+ * different vacuum process at the next time.
+ *
+ * Since all vacuum workers process different indexes we can write them without
+ * locking.
+ */
+static void
+lazy_vacuum_all_indexes(LVState *lvstate, IndexBulkDeleteResult **stats,
+						LVDeadTuples *dead_tuples, bool for_cleanup)
+{
+	LVShared	*lvshared = lvstate->lvshared;
+	LVRelStats	*vacrelstats = lvstate->vacrelstats;
+	int			nprocessed = 0;
+	bool		do_parallel = false;
+	int			idx;
+
+	Assert(!IsParallelWorker());
+	Assert(lvstate->parallel_ready ||
+		   (!lvstate->parallel_ready && stats != NULL));
+
+	/* no job if the table has no index */
+	if (lvstate->nindexes <= 0)
+		return;
+
+	if (lvstate->parallel_ready)
+		do_parallel = lazy_begin_parallel_vacuum_index(lvstate, for_cleanup);
+
+	for (;;)
+	{
+		IndexBulkDeleteResult *result = NULL;
+
+		/* Get the next index number to vacuum and set index statistics */
+		if (do_parallel)
+		{
+			idx = pg_atomic_fetch_add_u32(&(lvshared->nprocessed), 1);
+
+			/*
+			 * If there is already-updated result in the shared memory we
+			 * use it. Otherwise we pass NULL to index AMs and copy the
+			 * result to the shared memory segment.
+			 */
+			if (lvshared->indstats[idx].updated)
+				result = &(lvshared->indstats[idx].stats);
+		}
+		else
+		{
+			idx = nprocessed++;
+			result = stats[idx];
+		}
+
+		/* Done for all indexes? */
+		if (idx >= lvstate->nindexes)
+			break;
+
+		/*
+		 * Do vacuuming or cleanup one index. For cleanup index, we don't update
+		 * index statistics during parallel mode.
+		 */
+		if (for_cleanup)
+			result = lazy_cleanup_index(lvstate->indRels[idx], result,
+										vacrelstats->new_rel_tuples,
+										vacrelstats->tupcount_pages < vacrelstats->rel_pages,
+										!do_parallel);
+		else
+			result = lazy_vacuum_index(lvstate->indRels[idx], result,
+									   vacrelstats->old_rel_pages,
+									   dead_tuples);
+
+		if (do_parallel && result)
+		{
+			/*
+			 * Save index bulk-deletion result to the shared memory space if
+			 * not yet.
+			 */
+			if (!lvshared->indstats[idx].updated)
+				memcpy(&(lvshared->indstats[idx].stats), result,
+					   sizeof(IndexBulkDeleteResult));
+
+			lvshared->indstats[idx].updated = true;
+		}
+	}
+
+	if (do_parallel)
+		lazy_end_parallel_vacuum_index(lvstate);
+}
 
 /*
  *	lazy_vacuum_index() -- vacuum one index relation.
@@ -1681,11 +1940,11 @@ lazy_check_needs_freeze(Buffer buf, bool *hastup)
  *		Delete all the index entries pointing to tuples listed in
  *		vacrelstats->dead_tuples, and update running statistics.
  */
-static void
-lazy_vacuum_index(Relation indrel,
-				  IndexBulkDeleteResult **stats,
-				  LVRelStats *vacrelstats)
+static IndexBulkDeleteResult *
+lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult *stats,
+				  double reltuples, LVDeadTuples *dead_tuples)
 {
+	IndexBulkDeleteResult *res;
 	IndexVacuumInfo ivinfo;
 	PGRUsage	ru0;
 
@@ -1695,28 +1954,29 @@ lazy_vacuum_index(Relation indrel,
 	ivinfo.analyze_only = false;
 	ivinfo.estimated_count = true;
 	ivinfo.message_level = elevel;
-	/* We can only provide an approximate value of num_heap_tuples here */
-	ivinfo.num_heap_tuples = vacrelstats->old_live_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
 	/* Do bulk deletion */
-	*stats = index_bulk_delete(&ivinfo, *stats,
-							   lazy_tid_reaped, (void *) vacrelstats);
+	res = index_bulk_delete(&ivinfo, stats,
+							lazy_tid_reaped, (void *) dead_tuples);
 
 	ereport(elevel,
-			(errmsg("scanned index \"%s\" to remove %d row versions",
+			(errmsg("scanned index \"%s\" to remove %d row versions %s",
 					RelationGetRelationName(indrel),
-					vacrelstats->num_dead_tuples),
+					dead_tuples->num_tuples,
+					IsParallelWorker() ? "by parallel vacuum worker" : ""),
 			 errdetail_internal("%s", pg_rusage_show(&ru0))));
+
+	return res;
 }
 
 /*
  *	lazy_cleanup_index() -- do post-vacuum cleanup for one index relation.
  */
-static void
-lazy_cleanup_index(Relation indrel,
-				   IndexBulkDeleteResult *stats,
-				   LVRelStats *vacrelstats)
+static IndexBulkDeleteResult *
+lazy_cleanup_index(Relation indrel, IndexBulkDeleteResult *stats,
+				   double reltuples, bool estimated_count, bool update_stats)
 {
 	IndexVacuumInfo ivinfo;
 	PGRUsage	ru0;
@@ -1725,27 +1985,21 @@ lazy_cleanup_index(Relation indrel,
 
 	ivinfo.index = indrel;
 	ivinfo.analyze_only = false;
-	ivinfo.estimated_count = (vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	ivinfo.estimated_count = estimated_count;
 	ivinfo.message_level = elevel;
-
-	/*
-	 * Now we can provide a better estimate of total number of surviving
-	 * tuples (we assume indexes are more interested in that than in the
-	 * number of nominally live tuples).
-	 */
-	ivinfo.num_heap_tuples = vacrelstats->new_rel_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
 	stats = index_vacuum_cleanup(&ivinfo, stats);
 
 	if (!stats)
-		return;
+		return NULL;
 
 	/*
 	 * Now update statistics in pg_class, but only if the index says the count
 	 * is accurate.
 	 */
-	if (!stats->estimated_count)
+	if (!stats->estimated_count && update_stats)
 		vac_update_relstats(indrel,
 							stats->num_pages,
 							stats->num_index_tuples,
@@ -1766,8 +2020,7 @@ lazy_cleanup_index(Relation indrel,
 					   stats->tuples_removed,
 					   stats->pages_deleted, stats->pages_free,
 					   pg_rusage_show(&ru0))));
-
-	pfree(stats);
+	return stats;
 }
 
 /*
@@ -2077,15 +2330,16 @@ count_nondeletable_pages(Relation onerel, LVRelStats *vacrelstats)
  *
  * See the comments at the head of this file for rationale.
  */
-static void
-lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+static LVDeadTuples *
+lazy_space_alloc(LVState *lvstate, BlockNumber relblocks, int parallel_workers)
 {
+	LVDeadTuples	*dead_tuples = NULL;
 	long		maxtuples;
 	int			vac_work_mem = IsAutoVacuumWorkerProcess() &&
 	autovacuum_work_mem != -1 ?
 	autovacuum_work_mem : maintenance_work_mem;
 
-	if (vacrelstats->hasindex)
+	if (lvstate->nindexes > 0)
 	{
 		maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);
 		maxtuples = Min(maxtuples, INT_MAX);
@@ -2099,34 +2353,46 @@ lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
 		maxtuples = Max(maxtuples, MaxHeapTuplesPerPage);
 	}
 	else
-	{
 		maxtuples = MaxHeapTuplesPerPage;
+
+	/*
+	 * In parallel lazy vacuum, we enter the parallel mode and prepare all memory
+	 * necessary for executing parallel lazy vacuum including the space to store
+	 * dead tuples.
+	 */
+	if (parallel_workers > 0)
+	{
+		dead_tuples = lazy_prepare_parallel(lvstate, maxtuples, parallel_workers);
+
+		/* Preparation was a success, return the dead tuple space */
+		if (dead_tuples)
+			return dead_tuples;
 	}
 
-	vacrelstats->num_dead_tuples = 0;
-	vacrelstats->max_dead_tuples = (int) maxtuples;
-	vacrelstats->dead_tuples = (ItemPointer)
-		palloc(maxtuples * sizeof(ItemPointerData));
+	dead_tuples = (LVDeadTuples *) palloc(SizeOfLVDeadTuples + maxtuples * sizeof(ItemPointerData));
+	dead_tuples->num_tuples = 0;
+	dead_tuples->max_tuples = (int) maxtuples;
+
+	return dead_tuples;
 }
 
 /*
  * lazy_record_dead_tuple - remember one deletable tuple
  */
 static void
-lazy_record_dead_tuple(LVRelStats *vacrelstats,
-					   ItemPointer itemptr)
+lazy_record_dead_tuple(LVDeadTuples *dead_tuples, ItemPointer itemptr)
 {
 	/*
 	 * The array shouldn't overflow under normal behavior, but perhaps it
 	 * could if we are given a really small maintenance_work_mem. In that
 	 * case, just forget the last few tuples (we'll get 'em next time).
 	 */
-	if (vacrelstats->num_dead_tuples < vacrelstats->max_dead_tuples)
+	if (dead_tuples->num_tuples < dead_tuples->max_tuples)
 	{
-		vacrelstats->dead_tuples[vacrelstats->num_dead_tuples] = *itemptr;
-		vacrelstats->num_dead_tuples++;
+		dead_tuples->itemptrs[dead_tuples->num_tuples] = *itemptr;
+		dead_tuples->num_tuples++;
 		pgstat_progress_update_param(PROGRESS_VACUUM_NUM_DEAD_TUPLES,
-									 vacrelstats->num_dead_tuples);
+									 dead_tuples->num_tuples);
 	}
 }
 
@@ -2140,12 +2406,12 @@ lazy_record_dead_tuple(LVRelStats *vacrelstats,
 static bool
 lazy_tid_reaped(ItemPointer itemptr, void *state)
 {
-	LVRelStats *vacrelstats = (LVRelStats *) state;
+	LVDeadTuples	*dead_tuples = (LVDeadTuples *) state;
 	ItemPointer res;
 
 	res = (ItemPointer) bsearch((void *) itemptr,
-								(void *) vacrelstats->dead_tuples,
-								vacrelstats->num_dead_tuples,
+								(void *) dead_tuples->itemptrs,
+								dead_tuples->num_tuples,
 								sizeof(ItemPointerData),
 								vac_cmp_itemptr);
 
@@ -2293,3 +2559,381 @@ heap_page_is_all_visible(Relation rel, Buffer buf,
 
 	return all_visible;
 }
+
+/*
+ * Compute the number of parallel worker process to request. Vacuums can be executed
+ * in parallel if the table has more than one index since the parallel index vacuum
+ * processes one index by one vacuum process. The relation size of table and indexes
+ * doesn't affect to the parallel degree.
+ */
+static int
+compute_parallel_workers(Relation rel, int nrequests, int nindexes)
+{
+	int parallel_workers;
+
+	if (nindexes <= 1)
+		return 0;
+
+	if (nrequests)
+		parallel_workers = Min(nrequests, nindexes - 1);
+	else if (rel->rd_options)
+	{
+		StdRdOptions *relopts = (StdRdOptions *) rel->rd_options;
+		parallel_workers = Min(relopts->parallel_workers, nindexes - 1);
+	}
+	else
+	{
+		/*
+		 * The parallel degree is neither requested nor set in relopts. Compute
+		 * it based on the number of indexes.
+		 */
+		parallel_workers = nindexes - 1;
+	}
+
+	/* cap by max_parallel_maintenace_workers */
+	parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+
+	return parallel_workers;
+}
+
+/*
+ * Enter the parallel mode, allocate and initialize a DSM segment. Return
+ * the memory space for storing dead tuples or NULL if no workers are prepared.
+ */
+static LVDeadTuples *
+lazy_prepare_parallel(LVState *lvstate, long maxtuples, int request)
+{
+	LVShared	*shared;
+	ParallelContext *pcxt;
+	LVDeadTuples	*tidmap;
+	char	*sharedquery;
+	Size	estshared;
+	Size	estdt;
+	int		querylen;
+	int		i;
+	int		keys = 0;
+
+	Assert(request > 0);
+
+	EnterParallelMode();
+	pcxt = CreateParallelContext("postgres", "heap_parallel_vacuum_main",
+								 request, true);
+	lvstate->pcxt = pcxt;
+
+	/* quick exit if no workers are prepared, e.g. under serializable isolation */
+	if (pcxt->nworkers == 0)
+	{
+		lazy_end_parallel(lvstate, false);
+		return NULL;
+	}
+
+	/* Estimate size for shared information -- PARALLEL_VACUUM_KEY_SHARED */
+	estshared = MAXALIGN(add_size(SizeOfLVShared,
+								  mul_size(sizeof(LVIndStats), lvstate->nindexes)));
+	shm_toc_estimate_chunk(&pcxt->estimator, estshared);
+	keys++;
+
+	/* Estimate size for dead tuples -- PARALLEL_VACUUM_KEY_DEAD_TUPLES */
+	estdt = MAXALIGN(add_size(sizeof(LVDeadTuples),
+							  mul_size(sizeof(ItemPointerData), maxtuples)));
+	shm_toc_estimate_chunk(&pcxt->estimator, estdt);
+	keys++;
+
+	shm_toc_estimate_keys(&pcxt->estimator, keys);
+
+	/*
+	 * Finally, estimate VACUUM_KEY_QUERY_TEXT space. Auto vacuums don't have
+	 * debug_query_string.
+	 */
+	querylen = strlen(debug_query_string);
+	shm_toc_estimate_chunk(&pcxt->estimator, querylen + 1);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	/* create the DSM */
+	InitializeParallelDSM(pcxt);
+
+	/* prepare shared information */
+	shared = (LVShared *) shm_toc_allocate(pcxt->toc, estshared);
+	shared->relid = RelationGetRelid(lvstate->relation);
+	shared->is_wraparound = lvstate->is_wraparound;
+	shared->elevel = elevel;
+	pg_atomic_init_u32(&(shared->nprocessed), 0);
+
+	for (i = 0; i < lvstate->nindexes; i++)
+	{
+		LVIndStats *s = &(shared->indstats[i]);
+		s->updated = false;
+		MemSet(&(s->stats), 0, sizeof(IndexBulkDeleteResult));
+	}
+
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_SHARED, shared);
+	lvstate->lvshared = shared;
+
+	/* prepare the dead tuple space */
+	tidmap = (LVDeadTuples *) shm_toc_allocate(pcxt->toc, estdt);
+	tidmap->max_tuples = maxtuples;
+	tidmap->num_tuples = 0;
+	MemSet(tidmap->itemptrs, 0, sizeof(ItemPointerData) * maxtuples);
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_DEAD_TUPLES, tidmap);
+
+	/* Store query string for workers */
+	sharedquery = (char *) shm_toc_allocate(pcxt->toc, querylen + 1);
+	memcpy(sharedquery, debug_query_string, querylen + 1);
+	sharedquery[querylen] = '\0';
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, sharedquery);
+
+	/* All setup is done, now we're ready for parallel vacuum execution */
+	lvstate->parallel_ready = true;
+
+	return tidmap;
+}
+
+/*
+ * Shutdown workers, destroy parallel context, and end parallel mode. If
+ * 'update_indstats' is true, we copy statistics of all indexes before
+ * destroying the parallel context, and then update them after exit parallel
+ * mode.
+ */
+static void
+lazy_end_parallel(LVState *lvstate, bool update_indstats)
+{
+	LVIndStats *copied_indstats = NULL;
+
+	Assert(!IsParallelWorker());
+
+	if (update_indstats && lvstate->nindexes > 0)
+	{
+		/* copy the index statistics to a temporary space */
+		copied_indstats = palloc(sizeof(LVIndStats) * lvstate->nindexes);
+		memcpy(copied_indstats, lvstate->lvshared->indstats,
+			   sizeof(LVIndStats) * lvstate->nindexes);
+	}
+
+	/* Shutdown worker processes and destroy the parallel context */
+	WaitForParallelWorkersToFinish(lvstate->pcxt);
+	DestroyParallelContext(lvstate->pcxt);
+	ExitParallelMode();
+
+	if (copied_indstats)
+	{
+		int i;
+
+		for (i = 0; i < lvstate->nindexes; i++)
+		{
+			LVIndStats *s = &(copied_indstats[i]);
+
+			/* Update index statistics */
+			if (s->updated && !s->stats.estimated_count)
+				vac_update_relstats(lvstate->indRels[i],
+									s->stats.num_pages,
+									s->stats.num_index_tuples,
+									0,
+									false,
+									InvalidTransactionId,
+									InvalidMultiXactId,
+									false);
+		}
+
+		pfree(copied_indstats);
+	}
+
+	lvstate->parallel_ready = false;
+}
+
+/*
+ * Begin a parallel index vacuum or cleanup index. Set shared information
+ * and launch parallel worker processes. Return true if at least one worker
+ * has been launched.
+ */
+static bool
+lazy_begin_parallel_vacuum_index(LVState *lvstate, bool for_cleanup)
+{
+	LVRelStats *vacrelstats = lvstate->vacrelstats;
+	StringInfoData buf;
+
+	Assert(!IsParallelWorker());
+
+	/*
+	 * Request workers to do either vacuuming indexes or cleaning indexes.
+	 */
+	lvstate->lvshared->for_cleanup = for_cleanup;
+
+	if (for_cleanup)
+	{
+		/*
+		 * Now we can provide a better estimate of total number of surviving
+		 * tuples (we assume indexes are more interested in that than in the
+		 * number of nominally live tuples).
+		 */
+		lvstate->lvshared->reltuples = vacrelstats->new_rel_tuples;
+		lvstate->lvshared->estimated_count =
+			(vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	}
+	else
+	{
+		/* We can only provide an approximate value of num_heap_tuples here */
+		lvstate->lvshared->reltuples = vacrelstats->old_live_tuples;
+		lvstate->lvshared->estimated_count = true;
+	}
+
+	LaunchParallelWorkers(lvstate->pcxt);
+
+	/* Report parallel vacuum worker information */
+	initStringInfo(&buf);
+	appendStringInfo(&buf,
+					 ngettext("launched %d parallel vacuum worker %s (planned: %d",
+							  "launched %d parallel vacuum workers %s (planned: %d",
+							  lvstate->pcxt->nworkers_launched),
+					 lvstate->pcxt->nworkers_launched,
+					 for_cleanup ? "for index cleanup" : "for index vacuum",
+					 lvstate->pcxt->nworkers);
+	if (lvstate->options.nworkers > 0)
+		appendStringInfo(&buf, ", requested %d", lvstate->options.nworkers);
+
+	appendStringInfo(&buf, ")");
+	ereport(elevel, (errmsg("%s", buf.data)));
+
+	/*
+	 * if no workers launched, we vacuum all indexes by the leader process
+	 * alone. Since there is hope that we can launch workers in the next
+	 * execution time we don't want to end the parallel mode yet.
+	 */
+	if (lvstate->pcxt->nworkers_launched == 0)
+	{
+		lazy_end_parallel_vacuum_index(lvstate);
+		return false;
+	}
+
+	WaitForParallelWorkersToAttach(lvstate->pcxt);
+
+	return true;
+}
+
+/*
+ * Wait for all worker processes to finish and reinitialize DSM for
+ * the next execution.
+ */
+static void
+lazy_end_parallel_vacuum_index(LVState *lvstate)
+{
+	Assert(!IsParallelWorker());
+
+	WaitForParallelWorkersToFinish(lvstate->pcxt);
+
+	/* Reset the processing count */
+	pg_atomic_write_u32(&(lvstate->lvshared->nprocessed), 0);
+
+	/*
+	 * Reinitialize the DSM space except to relaunch parallel workers for
+	 * the next execution.
+	 */
+	ReinitializeParallelDSM(lvstate->pcxt);
+}
+
+/*
+ * Perform work within a launched parallel process.
+ *
+ * Parallel vacuum worker processes doesn't report the vacuum progress
+ * information.
+ */
+void
+heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
+{
+	Relation	onerel;
+	Relation	*indrels;
+	LVShared	*lvshared;
+	LVDeadTuples	*dead_tuples;
+	int			nindexes;
+	char		*sharedquery;
+
+	/* Set lazy vacuum state and open relations */
+	lvshared = (LVShared *) shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_SHARED, false);
+	onerel = heap_open(lvshared->relid, ShareUpdateExclusiveLock);
+	vac_open_indexes(onerel, RowExclusiveLock, &nindexes, &indrels);
+	elevel = lvshared->elevel;
+
+	ereport(DEBUG1,
+			(errmsg("starting parallel lazy vacuum worker for %s",
+					lvshared->for_cleanup ? "cleanup" : "vacuuming")));
+
+	/* Set debug_query_string for individual workers */
+	sharedquery = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, true);
+
+	/* Report the query string from leader */
+	debug_query_string = sharedquery;
+	pgstat_report_activity(STATE_RUNNING, debug_query_string);
+
+	/* Set dead tuple space within worker */
+	dead_tuples = (LVDeadTuples *) shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_DEAD_TUPLES, false);
+
+	/* Set cost-based vacuum delay */
+	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostBalance = 0;
+	VacuumPageHit = 0;
+	VacuumPageMiss = 0;
+	VacuumPageDirty = 0;
+
+	/* Do either vacuuming indexes or cleaning indexes */
+	lazy_vacuum_indexes_for_worker(indrels, nindexes, lvshared,
+								   dead_tuples,
+								   lvshared->for_cleanup);
+
+	vac_close_indexes(nindexes, indrels, RowExclusiveLock);
+	heap_close(onerel, ShareUpdateExclusiveLock);
+}
+
+/*
+ * Vacuum or cleanup indexes. This function must be used by the parallel vacuum
+ * worker processes. Similar to the leader process in parallel lazy vacuum, we
+ * copy the index bulk-deletion results to the shared memory segment.
+ */
+static void
+lazy_vacuum_indexes_for_worker(Relation *indrels, int nindexes,
+							   LVShared *lvshared, LVDeadTuples *dead_tuples,
+							   bool for_cleanup)
+{
+	int idx = 0;
+
+	Assert(IsParallelWorker());
+
+	for (;;)
+	{
+		IndexBulkDeleteResult *result = NULL;
+
+		/* Get next index to process */
+		idx = pg_atomic_fetch_add_u32(&(lvshared->nprocessed), 1);
+
+		/*  Done for all indexes? */
+		if (idx >= nindexes)
+			break;
+
+		/*
+		 * If there is already-updated result in the shared memory we use it.
+		 * Otherwise we pass NULL to index AMs and copy the result to the
+		 * shared memory segment.
+		 */
+		if (lvshared->indstats[idx].updated)
+			result = &(lvshared->indstats[idx].stats);
+
+		/* Do vacuuming or cleanup one index */
+		if (for_cleanup)
+			result = lazy_cleanup_index(indrels[idx], result, lvshared->reltuples,
+									   lvshared->estimated_count, false);
+		else
+			result = lazy_vacuum_index(indrels[idx], result, lvshared->reltuples,
+									  dead_tuples);
+
+		if (result)
+		{
+			/*
+			 * Save index bulk-deletion result to the shared memory space if
+			 * not yet.
+			 */
+			if (!lvshared->indstats[idx].updated)
+				memcpy(&(lvshared->indstats[idx].stats), result,
+					   sizeof(IndexBulkDeleteResult));
+
+			lvshared->indstats[idx].updated = true;
+		}
+	}
+}
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index 9c55c20..e53af92 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -14,6 +14,7 @@
 
 #include "postgres.h"
 
+#include "access/heapam.h"
 #include "access/nbtree.h"
 #include "access/parallel.h"
 #include "access/session.h"
@@ -138,6 +139,9 @@ static const struct
 	},
 	{
 		"_bt_parallel_build_main", _bt_parallel_build_main
+	},
+	{
+		"heap_parallel_vacuum_main", heap_parallel_vacuum_main
 	}
 };
 
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index e91df21..2f8446a 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -67,13 +67,13 @@ static BufferAccessStrategy vac_strategy;
 
 
 /* non-export function prototypes */
-static List *expand_vacuum_rel(VacuumRelation *vrel, int options);
-static List *get_all_vacuum_rels(int options);
+static List *expand_vacuum_rel(VacuumRelation *vrel, VacuumOptions options);
+static List *get_all_vacuum_rels(VacuumOptions options);
 static void vac_truncate_clog(TransactionId frozenXID,
 				  MultiXactId minMulti,
 				  TransactionId lastSaneFrozenXid,
 				  MultiXactId lastSaneMinMulti);
-static bool vacuum_rel(Oid relid, RangeVar *relation, int options,
+static bool vacuum_rel(Oid relid, RangeVar *relation, VacuumOptions options,
 		   VacuumParams *params);
 
 /*
@@ -88,15 +88,15 @@ ExecVacuum(VacuumStmt *vacstmt, bool isTopLevel)
 	VacuumParams params;
 
 	/* sanity checks on options */
-	Assert(vacstmt->options & (VACOPT_VACUUM | VACOPT_ANALYZE));
-	Assert((vacstmt->options & VACOPT_VACUUM) ||
-		   !(vacstmt->options & (VACOPT_FULL | VACOPT_FREEZE)));
-	Assert(!(vacstmt->options & VACOPT_SKIPTOAST));
+	Assert(vacstmt->options.flags & (VACOPT_VACUUM | VACOPT_ANALYZE));
+	Assert((vacstmt->options.flags & VACOPT_VACUUM) ||
+		   !(vacstmt->options.flags & (VACOPT_FULL | VACOPT_FREEZE)));
+	Assert(!(vacstmt->options.flags & VACOPT_SKIPTOAST));
 
 	/*
 	 * Make sure VACOPT_ANALYZE is specified if any column lists are present.
 	 */
-	if (!(vacstmt->options & VACOPT_ANALYZE))
+	if (!(vacstmt->options.flags & VACOPT_ANALYZE))
 	{
 		ListCell   *lc;
 
@@ -111,11 +111,17 @@ ExecVacuum(VacuumStmt *vacstmt, bool isTopLevel)
 		}
 	}
 
+	if ((vacstmt->options.flags & VACOPT_FULL) &&
+		(vacstmt->options.flags & VACOPT_PARALLEL))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot specify FULL option with PARALLEL option")));
+
 	/*
 	 * All freeze ages are zero if the FREEZE option is given; otherwise pass
 	 * them as -1 which means to use the default values.
 	 */
-	if (vacstmt->options & VACOPT_FREEZE)
+	if (vacstmt->options.flags & VACOPT_FREEZE)
 	{
 		params.freeze_min_age = 0;
 		params.freeze_table_age = 0;
@@ -143,7 +149,7 @@ ExecVacuum(VacuumStmt *vacstmt, bool isTopLevel)
 /*
  * Internal entry point for VACUUM and ANALYZE commands.
  *
- * options is a bitmask of VacuumOption flags, indicating what to do.
+ * options is a VacuumOptions, indicating what to do.
  *
  * relations, if not NIL, is a list of VacuumRelation to process; otherwise,
  * we process all relevant tables in the database.  For each VacuumRelation,
@@ -162,7 +168,7 @@ ExecVacuum(VacuumStmt *vacstmt, bool isTopLevel)
  * memory context that will not disappear at transaction commit.
  */
 void
-vacuum(int options, List *relations, VacuumParams *params,
+vacuum(VacuumOptions options, List *relations, VacuumParams *params,
 	   BufferAccessStrategy bstrategy, bool isTopLevel)
 {
 	static bool in_vacuum = false;
@@ -173,7 +179,7 @@ vacuum(int options, List *relations, VacuumParams *params,
 
 	Assert(params != NULL);
 
-	stmttype = (options & VACOPT_VACUUM) ? "VACUUM" : "ANALYZE";
+	stmttype = (options.flags & VACOPT_VACUUM) ? "VACUUM" : "ANALYZE";
 
 	/*
 	 * We cannot run VACUUM inside a user transaction block; if we were inside
@@ -183,7 +189,7 @@ vacuum(int options, List *relations, VacuumParams *params,
 	 *
 	 * ANALYZE (without VACUUM) can run either way.
 	 */
-	if (options & VACOPT_VACUUM)
+	if (options.flags & VACOPT_VACUUM)
 	{
 		PreventInTransactionBlock(isTopLevel, stmttype);
 		in_outer_xact = false;
@@ -205,8 +211,8 @@ vacuum(int options, List *relations, VacuumParams *params,
 	/*
 	 * Sanity check DISABLE_PAGE_SKIPPING option.
 	 */
-	if ((options & VACOPT_FULL) != 0 &&
-		(options & VACOPT_DISABLE_PAGE_SKIPPING) != 0)
+	if ((options.flags & VACOPT_FULL) != 0 &&
+		(options.flags & VACOPT_DISABLE_PAGE_SKIPPING) != 0)
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg("VACUUM option DISABLE_PAGE_SKIPPING cannot be used with FULL")));
@@ -215,7 +221,7 @@ vacuum(int options, List *relations, VacuumParams *params,
 	 * Send info about dead objects to the statistics collector, unless we are
 	 * in autovacuum --- autovacuum.c does this for itself.
 	 */
-	if ((options & VACOPT_VACUUM) && !IsAutoVacuumWorkerProcess())
+	if ((options.flags & VACOPT_VACUUM) && !IsAutoVacuumWorkerProcess())
 		pgstat_vacuum_stat();
 
 	/*
@@ -280,11 +286,11 @@ vacuum(int options, List *relations, VacuumParams *params,
 	 * transaction block, and also in an autovacuum worker, use own
 	 * transactions so we can release locks sooner.
 	 */
-	if (options & VACOPT_VACUUM)
+	if (options.flags & VACOPT_VACUUM)
 		use_own_xacts = true;
 	else
 	{
-		Assert(options & VACOPT_ANALYZE);
+		Assert(options.flags & VACOPT_ANALYZE);
 		if (IsAutoVacuumWorkerProcess())
 			use_own_xacts = true;
 		else if (in_outer_xact)
@@ -334,13 +340,13 @@ vacuum(int options, List *relations, VacuumParams *params,
 		{
 			VacuumRelation *vrel = lfirst_node(VacuumRelation, cur);
 
-			if (options & VACOPT_VACUUM)
+			if (options.flags & VACOPT_VACUUM)
 			{
 				if (!vacuum_rel(vrel->oid, vrel->relation, options, params))
 					continue;
 			}
 
-			if (options & VACOPT_ANALYZE)
+			if (options.flags & VACOPT_ANALYZE)
 			{
 				/*
 				 * If using separate xacts, start one for analyze. Otherwise,
@@ -353,7 +359,7 @@ vacuum(int options, List *relations, VacuumParams *params,
 					PushActiveSnapshot(GetTransactionSnapshot());
 				}
 
-				analyze_rel(vrel->oid, vrel->relation, options, params,
+				analyze_rel(vrel->oid, vrel->relation, options.flags, params,
 							vrel->va_cols, in_outer_xact, vac_strategy);
 
 				if (use_own_xacts)
@@ -389,7 +395,7 @@ vacuum(int options, List *relations, VacuumParams *params,
 		StartTransactionCommand();
 	}
 
-	if ((options & VACOPT_VACUUM) && !IsAutoVacuumWorkerProcess())
+	if ((options.flags & VACOPT_VACUUM) && !IsAutoVacuumWorkerProcess())
 	{
 		/*
 		 * Update pg_database.datfrozenxid, and truncate pg_xact if possible.
@@ -602,7 +608,7 @@ vacuum_open_relation(Oid relid, RangeVar *relation, VacuumParams *params,
  * are made in vac_context.
  */
 static List *
-expand_vacuum_rel(VacuumRelation *vrel, int options)
+expand_vacuum_rel(VacuumRelation *vrel, VacuumOptions options)
 {
 	List	   *vacrels = NIL;
 	MemoryContext oldcontext;
@@ -634,7 +640,7 @@ expand_vacuum_rel(VacuumRelation *vrel, int options)
 		 * below, as well as find_all_inheritors's expectation that the caller
 		 * holds some lock on the starting relation.
 		 */
-		rvr_opts = (options & VACOPT_SKIP_LOCKED) ? RVR_SKIP_LOCKED : 0;
+		rvr_opts = (options.flags & VACOPT_SKIP_LOCKED) ? RVR_SKIP_LOCKED : 0;
 		relid = RangeVarGetRelidExtended(vrel->relation,
 										 AccessShareLock,
 										 rvr_opts,
@@ -646,7 +652,7 @@ expand_vacuum_rel(VacuumRelation *vrel, int options)
 		 */
 		if (!OidIsValid(relid))
 		{
-			if (options & VACOPT_VACUUM)
+			if (options.flags & VACOPT_VACUUM)
 				ereport(WARNING,
 						(errcode(ERRCODE_LOCK_NOT_AVAILABLE),
 						 errmsg("skipping vacuum of \"%s\" --- lock not available",
@@ -672,7 +678,7 @@ expand_vacuum_rel(VacuumRelation *vrel, int options)
 		 * Make a returnable VacuumRelation for this rel if user is a proper
 		 * owner.
 		 */
-		if (vacuum_is_relation_owner(relid, classForm, options))
+		if (vacuum_is_relation_owner(relid, classForm, options.flags))
 		{
 			oldcontext = MemoryContextSwitchTo(vac_context);
 			vacrels = lappend(vacrels, makeVacuumRelation(vrel->relation,
@@ -741,7 +747,7 @@ expand_vacuum_rel(VacuumRelation *vrel, int options)
  * the current database.  The list is built in vac_context.
  */
 static List *
-get_all_vacuum_rels(int options)
+get_all_vacuum_rels(VacuumOptions options)
 {
 	List	   *vacrels = NIL;
 	Relation	pgclass;
@@ -759,7 +765,7 @@ get_all_vacuum_rels(int options)
 		Oid			relid = classForm->oid;
 
 		/* check permissions of relation */
-		if (!vacuum_is_relation_owner(relid, classForm, options))
+		if (!vacuum_is_relation_owner(relid, classForm, options.flags))
 			continue;
 
 		/*
@@ -1520,7 +1526,7 @@ vac_truncate_clog(TransactionId frozenXID,
  *		At entry and exit, we are not inside a transaction.
  */
 static bool
-vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
+vacuum_rel(Oid relid, RangeVar *relation, VacuumOptions options, VacuumParams *params)
 {
 	LOCKMODE	lmode;
 	Relation	onerel;
@@ -1541,7 +1547,7 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 	 */
 	PushActiveSnapshot(GetTransactionSnapshot());
 
-	if (!(options & VACOPT_FULL))
+	if (!(options.flags & VACOPT_FULL))
 	{
 		/*
 		 * In lazy vacuum, we can set the PROC_IN_VACUUM flag, which lets
@@ -1581,10 +1587,10 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 	 * vacuum, but just ShareUpdateExclusiveLock for concurrent vacuum. Either
 	 * way, we can be sure that no other backend is vacuuming the same table.
 	 */
-	lmode = (options & VACOPT_FULL) ? AccessExclusiveLock : ShareUpdateExclusiveLock;
+	lmode = (options.flags & VACOPT_FULL) ? AccessExclusiveLock : ShareUpdateExclusiveLock;
 
 	/* open the relation and get the appropriate lock on it */
-	onerel = vacuum_open_relation(relid, relation, params, options, lmode);
+	onerel = vacuum_open_relation(relid, relation, params, options.flags, lmode);
 
 	/* leave if relation could not be opened or locked */
 	if (!onerel)
@@ -1604,7 +1610,7 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 	 */
 	if (!vacuum_is_relation_owner(RelationGetRelid(onerel),
 								  onerel->rd_rel,
-								  options & VACOPT_VACUUM))
+								  options.flags & VACOPT_VACUUM))
 	{
 		relation_close(onerel, lmode);
 		PopActiveSnapshot();
@@ -1676,7 +1682,7 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 	 * us to process it.  In VACUUM FULL, though, the toast table is
 	 * automatically rebuilt by cluster_rel so we shouldn't recurse to it.
 	 */
-	if (!(options & VACOPT_SKIPTOAST) && !(options & VACOPT_FULL))
+	if (!(options.flags & VACOPT_SKIPTOAST) && !(options.flags & VACOPT_FULL))
 		toast_relid = onerel->rd_rel->reltoastrelid;
 	else
 		toast_relid = InvalidOid;
@@ -1695,7 +1701,7 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 	/*
 	 * Do the actual work --- either FULL or "lazy" vacuum
 	 */
-	if (options & VACOPT_FULL)
+	if (options.flags & VACOPT_FULL)
 	{
 		int			cluster_options = 0;
 
@@ -1703,7 +1709,7 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 		relation_close(onerel, NoLock);
 		onerel = NULL;
 
-		if ((options & VACOPT_VERBOSE) != 0)
+		if ((options.flags & VACOPT_VERBOSE) != 0)
 			cluster_options |= CLUOPT_VERBOSE;
 
 		/* VACUUM FULL is now a variant of CLUSTER; see cluster.c */
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 5c4fa7d..2313e4d 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -1668,8 +1668,10 @@ _equalDropdbStmt(const DropdbStmt *a, const DropdbStmt *b)
 static bool
 _equalVacuumStmt(const VacuumStmt *a, const VacuumStmt *b)
 {
-	COMPARE_SCALAR_FIELD(options);
-	COMPARE_NODE_FIELD(rels);
+	if (a->options.flags != b->options.flags)
+		return false;
+	if (a->options.nworkers != b->options.nworkers)
+		return false;
 
 	return true;
 }
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index d8a3c2d..ffd56c9 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -187,6 +187,7 @@ static void processCASbits(int cas_bits, int location, const char *constrType,
 			   bool *deferrable, bool *initdeferred, bool *not_valid,
 			   bool *no_inherit, core_yyscan_t yyscanner);
 static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
+static VacuumOptions *makeVacOpt(VacuumFlag flag, int nworkers);
 
 %}
 
@@ -237,6 +238,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 	struct ImportQual	*importqual;
 	InsertStmt			*istmt;
 	VariableSetStmt		*vsetstmt;
+	VacuumOptions		*vacopt;
 	PartitionElem		*partelem;
 	PartitionSpec		*partspec;
 	PartitionBoundSpec	*partboundspec;
@@ -305,8 +307,8 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 				create_extension_opt_item alter_extension_opt_item
 
 %type <ival>	opt_lock lock_type cast_context
-%type <ival>	vacuum_option_list vacuum_option_elem
-				analyze_option_list analyze_option_elem
+%type <vacopt>	vacuum_option_list vacuum_option_elem
+%type <ival>	analyze_option_list analyze_option_elem
 %type <boolean>	opt_or_replace
 				opt_grant_grant_option opt_grant_admin_option
 				opt_nowait opt_if_exists opt_with_data
@@ -10484,22 +10486,24 @@ cluster_index_specification:
 VacuumStmt: VACUUM opt_full opt_freeze opt_verbose opt_analyze opt_vacuum_relation_list
 				{
 					VacuumStmt *n = makeNode(VacuumStmt);
-					n->options = VACOPT_VACUUM;
+					n->options.flags = VACOPT_VACUUM;
 					if ($2)
-						n->options |= VACOPT_FULL;
+						n->options.flags |= VACOPT_FULL;
 					if ($3)
-						n->options |= VACOPT_FREEZE;
+						n->options.flags |= VACOPT_FREEZE;
 					if ($4)
-						n->options |= VACOPT_VERBOSE;
+						n->options.flags |= VACOPT_VERBOSE;
 					if ($5)
-						n->options |= VACOPT_ANALYZE;
+						n->options.flags |= VACOPT_ANALYZE;
+					n->options.nworkers = 0;
 					n->rels = $6;
 					$$ = (Node *)n;
 				}
 			| VACUUM '(' vacuum_option_list ')' opt_vacuum_relation_list
 				{
 					VacuumStmt *n = makeNode(VacuumStmt);
-					n->options = VACOPT_VACUUM | $3;
+					n->options.flags = VACOPT_VACUUM | $3->flags;
+					n->options.nworkers = $3->nworkers;
 					n->rels = $5;
 					$$ = (Node *) n;
 				}
@@ -10507,20 +10511,40 @@ VacuumStmt: VACUUM opt_full opt_freeze opt_verbose opt_analyze opt_vacuum_relati
 
 vacuum_option_list:
 			vacuum_option_elem								{ $$ = $1; }
-			| vacuum_option_list ',' vacuum_option_elem		{ $$ = $1 | $3; }
+			| vacuum_option_list ',' vacuum_option_elem
+				{
+					VacuumOptions *vacopt1 = $1;
+					VacuumOptions *vacopt2 = $3;
+
+					vacopt1->flags |= vacopt2->flags;
+					if (vacopt2->flags == VACOPT_PARALLEL)
+						vacopt1->nworkers = vacopt2->nworkers;
+					pfree(vacopt2);
+					$$ = vacopt1;
+				}
 		;
 
 vacuum_option_elem:
-			analyze_keyword		{ $$ = VACOPT_ANALYZE; }
-			| VERBOSE			{ $$ = VACOPT_VERBOSE; }
-			| FREEZE			{ $$ = VACOPT_FREEZE; }
-			| FULL				{ $$ = VACOPT_FULL; }
+			analyze_keyword		{ $$ = makeVacOpt(VACOPT_ANALYZE, 0); }
+			| VERBOSE			{ $$ = makeVacOpt(VACOPT_VERBOSE, 0); }
+			| FREEZE			{ $$ = makeVacOpt(VACOPT_FREEZE, 0); }
+			| FULL				{ $$ = makeVacOpt(VACOPT_FULL, 0); }
+			| PARALLEL			{ $$ = makeVacOpt(VACOPT_PARALLEL, 0); }
+			| PARALLEL ICONST
+			{
+				if ($2 < 1)
+					ereport(ERROR,
+							(errcode(ERRCODE_SYNTAX_ERROR),
+							 errmsg("parallel vacuum degree must be more than 1"),
+							 parser_errposition(@1)));
+				$$ = makeVacOpt(VACOPT_PARALLEL, $2);
+			}
 			| IDENT
 				{
 					if (strcmp($1, "disable_page_skipping") == 0)
-						$$ = VACOPT_DISABLE_PAGE_SKIPPING;
+						$$ = makeVacOpt(VACOPT_DISABLE_PAGE_SKIPPING, 0);
 					else if (strcmp($1, "skip_locked") == 0)
-						$$ = VACOPT_SKIP_LOCKED;
+						$$ = makeVacOpt(VACOPT_SKIP_LOCKED, 0);
 					else
 						ereport(ERROR,
 								(errcode(ERRCODE_SYNTAX_ERROR),
@@ -10532,16 +10556,16 @@ vacuum_option_elem:
 AnalyzeStmt: analyze_keyword opt_verbose opt_vacuum_relation_list
 				{
 					VacuumStmt *n = makeNode(VacuumStmt);
-					n->options = VACOPT_ANALYZE;
+					n->options.flags = VACOPT_ANALYZE;
 					if ($2)
-						n->options |= VACOPT_VERBOSE;
+						n->options.flags |= VACOPT_VERBOSE;
 					n->rels = $3;
 					$$ = (Node *)n;
 				}
 			| analyze_keyword '(' analyze_option_list ')' opt_vacuum_relation_list
 				{
 					VacuumStmt *n = makeNode(VacuumStmt);
-					n->options = VACOPT_ANALYZE | $3;
+					n->options.flags = VACOPT_ANALYZE | $3;
 					n->rels = $5;
 					$$ = (Node *) n;
 				}
@@ -16039,6 +16063,19 @@ makeXmlExpr(XmlExprOp op, char *name, List *named_args, List *args,
 	return (Node *) x;
 }
 
+
+/*
+ * Create a VacuumOptions with the given options.
+ */
+static VacuumOptions *
+makeVacOpt(VacuumFlag flag, int nworkers)
+{
+	VacuumOptions *vacopt = palloc(sizeof(VacuumOptions));
+
+	vacopt->flags = flag;
+	vacopt->nworkers = nworkers;
+	return vacopt;
+}
 /*
  * Merge the input and output parameters of a table function.
  */
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index d1177b3..8555a23 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -187,8 +187,8 @@ typedef struct av_relation
 typedef struct autovac_table
 {
 	Oid			at_relid;
-	int			at_vacoptions;	/* bitmask of VacuumOption */
-	VacuumParams at_params;
+	VacuumOptions	at_vacoptions;
+	VacuumParams	at_params;
 	int			at_vacuum_cost_delay;
 	int			at_vacuum_cost_limit;
 	bool		at_dobalance;
@@ -2481,7 +2481,7 @@ do_autovacuum(void)
 			 * next table in our list.
 			 */
 			HOLD_INTERRUPTS();
-			if (tab->at_vacoptions & VACOPT_VACUUM)
+			if (tab->at_vacoptions.flags & VACOPT_VACUUM)
 				errcontext("automatic vacuum of table \"%s.%s.%s\"",
 						   tab->at_datname, tab->at_nspname, tab->at_relname);
 			else
@@ -2882,10 +2882,11 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		tab = palloc(sizeof(autovac_table));
 		tab->at_relid = relid;
 		tab->at_sharedrel = classForm->relisshared;
-		tab->at_vacoptions = VACOPT_SKIPTOAST |
+		tab->at_vacoptions.flags = VACOPT_SKIPTOAST |
 			(dovacuum ? VACOPT_VACUUM : 0) |
 			(doanalyze ? VACOPT_ANALYZE : 0) |
 			(!wraparound ? VACOPT_SKIP_LOCKED : 0);
+		tab->at_vacoptions.nworkers = 0;	/* parallel lazy vacuum is not supported */
 		tab->at_params.freeze_min_age = freeze_min_age;
 		tab->at_params.freeze_table_age = freeze_table_age;
 		tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
@@ -3131,10 +3132,10 @@ autovac_report_activity(autovac_table *tab)
 	int			len;
 
 	/* Report the command and possible options */
-	if (tab->at_vacoptions & VACOPT_VACUUM)
+	if (tab->at_vacoptions.flags & VACOPT_VACUUM)
 		snprintf(activity, MAX_AUTOVAC_ACTIV_LEN,
 				 "autovacuum: VACUUM%s",
-				 tab->at_vacoptions & VACOPT_ANALYZE ? " ANALYZE" : "");
+				 tab->at_vacoptions.flags & VACOPT_ANALYZE ? " ANALYZE" : "");
 	else
 		snprintf(activity, MAX_AUTOVAC_ACTIV_LEN,
 				 "autovacuum: ANALYZE");
diff --git a/src/backend/tcop/utility.c b/src/backend/tcop/utility.c
index 27ae6be..d11d397 100644
--- a/src/backend/tcop/utility.c
+++ b/src/backend/tcop/utility.c
@@ -664,7 +664,7 @@ standard_ProcessUtility(PlannedStmt *pstmt,
 				VacuumStmt *stmt = (VacuumStmt *) parsetree;
 
 				/* we choose to allow this during "read only" transactions */
-				PreventCommandDuringRecovery((stmt->options & VACOPT_VACUUM) ?
+				PreventCommandDuringRecovery((stmt->options.flags & VACOPT_VACUUM) ?
 											 "VACUUM" : "ANALYZE");
 				/* forbidden in parallel mode due to CommandIsReadOnly */
 				ExecVacuum(stmt, isTopLevel);
@@ -2570,7 +2570,7 @@ CreateCommandTag(Node *parsetree)
 			break;
 
 		case T_VacuumStmt:
-			if (((VacuumStmt *) parsetree)->options & VACOPT_VACUUM)
+			if (((VacuumStmt *) parsetree)->options.flags & VACOPT_VACUUM)
 				tag = "VACUUM";
 			else
 				tag = "ANALYZE";
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index ab08791..d862cf7 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -14,11 +14,13 @@
 #ifndef HEAPAM_H
 #define HEAPAM_H
 
+#include "access/parallel.h"
 #include "access/relation.h"	/* for backward compatibility */
 #include "access/sdir.h"
 #include "access/skey.h"
 #include "access/table.h"		/* for backward compatibility */
 #include "nodes/lockoptions.h"
+#include "nodes/parsenodes.h"
 #include "nodes/primnodes.h"
 #include "storage/bufpage.h"
 #include "storage/lockdefs.h"
@@ -185,8 +187,9 @@ extern Size SyncScanShmemSize(void);
 
 /* in heap/vacuumlazy.c */
 struct VacuumParams;
-extern void heap_vacuum_rel(Relation onerel, int options,
+extern void heap_vacuum_rel(Relation onerel, VacuumOptions options,
 				struct VacuumParams *params, BufferAccessStrategy bstrategy);
+extern void heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
 
 /* in heap/heapam_visibility.c */
 extern bool HeapTupleSatisfiesVisibility(HeapTuple stup, Snapshot snapshot,
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 0a051ec..d3503e2 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -163,7 +163,7 @@ extern int	vacuum_multixact_freeze_table_age;
 
 /* in commands/vacuum.c */
 extern void ExecVacuum(VacuumStmt *vacstmt, bool isTopLevel);
-extern void vacuum(int options, List *relations, VacuumParams *params,
+extern void vacuum(VacuumOptions options, List *relations, VacuumParams *params,
 	   BufferAccessStrategy bstrategy, bool isTopLevel);
 extern void vac_open_indexes(Relation relation, LOCKMODE lockmode,
 				 int *nindexes, Relation **Irel);
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index addc2c2..5d9a6ac 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3144,7 +3144,7 @@ typedef struct ClusterStmt
  * and VACOPT_ANALYZE must be set in options.
  * ----------------------
  */
-typedef enum VacuumOption
+typedef enum VacuumFlag
 {
 	VACOPT_VACUUM = 1 << 0,		/* do VACUUM */
 	VACOPT_ANALYZE = 1 << 1,	/* do ANALYZE */
@@ -3153,8 +3153,15 @@ typedef enum VacuumOption
 	VACOPT_FULL = 1 << 4,		/* FULL (non-concurrent) vacuum */
 	VACOPT_SKIP_LOCKED = 1 << 5,	/* skip if cannot get lock */
 	VACOPT_SKIPTOAST = 1 << 6,	/* don't process the TOAST table, if any */
-	VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7	/* don't skip any pages */
-} VacuumOption;
+	VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7,	/* don't skip any pages */
+	VACOPT_PARALLEL = 1 << 8	/* do lazy VACUUM in parallel */
+} VacuumFlag;
+
+typedef struct VacuumOptions
+{
+	VacuumFlag	flags;	/* OR of VacuumFlag */
+	int			nworkers;	/* # of parallel vacuum workers */
+} VacuumOptions;
 
 /*
  * Info about a single target table of VACUUM/ANALYZE.
@@ -3173,9 +3180,9 @@ typedef struct VacuumRelation
 
 typedef struct VacuumStmt
 {
-	NodeTag		type;
-	int			options;		/* OR of VacuumOption flags */
-	List	   *rels;			/* list of VacuumRelation, or NIL for all */
+	NodeTag			type;
+	VacuumOptions	options;
+	List		   *rels;			/* list of VacuumRelation, or NIL for all */
 } VacuumStmt;
 
 /* ----------------------
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index fa9d663..9b5b7dc 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -80,6 +80,8 @@ CONTEXT:  SQL function "do_analyze" statement 1
 SQL function "wrap_do_analyze" statement 1
 VACUUM FULL vactst;
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+VACUUM (PARALLEL) vaccluster;
+VACUUM (PARALLEL 2) vaccluster;
 -- partitioned table
 CREATE TABLE vacparted (a int, b char) PARTITION BY LIST (a);
 CREATE TABLE vacparted1 PARTITION OF vacparted FOR VALUES IN (1);
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 9defa0d..f92c4e5 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -61,6 +61,9 @@ VACUUM FULL vaccluster;
 VACUUM FULL vactst;
 
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+VACUUM (PARALLEL) vaccluster;
+VACUUM (PARALLEL 2) vaccluster;
+
 
 -- partitioned table
 CREATE TABLE vacparted (a int, b char) PARTITION BY LIST (a);
-- 
2.10.5

v13-0002-Add-P-option-to-vacuumdb-command.patchapplication/x-patch; name=v13-0002-Add-P-option-to-vacuumdb-command.patchDownload

From 0a748c27cd81eb2348017864eafe2d113c151b81 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Wed, 23 Jan 2019 16:07:53 +0900
Subject: [PATCH v13 2/2] Add -P option to vacuumdb command

---
 doc/src/sgml/ref/vacuumdb.sgml    | 16 +++++++++++++
 src/bin/scripts/t/100_vacuumdb.pl | 10 +++++++-
 src/bin/scripts/vacuumdb.c        | 49 ++++++++++++++++++++++++++++++++++++++-
 3 files changed, 73 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/ref/vacuumdb.sgml b/doc/src/sgml/ref/vacuumdb.sgml
index f304627..79e3a25 100644
--- a/doc/src/sgml/ref/vacuumdb.sgml
+++ b/doc/src/sgml/ref/vacuumdb.sgml
@@ -173,6 +173,22 @@ PostgreSQL documentation
      </varlistentry>
 
      <varlistentry>
+      <term><option>-P <replaceable class="parameter">workers</replaceable></option></term>
+      <term><option>--parallel=<replaceable class="parameter">workers</replaceable></option></term>
+      <listitem>
+       <para>
+        Execute parallel vacuum with <productname>PostgreSQL</productname>'s
+        <replaceable class="parameter">workers</replaceable> background workers.
+       </para>
+       <para>
+        <application>vacuumdb</application> will require background workers,
+        so make sure your <xref linkend="guc-max-parallel-workers-maintenance"/>
+        setting is more than one.
+       </para>
+      </listitem>
+     </varlistentry>
+
+     <varlistentry>
       <term><option>-q</option></term>
       <term><option>--quiet</option></term>
       <listitem>
diff --git a/src/bin/scripts/t/100_vacuumdb.pl b/src/bin/scripts/t/100_vacuumdb.pl
index 951202b..2f76f64 100644
--- a/src/bin/scripts/t/100_vacuumdb.pl
+++ b/src/bin/scripts/t/100_vacuumdb.pl
@@ -3,7 +3,7 @@ use warnings;
 
 use PostgresNode;
 use TestLib;
-use Test::More tests => 30;
+use Test::More tests => 34;
 
 program_help_ok('vacuumdb');
 program_version_ok('vacuumdb');
@@ -48,6 +48,14 @@ $node->issues_sql_like(
 $node->command_fails(
 	[ 'vacuumdb', '--analyze-only', '--disable-page-skipping', 'postgres' ],
 	'--analyze-only and --disable-page-skipping specified together');
+$node->issues_sql_like(
+	[ 'vacuumdb', '-P2', 'postgres' ],
+	qr/statement: VACUUM \(PARALLEL 2\);/,
+	'vacuumdb -P2');
+$node->issues_sql_like(
+	[ 'vacuumdb', '-P', 'postgres' ],
+	qr/statement: VACUUM \(PARALLEL\);/,
+	'vacuumdb -P');
 $node->command_ok([qw(vacuumdb -Z --table=pg_am dbname=template1)],
 	'vacuumdb with connection string');
 
diff --git a/src/bin/scripts/vacuumdb.c b/src/bin/scripts/vacuumdb.c
index ec7d0a3..2e96225 100644
--- a/src/bin/scripts/vacuumdb.c
+++ b/src/bin/scripts/vacuumdb.c
@@ -42,6 +42,8 @@ typedef struct vacuumingOptions
 	bool		freeze;
 	bool		disable_page_skipping;
 	bool		skip_locked;
+	int			parallel_workers;	/* -1: disabled, 0: PARALLEL without number of
+									 * workers. */
 } vacuumingOptions;
 
 
@@ -110,6 +112,7 @@ main(int argc, char *argv[])
 		{"full", no_argument, NULL, 'f'},
 		{"verbose", no_argument, NULL, 'v'},
 		{"jobs", required_argument, NULL, 'j'},
+		{"parallel", optional_argument, NULL, 'P'},
 		{"maintenance-db", required_argument, NULL, 2},
 		{"analyze-in-stages", no_argument, NULL, 3},
 		{"disable-page-skipping", no_argument, NULL, 4},
@@ -137,6 +140,7 @@ main(int argc, char *argv[])
 
 	/* initialize options to all false */
 	memset(&vacopts, 0, sizeof(vacopts));
+	vacopts.parallel_workers = -1;
 
 	progname = get_progname(argv[0]);
 
@@ -144,7 +148,7 @@ main(int argc, char *argv[])
 
 	handle_help_version_opts(argc, argv, "vacuumdb", help);
 
-	while ((c = getopt_long(argc, argv, "h:p:U:wWeqd:zZFat:fvj:", long_options, &optindex)) != -1)
+	while ((c = getopt_long(argc, argv, "h:p:P::U:wWeqd:zZFat:fvj:", long_options, &optindex)) != -1)
 	{
 		switch (c)
 		{
@@ -211,6 +215,25 @@ main(int argc, char *argv[])
 					exit(1);
 				}
 				break;
+			case 'P':
+				{
+					int parallel_workers = 0;
+
+					if (optarg != NULL)
+					{
+						parallel_workers = atoi(optarg);
+						if (parallel_workers <= 0)
+						{
+							fprintf(stderr, _("%s: number of parallel workers must be at least 1\n"),
+									progname);
+							exit(1);
+						}
+					}
+
+					/* allow to set 0, meaning PARALLEL without the parallel degree */
+					vacopts.parallel_workers = parallel_workers;
+					break;
+				}
 			case 2:
 				maintenance_db = pg_strdup(optarg);
 				break;
@@ -267,9 +290,22 @@ main(int argc, char *argv[])
 					progname, "disable-page-skipping");
 			exit(1);
 		}
+		if (vacopts.parallel_workers >= 0)
+		{
+			fprintf(stderr, _("%s: cannot use the \"%s\" option when performing only analyze\n"),
+					progname, "parallel");
+			exit(1);
+		}
 		/* allow 'and_analyze' with 'analyze_only' */
 	}
 
+	if (vacopts.full && vacopts.parallel_workers >= 0)
+	{
+		fprintf(stderr, _("%s: cannot use the \"%s\" option with \"%s\" option"),
+				progname, "full", "parallel");
+		exit(1);
+	}
+
 	setup_cancel_handler();
 
 	/* Avoid opening extra connections. */
@@ -737,6 +773,16 @@ prepare_vacuum_command(PQExpBuffer sql, PGconn *conn,
 				appendPQExpBuffer(sql, "%sANALYZE", sep);
 				sep = comma;
 			}
+			if (vacopts->parallel_workers > 0)
+			{
+				appendPQExpBuffer(sql, "%sPARALLEL %d", sep, vacopts->parallel_workers);
+				sep = comma;
+			}
+			if (vacopts->parallel_workers == 0)
+			{
+				appendPQExpBuffer(sql, "%sPARALLEL", sep);
+				sep = comma;
+			}
 			if (sep != paren)
 				appendPQExpBufferChar(sql, ')');
 		}
@@ -1075,6 +1121,7 @@ help(const char *progname)
 	printf(_("  -f, --full                      do full vacuuming\n"));
 	printf(_("  -F, --freeze                    freeze row transaction information\n"));
 	printf(_("  -j, --jobs=NUM                  use this many concurrent connections to vacuum\n"));
+	printf(_("  -P, --parallel=NUM              do parallel vacuuming\n"));
 	printf(_("  -q, --quiet                     don't write any messages\n"));
 	printf(_("      --skip-locked               skip relations that cannot be immediately locked\n"));
 	printf(_("  -t, --table='TABLE[(COLUMNS)]'  vacuum specific table(s) only\n"));
-- 
2.10.5

#18

Haribabu Kommi

kommi.haribabu@gmail.com

almost 7 years ago

In reply to: Masahiko Sawada (#17)

On Thu, Jan 24, 2019 at 1:16 PM Masahiko Sawada <sawada.mshk@gmail.com>
wrote:

Attached the latest patches.

Thanks for the updated patches.
Some more code review comments.

+         started by a single utility command.  Currently, the parallel
+         utility commands that support the use of parallel workers are
+         <command>CREATE INDEX</command> and <command>VACUUM</command>
+         without <literal>FULL</literal> option, and only when building
+         a B-tree index.  Parallel workers are taken from the pool of

I feel the above sentence may not give the proper picture, how about the
adding following modification?

<command>CREATE INDEX</command> only when building a B-tree index
and <command>VACUUM</command> without <literal>FULL</literal> option.

+ * parallel vacuum, we perform both index vacuum and index cleanup in
parallel.
+ * Individual indexes is processed by one vacuum process. At beginning of

How about vacuum index and cleanup index similar like other places?

+ * memory space for dead tuples. When starting either index vacuum or
cleanup
+ * vacuum, we launch parallel worker processes. Once all indexes are
processed

same here as well?

+ * Before starting parallel index vacuum and parallel cleanup index we
launch
+ * parallel workers. All parallel workers will exit after processed all
indexes

parallel vacuum index and parallel cleanup index?

+ /*
+ * If there is already-updated result in the shared memory we
+ * use it. Otherwise we pass NULL to index AMs and copy the
+ * result to the shared memory segment.
+ */
+ if (lvshared->indstats[idx].updated)
+ result = &(lvshared->indstats[idx].stats);

I didn't really find a need of the flag to differentiate the stats pointer
from
first run to second run? I don't see any problem in passing directing the
stats
and the same stats are updated in the worker side and leader side. Anyway
no two
processes will do the index vacuum at same time. Am I missing something?

Even if this flag is to identify whether the stats are updated or not before
writing them, I don't see a need of it compared to normal vacuum.

+ * Enter the parallel mode, allocate and initialize a DSM segment. Return
+ * the memory space for storing dead tuples or NULL if no workers are
prepared.
+ */

+ pcxt = CreateParallelContext("postgres", "heap_parallel_vacuum_main",
+ request, true);

But we are passing as serializable_okay flag as true, means it doesn't
return
NULL. Is it expected?

+ initStringInfo(&buf);
+ appendStringInfo(&buf,
+ ngettext("launched %d parallel vacuum worker %s (planned: %d",
+   "launched %d parallel vacuum workers %s (planned: %d",
+   lvstate->pcxt->nworkers_launched),
+ lvstate->pcxt->nworkers_launched,
+ for_cleanup ? "for index cleanup" : "for index vacuum",
+ lvstate->pcxt->nworkers);
+ if (lvstate->options.nworkers > 0)
+ appendStringInfo(&buf, ", requested %d", lvstate->options.nworkers);

what is the difference between planned workers and requested workers,
aren't both
are same?

- COMPARE_SCALAR_FIELD(options);
- COMPARE_NODE_FIELD(rels);
+ if (a->options.flags != b->options.flags)
+ return false;
+ if (a->options.nworkers != b->options.nworkers)
+ return false;

Options is changed from SCALAR to check, but why the rels check is removed?
The options is changed from int to a structure so using SCALAR may not work
in other function like _copyVacuumStmt and etc?

+typedef struct VacuumOptions
+{
+ VacuumFlag flags; /* OR of VacuumFlag */
+ int nworkers; /* # of parallel vacuum workers */
+} VacuumOptions;

Do we need to add NodeTag for the above structure? Because this structure is
part of VacuumStmt structure.

+        <application>vacuumdb</application> will require background
workers,
+        so make sure your <xref
linkend="guc-max-parallel-workers-maintenance"/>
+        setting is more than one.

removing vacuumdb and changing it as "This option will ..."?

I will continue the testing of this patch and share the details.

Regards,
Haribabu Kommi
Fujitsu Australia

#19

Masahiko Sawada

sawada.mshk@gmail.com

almost 7 years ago

In reply to: Haribabu Kommi (#18)

On Wed, Jan 30, 2019 at 2:06 AM Haribabu Kommi <kommi.haribabu@gmail.com> wrote:

On Thu, Jan 24, 2019 at 1:16 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Attached the latest patches.

Thanks for the updated patches.
Some more code review comments.

Thank you!

+         started by a single utility command.  Currently, the parallel
+         utility commands that support the use of parallel workers are
+         <command>CREATE INDEX</command> and <command>VACUUM</command>
+         without <literal>FULL</literal> option, and only when building
+         a B-tree index.  Parallel workers are taken from the pool of
I feel the above sentence may not give the proper picture, how about the
adding following modification?

<command>CREATE INDEX</command> only when building a B-tree index
and <command>VACUUM</command> without <literal>FULL</literal> option.

Agreed.

+ * parallel vacuum, we perform both index vacuum and index cleanup in parallel.
+ * Individual indexes is processed by one vacuum process. At beginning of

How about vacuum index and cleanup index similar like other places?

+ * memory space for dead tuples. When starting either index vacuum or cleanup
+ * vacuum, we launch parallel worker processes. Once all indexes are processed

same here as well?

+ * Before starting parallel index vacuum and parallel cleanup index we launch
+ * parallel workers. All parallel workers will exit after processed all indexes

parallel vacuum index and parallel cleanup index?

ISTM we're using like "index vacuuming", "index cleanup" and "FSM
vacuming" in vacuumlazy.c so maybe "parallel index vacuuming" and
"parallel index cleanup" would be better?

+ /*
+ * If there is already-updated result in the shared memory we
+ * use it. Otherwise we pass NULL to index AMs and copy the
+ * result to the shared memory segment.
+ */
+ if (lvshared->indstats[idx].updated)
+ result = &(lvshared->indstats[idx].stats);
I didn't really find a need of the flag to differentiate the stats pointer from
first run to second run? I don't see any problem in passing directing the stats
and the same stats are updated in the worker side and leader side. Anyway no two
processes will do the index vacuum at same time. Am I missing something?

Even if this flag is to identify whether the stats are updated or not before
writing them, I don't see a need of it compared to normal vacuum.

The passing stats = NULL to amvacuumcleanup and ambulkdelete means the
first time execution. For example, btvacuumcleanup skips cleanup if
it's not NULL.In the normal vacuum we pass NULL to ambulkdelete or
amvacuumcleanup when the first time calling. And they store the result
stats to the memory allocated int the local memory. Therefore in the
parallel vacuum I think that both worker and leader need to move it to
the shared memory and mark it as updated as different worker could
vacuum different indexes at the next time.

+ * Enter the parallel mode, allocate and initialize a DSM segment. Return
+ * the memory space for storing dead tuples or NULL if no workers are prepared.
+ */

+ pcxt = CreateParallelContext("postgres", "heap_parallel_vacuum_main",
+ request, true);

But we are passing as serializable_okay flag as true, means it doesn't return
NULL. Is it expected?

I think you're right. Since the request never be 0 and
serializable_okey is true it should not return NULL. Will fix.

+ initStringInfo(&buf);
+ appendStringInfo(&buf,
+ ngettext("launched %d parallel vacuum worker %s (planned: %d",
+   "launched %d parallel vacuum workers %s (planned: %d",
+   lvstate->pcxt->nworkers_launched),
+ lvstate->pcxt->nworkers_launched,
+ for_cleanup ? "for index cleanup" : "for index vacuum",
+ lvstate->pcxt->nworkers);
+ if (lvstate->options.nworkers > 0)
+ appendStringInfo(&buf, ", requested %d", lvstate->options.nworkers);

what is the difference between planned workers and requested workers, aren't both
are same?

The request is the parallel degree that is specified explicitly by
user whereas the planned is the actual number we planned based on the
number of indexes the table has. For example, if we do like 'VACUUM
(PARALLEL 3000) tbl' where the tbl has 4 indexes, the request is 3000
and the planned is 4. Also if max_parallel_maintenance_workers is 2
the planned is 2.

- COMPARE_SCALAR_FIELD(options);
- COMPARE_NODE_FIELD(rels);
+ if (a->options.flags != b->options.flags)
+ return false;
+ if (a->options.nworkers != b->options.nworkers)
+ return false;
Options is changed from SCALAR to check, but why the rels check is removed?
The options is changed from int to a structure so using SCALAR may not work
in other function like _copyVacuumStmt and etc?

Agreed and will fix.

+typedef struct VacuumOptions
+{
+ VacuumFlag flags; /* OR of VacuumFlag */
+ int nworkers; /* # of parallel vacuum workers */
+} VacuumOptions;
Do we need to add NodeTag for the above structure? Because this structure is
part of VacuumStmt structure.

Yes, I will add it.

+        <application>vacuumdb</application> will require background workers,
+        so make sure your <xref linkend="guc-max-parallel-workers-maintenance"/>
+        setting is more than one.

removing vacuumdb and changing it as "This option will ..."?

Agreed.

I will continue the testing of this patch and share the details.

Thank you. I'll submit the updated patch set.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#20

Amit Kapila

amit.kapila16@gmail.com

almost 7 years ago

In reply to: Masahiko Sawada (#19)

On Fri, Feb 1, 2019 at 2:49 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Wed, Jan 30, 2019 at 2:06 AM Haribabu Kommi <kommi.haribabu@gmail.com> wrote:

Thank you. I'll submit the updated patch set.

I don't see any chance of getting this committed in the next few days,
so, moved to next CF. Thanks for working on this and I hope you will
continue work on this project.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#21

Masahiko Sawada

sawada.mshk@gmail.com

almost 7 years ago

In reply to: Amit Kapila (#20)

On Sat, Feb 2, 2019 at 4:06 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Feb 1, 2019 at 2:49 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Wed, Jan 30, 2019 at 2:06 AM Haribabu Kommi <kommi.haribabu@gmail.com> wrote:

Thank you. I'll submit the updated patch set.

I don't see any chance of getting this committed in the next few days,
so, moved to next CF. Thanks for working on this and I hope you will
continue work on this project.

Thank you!

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#22

Masahiko Sawada

sawada.mshk@gmail.com

almost 7 years ago

In reply to: Masahiko Sawada (#19)

2 attachment(s)

On Thu, Jan 31, 2019 at 10:18 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Thank you. I'll submit the updated patch set.

Attached the latest patch set.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachments:

v14-0002-Add-P-option-to-vacuumdb-command.patchtext/x-patch; charset=US-ASCII; name=v14-0002-Add-P-option-to-vacuumdb-command.patchDownload

From 021a179d7696183394db60aedbd1acb0301ad4b0 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Wed, 23 Jan 2019 16:07:53 +0900
Subject: [PATCH v14 2/2] Add -P option to vacuumdb command

---
 doc/src/sgml/ref/vacuumdb.sgml    | 22 +++++++++++++++
 src/bin/scripts/t/100_vacuumdb.pl | 10 ++++++-
 src/bin/scripts/vacuumdb.c        | 58 ++++++++++++++++++++++++++++++++++++++-
 3 files changed, 88 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/ref/vacuumdb.sgml b/doc/src/sgml/ref/vacuumdb.sgml
index 41c7f3d..95ff132 100644
--- a/doc/src/sgml/ref/vacuumdb.sgml
+++ b/doc/src/sgml/ref/vacuumdb.sgml
@@ -227,6 +227,28 @@ PostgreSQL documentation
      </varlistentry>
 
      <varlistentry>
+      <term><option>-P <replaceable class="parameter">workers</replaceable></option></term>
+      <term><option>--parallel=<replaceable class="parameter">workers</replaceable></option></term>
+      <listitem>
+       <para>
+        Execute parallel vacuum with <productname>PostgreSQL</productname>'s
+        <replaceable class="parameter">workers</replaceable> background workers.
+       </para>
+       <para>
+        This will require background workers, so make sure your
+        <xref linkend="guc-max-parallel-workers-maintenance"/> setting is
+        more than one.
+       </para>
+       <note>
+        <para>
+         This opton is only available for servers runining
+         <productname>PostgreSQL</productname> 12 and later.
+        </para>
+       </note>
+      </listitem>
+     </varlistentry>
+
+     <varlistentry>
       <term><option>-q</option></term>
       <term><option>--quiet</option></term>
       <listitem>
diff --git a/src/bin/scripts/t/100_vacuumdb.pl b/src/bin/scripts/t/100_vacuumdb.pl
index 7f3a9b1..5683ef6 100644
--- a/src/bin/scripts/t/100_vacuumdb.pl
+++ b/src/bin/scripts/t/100_vacuumdb.pl
@@ -3,7 +3,7 @@ use warnings;
 
 use PostgresNode;
 use TestLib;
-use Test::More tests => 44;
+use Test::More tests => 48;
 
 program_help_ok('vacuumdb');
 program_version_ok('vacuumdb');
@@ -48,6 +48,14 @@ $node->issues_sql_like(
 $node->command_fails(
 	[ 'vacuumdb', '--analyze-only', '--disable-page-skipping', 'postgres' ],
 	'--analyze-only and --disable-page-skipping specified together');
+$node->issues_sql_like(
+	[ 'vacuumdb', '-P2', 'postgres' ],
+	qr/statement: VACUUM \(PARALLEL 2\);/,
+	'vacuumdb -P2');
+$node->issues_sql_like(
+	[ 'vacuumdb', '-P', 'postgres' ],
+	qr/statement: VACUUM \(PARALLEL\);/,
+	'vacuumdb -P');
 $node->command_ok([qw(vacuumdb -Z --table=pg_am dbname=template1)],
 	'vacuumdb with connection string');
 
diff --git a/src/bin/scripts/vacuumdb.c b/src/bin/scripts/vacuumdb.c
index 5ac41ea..2aee18b 100644
--- a/src/bin/scripts/vacuumdb.c
+++ b/src/bin/scripts/vacuumdb.c
@@ -45,6 +45,8 @@ typedef struct vacuumingOptions
 	bool		skip_locked;
 	int			min_xid_age;
 	int			min_mxid_age;
+	int			parallel_workers;	/* -1: disabled, 0: PARALLEL without number of
+									 * workers. */
 } vacuumingOptions;
 
 
@@ -111,6 +113,7 @@ main(int argc, char *argv[])
 		{"full", no_argument, NULL, 'f'},
 		{"verbose", no_argument, NULL, 'v'},
 		{"jobs", required_argument, NULL, 'j'},
+		{"parallel", optional_argument, NULL, 'P'},
 		{"maintenance-db", required_argument, NULL, 2},
 		{"analyze-in-stages", no_argument, NULL, 3},
 		{"disable-page-skipping", no_argument, NULL, 4},
@@ -140,6 +143,7 @@ main(int argc, char *argv[])
 
 	/* initialize options to all false */
 	memset(&vacopts, 0, sizeof(vacopts));
+	vacopts.parallel_workers = -1;
 
 	progname = get_progname(argv[0]);
 
@@ -147,7 +151,7 @@ main(int argc, char *argv[])
 
 	handle_help_version_opts(argc, argv, "vacuumdb", help);
 
-	while ((c = getopt_long(argc, argv, "h:p:U:wWeqd:zZFat:fvj:", long_options, &optindex)) != -1)
+	while ((c = getopt_long(argc, argv, "h:p:P::U:wWeqd:zZFat:fvj:", long_options, &optindex)) != -1)
 	{
 		switch (c)
 		{
@@ -214,6 +218,25 @@ main(int argc, char *argv[])
 					exit(1);
 				}
 				break;
+			case 'P':
+				{
+					int parallel_workers = 0;
+
+					if (optarg != NULL)
+					{
+						parallel_workers = atoi(optarg);
+						if (parallel_workers <= 0)
+						{
+							fprintf(stderr, _("%s: number of parallel workers must be at least 1\n"),
+									progname);
+							exit(1);
+						}
+					}
+
+					/* allow to set 0, meaning PARALLEL without the parallel degree */
+					vacopts.parallel_workers = parallel_workers;
+					break;
+				}
 			case 2:
 				maintenance_db = pg_strdup(optarg);
 				break;
@@ -288,9 +311,22 @@ main(int argc, char *argv[])
 					progname, "disable-page-skipping");
 			exit(1);
 		}
+		if (vacopts.parallel_workers >= 0)
+		{
+			fprintf(stderr, _("%s: cannot use the \"%s\" option when performing only analyze\n"),
+					progname, "parallel");
+			exit(1);
+		}
 		/* allow 'and_analyze' with 'analyze_only' */
 	}
 
+	if (vacopts.full && vacopts.parallel_workers >= 0)
+	{
+		fprintf(stderr, _("%s: cannot use the \"%s\" option with \"%s\" option"),
+				progname, "full", "parallel");
+		exit(1);
+	}
+
 	setup_cancel_handler();
 
 	/* Avoid opening extra connections. */
@@ -426,6 +462,14 @@ vacuum_one_database(const char *dbname, vacuumingOptions *vacopts,
 		exit(1);
 	}
 
+	if (vacopts->parallel_workers > 0 && PQserverVersion(conn) < 120000)
+	{
+		PQfinish(conn);
+		fprintf(stderr, _("%s: cannot use the \"%s\" option on server versions older than PostgreSQL 12\n"),
+				progname, "parallel");
+		exit(1);
+	}
+
 	if (vacopts->min_xid_age != 0 && PQserverVersion(conn) < 90600)
 	{
 		fprintf(stderr, _("%s: cannot use the \"%s\" option on server versions older than PostgreSQL 9.6\n"),
@@ -895,6 +939,17 @@ prepare_vacuum_command(PQExpBuffer sql, int serverVersion,
 				appendPQExpBuffer(sql, "%sANALYZE", sep);
 				sep = comma;
 			}
+			if (vacopts->parallel_workers >= 0)
+			{
+				/* PARALLEL is supported since v12 */
+				Assert(serverVersion >= 120000);
+				if (vacopts->parallel_workers == 0)
+					appendPQExpBuffer(sql, "%sPARALLEL", sep);
+				else
+					appendPQExpBuffer(sql, "%sPARALLEL %d", sep,
+									  vacopts->parallel_workers);
+				sep = comma;
+			}
 			if (sep != paren)
 				appendPQExpBufferChar(sql, ')');
 		}
@@ -1227,6 +1282,7 @@ help(const char *progname)
 	printf(_("  -j, --jobs=NUM                  use this many concurrent connections to vacuum\n"));
 	printf(_("      --min-mxid-age=MXID_AGE     minimum multixact ID age of tables to vacuum\n"));
 	printf(_("      --min-xid-age=XID_AGE       minimum transaction ID age of tables to vacuum\n"));
+	printf(_("  -P, --parallel=NUM              do parallel vacuuming\n"));
 	printf(_("  -q, --quiet                     don't write any messages\n"));
 	printf(_("      --skip-locked               skip relations that cannot be immediately locked\n"));
 	printf(_("  -t, --table='TABLE[(COLUMNS)]'  vacuum specific table(s) only\n"));
-- 
1.8.3.1

v14-0001-Add-parallel-option-to-VACUUM-command.patchtext/x-patch; charset=US-ASCII; name=v14-0001-Add-parallel-option-to-VACUUM-command.patchDownload

From ae50a69d983db2c6811b08b17918033fcacff40a Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Tue, 18 Dec 2018 14:48:34 +0900
Subject: [PATCH v14 1/2] Add parallel option to VACUUM command

In parallel vacuum, we do both index vacuum and cleanup vacuum
in parallel with parallel worker processes if the table has
more than one index. All processes including the leader process
process indexes one by one.

Parallel vacuum can be performed by specifying like
VACUUM (PARALLEL 2) tbl, meaning that performing vacuum with 2
parallel worker processes. Or the setting parallel_workers
reloption more than 0 invokes parallel vacuum.

The parallel vacuum degree is limited by both the number of
indexes the table has and max_parallel_maintenance_workers.
---
 doc/src/sgml/config.sgml              |  25 +-
 doc/src/sgml/ref/vacuum.sgml          |  28 ++
 src/backend/access/heap/vacuumlazy.c  | 897 +++++++++++++++++++++++++++++-----
 src/backend/access/transam/parallel.c |   4 +
 src/backend/commands/vacuum.c         |  79 +--
 src/backend/nodes/copyfuncs.c         |  16 +-
 src/backend/nodes/equalfuncs.c        |  13 +-
 src/backend/parser/gram.y             |  72 ++-
 src/backend/postmaster/autovacuum.c   |  14 +-
 src/backend/tcop/utility.c            |   4 +-
 src/include/access/heapam.h           |   5 +-
 src/include/commands/vacuum.h         |   2 +-
 src/include/nodes/nodes.h             |   1 +
 src/include/nodes/parsenodes.h        |  20 +-
 src/test/regress/expected/vacuum.out  |   2 +
 src/test/regress/sql/vacuum.sql       |   3 +
 16 files changed, 977 insertions(+), 208 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index b6f5822..1bd1edd 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2185,18 +2185,19 @@ include_dir 'conf.d'
        <listitem>
         <para>
          Sets the maximum number of parallel workers that can be
-         started by a single utility command.  Currently, the only
-         parallel utility command that supports the use of parallel
-         workers is <command>CREATE INDEX</command>, and only when
-         building a B-tree index.  Parallel workers are taken from the
-         pool of processes established by <xref
-         linkend="guc-max-worker-processes"/>, limited by <xref
-         linkend="guc-max-parallel-workers"/>.  Note that the requested
-         number of workers may not actually be available at run time.
-         If this occurs, the utility operation will run with fewer
-         workers than expected.  The default value is 2.  Setting this
-         value to 0 disables the use of parallel workers by utility
-         commands.
+         started by a single utility command.  Currently, the parallel
+         utility commands that support the use of parallel workers are
+         <command>CREATE INDEX</command> only when building a B-tree
+         index and <command>VACUUM</command> without
+         <literal>FULL</literal> option.  Parallel workers are taken
+         from the pool of processes established by
+         <xref linkend="guc-max-worker-processes"/>,
+         limited by <xref linkend="guc-max-parallel-workers"/>.
+         Note that the requested number of workers may not actually be
+         available at run time.  If this occurs, the utility operation
+         will run with fewer workers than expected.  The default value
+         is 2.  Setting this value to 0 disables the use of parallel
+         workers by utility commands.
         </para>
 
         <para>
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index fd911f5..3edc623 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -30,6 +30,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     FREEZE
     VERBOSE
     ANALYZE
+    PARALLEL [ <replaceable class="parameter">N</replaceable> ]
     DISABLE_PAGE_SKIPPING
     SKIP_LOCKED
 
@@ -143,6 +144,24 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    </varlistentry>
 
    <varlistentry>
+    <term><literal>PARALLEL <replaceable class="parameter">N</replaceable></literal></term>
+    <listitem>
+     <para>
+      Vacuum index and cleanup index in parallel
+      <replaceable class="parameter">N</replaceable> background workers (for the detail
+      of each vacuum phases, please refer to <xref linkend="vacuum-phases"/>). If the
+      parallel degree <replaceable class="parameter">N</replaceable> is omitted,
+      then <command>VACUUM</command> decides the number of workers based on
+      number of indexes on the relation which further limited by
+      <xref linkend="guc-max-parallel-workers-maintenance"/>. Also if this option
+      is specified multile times, the last parallel degree
+      <replaceable class="parameter">N</replaceable> is considered into the account.
+      This option can not use with  <literal>FULL</literal> option.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><literal>DISABLE_PAGE_SKIPPING</literal></term>
     <listitem>
      <para>
@@ -261,6 +280,15 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    </para>
 
    <para>
+    Setting a value for <literal>parallel_workers</literal> via
+    <xref linkend="sql-altertable"/> also controls how many parallel
+    worker processes will be requested by a <command>VACUUM</command>
+    against the table. This setting is overwritten by setting
+    <replaceable class="parameter">N</replaceable> of <literal>PARALLEL</literal>
+    option.
+   </para>
+
+   <para>
     <command>VACUUM</command> causes a substantial increase in I/O traffic,
     which might cause poor performance for other active sessions.  Therefore,
     it is sometimes advisable to use the cost-based vacuum delay feature.
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 37aa484..e534022 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -22,6 +22,19 @@
  * of index scans performed.  So we don't use maintenance_work_mem memory for
  * the TID array, just enough to hold as many heap tuples as fit on one page.
  *
+ * Lazy vacuum supports parallel execution with parallel worker processes. In
+ * parallel vacuum, we perform both index vacuuming and index cleanup in
+ * parallel. Individual indexes is processed by one vacuum process. At beginning
+ * of lazy vacuum (it lazy_scan_heap) we prepare the parallel context and
+ * initialize the shared memory segments that contains shared information as
+ * well as the memory space for dead tuples. When starting either index vacuuming
+ * or index cleanup, we launch parallel worker processes. Once all indexes are
+ * processed the parallel worker processes exit and the leader process
+ * re-initializes the shared memory segment. Note that all parallel workers live
+ * during one either index vacuuming or index cleanup but the leader process neither
+ * exits from the parallel mode nor destories the parallel context. For updating
+ * the index statistics, since any updates are not allowed during parallel mode
+ * we update the index statistics after exited from the parallel mode.
  *
  * Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
@@ -41,8 +54,10 @@
 #include "access/heapam_xlog.h"
 #include "access/htup_details.h"
 #include "access/multixact.h"
+#include "access/parallel.h"
 #include "access/transam.h"
 #include "access/visibilitymap.h"
+#include "access/xact.h"
 #include "access/xlog.h"
 #include "catalog/storage.h"
 #include "commands/dbcommands.h"
@@ -55,6 +70,7 @@
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
 #include "storage/lmgr.h"
+#include "tcop/tcopprot.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/pg_rusage.h"
@@ -110,10 +126,79 @@
  */
 #define PREFETCH_SIZE			((BlockNumber) 32)
 
+/* DSM keys for parallel lazy vacuum */
+#define PARALLEL_VACUUM_KEY_SHARED			UINT64CONST(0xFFFFFFFFFFF00001)
+#define PARALLEL_VACUUM_KEY_DEAD_TUPLES		UINT64CONST(0xFFFFFFFFFFF00002)
+#define PARALLEL_VACUUM_KEY_QUERY_TEXT		UINT64CONST(0xFFFFFFFFFFF00003)
+
+/*
+ * Structs for an index bulk-deletion statistic that is used for parallel
+ * lazy vacuum. This is allocated in a dynamic shared memory segment.
+ */
+typedef struct LVIndStats
+{
+	bool updated;	/* is the stats updated? */
+	IndexBulkDeleteResult stats;
+} LVIndStats;
+
+/*
+ * LVDeadTuples controls the dead tuple TIDs collected during heap scan.
+ * This is allocated in a dynamic shared memory segment when parallel
+ * lazy vacuum mode, or allocated in a local memory.
+ */
+typedef struct LVDeadTuples
+{
+	int			max_tuples;	/* # slots allocated in array */
+	int			num_tuples;	/* current # of entries */
+	/* List of TIDs of tuples we intend to delete */
+	/* NB: this list is ordered by TID address */
+	ItemPointerData itemptrs[FLEXIBLE_ARRAY_MEMBER];	/* array of ItemPointerData */
+} LVDeadTuples;
+#define SizeOfLVDeadTuples offsetof(LVDeadTuples, itemptrs) + sizeof(ItemPointerData)
+
+/*
+ * Status for parallel index vacuuming and index cleanup. This is allocated in
+ * a dynamic shared memory segment.
+ */
+typedef struct LVShared
+{
+	/*
+	 * Target table relid and vacuum settings. These fields are not modified
+	 * during the lazy vacuum.
+	 */
+	Oid		relid;
+	bool	is_wraparound;
+	int		elevel;
+
+	/*
+	 * An indication for vacuum workers of doing either index vacuuming or
+	 * index cleanup.
+	 */
+	bool	for_cleanup;
+
+	/*
+	 * Fields for index vacuuming and index cleanup, which are necessary for
+	 * IndexVacuumInfo.
+	 *
+	 * reltuples is the total number of input heap tuples. We set either an
+	 * old live tuples in vacuum index or th new live tuples in index cleanup.
+	 *
+	 * estimated_count is true if the reltuples is estimated value.
+	 */
+	double	reltuples;
+	bool	estimated_count;
+
+	/*
+	 * Variables to control parallel index vacuuming. An variable-sized field
+	 * 'indstats' must come last.
+	 */
+	pg_atomic_uint32	nprocessed;
+	LVIndStats			indstats[FLEXIBLE_ARRAY_MEMBER];
+} LVShared;
+#define SizeOfLVShared offsetof(LVShared, indstats) + sizeof(LVIndStats)
+
 typedef struct LVRelStats
 {
-	/* hasindex = true means two-pass strategy; false means one-pass */
-	bool		hasindex;
 	/* Overall statistics about rel */
 	BlockNumber old_rel_pages;	/* previous value of pg_class.relpages */
 	BlockNumber rel_pages;		/* total number of pages */
@@ -128,16 +213,35 @@ typedef struct LVRelStats
 	BlockNumber pages_removed;
 	double		tuples_deleted;
 	BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
-	/* List of TIDs of tuples we intend to delete */
-	/* NB: this list is ordered by TID address */
-	int			num_dead_tuples;	/* current # of entries */
-	int			max_dead_tuples;	/* # slots allocated in array */
-	ItemPointer dead_tuples;	/* array of ItemPointerData */
 	int			num_index_scans;
 	TransactionId latestRemovedXid;
 	bool		lock_waiter_detected;
 } LVRelStats;
 
+/*
+ * Working state for lazy heap vacuum execution used by only leader process.
+ * This is present only in the leader process. In parallel lazy vacuum, the
+ * 'lvshared' and 'pcxt' are not NULL and they point to the dynamic shared
+ * memory segment.
+ */
+typedef struct LVState
+{
+	Relation	relation;
+	LVRelStats	*vacrelstats;
+	Relation	*indRels;
+	/* nindexes > 0 means two-pass strategy; false means one-pass */
+	int			nindexes;
+
+	/* Lazy vacuum options and scan status */
+	VacuumOptions	*options;
+	bool			is_wraparound;
+	bool			aggressive;
+	bool			parallel_ready;	/* true if parallel vacuum is prepared */
+
+	/* Variables for parallel lazy index vacuuming */
+	LVShared		*lvshared;
+	ParallelContext	*pcxt;
+} LVState;
 
 /* A few variables that don't seem worth passing around as parameters */
 static int	elevel = -1;
@@ -150,31 +254,43 @@ static BufferAccessStrategy vac_strategy;
 
 
 /* non-export function prototypes */
-static void lazy_scan_heap(Relation onerel, int options,
-			   LVRelStats *vacrelstats, Relation *Irel, int nindexes,
-			   bool aggressive);
-static void lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats);
+static void lazy_scan_heap(LVState *lvstate);
+static void lazy_vacuum_heap(LVState *lvstate, LVDeadTuples *dead_tuples);
 static bool lazy_check_needs_freeze(Buffer buf, bool *hastup);
-static void lazy_vacuum_index(Relation indrel,
-				  IndexBulkDeleteResult **stats,
-				  LVRelStats *vacrelstats);
-static void lazy_cleanup_index(Relation indrel,
-				   IndexBulkDeleteResult *stats,
-				   LVRelStats *vacrelstats);
-static int lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
-				 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer);
+static IndexBulkDeleteResult *lazy_vacuum_index(Relation indrel,
+									IndexBulkDeleteResult *stats,
+									double reltuples,
+									LVDeadTuples	*dead_tuples);
+static IndexBulkDeleteResult *lazy_cleanup_index(Relation indrel,
+									IndexBulkDeleteResult *stats,
+									double reltuples, bool estimated_count,
+									bool update_stats);
+static int lazy_vacuum_page(LVState *lvstate, Relation onerel, BlockNumber blkno,
+					Buffer buffer, int tupindex, Buffer *vmbuffer,
+					TransactionId latestRemovedXid, LVDeadTuples *dead_tuples);
 static bool should_attempt_truncation(LVRelStats *vacrelstats);
 static void lazy_truncate_heap(Relation onerel, LVRelStats *vacrelstats);
 static BlockNumber count_nondeletable_pages(Relation onerel,
 						 LVRelStats *vacrelstats);
-static void lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks);
-static void lazy_record_dead_tuple(LVRelStats *vacrelstats,
-					   ItemPointer itemptr);
+static LVDeadTuples *lazy_space_alloc(LVState *lvstate, BlockNumber relblocks,
+								  int parallel_workers);
+static void lazy_record_dead_tuple(LVDeadTuples *dead_tuples, ItemPointer itemptr);
 static bool lazy_tid_reaped(ItemPointer itemptr, void *state);
 static int	vac_cmp_itemptr(const void *left, const void *right);
 static bool heap_page_is_all_visible(Relation rel, Buffer buf,
 						 TransactionId *visibility_cutoff_xid, bool *all_frozen);
-
+static LVDeadTuples *lazy_prepare_parallel(LVState *lvstate, long maxtuples, int request);
+static void lazy_end_parallel(LVState *lvstate, bool update_indstats);
+static bool lazy_begin_parallel_vacuum_index(LVState *lvstate, bool for_cleanup);
+static void lazy_end_parallel_vacuum_index(LVState *lvstate);
+static void lazy_vacuum_all_indexes(LVState *lvstate,
+									IndexBulkDeleteResult **stats,
+									LVDeadTuples *dead_tuples,
+									bool for_cleanup);
+static void lazy_vacuum_indexes_for_worker(Relation *indrels, int nindexes,
+										   LVShared *lvshared, LVDeadTuples *dead_tuples,
+										   bool for_cleanup);
+static int compute_parallel_workers(Relation rel, int nrequests, int nindexes);
 
 /*
  *	heap_vacuum_rel() -- perform VACUUM for one heap relation
@@ -186,9 +302,10 @@ static bool heap_page_is_all_visible(Relation rel, Buffer buf,
  *		and locked the relation.
  */
 void
-heap_vacuum_rel(Relation onerel, int options, VacuumParams *params,
+heap_vacuum_rel(Relation onerel, VacuumOptions *options, VacuumParams *params,
 				BufferAccessStrategy bstrategy)
 {
+	LVState	   *lvstate;
 	LVRelStats *vacrelstats;
 	Relation   *Irel;
 	int			nindexes;
@@ -200,6 +317,7 @@ heap_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 				write_rate;
 	bool		aggressive;		/* should we scan all unfrozen pages? */
 	bool		scanned_all_unfrozen;	/* actually scanned all such pages? */
+	bool		hasindex;
 	TransactionId xidFullScanLimit;
 	MultiXactId mxactFullScanLimit;
 	BlockNumber new_rel_pages;
@@ -217,7 +335,7 @@ heap_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 		starttime = GetCurrentTimestamp();
 	}
 
-	if (options & VACOPT_VERBOSE)
+	if (options->flags & VACOPT_VERBOSE)
 		elevel = INFO;
 	else
 		elevel = DEBUG2;
@@ -245,7 +363,7 @@ heap_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 											   xidFullScanLimit);
 	aggressive |= MultiXactIdPrecedesOrEquals(onerel->rd_rel->relminmxid,
 											  mxactFullScanLimit);
-	if (options & VACOPT_DISABLE_PAGE_SKIPPING)
+	if (options->flags & VACOPT_DISABLE_PAGE_SKIPPING)
 		aggressive = true;
 
 	vacrelstats = (LVRelStats *) palloc0(sizeof(LVRelStats));
@@ -258,10 +376,23 @@ heap_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 
 	/* Open all indexes of the relation */
 	vac_open_indexes(onerel, RowExclusiveLock, &nindexes, &Irel);
-	vacrelstats->hasindex = (nindexes > 0);
+	hasindex = (nindexes > 0);
+
+	/* Create a lazy vacuum working state */
+	lvstate = (LVState *) palloc0(sizeof(LVState));
+	lvstate->relation = onerel;
+	lvstate->vacrelstats = vacrelstats;
+	lvstate->indRels = Irel;
+	lvstate->nindexes = nindexes;
+	lvstate->options = options;
+	lvstate->is_wraparound = params->is_wraparound;
+	lvstate->aggressive = aggressive;
+	lvstate->parallel_ready = false;
+	lvstate->lvshared = NULL;
+	lvstate->pcxt = NULL;
 
 	/* Do the vacuuming */
-	lazy_scan_heap(onerel, options, vacrelstats, Irel, nindexes, aggressive);
+	lazy_scan_heap(lvstate);
 
 	/* Done with indexes */
 	vac_close_indexes(nindexes, Irel, NoLock);
@@ -332,7 +463,7 @@ heap_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 						new_rel_pages,
 						new_live_tuples,
 						new_rel_allvisible,
-						vacrelstats->hasindex,
+						hasindex,
 						new_frozen_xid,
 						new_min_multi,
 						false);
@@ -464,14 +595,29 @@ vacuum_log_cleanup_info(Relation rel, LVRelStats *vacrelstats)
  *		dead-tuple TIDs, invoke vacuuming of indexes and call lazy_vacuum_heap
  *		to reclaim dead line pointers.
  *
+ *		If the table has more than one index and parallel lazy vacuum is requested,
+ *		we execute both index vacuuming and index cleanup with parallel workers. When
+ *		allocating the space for lazy scan heap, we enter the parallel mode, create
+ *		the parallel context and initailize a dynamic shared memory segment for dead
+ *		tuples. The dead_tuples points either to a dynamic shared memory segment in
+ *		parallel vacuum case or to a local memory in single process vacuum case.
+ *		Before starting	parallel index vacuuming and parallel index cleanup we launch
+ *		parallel workers. All parallel workers will exit after processed all indexes
+ *		and the leader process re-initialize parallel context and then re-launch them
+ *		at the next execution. The index statistics are updated by the leader after
+ *		exited from the parallel mode since all writes are not allowed during the
+ *		parallel mode.
+ *
  *		If there are no indexes then we can reclaim line pointers on the fly;
  *		dead line pointers need only be retained until all index pointers that
  *		reference them have been killed.
  */
 static void
-lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
-			   Relation *Irel, int nindexes, bool aggressive)
+lazy_scan_heap(LVState *lvstate)
 {
+	Relation	onerel = lvstate->relation;
+	LVRelStats	*vacrelstats = lvstate->vacrelstats;
+	LVDeadTuples	*dead_tuples = NULL;
 	BlockNumber nblocks,
 				blkno;
 	HeapTupleData tuple;
@@ -486,7 +632,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 				tups_vacuumed,	/* tuples cleaned up by vacuum */
 				nkeep,			/* dead-but-not-removable tuples */
 				nunused;		/* unused item pointers */
-	IndexBulkDeleteResult **indstats;
+	IndexBulkDeleteResult **indstats = NULL;
 	int			i;
 	PGRUsage	ru0;
 	Buffer		vmbuffer = InvalidBuffer;
@@ -494,6 +640,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	bool		skipping_blocks;
 	xl_heap_freeze_tuple *frozen;
 	StringInfoData buf;
+	int			parallel_workers = 0;
 	const int	initprog_index[] = {
 		PROGRESS_VACUUM_PHASE,
 		PROGRESS_VACUUM_TOTAL_HEAP_BLKS,
@@ -504,7 +651,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	pg_rusage_init(&ru0);
 
 	relname = RelationGetRelationName(onerel);
-	if (aggressive)
+	if (lvstate->aggressive)
 		ereport(elevel,
 				(errmsg("aggressively vacuuming \"%s.%s\"",
 						get_namespace_name(RelationGetNamespace(onerel)),
@@ -519,9 +666,6 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	next_fsm_block_to_vacuum = (BlockNumber) 0;
 	num_tuples = live_tuples = tups_vacuumed = nkeep = nunused = 0;
 
-	indstats = (IndexBulkDeleteResult **)
-		palloc0(nindexes * sizeof(IndexBulkDeleteResult *));
-
 	nblocks = RelationGetNumberOfBlocks(onerel);
 	vacrelstats->rel_pages = nblocks;
 	vacrelstats->scanned_pages = 0;
@@ -529,13 +673,36 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	vacrelstats->nonempty_pages = 0;
 	vacrelstats->latestRemovedXid = InvalidTransactionId;
 
-	lazy_space_alloc(vacrelstats, nblocks);
+	/*
+	 * Compute the number of parallel vacuum worker to request and then enable
+	 * parallel lazy vacuum.
+	 */
+	if ((lvstate->options->flags & VACOPT_PARALLEL) != 0)
+		parallel_workers = compute_parallel_workers(lvstate->relation,
+													lvstate->options->nworkers,
+													lvstate->nindexes);
+
+	/*
+	 * Allocate memory space for lazy vacuum. If parallel_workers > 0, we
+	 * prepare for parallel vacuum, entering the parallel mode, initializing
+	 * a dynamic shared memory segment.
+	 */
+	dead_tuples = lazy_space_alloc(lvstate, nblocks, parallel_workers);
 	frozen = palloc(sizeof(xl_heap_freeze_tuple) * MaxHeapTuplesPerPage);
 
+	/*
+	 * allocate the memory for index bulkdelete results if in the single vacuum
+	 * mode. In parallel mode, we've already prepared it in the shared memory
+	 * segment.
+	 */
+	if (!lvstate->parallel_ready)
+		indstats = (IndexBulkDeleteResult **)
+			palloc0(lvstate->nindexes * sizeof(IndexBulkDeleteResult *));
+
 	/* Report that we're scanning the heap, advertising total # of blocks */
 	initprog_val[0] = PROGRESS_VACUUM_PHASE_SCAN_HEAP;
 	initprog_val[1] = nblocks;
-	initprog_val[2] = vacrelstats->max_dead_tuples;
+	initprog_val[2] = dead_tuples->max_tuples;
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
 	/*
@@ -583,7 +750,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	 * be replayed on any hot standby, where it can be disruptive.
 	 */
 	next_unskippable_block = 0;
-	if ((options & VACOPT_DISABLE_PAGE_SKIPPING) == 0)
+	if ((lvstate->options->flags & VACOPT_DISABLE_PAGE_SKIPPING) == 0)
 	{
 		while (next_unskippable_block < nblocks)
 		{
@@ -591,7 +758,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 
 			vmstatus = visibilitymap_get_status(onerel, next_unskippable_block,
 												&vmbuffer);
-			if (aggressive)
+			if (lvstate->aggressive)
 			{
 				if ((vmstatus & VISIBILITYMAP_ALL_FROZEN) == 0)
 					break;
@@ -638,7 +805,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		{
 			/* Time to advance next_unskippable_block */
 			next_unskippable_block++;
-			if ((options & VACOPT_DISABLE_PAGE_SKIPPING) == 0)
+			if ((lvstate->options->flags & VACOPT_DISABLE_PAGE_SKIPPING) == 0)
 			{
 				while (next_unskippable_block < nblocks)
 				{
@@ -647,7 +814,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 					vmskipflags = visibilitymap_get_status(onerel,
 														   next_unskippable_block,
 														   &vmbuffer);
-					if (aggressive)
+					if (lvstate->aggressive)
 					{
 						if ((vmskipflags & VISIBILITYMAP_ALL_FROZEN) == 0)
 							break;
@@ -676,7 +843,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			 * it's not all-visible.  But in an aggressive vacuum we know only
 			 * that it's not all-frozen, so it might still be all-visible.
 			 */
-			if (aggressive && VM_ALL_VISIBLE(onerel, blkno, &vmbuffer))
+			if (lvstate->aggressive && VM_ALL_VISIBLE(onerel, blkno, &vmbuffer))
 				all_visible_according_to_vm = true;
 		}
 		else
@@ -700,7 +867,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 				 * know whether it was all-frozen, so we have to recheck; but
 				 * in this case an approximate answer is OK.
 				 */
-				if (aggressive || VM_ALL_FROZEN(onerel, blkno, &vmbuffer))
+				if (lvstate->aggressive || VM_ALL_FROZEN(onerel, blkno, &vmbuffer))
 					vacrelstats->frozenskipped_pages++;
 				continue;
 			}
@@ -713,8 +880,8 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		 * If we are close to overrunning the available space for dead-tuple
 		 * TIDs, pause and do a cycle of vacuuming before we tackle this page.
 		 */
-		if ((vacrelstats->max_dead_tuples - vacrelstats->num_dead_tuples) < MaxHeapTuplesPerPage &&
-			vacrelstats->num_dead_tuples > 0)
+		if ((dead_tuples->max_tuples - dead_tuples->num_tuples) < MaxHeapTuplesPerPage &&
+			dead_tuples->num_tuples > 0)
 		{
 			const int	hvp_index[] = {
 				PROGRESS_VACUUM_PHASE,
@@ -742,10 +909,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 										 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
 			/* Remove index entries */
-			for (i = 0; i < nindexes; i++)
-				lazy_vacuum_index(Irel[i],
-								  &indstats[i],
-								  vacrelstats);
+			lazy_vacuum_all_indexes(lvstate, indstats, dead_tuples, false);
 
 			/*
 			 * Report that we are now vacuuming the heap.  We also increase
@@ -758,14 +922,14 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			pgstat_progress_update_multi_param(2, hvp_index, hvp_val);
 
 			/* Remove tuples from heap */
-			lazy_vacuum_heap(onerel, vacrelstats);
+			lazy_vacuum_heap(lvstate, dead_tuples);
 
 			/*
 			 * Forget the now-vacuumed tuples, and press on, but be careful
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 			vacrelstats->num_index_scans++;
 
 			/*
@@ -803,7 +967,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			 * it's OK to skip vacuuming pages we get a lock conflict on. They
 			 * will be dealt with in some future vacuum.
 			 */
-			if (!aggressive && !FORCE_CHECK_PAGE())
+			if (!lvstate->aggressive && !FORCE_CHECK_PAGE())
 			{
 				ReleaseBuffer(buf);
 				vacrelstats->pinskipped_pages++;
@@ -836,7 +1000,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 					vacrelstats->nonempty_pages = blkno + 1;
 				continue;
 			}
-			if (!aggressive)
+			if (!lvstate->aggressive)
 			{
 				/*
 				 * Here, we must not advance scanned_pages; that would amount
@@ -955,7 +1119,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		has_dead_tuples = false;
 		nfrozen = 0;
 		hastup = false;
-		prev_dead_count = vacrelstats->num_dead_tuples;
+		prev_dead_count = dead_tuples->num_tuples;
 		maxoff = PageGetMaxOffsetNumber(page);
 
 		/*
@@ -994,7 +1158,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			 */
 			if (ItemIdIsDead(itemid))
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				all_visible = false;
 				continue;
 			}
@@ -1134,7 +1298,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 
 			if (tupgone)
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				HeapTupleHeaderAdvanceLatestRemovedXid(tuple.t_data,
 													   &vacrelstats->latestRemovedXid);
 				tups_vacuumed += 1;
@@ -1203,11 +1367,12 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		 * If there are no indexes then we can vacuum the page right now
 		 * instead of doing a second scan.
 		 */
-		if (nindexes == 0 &&
-			vacrelstats->num_dead_tuples > 0)
+		if (lvstate->nindexes == 0 && dead_tuples->num_tuples > 0)
 		{
 			/* Remove tuples from heap */
-			lazy_vacuum_page(onerel, blkno, buf, 0, vacrelstats, &vmbuffer);
+			lazy_vacuum_page(lvstate, onerel, blkno, buf, 0, &vmbuffer,
+							 lvstate->vacrelstats->latestRemovedXid,
+							 dead_tuples);
 			has_dead_tuples = false;
 
 			/*
@@ -1215,7 +1380,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 			vacuumed_pages++;
 
 			/*
@@ -1331,7 +1496,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		 * page, so remember its free space as-is.  (This path will always be
 		 * taken if there are no indexes.)
 		 */
-		if (vacrelstats->num_dead_tuples == prev_dead_count)
+		if (dead_tuples->num_tuples == prev_dead_count)
 			RecordPageWithFreeSpace(onerel, blkno, freespace);
 	}
 
@@ -1365,7 +1530,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 
 	/* If any tuples need to be deleted, perform final vacuum cycle */
 	/* XXX put a threshold on min number of tuples here? */
-	if (vacrelstats->num_dead_tuples > 0)
+	if (dead_tuples->num_tuples > 0)
 	{
 		const int	hvp_index[] = {
 			PROGRESS_VACUUM_PHASE,
@@ -1381,10 +1546,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 									 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
 		/* Remove index entries */
-		for (i = 0; i < nindexes; i++)
-			lazy_vacuum_index(Irel[i],
-							  &indstats[i],
-							  vacrelstats);
+		lazy_vacuum_all_indexes(lvstate, indstats, dead_tuples, false);
 
 		/* Report that we are now vacuuming the heap */
 		hvp_val[0] = PROGRESS_VACUUM_PHASE_VACUUM_HEAP;
@@ -1394,7 +1556,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		/* Remove tuples from heap */
 		pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
 									 PROGRESS_VACUUM_PHASE_VACUUM_HEAP);
-		lazy_vacuum_heap(onerel, vacrelstats);
+		lazy_vacuum_heap(lvstate, dead_tuples);
 		vacrelstats->num_index_scans++;
 	}
 
@@ -1411,8 +1573,10 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
 
 	/* Do post-vacuum cleanup and statistics update for each index */
-	for (i = 0; i < nindexes; i++)
-		lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
+	lazy_vacuum_all_indexes(lvstate, indstats, dead_tuples, true);
+
+	if (lvstate->parallel_ready)
+		lazy_end_parallel(lvstate, true);
 
 	/* If no indexes, make log report that lazy_vacuum_heap would've made */
 	if (vacuumed_pages)
@@ -1467,8 +1631,9 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
  * process index entry removal in batches as large as possible.
  */
 static void
-lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
+lazy_vacuum_heap(LVState *lvstate, LVDeadTuples *dead_tuples)
 {
+	Relation	onerel = lvstate->relation;
 	int			tupindex;
 	int			npages;
 	PGRUsage	ru0;
@@ -1478,7 +1643,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 	npages = 0;
 
 	tupindex = 0;
-	while (tupindex < vacrelstats->num_dead_tuples)
+	while (tupindex < dead_tuples->num_tuples)
 	{
 		BlockNumber tblk;
 		Buffer		buf;
@@ -1487,7 +1652,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 
 		vacuum_delay_point();
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&dead_tuples->itemptrs[tupindex]);
 		buf = ReadBufferExtended(onerel, MAIN_FORKNUM, tblk, RBM_NORMAL,
 								 vac_strategy);
 		if (!ConditionalLockBufferForCleanup(buf))
@@ -1496,8 +1661,9 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 			++tupindex;
 			continue;
 		}
-		tupindex = lazy_vacuum_page(onerel, tblk, buf, tupindex, vacrelstats,
-									&vmbuffer);
+		tupindex = lazy_vacuum_page(lvstate, onerel, tblk, buf, tupindex,
+									&vmbuffer, lvstate->vacrelstats->latestRemovedXid,
+									dead_tuples);
 
 		/* Now that we've compacted the page, record its available space */
 		page = BufferGetPage(buf);
@@ -1532,8 +1698,9 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
  * The return value is the first tupindex after the tuples of this page.
  */
 static int
-lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
-				 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer)
+lazy_vacuum_page(LVState *lvstate, Relation onerel, BlockNumber blkno,
+				 Buffer buffer, int tupindex, Buffer *vmbuffer,
+				 TransactionId latestRemovedXid, LVDeadTuples *dead_tuples)
 {
 	Page		page = BufferGetPage(buffer);
 	OffsetNumber unused[MaxOffsetNumber];
@@ -1545,16 +1712,16 @@ lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 
 	START_CRIT_SECTION();
 
-	for (; tupindex < vacrelstats->num_dead_tuples; tupindex++)
+	for (; tupindex < dead_tuples->num_tuples; tupindex++)
 	{
 		BlockNumber tblk;
 		OffsetNumber toff;
 		ItemId		itemid;
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&dead_tuples->itemptrs[tupindex]);
 		if (tblk != blkno)
 			break;				/* past end of tuples for this block */
-		toff = ItemPointerGetOffsetNumber(&vacrelstats->dead_tuples[tupindex]);
+		toff = ItemPointerGetOffsetNumber(&dead_tuples->itemptrs[tupindex]);
 		itemid = PageGetItemId(page, toff);
 		ItemIdSetUnused(itemid);
 		unused[uncnt++] = toff;
@@ -1575,7 +1742,7 @@ lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 		recptr = log_heap_clean(onerel, buffer,
 								NULL, 0, NULL, 0,
 								unused, uncnt,
-								vacrelstats->latestRemovedXid);
+								latestRemovedXid);
 		PageSetLSN(page, recptr);
 	}
 
@@ -1674,6 +1841,98 @@ lazy_check_needs_freeze(Buffer buf, bool *hastup)
 	return false;
 }
 
+/*
+ * Vacuum or cleanup all indexes. If we're ready for the parallel vacuum it's
+ * done with parallel workers. So this function must be used by the parallel
+ * vacuum leader process.
+ *
+ * In parallel lazy vacuum, we copy the index bulk-deletion results returned
+ * from ambulkdelete and amvacuumcleanup to the shared memory because they are
+ * allocated in the local memory and it's possible that an index will be
+ * vacuumed by the different vacuum process at the next time.
+ *
+ * Since all vacuum workers process different indexes we can write them without
+ * locking.
+ */
+static void
+lazy_vacuum_all_indexes(LVState *lvstate, IndexBulkDeleteResult **stats,
+						LVDeadTuples *dead_tuples, bool for_cleanup)
+{
+	LVShared	*lvshared = lvstate->lvshared;
+	LVRelStats	*vacrelstats = lvstate->vacrelstats;
+	int			nprocessed = 0;
+	bool		do_parallel = false;
+	int			idx;
+
+	Assert(!IsParallelWorker());
+	Assert(lvstate->parallel_ready ||
+		   (!lvstate->parallel_ready && stats != NULL));
+
+	/* no job if the table has no index */
+	if (lvstate->nindexes <= 0)
+		return;
+
+	if (lvstate->parallel_ready)
+		do_parallel = lazy_begin_parallel_vacuum_index(lvstate, for_cleanup);
+
+	for (;;)
+	{
+		IndexBulkDeleteResult *result = NULL;
+
+		/* Get the next index number to vacuum and set index statistics */
+		if (do_parallel)
+		{
+			idx = pg_atomic_fetch_add_u32(&(lvshared->nprocessed), 1);
+
+			/*
+			 * If there is already-updated result in the shared memory we
+			 * use it. Otherwise we pass NULL to index AMs as they expect
+			 * NULL for the first time exectuion.
+			 */
+			if (lvshared->indstats[idx].updated)
+				result = &(lvshared->indstats[idx].stats);
+		}
+		else
+		{
+			idx = nprocessed++;
+			result = stats[idx];
+		}
+
+		/* Done for all indexes? */
+		if (idx >= lvstate->nindexes)
+			break;
+
+		/*
+		 * Vacuum or cleanup one index. For index cleanup, we don't update
+		 * index statistics during parallel mode.
+		 */
+		if (for_cleanup)
+			result = lazy_cleanup_index(lvstate->indRels[idx], result,
+										vacrelstats->new_rel_tuples,
+										vacrelstats->tupcount_pages < vacrelstats->rel_pages,
+										!do_parallel);
+		else
+			result = lazy_vacuum_index(lvstate->indRels[idx], result,
+									   vacrelstats->old_rel_pages,
+									   dead_tuples);
+
+		if (do_parallel && result)
+		{
+			/*
+			 * Save index bulk-deletion result to the shared memory space if
+			 * this is the first time.
+			 */
+			if (!lvshared->indstats[idx].updated)
+				memcpy(&(lvshared->indstats[idx].stats), result,
+					   sizeof(IndexBulkDeleteResult));
+
+			lvshared->indstats[idx].updated = true;
+		}
+	}
+
+	if (do_parallel)
+		lazy_end_parallel_vacuum_index(lvstate);
+}
 
 /*
  *	lazy_vacuum_index() -- vacuum one index relation.
@@ -1681,11 +1940,11 @@ lazy_check_needs_freeze(Buffer buf, bool *hastup)
  *		Delete all the index entries pointing to tuples listed in
  *		vacrelstats->dead_tuples, and update running statistics.
  */
-static void
-lazy_vacuum_index(Relation indrel,
-				  IndexBulkDeleteResult **stats,
-				  LVRelStats *vacrelstats)
+static IndexBulkDeleteResult *
+lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult *stats,
+				  double reltuples, LVDeadTuples *dead_tuples)
 {
+	IndexBulkDeleteResult *res;
 	IndexVacuumInfo ivinfo;
 	PGRUsage	ru0;
 
@@ -1695,28 +1954,29 @@ lazy_vacuum_index(Relation indrel,
 	ivinfo.analyze_only = false;
 	ivinfo.estimated_count = true;
 	ivinfo.message_level = elevel;
-	/* We can only provide an approximate value of num_heap_tuples here */
-	ivinfo.num_heap_tuples = vacrelstats->old_live_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
 	/* Do bulk deletion */
-	*stats = index_bulk_delete(&ivinfo, *stats,
-							   lazy_tid_reaped, (void *) vacrelstats);
+	res = index_bulk_delete(&ivinfo, stats,
+							lazy_tid_reaped, (void *) dead_tuples);
 
 	ereport(elevel,
-			(errmsg("scanned index \"%s\" to remove %d row versions",
+			(errmsg("scanned index \"%s\" to remove %d row versions %s",
 					RelationGetRelationName(indrel),
-					vacrelstats->num_dead_tuples),
+					dead_tuples->num_tuples,
+					IsParallelWorker() ? "by parallel vacuum worker" : ""),
 			 errdetail_internal("%s", pg_rusage_show(&ru0))));
+
+	return res;
 }
 
 /*
  *	lazy_cleanup_index() -- do post-vacuum cleanup for one index relation.
  */
-static void
-lazy_cleanup_index(Relation indrel,
-				   IndexBulkDeleteResult *stats,
-				   LVRelStats *vacrelstats)
+static IndexBulkDeleteResult *
+lazy_cleanup_index(Relation indrel, IndexBulkDeleteResult *stats,
+				   double reltuples, bool estimated_count, bool update_stats)
 {
 	IndexVacuumInfo ivinfo;
 	PGRUsage	ru0;
@@ -1725,27 +1985,21 @@ lazy_cleanup_index(Relation indrel,
 
 	ivinfo.index = indrel;
 	ivinfo.analyze_only = false;
-	ivinfo.estimated_count = (vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	ivinfo.estimated_count = estimated_count;
 	ivinfo.message_level = elevel;
-
-	/*
-	 * Now we can provide a better estimate of total number of surviving
-	 * tuples (we assume indexes are more interested in that than in the
-	 * number of nominally live tuples).
-	 */
-	ivinfo.num_heap_tuples = vacrelstats->new_rel_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
 	stats = index_vacuum_cleanup(&ivinfo, stats);
 
 	if (!stats)
-		return;
+		return NULL;
 
 	/*
 	 * Now update statistics in pg_class, but only if the index says the count
 	 * is accurate.
 	 */
-	if (!stats->estimated_count)
+	if (!stats->estimated_count && update_stats)
 		vac_update_relstats(indrel,
 							stats->num_pages,
 							stats->num_index_tuples,
@@ -1766,8 +2020,7 @@ lazy_cleanup_index(Relation indrel,
 					   stats->tuples_removed,
 					   stats->pages_deleted, stats->pages_free,
 					   pg_rusage_show(&ru0))));
-
-	pfree(stats);
+	return stats;
 }
 
 /*
@@ -2077,15 +2330,16 @@ count_nondeletable_pages(Relation onerel, LVRelStats *vacrelstats)
  *
  * See the comments at the head of this file for rationale.
  */
-static void
-lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+static LVDeadTuples *
+lazy_space_alloc(LVState *lvstate, BlockNumber relblocks, int parallel_workers)
 {
+	LVDeadTuples	*dead_tuples = NULL;
 	long		maxtuples;
 	int			vac_work_mem = IsAutoVacuumWorkerProcess() &&
 	autovacuum_work_mem != -1 ?
 	autovacuum_work_mem : maintenance_work_mem;
 
-	if (vacrelstats->hasindex)
+	if (lvstate->nindexes > 0)
 	{
 		maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);
 		maxtuples = Min(maxtuples, INT_MAX);
@@ -2099,34 +2353,45 @@ lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
 		maxtuples = Max(maxtuples, MaxHeapTuplesPerPage);
 	}
 	else
-	{
 		maxtuples = MaxHeapTuplesPerPage;
+
+	/*
+	 * In parallel lazy vacuum, we enter the parallel mode and prepare all
+	 * memory necessary for executing the parallel lazy vacuum including the
+	 * space to store dead tuples.
+	 */
+	if (parallel_workers > 0)
+	{
+		dead_tuples = lazy_prepare_parallel(lvstate, maxtuples, parallel_workers);
+		Assert(dead_tuples != NULL);
+
+		return dead_tuples;
 	}
 
-	vacrelstats->num_dead_tuples = 0;
-	vacrelstats->max_dead_tuples = (int) maxtuples;
-	vacrelstats->dead_tuples = (ItemPointer)
-		palloc(maxtuples * sizeof(ItemPointerData));
+	dead_tuples = (LVDeadTuples *) palloc(SizeOfLVDeadTuples + maxtuples * sizeof(ItemPointerData));
+	dead_tuples->num_tuples = 0;
+	dead_tuples->max_tuples = (int) maxtuples;
+
+	return dead_tuples;
 }
 
 /*
  * lazy_record_dead_tuple - remember one deletable tuple
  */
 static void
-lazy_record_dead_tuple(LVRelStats *vacrelstats,
-					   ItemPointer itemptr)
+lazy_record_dead_tuple(LVDeadTuples *dead_tuples, ItemPointer itemptr)
 {
 	/*
 	 * The array shouldn't overflow under normal behavior, but perhaps it
 	 * could if we are given a really small maintenance_work_mem. In that
 	 * case, just forget the last few tuples (we'll get 'em next time).
 	 */
-	if (vacrelstats->num_dead_tuples < vacrelstats->max_dead_tuples)
+	if (dead_tuples->num_tuples < dead_tuples->max_tuples)
 	{
-		vacrelstats->dead_tuples[vacrelstats->num_dead_tuples] = *itemptr;
-		vacrelstats->num_dead_tuples++;
+		dead_tuples->itemptrs[dead_tuples->num_tuples] = *itemptr;
+		dead_tuples->num_tuples++;
 		pgstat_progress_update_param(PROGRESS_VACUUM_NUM_DEAD_TUPLES,
-									 vacrelstats->num_dead_tuples);
+									 dead_tuples->num_tuples);
 	}
 }
 
@@ -2140,12 +2405,12 @@ lazy_record_dead_tuple(LVRelStats *vacrelstats,
 static bool
 lazy_tid_reaped(ItemPointer itemptr, void *state)
 {
-	LVRelStats *vacrelstats = (LVRelStats *) state;
+	LVDeadTuples	*dead_tuples = (LVDeadTuples *) state;
 	ItemPointer res;
 
 	res = (ItemPointer) bsearch((void *) itemptr,
-								(void *) vacrelstats->dead_tuples,
-								vacrelstats->num_dead_tuples,
+								(void *) dead_tuples->itemptrs,
+								dead_tuples->num_tuples,
 								sizeof(ItemPointerData),
 								vac_cmp_itemptr);
 
@@ -2293,3 +2558,387 @@ heap_page_is_all_visible(Relation rel, Buffer buf,
 
 	return all_visible;
 }
+
+/*
+ * Compute the number of parallel worker process to request. Vacuums can be
+ * executed in parallel if the table has more than one index since the parallel
+ * index vacuuming processes one index by one vacuum process. The relation
+ * size of table and indexes doesn't affect to the parallel degree.
+ */
+static int
+compute_parallel_workers(Relation rel, int nrequests, int nindexes)
+{
+	int parallel_workers;
+
+	if (nindexes <= 1)
+		return 0;
+
+	if (nrequests)
+		parallel_workers = Min(nrequests, nindexes - 1);
+	else if (rel->rd_options)
+	{
+		StdRdOptions *relopts = (StdRdOptions *) rel->rd_options;
+		parallel_workers = Min(relopts->parallel_workers, nindexes - 1);
+	}
+	else
+	{
+		/*
+		 * The parallel degree is neither requested nor set in relopts. Compute
+		 * it based on the number of indexes.
+		 */
+		parallel_workers = nindexes - 1;
+	}
+
+	/* cap by max_parallel_maintenace_workers */
+	parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+
+	return parallel_workers;
+}
+
+/*
+ * Enter the parallel mode, allocate and initialize a DSM segment. Return
+ * the memory space for storing dead tuples.
+ */
+static LVDeadTuples *
+lazy_prepare_parallel(LVState *lvstate, long maxtuples, int request)
+{
+	LVShared	*shared;
+	ParallelContext *pcxt;
+	LVDeadTuples	*tidmap;
+	char	*sharedquery;
+	Size	estshared;
+	Size	estdt;
+	int		querylen;
+	int		i;
+	int		keys = 0;
+
+	Assert(request > 0);
+
+	EnterParallelMode();
+
+	pcxt = CreateParallelContext("postgres", "heap_parallel_vacuum_main",
+								 request, true);
+	/*
+	 * nworkers must be prepared as we always request at least one worker
+	 * and allow serializable isolation.
+	 */
+	Assert(pcxt->nworkers > 0);
+	lvstate->pcxt = pcxt;
+
+	/* quick exit if no workers are prepared, e.g. under serializable isolation */
+	if (pcxt->nworkers == 0)
+	{
+		lazy_end_parallel(lvstate, false);
+		return NULL;
+	}
+
+	/* Estimate size for shared information -- PARALLEL_VACUUM_KEY_SHARED */
+	estshared = MAXALIGN(add_size(SizeOfLVShared,
+								  mul_size(sizeof(LVIndStats), lvstate->nindexes)));
+	shm_toc_estimate_chunk(&pcxt->estimator, estshared);
+	keys++;
+
+	/* Estimate size for dead tuples -- PARALLEL_VACUUM_KEY_DEAD_TUPLES */
+	estdt = MAXALIGN(add_size(sizeof(LVDeadTuples),
+							  mul_size(sizeof(ItemPointerData), maxtuples)));
+	shm_toc_estimate_chunk(&pcxt->estimator, estdt);
+	keys++;
+
+	shm_toc_estimate_keys(&pcxt->estimator, keys);
+
+	/*
+	 * Finally, estimate VACUUM_KEY_QUERY_TEXT space. Auto vacuums don't have
+	 * debug_query_string.
+	 */
+	querylen = strlen(debug_query_string);
+	shm_toc_estimate_chunk(&pcxt->estimator, querylen + 1);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	/* create the DSM */
+	InitializeParallelDSM(pcxt);
+
+	/* prepare shared information */
+	shared = (LVShared *) shm_toc_allocate(pcxt->toc, estshared);
+	shared->relid = RelationGetRelid(lvstate->relation);
+	shared->is_wraparound = lvstate->is_wraparound;
+	shared->elevel = elevel;
+	pg_atomic_init_u32(&(shared->nprocessed), 0);
+
+	for (i = 0; i < lvstate->nindexes; i++)
+	{
+		LVIndStats *s = &(shared->indstats[i]);
+		s->updated = false;
+		MemSet(&(s->stats), 0, sizeof(IndexBulkDeleteResult));
+	}
+
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_SHARED, shared);
+	lvstate->lvshared = shared;
+
+	/* prepare the dead tuple space */
+	tidmap = (LVDeadTuples *) shm_toc_allocate(pcxt->toc, estdt);
+	tidmap->max_tuples = maxtuples;
+	tidmap->num_tuples = 0;
+	MemSet(tidmap->itemptrs, 0, sizeof(ItemPointerData) * maxtuples);
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_DEAD_TUPLES, tidmap);
+
+	/* Store query string for workers */
+	sharedquery = (char *) shm_toc_allocate(pcxt->toc, querylen + 1);
+	memcpy(sharedquery, debug_query_string, querylen + 1);
+	sharedquery[querylen] = '\0';
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, sharedquery);
+
+	/* All setup is done, now we're ready for parallel vacuum execution */
+	lvstate->parallel_ready = true;
+
+	return tidmap;
+}
+
+/*
+ * Shutdown workers, destroy parallel context, and end parallel mode. If
+ * 'update_indstats' is true, we copy statistics of all indexes before
+ * destroying the parallel context, and then update them after exit parallel
+ * mode.
+ */
+static void
+lazy_end_parallel(LVState *lvstate, bool update_indstats)
+{
+	LVIndStats *copied_indstats = NULL;
+
+	Assert(!IsParallelWorker());
+
+	if (update_indstats && lvstate->nindexes > 0)
+	{
+		/* copy the index statistics to a temporary space */
+		copied_indstats = palloc(sizeof(LVIndStats) * lvstate->nindexes);
+		memcpy(copied_indstats, lvstate->lvshared->indstats,
+			   sizeof(LVIndStats) * lvstate->nindexes);
+	}
+
+	/* Shutdown worker processes and destroy the parallel context */
+	WaitForParallelWorkersToFinish(lvstate->pcxt);
+	DestroyParallelContext(lvstate->pcxt);
+	ExitParallelMode();
+
+	if (copied_indstats)
+	{
+		int i;
+
+		for (i = 0; i < lvstate->nindexes; i++)
+		{
+			LVIndStats *s = &(copied_indstats[i]);
+
+			/* Update index statistics */
+			if (s->updated && !s->stats.estimated_count)
+				vac_update_relstats(lvstate->indRels[i],
+									s->stats.num_pages,
+									s->stats.num_index_tuples,
+									0,
+									false,
+									InvalidTransactionId,
+									InvalidMultiXactId,
+									false);
+		}
+
+		pfree(copied_indstats);
+	}
+
+	lvstate->parallel_ready = false;
+}
+
+/*
+ * Begin a parallel index vacuuming or index cleanup. Set shared information
+ * and launch parallel worker processes. Return true if at least one worker
+ * has been launched.
+ */
+static bool
+lazy_begin_parallel_vacuum_index(LVState *lvstate, bool for_cleanup)
+{
+	LVRelStats *vacrelstats = lvstate->vacrelstats;
+	StringInfoData buf;
+
+	Assert(!IsParallelWorker());
+
+	/*
+	 * Request workers to do either vacuuming indexes or cleaning indexes.
+	 */
+	lvstate->lvshared->for_cleanup = for_cleanup;
+
+	if (for_cleanup)
+	{
+		/*
+		 * Now we can provide a better estimate of total number of surviving
+		 * tuples (we assume indexes are more interested in that than in the
+		 * number of nominally live tuples).
+		 */
+		lvstate->lvshared->reltuples = vacrelstats->new_rel_tuples;
+		lvstate->lvshared->estimated_count =
+			(vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	}
+	else
+	{
+		/* We can only provide an approximate value of num_heap_tuples here */
+		lvstate->lvshared->reltuples = vacrelstats->old_live_tuples;
+		lvstate->lvshared->estimated_count = true;
+	}
+
+	LaunchParallelWorkers(lvstate->pcxt);
+
+	/* Report parallel vacuum worker information */
+	initStringInfo(&buf);
+	appendStringInfo(&buf,
+					 ngettext("launched %d parallel vacuum worker %s (planned: %d",
+							  "launched %d parallel vacuum workers %s (planned: %d",
+							  lvstate->pcxt->nworkers_launched),
+					 lvstate->pcxt->nworkers_launched,
+					 for_cleanup ? "for index cleanup" : "for index vacuuming",
+					 lvstate->pcxt->nworkers);
+	if (lvstate->options->nworkers > 0)
+		appendStringInfo(&buf, ", requested %d", lvstate->options->nworkers);
+
+	appendStringInfo(&buf, ")");
+	ereport(elevel, (errmsg("%s", buf.data)));
+
+	/*
+	 * if no workers launched, we vacuum all indexes by the leader process
+	 * alone. Since there is hope that we can launch workers in the next
+	 * execution time we don't want to end the parallel mode yet.
+	 */
+	if (lvstate->pcxt->nworkers_launched == 0)
+	{
+		lazy_end_parallel_vacuum_index(lvstate);
+		return false;
+	}
+
+	WaitForParallelWorkersToAttach(lvstate->pcxt);
+
+	return true;
+}
+
+/*
+ * Wait for all worker processes to finish and reinitialize DSM for
+ * the next execution.
+ */
+static void
+lazy_end_parallel_vacuum_index(LVState *lvstate)
+{
+	Assert(!IsParallelWorker());
+
+	WaitForParallelWorkersToFinish(lvstate->pcxt);
+
+	/* Reset the processing count */
+	pg_atomic_write_u32(&(lvstate->lvshared->nprocessed), 0);
+
+	/*
+	 * Reinitialize the DSM space except to relaunch parallel workers for
+	 * the next execution.
+	 */
+	ReinitializeParallelDSM(lvstate->pcxt);
+}
+
+/*
+ * Perform work within a launched parallel process.
+ *
+ * Parallel vacuum worker processes doesn't report the vacuum progress
+ * information.
+ */
+void
+heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
+{
+	Relation	onerel;
+	Relation	*indrels;
+	LVShared	*lvshared;
+	LVDeadTuples	*dead_tuples;
+	int			nindexes;
+	char		*sharedquery;
+
+	/* Set lazy vacuum state and open relations */
+	lvshared = (LVShared *) shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_SHARED, false);
+	onerel = heap_open(lvshared->relid, ShareUpdateExclusiveLock);
+	vac_open_indexes(onerel, RowExclusiveLock, &nindexes, &indrels);
+	elevel = lvshared->elevel;
+
+	ereport(DEBUG1,
+			(errmsg("starting parallel lazy vacuum worker for %s",
+					lvshared->for_cleanup ? "cleanup" : "vacuuming")));
+
+	/* Set debug_query_string for individual workers */
+	sharedquery = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, true);
+
+	/* Report the query string from leader */
+	debug_query_string = sharedquery;
+	pgstat_report_activity(STATE_RUNNING, debug_query_string);
+
+	/* Set dead tuple space within worker */
+	dead_tuples = (LVDeadTuples *) shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_DEAD_TUPLES, false);
+
+	/* Set cost-based vacuum delay */
+	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostBalance = 0;
+	VacuumPageHit = 0;
+	VacuumPageMiss = 0;
+	VacuumPageDirty = 0;
+
+	/* Do either vacuuming indexes or cleaning indexes */
+	lazy_vacuum_indexes_for_worker(indrels, nindexes, lvshared,
+								   dead_tuples,
+								   lvshared->for_cleanup);
+
+	vac_close_indexes(nindexes, indrels, RowExclusiveLock);
+	heap_close(onerel, ShareUpdateExclusiveLock);
+}
+
+/*
+ * Vacuum or cleanup indexes. This function must be used by the parallel vacuum
+ * worker processes. Similar to the leader process in parallel lazy vacuum, we
+ * copy the index bulk-deletion results to the shared memory segment.
+ */
+static void
+lazy_vacuum_indexes_for_worker(Relation *indrels, int nindexes,
+							   LVShared *lvshared, LVDeadTuples *dead_tuples,
+							   bool for_cleanup)
+{
+	int idx = 0;
+
+	Assert(IsParallelWorker());
+
+	for (;;)
+	{
+		IndexBulkDeleteResult *result = NULL;
+
+		/* Get next index to process */
+		idx = pg_atomic_fetch_add_u32(&(lvshared->nprocessed), 1);
+
+		/*  Done for all indexes? */
+		if (idx >= nindexes)
+			break;
+
+		/*
+		 * If there is already-updated result in the shared memory we
+		 * use it. Otherwise we pass NULL to index AMs as they expect
+		 * NULL for the first time exectuion.
+		 */
+		if (lvshared->indstats[idx].updated)
+			result = &(lvshared->indstats[idx].stats);
+
+		/* Vacuum or cleanup one index */
+		if (for_cleanup)
+			result = lazy_cleanup_index(indrels[idx], result, lvshared->reltuples,
+										lvshared->estimated_count, false);
+		else
+			result = lazy_vacuum_index(indrels[idx], result, lvshared->reltuples,
+									   dead_tuples);
+
+		if (result)
+		{
+			/*
+			 * Save index bulk-deletion result to the shared memory space if
+			 * this is the first time.
+			 */
+			if (!lvshared->indstats[idx].updated)
+				memcpy(&(lvshared->indstats[idx].stats), result,
+					   sizeof(IndexBulkDeleteResult));
+
+			lvshared->indstats[idx].updated = true;
+		}
+	}
+}
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index ce2b616..fb1e951 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -14,6 +14,7 @@
 
 #include "postgres.h"
 
+#include "access/heapam.h"
 #include "access/nbtree.h"
 #include "access/parallel.h"
 #include "access/session.h"
@@ -138,6 +139,9 @@ static const struct
 	},
 	{
 		"_bt_parallel_build_main", _bt_parallel_build_main
+	},
+	{
+		"heap_parallel_vacuum_main", heap_parallel_vacuum_main
 	}
 };
 
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index e91df21..1b64f15 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -67,13 +67,13 @@ static BufferAccessStrategy vac_strategy;
 
 
 /* non-export function prototypes */
-static List *expand_vacuum_rel(VacuumRelation *vrel, int options);
-static List *get_all_vacuum_rels(int options);
+static List *expand_vacuum_rel(VacuumRelation *vrel, VacuumOptions *options);
+static List *get_all_vacuum_rels(VacuumOptions *options);
 static void vac_truncate_clog(TransactionId frozenXID,
 				  MultiXactId minMulti,
 				  TransactionId lastSaneFrozenXid,
 				  MultiXactId lastSaneMinMulti);
-static bool vacuum_rel(Oid relid, RangeVar *relation, int options,
+static bool vacuum_rel(Oid relid, RangeVar *relation, VacuumOptions *options,
 		   VacuumParams *params);
 
 /*
@@ -88,15 +88,15 @@ ExecVacuum(VacuumStmt *vacstmt, bool isTopLevel)
 	VacuumParams params;
 
 	/* sanity checks on options */
-	Assert(vacstmt->options & (VACOPT_VACUUM | VACOPT_ANALYZE));
-	Assert((vacstmt->options & VACOPT_VACUUM) ||
-		   !(vacstmt->options & (VACOPT_FULL | VACOPT_FREEZE)));
-	Assert(!(vacstmt->options & VACOPT_SKIPTOAST));
+	Assert(vacstmt->options->flags & (VACOPT_VACUUM | VACOPT_ANALYZE));
+	Assert((vacstmt->options->flags & VACOPT_VACUUM) ||
+		   !(vacstmt->options->flags & (VACOPT_FULL | VACOPT_FREEZE)));
+	Assert(!(vacstmt->options->flags & VACOPT_SKIPTOAST));
 
 	/*
 	 * Make sure VACOPT_ANALYZE is specified if any column lists are present.
 	 */
-	if (!(vacstmt->options & VACOPT_ANALYZE))
+	if (!(vacstmt->options->flags & VACOPT_ANALYZE))
 	{
 		ListCell   *lc;
 
@@ -111,11 +111,17 @@ ExecVacuum(VacuumStmt *vacstmt, bool isTopLevel)
 		}
 	}
 
+	if ((vacstmt->options->flags & VACOPT_FULL) &&
+		(vacstmt->options->flags & VACOPT_PARALLEL))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot specify FULL option with PARALLEL option")));
+
 	/*
 	 * All freeze ages are zero if the FREEZE option is given; otherwise pass
 	 * them as -1 which means to use the default values.
 	 */
-	if (vacstmt->options & VACOPT_FREEZE)
+	if (vacstmt->options->flags & VACOPT_FREEZE)
 	{
 		params.freeze_min_age = 0;
 		params.freeze_table_age = 0;
@@ -143,7 +149,7 @@ ExecVacuum(VacuumStmt *vacstmt, bool isTopLevel)
 /*
  * Internal entry point for VACUUM and ANALYZE commands.
  *
- * options is a bitmask of VacuumOption flags, indicating what to do.
+ * options is a VacuumOptions, indicating what to do.
  *
  * relations, if not NIL, is a list of VacuumRelation to process; otherwise,
  * we process all relevant tables in the database.  For each VacuumRelation,
@@ -162,7 +168,7 @@ ExecVacuum(VacuumStmt *vacstmt, bool isTopLevel)
  * memory context that will not disappear at transaction commit.
  */
 void
-vacuum(int options, List *relations, VacuumParams *params,
+vacuum(VacuumOptions *options, List *relations, VacuumParams *params,
 	   BufferAccessStrategy bstrategy, bool isTopLevel)
 {
 	static bool in_vacuum = false;
@@ -173,7 +179,7 @@ vacuum(int options, List *relations, VacuumParams *params,
 
 	Assert(params != NULL);
 
-	stmttype = (options & VACOPT_VACUUM) ? "VACUUM" : "ANALYZE";
+	stmttype = (options->flags & VACOPT_VACUUM) ? "VACUUM" : "ANALYZE";
 
 	/*
 	 * We cannot run VACUUM inside a user transaction block; if we were inside
@@ -183,7 +189,7 @@ vacuum(int options, List *relations, VacuumParams *params,
 	 *
 	 * ANALYZE (without VACUUM) can run either way.
 	 */
-	if (options & VACOPT_VACUUM)
+	if (options->flags & VACOPT_VACUUM)
 	{
 		PreventInTransactionBlock(isTopLevel, stmttype);
 		in_outer_xact = false;
@@ -205,8 +211,8 @@ vacuum(int options, List *relations, VacuumParams *params,
 	/*
 	 * Sanity check DISABLE_PAGE_SKIPPING option.
 	 */
-	if ((options & VACOPT_FULL) != 0 &&
-		(options & VACOPT_DISABLE_PAGE_SKIPPING) != 0)
+	if ((options->flags & VACOPT_FULL) != 0 &&
+		(options->flags & VACOPT_DISABLE_PAGE_SKIPPING) != 0)
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg("VACUUM option DISABLE_PAGE_SKIPPING cannot be used with FULL")));
@@ -215,7 +221,7 @@ vacuum(int options, List *relations, VacuumParams *params,
 	 * Send info about dead objects to the statistics collector, unless we are
 	 * in autovacuum --- autovacuum.c does this for itself.
 	 */
-	if ((options & VACOPT_VACUUM) && !IsAutoVacuumWorkerProcess())
+	if ((options->flags & VACOPT_VACUUM) && !IsAutoVacuumWorkerProcess())
 		pgstat_vacuum_stat();
 
 	/*
@@ -280,11 +286,11 @@ vacuum(int options, List *relations, VacuumParams *params,
 	 * transaction block, and also in an autovacuum worker, use own
 	 * transactions so we can release locks sooner.
 	 */
-	if (options & VACOPT_VACUUM)
+	if (options->flags & VACOPT_VACUUM)
 		use_own_xacts = true;
 	else
 	{
-		Assert(options & VACOPT_ANALYZE);
+		Assert(options->flags & VACOPT_ANALYZE);
 		if (IsAutoVacuumWorkerProcess())
 			use_own_xacts = true;
 		else if (in_outer_xact)
@@ -334,13 +340,13 @@ vacuum(int options, List *relations, VacuumParams *params,
 		{
 			VacuumRelation *vrel = lfirst_node(VacuumRelation, cur);
 
-			if (options & VACOPT_VACUUM)
+			if (options->flags & VACOPT_VACUUM)
 			{
 				if (!vacuum_rel(vrel->oid, vrel->relation, options, params))
 					continue;
 			}
 
-			if (options & VACOPT_ANALYZE)
+			if (options->flags & VACOPT_ANALYZE)
 			{
 				/*
 				 * If using separate xacts, start one for analyze. Otherwise,
@@ -353,7 +359,7 @@ vacuum(int options, List *relations, VacuumParams *params,
 					PushActiveSnapshot(GetTransactionSnapshot());
 				}
 
-				analyze_rel(vrel->oid, vrel->relation, options, params,
+				analyze_rel(vrel->oid, vrel->relation, options->flags, params,
 							vrel->va_cols, in_outer_xact, vac_strategy);
 
 				if (use_own_xacts)
@@ -389,7 +395,7 @@ vacuum(int options, List *relations, VacuumParams *params,
 		StartTransactionCommand();
 	}
 
-	if ((options & VACOPT_VACUUM) && !IsAutoVacuumWorkerProcess())
+	if ((options->flags & VACOPT_VACUUM) && !IsAutoVacuumWorkerProcess())
 	{
 		/*
 		 * Update pg_database.datfrozenxid, and truncate pg_xact if possible.
@@ -602,7 +608,7 @@ vacuum_open_relation(Oid relid, RangeVar *relation, VacuumParams *params,
  * are made in vac_context.
  */
 static List *
-expand_vacuum_rel(VacuumRelation *vrel, int options)
+expand_vacuum_rel(VacuumRelation *vrel, VacuumOptions *options)
 {
 	List	   *vacrels = NIL;
 	MemoryContext oldcontext;
@@ -634,7 +640,7 @@ expand_vacuum_rel(VacuumRelation *vrel, int options)
 		 * below, as well as find_all_inheritors's expectation that the caller
 		 * holds some lock on the starting relation.
 		 */
-		rvr_opts = (options & VACOPT_SKIP_LOCKED) ? RVR_SKIP_LOCKED : 0;
+		rvr_opts = (options->flags & VACOPT_SKIP_LOCKED) ? RVR_SKIP_LOCKED : 0;
 		relid = RangeVarGetRelidExtended(vrel->relation,
 										 AccessShareLock,
 										 rvr_opts,
@@ -646,7 +652,7 @@ expand_vacuum_rel(VacuumRelation *vrel, int options)
 		 */
 		if (!OidIsValid(relid))
 		{
-			if (options & VACOPT_VACUUM)
+			if (options->flags & VACOPT_VACUUM)
 				ereport(WARNING,
 						(errcode(ERRCODE_LOCK_NOT_AVAILABLE),
 						 errmsg("skipping vacuum of \"%s\" --- lock not available",
@@ -672,7 +678,7 @@ expand_vacuum_rel(VacuumRelation *vrel, int options)
 		 * Make a returnable VacuumRelation for this rel if user is a proper
 		 * owner.
 		 */
-		if (vacuum_is_relation_owner(relid, classForm, options))
+		if (vacuum_is_relation_owner(relid, classForm, options->flags))
 		{
 			oldcontext = MemoryContextSwitchTo(vac_context);
 			vacrels = lappend(vacrels, makeVacuumRelation(vrel->relation,
@@ -741,7 +747,7 @@ expand_vacuum_rel(VacuumRelation *vrel, int options)
  * the current database.  The list is built in vac_context.
  */
 static List *
-get_all_vacuum_rels(int options)
+get_all_vacuum_rels(VacuumOptions *options)
 {
 	List	   *vacrels = NIL;
 	Relation	pgclass;
@@ -759,7 +765,7 @@ get_all_vacuum_rels(int options)
 		Oid			relid = classForm->oid;
 
 		/* check permissions of relation */
-		if (!vacuum_is_relation_owner(relid, classForm, options))
+		if (!vacuum_is_relation_owner(relid, classForm, options->flags))
 			continue;
 
 		/*
@@ -1520,7 +1526,8 @@ vac_truncate_clog(TransactionId frozenXID,
  *		At entry and exit, we are not inside a transaction.
  */
 static bool
-vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
+vacuum_rel(Oid relid, RangeVar *relation, VacuumOptions *options,
+		   VacuumParams *params)
 {
 	LOCKMODE	lmode;
 	Relation	onerel;
@@ -1541,7 +1548,7 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 	 */
 	PushActiveSnapshot(GetTransactionSnapshot());
 
-	if (!(options & VACOPT_FULL))
+	if (!(options->flags & VACOPT_FULL))
 	{
 		/*
 		 * In lazy vacuum, we can set the PROC_IN_VACUUM flag, which lets
@@ -1581,10 +1588,10 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 	 * vacuum, but just ShareUpdateExclusiveLock for concurrent vacuum. Either
 	 * way, we can be sure that no other backend is vacuuming the same table.
 	 */
-	lmode = (options & VACOPT_FULL) ? AccessExclusiveLock : ShareUpdateExclusiveLock;
+	lmode = (options->flags & VACOPT_FULL) ? AccessExclusiveLock : ShareUpdateExclusiveLock;
 
 	/* open the relation and get the appropriate lock on it */
-	onerel = vacuum_open_relation(relid, relation, params, options, lmode);
+	onerel = vacuum_open_relation(relid, relation, params, options->flags, lmode);
 
 	/* leave if relation could not be opened or locked */
 	if (!onerel)
@@ -1604,7 +1611,7 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 	 */
 	if (!vacuum_is_relation_owner(RelationGetRelid(onerel),
 								  onerel->rd_rel,
-								  options & VACOPT_VACUUM))
+								  options->flags & VACOPT_VACUUM))
 	{
 		relation_close(onerel, lmode);
 		PopActiveSnapshot();
@@ -1676,7 +1683,7 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 	 * us to process it.  In VACUUM FULL, though, the toast table is
 	 * automatically rebuilt by cluster_rel so we shouldn't recurse to it.
 	 */
-	if (!(options & VACOPT_SKIPTOAST) && !(options & VACOPT_FULL))
+	if (!(options->flags & VACOPT_SKIPTOAST) && !(options->flags & VACOPT_FULL))
 		toast_relid = onerel->rd_rel->reltoastrelid;
 	else
 		toast_relid = InvalidOid;
@@ -1695,7 +1702,7 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 	/*
 	 * Do the actual work --- either FULL or "lazy" vacuum
 	 */
-	if (options & VACOPT_FULL)
+	if (options->flags & VACOPT_FULL)
 	{
 		int			cluster_options = 0;
 
@@ -1703,7 +1710,7 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 		relation_close(onerel, NoLock);
 		onerel = NULL;
 
-		if ((options & VACOPT_VERBOSE) != 0)
+		if ((options->flags & VACOPT_VERBOSE) != 0)
 			cluster_options |= CLUOPT_VERBOSE;
 
 		/* VACUUM FULL is now a variant of CLUSTER; see cluster.c */
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index b44ead2..9e576e0 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -3842,12 +3842,23 @@ _copyDropdbStmt(const DropdbStmt *from)
 	return newnode;
 }
 
+static VacuumOptions *
+_copyVacuumOptions(const VacuumOptions *from)
+{
+	VacuumOptions *newnode = makeNode(VacuumOptions);
+
+	COPY_SCALAR_FIELD(flags);
+	COPY_SCALAR_FIELD(nworkers);
+
+	return newnode;
+}
+
 static VacuumStmt *
 _copyVacuumStmt(const VacuumStmt *from)
 {
 	VacuumStmt *newnode = makeNode(VacuumStmt);
 
-	COPY_SCALAR_FIELD(options);
+	COPY_NODE_FIELD(options);
 	COPY_NODE_FIELD(rels);
 
 	return newnode;
@@ -5320,6 +5331,9 @@ copyObjectImpl(const void *from)
 		case T_DropdbStmt:
 			retval = _copyDropdbStmt(from);
 			break;
+		case T_VacuumOptions:
+			retval = _copyVacuumOptions(from);
+			break;
 		case T_VacuumStmt:
 			retval = _copyVacuumStmt(from);
 			break;
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 1e169e0..011a25f 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -1666,9 +1666,18 @@ _equalDropdbStmt(const DropdbStmt *a, const DropdbStmt *b)
 }
 
 static bool
+_equalVacuumOptions(const VacuumOptions *a, const VacuumOptions *b)
+{
+	COMPARE_SCALAR_FIELD(flags);
+	COMPARE_SCALAR_FIELD(nworkers);
+
+	return true;
+}
+
+static bool
 _equalVacuumStmt(const VacuumStmt *a, const VacuumStmt *b)
 {
-	COMPARE_SCALAR_FIELD(options);
+	COMPARE_NODE_FIELD(options);
 	COMPARE_NODE_FIELD(rels);
 
 	return true;
@@ -3385,6 +3394,8 @@ equal(const void *a, const void *b)
 		case T_DropdbStmt:
 			retval = _equalDropdbStmt(a, b);
 			break;
+		case T_VacuumOptions:
+			retval = _equalVacuumOptions(a, b);
 		case T_VacuumStmt:
 			retval = _equalVacuumStmt(a, b);
 			break;
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index c1faf41..d2cd4a2 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -187,6 +187,7 @@ static void processCASbits(int cas_bits, int location, const char *constrType,
 			   bool *deferrable, bool *initdeferred, bool *not_valid,
 			   bool *no_inherit, core_yyscan_t yyscanner);
 static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
+static VacuumOptions *makeVacOpt(VacuumFlag flag, int nworkers);
 
 %}
 
@@ -237,6 +238,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 	struct ImportQual	*importqual;
 	InsertStmt			*istmt;
 	VariableSetStmt		*vsetstmt;
+	VacuumOptions		*vacopt;
 	PartitionElem		*partelem;
 	PartitionSpec		*partspec;
 	PartitionBoundSpec	*partboundspec;
@@ -305,8 +307,8 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 				create_extension_opt_item alter_extension_opt_item
 
 %type <ival>	opt_lock lock_type cast_context
-%type <ival>	vacuum_option_list vacuum_option_elem
-				analyze_option_list analyze_option_elem
+%type <vacopt>	vacuum_option_list vacuum_option_elem
+%type <ival>	analyze_option_list analyze_option_elem
 %type <boolean>	opt_or_replace
 				opt_grant_grant_option opt_grant_admin_option
 				opt_nowait opt_if_exists opt_with_data
@@ -10430,22 +10432,23 @@ cluster_index_specification:
 VacuumStmt: VACUUM opt_full opt_freeze opt_verbose opt_analyze opt_vacuum_relation_list
 				{
 					VacuumStmt *n = makeNode(VacuumStmt);
-					n->options = VACOPT_VACUUM;
+					n->options = makeVacOpt(VACOPT_VACUUM, 0);
 					if ($2)
-						n->options |= VACOPT_FULL;
+						n->options->flags |= VACOPT_FULL;
 					if ($3)
-						n->options |= VACOPT_FREEZE;
+						n->options->flags |= VACOPT_FREEZE;
 					if ($4)
-						n->options |= VACOPT_VERBOSE;
+						n->options->flags |= VACOPT_VERBOSE;
 					if ($5)
-						n->options |= VACOPT_ANALYZE;
+						n->options->flags |= VACOPT_ANALYZE;
 					n->rels = $6;
 					$$ = (Node *)n;
 				}
 			| VACUUM '(' vacuum_option_list ')' opt_vacuum_relation_list
 				{
 					VacuumStmt *n = makeNode(VacuumStmt);
-					n->options = VACOPT_VACUUM | $3;
+					n->options = $3;
+					n->options->flags |= VACOPT_VACUUM;
 					n->rels = $5;
 					$$ = (Node *) n;
 				}
@@ -10453,20 +10456,40 @@ VacuumStmt: VACUUM opt_full opt_freeze opt_verbose opt_analyze opt_vacuum_relati
 
 vacuum_option_list:
 			vacuum_option_elem								{ $$ = $1; }
-			| vacuum_option_list ',' vacuum_option_elem		{ $$ = $1 | $3; }
+			| vacuum_option_list ',' vacuum_option_elem
+				{
+					VacuumOptions *vacopt1 = $1;
+					VacuumOptions *vacopt2 = $3;
+
+					vacopt1->flags |= vacopt2->flags;
+					if (vacopt2->flags == VACOPT_PARALLEL)
+						vacopt1->nworkers = vacopt2->nworkers;
+					pfree(vacopt2);
+					$$ = vacopt1;
+				}
 		;
 
 vacuum_option_elem:
-			analyze_keyword		{ $$ = VACOPT_ANALYZE; }
-			| VERBOSE			{ $$ = VACOPT_VERBOSE; }
-			| FREEZE			{ $$ = VACOPT_FREEZE; }
-			| FULL				{ $$ = VACOPT_FULL; }
+			analyze_keyword		{ $$ = makeVacOpt(VACOPT_ANALYZE, 0); }
+			| VERBOSE			{ $$ = makeVacOpt(VACOPT_VERBOSE, 0); }
+			| FREEZE			{ $$ = makeVacOpt(VACOPT_FREEZE, 0); }
+			| FULL				{ $$ = makeVacOpt(VACOPT_FULL, 0); }
+			| PARALLEL			{ $$ = makeVacOpt(VACOPT_PARALLEL, 0); }
+			| PARALLEL ICONST
+			{
+				if ($2 < 1)
+					ereport(ERROR,
+							(errcode(ERRCODE_SYNTAX_ERROR),
+							 errmsg("parallel vacuum degree must be more than 1"),
+							 parser_errposition(@1)));
+				$$ = makeVacOpt(VACOPT_PARALLEL, $2);
+			}
 			| IDENT
 				{
 					if (strcmp($1, "disable_page_skipping") == 0)
-						$$ = VACOPT_DISABLE_PAGE_SKIPPING;
+						$$ = makeVacOpt(VACOPT_DISABLE_PAGE_SKIPPING, 0);
 					else if (strcmp($1, "skip_locked") == 0)
-						$$ = VACOPT_SKIP_LOCKED;
+						$$ = makeVacOpt(VACOPT_SKIP_LOCKED, 0);
 					else
 						ereport(ERROR,
 								(errcode(ERRCODE_SYNTAX_ERROR),
@@ -10478,16 +10501,16 @@ vacuum_option_elem:
 AnalyzeStmt: analyze_keyword opt_verbose opt_vacuum_relation_list
 				{
 					VacuumStmt *n = makeNode(VacuumStmt);
-					n->options = VACOPT_ANALYZE;
+					n->options = makeVacOpt(VACOPT_ANALYZE, 0);
 					if ($2)
-						n->options |= VACOPT_VERBOSE;
+						n->options->flags |= VACOPT_VERBOSE;
 					n->rels = $3;
 					$$ = (Node *)n;
 				}
 			| analyze_keyword '(' analyze_option_list ')' opt_vacuum_relation_list
 				{
 					VacuumStmt *n = makeNode(VacuumStmt);
-					n->options = VACOPT_ANALYZE | $3;
+					n->options = makeVacOpt(VACOPT_ANALYZE | $3, 0);
 					n->rels = $5;
 					$$ = (Node *) n;
 				}
@@ -15985,6 +16008,19 @@ makeXmlExpr(XmlExprOp op, char *name, List *named_args, List *args,
 	return (Node *) x;
 }
 
+
+/*
+ * Create a VacuumOptions with the given options.
+ */
+static VacuumOptions *
+makeVacOpt(VacuumFlag flag, int nworkers)
+{
+	VacuumOptions *vacopt = makeNode(VacuumOptions);
+
+	vacopt->flags = flag;
+	vacopt->nworkers = nworkers;
+	return vacopt;
+}
 /*
  * Merge the input and output parameters of a table function.
  */
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index d1177b3..22ec846 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -187,8 +187,8 @@ typedef struct av_relation
 typedef struct autovac_table
 {
 	Oid			at_relid;
-	int			at_vacoptions;	/* bitmask of VacuumOption */
-	VacuumParams at_params;
+	VacuumOptions	*at_vacoptions;
+	VacuumParams	at_params;
 	int			at_vacuum_cost_delay;
 	int			at_vacuum_cost_limit;
 	bool		at_dobalance;
@@ -2481,7 +2481,7 @@ do_autovacuum(void)
 			 * next table in our list.
 			 */
 			HOLD_INTERRUPTS();
-			if (tab->at_vacoptions & VACOPT_VACUUM)
+			if (tab->at_vacoptions->flags & VACOPT_VACUUM)
 				errcontext("automatic vacuum of table \"%s.%s.%s\"",
 						   tab->at_datname, tab->at_nspname, tab->at_relname);
 			else
@@ -2882,10 +2882,12 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		tab = palloc(sizeof(autovac_table));
 		tab->at_relid = relid;
 		tab->at_sharedrel = classForm->relisshared;
-		tab->at_vacoptions = VACOPT_SKIPTOAST |
+		tab->at_vacoptions = makeNode(VacuumOptions);
+		tab->at_vacoptions->flags = VACOPT_SKIPTOAST |
 			(dovacuum ? VACOPT_VACUUM : 0) |
 			(doanalyze ? VACOPT_ANALYZE : 0) |
 			(!wraparound ? VACOPT_SKIP_LOCKED : 0);
+		tab->at_vacoptions->nworkers = 0;	/* parallel lazy vacuum is not supported */
 		tab->at_params.freeze_min_age = freeze_min_age;
 		tab->at_params.freeze_table_age = freeze_table_age;
 		tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
@@ -3131,10 +3133,10 @@ autovac_report_activity(autovac_table *tab)
 	int			len;
 
 	/* Report the command and possible options */
-	if (tab->at_vacoptions & VACOPT_VACUUM)
+	if (tab->at_vacoptions->flags & VACOPT_VACUUM)
 		snprintf(activity, MAX_AUTOVAC_ACTIV_LEN,
 				 "autovacuum: VACUUM%s",
-				 tab->at_vacoptions & VACOPT_ANALYZE ? " ANALYZE" : "");
+				 tab->at_vacoptions->flags & VACOPT_ANALYZE ? " ANALYZE" : "");
 	else
 		snprintf(activity, MAX_AUTOVAC_ACTIV_LEN,
 				 "autovacuum: ANALYZE");
diff --git a/src/backend/tcop/utility.c b/src/backend/tcop/utility.c
index 6ec795f..a735ff9 100644
--- a/src/backend/tcop/utility.c
+++ b/src/backend/tcop/utility.c
@@ -664,7 +664,7 @@ standard_ProcessUtility(PlannedStmt *pstmt,
 				VacuumStmt *stmt = (VacuumStmt *) parsetree;
 
 				/* we choose to allow this during "read only" transactions */
-				PreventCommandDuringRecovery((stmt->options & VACOPT_VACUUM) ?
+				PreventCommandDuringRecovery((stmt->options->flags & VACOPT_VACUUM) ?
 											 "VACUUM" : "ANALYZE");
 				/* forbidden in parallel mode due to CommandIsReadOnly */
 				ExecVacuum(stmt, isTopLevel);
@@ -2570,7 +2570,7 @@ CreateCommandTag(Node *parsetree)
 			break;
 
 		case T_VacuumStmt:
-			if (((VacuumStmt *) parsetree)->options & VACOPT_VACUUM)
+			if (((VacuumStmt *) parsetree)->options->flags & VACOPT_VACUUM)
 				tag = "VACUUM";
 			else
 				tag = "ANALYZE";
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index ab08791..62e75d8 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -14,11 +14,13 @@
 #ifndef HEAPAM_H
 #define HEAPAM_H
 
+#include "access/parallel.h"
 #include "access/relation.h"	/* for backward compatibility */
 #include "access/sdir.h"
 #include "access/skey.h"
 #include "access/table.h"		/* for backward compatibility */
 #include "nodes/lockoptions.h"
+#include "nodes/parsenodes.h"
 #include "nodes/primnodes.h"
 #include "storage/bufpage.h"
 #include "storage/lockdefs.h"
@@ -185,8 +187,9 @@ extern Size SyncScanShmemSize(void);
 
 /* in heap/vacuumlazy.c */
 struct VacuumParams;
-extern void heap_vacuum_rel(Relation onerel, int options,
+extern void heap_vacuum_rel(Relation onerel, VacuumOptions *options,
 				struct VacuumParams *params, BufferAccessStrategy bstrategy);
+extern void heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
 
 /* in heap/heapam_visibility.c */
 extern bool HeapTupleSatisfiesVisibility(HeapTuple stup, Snapshot snapshot,
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 0a051ec..dd71f0d 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -163,7 +163,7 @@ extern int	vacuum_multixact_freeze_table_age;
 
 /* in commands/vacuum.c */
 extern void ExecVacuum(VacuumStmt *vacstmt, bool isTopLevel);
-extern void vacuum(int options, List *relations, VacuumParams *params,
+extern void vacuum(VacuumOptions *options, List *relations, VacuumParams *params,
 	   BufferAccessStrategy bstrategy, bool isTopLevel);
 extern void vac_open_indexes(Relation relation, LOCKMODE lockmode,
 				 int *nindexes, Relation **Irel);
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index e215ad4..70b9231 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -474,6 +474,7 @@ typedef enum NodeTag
 	T_PartitionBoundSpec,
 	T_PartitionRangeDatum,
 	T_PartitionCmd,
+	T_VacuumOptions,
 	T_VacuumRelation,
 
 	/*
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 2fe14d7..526caa2 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3147,7 +3147,7 @@ typedef struct ClusterStmt
  * and VACOPT_ANALYZE must be set in options.
  * ----------------------
  */
-typedef enum VacuumOption
+typedef enum VacuumFlag
 {
 	VACOPT_VACUUM = 1 << 0,		/* do VACUUM */
 	VACOPT_ANALYZE = 1 << 1,	/* do ANALYZE */
@@ -3156,8 +3156,16 @@ typedef enum VacuumOption
 	VACOPT_FULL = 1 << 4,		/* FULL (non-concurrent) vacuum */
 	VACOPT_SKIP_LOCKED = 1 << 5,	/* skip if cannot get lock */
 	VACOPT_SKIPTOAST = 1 << 6,	/* don't process the TOAST table, if any */
-	VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7	/* don't skip any pages */
-} VacuumOption;
+	VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7,	/* don't skip any pages */
+	VACOPT_PARALLEL = 1 << 8	/* do lazy VACUUM in parallel */
+} VacuumFlag;
+
+typedef struct VacuumOptions
+{
+	NodeTag		type;
+	VacuumFlag	flags;	/* OR of VacuumFlag */
+	int			nworkers;	/* # of parallel vacuum workers */
+} VacuumOptions;
 
 /*
  * Info about a single target table of VACUUM/ANALYZE.
@@ -3176,9 +3184,9 @@ typedef struct VacuumRelation
 
 typedef struct VacuumStmt
 {
-	NodeTag		type;
-	int			options;		/* OR of VacuumOption flags */
-	List	   *rels;			/* list of VacuumRelation, or NIL for all */
+	NodeTag			type;
+	VacuumOptions  *options;
+	List		   *rels;			/* list of VacuumRelation, or NIL for all */
 } VacuumStmt;
 
 /* ----------------------
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index fa9d663..9b5b7dc 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -80,6 +80,8 @@ CONTEXT:  SQL function "do_analyze" statement 1
 SQL function "wrap_do_analyze" statement 1
 VACUUM FULL vactst;
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+VACUUM (PARALLEL) vaccluster;
+VACUUM (PARALLEL 2) vaccluster;
 -- partitioned table
 CREATE TABLE vacparted (a int, b char) PARTITION BY LIST (a);
 CREATE TABLE vacparted1 PARTITION OF vacparted FOR VALUES IN (1);
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 9defa0d..f92c4e5 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -61,6 +61,9 @@ VACUUM FULL vaccluster;
 VACUUM FULL vactst;
 
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+VACUUM (PARALLEL) vaccluster;
+VACUUM (PARALLEL 2) vaccluster;
+
 
 -- partitioned table
 CREATE TABLE vacparted (a int, b char) PARTITION BY LIST (a);
-- 
1.8.3.1

#23

Haribabu Kommi

kommi.haribabu@gmail.com

almost 7 years ago

In reply to: Masahiko Sawada (#19)

On Fri, Feb 1, 2019 at 8:19 AM Masahiko Sawada <sawada.mshk@gmail.com>
wrote:

On Wed, Jan 30, 2019 at 2:06 AM Haribabu Kommi <kommi.haribabu@gmail.com>
wrote:

+ * Before starting parallel index vacuum and parallel cleanup index we

launch

+ * parallel workers. All parallel workers will exit after processed all

indexes

parallel vacuum index and parallel cleanup index?

ISTM we're using like "index vacuuming", "index cleanup" and "FSM
vacuming" in vacuumlazy.c so maybe "parallel index vacuuming" and
"parallel index cleanup" would be better?

OK.

+ /*
+ * If there is already-updated result in the shared memory we
+ * use it. Otherwise we pass NULL to index AMs and copy the
+ * result to the shared memory segment.
+ */
+ if (lvshared->indstats[idx].updated)
+ result = &(lvshared->indstats[idx].stats);
I didn't really find a need of the flag to differentiate the stats
pointer from

first run to second run? I don't see any problem in passing directing

the stats

and the same stats are updated in the worker side and leader side.

Anyway no two

processes will do the index vacuum at same time. Am I missing something?

Even if this flag is to identify whether the stats are updated or not

before

writing them, I don't see a need of it compared to normal vacuum.

The passing stats = NULL to amvacuumcleanup and ambulkdelete means the
first time execution. For example, btvacuumcleanup skips cleanup if
it's not NULL.In the normal vacuum we pass NULL to ambulkdelete or
amvacuumcleanup when the first time calling. And they store the result
stats to the memory allocated int the local memory. Therefore in the
parallel vacuum I think that both worker and leader need to move it to
the shared memory and mark it as updated as different worker could
vacuum different indexes at the next time.

OK, understood the point. But for btbulkdelete whenever the stats are NULL,
it allocates the memory. So I don't see a problem with it.

The only problem is with btvacuumcleanup, when there are no dead tuples
present in the table, the btbulkdelete is not called and directly the
btvacuumcleanup
is called at the end of vacuum, in that scenario, there is code flow
difference
based on the stats. so why can't we use the deadtuples number to
differentiate
instead of adding another flag? And also this scenario is not very often,
so avoiding
memcpy for normal operations would be better. It may be a small gain, just
thought of it.

+ initStringInfo(&buf);
+ appendStringInfo(&buf,
+ ngettext("launched %d parallel vacuum worker %s (planned: %d",
+   "launched %d parallel vacuum workers %s (planned: %d",
+   lvstate->pcxt->nworkers_launched),
+ lvstate->pcxt->nworkers_launched,
+ for_cleanup ? "for index cleanup" : "for index vacuum",
+ lvstate->pcxt->nworkers);
+ if (lvstate->options.nworkers > 0)
+ appendStringInfo(&buf, ", requested %d", lvstate->options.nworkers);
what is the difference between planned workers and requested workers,
aren't both

are same?

The request is the parallel degree that is specified explicitly by
user whereas the planned is the actual number we planned based on the
number of indexes the table has. For example, if we do like 'VACUUM
(PARALLEL 3000) tbl' where the tbl has 4 indexes, the request is 3000
and the planned is 4. Also if max_parallel_maintenance_workers is 2
the planned is 2.

OK.

Regards,
Haribabu Kommi
Fujitsu Australia

#24

Masahiko Sawada

sawada.mshk@gmail.com

almost 7 years ago

In reply to: Haribabu Kommi (#23)

On Tue, Feb 5, 2019 at 12:14 PM Haribabu Kommi <kommi.haribabu@gmail.com> wrote:

On Fri, Feb 1, 2019 at 8:19 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

The passing stats = NULL to amvacuumcleanup and ambulkdelete means the
first time execution. For example, btvacuumcleanup skips cleanup if
it's not NULL.In the normal vacuum we pass NULL to ambulkdelete or
amvacuumcleanup when the first time calling. And they store the result
stats to the memory allocated int the local memory. Therefore in the
parallel vacuum I think that both worker and leader need to move it to
the shared memory and mark it as updated as different worker could
vacuum different indexes at the next time.

OK, understood the point. But for btbulkdelete whenever the stats are NULL,
it allocates the memory. So I don't see a problem with it.

The only problem is with btvacuumcleanup, when there are no dead tuples
present in the table, the btbulkdelete is not called and directly the btvacuumcleanup
is called at the end of vacuum, in that scenario, there is code flow difference
based on the stats. so why can't we use the deadtuples number to differentiate
instead of adding another flag?

I don't understand your suggestion. What do we compare deadtuples
number to? Could you elaborate on that please?

And also this scenario is not very often, so avoiding
memcpy for normal operations would be better. It may be a small gain, just
thought of it.

This scenario could happen periodically on an insert-only table.
Additional memcpy is executed once per indexes in a vacuuming but I
agree that the avoiding memcpy would be good.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#25

Haribabu Kommi

kommi.haribabu@gmail.com

almost 7 years ago

In reply to: Masahiko Sawada (#24)

On Sat, Feb 9, 2019 at 11:47 PM Masahiko Sawada <sawada.mshk@gmail.com>
wrote:

On Tue, Feb 5, 2019 at 12:14 PM Haribabu Kommi <kommi.haribabu@gmail.com>
wrote:

On Fri, Feb 1, 2019 at 8:19 AM Masahiko Sawada <sawada.mshk@gmail.com>

wrote:

The passing stats = NULL to amvacuumcleanup and ambulkdelete means the
first time execution. For example, btvacuumcleanup skips cleanup if
it's not NULL.In the normal vacuum we pass NULL to ambulkdelete or
amvacuumcleanup when the first time calling. And they store the result
stats to the memory allocated int the local memory. Therefore in the
parallel vacuum I think that both worker and leader need to move it to
the shared memory and mark it as updated as different worker could
vacuum different indexes at the next time.

OK, understood the point. But for btbulkdelete whenever the stats are

NULL,

it allocates the memory. So I don't see a problem with it.

The only problem is with btvacuumcleanup, when there are no dead tuples
present in the table, the btbulkdelete is not called and directly the

btvacuumcleanup

is called at the end of vacuum, in that scenario, there is code flow

difference

based on the stats. so why can't we use the deadtuples number to

differentiate

instead of adding another flag?

I don't understand your suggestion. What do we compare deadtuples
number to? Could you elaborate on that please?

The scenario where the stats should pass NULL to btvacuumcleanup function is
when there no dead tuples, I just think that we may use that deadtuples
structure
to find out whether stats should pass NULL or not while avoiding the extra
memcpy.

And also this scenario is not very often, so avoiding
memcpy for normal operations would be better. It may be a small gain,

just

thought of it.

This scenario could happen periodically on an insert-only table.
Additional memcpy is executed once per indexes in a vacuuming but I
agree that the avoiding memcpy would be good.

Yes, understood. If possible removing the need of memcpy would be good.
The latest patch doesn't apply anymore. Needs a rebase.

Regards,
Haribabu Kommi
Fujitsu Australia

#26

Masahiko Sawada

sawada.mshk@gmail.com

almost 7 years ago

In reply to: Haribabu Kommi (#25)

2 attachment(s)

On Wed, Feb 13, 2019 at 9:32 PM Haribabu Kommi <kommi.haribabu@gmail.com> wrote:

On Sat, Feb 9, 2019 at 11:47 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Tue, Feb 5, 2019 at 12:14 PM Haribabu Kommi <kommi.haribabu@gmail.com> wrote:

On Fri, Feb 1, 2019 at 8:19 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

The passing stats = NULL to amvacuumcleanup and ambulkdelete means the
first time execution. For example, btvacuumcleanup skips cleanup if
it's not NULL.In the normal vacuum we pass NULL to ambulkdelete or
amvacuumcleanup when the first time calling. And they store the result
stats to the memory allocated int the local memory. Therefore in the
parallel vacuum I think that both worker and leader need to move it to
the shared memory and mark it as updated as different worker could
vacuum different indexes at the next time.

OK, understood the point. But for btbulkdelete whenever the stats are NULL,
it allocates the memory. So I don't see a problem with it.

The only problem is with btvacuumcleanup, when there are no dead tuples
present in the table, the btbulkdelete is not called and directly the btvacuumcleanup
is called at the end of vacuum, in that scenario, there is code flow difference
based on the stats. so why can't we use the deadtuples number to differentiate
instead of adding another flag?

I don't understand your suggestion. What do we compare deadtuples
number to? Could you elaborate on that please?

The scenario where the stats should pass NULL to btvacuumcleanup function is
when there no dead tuples, I just think that we may use that deadtuples structure
to find out whether stats should pass NULL or not while avoiding the extra
memcpy.

Thank you for your explanation. I understood. Maybe I'm worrying too
much but I'm concernced compatibility; currently we handle indexes
individually. So if there is an index access method whose ambulkdelete
returns NULL at the first call but returns a palloc'd struct at the
second time or other, that doesn't work fine.

The documentation says that passed-in 'stats' is NULL at the first
time call of ambulkdelete but doesn't say about the second time or
more. Index access methods may expect that the passed-in 'stats' is
the same as what they has returned last time. So I think to add an
extra flag for keeping comptibility.

And also this scenario is not very often, so avoiding
memcpy for normal operations would be better. It may be a small gain, just
thought of it.

This scenario could happen periodically on an insert-only table.
Additional memcpy is executed once per indexes in a vacuuming but I
agree that the avoiding memcpy would be good.

Yes, understood. If possible removing the need of memcpy would be good.
The latest patch doesn't apply anymore. Needs a rebase.

Thank you. Attached the rebased patch.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachments:

v15-0001-Add-parallel-option-to-VACUUM-command.patchapplication/octet-stream; name=v15-0001-Add-parallel-option-to-VACUUM-command.patchDownload

From eef17263197bd060abeea43c5d414a617ed347b7 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Tue, 18 Dec 2018 14:48:34 +0900
Subject: [PATCH v15 1/2] Add parallel option to VACUUM command

In parallel vacuum, we do both index vacuum and cleanup vacuum
in parallel with parallel worker processes if the table has
more than one index. All processes including the leader process
process indexes one by one.

Parallel vacuum can be performed by specifying like
VACUUM (PARALLEL 2) tbl, meaning that performing vacuum with 2
parallel worker processes. Or the setting parallel_workers
reloption more than 0 invokes parallel vacuum.

The parallel vacuum degree is limited by both the number of
indexes the table has and max_parallel_maintenance_workers.
---
 doc/src/sgml/config.sgml              |  24 +-
 doc/src/sgml/ref/vacuum.sgml          |  28 ++
 src/backend/access/heap/vacuumlazy.c  | 893 +++++++++++++++++++++++++++++-----
 src/backend/access/transam/parallel.c |   4 +
 src/backend/commands/vacuum.c         |  78 +--
 src/backend/nodes/equalfuncs.c        |   6 +-
 src/backend/parser/gram.y             |  73 ++-
 src/backend/postmaster/autovacuum.c   |  13 +-
 src/backend/tcop/utility.c            |   4 +-
 src/include/access/heapam.h           |   5 +-
 src/include/commands/vacuum.h         |   2 +-
 src/include/nodes/parsenodes.h        |  19 +-
 src/test/regress/expected/vacuum.out  |   2 +
 src/test/regress/sql/vacuum.sql       |   3 +
 14 files changed, 946 insertions(+), 208 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 8bd57f3..9736249 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2209,18 +2209,18 @@ include_dir 'conf.d'
        <listitem>
         <para>
          Sets the maximum number of parallel workers that can be
-         started by a single utility command.  Currently, the only
-         parallel utility command that supports the use of parallel
-         workers is <command>CREATE INDEX</command>, and only when
-         building a B-tree index.  Parallel workers are taken from the
-         pool of processes established by <xref
-         linkend="guc-max-worker-processes"/>, limited by <xref
-         linkend="guc-max-parallel-workers"/>.  Note that the requested
-         number of workers may not actually be available at run time.
-         If this occurs, the utility operation will run with fewer
-         workers than expected.  The default value is 2.  Setting this
-         value to 0 disables the use of parallel workers by utility
-         commands.
+         started by a single utility command.  Currently, the parallel
+         utility commands that support the use of parallel workers are
+         <command>CREATE INDEX</command> and <command>VACUUM</command>
+         without <literal>FULL</literal> option, and only when building
+         a B-tree index.  Parallel workers are taken from the pool of
+         processes established by <xref linkend="guc-max-worker-processes"/>,
+         limited by <xref linkend="guc-max-parallel-workers"/>.
+         Note that the requested number of workers may not actually be
+         available at run time.  If this occurs, the utility operation
+         will run with fewer workers than expected.  The default value
+         is 2.  Setting this value to 0 disables the use of parallel
+         workers by utility commands.
         </para>
 
         <para>
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index fd911f5..add3060 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -30,6 +30,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     FREEZE
     VERBOSE
     ANALYZE
+    PARALLEL [ <replaceable class="parameter">N</replaceable> ]
     DISABLE_PAGE_SKIPPING
     SKIP_LOCKED
 
@@ -143,6 +144,24 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    </varlistentry>
 
    <varlistentry>
+    <term><literal>PARALLEL <replaceable class="parameter">N</replaceable></literal></term>
+    <listitem>
+     <para>
+      Vacuum index and cleanup index in parallel
+      <replaceable class="parameter">N</replaceable> background workers (for the detail
+      of each vacuum phases, please refer to <xref linkend="vacuum-phases"/>. If the
+      parallel degree <replaceable class="parameter">N</replaceable> is omitted,
+      then <command>VACUUM</command> decides the number of workers based on
+      number of indexes on the relation which further limited by
+      <xref linkend="guc-max-parallel-workers-maintenance"/>. Also if this option
+      is specified multile times, the last parallel degree
+      <replaceable class="parameter">N</replaceable> is considered into the account.
+      This option can not use with  <literal>FULL</literal> option.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><literal>DISABLE_PAGE_SKIPPING</literal></term>
     <listitem>
      <para>
@@ -261,6 +280,15 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    </para>
 
    <para>
+    Setting a value for <literal>parallel_workers</literal> via
+    <xref linkend="sql-altertable"/> also controls how many parallel
+    worker processes will be requested by a <command>VACUUM</command>
+    against the table. This setting is overwritten by setting
+    <replaceable class="parameter">N</replaceable> of <literal>PARALLEL</literal>
+    option.
+   </para>
+
+   <para>
     <command>VACUUM</command> causes a substantial increase in I/O traffic,
     which might cause poor performance for other active sessions.  Therefore,
     it is sometimes advisable to use the cost-based vacuum delay feature.
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 9416c31..7ae45fb 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -22,6 +22,19 @@
  * of index scans performed.  So we don't use maintenance_work_mem memory for
  * the TID array, just enough to hold as many heap tuples as fit on one page.
  *
+ * Lazy vacuum supports parallel execution with parallel worker processes. In
+ * parallel vacuum, we perform both index vacuum and index cleanup in parallel.
+ * Individual indexes is processed by one vacuum process. At beginning of
+ * lazy vacuum (at lazy_scan_heap) we prepare the parallel context and initialize
+ * the shared memory segments that contains shared information as well as the
+ * memory space for dead tuples. When starting either index vacuum or cleanup
+ * vacuum, we launch parallel worker processes. Once all indexes are processed
+ * the parallel worker processes exit and the leader process re-initializes the
+ * shared memory segment. Note that all parallel workers live during one either
+ * index vacuum or cleanup index but the leader process neither exits from the
+ * parallel mode nor destories the parallel context. For updating the index
+ * statistics, since any updates are not allowed during parallel mode we update
+ * the index statistics after exited from the parallel mode.
  *
  * Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
@@ -41,8 +54,10 @@
 #include "access/heapam_xlog.h"
 #include "access/htup_details.h"
 #include "access/multixact.h"
+#include "access/parallel.h"
 #include "access/transam.h"
 #include "access/visibilitymap.h"
+#include "access/xact.h"
 #include "access/xlog.h"
 #include "catalog/storage.h"
 #include "commands/dbcommands.h"
@@ -55,6 +70,7 @@
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
 #include "storage/lmgr.h"
+#include "tcop/tcopprot.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/pg_rusage.h"
@@ -110,10 +126,79 @@
  */
 #define PREFETCH_SIZE			((BlockNumber) 32)
 
+/* DSM keys for parallel lazy vacuum */
+#define PARALLEL_VACUUM_KEY_SHARED			UINT64CONST(0xFFFFFFFFFFF00001)
+#define PARALLEL_VACUUM_KEY_DEAD_TUPLES		UINT64CONST(0xFFFFFFFFFFF00002)
+#define PARALLEL_VACUUM_KEY_QUERY_TEXT		UINT64CONST(0xFFFFFFFFFFF00003)
+
+/*
+ * Structs for an index bulk-deletion statistic that is used for parallel
+ * lazy vacuum. This is allocated in a dynamic shared memory segment.
+ */
+typedef struct LVIndStats
+{
+	bool updated;	/* is the stats updated? */
+	IndexBulkDeleteResult stats;
+} LVIndStats;
+
+/*
+ * LVDeadTuples controls the dead tuple TIDs collected during heap scan.
+ * This is allocated in a dynamic shared memory segment when parallel
+ * lazy vacuum mode, or allocated in a local memory.
+ */
+typedef struct LVDeadTuples
+{
+	int			max_tuples;	/* # slots allocated in array */
+	int			num_tuples;	/* current # of entries */
+	/* List of TIDs of tuples we intend to delete */
+	/* NB: this list is ordered by TID address */
+	ItemPointerData itemptrs[FLEXIBLE_ARRAY_MEMBER];	/* array of ItemPointerData */
+} LVDeadTuples;
+#define SizeOfLVDeadTuples offsetof(LVDeadTuples, itemptrs) + sizeof(ItemPointerData)
+
+/*
+ * Status for parallel index vacuum and index cleanup. This is allocated in
+ * a dynamic shared memory segment.
+ */
+typedef struct LVShared
+{
+	/*
+	 * Target table relid and vacuum settings. These fields are not modified
+	 * during the lazy vacuum.
+	 */
+	Oid		relid;
+	bool	is_wraparound;
+	int		elevel;
+
+	/*
+	 * An indication for vacuum workers of doing either vacuuming index or
+	 * cleanup index.
+	 */
+	bool	for_cleanup;
+
+	/*
+	 * Fields for vacuum index or cleanup index, or both necessary for
+	 * IndexVacuumInfo.
+	 *
+	 * reltuples is the total number of input heap tuples. We set either an
+	 * old live tuples in vacuum index or th new live tuples in cleanup index.
+	 *
+	 * estimated_count is true if the reltuples is estimated value.
+	 */
+	double	reltuples;
+	bool	estimated_count;
+
+	/*
+	 * Variables to control parallel index vacuum. An variable-sized field
+	 * 'indstats' must come last.
+	 */
+	pg_atomic_uint32	nprocessed;
+	LVIndStats			indstats[FLEXIBLE_ARRAY_MEMBER];
+} LVShared;
+#define SizeOfLVShared offsetof(LVShared, indstats) + sizeof(LVIndStats)
+
 typedef struct LVRelStats
 {
-	/* hasindex = true means two-pass strategy; false means one-pass */
-	bool		hasindex;
 	/* Overall statistics about rel */
 	BlockNumber old_rel_pages;	/* previous value of pg_class.relpages */
 	BlockNumber rel_pages;		/* total number of pages */
@@ -128,16 +213,35 @@ typedef struct LVRelStats
 	BlockNumber pages_removed;
 	double		tuples_deleted;
 	BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
-	/* List of TIDs of tuples we intend to delete */
-	/* NB: this list is ordered by TID address */
-	int			num_dead_tuples;	/* current # of entries */
-	int			max_dead_tuples;	/* # slots allocated in array */
-	ItemPointer dead_tuples;	/* array of ItemPointerData */
 	int			num_index_scans;
 	TransactionId latestRemovedXid;
 	bool		lock_waiter_detected;
 } LVRelStats;
 
+/*
+ * Working state for lazy heap vacuum execution used by only leader process.
+ * This is present only in the leader process. In parallel lazy vacuum, the
+ * 'lvshared' and 'pcxt' are not NULL and they point to the dynamic shared
+ * memory segment.
+ */
+typedef struct LVState
+{
+	Relation	relation;
+	LVRelStats	*vacrelstats;
+	Relation	*indRels;
+	/* nindexes > 0 means two-pass strategy; false means one-pass */
+	int			nindexes;
+
+	/* Lazy vacuum options and scan status */
+	VacuumOptions	options;
+	bool			is_wraparound;
+	bool			aggressive;
+	bool			parallel_ready;	/* true if parallel vacuum is prepared */
+
+	/* Variables for parallel lazy index vacuum */
+	LVShared		*lvshared;
+	ParallelContext	*pcxt;
+} LVState;
 
 /* A few variables that don't seem worth passing around as parameters */
 static int	elevel = -1;
@@ -150,31 +254,44 @@ static BufferAccessStrategy vac_strategy;
 
 
 /* non-export function prototypes */
-static void lazy_scan_heap(Relation onerel, int options,
-			   LVRelStats *vacrelstats, Relation *Irel, int nindexes,
-			   bool aggressive);
-static void lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats, BlockNumber nblocks);
+static void lazy_scan_heap(LVState *lvstate);
+static void lazy_vacuum_heap(LVState *lvstate, LVDeadTuples *dead_tuples,
+							 BlockNumber nblocks);
 static bool lazy_check_needs_freeze(Buffer buf, bool *hastup);
-static void lazy_vacuum_index(Relation indrel,
-				  IndexBulkDeleteResult **stats,
-				  LVRelStats *vacrelstats);
-static void lazy_cleanup_index(Relation indrel,
-				   IndexBulkDeleteResult *stats,
-				   LVRelStats *vacrelstats);
-static int lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
-				 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer);
+static IndexBulkDeleteResult *lazy_vacuum_index(Relation indrel,
+									IndexBulkDeleteResult *stats,
+									double reltuples,
+									LVDeadTuples	*dead_tuples);
+static IndexBulkDeleteResult *lazy_cleanup_index(Relation indrel,
+									IndexBulkDeleteResult *stats,
+									double reltuples, bool estimated_count,
+									bool update_stats);
+static int lazy_vacuum_page(LVState *lvstate, Relation onerel, BlockNumber blkno,
+					Buffer buffer, int tupindex, Buffer *vmbuffer,
+					TransactionId latestRemovedXid, LVDeadTuples *dead_tuples);
 static bool should_attempt_truncation(LVRelStats *vacrelstats);
 static void lazy_truncate_heap(Relation onerel, LVRelStats *vacrelstats);
 static BlockNumber count_nondeletable_pages(Relation onerel,
 						 LVRelStats *vacrelstats);
-static void lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks);
-static void lazy_record_dead_tuple(LVRelStats *vacrelstats,
-					   ItemPointer itemptr);
+static LVDeadTuples *lazy_space_alloc(LVState *lvstate, BlockNumber relblocks,
+								  int parallel_workers);
+static void lazy_record_dead_tuple(LVDeadTuples *dead_tuples, ItemPointer itemptr);
 static bool lazy_tid_reaped(ItemPointer itemptr, void *state);
 static int	vac_cmp_itemptr(const void *left, const void *right);
 static bool heap_page_is_all_visible(Relation rel, Buffer buf,
 						 TransactionId *visibility_cutoff_xid, bool *all_frozen);
-
+static LVDeadTuples *lazy_prepare_parallel(LVState *lvstate, long maxtuples, int request);
+static void lazy_end_parallel(LVState *lvstate, bool update_indstats);
+static bool lazy_begin_parallel_vacuum_index(LVState *lvstate, bool for_cleanup);
+static void lazy_end_parallel_vacuum_index(LVState *lvstate);
+static void lazy_vacuum_all_indexes(LVState *lvstate,
+									IndexBulkDeleteResult **stats,
+									LVDeadTuples *dead_tuples,
+									bool for_cleanup);
+static void lazy_vacuum_indexes_for_worker(Relation *indrels, int nindexes,
+										   LVShared *lvshared, LVDeadTuples *dead_tuples,
+										   bool for_cleanup);
+static int compute_parallel_workers(Relation rel, int nrequests, int nindexes);
 
 /*
  *	heap_vacuum_rel() -- perform VACUUM for one heap relation
@@ -186,9 +303,10 @@ static bool heap_page_is_all_visible(Relation rel, Buffer buf,
  *		and locked the relation.
  */
 void
-heap_vacuum_rel(Relation onerel, int options, VacuumParams *params,
+heap_vacuum_rel(Relation onerel, VacuumOptions options, VacuumParams *params,
 				BufferAccessStrategy bstrategy)
 {
+	LVState	   *lvstate;
 	LVRelStats *vacrelstats;
 	Relation   *Irel;
 	int			nindexes;
@@ -200,6 +318,7 @@ heap_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 				write_rate;
 	bool		aggressive;		/* should we scan all unfrozen pages? */
 	bool		scanned_all_unfrozen;	/* actually scanned all such pages? */
+	bool		hasindex;
 	TransactionId xidFullScanLimit;
 	MultiXactId mxactFullScanLimit;
 	BlockNumber new_rel_pages;
@@ -217,7 +336,7 @@ heap_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 		starttime = GetCurrentTimestamp();
 	}
 
-	if (options & VACOPT_VERBOSE)
+	if (options.flags & VACOPT_VERBOSE)
 		elevel = INFO;
 	else
 		elevel = DEBUG2;
@@ -245,7 +364,7 @@ heap_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 											   xidFullScanLimit);
 	aggressive |= MultiXactIdPrecedesOrEquals(onerel->rd_rel->relminmxid,
 											  mxactFullScanLimit);
-	if (options & VACOPT_DISABLE_PAGE_SKIPPING)
+	if (options.flags & VACOPT_DISABLE_PAGE_SKIPPING)
 		aggressive = true;
 
 	vacrelstats = (LVRelStats *) palloc0(sizeof(LVRelStats));
@@ -258,10 +377,23 @@ heap_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 
 	/* Open all indexes of the relation */
 	vac_open_indexes(onerel, RowExclusiveLock, &nindexes, &Irel);
-	vacrelstats->hasindex = (nindexes > 0);
+	hasindex = (nindexes > 0);
+
+	/* Create a lazy vacuum working state */
+	lvstate = (LVState *) palloc0(sizeof(LVState));
+	lvstate->relation = onerel;
+	lvstate->vacrelstats = vacrelstats;
+	lvstate->indRels = Irel;
+	lvstate->nindexes = nindexes;
+	lvstate->options = options;
+	lvstate->is_wraparound = params->is_wraparound;
+	lvstate->aggressive = aggressive;
+	lvstate->parallel_ready = false;
+	lvstate->lvshared = NULL;
+	lvstate->pcxt = NULL;
 
 	/* Do the vacuuming */
-	lazy_scan_heap(onerel, options, vacrelstats, Irel, nindexes, aggressive);
+	lazy_scan_heap(lvstate);
 
 	/* Done with indexes */
 	vac_close_indexes(nindexes, Irel, NoLock);
@@ -332,7 +464,7 @@ heap_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 						new_rel_pages,
 						new_live_tuples,
 						new_rel_allvisible,
-						vacrelstats->hasindex,
+						hasindex,
 						new_frozen_xid,
 						new_min_multi,
 						false);
@@ -464,14 +596,29 @@ vacuum_log_cleanup_info(Relation rel, LVRelStats *vacrelstats)
  *		dead-tuple TIDs, invoke vacuuming of indexes and call lazy_vacuum_heap
  *		to reclaim dead line pointers.
  *
+ *		If the table has more than one index and parallel lazy vacuum is requested,
+ *		we execute both index vacuum and cleanup index with parallel workers. When
+ *		allocating the space for lazy scan heap, we enter the parallel mode, create
+ *		the parallel context and initailize a dynamic shared memory segment for dead
+ *		tuples. The dead_tuples points either to a dynamic shared memory segment in
+ *		parallel vacuum case or to a local memory in single process vacuum case.
+ *		Before starting	parallel index vacuum and parallel cleanup index we launch
+ *		parallel workers. All parallel workers will exit after processed all indexes
+ *		and the leader process re-initialize parallel context and then re-launch them
+ *		at the next execution. The index statistics are updated by the leader after
+ *		exited from the parallel mode since all writes are not allowed during the
+ *		parallel mode.
+ *
  *		If there are no indexes then we can reclaim line pointers on the fly;
  *		dead line pointers need only be retained until all index pointers that
  *		reference them have been killed.
  */
 static void
-lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
-			   Relation *Irel, int nindexes, bool aggressive)
+lazy_scan_heap(LVState *lvstate)
 {
+	Relation	onerel = lvstate->relation;
+	LVRelStats	*vacrelstats = lvstate->vacrelstats;
+	LVDeadTuples	*dead_tuples = NULL;
 	BlockNumber nblocks,
 				blkno;
 	HeapTupleData tuple;
@@ -486,7 +633,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 				tups_vacuumed,	/* tuples cleaned up by vacuum */
 				nkeep,			/* dead-but-not-removable tuples */
 				nunused;		/* unused item pointers */
-	IndexBulkDeleteResult **indstats;
+	IndexBulkDeleteResult **indstats = NULL;
 	int			i;
 	PGRUsage	ru0;
 	Buffer		vmbuffer = InvalidBuffer;
@@ -494,6 +641,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	bool		skipping_blocks;
 	xl_heap_freeze_tuple *frozen;
 	StringInfoData buf;
+	int			parallel_workers = 0;
 	const int	initprog_index[] = {
 		PROGRESS_VACUUM_PHASE,
 		PROGRESS_VACUUM_TOTAL_HEAP_BLKS,
@@ -504,7 +652,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	pg_rusage_init(&ru0);
 
 	relname = RelationGetRelationName(onerel);
-	if (aggressive)
+	if (lvstate->aggressive)
 		ereport(elevel,
 				(errmsg("aggressively vacuuming \"%s.%s\"",
 						get_namespace_name(RelationGetNamespace(onerel)),
@@ -519,9 +667,6 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	next_fsm_block_to_vacuum = (BlockNumber) 0;
 	num_tuples = live_tuples = tups_vacuumed = nkeep = nunused = 0;
 
-	indstats = (IndexBulkDeleteResult **)
-		palloc0(nindexes * sizeof(IndexBulkDeleteResult *));
-
 	nblocks = RelationGetNumberOfBlocks(onerel);
 	vacrelstats->rel_pages = nblocks;
 	vacrelstats->scanned_pages = 0;
@@ -529,13 +674,36 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	vacrelstats->nonempty_pages = 0;
 	vacrelstats->latestRemovedXid = InvalidTransactionId;
 
-	lazy_space_alloc(vacrelstats, nblocks);
+	/*
+	 * Compute the number of parallel vacuum worker to request and then enable
+	 * parallel lazy vacuum.
+	 */
+	if ((lvstate->options.flags & VACOPT_PARALLEL) != 0)
+		parallel_workers = compute_parallel_workers(lvstate->relation,
+													lvstate->options.nworkers,
+													lvstate->nindexes);
+
+	/*
+	 * Allocate memory space for lazy vacuum. If parallel_workers > 0, we
+	 * prepare for parallel vacuum, entering the parallel mode, initializing
+	 * a dynamic shared memory segment.
+	 */
+	dead_tuples = lazy_space_alloc(lvstate, nblocks, parallel_workers);
 	frozen = palloc(sizeof(xl_heap_freeze_tuple) * MaxHeapTuplesPerPage);
 
+	/*
+	 * allocate the memory for index bulkdelete results if in the single vacuum
+	 * mode. In parallel mode, we've already prepared it in the shared memory
+	 * segment.
+	 */
+	if (!lvstate->parallel_ready)
+		indstats = (IndexBulkDeleteResult **)
+			palloc0(lvstate->nindexes * sizeof(IndexBulkDeleteResult *));
+
 	/* Report that we're scanning the heap, advertising total # of blocks */
 	initprog_val[0] = PROGRESS_VACUUM_PHASE_SCAN_HEAP;
 	initprog_val[1] = nblocks;
-	initprog_val[2] = vacrelstats->max_dead_tuples;
+	initprog_val[2] = dead_tuples->max_tuples;
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
 	/*
@@ -583,7 +751,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	 * be replayed on any hot standby, where it can be disruptive.
 	 */
 	next_unskippable_block = 0;
-	if ((options & VACOPT_DISABLE_PAGE_SKIPPING) == 0)
+	if ((lvstate->options.flags & VACOPT_DISABLE_PAGE_SKIPPING) == 0)
 	{
 		while (next_unskippable_block < nblocks)
 		{
@@ -591,7 +759,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 
 			vmstatus = visibilitymap_get_status(onerel, next_unskippable_block,
 												&vmbuffer);
-			if (aggressive)
+			if (lvstate->aggressive)
 			{
 				if ((vmstatus & VISIBILITYMAP_ALL_FROZEN) == 0)
 					break;
@@ -638,7 +806,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		{
 			/* Time to advance next_unskippable_block */
 			next_unskippable_block++;
-			if ((options & VACOPT_DISABLE_PAGE_SKIPPING) == 0)
+			if ((lvstate->options.flags & VACOPT_DISABLE_PAGE_SKIPPING) == 0)
 			{
 				while (next_unskippable_block < nblocks)
 				{
@@ -647,7 +815,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 					vmskipflags = visibilitymap_get_status(onerel,
 														   next_unskippable_block,
 														   &vmbuffer);
-					if (aggressive)
+					if (lvstate->aggressive)
 					{
 						if ((vmskipflags & VISIBILITYMAP_ALL_FROZEN) == 0)
 							break;
@@ -676,7 +844,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			 * it's not all-visible.  But in an aggressive vacuum we know only
 			 * that it's not all-frozen, so it might still be all-visible.
 			 */
-			if (aggressive && VM_ALL_VISIBLE(onerel, blkno, &vmbuffer))
+			if (lvstate->aggressive && VM_ALL_VISIBLE(onerel, blkno, &vmbuffer))
 				all_visible_according_to_vm = true;
 		}
 		else
@@ -700,7 +868,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 				 * know whether it was all-frozen, so we have to recheck; but
 				 * in this case an approximate answer is OK.
 				 */
-				if (aggressive || VM_ALL_FROZEN(onerel, blkno, &vmbuffer))
+				if (lvstate->aggressive || VM_ALL_FROZEN(onerel, blkno, &vmbuffer))
 					vacrelstats->frozenskipped_pages++;
 				continue;
 			}
@@ -713,8 +881,8 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		 * If we are close to overrunning the available space for dead-tuple
 		 * TIDs, pause and do a cycle of vacuuming before we tackle this page.
 		 */
-		if ((vacrelstats->max_dead_tuples - vacrelstats->num_dead_tuples) < MaxHeapTuplesPerPage &&
-			vacrelstats->num_dead_tuples > 0)
+		if ((dead_tuples->max_tuples - dead_tuples->num_tuples) < MaxHeapTuplesPerPage &&
+			dead_tuples->num_tuples > 0)
 		{
 			const int	hvp_index[] = {
 				PROGRESS_VACUUM_PHASE,
@@ -742,10 +910,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 										 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
 			/* Remove index entries */
-			for (i = 0; i < nindexes; i++)
-				lazy_vacuum_index(Irel[i],
-								  &indstats[i],
-								  vacrelstats);
+			lazy_vacuum_all_indexes(lvstate, indstats, dead_tuples, false);
 
 			/*
 			 * Report that we are now vacuuming the heap.  We also increase
@@ -758,14 +923,14 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			pgstat_progress_update_multi_param(2, hvp_index, hvp_val);
 
 			/* Remove tuples from heap */
-			lazy_vacuum_heap(onerel, vacrelstats, nblocks);
+			lazy_vacuum_heap(lvstate, dead_tuples, nblocks);
 
 			/*
 			 * Forget the now-vacuumed tuples, and press on, but be careful
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 			vacrelstats->num_index_scans++;
 
 			/*
@@ -803,7 +968,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			 * it's OK to skip vacuuming pages we get a lock conflict on. They
 			 * will be dealt with in some future vacuum.
 			 */
-			if (!aggressive && !FORCE_CHECK_PAGE())
+			if (!lvstate->aggressive && !FORCE_CHECK_PAGE())
 			{
 				ReleaseBuffer(buf);
 				vacrelstats->pinskipped_pages++;
@@ -836,7 +1001,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 					vacrelstats->nonempty_pages = blkno + 1;
 				continue;
 			}
-			if (!aggressive)
+			if (!lvstate->aggressive)
 			{
 				/*
 				 * Here, we must not advance scanned_pages; that would amount
@@ -961,7 +1126,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		has_dead_tuples = false;
 		nfrozen = 0;
 		hastup = false;
-		prev_dead_count = vacrelstats->num_dead_tuples;
+		prev_dead_count = dead_tuples->num_tuples;
 		maxoff = PageGetMaxOffsetNumber(page);
 
 		/*
@@ -1000,7 +1165,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			 */
 			if (ItemIdIsDead(itemid))
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				all_visible = false;
 				continue;
 			}
@@ -1140,7 +1305,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 
 			if (tupgone)
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				HeapTupleHeaderAdvanceLatestRemovedXid(tuple.t_data,
 													   &vacrelstats->latestRemovedXid);
 				tups_vacuumed += 1;
@@ -1209,11 +1374,12 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		 * If there are no indexes then we can vacuum the page right now
 		 * instead of doing a second scan.
 		 */
-		if (nindexes == 0 &&
-			vacrelstats->num_dead_tuples > 0)
+		if (lvstate->nindexes == 0 && dead_tuples->num_tuples > 0)
 		{
 			/* Remove tuples from heap */
-			lazy_vacuum_page(onerel, blkno, buf, 0, vacrelstats, &vmbuffer);
+			lazy_vacuum_page(lvstate, onerel, blkno, buf, 0, &vmbuffer,
+							 lvstate->vacrelstats->latestRemovedXid,
+							 dead_tuples);
 			has_dead_tuples = false;
 
 			/*
@@ -1221,7 +1387,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 			vacuumed_pages++;
 
 			/*
@@ -1337,7 +1503,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		 * page, so remember its free space as-is.  (This path will always be
 		 * taken if there are no indexes.)
 		 */
-		if (vacrelstats->num_dead_tuples == prev_dead_count)
+		if (dead_tuples->num_tuples == prev_dead_count)
 			RecordPageWithFreeSpace(onerel, blkno, freespace, nblocks);
 	}
 
@@ -1371,7 +1537,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 
 	/* If any tuples need to be deleted, perform final vacuum cycle */
 	/* XXX put a threshold on min number of tuples here? */
-	if (vacrelstats->num_dead_tuples > 0)
+	if (dead_tuples->num_tuples > 0)
 	{
 		const int	hvp_index[] = {
 			PROGRESS_VACUUM_PHASE,
@@ -1387,10 +1553,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 									 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
 		/* Remove index entries */
-		for (i = 0; i < nindexes; i++)
-			lazy_vacuum_index(Irel[i],
-							  &indstats[i],
-							  vacrelstats);
+		lazy_vacuum_all_indexes(lvstate, indstats, dead_tuples, false);
 
 		/* Report that we are now vacuuming the heap */
 		hvp_val[0] = PROGRESS_VACUUM_PHASE_VACUUM_HEAP;
@@ -1400,7 +1563,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		/* Remove tuples from heap */
 		pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
 									 PROGRESS_VACUUM_PHASE_VACUUM_HEAP);
-		lazy_vacuum_heap(onerel, vacrelstats, nblocks);
+		lazy_vacuum_heap(lvstate, dead_tuples, nblocks);
 		vacrelstats->num_index_scans++;
 	}
 
@@ -1417,8 +1580,10 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
 
 	/* Do post-vacuum cleanup and statistics update for each index */
-	for (i = 0; i < nindexes; i++)
-		lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
+	lazy_vacuum_all_indexes(lvstate, indstats, dead_tuples, true);
+
+	if (lvstate->parallel_ready)
+		lazy_end_parallel(lvstate, true);
 
 	/* If no indexes, make log report that lazy_vacuum_heap would've made */
 	if (vacuumed_pages)
@@ -1474,8 +1639,9 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
  * Note: nblocks is passed as an optimization for RecordPageWithFreeSpace().
  */
 static void
-lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats, BlockNumber nblocks)
+lazy_vacuum_heap(LVState *lvstate, LVDeadTuples *dead_tuples, BlockNumber nblocks)
 {
+	Relation	onerel = lvstate->relation;
 	int			tupindex;
 	int			npages;
 	PGRUsage	ru0;
@@ -1485,7 +1651,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats, BlockNumber nblocks)
 	npages = 0;
 
 	tupindex = 0;
-	while (tupindex < vacrelstats->num_dead_tuples)
+	while (tupindex < dead_tuples->num_tuples)
 	{
 		BlockNumber tblk;
 		Buffer		buf;
@@ -1494,7 +1660,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats, BlockNumber nblocks)
 
 		vacuum_delay_point();
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&dead_tuples->itemptrs[tupindex]);
 		buf = ReadBufferExtended(onerel, MAIN_FORKNUM, tblk, RBM_NORMAL,
 								 vac_strategy);
 		if (!ConditionalLockBufferForCleanup(buf))
@@ -1503,8 +1669,9 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats, BlockNumber nblocks)
 			++tupindex;
 			continue;
 		}
-		tupindex = lazy_vacuum_page(onerel, tblk, buf, tupindex, vacrelstats,
-									&vmbuffer);
+		tupindex = lazy_vacuum_page(lvstate, onerel, tblk, buf, tupindex,
+									&vmbuffer, lvstate->vacrelstats->latestRemovedXid,
+									dead_tuples);
 
 		/* Now that we've compacted the page, record its available space */
 		page = BufferGetPage(buf);
@@ -1539,8 +1706,9 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats, BlockNumber nblocks)
  * The return value is the first tupindex after the tuples of this page.
  */
 static int
-lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
-				 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer)
+lazy_vacuum_page(LVState *lvstate, Relation onerel, BlockNumber blkno,
+				 Buffer buffer, int tupindex, Buffer *vmbuffer,
+				 TransactionId latestRemovedXid, LVDeadTuples *dead_tuples)
 {
 	Page		page = BufferGetPage(buffer);
 	OffsetNumber unused[MaxOffsetNumber];
@@ -1552,16 +1720,16 @@ lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 
 	START_CRIT_SECTION();
 
-	for (; tupindex < vacrelstats->num_dead_tuples; tupindex++)
+	for (; tupindex < dead_tuples->num_tuples; tupindex++)
 	{
 		BlockNumber tblk;
 		OffsetNumber toff;
 		ItemId		itemid;
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&dead_tuples->itemptrs[tupindex]);
 		if (tblk != blkno)
 			break;				/* past end of tuples for this block */
-		toff = ItemPointerGetOffsetNumber(&vacrelstats->dead_tuples[tupindex]);
+		toff = ItemPointerGetOffsetNumber(&dead_tuples->itemptrs[tupindex]);
 		itemid = PageGetItemId(page, toff);
 		ItemIdSetUnused(itemid);
 		unused[uncnt++] = toff;
@@ -1582,7 +1750,7 @@ lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 		recptr = log_heap_clean(onerel, buffer,
 								NULL, 0, NULL, 0,
 								unused, uncnt,
-								vacrelstats->latestRemovedXid);
+								latestRemovedXid);
 		PageSetLSN(page, recptr);
 	}
 
@@ -1682,6 +1850,98 @@ lazy_check_needs_freeze(Buffer buf, bool *hastup)
 	return false;
 }
 
+/*
+ * Vacuum or cleanup all indexes. If we're ready to do parallel vacuum it's done
+ * with parallel workers. So this function must be used by the parallel vacuum
+ * leader process.
+ *
+ * In parallel lazy vacuum, we copy the index bulk-deletion results returned from
+ * ambulkdelete and amvacuumcleanup to the shared memory because they are allocated
+ * in the local memory and it's possible that an index will be vacuumed by the
+ * different vacuum process at the next time.
+ *
+ * Since all vacuum workers process different indexes we can write them without
+ * locking.
+ */
+static void
+lazy_vacuum_all_indexes(LVState *lvstate, IndexBulkDeleteResult **stats,
+						LVDeadTuples *dead_tuples, bool for_cleanup)
+{
+	LVShared	*lvshared = lvstate->lvshared;
+	LVRelStats	*vacrelstats = lvstate->vacrelstats;
+	int			nprocessed = 0;
+	bool		do_parallel = false;
+	int			idx;
+
+	Assert(!IsParallelWorker());
+	Assert(lvstate->parallel_ready ||
+		   (!lvstate->parallel_ready && stats != NULL));
+
+	/* no job if the table has no index */
+	if (lvstate->nindexes <= 0)
+		return;
+
+	if (lvstate->parallel_ready)
+		do_parallel = lazy_begin_parallel_vacuum_index(lvstate, for_cleanup);
+
+	for (;;)
+	{
+		IndexBulkDeleteResult *result = NULL;
+
+		/* Get the next index number to vacuum and set index statistics */
+		if (do_parallel)
+		{
+			idx = pg_atomic_fetch_add_u32(&(lvshared->nprocessed), 1);
+
+			/*
+			 * If there is already-updated result in the shared memory we use it.
+			 * Otherwise we pass NULL to index AMs, meaning it's first time call,
+			 * and copy the result to the shared memory segment.
+			 */
+			if (lvshared->indstats[idx].updated)
+				result = &(lvshared->indstats[idx].stats);
+		}
+		else
+		{
+			idx = nprocessed++;
+			result = stats[idx];
+		}
+
+		/* Done for all indexes? */
+		if (idx >= lvstate->nindexes)
+			break;
+
+		/*
+		 * Do vacuuming or cleanup one index. For cleanup index, we don't update
+		 * index statistics during parallel mode.
+		 */
+		if (for_cleanup)
+			result = lazy_cleanup_index(lvstate->indRels[idx], result,
+										vacrelstats->new_rel_tuples,
+										vacrelstats->tupcount_pages < vacrelstats->rel_pages,
+										!do_parallel);
+		else
+			result = lazy_vacuum_index(lvstate->indRels[idx], result,
+									   vacrelstats->old_rel_pages,
+									   dead_tuples);
+
+		if (do_parallel && result)
+		{
+			/*
+			 * Save index bulk-deletion result to the shared memory space if
+			 * not yet.
+			 */
+			if (!lvshared->indstats[idx].updated)
+				memcpy(&(lvshared->indstats[idx].stats), result,
+					   sizeof(IndexBulkDeleteResult));
+
+			lvshared->indstats[idx].updated = true;
+		}
+	}
+
+	if (do_parallel)
+		lazy_end_parallel_vacuum_index(lvstate);
+}
 
 /*
  *	lazy_vacuum_index() -- vacuum one index relation.
@@ -1689,11 +1949,11 @@ lazy_check_needs_freeze(Buffer buf, bool *hastup)
  *		Delete all the index entries pointing to tuples listed in
  *		vacrelstats->dead_tuples, and update running statistics.
  */
-static void
-lazy_vacuum_index(Relation indrel,
-				  IndexBulkDeleteResult **stats,
-				  LVRelStats *vacrelstats)
+static IndexBulkDeleteResult *
+lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult *stats,
+				  double reltuples, LVDeadTuples *dead_tuples)
 {
+	IndexBulkDeleteResult *res;
 	IndexVacuumInfo ivinfo;
 	PGRUsage	ru0;
 
@@ -1703,28 +1963,29 @@ lazy_vacuum_index(Relation indrel,
 	ivinfo.analyze_only = false;
 	ivinfo.estimated_count = true;
 	ivinfo.message_level = elevel;
-	/* We can only provide an approximate value of num_heap_tuples here */
-	ivinfo.num_heap_tuples = vacrelstats->old_live_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
 	/* Do bulk deletion */
-	*stats = index_bulk_delete(&ivinfo, *stats,
-							   lazy_tid_reaped, (void *) vacrelstats);
+	res = index_bulk_delete(&ivinfo, stats,
+							lazy_tid_reaped, (void *) dead_tuples);
 
 	ereport(elevel,
-			(errmsg("scanned index \"%s\" to remove %d row versions",
+			(errmsg("scanned index \"%s\" to remove %d row versions %s",
 					RelationGetRelationName(indrel),
-					vacrelstats->num_dead_tuples),
+					dead_tuples->num_tuples,
+					IsParallelWorker() ? "by parallel vacuum worker" : ""),
 			 errdetail_internal("%s", pg_rusage_show(&ru0))));
+
+	return res;
 }
 
 /*
  *	lazy_cleanup_index() -- do post-vacuum cleanup for one index relation.
  */
-static void
-lazy_cleanup_index(Relation indrel,
-				   IndexBulkDeleteResult *stats,
-				   LVRelStats *vacrelstats)
+static IndexBulkDeleteResult *
+lazy_cleanup_index(Relation indrel, IndexBulkDeleteResult *stats,
+				   double reltuples, bool estimated_count, bool update_stats)
 {
 	IndexVacuumInfo ivinfo;
 	PGRUsage	ru0;
@@ -1733,27 +1994,21 @@ lazy_cleanup_index(Relation indrel,
 
 	ivinfo.index = indrel;
 	ivinfo.analyze_only = false;
-	ivinfo.estimated_count = (vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	ivinfo.estimated_count = estimated_count;
 	ivinfo.message_level = elevel;
-
-	/*
-	 * Now we can provide a better estimate of total number of surviving
-	 * tuples (we assume indexes are more interested in that than in the
-	 * number of nominally live tuples).
-	 */
-	ivinfo.num_heap_tuples = vacrelstats->new_rel_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
 	stats = index_vacuum_cleanup(&ivinfo, stats);
 
 	if (!stats)
-		return;
+		return NULL;
 
 	/*
 	 * Now update statistics in pg_class, but only if the index says the count
 	 * is accurate.
 	 */
-	if (!stats->estimated_count)
+	if (!stats->estimated_count && update_stats)
 		vac_update_relstats(indrel,
 							stats->num_pages,
 							stats->num_index_tuples,
@@ -1774,8 +2029,7 @@ lazy_cleanup_index(Relation indrel,
 					   stats->tuples_removed,
 					   stats->pages_deleted, stats->pages_free,
 					   pg_rusage_show(&ru0))));
-
-	pfree(stats);
+	return stats;
 }
 
 /*
@@ -2084,15 +2338,16 @@ count_nondeletable_pages(Relation onerel, LVRelStats *vacrelstats)
  *
  * See the comments at the head of this file for rationale.
  */
-static void
-lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+static LVDeadTuples *
+lazy_space_alloc(LVState *lvstate, BlockNumber relblocks, int parallel_workers)
 {
+	LVDeadTuples	*dead_tuples = NULL;
 	long		maxtuples;
 	int			vac_work_mem = IsAutoVacuumWorkerProcess() &&
 	autovacuum_work_mem != -1 ?
 	autovacuum_work_mem : maintenance_work_mem;
 
-	if (vacrelstats->hasindex)
+	if (lvstate->nindexes > 0)
 	{
 		maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);
 		maxtuples = Min(maxtuples, INT_MAX);
@@ -2106,34 +2361,46 @@ lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
 		maxtuples = Max(maxtuples, MaxHeapTuplesPerPage);
 	}
 	else
-	{
 		maxtuples = MaxHeapTuplesPerPage;
+
+	/*
+	 * In parallel lazy vacuum, we enter the parallel mode and prepare all memory
+	 * necessary for executing parallel lazy vacuum including the space to store
+	 * dead tuples.
+	 */
+	if (parallel_workers > 0)
+	{
+		dead_tuples = lazy_prepare_parallel(lvstate, maxtuples, parallel_workers);
+
+		/* Preparation was a success, return the dead tuple space */
+		if (dead_tuples)
+			return dead_tuples;
 	}
 
-	vacrelstats->num_dead_tuples = 0;
-	vacrelstats->max_dead_tuples = (int) maxtuples;
-	vacrelstats->dead_tuples = (ItemPointer)
-		palloc(maxtuples * sizeof(ItemPointerData));
+	dead_tuples = (LVDeadTuples *) palloc(SizeOfLVDeadTuples + maxtuples * sizeof(ItemPointerData));
+	dead_tuples->num_tuples = 0;
+	dead_tuples->max_tuples = (int) maxtuples;
+
+	return dead_tuples;
 }
 
 /*
  * lazy_record_dead_tuple - remember one deletable tuple
  */
 static void
-lazy_record_dead_tuple(LVRelStats *vacrelstats,
-					   ItemPointer itemptr)
+lazy_record_dead_tuple(LVDeadTuples *dead_tuples, ItemPointer itemptr)
 {
 	/*
 	 * The array shouldn't overflow under normal behavior, but perhaps it
 	 * could if we are given a really small maintenance_work_mem. In that
 	 * case, just forget the last few tuples (we'll get 'em next time).
 	 */
-	if (vacrelstats->num_dead_tuples < vacrelstats->max_dead_tuples)
+	if (dead_tuples->num_tuples < dead_tuples->max_tuples)
 	{
-		vacrelstats->dead_tuples[vacrelstats->num_dead_tuples] = *itemptr;
-		vacrelstats->num_dead_tuples++;
+		dead_tuples->itemptrs[dead_tuples->num_tuples] = *itemptr;
+		dead_tuples->num_tuples++;
 		pgstat_progress_update_param(PROGRESS_VACUUM_NUM_DEAD_TUPLES,
-									 vacrelstats->num_dead_tuples);
+									 dead_tuples->num_tuples);
 	}
 }
 
@@ -2147,12 +2414,12 @@ lazy_record_dead_tuple(LVRelStats *vacrelstats,
 static bool
 lazy_tid_reaped(ItemPointer itemptr, void *state)
 {
-	LVRelStats *vacrelstats = (LVRelStats *) state;
+	LVDeadTuples	*dead_tuples = (LVDeadTuples *) state;
 	ItemPointer res;
 
 	res = (ItemPointer) bsearch((void *) itemptr,
-								(void *) vacrelstats->dead_tuples,
-								vacrelstats->num_dead_tuples,
+								(void *) dead_tuples->itemptrs,
+								dead_tuples->num_tuples,
 								sizeof(ItemPointerData),
 								vac_cmp_itemptr);
 
@@ -2300,3 +2567,381 @@ heap_page_is_all_visible(Relation rel, Buffer buf,
 
 	return all_visible;
 }
+
+/*
+ * Compute the number of parallel worker process to request. Vacuums can be executed
+ * in parallel if the table has more than one index since the parallel index vacuum
+ * processes one index by one vacuum process. The relation size of table and indexes
+ * doesn't affect to the parallel degree.
+ */
+static int
+compute_parallel_workers(Relation rel, int nrequests, int nindexes)
+{
+	int parallel_workers;
+
+	if (nindexes <= 1)
+		return 0;
+
+	if (nrequests)
+		parallel_workers = Min(nrequests, nindexes - 1);
+	else if (rel->rd_options)
+	{
+		StdRdOptions *relopts = (StdRdOptions *) rel->rd_options;
+		parallel_workers = Min(relopts->parallel_workers, nindexes - 1);
+	}
+	else
+	{
+		/*
+		 * The parallel degree is neither requested nor set in relopts. Compute
+		 * it based on the number of indexes.
+		 */
+		parallel_workers = nindexes - 1;
+	}
+
+	/* cap by max_parallel_maintenace_workers */
+	parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+
+	return parallel_workers;
+}
+
+/*
+ * Enter the parallel mode, allocate and initialize a DSM segment. Return
+ * the memory space for storing dead tuples or NULL if no workers are prepared.
+ */
+static LVDeadTuples *
+lazy_prepare_parallel(LVState *lvstate, long maxtuples, int request)
+{
+	LVShared	*shared;
+	ParallelContext *pcxt;
+	LVDeadTuples	*tidmap;
+	char	*sharedquery;
+	Size	estshared;
+	Size	estdt;
+	int		querylen;
+	int		i;
+	int		keys = 0;
+
+	Assert(request > 0);
+
+	EnterParallelMode();
+	pcxt = CreateParallelContext("postgres", "heap_parallel_vacuum_main",
+								 request, true);
+	lvstate->pcxt = pcxt;
+
+	/* quick exit if no workers are prepared, e.g. under serializable isolation */
+	if (pcxt->nworkers == 0)
+	{
+		lazy_end_parallel(lvstate, false);
+		return NULL;
+	}
+
+	/* Estimate size for shared information -- PARALLEL_VACUUM_KEY_SHARED */
+	estshared = MAXALIGN(add_size(SizeOfLVShared,
+								  mul_size(sizeof(LVIndStats), lvstate->nindexes)));
+	shm_toc_estimate_chunk(&pcxt->estimator, estshared);
+	keys++;
+
+	/* Estimate size for dead tuples -- PARALLEL_VACUUM_KEY_DEAD_TUPLES */
+	estdt = MAXALIGN(add_size(sizeof(LVDeadTuples),
+							  mul_size(sizeof(ItemPointerData), maxtuples)));
+	shm_toc_estimate_chunk(&pcxt->estimator, estdt);
+	keys++;
+
+	shm_toc_estimate_keys(&pcxt->estimator, keys);
+
+	/*
+	 * Finally, estimate VACUUM_KEY_QUERY_TEXT space. Auto vacuums don't have
+	 * debug_query_string.
+	 */
+	querylen = strlen(debug_query_string);
+	shm_toc_estimate_chunk(&pcxt->estimator, querylen + 1);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	/* create the DSM */
+	InitializeParallelDSM(pcxt);
+
+	/* prepare shared information */
+	shared = (LVShared *) shm_toc_allocate(pcxt->toc, estshared);
+	shared->relid = RelationGetRelid(lvstate->relation);
+	shared->is_wraparound = lvstate->is_wraparound;
+	shared->elevel = elevel;
+	pg_atomic_init_u32(&(shared->nprocessed), 0);
+
+	for (i = 0; i < lvstate->nindexes; i++)
+	{
+		LVIndStats *s = &(shared->indstats[i]);
+		s->updated = false;
+		MemSet(&(s->stats), 0, sizeof(IndexBulkDeleteResult));
+	}
+
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_SHARED, shared);
+	lvstate->lvshared = shared;
+
+	/* prepare the dead tuple space */
+	tidmap = (LVDeadTuples *) shm_toc_allocate(pcxt->toc, estdt);
+	tidmap->max_tuples = maxtuples;
+	tidmap->num_tuples = 0;
+	MemSet(tidmap->itemptrs, 0, sizeof(ItemPointerData) * maxtuples);
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_DEAD_TUPLES, tidmap);
+
+	/* Store query string for workers */
+	sharedquery = (char *) shm_toc_allocate(pcxt->toc, querylen + 1);
+	memcpy(sharedquery, debug_query_string, querylen + 1);
+	sharedquery[querylen] = '\0';
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, sharedquery);
+
+	/* All setup is done, now we're ready for parallel vacuum execution */
+	lvstate->parallel_ready = true;
+
+	return tidmap;
+}
+
+/*
+ * Shutdown workers, destroy parallel context, and end parallel mode. If
+ * 'update_indstats' is true, we copy statistics of all indexes before
+ * destroying the parallel context, and then update them after exit parallel
+ * mode.
+ */
+static void
+lazy_end_parallel(LVState *lvstate, bool update_indstats)
+{
+	LVIndStats *copied_indstats = NULL;
+
+	Assert(!IsParallelWorker());
+
+	if (update_indstats && lvstate->nindexes > 0)
+	{
+		/* copy the index statistics to a temporary space */
+		copied_indstats = palloc(sizeof(LVIndStats) * lvstate->nindexes);
+		memcpy(copied_indstats, lvstate->lvshared->indstats,
+			   sizeof(LVIndStats) * lvstate->nindexes);
+	}
+
+	/* Shutdown worker processes and destroy the parallel context */
+	WaitForParallelWorkersToFinish(lvstate->pcxt);
+	DestroyParallelContext(lvstate->pcxt);
+	ExitParallelMode();
+
+	if (copied_indstats)
+	{
+		int i;
+
+		for (i = 0; i < lvstate->nindexes; i++)
+		{
+			LVIndStats *s = &(copied_indstats[i]);
+
+			/* Update index statistics */
+			if (s->updated && !s->stats.estimated_count)
+				vac_update_relstats(lvstate->indRels[i],
+									s->stats.num_pages,
+									s->stats.num_index_tuples,
+									0,
+									false,
+									InvalidTransactionId,
+									InvalidMultiXactId,
+									false);
+		}
+
+		pfree(copied_indstats);
+	}
+
+	lvstate->parallel_ready = false;
+}
+
+/*
+ * Begin a parallel index vacuum or cleanup index. Set shared information
+ * and launch parallel worker processes. Return true if at least one worker
+ * has been launched.
+ */
+static bool
+lazy_begin_parallel_vacuum_index(LVState *lvstate, bool for_cleanup)
+{
+	LVRelStats *vacrelstats = lvstate->vacrelstats;
+	StringInfoData buf;
+
+	Assert(!IsParallelWorker());
+
+	/*
+	 * Request workers to do either vacuuming indexes or cleaning indexes.
+	 */
+	lvstate->lvshared->for_cleanup = for_cleanup;
+
+	if (for_cleanup)
+	{
+		/*
+		 * Now we can provide a better estimate of total number of surviving
+		 * tuples (we assume indexes are more interested in that than in the
+		 * number of nominally live tuples).
+		 */
+		lvstate->lvshared->reltuples = vacrelstats->new_rel_tuples;
+		lvstate->lvshared->estimated_count =
+			(vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	}
+	else
+	{
+		/* We can only provide an approximate value of num_heap_tuples here */
+		lvstate->lvshared->reltuples = vacrelstats->old_live_tuples;
+		lvstate->lvshared->estimated_count = true;
+	}
+
+	LaunchParallelWorkers(lvstate->pcxt);
+
+	/* Report parallel vacuum worker information */
+	initStringInfo(&buf);
+	appendStringInfo(&buf,
+					 ngettext("launched %d parallel vacuum worker %s (planned: %d",
+							  "launched %d parallel vacuum workers %s (planned: %d",
+							  lvstate->pcxt->nworkers_launched),
+					 lvstate->pcxt->nworkers_launched,
+					 for_cleanup ? "for index cleanup" : "for index vacuum",
+					 lvstate->pcxt->nworkers);
+	if (lvstate->options.nworkers > 0)
+		appendStringInfo(&buf, ", requested %d", lvstate->options.nworkers);
+
+	appendStringInfo(&buf, ")");
+	ereport(elevel, (errmsg("%s", buf.data)));
+
+	/*
+	 * if no workers launched, we vacuum all indexes by the leader process
+	 * alone. Since there is hope that we can launch workers in the next
+	 * execution time we don't want to end the parallel mode yet.
+	 */
+	if (lvstate->pcxt->nworkers_launched == 0)
+	{
+		lazy_end_parallel_vacuum_index(lvstate);
+		return false;
+	}
+
+	WaitForParallelWorkersToAttach(lvstate->pcxt);
+
+	return true;
+}
+
+/*
+ * Wait for all worker processes to finish and reinitialize DSM for
+ * the next execution.
+ */
+static void
+lazy_end_parallel_vacuum_index(LVState *lvstate)
+{
+	Assert(!IsParallelWorker());
+
+	WaitForParallelWorkersToFinish(lvstate->pcxt);
+
+	/* Reset the processing count */
+	pg_atomic_write_u32(&(lvstate->lvshared->nprocessed), 0);
+
+	/*
+	 * Reinitialize the DSM space except to relaunch parallel workers for
+	 * the next execution.
+	 */
+	ReinitializeParallelDSM(lvstate->pcxt);
+}
+
+/*
+ * Perform work within a launched parallel process.
+ *
+ * Parallel vacuum worker processes doesn't report the vacuum progress
+ * information.
+ */
+void
+heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
+{
+	Relation	onerel;
+	Relation	*indrels;
+	LVShared	*lvshared;
+	LVDeadTuples	*dead_tuples;
+	int			nindexes;
+	char		*sharedquery;
+
+	/* Set lazy vacuum state and open relations */
+	lvshared = (LVShared *) shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_SHARED, false);
+	onerel = heap_open(lvshared->relid, ShareUpdateExclusiveLock);
+	vac_open_indexes(onerel, RowExclusiveLock, &nindexes, &indrels);
+	elevel = lvshared->elevel;
+
+	ereport(DEBUG1,
+			(errmsg("starting parallel lazy vacuum worker for %s",
+					lvshared->for_cleanup ? "cleanup" : "vacuuming")));
+
+	/* Set debug_query_string for individual workers */
+	sharedquery = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, true);
+
+	/* Report the query string from leader */
+	debug_query_string = sharedquery;
+	pgstat_report_activity(STATE_RUNNING, debug_query_string);
+
+	/* Set dead tuple space within worker */
+	dead_tuples = (LVDeadTuples *) shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_DEAD_TUPLES, false);
+
+	/* Set cost-based vacuum delay */
+	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostBalance = 0;
+	VacuumPageHit = 0;
+	VacuumPageMiss = 0;
+	VacuumPageDirty = 0;
+
+	/* Do either vacuuming indexes or cleaning indexes */
+	lazy_vacuum_indexes_for_worker(indrels, nindexes, lvshared,
+								   dead_tuples,
+								   lvshared->for_cleanup);
+
+	vac_close_indexes(nindexes, indrels, RowExclusiveLock);
+	heap_close(onerel, ShareUpdateExclusiveLock);
+}
+
+/*
+ * Vacuum or cleanup indexes. This function must be used by the parallel vacuum
+ * worker processes. Similar to the leader process in parallel lazy vacuum, we
+ * copy the index bulk-deletion results to the shared memory segment.
+ */
+static void
+lazy_vacuum_indexes_for_worker(Relation *indrels, int nindexes,
+							   LVShared *lvshared, LVDeadTuples *dead_tuples,
+							   bool for_cleanup)
+{
+	int idx = 0;
+
+	Assert(IsParallelWorker());
+
+	for (;;)
+	{
+		IndexBulkDeleteResult *result = NULL;
+
+		/* Get next index to process */
+		idx = pg_atomic_fetch_add_u32(&(lvshared->nprocessed), 1);
+
+		/*  Done for all indexes? */
+		if (idx >= nindexes)
+			break;
+
+		/*
+		 * If there is already-updated result in the shared memory we use it.
+		 * Otherwise we pass NULL to index AMs, meaning it's first time call,
+		 * and copy the result to the shared memory segment.
+		 */
+		if (lvshared->indstats[idx].updated)
+			result = &(lvshared->indstats[idx].stats);
+
+		/* Do vacuuming or cleanup one index */
+		if (for_cleanup)
+			result = lazy_cleanup_index(indrels[idx], result, lvshared->reltuples,
+									   lvshared->estimated_count, false);
+		else
+			result = lazy_vacuum_index(indrels[idx], result, lvshared->reltuples,
+									  dead_tuples);
+
+		if (result)
+		{
+			/*
+			 * Save index bulk-deletion result to the shared memory space if
+			 * not yet.
+			 */
+			if (!lvshared->indstats[idx].updated)
+				memcpy(&(lvshared->indstats[idx].stats), result,
+					   sizeof(IndexBulkDeleteResult));
+
+			lvshared->indstats[idx].updated = true;
+		}
+	}
+}
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index ce2b616..fb1e951 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -14,6 +14,7 @@
 
 #include "postgres.h"
 
+#include "access/heapam.h"
 #include "access/nbtree.h"
 #include "access/parallel.h"
 #include "access/session.h"
@@ -138,6 +139,9 @@ static const struct
 	},
 	{
 		"_bt_parallel_build_main", _bt_parallel_build_main
+	},
+	{
+		"heap_parallel_vacuum_main", heap_parallel_vacuum_main
 	}
 };
 
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index e91df21..2f8446a 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -67,13 +67,13 @@ static BufferAccessStrategy vac_strategy;
 
 
 /* non-export function prototypes */
-static List *expand_vacuum_rel(VacuumRelation *vrel, int options);
-static List *get_all_vacuum_rels(int options);
+static List *expand_vacuum_rel(VacuumRelation *vrel, VacuumOptions options);
+static List *get_all_vacuum_rels(VacuumOptions options);
 static void vac_truncate_clog(TransactionId frozenXID,
 				  MultiXactId minMulti,
 				  TransactionId lastSaneFrozenXid,
 				  MultiXactId lastSaneMinMulti);
-static bool vacuum_rel(Oid relid, RangeVar *relation, int options,
+static bool vacuum_rel(Oid relid, RangeVar *relation, VacuumOptions options,
 		   VacuumParams *params);
 
 /*
@@ -88,15 +88,15 @@ ExecVacuum(VacuumStmt *vacstmt, bool isTopLevel)
 	VacuumParams params;
 
 	/* sanity checks on options */
-	Assert(vacstmt->options & (VACOPT_VACUUM | VACOPT_ANALYZE));
-	Assert((vacstmt->options & VACOPT_VACUUM) ||
-		   !(vacstmt->options & (VACOPT_FULL | VACOPT_FREEZE)));
-	Assert(!(vacstmt->options & VACOPT_SKIPTOAST));
+	Assert(vacstmt->options.flags & (VACOPT_VACUUM | VACOPT_ANALYZE));
+	Assert((vacstmt->options.flags & VACOPT_VACUUM) ||
+		   !(vacstmt->options.flags & (VACOPT_FULL | VACOPT_FREEZE)));
+	Assert(!(vacstmt->options.flags & VACOPT_SKIPTOAST));
 
 	/*
 	 * Make sure VACOPT_ANALYZE is specified if any column lists are present.
 	 */
-	if (!(vacstmt->options & VACOPT_ANALYZE))
+	if (!(vacstmt->options.flags & VACOPT_ANALYZE))
 	{
 		ListCell   *lc;
 
@@ -111,11 +111,17 @@ ExecVacuum(VacuumStmt *vacstmt, bool isTopLevel)
 		}
 	}
 
+	if ((vacstmt->options.flags & VACOPT_FULL) &&
+		(vacstmt->options.flags & VACOPT_PARALLEL))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot specify FULL option with PARALLEL option")));
+
 	/*
 	 * All freeze ages are zero if the FREEZE option is given; otherwise pass
 	 * them as -1 which means to use the default values.
 	 */
-	if (vacstmt->options & VACOPT_FREEZE)
+	if (vacstmt->options.flags & VACOPT_FREEZE)
 	{
 		params.freeze_min_age = 0;
 		params.freeze_table_age = 0;
@@ -143,7 +149,7 @@ ExecVacuum(VacuumStmt *vacstmt, bool isTopLevel)
 /*
  * Internal entry point for VACUUM and ANALYZE commands.
  *
- * options is a bitmask of VacuumOption flags, indicating what to do.
+ * options is a VacuumOptions, indicating what to do.
  *
  * relations, if not NIL, is a list of VacuumRelation to process; otherwise,
  * we process all relevant tables in the database.  For each VacuumRelation,
@@ -162,7 +168,7 @@ ExecVacuum(VacuumStmt *vacstmt, bool isTopLevel)
  * memory context that will not disappear at transaction commit.
  */
 void
-vacuum(int options, List *relations, VacuumParams *params,
+vacuum(VacuumOptions options, List *relations, VacuumParams *params,
 	   BufferAccessStrategy bstrategy, bool isTopLevel)
 {
 	static bool in_vacuum = false;
@@ -173,7 +179,7 @@ vacuum(int options, List *relations, VacuumParams *params,
 
 	Assert(params != NULL);
 
-	stmttype = (options & VACOPT_VACUUM) ? "VACUUM" : "ANALYZE";
+	stmttype = (options.flags & VACOPT_VACUUM) ? "VACUUM" : "ANALYZE";
 
 	/*
 	 * We cannot run VACUUM inside a user transaction block; if we were inside
@@ -183,7 +189,7 @@ vacuum(int options, List *relations, VacuumParams *params,
 	 *
 	 * ANALYZE (without VACUUM) can run either way.
 	 */
-	if (options & VACOPT_VACUUM)
+	if (options.flags & VACOPT_VACUUM)
 	{
 		PreventInTransactionBlock(isTopLevel, stmttype);
 		in_outer_xact = false;
@@ -205,8 +211,8 @@ vacuum(int options, List *relations, VacuumParams *params,
 	/*
 	 * Sanity check DISABLE_PAGE_SKIPPING option.
 	 */
-	if ((options & VACOPT_FULL) != 0 &&
-		(options & VACOPT_DISABLE_PAGE_SKIPPING) != 0)
+	if ((options.flags & VACOPT_FULL) != 0 &&
+		(options.flags & VACOPT_DISABLE_PAGE_SKIPPING) != 0)
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg("VACUUM option DISABLE_PAGE_SKIPPING cannot be used with FULL")));
@@ -215,7 +221,7 @@ vacuum(int options, List *relations, VacuumParams *params,
 	 * Send info about dead objects to the statistics collector, unless we are
 	 * in autovacuum --- autovacuum.c does this for itself.
 	 */
-	if ((options & VACOPT_VACUUM) && !IsAutoVacuumWorkerProcess())
+	if ((options.flags & VACOPT_VACUUM) && !IsAutoVacuumWorkerProcess())
 		pgstat_vacuum_stat();
 
 	/*
@@ -280,11 +286,11 @@ vacuum(int options, List *relations, VacuumParams *params,
 	 * transaction block, and also in an autovacuum worker, use own
 	 * transactions so we can release locks sooner.
 	 */
-	if (options & VACOPT_VACUUM)
+	if (options.flags & VACOPT_VACUUM)
 		use_own_xacts = true;
 	else
 	{
-		Assert(options & VACOPT_ANALYZE);
+		Assert(options.flags & VACOPT_ANALYZE);
 		if (IsAutoVacuumWorkerProcess())
 			use_own_xacts = true;
 		else if (in_outer_xact)
@@ -334,13 +340,13 @@ vacuum(int options, List *relations, VacuumParams *params,
 		{
 			VacuumRelation *vrel = lfirst_node(VacuumRelation, cur);
 
-			if (options & VACOPT_VACUUM)
+			if (options.flags & VACOPT_VACUUM)
 			{
 				if (!vacuum_rel(vrel->oid, vrel->relation, options, params))
 					continue;
 			}
 
-			if (options & VACOPT_ANALYZE)
+			if (options.flags & VACOPT_ANALYZE)
 			{
 				/*
 				 * If using separate xacts, start one for analyze. Otherwise,
@@ -353,7 +359,7 @@ vacuum(int options, List *relations, VacuumParams *params,
 					PushActiveSnapshot(GetTransactionSnapshot());
 				}
 
-				analyze_rel(vrel->oid, vrel->relation, options, params,
+				analyze_rel(vrel->oid, vrel->relation, options.flags, params,
 							vrel->va_cols, in_outer_xact, vac_strategy);
 
 				if (use_own_xacts)
@@ -389,7 +395,7 @@ vacuum(int options, List *relations, VacuumParams *params,
 		StartTransactionCommand();
 	}
 
-	if ((options & VACOPT_VACUUM) && !IsAutoVacuumWorkerProcess())
+	if ((options.flags & VACOPT_VACUUM) && !IsAutoVacuumWorkerProcess())
 	{
 		/*
 		 * Update pg_database.datfrozenxid, and truncate pg_xact if possible.
@@ -602,7 +608,7 @@ vacuum_open_relation(Oid relid, RangeVar *relation, VacuumParams *params,
  * are made in vac_context.
  */
 static List *
-expand_vacuum_rel(VacuumRelation *vrel, int options)
+expand_vacuum_rel(VacuumRelation *vrel, VacuumOptions options)
 {
 	List	   *vacrels = NIL;
 	MemoryContext oldcontext;
@@ -634,7 +640,7 @@ expand_vacuum_rel(VacuumRelation *vrel, int options)
 		 * below, as well as find_all_inheritors's expectation that the caller
 		 * holds some lock on the starting relation.
 		 */
-		rvr_opts = (options & VACOPT_SKIP_LOCKED) ? RVR_SKIP_LOCKED : 0;
+		rvr_opts = (options.flags & VACOPT_SKIP_LOCKED) ? RVR_SKIP_LOCKED : 0;
 		relid = RangeVarGetRelidExtended(vrel->relation,
 										 AccessShareLock,
 										 rvr_opts,
@@ -646,7 +652,7 @@ expand_vacuum_rel(VacuumRelation *vrel, int options)
 		 */
 		if (!OidIsValid(relid))
 		{
-			if (options & VACOPT_VACUUM)
+			if (options.flags & VACOPT_VACUUM)
 				ereport(WARNING,
 						(errcode(ERRCODE_LOCK_NOT_AVAILABLE),
 						 errmsg("skipping vacuum of \"%s\" --- lock not available",
@@ -672,7 +678,7 @@ expand_vacuum_rel(VacuumRelation *vrel, int options)
 		 * Make a returnable VacuumRelation for this rel if user is a proper
 		 * owner.
 		 */
-		if (vacuum_is_relation_owner(relid, classForm, options))
+		if (vacuum_is_relation_owner(relid, classForm, options.flags))
 		{
 			oldcontext = MemoryContextSwitchTo(vac_context);
 			vacrels = lappend(vacrels, makeVacuumRelation(vrel->relation,
@@ -741,7 +747,7 @@ expand_vacuum_rel(VacuumRelation *vrel, int options)
  * the current database.  The list is built in vac_context.
  */
 static List *
-get_all_vacuum_rels(int options)
+get_all_vacuum_rels(VacuumOptions options)
 {
 	List	   *vacrels = NIL;
 	Relation	pgclass;
@@ -759,7 +765,7 @@ get_all_vacuum_rels(int options)
 		Oid			relid = classForm->oid;
 
 		/* check permissions of relation */
-		if (!vacuum_is_relation_owner(relid, classForm, options))
+		if (!vacuum_is_relation_owner(relid, classForm, options.flags))
 			continue;
 
 		/*
@@ -1520,7 +1526,7 @@ vac_truncate_clog(TransactionId frozenXID,
  *		At entry and exit, we are not inside a transaction.
  */
 static bool
-vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
+vacuum_rel(Oid relid, RangeVar *relation, VacuumOptions options, VacuumParams *params)
 {
 	LOCKMODE	lmode;
 	Relation	onerel;
@@ -1541,7 +1547,7 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 	 */
 	PushActiveSnapshot(GetTransactionSnapshot());
 
-	if (!(options & VACOPT_FULL))
+	if (!(options.flags & VACOPT_FULL))
 	{
 		/*
 		 * In lazy vacuum, we can set the PROC_IN_VACUUM flag, which lets
@@ -1581,10 +1587,10 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 	 * vacuum, but just ShareUpdateExclusiveLock for concurrent vacuum. Either
 	 * way, we can be sure that no other backend is vacuuming the same table.
 	 */
-	lmode = (options & VACOPT_FULL) ? AccessExclusiveLock : ShareUpdateExclusiveLock;
+	lmode = (options.flags & VACOPT_FULL) ? AccessExclusiveLock : ShareUpdateExclusiveLock;
 
 	/* open the relation and get the appropriate lock on it */
-	onerel = vacuum_open_relation(relid, relation, params, options, lmode);
+	onerel = vacuum_open_relation(relid, relation, params, options.flags, lmode);
 
 	/* leave if relation could not be opened or locked */
 	if (!onerel)
@@ -1604,7 +1610,7 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 	 */
 	if (!vacuum_is_relation_owner(RelationGetRelid(onerel),
 								  onerel->rd_rel,
-								  options & VACOPT_VACUUM))
+								  options.flags & VACOPT_VACUUM))
 	{
 		relation_close(onerel, lmode);
 		PopActiveSnapshot();
@@ -1676,7 +1682,7 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 	 * us to process it.  In VACUUM FULL, though, the toast table is
 	 * automatically rebuilt by cluster_rel so we shouldn't recurse to it.
 	 */
-	if (!(options & VACOPT_SKIPTOAST) && !(options & VACOPT_FULL))
+	if (!(options.flags & VACOPT_SKIPTOAST) && !(options.flags & VACOPT_FULL))
 		toast_relid = onerel->rd_rel->reltoastrelid;
 	else
 		toast_relid = InvalidOid;
@@ -1695,7 +1701,7 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 	/*
 	 * Do the actual work --- either FULL or "lazy" vacuum
 	 */
-	if (options & VACOPT_FULL)
+	if (options.flags & VACOPT_FULL)
 	{
 		int			cluster_options = 0;
 
@@ -1703,7 +1709,7 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 		relation_close(onerel, NoLock);
 		onerel = NULL;
 
-		if ((options & VACOPT_VERBOSE) != 0)
+		if ((options.flags & VACOPT_VERBOSE) != 0)
 			cluster_options |= CLUOPT_VERBOSE;
 
 		/* VACUUM FULL is now a variant of CLUSTER; see cluster.c */
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 1e169e0..03dba93 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -1668,8 +1668,10 @@ _equalDropdbStmt(const DropdbStmt *a, const DropdbStmt *b)
 static bool
 _equalVacuumStmt(const VacuumStmt *a, const VacuumStmt *b)
 {
-	COMPARE_SCALAR_FIELD(options);
-	COMPARE_NODE_FIELD(rels);
+	if (a->options.flags != b->options.flags)
+		return false;
+	if (a->options.nworkers != b->options.nworkers)
+		return false;
 
 	return true;
 }
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index ef6bbe3..9829ca3 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -187,6 +187,7 @@ static void processCASbits(int cas_bits, int location, const char *constrType,
 			   bool *deferrable, bool *initdeferred, bool *not_valid,
 			   bool *no_inherit, core_yyscan_t yyscanner);
 static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
+static VacuumOptions *makeVacOpt(VacuumFlag flag, int nworkers);
 
 %}
 
@@ -237,6 +238,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 	struct ImportQual	*importqual;
 	InsertStmt			*istmt;
 	VariableSetStmt		*vsetstmt;
+	VacuumOptions		*vacopt;
 	PartitionElem		*partelem;
 	PartitionSpec		*partspec;
 	PartitionBoundSpec	*partboundspec;
@@ -305,8 +307,8 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 				create_extension_opt_item alter_extension_opt_item
 
 %type <ival>	opt_lock lock_type cast_context
-%type <ival>	vacuum_option_list vacuum_option_elem
-				analyze_option_list analyze_option_elem
+%type <vacopt>	vacuum_option_list vacuum_option_elem
+%type <ival>	analyze_option_list analyze_option_elem
 %type <boolean>	opt_or_replace
 				opt_grant_grant_option opt_grant_admin_option
 				opt_nowait opt_if_exists opt_with_data
@@ -10434,22 +10436,24 @@ cluster_index_specification:
 VacuumStmt: VACUUM opt_full opt_freeze opt_verbose opt_analyze opt_vacuum_relation_list
 				{
 					VacuumStmt *n = makeNode(VacuumStmt);
-					n->options = VACOPT_VACUUM;
+					n->options.flags = VACOPT_VACUUM;
 					if ($2)
-						n->options |= VACOPT_FULL;
+						n->options.flags |= VACOPT_FULL;
 					if ($3)
-						n->options |= VACOPT_FREEZE;
+						n->options.flags |= VACOPT_FREEZE;
 					if ($4)
-						n->options |= VACOPT_VERBOSE;
+						n->options.flags |= VACOPT_VERBOSE;
 					if ($5)
-						n->options |= VACOPT_ANALYZE;
+						n->options.flags |= VACOPT_ANALYZE;
+					n->options.nworkers = 0;
 					n->rels = $6;
 					$$ = (Node *)n;
 				}
 			| VACUUM '(' vacuum_option_list ')' opt_vacuum_relation_list
 				{
 					VacuumStmt *n = makeNode(VacuumStmt);
-					n->options = VACOPT_VACUUM | $3;
+					n->options.flags = VACOPT_VACUUM | $3->flags;
+					n->options.nworkers = $3->nworkers;
 					n->rels = $5;
 					$$ = (Node *) n;
 				}
@@ -10457,20 +10461,40 @@ VacuumStmt: VACUUM opt_full opt_freeze opt_verbose opt_analyze opt_vacuum_relati
 
 vacuum_option_list:
 			vacuum_option_elem								{ $$ = $1; }
-			| vacuum_option_list ',' vacuum_option_elem		{ $$ = $1 | $3; }
+			| vacuum_option_list ',' vacuum_option_elem
+				{
+					VacuumOptions *vacopt1 = $1;
+					VacuumOptions *vacopt2 = $3;
+
+					vacopt1->flags |= vacopt2->flags;
+					if (vacopt2->flags == VACOPT_PARALLEL)
+						vacopt1->nworkers = vacopt2->nworkers;
+					pfree(vacopt2);
+					$$ = vacopt1;
+				}
 		;
 
 vacuum_option_elem:
-			analyze_keyword		{ $$ = VACOPT_ANALYZE; }
-			| VERBOSE			{ $$ = VACOPT_VERBOSE; }
-			| FREEZE			{ $$ = VACOPT_FREEZE; }
-			| FULL				{ $$ = VACOPT_FULL; }
+			analyze_keyword		{ $$ = makeVacOpt(VACOPT_ANALYZE, 0); }
+			| VERBOSE			{ $$ = makeVacOpt(VACOPT_VERBOSE, 0); }
+			| FREEZE			{ $$ = makeVacOpt(VACOPT_FREEZE, 0); }
+			| FULL				{ $$ = makeVacOpt(VACOPT_FULL, 0); }
+			| PARALLEL			{ $$ = makeVacOpt(VACOPT_PARALLEL, 0); }
+			| PARALLEL ICONST
+			{
+				if ($2 < 1)
+					ereport(ERROR,
+							(errcode(ERRCODE_SYNTAX_ERROR),
+							 errmsg("parallel vacuum degree must be more than 1"),
+							 parser_errposition(@1)));
+				$$ = makeVacOpt(VACOPT_PARALLEL, $2);
+			}
 			| IDENT
 				{
 					if (strcmp($1, "disable_page_skipping") == 0)
-						$$ = VACOPT_DISABLE_PAGE_SKIPPING;
+						$$ = makeVacOpt(VACOPT_DISABLE_PAGE_SKIPPING, 0);
 					else if (strcmp($1, "skip_locked") == 0)
-						$$ = VACOPT_SKIP_LOCKED;
+						$$ = makeVacOpt(VACOPT_SKIP_LOCKED, 0);
 					else
 						ereport(ERROR,
 								(errcode(ERRCODE_SYNTAX_ERROR),
@@ -10482,16 +10506,16 @@ vacuum_option_elem:
 AnalyzeStmt: analyze_keyword opt_verbose opt_vacuum_relation_list
 				{
 					VacuumStmt *n = makeNode(VacuumStmt);
-					n->options = VACOPT_ANALYZE;
+					n->options.flags = VACOPT_ANALYZE;
 					if ($2)
-						n->options |= VACOPT_VERBOSE;
+						n->options.flags |= VACOPT_VERBOSE;
 					n->rels = $3;
 					$$ = (Node *)n;
 				}
 			| analyze_keyword '(' analyze_option_list ')' opt_vacuum_relation_list
 				{
 					VacuumStmt *n = makeNode(VacuumStmt);
-					n->options = VACOPT_ANALYZE | $3;
+					n->options.flags = VACOPT_ANALYZE | $3;
 					n->rels = $5;
 					$$ = (Node *) n;
 				}
@@ -15990,6 +16014,19 @@ makeXmlExpr(XmlExprOp op, char *name, List *named_args, List *args,
 	return (Node *) x;
 }
 
+
+/*
+ * Create a VacuumOptions with the given options.
+ */
+static VacuumOptions *
+makeVacOpt(VacuumFlag flag, int nworkers)
+{
+	VacuumOptions *vacopt = palloc(sizeof(VacuumOptions));
+
+	vacopt->flags = flag;
+	vacopt->nworkers = nworkers;
+	return vacopt;
+}
 /*
  * Merge the input and output parameters of a table function.
  */
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index d1177b3..8555a23 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -187,8 +187,8 @@ typedef struct av_relation
 typedef struct autovac_table
 {
 	Oid			at_relid;
-	int			at_vacoptions;	/* bitmask of VacuumOption */
-	VacuumParams at_params;
+	VacuumOptions	at_vacoptions;
+	VacuumParams	at_params;
 	int			at_vacuum_cost_delay;
 	int			at_vacuum_cost_limit;
 	bool		at_dobalance;
@@ -2481,7 +2481,7 @@ do_autovacuum(void)
 			 * next table in our list.
 			 */
 			HOLD_INTERRUPTS();
-			if (tab->at_vacoptions & VACOPT_VACUUM)
+			if (tab->at_vacoptions.flags & VACOPT_VACUUM)
 				errcontext("automatic vacuum of table \"%s.%s.%s\"",
 						   tab->at_datname, tab->at_nspname, tab->at_relname);
 			else
@@ -2882,10 +2882,11 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		tab = palloc(sizeof(autovac_table));
 		tab->at_relid = relid;
 		tab->at_sharedrel = classForm->relisshared;
-		tab->at_vacoptions = VACOPT_SKIPTOAST |
+		tab->at_vacoptions.flags = VACOPT_SKIPTOAST |
 			(dovacuum ? VACOPT_VACUUM : 0) |
 			(doanalyze ? VACOPT_ANALYZE : 0) |
 			(!wraparound ? VACOPT_SKIP_LOCKED : 0);
+		tab->at_vacoptions.nworkers = 0;	/* parallel lazy vacuum is not supported */
 		tab->at_params.freeze_min_age = freeze_min_age;
 		tab->at_params.freeze_table_age = freeze_table_age;
 		tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
@@ -3131,10 +3132,10 @@ autovac_report_activity(autovac_table *tab)
 	int			len;
 
 	/* Report the command and possible options */
-	if (tab->at_vacoptions & VACOPT_VACUUM)
+	if (tab->at_vacoptions.flags & VACOPT_VACUUM)
 		snprintf(activity, MAX_AUTOVAC_ACTIV_LEN,
 				 "autovacuum: VACUUM%s",
-				 tab->at_vacoptions & VACOPT_ANALYZE ? " ANALYZE" : "");
+				 tab->at_vacoptions.flags & VACOPT_ANALYZE ? " ANALYZE" : "");
 	else
 		snprintf(activity, MAX_AUTOVAC_ACTIV_LEN,
 				 "autovacuum: ANALYZE");
diff --git a/src/backend/tcop/utility.c b/src/backend/tcop/utility.c
index 6ec795f..f74d17a 100644
--- a/src/backend/tcop/utility.c
+++ b/src/backend/tcop/utility.c
@@ -664,7 +664,7 @@ standard_ProcessUtility(PlannedStmt *pstmt,
 				VacuumStmt *stmt = (VacuumStmt *) parsetree;
 
 				/* we choose to allow this during "read only" transactions */
-				PreventCommandDuringRecovery((stmt->options & VACOPT_VACUUM) ?
+				PreventCommandDuringRecovery((stmt->options.flags & VACOPT_VACUUM) ?
 											 "VACUUM" : "ANALYZE");
 				/* forbidden in parallel mode due to CommandIsReadOnly */
 				ExecVacuum(stmt, isTopLevel);
@@ -2570,7 +2570,7 @@ CreateCommandTag(Node *parsetree)
 			break;
 
 		case T_VacuumStmt:
-			if (((VacuumStmt *) parsetree)->options & VACOPT_VACUUM)
+			if (((VacuumStmt *) parsetree)->options.flags & VACOPT_VACUUM)
 				tag = "VACUUM";
 			else
 				tag = "ANALYZE";
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index ab08791..d862cf7 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -14,11 +14,13 @@
 #ifndef HEAPAM_H
 #define HEAPAM_H
 
+#include "access/parallel.h"
 #include "access/relation.h"	/* for backward compatibility */
 #include "access/sdir.h"
 #include "access/skey.h"
 #include "access/table.h"		/* for backward compatibility */
 #include "nodes/lockoptions.h"
+#include "nodes/parsenodes.h"
 #include "nodes/primnodes.h"
 #include "storage/bufpage.h"
 #include "storage/lockdefs.h"
@@ -185,8 +187,9 @@ extern Size SyncScanShmemSize(void);
 
 /* in heap/vacuumlazy.c */
 struct VacuumParams;
-extern void heap_vacuum_rel(Relation onerel, int options,
+extern void heap_vacuum_rel(Relation onerel, VacuumOptions options,
 				struct VacuumParams *params, BufferAccessStrategy bstrategy);
+extern void heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
 
 /* in heap/heapam_visibility.c */
 extern bool HeapTupleSatisfiesVisibility(HeapTuple stup, Snapshot snapshot,
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 0a051ec..d3503e2 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -163,7 +163,7 @@ extern int	vacuum_multixact_freeze_table_age;
 
 /* in commands/vacuum.c */
 extern void ExecVacuum(VacuumStmt *vacstmt, bool isTopLevel);
-extern void vacuum(int options, List *relations, VacuumParams *params,
+extern void vacuum(VacuumOptions options, List *relations, VacuumParams *params,
 	   BufferAccessStrategy bstrategy, bool isTopLevel);
 extern void vac_open_indexes(Relation relation, LOCKMODE lockmode,
 				 int *nindexes, Relation **Irel);
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 2fe14d7..984ceb1 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3147,7 +3147,7 @@ typedef struct ClusterStmt
  * and VACOPT_ANALYZE must be set in options.
  * ----------------------
  */
-typedef enum VacuumOption
+typedef enum VacuumFlag
 {
 	VACOPT_VACUUM = 1 << 0,		/* do VACUUM */
 	VACOPT_ANALYZE = 1 << 1,	/* do ANALYZE */
@@ -3156,8 +3156,15 @@ typedef enum VacuumOption
 	VACOPT_FULL = 1 << 4,		/* FULL (non-concurrent) vacuum */
 	VACOPT_SKIP_LOCKED = 1 << 5,	/* skip if cannot get lock */
 	VACOPT_SKIPTOAST = 1 << 6,	/* don't process the TOAST table, if any */
-	VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7	/* don't skip any pages */
-} VacuumOption;
+	VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7,	/* don't skip any pages */
+	VACOPT_PARALLEL = 1 << 8	/* do lazy VACUUM in parallel */
+} VacuumFlag;
+
+typedef struct VacuumOptions
+{
+	VacuumFlag	flags;	/* OR of VacuumFlag */
+	int			nworkers;	/* # of parallel vacuum workers */
+} VacuumOptions;
 
 /*
  * Info about a single target table of VACUUM/ANALYZE.
@@ -3176,9 +3183,9 @@ typedef struct VacuumRelation
 
 typedef struct VacuumStmt
 {
-	NodeTag		type;
-	int			options;		/* OR of VacuumOption flags */
-	List	   *rels;			/* list of VacuumRelation, or NIL for all */
+	NodeTag			type;
+	VacuumOptions	options;
+	List		   *rels;			/* list of VacuumRelation, or NIL for all */
 } VacuumStmt;
 
 /* ----------------------
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index fa9d663..9b5b7dc 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -80,6 +80,8 @@ CONTEXT:  SQL function "do_analyze" statement 1
 SQL function "wrap_do_analyze" statement 1
 VACUUM FULL vactst;
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+VACUUM (PARALLEL) vaccluster;
+VACUUM (PARALLEL 2) vaccluster;
 -- partitioned table
 CREATE TABLE vacparted (a int, b char) PARTITION BY LIST (a);
 CREATE TABLE vacparted1 PARTITION OF vacparted FOR VALUES IN (1);
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 9defa0d..f92c4e5 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -61,6 +61,9 @@ VACUUM FULL vaccluster;
 VACUUM FULL vactst;
 
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+VACUUM (PARALLEL) vaccluster;
+VACUUM (PARALLEL 2) vaccluster;
+
 
 -- partitioned table
 CREATE TABLE vacparted (a int, b char) PARTITION BY LIST (a);
-- 
2.10.5

v15-0002-Add-P-option-to-vacuumdb-command.patchapplication/octet-stream; name=v15-0002-Add-P-option-to-vacuumdb-command.patchDownload

From 0691a4664110c1b65a544c82048212fee259f9e8 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Wed, 23 Jan 2019 16:07:53 +0900
Subject: [PATCH v15 2/2] Add -P option to vacuumdb command

---
 doc/src/sgml/ref/vacuumdb.sgml    | 12 ++++++++++
 src/bin/scripts/t/100_vacuumdb.pl | 10 +++++++-
 src/bin/scripts/vacuumdb.c        | 49 ++++++++++++++++++++++++++++++++++++++-
 3 files changed, 69 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/ref/vacuumdb.sgml b/doc/src/sgml/ref/vacuumdb.sgml
index 41c7f3d..79cad7e 100644
--- a/doc/src/sgml/ref/vacuumdb.sgml
+++ b/doc/src/sgml/ref/vacuumdb.sgml
@@ -223,6 +223,18 @@ PostgreSQL documentation
          <productname>PostgreSQL</productname> 9.6 and later.
         </para>
        </note>
+      <term><option>-P <replaceable class="parameter">workers</replaceable></option></term>
+      <term><option>--parallel=<replaceable class="parameter">workers</replaceable></option></term>
+      <listitem>
+       <para>
+        Execute parallel vacuum with <productname>PostgreSQL</productname>'s
+        <replaceable class="parameter">workers</replaceable> background workers.
+       </para>
+       <para>
+        <application>vacuumdb</application> will require background workers,
+        so make sure your <xref linkend="guc-max-parallel-workers-maintenance"/>
+        setting is more than one.
+       </para>
       </listitem>
      </varlistentry>
 
diff --git a/src/bin/scripts/t/100_vacuumdb.pl b/src/bin/scripts/t/100_vacuumdb.pl
index 7f3a9b1..5ab87f3 100644
--- a/src/bin/scripts/t/100_vacuumdb.pl
+++ b/src/bin/scripts/t/100_vacuumdb.pl
@@ -3,7 +3,7 @@ use warnings;
 
 use PostgresNode;
 use TestLib;
-use Test::More tests => 44;
+use Test::More tests => 48;
 
 program_help_ok('vacuumdb');
 program_version_ok('vacuumdb');
@@ -48,6 +48,14 @@ $node->issues_sql_like(
 $node->command_fails(
 	[ 'vacuumdb', '--analyze-only', '--disable-page-skipping', 'postgres' ],
 	'--analyze-only and --disable-page-skipping specified together');
+$node->issues_sql_like(
+	[ 'vacuumdb', '-P2', 'postgres' ],
+	qr/statement: VACUUM \(PARALLEL 2\).*;/,
+	'vacuumdb -P2');
+$node->issues_sql_like(
+	[ 'vacuumdb', '-P', 'postgres' ],
+	qr/statement: VACUUM \(PARALLEL\).*;/,
+	'vacuumdb -P');
 $node->command_ok([qw(vacuumdb -Z --table=pg_am dbname=template1)],
 	'vacuumdb with connection string');
 
diff --git a/src/bin/scripts/vacuumdb.c b/src/bin/scripts/vacuumdb.c
index 5ac41ea..b9799db 100644
--- a/src/bin/scripts/vacuumdb.c
+++ b/src/bin/scripts/vacuumdb.c
@@ -45,6 +45,8 @@ typedef struct vacuumingOptions
 	bool		skip_locked;
 	int			min_xid_age;
 	int			min_mxid_age;
+	int			parallel_workers;	/* -1: disabled, 0: PARALLEL without number of
+									 * workers. */
 } vacuumingOptions;
 
 
@@ -111,6 +113,7 @@ main(int argc, char *argv[])
 		{"full", no_argument, NULL, 'f'},
 		{"verbose", no_argument, NULL, 'v'},
 		{"jobs", required_argument, NULL, 'j'},
+		{"parallel", optional_argument, NULL, 'P'},
 		{"maintenance-db", required_argument, NULL, 2},
 		{"analyze-in-stages", no_argument, NULL, 3},
 		{"disable-page-skipping", no_argument, NULL, 4},
@@ -140,6 +143,7 @@ main(int argc, char *argv[])
 
 	/* initialize options to all false */
 	memset(&vacopts, 0, sizeof(vacopts));
+	vacopts.parallel_workers = -1;
 
 	progname = get_progname(argv[0]);
 
@@ -147,7 +151,7 @@ main(int argc, char *argv[])
 
 	handle_help_version_opts(argc, argv, "vacuumdb", help);
 
-	while ((c = getopt_long(argc, argv, "h:p:U:wWeqd:zZFat:fvj:", long_options, &optindex)) != -1)
+	while ((c = getopt_long(argc, argv, "h:p:P::U:wWeqd:zZFat:fvj:", long_options, &optindex)) != -1)
 	{
 		switch (c)
 		{
@@ -214,6 +218,25 @@ main(int argc, char *argv[])
 					exit(1);
 				}
 				break;
+			case 'P':
+				{
+					int parallel_workers = 0;
+
+					if (optarg != NULL)
+					{
+						parallel_workers = atoi(optarg);
+						if (parallel_workers <= 0)
+						{
+							fprintf(stderr, _("%s: number of parallel workers must be at least 1\n"),
+									progname);
+							exit(1);
+						}
+					}
+
+					/* allow to set 0, meaning PARALLEL without the parallel degree */
+					vacopts.parallel_workers = parallel_workers;
+					break;
+				}
 			case 2:
 				maintenance_db = pg_strdup(optarg);
 				break;
@@ -288,9 +311,22 @@ main(int argc, char *argv[])
 					progname, "disable-page-skipping");
 			exit(1);
 		}
+		if (vacopts.parallel_workers >= 0)
+		{
+			fprintf(stderr, _("%s: cannot use the \"%s\" option when performing only analyze\n"),
+					progname, "parallel");
+			exit(1);
+		}
 		/* allow 'and_analyze' with 'analyze_only' */
 	}
 
+	if (vacopts.full && vacopts.parallel_workers >= 0)
+	{
+		fprintf(stderr, _("%s: cannot use the \"%s\" option with \"%s\" option"),
+				progname, "full", "parallel");
+		exit(1);
+	}
+
 	setup_cancel_handler();
 
 	/* Avoid opening extra connections. */
@@ -895,6 +931,16 @@ prepare_vacuum_command(PQExpBuffer sql, int serverVersion,
 				appendPQExpBuffer(sql, "%sANALYZE", sep);
 				sep = comma;
 			}
+			if (vacopts->parallel_workers > 0)
+			{
+				appendPQExpBuffer(sql, "%sPARALLEL %d", sep, vacopts->parallel_workers);
+				sep = comma;
+			}
+			if (vacopts->parallel_workers == 0)
+			{
+				appendPQExpBuffer(sql, "%sPARALLEL", sep);
+				sep = comma;
+			}
 			if (sep != paren)
 				appendPQExpBufferChar(sql, ')');
 		}
@@ -1227,6 +1273,7 @@ help(const char *progname)
 	printf(_("  -j, --jobs=NUM                  use this many concurrent connections to vacuum\n"));
 	printf(_("      --min-mxid-age=MXID_AGE     minimum multixact ID age of tables to vacuum\n"));
 	printf(_("      --min-xid-age=XID_AGE       minimum transaction ID age of tables to vacuum\n"));
+	printf(_("  -P, --parallel=NUM              do parallel vacuuming\n"));
 	printf(_("  -q, --quiet                     don't write any messages\n"));
 	printf(_("      --skip-locked               skip relations that cannot be immediately locked\n"));
 	printf(_("  -t, --table='TABLE[(COLUMNS)]'  vacuum specific table(s) only\n"));
-- 
2.10.5

#27

Haribabu Kommi

kommi.haribabu@gmail.com

almost 7 years ago

In reply to: Masahiko Sawada (#26)

On Thu, Feb 14, 2019 at 9:17 PM Masahiko Sawada <sawada.mshk@gmail.com>
wrote:

On Wed, Feb 13, 2019 at 9:32 PM Haribabu Kommi <kommi.haribabu@gmail.com>
wrote:

On Sat, Feb 9, 2019 at 11:47 PM Masahiko Sawada <sawada.mshk@gmail.com>

wrote:

On Tue, Feb 5, 2019 at 12:14 PM Haribabu Kommi <

kommi.haribabu@gmail.com> wrote:

On Fri, Feb 1, 2019 at 8:19 AM Masahiko Sawada <sawada.mshk@gmail.com>

wrote:

The passing stats = NULL to amvacuumcleanup and ambulkdelete means

the

first time execution. For example, btvacuumcleanup skips cleanup if
it's not NULL.In the normal vacuum we pass NULL to ambulkdelete or
amvacuumcleanup when the first time calling. And they store the

result

stats to the memory allocated int the local memory. Therefore in the
parallel vacuum I think that both worker and leader need to move it

to

the shared memory and mark it as updated as different worker could
vacuum different indexes at the next time.

OK, understood the point. But for btbulkdelete whenever the stats are

NULL,

it allocates the memory. So I don't see a problem with it.

The only problem is with btvacuumcleanup, when there are no dead

tuples

present in the table, the btbulkdelete is not called and directly the

btvacuumcleanup

is called at the end of vacuum, in that scenario, there is code flow

difference

based on the stats. so why can't we use the deadtuples number to

differentiate

instead of adding another flag?

I don't understand your suggestion. What do we compare deadtuples
number to? Could you elaborate on that please?

The scenario where the stats should pass NULL to btvacuumcleanup

function is

when there no dead tuples, I just think that we may use that deadtuples

structure

to find out whether stats should pass NULL or not while avoiding the

extra

memcpy.

Thank you for your explanation. I understood. Maybe I'm worrying too
much but I'm concernced compatibility; currently we handle indexes
individually. So if there is an index access method whose ambulkdelete
returns NULL at the first call but returns a palloc'd struct at the
second time or other, that doesn't work fine.

The documentation says that passed-in 'stats' is NULL at the first
time call of ambulkdelete but doesn't say about the second time or
more. Index access methods may expect that the passed-in 'stats' is
the same as what they has returned last time. So I think to add an
extra flag for keeping comptibility.

I checked some of the ambulkdelete functions, and they are not returning
a NULL data whenever those functions are called. But the palloc'd structure
doesn't get filled with the details.

IMO, there is no need of any extra code that is required for parallel vacuum
compared to normal vacuum.

Regards,
Haribabu Kommi
Fujitsu Australia

#28

Haribabu Kommi

kommi.haribabu@gmail.com

almost 7 years ago

In reply to: Masahiko Sawada (#26)

On Thu, Feb 14, 2019 at 9:17 PM Masahiko Sawada <sawada.mshk@gmail.com>
wrote:

Thank you. Attached the rebased patch.

I ran some performance tests to compare the parallelism benefits,
but I got some strange results of performance overhead, may be it is
because, I tested it on my laptop.

FYI,

Table schema:

create table tbl(f1 int, f2 char(100), f3 float4, f4 char(100), f5 float8,
f6 char(100), f7 bigint);

Tbl with 3 indexes

1000 record deletion
master - 22ms
patch - 25ms with 0 parallel workers
patch - 43ms with 1 parallel worker
patch - 72ms with 2 parallel workers

10000 record deletion
master - 52ms
patch - 56ms with 0 parallel workers
patch - 79ms with 1 parallel worker
patch - 86ms with 2 parallel workers

100000 record deletion
master - 410ms
patch - 379ms with 0 parallel workers
patch - 330ms with 1 parallel worker
patch - 289ms with 2 parallel workers

Tbl with 5 indexes

1000 record deletion
master - 28ms
patch - 34ms with 0 parallel workers
patch - 86ms with 2 parallel workers
patch - 106ms with 4 parallel workers

10000 record deletion
master - 58ms
patch - 63ms with 0 parallel workers
patch - 101ms with 2 parallel workers
patch - 118ms with 4 parallel workers

100000 record deletion
master - 632ms
patch - 490ms with 0 parallel workers
patch - 455ms with 2 parallel workers
patch - 403ms with 4 parallel workers

Tbl with 7 indexes

1000 record deletion
master - 35ms
patch - 44ms with 0 parallel workers
patch - 93ms with 2 parallel workers
patch - 110ms with 4 parallel workers
patch - 123ms with 6 parallel workers

10000 record deletion
master - 76ms
patch - 78ms with 0 parallel workers
patch - 135ms with 2 parallel workers
patch - 143ms with 4 parallel workers
patch - 139ms with 6 parallel workers

100000 record deletion
master - 641ms
patch - 656ms with 0 parallel workers
patch - 613ms with 2 parallel workers
patch - 735ms with 4 parallel workers
patch - 679ms with 6 parallel workers

Regards,
Haribabu Kommi
Fujitsu Australia

#29

Masahiko Sawada

sawada.mshk@gmail.com

almost 7 years ago

In reply to: Haribabu Kommi (#28)

On Tue, Feb 26, 2019 at 1:35 PM Haribabu Kommi <kommi.haribabu@gmail.com> wrote:

On Thu, Feb 14, 2019 at 9:17 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Thank you. Attached the rebased patch.

I ran some performance tests to compare the parallelism benefits,

Thank you for testing!

but I got some strange results of performance overhead, may be it is
because, I tested it on my laptop.

Hmm, I think the parallel vacuum would help for heavy workloads like a
big table with multiple indexes. In your test result, all executions
are completed within 1 sec, which seems to be one use case that the
parallel vacuum wouldn't help. I suspect that the table is small,
right? Anyway I'll also do performance tests.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#30

Masahiko Sawada

sawada.mshk@gmail.com

almost 7 years ago

In reply to: Haribabu Kommi (#27)

On Sat, Feb 23, 2019 at 10:28 PM Haribabu Kommi
<kommi.haribabu@gmail.com> wrote:

On Thu, Feb 14, 2019 at 9:17 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Wed, Feb 13, 2019 at 9:32 PM Haribabu Kommi <kommi.haribabu@gmail.com> wrote:

On Sat, Feb 9, 2019 at 11:47 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Tue, Feb 5, 2019 at 12:14 PM Haribabu Kommi <kommi.haribabu@gmail.com> wrote:

On Fri, Feb 1, 2019 at 8:19 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

The passing stats = NULL to amvacuumcleanup and ambulkdelete means the
first time execution. For example, btvacuumcleanup skips cleanup if
it's not NULL.In the normal vacuum we pass NULL to ambulkdelete or
amvacuumcleanup when the first time calling. And they store the result
stats to the memory allocated int the local memory. Therefore in the
parallel vacuum I think that both worker and leader need to move it to
the shared memory and mark it as updated as different worker could
vacuum different indexes at the next time.

OK, understood the point. But for btbulkdelete whenever the stats are NULL,
it allocates the memory. So I don't see a problem with it.

The only problem is with btvacuumcleanup, when there are no dead tuples
present in the table, the btbulkdelete is not called and directly the btvacuumcleanup
is called at the end of vacuum, in that scenario, there is code flow difference
based on the stats. so why can't we use the deadtuples number to differentiate
instead of adding another flag?

I don't understand your suggestion. What do we compare deadtuples
number to? Could you elaborate on that please?

The scenario where the stats should pass NULL to btvacuumcleanup function is
when there no dead tuples, I just think that we may use that deadtuples structure
to find out whether stats should pass NULL or not while avoiding the extra
memcpy.

Thank you for your explanation. I understood. Maybe I'm worrying too
much but I'm concernced compatibility; currently we handle indexes
individually. So if there is an index access method whose ambulkdelete
returns NULL at the first call but returns a palloc'd struct at the
second time or other, that doesn't work fine.

The documentation says that passed-in 'stats' is NULL at the first
time call of ambulkdelete but doesn't say about the second time or
more. Index access methods may expect that the passed-in 'stats' is
the same as what they has returned last time. So I think to add an
extra flag for keeping comptibility.

I checked some of the ambulkdelete functions, and they are not returning
a NULL data whenever those functions are called. But the palloc'd structure
doesn't get filled with the details.

IMO, there is no need of any extra code that is required for parallel vacuum
compared to normal vacuum.

Hmm, I think that this code is necessary to faithfully keep the same
index vacuum behavior, especially for communication between lazy
vacuum and IAMs, as it is. The IAMs in postgres don't worry about that
but other third party AMs might not, and it might be developed in the
future. On the other hand, I can understand your concerns; if such IAM
is quite rare we might not need to make the code complicated
needlessly. I'd like to hear more opinions also from other hackers.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#31

Robert Haas

robertmhaas@gmail.com

almost 7 years ago

In reply to: Masahiko Sawada (#26)

On Thu, Feb 14, 2019 at 5:17 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Thank you. Attached the rebased patch.

Here are some review comments.

+         started by a single utility command.  Currently, the parallel
+         utility commands that support the use of parallel workers are
+         <command>CREATE INDEX</command> and <command>VACUUM</command>
+         without <literal>FULL</literal> option, and only when building
+         a B-tree index.  Parallel workers are taken from the pool of

That sentence is garbled. The end part about b-tree indexes applies
only to CREATE INDEX, not to VACUUM, since VACUUM does build indexes.

+      Vacuum index and cleanup index in parallel
+      <replaceable class="parameter">N</replaceable> background
workers (for the detail
+      of each vacuum phases, please refer to <xref
linkend="vacuum-phases"/>. If the

I have two problems with this. One is that I can't understand the
English very well. I think you mean something like: "Perform the
'vacuum index' and 'cleanup index' phases of VACUUM in parallel using
N background workers," but I'm not entirely sure. The other is that
if that is what you mean, I don't think it's a sufficient description.
Users need to understand whether, for example, only one worker can be
used per index, or whether the work for a single index can be split
across workers.

+      parallel degree <replaceable class="parameter">N</replaceable>
is omitted,
+      then <command>VACUUM</command> decides the number of workers based on
+      number of indexes on the relation which further limited by
+      <xref linkend="guc-max-parallel-workers-maintenance"/>. Also if
this option

Now this makes it sound like it's one worker per index, but you could
be more explicit about it.

+      is specified multile times, the last parallel degree
+      <replaceable class="parameter">N</replaceable> is considered
into the account.

Typo, but I'd just delete this sentence altogether; the behavior if
the option is multiply specified seems like a triviality that need not
be documented.

+    Setting a value for <literal>parallel_workers</literal> via
+    <xref linkend="sql-altertable"/> also controls how many parallel
+    worker processes will be requested by a <command>VACUUM</command>
+    against the table. This setting is overwritten by setting
+    <replaceable class="parameter">N</replaceable> of
<literal>PARALLEL</literal>
+    option.

I wonder if we really want this behavior. Should a setting that
controls the degree of parallelism when scanning the table also affect
VACUUM? I tend to think that we probably don't ever want VACUUM of a
table to be parallel by default, but rather something that the user
must explicitly request. Happy to hear other opinions. If we do want
this behavior, I think this should be written differently, something
like this: The PARALLEL N option to VACUUM takes precedence over this
option.

+ * parallel mode nor destories the parallel context. For updating the index

Spelling.

+/* DSM keys for parallel lazy vacuum */
+#define PARALLEL_VACUUM_KEY_SHARED UINT64CONST(0xFFFFFFFFFFF00001)
+#define PARALLEL_VACUUM_KEY_DEAD_TUPLES UINT64CONST(0xFFFFFFFFFFF00002)
+#define PARALLEL_VACUUM_KEY_QUERY_TEXT UINT64CONST(0xFFFFFFFFFFF00003)

Any special reason not to use just 1, 2, 3 here? The general
infrastructure stuff uses high numbers to avoid conflicting with
plan_node_id values, but end clients of the parallel infrastructure
can generally just use small integers.

+ bool updated; /* is the stats updated? */

is -> are

+ * LVDeadTuples controls the dead tuple TIDs collected during heap scan.

what do you mean by "controls", exactly? stores?

+ * This is allocated in a dynamic shared memory segment when parallel
+ * lazy vacuum mode, or allocated in a local memory.

If this is in DSM, then max_tuples is a wart, I think. We can't grow
the segment at that point. I'm suspicious that we need a better
design here. It looks like you gather all of the dead tuples in
backend-local memory and then allocate an equal amount of DSM to copy
them. But that means that we are using twice as much memory, which
seems pretty bad. You'd have to do that at least momentarily no
matter what, but it's not obvious that the backend-local copy is ever
freed. There's another patch kicking around to allocate memory for
vacuum in chunks rather than preallocating the whole slab of memory at
once; we might want to think about getting that committed first and
then having this build on top of it. At least we need something
smarter than this.

-heap_vacuum_rel(Relation onerel, int options, VacuumParams *params,
+heap_vacuum_rel(Relation onerel, VacuumOptions options, VacuumParams *params,

We generally avoid passing a struct by value; copying the struct can
be expensive and having multiple shallow copies of the same data
sometimes leads to surprising results. I think it might be a good
idea to propose a preliminary refactoring patch that invents
VacuumOptions and gives it just a single 'int' member and refactors
everything to use it, and then that can be committed first. It should
pass a pointer, though, not the actual struct.

+ LVState *lvstate;

It's not clear to me why we need this new LVState thing. What's the
motivation for that? If it's a good idea, could it be done as a
separate, preparatory patch? It seems to be responsible for a lot of
code churn in this patch. It also leads to strange stuff like this:

  ereport(elevel,
- (errmsg("scanned index \"%s\" to remove %d row versions",
+ (errmsg("scanned index \"%s\" to remove %d row versions %s",
  RelationGetRelationName(indrel),
- vacrelstats->num_dead_tuples),
+ dead_tuples->num_tuples,
+ IsParallelWorker() ? "by parallel vacuum worker" : ""),

This doesn't seem to be great grammar, and translation guidelines
generally discourage this sort of incremental message construction
quite strongly. Since the user can probably infer what happened by a
suitable choice of log_line_prefix, I'm not totally sure this is worth
doing in the first place, but if we're going to do it, it should
probably have two completely separate message strings and pick between
them using IsParallelWorker(), rather than building it up
incrementally like this.

+compute_parallel_workers(Relation rel, int nrequests, int nindexes)

I think 'nrequets' is meant to be 'nrequested'. It isn't the number
of requests; it's the number of workers that were requested.

+ /* quick exit if no workers are prepared, e.g. under serializable isolation */

That comment makes very little sense in this context.

+ /* Report parallel vacuum worker information */
+ initStringInfo(&buf);
+ appendStringInfo(&buf,
+ ngettext("launched %d parallel vacuum worker %s (planned: %d",
+   "launched %d parallel vacuum workers %s (planned: %d",
+   lvstate->pcxt->nworkers_launched),
+ lvstate->pcxt->nworkers_launched,
+ for_cleanup ? "for index cleanup" : "for index vacuum",
+ lvstate->pcxt->nworkers);
+ if (lvstate->options.nworkers > 0)
+ appendStringInfo(&buf, ", requested %d", lvstate->options.nworkers);
+
+ appendStringInfo(&buf, ")");
+ ereport(elevel, (errmsg("%s", buf.data)));

This is another example of incremental message construction, again
violating translation guidelines.

+ WaitForParallelWorkersToAttach(lvstate->pcxt);

Why?

+ /*
+ * If there is already-updated result in the shared memory we use it.
+ * Otherwise we pass NULL to index AMs, meaning it's first time call,
+ * and copy the result to the shared memory segment.
+ */

I'm probably missing something here, but isn't the intention that we
only do each index once? If so, how would there be anything there
already? Once from for_cleanup = false and once for for_cleanup =
true?

+ if (a->options.flags != b->options.flags)
+ return false;
+ if (a->options.nworkers != b->options.nworkers)
+ return false;

You could just do COMPARE_SCALAR_FIELD(options.flags);
COMPARE_SCALAR_FIELD(options.nworkers);

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#32

Masahiko Sawada

sawada.mshk@gmail.com

almost 7 years ago

In reply to: Robert Haas (#31)

On Thu, Feb 28, 2019 at 2:44 AM Robert Haas <robertmhaas@gmail.com> wrote:

On Thu, Feb 14, 2019 at 5:17 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Thank you. Attached the rebased patch.

Here are some review comments.

Thank you for reviewing the patches!

+         started by a single utility command.  Currently, the parallel
+         utility commands that support the use of parallel workers are
+         <command>CREATE INDEX</command> and <command>VACUUM</command>
+         without <literal>FULL</literal> option, and only when building
+         a B-tree index.  Parallel workers are taken from the pool of

That sentence is garbled. The end part about b-tree indexes applies
only to CREATE INDEX, not to VACUUM, since VACUUM does build indexes.

Fixed.

+      Vacuum index and cleanup index in parallel
+      <replaceable class="parameter">N</replaceable> background
workers (for the detail
+      of each vacuum phases, please refer to <xref
linkend="vacuum-phases"/>. If the
I have two problems with this. One is that I can't understand the
English very well. I think you mean something like: "Perform the
'vacuum index' and 'cleanup index' phases of VACUUM in parallel using
N background workers," but I'm not entirely sure. The other is that
if that is what you mean, I don't think it's a sufficient description.
Users need to understand whether, for example, only one worker can be
used per index, or whether the work for a single index can be split
across workers.
+      parallel degree <replaceable class="parameter">N</replaceable>
is omitted,
+      then <command>VACUUM</command> decides the number of workers based on
+      number of indexes on the relation which further limited by
+      <xref linkend="guc-max-parallel-workers-maintenance"/>. Also if
this option
Now this makes it sound like it's one worker per index, but you could
be more explicit about it.

Fixed.

+      is specified multile times, the last parallel degree
+      <replaceable class="parameter">N</replaceable> is considered
into the account.
Typo, but I'd just delete this sentence altogether; the behavior if
the option is multiply specified seems like a triviality that need not
be documented.

Understood, removed.

+    Setting a value for <literal>parallel_workers</literal> via
+    <xref linkend="sql-altertable"/> also controls how many parallel
+    worker processes will be requested by a <command>VACUUM</command>
+    against the table. This setting is overwritten by setting
+    <replaceable class="parameter">N</replaceable> of
<literal>PARALLEL</literal>
+    option.
I wonder if we really want this behavior. Should a setting that
controls the degree of parallelism when scanning the table also affect
VACUUM? I tend to think that we probably don't ever want VACUUM of a
table to be parallel by default, but rather something that the user
must explicitly request. Happy to hear other opinions. If we do want
this behavior, I think this should be written differently, something
like this: The PARALLEL N option to VACUUM takes precedence over this
option.

For example, I can imagine a use case where a batch job does parallel
vacuum to some tables in a maintenance window. The batch operation
will need to compute and specify the degree of parallelism every time
according to for instance the number of indexes, which would be
troublesome. But if we can set the degree of parallelism for each
tables it can just to do 'VACUUM (PARALLEL)'.

+ * parallel mode nor destories the parallel context. For updating the index

Spelling.

Fixed.

+/* DSM keys for parallel lazy vacuum */
+#define PARALLEL_VACUUM_KEY_SHARED UINT64CONST(0xFFFFFFFFFFF00001)
+#define PARALLEL_VACUUM_KEY_DEAD_TUPLES UINT64CONST(0xFFFFFFFFFFF00002)
+#define PARALLEL_VACUUM_KEY_QUERY_TEXT UINT64CONST(0xFFFFFFFFFFF00003)
Any special reason not to use just 1, 2, 3 here? The general
infrastructure stuff uses high numbers to avoid conflicting with
plan_node_id values, but end clients of the parallel infrastructure
can generally just use small integers.

It seems that I was worrying unnecessarily, changed to 1, 2, 3.

+ bool updated; /* is the stats updated? */

is -> are

+ * LVDeadTuples controls the dead tuple TIDs collected during heap scan.

what do you mean by "controls", exactly? stores?

Fixed.

+ * This is allocated in a dynamic shared memory segment when parallel
+ * lazy vacuum mode, or allocated in a local memory.
If this is in DSM, then max_tuples is a wart, I think. We can't grow
the segment at that point. I'm suspicious that we need a better
design here. It looks like you gather all of the dead tuples in
backend-local memory and then allocate an equal amount of DSM to copy
them. But that means that we are using twice as much memory, which
seems pretty bad. You'd have to do that at least momentarily no
matter what, but it's not obvious that the backend-local copy is ever
freed.

Hmm, the current design is more simple; only the leader process scans
heap and save dead tuples TID to DSM. The DSM is allocated at once
when starting lazy vacuum and we never need to enlarge DSM . Also we
can use the same code around heap vacuum and collecting dead tuples
for both single process vacuum and parallel vacuum. Once index vacuum
is completed, the leader process reinitializes DSM and reuse it in the
next time.

There's another patch kicking around to allocate memory for
vacuum in chunks rather than preallocating the whole slab of memory at
once; we might want to think about getting that committed first and
then having this build on top of it. At least we need something
smarter than this.

Since the parallel vacuum uses memory in the same manner as the single
process vacuum it's not deteriorated. I'd agree that that patch is
more smarter and this patch can be built on top of it but I'm
concerned that there two proposals on that thread and the discussion
has not been active for 8 months. I wonder if it would be worth to
think of improving the memory allocating based on that patch after the
parallel vacuum get committed.

-heap_vacuum_rel(Relation onerel, int options, VacuumParams *params,
+heap_vacuum_rel(Relation onerel, VacuumOptions options, VacuumParams *params,
We generally avoid passing a struct by value; copying the struct can
be expensive and having multiple shallow copies of the same data
sometimes leads to surprising results. I think it might be a good
idea to propose a preliminary refactoring patch that invents
VacuumOptions and gives it just a single 'int' member and refactors
everything to use it, and then that can be committed first. It should
pass a pointer, though, not the actual struct.

Agreed. I'll separate patches and propose it.

+ LVState *lvstate;

It's not clear to me why we need this new LVState thing. What's the
motivation for that? If it's a good idea, could it be done as a
separate, preparatory patch? It seems to be responsible for a lot of
code churn in this patch. It also leads to strange stuff like this:

The main motivations are refactoring and improving readability but
it's mainly for the previous version patch which implements parallel
heap vacuum. It might no longer need here. I'll try to implement
without LVState. Thank you.

ereport(elevel,
- (errmsg("scanned index \"%s\" to remove %d row versions",
+ (errmsg("scanned index \"%s\" to remove %d row versions %s",
RelationGetRelationName(indrel),
- vacrelstats->num_dead_tuples),
+ dead_tuples->num_tuples,
+ IsParallelWorker() ? "by parallel vacuum worker" : ""),
This doesn't seem to be great grammar, and translation guidelines
generally discourage this sort of incremental message construction
quite strongly. Since the user can probably infer what happened by a
suitable choice of log_line_prefix, I'm not totally sure this is worth
doing in the first place, but if we're going to do it, it should
probably have two completely separate message strings and pick between
them using IsParallelWorker(), rather than building it up
incrementally like this.

Fixed.

+compute_parallel_workers(Relation rel, int nrequests, int nindexes)

I think 'nrequets' is meant to be 'nrequested'. It isn't the number
of requests; it's the number of workers that were requested.

Fixed.

+ /* quick exit if no workers are prepared, e.g. under serializable isolation */

That comment makes very little sense in this context.

Fixed.

+ /* Report parallel vacuum worker information */
+ initStringInfo(&buf);
+ appendStringInfo(&buf,
+ ngettext("launched %d parallel vacuum worker %s (planned: %d",
+   "launched %d parallel vacuum workers %s (planned: %d",
+   lvstate->pcxt->nworkers_launched),
+ lvstate->pcxt->nworkers_launched,
+ for_cleanup ? "for index cleanup" : "for index vacuum",
+ lvstate->pcxt->nworkers);
+ if (lvstate->options.nworkers > 0)
+ appendStringInfo(&buf, ", requested %d", lvstate->options.nworkers);
+
+ appendStringInfo(&buf, ")");
+ ereport(elevel, (errmsg("%s", buf.data)));

This is another example of incremental message construction, again
violating translation guidelines.

Fixed.

+ WaitForParallelWorkersToAttach(lvstate->pcxt);

Why?

Oh not necessary, removed.

+ /*
+ * If there is already-updated result in the shared memory we use it.
+ * Otherwise we pass NULL to index AMs, meaning it's first time call,
+ * and copy the result to the shared memory segment.
+ */
I'm probably missing something here, but isn't the intention that we
only do each index once? If so, how would there be anything there
already? Once from for_cleanup = false and once for for_cleanup =
true?

We call ambulkdelete (for_cleanup = false) 0 or more times for each
index and call amvacuumcleanup (for_cleanup = true) at the end. In the
first time calling either ambulkdelete or amvacuumcleanup the lazy
vacuum must pass NULL to them. They return either palloc'd
IndexBulkDeleteResult or NULL. If they returns the former the lazy
vacuum must pass it to them again at the next time. In current design,
since there is no guarantee that an index is always processed by the
same vacuum process each vacuum processes save the result to DSM in
order to share those results among vacuum processes. The 'updated'
flags indicates that its slot is used. So we can pass the address of
DSM if 'updated' is true, otherwise pass NULL.

+ if (a->options.flags != b->options.flags)
+ return false;
+ if (a->options.nworkers != b->options.nworkers)
+ return false;
You could just do COMPARE_SCALAR_FIELD(options.flags);
COMPARE_SCALAR_FIELD(options.nworkers);

Fixed.

Almost comments I got have been incorporated to the local branch but a
few comments need discussion. I'll submit the updated version patch
once I addressed all of comments.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#33

Robert Haas

robertmhaas@gmail.com

almost 7 years ago

In reply to: Masahiko Sawada (#32)

On Fri, Mar 1, 2019 at 12:19 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

I wonder if we really want this behavior. Should a setting that
controls the degree of parallelism when scanning the table also affect
VACUUM? I tend to think that we probably don't ever want VACUUM of a
table to be parallel by default, but rather something that the user
must explicitly request. Happy to hear other opinions. If we do want
this behavior, I think this should be written differently, something
like this: The PARALLEL N option to VACUUM takes precedence over this
option.

For example, I can imagine a use case where a batch job does parallel
vacuum to some tables in a maintenance window. The batch operation
will need to compute and specify the degree of parallelism every time
according to for instance the number of indexes, which would be
troublesome. But if we can set the degree of parallelism for each
tables it can just to do 'VACUUM (PARALLEL)'.

True, but the setting in question would also affect the behavior of
sequential scans and index scans. TBH, I'm not sure that the
parallel_workers reloption is really a great design as it is: is
hard-coding the number of workers really what people want? Do they
really want the same degree of parallelism for sequential scans and
index scans? Why should they want the same degree of parallelism also
for VACUUM? Maybe they do, and maybe somebody explain why they do,
but as of now, it's not obvious to me why that should be true.

Since the parallel vacuum uses memory in the same manner as the single
process vacuum it's not deteriorated. I'd agree that that patch is
more smarter and this patch can be built on top of it but I'm
concerned that there two proposals on that thread and the discussion
has not been active for 8 months. I wonder if it would be worth to
think of improving the memory allocating based on that patch after the
parallel vacuum get committed.

Well, I think we can't just say "oh, this patch is going to use twice
as much memory as before," which is what it looks like it's doing
right now. If you think it's not doing that, can you explain further?

Agreed. I'll separate patches and propose it.

Cool. Probably best to keep that on this thread.

The main motivations are refactoring and improving readability but
it's mainly for the previous version patch which implements parallel
heap vacuum. It might no longer need here. I'll try to implement
without LVState. Thank you.

Oh, OK.

+ /*
+ * If there is already-updated result in the shared memory we use it.
+ * Otherwise we pass NULL to index AMs, meaning it's first time call,
+ * and copy the result to the shared memory segment.
+ */
I'm probably missing something here, but isn't the intention that we
only do each index once? If so, how would there be anything there
already? Once from for_cleanup = false and once for for_cleanup =
true?
We call ambulkdelete (for_cleanup = false) 0 or more times for each
index and call amvacuumcleanup (for_cleanup = true) at the end. In the
first time calling either ambulkdelete or amvacuumcleanup the lazy
vacuum must pass NULL to them. They return either palloc'd
IndexBulkDeleteResult or NULL. If they returns the former the lazy
vacuum must pass it to them again at the next time. In current design,
since there is no guarantee that an index is always processed by the
same vacuum process each vacuum processes save the result to DSM in
order to share those results among vacuum processes. The 'updated'
flags indicates that its slot is used. So we can pass the address of
DSM if 'updated' is true, otherwise pass NULL.

Ah, OK. Thanks for explaining.

Almost comments I got have been incorporated to the local branch but a
few comments need discussion. I'll submit the updated version patch
once I addressed all of comments.

Cool.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#34

Masahiko Sawada

sawada.mshk@gmail.com

almost 7 years ago

In reply to: Robert Haas (#33)

On Sat, Mar 2, 2019 at 3:54 AM Robert Haas <robertmhaas@gmail.com> wrote:

On Fri, Mar 1, 2019 at 12:19 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

I wonder if we really want this behavior. Should a setting that
controls the degree of parallelism when scanning the table also affect
VACUUM? I tend to think that we probably don't ever want VACUUM of a
table to be parallel by default, but rather something that the user
must explicitly request. Happy to hear other opinions. If we do want
this behavior, I think this should be written differently, something
like this: The PARALLEL N option to VACUUM takes precedence over this
option.

For example, I can imagine a use case where a batch job does parallel
vacuum to some tables in a maintenance window. The batch operation
will need to compute and specify the degree of parallelism every time
according to for instance the number of indexes, which would be
troublesome. But if we can set the degree of parallelism for each
tables it can just to do 'VACUUM (PARALLEL)'.

True, but the setting in question would also affect the behavior of
sequential scans and index scans. TBH, I'm not sure that the
parallel_workers reloption is really a great design as it is: is
hard-coding the number of workers really what people want? Do they
really want the same degree of parallelism for sequential scans and
index scans? Why should they want the same degree of parallelism also
for VACUUM? Maybe they do, and maybe somebody explain why they do,
but as of now, it's not obvious to me why that should be true.

I think that there are users who want to specify the degree of
parallelism. I think that hard-coding the number of workers would be
good design for something like VACUUM which is a simple operation for
single object; since there are no joins, aggregations it'd be
relatively easy to compute it. That's why the patch introduces
PARALLEL N option as well. I think that a reloption for parallel
vacuum would be just a way to save the degree of parallelism. And I
agree that users don't want to use same degree of parallelism for
VACUUM, so maybe it'd better to add new reloption like
parallel_vacuum_workers. On the other hand, it can be a separate
patch, I can remove the reloption part from this patch and will
propose it when there are requests.

Since the parallel vacuum uses memory in the same manner as the single
process vacuum it's not deteriorated. I'd agree that that patch is
more smarter and this patch can be built on top of it but I'm
concerned that there two proposals on that thread and the discussion
has not been active for 8 months. I wonder if it would be worth to
think of improving the memory allocating based on that patch after the
parallel vacuum get committed.

Well, I think we can't just say "oh, this patch is going to use twice
as much memory as before," which is what it looks like it's doing
right now. If you think it's not doing that, can you explain further?

In the current design, the leader process allocates the whole DSM at
once when starting and records dead tuple's TIDs to the DSM. This is
the same behaviour as before except for it's recording dead tuples TID
to the shared memory segment. Once index vacuuming finished the leader
process re-initialize DSM for the next time. So parallel vacuum uses
the same amount of memory as before during execution.

Agreed. I'll separate patches and propose it.

Cool. Probably best to keep that on this thread.

Understood.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#35

Masahiko Sawada

sawada.mshk@gmail.com

almost 7 years ago

In reply to: Masahiko Sawada (#34)

3 attachment(s)

On Mon, Mar 4, 2019 at 10:27 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Sat, Mar 2, 2019 at 3:54 AM Robert Haas <robertmhaas@gmail.com> wrote:

On Fri, Mar 1, 2019 at 12:19 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

I wonder if we really want this behavior. Should a setting that
controls the degree of parallelism when scanning the table also affect
VACUUM? I tend to think that we probably don't ever want VACUUM of a
table to be parallel by default, but rather something that the user
must explicitly request. Happy to hear other opinions. If we do want
this behavior, I think this should be written differently, something
like this: The PARALLEL N option to VACUUM takes precedence over this
option.

For example, I can imagine a use case where a batch job does parallel
vacuum to some tables in a maintenance window. The batch operation
will need to compute and specify the degree of parallelism every time
according to for instance the number of indexes, which would be
troublesome. But if we can set the degree of parallelism for each
tables it can just to do 'VACUUM (PARALLEL)'.

True, but the setting in question would also affect the behavior of
sequential scans and index scans. TBH, I'm not sure that the
parallel_workers reloption is really a great design as it is: is
hard-coding the number of workers really what people want? Do they
really want the same degree of parallelism for sequential scans and
index scans? Why should they want the same degree of parallelism also
for VACUUM? Maybe they do, and maybe somebody explain why they do,
but as of now, it's not obvious to me why that should be true.

I think that there are users who want to specify the degree of
parallelism. I think that hard-coding the number of workers would be
good design for something like VACUUM which is a simple operation for
single object; since there are no joins, aggregations it'd be
relatively easy to compute it. That's why the patch introduces
PARALLEL N option as well. I think that a reloption for parallel
vacuum would be just a way to save the degree of parallelism. And I
agree that users don't want to use same degree of parallelism for
VACUUM, so maybe it'd better to add new reloption like
parallel_vacuum_workers. On the other hand, it can be a separate
patch, I can remove the reloption part from this patch and will
propose it when there are requests.

Okay, attached the latest version of patch set. I've incorporated all
comments I got and separated the patch for making vacuum options a
Node (0001 patch). And the patch doesn't use parallel_workers. It
might be proposed in the another form again in the future if
requested.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachments:

v16-0001-Make-vacuum-options-a-Node.patchapplication/octet-stream; name=v16-0001-Make-vacuum-options-a-Node.patchDownload

From 07370615c524c629a5c46957420c122567b2e0eb Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Mon, 4 Mar 2019 15:15:13 +0900
Subject: [PATCH v16 1/3] Make vacuum options a Node.

Adds new Node VacuumOptions for follow up commit. VacuumOptions is
passed down to vacuum path but not to the analyze path.
---
 src/backend/access/heap/vacuumlazy.c | 14 +++---
 src/backend/commands/vacuum.c        | 90 ++++++++++++++++++------------------
 src/backend/nodes/copyfuncs.c        | 15 +++++-
 src/backend/nodes/equalfuncs.c       | 13 +++++-
 src/backend/parser/gram.y            | 58 +++++++++++++++--------
 src/backend/postmaster/autovacuum.c  | 14 +++---
 src/backend/tcop/utility.c           |  4 +-
 src/include/access/heapam.h          |  3 +-
 src/include/commands/vacuum.h        |  6 +--
 src/include/nodes/nodes.h            |  1 +
 src/include/nodes/parsenodes.h       | 16 +++++--
 11 files changed, 144 insertions(+), 90 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 9416c31..2c33bf6 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -150,7 +150,7 @@ static BufferAccessStrategy vac_strategy;
 
 
 /* non-export function prototypes */
-static void lazy_scan_heap(Relation onerel, int options,
+static void lazy_scan_heap(Relation onerel, VacuumOptions *options,
 			   LVRelStats *vacrelstats, Relation *Irel, int nindexes,
 			   bool aggressive);
 static void lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats, BlockNumber nblocks);
@@ -186,7 +186,7 @@ static bool heap_page_is_all_visible(Relation rel, Buffer buf,
  *		and locked the relation.
  */
 void
-heap_vacuum_rel(Relation onerel, int options, VacuumParams *params,
+heap_vacuum_rel(Relation onerel, VacuumOptions *options, VacuumParams *params,
 				BufferAccessStrategy bstrategy)
 {
 	LVRelStats *vacrelstats;
@@ -217,7 +217,7 @@ heap_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 		starttime = GetCurrentTimestamp();
 	}
 
-	if (options & VACOPT_VERBOSE)
+	if (options->flags & VACOPT_VERBOSE)
 		elevel = INFO;
 	else
 		elevel = DEBUG2;
@@ -245,7 +245,7 @@ heap_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 											   xidFullScanLimit);
 	aggressive |= MultiXactIdPrecedesOrEquals(onerel->rd_rel->relminmxid,
 											  mxactFullScanLimit);
-	if (options & VACOPT_DISABLE_PAGE_SKIPPING)
+	if (options->flags & VACOPT_DISABLE_PAGE_SKIPPING)
 		aggressive = true;
 
 	vacrelstats = (LVRelStats *) palloc0(sizeof(LVRelStats));
@@ -469,7 +469,7 @@ vacuum_log_cleanup_info(Relation rel, LVRelStats *vacrelstats)
  *		reference them have been killed.
  */
 static void
-lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
+lazy_scan_heap(Relation onerel, VacuumOptions *options, LVRelStats *vacrelstats,
 			   Relation *Irel, int nindexes, bool aggressive)
 {
 	BlockNumber nblocks,
@@ -583,7 +583,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	 * be replayed on any hot standby, where it can be disruptive.
 	 */
 	next_unskippable_block = 0;
-	if ((options & VACOPT_DISABLE_PAGE_SKIPPING) == 0)
+	if ((options->flags & VACOPT_DISABLE_PAGE_SKIPPING) == 0)
 	{
 		while (next_unskippable_block < nblocks)
 		{
@@ -638,7 +638,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		{
 			/* Time to advance next_unskippable_block */
 			next_unskippable_block++;
-			if ((options & VACOPT_DISABLE_PAGE_SKIPPING) == 0)
+			if ((options->flags & VACOPT_DISABLE_PAGE_SKIPPING) == 0)
 			{
 				while (next_unskippable_block < nblocks)
 				{
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index e91df21..843f626 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -67,13 +67,13 @@ static BufferAccessStrategy vac_strategy;
 
 
 /* non-export function prototypes */
-static List *expand_vacuum_rel(VacuumRelation *vrel, int options);
-static List *get_all_vacuum_rels(int options);
+static List *expand_vacuum_rel(VacuumRelation *vrel, VacuumOptions *options);
+static List *get_all_vacuum_rels(VacuumOptions *options);
 static void vac_truncate_clog(TransactionId frozenXID,
 				  MultiXactId minMulti,
 				  TransactionId lastSaneFrozenXid,
 				  MultiXactId lastSaneMinMulti);
-static bool vacuum_rel(Oid relid, RangeVar *relation, int options,
+static bool vacuum_rel(Oid relid, RangeVar *relation, VacuumOptions *options,
 		   VacuumParams *params);
 
 /*
@@ -88,15 +88,15 @@ ExecVacuum(VacuumStmt *vacstmt, bool isTopLevel)
 	VacuumParams params;
 
 	/* sanity checks on options */
-	Assert(vacstmt->options & (VACOPT_VACUUM | VACOPT_ANALYZE));
-	Assert((vacstmt->options & VACOPT_VACUUM) ||
-		   !(vacstmt->options & (VACOPT_FULL | VACOPT_FREEZE)));
-	Assert(!(vacstmt->options & VACOPT_SKIPTOAST));
+	Assert(vacstmt->options->flags & (VACOPT_VACUUM | VACOPT_ANALYZE));
+	Assert((vacstmt->options->flags & VACOPT_VACUUM) ||
+		   !(vacstmt->options->flags & (VACOPT_FULL | VACOPT_FREEZE)));
+	Assert(!(vacstmt->options->flags & VACOPT_SKIPTOAST));
 
 	/*
 	 * Make sure VACOPT_ANALYZE is specified if any column lists are present.
 	 */
-	if (!(vacstmt->options & VACOPT_ANALYZE))
+	if (!(vacstmt->options->flags & VACOPT_ANALYZE))
 	{
 		ListCell   *lc;
 
@@ -115,7 +115,7 @@ ExecVacuum(VacuumStmt *vacstmt, bool isTopLevel)
 	 * All freeze ages are zero if the FREEZE option is given; otherwise pass
 	 * them as -1 which means to use the default values.
 	 */
-	if (vacstmt->options & VACOPT_FREEZE)
+	if (vacstmt->options->flags & VACOPT_FREEZE)
 	{
 		params.freeze_min_age = 0;
 		params.freeze_table_age = 0;
@@ -143,7 +143,7 @@ ExecVacuum(VacuumStmt *vacstmt, bool isTopLevel)
 /*
  * Internal entry point for VACUUM and ANALYZE commands.
  *
- * options is a bitmask of VacuumOption flags, indicating what to do.
+ * options is a VacuumOptions, indicating what to do.
  *
  * relations, if not NIL, is a list of VacuumRelation to process; otherwise,
  * we process all relevant tables in the database.  For each VacuumRelation,
@@ -162,7 +162,7 @@ ExecVacuum(VacuumStmt *vacstmt, bool isTopLevel)
  * memory context that will not disappear at transaction commit.
  */
 void
-vacuum(int options, List *relations, VacuumParams *params,
+vacuum(VacuumOptions *options, List *relations, VacuumParams *params,
 	   BufferAccessStrategy bstrategy, bool isTopLevel)
 {
 	static bool in_vacuum = false;
@@ -173,7 +173,7 @@ vacuum(int options, List *relations, VacuumParams *params,
 
 	Assert(params != NULL);
 
-	stmttype = (options & VACOPT_VACUUM) ? "VACUUM" : "ANALYZE";
+	stmttype = (options->flags & VACOPT_VACUUM) ? "VACUUM" : "ANALYZE";
 
 	/*
 	 * We cannot run VACUUM inside a user transaction block; if we were inside
@@ -183,7 +183,7 @@ vacuum(int options, List *relations, VacuumParams *params,
 	 *
 	 * ANALYZE (without VACUUM) can run either way.
 	 */
-	if (options & VACOPT_VACUUM)
+	if (options->flags & VACOPT_VACUUM)
 	{
 		PreventInTransactionBlock(isTopLevel, stmttype);
 		in_outer_xact = false;
@@ -205,8 +205,8 @@ vacuum(int options, List *relations, VacuumParams *params,
 	/*
 	 * Sanity check DISABLE_PAGE_SKIPPING option.
 	 */
-	if ((options & VACOPT_FULL) != 0 &&
-		(options & VACOPT_DISABLE_PAGE_SKIPPING) != 0)
+	if ((options->flags & VACOPT_FULL) != 0 &&
+		(options->flags & VACOPT_DISABLE_PAGE_SKIPPING) != 0)
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg("VACUUM option DISABLE_PAGE_SKIPPING cannot be used with FULL")));
@@ -215,7 +215,7 @@ vacuum(int options, List *relations, VacuumParams *params,
 	 * Send info about dead objects to the statistics collector, unless we are
 	 * in autovacuum --- autovacuum.c does this for itself.
 	 */
-	if ((options & VACOPT_VACUUM) && !IsAutoVacuumWorkerProcess())
+	if ((options->flags & VACOPT_VACUUM) && !IsAutoVacuumWorkerProcess())
 		pgstat_vacuum_stat();
 
 	/*
@@ -280,11 +280,11 @@ vacuum(int options, List *relations, VacuumParams *params,
 	 * transaction block, and also in an autovacuum worker, use own
 	 * transactions so we can release locks sooner.
 	 */
-	if (options & VACOPT_VACUUM)
+	if (options->flags & VACOPT_VACUUM)
 		use_own_xacts = true;
 	else
 	{
-		Assert(options & VACOPT_ANALYZE);
+		Assert(options->flags & VACOPT_ANALYZE);
 		if (IsAutoVacuumWorkerProcess())
 			use_own_xacts = true;
 		else if (in_outer_xact)
@@ -334,13 +334,13 @@ vacuum(int options, List *relations, VacuumParams *params,
 		{
 			VacuumRelation *vrel = lfirst_node(VacuumRelation, cur);
 
-			if (options & VACOPT_VACUUM)
+			if (options->flags & VACOPT_VACUUM)
 			{
 				if (!vacuum_rel(vrel->oid, vrel->relation, options, params))
 					continue;
 			}
 
-			if (options & VACOPT_ANALYZE)
+			if (options->flags & VACOPT_ANALYZE)
 			{
 				/*
 				 * If using separate xacts, start one for analyze. Otherwise,
@@ -353,7 +353,7 @@ vacuum(int options, List *relations, VacuumParams *params,
 					PushActiveSnapshot(GetTransactionSnapshot());
 				}
 
-				analyze_rel(vrel->oid, vrel->relation, options, params,
+				analyze_rel(vrel->oid, vrel->relation, options->flags, params,
 							vrel->va_cols, in_outer_xact, vac_strategy);
 
 				if (use_own_xacts)
@@ -389,7 +389,7 @@ vacuum(int options, List *relations, VacuumParams *params,
 		StartTransactionCommand();
 	}
 
-	if ((options & VACOPT_VACUUM) && !IsAutoVacuumWorkerProcess())
+	if ((options->flags & VACOPT_VACUUM) && !IsAutoVacuumWorkerProcess())
 	{
 		/*
 		 * Update pg_database.datfrozenxid, and truncate pg_xact if possible.
@@ -415,11 +415,11 @@ vacuum(int options, List *relations, VacuumParams *params,
  * ANALYZE.
  */
 bool
-vacuum_is_relation_owner(Oid relid, Form_pg_class reltuple, int options)
+vacuum_is_relation_owner(Oid relid, Form_pg_class reltuple, int flags)
 {
 	char	   *relname;
 
-	Assert((options & (VACOPT_VACUUM | VACOPT_ANALYZE)) != 0);
+	Assert((flags & (VACOPT_VACUUM | VACOPT_ANALYZE)) != 0);
 
 	/*
 	 * Check permissions.
@@ -438,7 +438,7 @@ vacuum_is_relation_owner(Oid relid, Form_pg_class reltuple, int options)
 
 	relname = NameStr(reltuple->relname);
 
-	if ((options & VACOPT_VACUUM) != 0)
+	if ((flags & VACOPT_VACUUM) != 0)
 	{
 		if (reltuple->relisshared)
 			ereport(WARNING,
@@ -461,7 +461,7 @@ vacuum_is_relation_owner(Oid relid, Form_pg_class reltuple, int options)
 		return false;
 	}
 
-	if ((options & VACOPT_ANALYZE) != 0)
+	if ((flags & VACOPT_ANALYZE) != 0)
 	{
 		if (reltuple->relisshared)
 			ereport(WARNING,
@@ -490,14 +490,14 @@ vacuum_is_relation_owner(Oid relid, Form_pg_class reltuple, int options)
  */
 Relation
 vacuum_open_relation(Oid relid, RangeVar *relation, VacuumParams *params,
-					 int options, LOCKMODE lmode)
+					 int flags, LOCKMODE lmode)
 {
 	Relation	onerel;
 	bool		rel_lock = true;
 	int			elevel;
 
 	Assert(params != NULL);
-	Assert((options & (VACOPT_VACUUM | VACOPT_ANALYZE)) != 0);
+	Assert((flags & (VACOPT_VACUUM | VACOPT_ANALYZE)) != 0);
 
 	/*
 	 * Open the relation and get the appropriate lock on it.
@@ -508,7 +508,7 @@ vacuum_open_relation(Oid relid, RangeVar *relation, VacuumParams *params,
 	 * If we've been asked not to wait for the relation lock, acquire it first
 	 * in non-blocking mode, before calling try_relation_open().
 	 */
-	if (!(options & VACOPT_SKIP_LOCKED))
+	if (!(flags & VACOPT_SKIP_LOCKED))
 		onerel = try_relation_open(relid, lmode);
 	else if (ConditionalLockRelationOid(relid, lmode))
 		onerel = try_relation_open(relid, NoLock);
@@ -548,7 +548,7 @@ vacuum_open_relation(Oid relid, RangeVar *relation, VacuumParams *params,
 	else
 		return NULL;
 
-	if ((options & VACOPT_VACUUM) != 0)
+	if ((flags & VACOPT_VACUUM) != 0)
 	{
 		if (!rel_lock)
 			ereport(elevel,
@@ -569,7 +569,7 @@ vacuum_open_relation(Oid relid, RangeVar *relation, VacuumParams *params,
 		return NULL;
 	}
 
-	if ((options & VACOPT_ANALYZE) != 0)
+	if ((flags & VACOPT_ANALYZE) != 0)
 	{
 		if (!rel_lock)
 			ereport(elevel,
@@ -602,7 +602,7 @@ vacuum_open_relation(Oid relid, RangeVar *relation, VacuumParams *params,
  * are made in vac_context.
  */
 static List *
-expand_vacuum_rel(VacuumRelation *vrel, int options)
+expand_vacuum_rel(VacuumRelation *vrel, VacuumOptions *options)
 {
 	List	   *vacrels = NIL;
 	MemoryContext oldcontext;
@@ -634,7 +634,7 @@ expand_vacuum_rel(VacuumRelation *vrel, int options)
 		 * below, as well as find_all_inheritors's expectation that the caller
 		 * holds some lock on the starting relation.
 		 */
-		rvr_opts = (options & VACOPT_SKIP_LOCKED) ? RVR_SKIP_LOCKED : 0;
+		rvr_opts = (options->flags & VACOPT_SKIP_LOCKED) ? RVR_SKIP_LOCKED : 0;
 		relid = RangeVarGetRelidExtended(vrel->relation,
 										 AccessShareLock,
 										 rvr_opts,
@@ -646,7 +646,7 @@ expand_vacuum_rel(VacuumRelation *vrel, int options)
 		 */
 		if (!OidIsValid(relid))
 		{
-			if (options & VACOPT_VACUUM)
+			if (options->flags & VACOPT_VACUUM)
 				ereport(WARNING,
 						(errcode(ERRCODE_LOCK_NOT_AVAILABLE),
 						 errmsg("skipping vacuum of \"%s\" --- lock not available",
@@ -672,7 +672,7 @@ expand_vacuum_rel(VacuumRelation *vrel, int options)
 		 * Make a returnable VacuumRelation for this rel if user is a proper
 		 * owner.
 		 */
-		if (vacuum_is_relation_owner(relid, classForm, options))
+		if (vacuum_is_relation_owner(relid, classForm, options->flags))
 		{
 			oldcontext = MemoryContextSwitchTo(vac_context);
 			vacrels = lappend(vacrels, makeVacuumRelation(vrel->relation,
@@ -741,7 +741,7 @@ expand_vacuum_rel(VacuumRelation *vrel, int options)
  * the current database.  The list is built in vac_context.
  */
 static List *
-get_all_vacuum_rels(int options)
+get_all_vacuum_rels(VacuumOptions *options)
 {
 	List	   *vacrels = NIL;
 	Relation	pgclass;
@@ -759,7 +759,7 @@ get_all_vacuum_rels(int options)
 		Oid			relid = classForm->oid;
 
 		/* check permissions of relation */
-		if (!vacuum_is_relation_owner(relid, classForm, options))
+		if (!vacuum_is_relation_owner(relid, classForm, options->flags))
 			continue;
 
 		/*
@@ -1520,7 +1520,7 @@ vac_truncate_clog(TransactionId frozenXID,
  *		At entry and exit, we are not inside a transaction.
  */
 static bool
-vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
+vacuum_rel(Oid relid, RangeVar *relation, VacuumOptions *options, VacuumParams *params)
 {
 	LOCKMODE	lmode;
 	Relation	onerel;
@@ -1541,7 +1541,7 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 	 */
 	PushActiveSnapshot(GetTransactionSnapshot());
 
-	if (!(options & VACOPT_FULL))
+	if (!(options->flags & VACOPT_FULL))
 	{
 		/*
 		 * In lazy vacuum, we can set the PROC_IN_VACUUM flag, which lets
@@ -1581,10 +1581,10 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 	 * vacuum, but just ShareUpdateExclusiveLock for concurrent vacuum. Either
 	 * way, we can be sure that no other backend is vacuuming the same table.
 	 */
-	lmode = (options & VACOPT_FULL) ? AccessExclusiveLock : ShareUpdateExclusiveLock;
+	lmode = (options->flags & VACOPT_FULL) ? AccessExclusiveLock : ShareUpdateExclusiveLock;
 
 	/* open the relation and get the appropriate lock on it */
-	onerel = vacuum_open_relation(relid, relation, params, options, lmode);
+	onerel = vacuum_open_relation(relid, relation, params, options->flags, lmode);
 
 	/* leave if relation could not be opened or locked */
 	if (!onerel)
@@ -1604,7 +1604,7 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 	 */
 	if (!vacuum_is_relation_owner(RelationGetRelid(onerel),
 								  onerel->rd_rel,
-								  options & VACOPT_VACUUM))
+								  options->flags & VACOPT_VACUUM))
 	{
 		relation_close(onerel, lmode);
 		PopActiveSnapshot();
@@ -1676,7 +1676,7 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 	 * us to process it.  In VACUUM FULL, though, the toast table is
 	 * automatically rebuilt by cluster_rel so we shouldn't recurse to it.
 	 */
-	if (!(options & VACOPT_SKIPTOAST) && !(options & VACOPT_FULL))
+	if (!(options->flags & VACOPT_SKIPTOAST) && !(options->flags & VACOPT_FULL))
 		toast_relid = onerel->rd_rel->reltoastrelid;
 	else
 		toast_relid = InvalidOid;
@@ -1695,7 +1695,7 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 	/*
 	 * Do the actual work --- either FULL or "lazy" vacuum
 	 */
-	if (options & VACOPT_FULL)
+	if (options->flags & VACOPT_FULL)
 	{
 		int			cluster_options = 0;
 
@@ -1703,7 +1703,7 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 		relation_close(onerel, NoLock);
 		onerel = NULL;
 
-		if ((options & VACOPT_VERBOSE) != 0)
+		if ((options->flags & VACOPT_VERBOSE) != 0)
 			cluster_options |= CLUOPT_VERBOSE;
 
 		/* VACUUM FULL is now a variant of CLUSTER; see cluster.c */
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index e15724b..7f937c9 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -3843,12 +3843,22 @@ _copyDropdbStmt(const DropdbStmt *from)
 	return newnode;
 }
 
+static VacuumOptions *
+_copyVacuumOptions(const VacuumOptions *from)
+{
+	VacuumOptions *newnode = makeNode(VacuumOptions);
+
+	COPY_SCALAR_FIELD(flags);
+
+	return newnode;
+}
+
 static VacuumStmt *
 _copyVacuumStmt(const VacuumStmt *from)
 {
 	VacuumStmt *newnode = makeNode(VacuumStmt);
 
-	COPY_SCALAR_FIELD(options);
+	COPY_NODE_FIELD(options);
 	COPY_NODE_FIELD(rels);
 
 	return newnode;
@@ -5321,6 +5331,9 @@ copyObjectImpl(const void *from)
 		case T_DropdbStmt:
 			retval = _copyDropdbStmt(from);
 			break;
+		case T_VacuumOptions:
+			retval = _copyVacuumOptions(from);
+			break;
 		case T_VacuumStmt:
 			retval = _copyVacuumStmt(from);
 			break;
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 31499eb..3dbbff4 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -1666,9 +1666,17 @@ _equalDropdbStmt(const DropdbStmt *a, const DropdbStmt *b)
 }
 
 static bool
+_equalVacuumOptions(const VacuumOptions *a, const VacuumOptions *b)
+{
+	COMPARE_SCALAR_FIELD(flags);
+
+	return true;
+}
+
+static bool
 _equalVacuumStmt(const VacuumStmt *a, const VacuumStmt *b)
 {
-	COMPARE_SCALAR_FIELD(options);
+	COMPARE_NODE_FIELD(options);
 	COMPARE_NODE_FIELD(rels);
 
 	return true;
@@ -3386,6 +3394,9 @@ equal(const void *a, const void *b)
 		case T_DropdbStmt:
 			retval = _equalDropdbStmt(a, b);
 			break;
+		case T_VacuumOptions:
+			retval = _equalVacuumOptions(a, b);
+			break;
 		case T_VacuumStmt:
 			retval = _equalVacuumStmt(a, b);
 			break;
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 0279013..e7601da 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -187,6 +187,7 @@ static void processCASbits(int cas_bits, int location, const char *constrType,
 			   bool *deferrable, bool *initdeferred, bool *not_valid,
 			   bool *no_inherit, core_yyscan_t yyscanner);
 static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
+static VacuumOptions *makeVacOpt(VacuumFlag flags);
 
 %}
 
@@ -237,6 +238,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 	struct ImportQual	*importqual;
 	InsertStmt			*istmt;
 	VariableSetStmt		*vsetstmt;
+	VacuumOptions		*vacopt;
 	PartitionElem		*partelem;
 	PartitionSpec		*partspec;
 	PartitionBoundSpec	*partboundspec;
@@ -305,8 +307,8 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 				create_extension_opt_item alter_extension_opt_item
 
 %type <ival>	opt_lock lock_type cast_context
-%type <ival>	vacuum_option_list vacuum_option_elem
-				analyze_option_list analyze_option_elem
+%type <vacopt>	vacuum_option_list vacuum_option_elem
+%type <ival>	analyze_option_list analyze_option_elem
 %type <boolean>	opt_or_replace
 				opt_grant_grant_option opt_grant_admin_option
 				opt_nowait opt_if_exists opt_with_data
@@ -10436,22 +10438,24 @@ cluster_index_specification:
 VacuumStmt: VACUUM opt_full opt_freeze opt_verbose opt_analyze opt_vacuum_relation_list
 				{
 					VacuumStmt *n = makeNode(VacuumStmt);
-					n->options = VACOPT_VACUUM;
+					VacuumOptions *opt = makeVacOpt(VACOPT_VACUUM);
 					if ($2)
-						n->options |= VACOPT_FULL;
+						opt->flags |= VACOPT_FULL;
 					if ($3)
-						n->options |= VACOPT_FREEZE;
+						opt->flags |= VACOPT_FREEZE;
 					if ($4)
-						n->options |= VACOPT_VERBOSE;
+						opt->flags |= VACOPT_VERBOSE;
 					if ($5)
-						n->options |= VACOPT_ANALYZE;
+						opt->flags |= VACOPT_ANALYZE;
+					n->options = opt;
 					n->rels = $6;
 					$$ = (Node *)n;
 				}
 			| VACUUM '(' vacuum_option_list ')' opt_vacuum_relation_list
 				{
 					VacuumStmt *n = makeNode(VacuumStmt);
-					n->options = VACOPT_VACUUM | $3;
+					n->options = $3;
+					n->options->flags |= VACOPT_VACUUM;
 					n->rels = $5;
 					$$ = (Node *) n;
 				}
@@ -10459,20 +10463,25 @@ VacuumStmt: VACUUM opt_full opt_freeze opt_verbose opt_analyze opt_vacuum_relati
 
 vacuum_option_list:
 			vacuum_option_elem								{ $$ = $1; }
-			| vacuum_option_list ',' vacuum_option_elem		{ $$ = $1 | $3; }
+			| vacuum_option_list ',' vacuum_option_elem
+				{
+					$1->flags |= $3->flags;
+					pfree($3);
+					$$ = $1;
+				}
 		;
 
 vacuum_option_elem:
-			analyze_keyword		{ $$ = VACOPT_ANALYZE; }
-			| VERBOSE			{ $$ = VACOPT_VERBOSE; }
-			| FREEZE			{ $$ = VACOPT_FREEZE; }
-			| FULL				{ $$ = VACOPT_FULL; }
+			analyze_keyword		{ $$ = makeVacOpt(VACOPT_ANALYZE); }
+			| VERBOSE			{ $$ = makeVacOpt(VACOPT_VERBOSE); }
+			| FREEZE			{ $$ = makeVacOpt(VACOPT_FREEZE); }
+			| FULL				{ $$ = makeVacOpt(VACOPT_FULL); }
 			| IDENT
 				{
 					if (strcmp($1, "disable_page_skipping") == 0)
-						$$ = VACOPT_DISABLE_PAGE_SKIPPING;
+						$$ = makeVacOpt(VACOPT_DISABLE_PAGE_SKIPPING);
 					else if (strcmp($1, "skip_locked") == 0)
-						$$ = VACOPT_SKIP_LOCKED;
+						$$ = makeVacOpt(VACOPT_SKIP_LOCKED);
 					else
 						ereport(ERROR,
 								(errcode(ERRCODE_SYNTAX_ERROR),
@@ -10484,16 +10493,17 @@ vacuum_option_elem:
 AnalyzeStmt: analyze_keyword opt_verbose opt_vacuum_relation_list
 				{
 					VacuumStmt *n = makeNode(VacuumStmt);
-					n->options = VACOPT_ANALYZE;
+					VacuumOptions *opt = makeVacOpt(VACOPT_ANALYZE);
 					if ($2)
-						n->options |= VACOPT_VERBOSE;
+						opt->flags |= VACOPT_VERBOSE;
+					n->options = opt;
 					n->rels = $3;
 					$$ = (Node *)n;
 				}
 			| analyze_keyword '(' analyze_option_list ')' opt_vacuum_relation_list
 				{
 					VacuumStmt *n = makeNode(VacuumStmt);
-					n->options = VACOPT_ANALYZE | $3;
+					n->options =  makeVacOpt(VACOPT_ANALYZE | $3);
 					n->rels = $5;
 					$$ = (Node *) n;
 				}
@@ -16018,6 +16028,18 @@ makeXmlExpr(XmlExprOp op, char *name, List *named_args, List *args,
 }
 
 /*
+ * Create a VacuumOptions with the given flags.
+ */
+static VacuumOptions *
+makeVacOpt(const VacuumFlag flags)
+{
+	VacuumOptions *opt = makeNode(VacuumOptions);
+
+	opt->flags = flags;
+	return opt;
+}
+
+/*
  * Merge the input and output parameters of a table function.
  */
 static List *
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 347f91e..525a33b 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -187,8 +187,8 @@ typedef struct av_relation
 typedef struct autovac_table
 {
 	Oid			at_relid;
-	int			at_vacoptions;	/* bitmask of VacuumOption */
-	VacuumParams at_params;
+	VacuumOptions	at_vacoptions;
+	VacuumParams	at_params;
 	int			at_vacuum_cost_delay;
 	int			at_vacuum_cost_limit;
 	bool		at_dobalance;
@@ -2481,7 +2481,7 @@ do_autovacuum(void)
 			 * next table in our list.
 			 */
 			HOLD_INTERRUPTS();
-			if (tab->at_vacoptions & VACOPT_VACUUM)
+			if (tab->at_vacoptions.flags & VACOPT_VACUUM)
 				errcontext("automatic vacuum of table \"%s.%s.%s\"",
 						   tab->at_datname, tab->at_nspname, tab->at_relname);
 			else
@@ -2882,7 +2882,7 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		tab = palloc(sizeof(autovac_table));
 		tab->at_relid = relid;
 		tab->at_sharedrel = classForm->relisshared;
-		tab->at_vacoptions = VACOPT_SKIPTOAST |
+		tab->at_vacoptions.flags = VACOPT_SKIPTOAST |
 			(dovacuum ? VACOPT_VACUUM : 0) |
 			(doanalyze ? VACOPT_ANALYZE : 0) |
 			(!wraparound ? VACOPT_SKIP_LOCKED : 0);
@@ -3109,7 +3109,7 @@ autovacuum_do_vac_analyze(autovac_table *tab, BufferAccessStrategy bstrategy)
 	rel = makeVacuumRelation(rangevar, tab->at_relid, NIL);
 	rel_list = list_make1(rel);
 
-	vacuum(tab->at_vacoptions, rel_list, &tab->at_params, bstrategy, true);
+	vacuum(&tab->at_vacoptions, rel_list, &tab->at_params, bstrategy, true);
 }
 
 /*
@@ -3131,10 +3131,10 @@ autovac_report_activity(autovac_table *tab)
 	int			len;
 
 	/* Report the command and possible options */
-	if (tab->at_vacoptions & VACOPT_VACUUM)
+	if (tab->at_vacoptions.flags & VACOPT_VACUUM)
 		snprintf(activity, MAX_AUTOVAC_ACTIV_LEN,
 				 "autovacuum: VACUUM%s",
-				 tab->at_vacoptions & VACOPT_ANALYZE ? " ANALYZE" : "");
+				 tab->at_vacoptions.flags & VACOPT_ANALYZE ? " ANALYZE" : "");
 	else
 		snprintf(activity, MAX_AUTOVAC_ACTIV_LEN,
 				 "autovacuum: ANALYZE");
diff --git a/src/backend/tcop/utility.c b/src/backend/tcop/utility.c
index 6ec795f..a735ff9 100644
--- a/src/backend/tcop/utility.c
+++ b/src/backend/tcop/utility.c
@@ -664,7 +664,7 @@ standard_ProcessUtility(PlannedStmt *pstmt,
 				VacuumStmt *stmt = (VacuumStmt *) parsetree;
 
 				/* we choose to allow this during "read only" transactions */
-				PreventCommandDuringRecovery((stmt->options & VACOPT_VACUUM) ?
+				PreventCommandDuringRecovery((stmt->options->flags & VACOPT_VACUUM) ?
 											 "VACUUM" : "ANALYZE");
 				/* forbidden in parallel mode due to CommandIsReadOnly */
 				ExecVacuum(stmt, isTopLevel);
@@ -2570,7 +2570,7 @@ CreateCommandTag(Node *parsetree)
 			break;
 
 		case T_VacuumStmt:
-			if (((VacuumStmt *) parsetree)->options & VACOPT_VACUUM)
+			if (((VacuumStmt *) parsetree)->options->flags & VACOPT_VACUUM)
 				tag = "VACUUM";
 			else
 				tag = "ANALYZE";
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index ab08791..1c8525f 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -18,6 +18,7 @@
 #include "access/sdir.h"
 #include "access/skey.h"
 #include "access/table.h"		/* for backward compatibility */
+#include "nodes/parsenodes.h"
 #include "nodes/lockoptions.h"
 #include "nodes/primnodes.h"
 #include "storage/bufpage.h"
@@ -185,7 +186,7 @@ extern Size SyncScanShmemSize(void);
 
 /* in heap/vacuumlazy.c */
 struct VacuumParams;
-extern void heap_vacuum_rel(Relation onerel, int options,
+extern void heap_vacuum_rel(Relation onerel, VacuumOptions *options,
 				struct VacuumParams *params, BufferAccessStrategy bstrategy);
 
 /* in heap/heapam_visibility.c */
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 0a051ec..cfc6771 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -163,7 +163,7 @@ extern int	vacuum_multixact_freeze_table_age;
 
 /* in commands/vacuum.c */
 extern void ExecVacuum(VacuumStmt *vacstmt, bool isTopLevel);
-extern void vacuum(int options, List *relations, VacuumParams *params,
+extern void vacuum(VacuumOptions *options, List *relations, VacuumParams *params,
 	   BufferAccessStrategy bstrategy, bool isTopLevel);
 extern void vac_open_indexes(Relation relation, LOCKMODE lockmode,
 				 int *nindexes, Relation **Irel);
@@ -192,9 +192,9 @@ extern void vacuum_set_xid_limits(Relation rel,
 extern void vac_update_datfrozenxid(void);
 extern void vacuum_delay_point(void);
 extern bool vacuum_is_relation_owner(Oid relid, Form_pg_class reltuple,
-						 int options);
+									 int flags);
 extern Relation vacuum_open_relation(Oid relid, RangeVar *relation,
-					 VacuumParams *params, int options, LOCKMODE lmode);
+					 VacuumParams *params, int flags, LOCKMODE lmode);
 
 /* in commands/analyze.c */
 extern void analyze_rel(Oid relid, RangeVar *relation, int options,
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index f938925..3bbafcd 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -476,6 +476,7 @@ typedef enum NodeTag
 	T_PartitionRangeDatum,
 	T_PartitionCmd,
 	T_VacuumRelation,
+	T_VacuumOptions,
 
 	/*
 	 * TAGS FOR REPLICATION GRAMMAR PARSE NODES (replnodes.h)
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index a7e859d..278e5d1 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3154,7 +3154,7 @@ typedef struct ClusterStmt
  * and VACOPT_ANALYZE must be set in options.
  * ----------------------
  */
-typedef enum VacuumOption
+typedef enum VacuumFlag
 {
 	VACOPT_VACUUM = 1 << 0,		/* do VACUUM */
 	VACOPT_ANALYZE = 1 << 1,	/* do ANALYZE */
@@ -3164,7 +3164,13 @@ typedef enum VacuumOption
 	VACOPT_SKIP_LOCKED = 1 << 5,	/* skip if cannot get lock */
 	VACOPT_SKIPTOAST = 1 << 6,	/* don't process the TOAST table, if any */
 	VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7	/* don't skip any pages */
-} VacuumOption;
+} VacuumFlag;
+
+typedef struct VacuumOptions
+{
+	NodeTag		type;
+	int			flags; /* OR of VacuumFlag */
+} VacuumOptions;
 
 /*
  * Info about a single target table of VACUUM/ANALYZE.
@@ -3183,9 +3189,9 @@ typedef struct VacuumRelation
 
 typedef struct VacuumStmt
 {
-	NodeTag		type;
-	int			options;		/* OR of VacuumOption flags */
-	List	   *rels;			/* list of VacuumRelation, or NIL for all */
+	NodeTag			type;
+	VacuumOptions	*options;
+	List		   *rels;			/* list of VacuumRelation, or NIL for all */
 } VacuumStmt;
 
 /* ----------------------
-- 
2.10.5

v16-0003-Add-paralell-P-option-to-vacuumdb-command.patchapplication/octet-stream; name=v16-0003-Add-paralell-P-option-to-vacuumdb-command.patchDownload

From 4343782af2570295c3a26df4fb48eb06fa59dab6 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Wed, 23 Jan 2019 16:07:53 +0900
Subject: [PATCH v16 3/3] Add --paralell, -P option to vacuumdb command

---
 doc/src/sgml/ref/vacuumdb.sgml    | 16 +++++++++++++
 src/bin/scripts/t/100_vacuumdb.pl | 10 +++++++-
 src/bin/scripts/vacuumdb.c        | 49 ++++++++++++++++++++++++++++++++++++++-
 3 files changed, 73 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/ref/vacuumdb.sgml b/doc/src/sgml/ref/vacuumdb.sgml
index 41c7f3d..da65177 100644
--- a/doc/src/sgml/ref/vacuumdb.sgml
+++ b/doc/src/sgml/ref/vacuumdb.sgml
@@ -227,6 +227,22 @@ PostgreSQL documentation
      </varlistentry>
 
      <varlistentry>
+      <term><option>-P <replaceable class="parameter">workers</replaceable></option></term>
+      <term><option>--parallel=<replaceable class="parameter">workers</replaceable></option></term>
+      <listitem>
+       <para>
+        Execute parallel vacuum with <productname>PostgreSQL</productname>'s
+        <replaceable class="parameter">workers</replaceable> background workers.
+       </para>
+       <para>
+        This option will require background workers, so make sure your
+        <xref linkend="guc-max-parallel-workers-maintenance"/> setting is more
+        than one.
+       </para>
+      </listitem>
+     </varlistentry>
+
+     <varlistentry>
       <term><option>-q</option></term>
       <term><option>--quiet</option></term>
       <listitem>
diff --git a/src/bin/scripts/t/100_vacuumdb.pl b/src/bin/scripts/t/100_vacuumdb.pl
index 7f3a9b1..5ab87f3 100644
--- a/src/bin/scripts/t/100_vacuumdb.pl
+++ b/src/bin/scripts/t/100_vacuumdb.pl
@@ -3,7 +3,7 @@ use warnings;
 
 use PostgresNode;
 use TestLib;
-use Test::More tests => 44;
+use Test::More tests => 48;
 
 program_help_ok('vacuumdb');
 program_version_ok('vacuumdb');
@@ -48,6 +48,14 @@ $node->issues_sql_like(
 $node->command_fails(
 	[ 'vacuumdb', '--analyze-only', '--disable-page-skipping', 'postgres' ],
 	'--analyze-only and --disable-page-skipping specified together');
+$node->issues_sql_like(
+	[ 'vacuumdb', '-P2', 'postgres' ],
+	qr/statement: VACUUM \(PARALLEL 2\).*;/,
+	'vacuumdb -P2');
+$node->issues_sql_like(
+	[ 'vacuumdb', '-P', 'postgres' ],
+	qr/statement: VACUUM \(PARALLEL\).*;/,
+	'vacuumdb -P');
 $node->command_ok([qw(vacuumdb -Z --table=pg_am dbname=template1)],
 	'vacuumdb with connection string');
 
diff --git a/src/bin/scripts/vacuumdb.c b/src/bin/scripts/vacuumdb.c
index 5ac41ea..b9799db 100644
--- a/src/bin/scripts/vacuumdb.c
+++ b/src/bin/scripts/vacuumdb.c
@@ -45,6 +45,8 @@ typedef struct vacuumingOptions
 	bool		skip_locked;
 	int			min_xid_age;
 	int			min_mxid_age;
+	int			parallel_workers;	/* -1: disabled, 0: PARALLEL without number of
+									 * workers. */
 } vacuumingOptions;
 
 
@@ -111,6 +113,7 @@ main(int argc, char *argv[])
 		{"full", no_argument, NULL, 'f'},
 		{"verbose", no_argument, NULL, 'v'},
 		{"jobs", required_argument, NULL, 'j'},
+		{"parallel", optional_argument, NULL, 'P'},
 		{"maintenance-db", required_argument, NULL, 2},
 		{"analyze-in-stages", no_argument, NULL, 3},
 		{"disable-page-skipping", no_argument, NULL, 4},
@@ -140,6 +143,7 @@ main(int argc, char *argv[])
 
 	/* initialize options to all false */
 	memset(&vacopts, 0, sizeof(vacopts));
+	vacopts.parallel_workers = -1;
 
 	progname = get_progname(argv[0]);
 
@@ -147,7 +151,7 @@ main(int argc, char *argv[])
 
 	handle_help_version_opts(argc, argv, "vacuumdb", help);
 
-	while ((c = getopt_long(argc, argv, "h:p:U:wWeqd:zZFat:fvj:", long_options, &optindex)) != -1)
+	while ((c = getopt_long(argc, argv, "h:p:P::U:wWeqd:zZFat:fvj:", long_options, &optindex)) != -1)
 	{
 		switch (c)
 		{
@@ -214,6 +218,25 @@ main(int argc, char *argv[])
 					exit(1);
 				}
 				break;
+			case 'P':
+				{
+					int parallel_workers = 0;
+
+					if (optarg != NULL)
+					{
+						parallel_workers = atoi(optarg);
+						if (parallel_workers <= 0)
+						{
+							fprintf(stderr, _("%s: number of parallel workers must be at least 1\n"),
+									progname);
+							exit(1);
+						}
+					}
+
+					/* allow to set 0, meaning PARALLEL without the parallel degree */
+					vacopts.parallel_workers = parallel_workers;
+					break;
+				}
 			case 2:
 				maintenance_db = pg_strdup(optarg);
 				break;
@@ -288,9 +311,22 @@ main(int argc, char *argv[])
 					progname, "disable-page-skipping");
 			exit(1);
 		}
+		if (vacopts.parallel_workers >= 0)
+		{
+			fprintf(stderr, _("%s: cannot use the \"%s\" option when performing only analyze\n"),
+					progname, "parallel");
+			exit(1);
+		}
 		/* allow 'and_analyze' with 'analyze_only' */
 	}
 
+	if (vacopts.full && vacopts.parallel_workers >= 0)
+	{
+		fprintf(stderr, _("%s: cannot use the \"%s\" option with \"%s\" option"),
+				progname, "full", "parallel");
+		exit(1);
+	}
+
 	setup_cancel_handler();
 
 	/* Avoid opening extra connections. */
@@ -895,6 +931,16 @@ prepare_vacuum_command(PQExpBuffer sql, int serverVersion,
 				appendPQExpBuffer(sql, "%sANALYZE", sep);
 				sep = comma;
 			}
+			if (vacopts->parallel_workers > 0)
+			{
+				appendPQExpBuffer(sql, "%sPARALLEL %d", sep, vacopts->parallel_workers);
+				sep = comma;
+			}
+			if (vacopts->parallel_workers == 0)
+			{
+				appendPQExpBuffer(sql, "%sPARALLEL", sep);
+				sep = comma;
+			}
 			if (sep != paren)
 				appendPQExpBufferChar(sql, ')');
 		}
@@ -1227,6 +1273,7 @@ help(const char *progname)
 	printf(_("  -j, --jobs=NUM                  use this many concurrent connections to vacuum\n"));
 	printf(_("      --min-mxid-age=MXID_AGE     minimum multixact ID age of tables to vacuum\n"));
 	printf(_("      --min-xid-age=XID_AGE       minimum transaction ID age of tables to vacuum\n"));
+	printf(_("  -P, --parallel=NUM              do parallel vacuuming\n"));
 	printf(_("  -q, --quiet                     don't write any messages\n"));
 	printf(_("      --skip-locked               skip relations that cannot be immediately locked\n"));
 	printf(_("  -t, --table='TABLE[(COLUMNS)]'  vacuum specific table(s) only\n"));
-- 
2.10.5

v16-0002-Add-parallel-option-to-VACUUM-command.patchapplication/octet-stream; name=v16-0002-Add-parallel-option-to-VACUUM-command.patchDownload

From 6ada025d2b3ca57bb071556a63d8f56e908562b6 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Mon, 4 Mar 2019 09:31:41 +0900
Subject: [PATCH v16 2/3] Add parallel option to VACUUM command

In parallel vacuum, we do both index vacuum and cleanup vacuum
in parallel with parallel worker processes if the table has
more than one index. All processes including the leader process
process indexes one by one.

Parallel vacuum can be performed by specifying like
VACUUM (PARALLEL 2) tbl, meaning that performing vacuum with 2
parallel worker processes.

The parallel vacuum degree is limited by both the number of
indexes the table has and max_parallel_maintenance_workers.
---
 doc/src/sgml/config.sgml              |  14 +-
 doc/src/sgml/ref/vacuum.sgml          |  20 +
 src/backend/access/heap/vacuumlazy.c  | 855 ++++++++++++++++++++++++++++++----
 src/backend/access/transam/parallel.c |   4 +
 src/backend/commands/vacuum.c         |   6 +
 src/backend/nodes/copyfuncs.c         |   1 +
 src/backend/nodes/equalfuncs.c        |   1 +
 src/backend/parser/gram.y             |  62 ++-
 src/backend/postmaster/autovacuum.c   |   1 +
 src/bin/psql/tab-complete.c           |   3 +-
 src/include/access/heapam.h           |   3 +
 src/include/nodes/parsenodes.h        |   4 +-
 src/test/regress/expected/vacuum.out  |   2 +
 src/test/regress/sql/vacuum.sql       |   3 +
 14 files changed, 851 insertions(+), 128 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 6d42b7a..840eadd 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2209,13 +2209,13 @@ include_dir 'conf.d'
        <listitem>
         <para>
          Sets the maximum number of parallel workers that can be
-         started by a single utility command.  Currently, the only
-         parallel utility command that supports the use of parallel
-         workers is <command>CREATE INDEX</command>, and only when
-         building a B-tree index.  Parallel workers are taken from the
-         pool of processes established by <xref
-         linkend="guc-max-worker-processes"/>, limited by <xref
-         linkend="guc-max-parallel-workers"/>.  Note that the requested
+         started by a single utility command.  Currently, the parallel
+         utility commands that support the use of parallel workers are
+         <command>CREATE INDEX</command> only when building a B-tree index,
+         and <command>VACUUM</command> without <literal>FULL</literal>
+         option. Parallel workers are taken from the pool of processes
+         established by <xref linkend="guc-max-worker-processes"/>, limited
+         by <xref linkend="guc-max-parallel-workers"/>.  Note that the requested
          number of workers may not actually be available at run time.
          If this occurs, the utility operation will run with fewer
          workers than expected.  The default value is 2.  Setting this
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index fd911f5..1d7a002 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -30,6 +30,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     FREEZE
     VERBOSE
     ANALYZE
+    PARALLEL [ <replaceable class="parameter">N</replaceable> ]
     DISABLE_PAGE_SKIPPING
     SKIP_LOCKED
 
@@ -143,6 +144,25 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    </varlistentry>
 
    <varlistentry>
+    <term><literal>PARALLEL <replaceable class="parameter">N</replaceable></literal></term>
+    <listitem>
+     <para>
+      Perform vacuum index and cleanup index phases of <command>VACUUM</command>
+      in parallel using <replaceable class="parameter">N</replaceable> background
+      workers (for the detail of each vacuum phases, please refer to
+      <xref linkend="vacuum-phases"/>). Only one worker can be used per index.
+      Workers for vacuum launches before starting each phases and exit at the end
+      of the phase. If the parallel degree
+      <replaceable class="parameter">N</replaceable> is omitted, then
+      <command>VACUUM</command> decides the number of workers based on number of
+      indexes on the relation which further limited by
+      <xref linkend="guc-max-parallel-workers-maintenance"/>. This option can not
+      use with  <literal>FULL</literal> option.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><literal>DISABLE_PAGE_SKIPPING</literal></term>
     <listitem>
      <para>
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 2c33bf6..5f1eed4 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -22,6 +22,19 @@
  * of index scans performed.  So we don't use maintenance_work_mem memory for
  * the TID array, just enough to hold as many heap tuples as fit on one page.
  *
+ * Lazy vacuum supports parallel execution with parallel worker processes. In
+ * parallel vacuum, we perform both index vacuuming and index cleanup in
+ * parallel. Individual indexes is processed by one vacuum process. At beginning
+ * of lazy vacuum (at lazy_scan_heap) we prepare the parallel context and
+ * initialize the shared memory segments that contains shared information as
+ * well as the memory space for dead tuples. When starting either index vacuuming
+ * or index cleanup, we launch parallel worker processes. Once all indexes are
+ * processed the parallel worker processes exit and the leader process
+ * re-initializes the shared memory segment. Note that all parallel workers live
+ * during one either index vacuuming or cleanup index but the leader process neither
+ * exits from the parallel mode nor destroys the parallel context. For updating
+ * the index statistics, since any updates are not allowed during parallel mode
+ * we update the index statistics after exited from the parallel mode.
  *
  * Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
@@ -41,8 +54,10 @@
 #include "access/heapam_xlog.h"
 #include "access/htup_details.h"
 #include "access/multixact.h"
+#include "access/parallel.h"
 #include "access/transam.h"
 #include "access/visibilitymap.h"
+#include "access/xact.h"
 #include "access/xlog.h"
 #include "catalog/storage.h"
 #include "commands/dbcommands.h"
@@ -55,6 +70,7 @@
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
 #include "storage/lmgr.h"
+#include "tcop/tcopprot.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/pg_rusage.h"
@@ -110,6 +126,88 @@
  */
 #define PREFETCH_SIZE			((BlockNumber) 32)
 
+/*
+ * DSM keys for parallel lazy vacuum. Since we don't need to worry about DSM
+ * keys conflicting with plan_node_id we can use small integers.
+ */
+#define PARALLEL_VACUUM_KEY_SHARED			1
+#define PARALLEL_VACUUM_KEY_DEAD_TUPLES		2
+#define PARALLEL_VACUUM_KEY_QUERY_TEXT		3
+
+/*
+ * Structs for an index bulk-deletion statistic that is used for parallel
+ * lazy vacuum. This is allocated in a dynamic shared memory segment.
+ */
+typedef struct LVIndStats
+{
+	bool updated;	/* are the stats updated? */
+	IndexBulkDeleteResult stats;
+} LVIndStats;
+
+/*
+ * LVDeadTuples stores the dead tuple TIDs collected during heap scan.
+ * This is allocated in a dynamic shared memory segment when parallel
+ * lazy vacuum mode, or allocated in a local memory.
+ */
+typedef struct LVDeadTuples
+{
+	int			max_tuples;	/* # slots allocated in array */
+	int			num_tuples;	/* current # of entries */
+	/* List of TIDs of tuples we intend to delete */
+	/* NB: this list is ordered by TID address */
+	ItemPointerData itemptrs[FLEXIBLE_ARRAY_MEMBER];	/* array of ItemPointerData */
+} LVDeadTuples;
+#define SizeOfLVDeadTuples offsetof(LVDeadTuples, itemptrs) + sizeof(ItemPointerData)
+
+/*
+ * Shared information among parallel workers. So this is allocated in
+ * a dynamic shared memory segment.
+ */
+typedef struct LVShared
+{
+	/*
+	 * Target table relid and vacuum settings. These fields are not modified
+	 * during the lazy vacuum.
+	 */
+	Oid		relid;
+	bool	is_wraparound;
+	int		elevel;
+
+	/*
+	 * An indication for vacuum workers of doing either vacuuming index or
+	 * cleanup index.
+	 */
+	bool	for_cleanup;
+
+	/*
+	 * Fields for vacuum index or cleanup index, or both necessary for
+	 * IndexVacuumInfo.
+	 *
+	 * reltuples is the total number of input heap tuples. We set either an
+	 * old live tuples in vacuum index or th new live tuples in cleanup index.
+	 *
+	 * estimated_count is true if the reltuples is estimated value.
+	 */
+	double	reltuples;
+	bool	estimated_count;
+
+	/*
+	 * Variables to control parallel index vacuuming. An variable-sized field
+	 * 'indstats' must come last.
+	 */
+	pg_atomic_uint32	nprocessed;
+	LVIndStats			indstats[FLEXIBLE_ARRAY_MEMBER];
+} LVShared;
+#define SizeOfLVShared offsetof(LVShared, indstats) + sizeof(LVIndStats)
+
+/* Struct for parallel lazy vacuum */
+typedef struct LVParallelState
+{
+	ParallelContext	*pcxt;
+	LVShared		*lvshared;
+	int				nworkers_requested;	/* user-requested parallel degree */
+} LVParallelState;
+
 typedef struct LVRelStats
 {
 	/* hasindex = true means two-pass strategy; false means one-pass */
@@ -128,17 +226,12 @@ typedef struct LVRelStats
 	BlockNumber pages_removed;
 	double		tuples_deleted;
 	BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
-	/* List of TIDs of tuples we intend to delete */
-	/* NB: this list is ordered by TID address */
-	int			num_dead_tuples;	/* current # of entries */
-	int			max_dead_tuples;	/* # slots allocated in array */
-	ItemPointer dead_tuples;	/* array of ItemPointerData */
+	LVDeadTuples *dead_tuples;
 	int			num_index_scans;
 	TransactionId latestRemovedXid;
 	bool		lock_waiter_detected;
 } LVRelStats;
 
-
 /* A few variables that don't seem worth passing around as parameters */
 static int	elevel = -1;
 
@@ -152,15 +245,17 @@ static BufferAccessStrategy vac_strategy;
 /* non-export function prototypes */
 static void lazy_scan_heap(Relation onerel, VacuumOptions *options,
 			   LVRelStats *vacrelstats, Relation *Irel, int nindexes,
-			   bool aggressive);
+			   bool aggressive, bool is_wraparound);
 static void lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats, BlockNumber nblocks);
 static bool lazy_check_needs_freeze(Buffer buf, bool *hastup);
-static void lazy_vacuum_index(Relation indrel,
-				  IndexBulkDeleteResult **stats,
-				  LVRelStats *vacrelstats);
-static void lazy_cleanup_index(Relation indrel,
-				   IndexBulkDeleteResult *stats,
-				   LVRelStats *vacrelstats);
+static IndexBulkDeleteResult *lazy_vacuum_index(Relation indrel,
+									IndexBulkDeleteResult *stats,
+									double reltuples,
+									LVDeadTuples	*dead_tuples);
+static IndexBulkDeleteResult *lazy_cleanup_index(Relation indrel,
+									IndexBulkDeleteResult *stats,
+									double reltuples, bool estimated_count,
+									bool update_stats);
 static int lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 				 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer);
 static bool should_attempt_truncation(LVRelStats *vacrelstats);
@@ -168,13 +263,27 @@ static void lazy_truncate_heap(Relation onerel, LVRelStats *vacrelstats);
 static BlockNumber count_nondeletable_pages(Relation onerel,
 						 LVRelStats *vacrelstats);
 static void lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks);
-static void lazy_record_dead_tuple(LVRelStats *vacrelstats,
-					   ItemPointer itemptr);
+static void lazy_record_dead_tuple(LVDeadTuples *dead_tuples, ItemPointer itemptr);
 static bool lazy_tid_reaped(ItemPointer itemptr, void *state);
 static int	vac_cmp_itemptr(const void *left, const void *right);
 static bool heap_page_is_all_visible(Relation rel, Buffer buf,
 						 TransactionId *visibility_cutoff_xid, bool *all_frozen);
-
+static LVParallelState *lazy_prepare_parallel(LVRelStats *vacrelstats, Oid relid,
+											  BlockNumber nblocks, int nindexes,
+											  int nrequested, bool is_wraparound);
+static void lazy_end_parallel(LVParallelState *lps, Relation *Irel, int nindexes,
+							  bool update_indstats);
+static bool lazy_begin_parallel_vacuum_index(LVParallelState *lps, LVRelStats *vacrelstats,
+											 bool for_cleanup);
+static void lazy_end_parallel_vacuum_index(LVParallelState *lps, bool for_cleanup);
+static void lazy_vacuum_all_indexes(LVRelStats *vacrelstats, Relation *Irel, int nindexes,
+									IndexBulkDeleteResult **stats,
+									LVParallelState *lps, bool for_cleanup);
+static void lazy_vacuum_indexes_for_worker(Relation *indrels, int nindexes,
+										   LVShared *lvshared, LVDeadTuples *dead_tuples,
+										   bool for_cleanup);
+static int compute_parallel_workers(Relation onerel, int nrequested, int nindexes);
+static long compute_max_dead_tuples(BlockNumber relblocks, bool hasindex);
 
 /*
  *	heap_vacuum_rel() -- perform VACUUM for one heap relation
@@ -261,7 +370,8 @@ heap_vacuum_rel(Relation onerel, VacuumOptions *options, VacuumParams *params,
 	vacrelstats->hasindex = (nindexes > 0);
 
 	/* Do the vacuuming */
-	lazy_scan_heap(onerel, options, vacrelstats, Irel, nindexes, aggressive);
+	lazy_scan_heap(onerel, options, vacrelstats, Irel, nindexes, aggressive,
+				   params->is_wraparound);
 
 	/* Done with indexes */
 	vac_close_indexes(nindexes, Irel, NoLock);
@@ -464,14 +574,29 @@ vacuum_log_cleanup_info(Relation rel, LVRelStats *vacrelstats)
  *		dead-tuple TIDs, invoke vacuuming of indexes and call lazy_vacuum_heap
  *		to reclaim dead line pointers.
  *
+ *		If the table has more than one index and parallel lazy vacuum is requested,
+ *		we execute both index vacuuming and cleanup index with parallel workers. When
+ *		allocating the space for lazy scan heap, we enter the parallel mode, create
+ *		the parallel context and initailize a dynamic shared memory segment for dead
+ *		tuples. The dead_tuples points either to a dynamic shared memory segment in
+ *		parallel vacuum case or to a local memory in single process vacuum case.
+ *		Before starting	parallel index vacuuming and parallel cleanup index we launch
+ *		parallel workers. All parallel workers will exit after processed all indexes
+ *		and the leader process re-initialize parallel context and then re-launch them
+ *		at the next execution. The index statistics are updated by the leader after
+ *		exited from the parallel mode since all writes are not allowed during the
+ *		parallel mode.
+ *
  *		If there are no indexes then we can reclaim line pointers on the fly;
  *		dead line pointers need only be retained until all index pointers that
  *		reference them have been killed.
  */
 static void
 lazy_scan_heap(Relation onerel, VacuumOptions *options, LVRelStats *vacrelstats,
-			   Relation *Irel, int nindexes, bool aggressive)
+			   Relation *Irel, int nindexes, bool aggressive, bool is_wraparound)
 {
+	LVParallelState *lps = NULL;	/* non-NULL means ready for parallel vacuum */
+	LVDeadTuples *dead_tuples;
 	BlockNumber nblocks,
 				blkno;
 	HeapTupleData tuple;
@@ -486,7 +611,7 @@ lazy_scan_heap(Relation onerel, VacuumOptions *options, LVRelStats *vacrelstats,
 				tups_vacuumed,	/* tuples cleaned up by vacuum */
 				nkeep,			/* dead-but-not-removable tuples */
 				nunused;		/* unused item pointers */
-	IndexBulkDeleteResult **indstats;
+	IndexBulkDeleteResult **indstats = NULL;
 	int			i;
 	PGRUsage	ru0;
 	Buffer		vmbuffer = InvalidBuffer;
@@ -494,6 +619,7 @@ lazy_scan_heap(Relation onerel, VacuumOptions *options, LVRelStats *vacrelstats,
 	bool		skipping_blocks;
 	xl_heap_freeze_tuple *frozen;
 	StringInfoData buf;
+	int			parallel_workers = 0;
 	const int	initprog_index[] = {
 		PROGRESS_VACUUM_PHASE,
 		PROGRESS_VACUUM_TOTAL_HEAP_BLKS,
@@ -519,9 +645,6 @@ lazy_scan_heap(Relation onerel, VacuumOptions *options, LVRelStats *vacrelstats,
 	next_fsm_block_to_vacuum = (BlockNumber) 0;
 	num_tuples = live_tuples = tups_vacuumed = nkeep = nunused = 0;
 
-	indstats = (IndexBulkDeleteResult **)
-		palloc0(nindexes * sizeof(IndexBulkDeleteResult *));
-
 	nblocks = RelationGetNumberOfBlocks(onerel);
 	vacrelstats->rel_pages = nblocks;
 	vacrelstats->scanned_pages = 0;
@@ -529,13 +652,47 @@ lazy_scan_heap(Relation onerel, VacuumOptions *options, LVRelStats *vacrelstats,
 	vacrelstats->nonempty_pages = 0;
 	vacrelstats->latestRemovedXid = InvalidTransactionId;
 
-	lazy_space_alloc(vacrelstats, nblocks);
+	/*
+	 * Compute the number of parallel vacuum worker to request and then enable
+	 * parallel lazy vacuum.
+	 */
+	if ((options->flags & VACOPT_PARALLEL) != 0)
+		parallel_workers = compute_parallel_workers(onerel,
+													options->nworkers,
+													nindexes);
+
+	if (parallel_workers > 0)
+	{
+		/* Enter the parallel mode and prepare parallel vacuum */
+		lps = lazy_prepare_parallel(vacrelstats,
+									RelationGetRelid(onerel),
+									nblocks, nindexes,
+									parallel_workers, is_wraparound);
+		lps->nworkers_requested = options->nworkers;
+	}
+	else
+	{
+		/* Allocate the memory space for dead tuples locally */
+		lazy_space_alloc(vacrelstats, nblocks);
+	}
+
+	dead_tuples = vacrelstats->dead_tuples;
 	frozen = palloc(sizeof(xl_heap_freeze_tuple) * MaxHeapTuplesPerPage);
 
+	/*
+	 * Allocate the memory for index bulk-delete results if in the single vacuum
+	 * mode. In parallel mode, we've already prepared it in the shared memory
+	 * segment.
+	 */
+	if (!lps)
+		indstats = (IndexBulkDeleteResult **)
+			palloc0(nindexes * sizeof(IndexBulkDeleteResult *));
+
+
 	/* Report that we're scanning the heap, advertising total # of blocks */
 	initprog_val[0] = PROGRESS_VACUUM_PHASE_SCAN_HEAP;
 	initprog_val[1] = nblocks;
-	initprog_val[2] = vacrelstats->max_dead_tuples;
+	initprog_val[2] = dead_tuples->max_tuples;
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
 	/*
@@ -713,8 +870,8 @@ lazy_scan_heap(Relation onerel, VacuumOptions *options, LVRelStats *vacrelstats,
 		 * If we are close to overrunning the available space for dead-tuple
 		 * TIDs, pause and do a cycle of vacuuming before we tackle this page.
 		 */
-		if ((vacrelstats->max_dead_tuples - vacrelstats->num_dead_tuples) < MaxHeapTuplesPerPage &&
-			vacrelstats->num_dead_tuples > 0)
+		if ((dead_tuples->max_tuples - dead_tuples->num_tuples) < MaxHeapTuplesPerPage &&
+			dead_tuples->num_tuples > 0)
 		{
 			const int	hvp_index[] = {
 				PROGRESS_VACUUM_PHASE,
@@ -742,10 +899,8 @@ lazy_scan_heap(Relation onerel, VacuumOptions *options, LVRelStats *vacrelstats,
 										 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
 			/* Remove index entries */
-			for (i = 0; i < nindexes; i++)
-				lazy_vacuum_index(Irel[i],
-								  &indstats[i],
-								  vacrelstats);
+			lazy_vacuum_all_indexes(vacrelstats, Irel, nindexes, indstats,
+									lps, false);
 
 			/*
 			 * Report that we are now vacuuming the heap.  We also increase
@@ -765,7 +920,7 @@ lazy_scan_heap(Relation onerel, VacuumOptions *options, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 			vacrelstats->num_index_scans++;
 
 			/*
@@ -961,7 +1116,7 @@ lazy_scan_heap(Relation onerel, VacuumOptions *options, LVRelStats *vacrelstats,
 		has_dead_tuples = false;
 		nfrozen = 0;
 		hastup = false;
-		prev_dead_count = vacrelstats->num_dead_tuples;
+		prev_dead_count = dead_tuples->num_tuples;
 		maxoff = PageGetMaxOffsetNumber(page);
 
 		/*
@@ -1000,7 +1155,7 @@ lazy_scan_heap(Relation onerel, VacuumOptions *options, LVRelStats *vacrelstats,
 			 */
 			if (ItemIdIsDead(itemid))
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				all_visible = false;
 				continue;
 			}
@@ -1140,7 +1295,7 @@ lazy_scan_heap(Relation onerel, VacuumOptions *options, LVRelStats *vacrelstats,
 
 			if (tupgone)
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				HeapTupleHeaderAdvanceLatestRemovedXid(tuple.t_data,
 													   &vacrelstats->latestRemovedXid);
 				tups_vacuumed += 1;
@@ -1209,8 +1364,7 @@ lazy_scan_heap(Relation onerel, VacuumOptions *options, LVRelStats *vacrelstats,
 		 * If there are no indexes then we can vacuum the page right now
 		 * instead of doing a second scan.
 		 */
-		if (nindexes == 0 &&
-			vacrelstats->num_dead_tuples > 0)
+		if (nindexes == 0 && dead_tuples->num_tuples > 0)
 		{
 			/* Remove tuples from heap */
 			lazy_vacuum_page(onerel, blkno, buf, 0, vacrelstats, &vmbuffer);
@@ -1221,7 +1375,7 @@ lazy_scan_heap(Relation onerel, VacuumOptions *options, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 			vacuumed_pages++;
 
 			/*
@@ -1337,7 +1491,7 @@ lazy_scan_heap(Relation onerel, VacuumOptions *options, LVRelStats *vacrelstats,
 		 * page, so remember its free space as-is.  (This path will always be
 		 * taken if there are no indexes.)
 		 */
-		if (vacrelstats->num_dead_tuples == prev_dead_count)
+		if (dead_tuples->num_tuples == prev_dead_count)
 			RecordPageWithFreeSpace(onerel, blkno, freespace, nblocks);
 	}
 
@@ -1371,7 +1525,7 @@ lazy_scan_heap(Relation onerel, VacuumOptions *options, LVRelStats *vacrelstats,
 
 	/* If any tuples need to be deleted, perform final vacuum cycle */
 	/* XXX put a threshold on min number of tuples here? */
-	if (vacrelstats->num_dead_tuples > 0)
+	if (dead_tuples->num_tuples > 0)
 	{
 		const int	hvp_index[] = {
 			PROGRESS_VACUUM_PHASE,
@@ -1387,10 +1541,8 @@ lazy_scan_heap(Relation onerel, VacuumOptions *options, LVRelStats *vacrelstats,
 									 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
 		/* Remove index entries */
-		for (i = 0; i < nindexes; i++)
-			lazy_vacuum_index(Irel[i],
-							  &indstats[i],
-							  vacrelstats);
+		lazy_vacuum_all_indexes(vacrelstats, Irel, nindexes, indstats,
+								lps, false);
 
 		/* Report that we are now vacuuming the heap */
 		hvp_val[0] = PROGRESS_VACUUM_PHASE_VACUUM_HEAP;
@@ -1417,8 +1569,12 @@ lazy_scan_heap(Relation onerel, VacuumOptions *options, LVRelStats *vacrelstats,
 								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
 
 	/* Do post-vacuum cleanup and statistics update for each index */
-	for (i = 0; i < nindexes; i++)
-		lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
+	lazy_vacuum_all_indexes(vacrelstats, Irel, nindexes, indstats,
+							lps, true);
+
+	/* End parallel vacuum, update index statistics */
+	if (lps)
+		lazy_end_parallel(lps, Irel, nindexes, true);
 
 	/* If no indexes, make log report that lazy_vacuum_heap would've made */
 	if (vacuumed_pages)
@@ -1485,7 +1641,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats, BlockNumber nblocks)
 	npages = 0;
 
 	tupindex = 0;
-	while (tupindex < vacrelstats->num_dead_tuples)
+	while (tupindex < vacrelstats->dead_tuples->num_tuples)
 	{
 		BlockNumber tblk;
 		Buffer		buf;
@@ -1494,7 +1650,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats, BlockNumber nblocks)
 
 		vacuum_delay_point();
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples->itemptrs[tupindex]);
 		buf = ReadBufferExtended(onerel, MAIN_FORKNUM, tblk, RBM_NORMAL,
 								 vac_strategy);
 		if (!ConditionalLockBufferForCleanup(buf))
@@ -1503,8 +1659,8 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats, BlockNumber nblocks)
 			++tupindex;
 			continue;
 		}
-		tupindex = lazy_vacuum_page(onerel, tblk, buf, tupindex, vacrelstats,
-									&vmbuffer);
+		tupindex = lazy_vacuum_page(onerel, tblk, buf, tupindex,
+									vacrelstats, &vmbuffer);
 
 		/* Now that we've compacted the page, record its available space */
 		page = BufferGetPage(buf);
@@ -1542,6 +1698,7 @@ static int
 lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 				 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer)
 {
+	LVDeadTuples	*dead_tuples = vacrelstats->dead_tuples;
 	Page		page = BufferGetPage(buffer);
 	OffsetNumber unused[MaxOffsetNumber];
 	int			uncnt = 0;
@@ -1552,16 +1709,16 @@ lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 
 	START_CRIT_SECTION();
 
-	for (; tupindex < vacrelstats->num_dead_tuples; tupindex++)
+	for (; tupindex < dead_tuples->num_tuples; tupindex++)
 	{
 		BlockNumber tblk;
 		OffsetNumber toff;
 		ItemId		itemid;
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&dead_tuples->itemptrs[tupindex]);
 		if (tblk != blkno)
 			break;				/* past end of tuples for this block */
-		toff = ItemPointerGetOffsetNumber(&vacrelstats->dead_tuples[tupindex]);
+		toff = ItemPointerGetOffsetNumber(&dead_tuples->itemptrs[tupindex]);
 		itemid = PageGetItemId(page, toff);
 		ItemIdSetUnused(itemid);
 		unused[uncnt++] = toff;
@@ -1682,6 +1839,94 @@ lazy_check_needs_freeze(Buffer buf, bool *hastup)
 	return false;
 }
 
+/*
+ * Vacuum or cleanup all indexes. If we're ready to do parallel vacuum it's
+ * performed with parallel workers. So this function must be used by the parallel
+ * vacuum leader process.
+ *
+ * In parallel lazy vacuum, we copy the index bulk-deletion results returned from
+ * ambulkdelete and amvacuumcleanup to the shared memory because they are allocated
+ * locally and it's possible that an index will be vacuumed by the different vacuum
+ * process at the next time.
+ *
+ * Since all vacuum workers write the bulk-delete result at different slot we can
+ * write them without locking.
+ */
+static void
+lazy_vacuum_all_indexes(LVRelStats *vacrelstats, Relation *Irel, int nindexes,
+						IndexBulkDeleteResult **stats, LVParallelState *lps,
+						bool for_cleanup)
+{
+	int			nprocessed = 0;
+	bool		do_parallel = false;
+	int			idx;
+
+	Assert(!IsParallelWorker());
+
+	/* no job if the table has no index */
+	if (nindexes <= 0)
+		return;
+
+	/* Launch parallel vacuum workers if we're ready */
+	if (lps)
+		do_parallel = lazy_begin_parallel_vacuum_index(lps, vacrelstats,
+													   for_cleanup);
+
+	for (;;)
+	{
+		IndexBulkDeleteResult *result = NULL;
+
+		/* Get the next index number to vacuum and set index statistics */
+		if (do_parallel)
+		{
+			idx = pg_atomic_fetch_add_u32(&(lps->lvshared->nprocessed), 1);
+
+			/*
+			 * If there is already-updated result in the shared memory we
+			 * use it. Otherwise we pass NULL ot index AMs as they expect
+			 * NULL for the first time execution.
+			 */
+			if (lps->lvshared->indstats[idx].updated)
+				result = &(lps->lvshared->indstats[idx].stats);
+		}
+		else
+		{
+			idx = nprocessed++;
+			result = stats[idx];
+		}
+
+		/* Done for all indexes? */
+		if (idx >= nindexes)
+			break;
+
+		/*
+		 * Do vacuuming or cleanup one index. For cleanup index, we don't update
+		 * index statistics during parallel mode.
+		 */
+		if (for_cleanup)
+			result = lazy_cleanup_index(Irel[idx], result,
+										vacrelstats->new_rel_tuples,
+										vacrelstats->tupcount_pages < vacrelstats->rel_pages,
+										!do_parallel);
+		else
+			result = lazy_vacuum_index(Irel[idx], result,
+									   vacrelstats->old_rel_pages,
+									   vacrelstats->dead_tuples);
+
+		if (do_parallel && result)
+		{
+			/* Save index bulk-deletion result to the shared memory space */
+			memcpy(&(lps->lvshared->indstats[idx].stats), result,
+				   sizeof(IndexBulkDeleteResult));
+
+			/* Set true to pass the saved results at the next time */
+			lps->lvshared->indstats[idx].updated = true;
+		}
+	}
+
+	if (do_parallel)
+		lazy_end_parallel_vacuum_index(lps, for_cleanup);
+}
 
 /*
  *	lazy_vacuum_index() -- vacuum one index relation.
@@ -1689,12 +1934,13 @@ lazy_check_needs_freeze(Buffer buf, bool *hastup)
  *		Delete all the index entries pointing to tuples listed in
  *		vacrelstats->dead_tuples, and update running statistics.
  */
-static void
-lazy_vacuum_index(Relation indrel,
-				  IndexBulkDeleteResult **stats,
-				  LVRelStats *vacrelstats)
+static IndexBulkDeleteResult *
+lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult *stats,
+				  double reltuples, LVDeadTuples *dead_tuples)
 {
+	IndexBulkDeleteResult *res;
 	IndexVacuumInfo ivinfo;
+	char		*msg;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
@@ -1703,57 +1949,56 @@ lazy_vacuum_index(Relation indrel,
 	ivinfo.analyze_only = false;
 	ivinfo.estimated_count = true;
 	ivinfo.message_level = elevel;
-	/* We can only provide an approximate value of num_heap_tuples here */
-	ivinfo.num_heap_tuples = vacrelstats->old_live_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
 	/* Do bulk deletion */
-	*stats = index_bulk_delete(&ivinfo, *stats,
-							   lazy_tid_reaped, (void *) vacrelstats);
+	res = index_bulk_delete(&ivinfo, stats,
+							lazy_tid_reaped, (void *) dead_tuples);
 
+	if (IsParallelWorker())
+		msg = "scanned index \"%s\" to remove %d row versions by parallel vacuum worker";
+	else
+		msg = "scanned index \"%s\" to remove %d row versions";
 	ereport(elevel,
-			(errmsg("scanned index \"%s\" to remove %d row versions",
+			(errmsg(msg,
 					RelationGetRelationName(indrel),
-					vacrelstats->num_dead_tuples),
+					dead_tuples->num_tuples),
 			 errdetail_internal("%s", pg_rusage_show(&ru0))));
+
+	return res;
 }
 
 /*
  *	lazy_cleanup_index() -- do post-vacuum cleanup for one index relation.
  */
-static void
-lazy_cleanup_index(Relation indrel,
-				   IndexBulkDeleteResult *stats,
-				   LVRelStats *vacrelstats)
+static IndexBulkDeleteResult *
+lazy_cleanup_index(Relation indrel, IndexBulkDeleteResult *stats,
+				   double reltuples, bool estimated_count, bool update_stats)
 {
 	IndexVacuumInfo ivinfo;
+	char		*msg;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
 
 	ivinfo.index = indrel;
 	ivinfo.analyze_only = false;
-	ivinfo.estimated_count = (vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	ivinfo.estimated_count = estimated_count;
 	ivinfo.message_level = elevel;
-
-	/*
-	 * Now we can provide a better estimate of total number of surviving
-	 * tuples (we assume indexes are more interested in that than in the
-	 * number of nominally live tuples).
-	 */
-	ivinfo.num_heap_tuples = vacrelstats->new_rel_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
 	stats = index_vacuum_cleanup(&ivinfo, stats);
 
 	if (!stats)
-		return;
+		return NULL;
 
 	/*
 	 * Now update statistics in pg_class, but only if the index says the count
 	 * is accurate.
 	 */
-	if (!stats->estimated_count)
+	if (!stats->estimated_count && update_stats)
 		vac_update_relstats(indrel,
 							stats->num_pages,
 							stats->num_index_tuples,
@@ -1763,8 +2008,13 @@ lazy_cleanup_index(Relation indrel,
 							InvalidMultiXactId,
 							false);
 
+	if (IsParallelWorker())
+		msg = "index \"%s\" now contains %.0f row versions in %u pages, reported by parallel vacuum worker";
+	else
+		msg = "index \"%s\" now contains %.0f row versions in %u pages";
+
 	ereport(elevel,
-			(errmsg("index \"%s\" now contains %.0f row versions in %u pages",
+			(errmsg(msg,
 					RelationGetRelationName(indrel),
 					stats->num_index_tuples,
 					stats->num_pages),
@@ -1775,7 +2025,14 @@ lazy_cleanup_index(Relation indrel,
 					   stats->pages_deleted, stats->pages_free,
 					   pg_rusage_show(&ru0))));
 
-	pfree(stats);
+	if (update_stats)
+	{
+		/* Must not in parallel mode as the stats is allocated in DSM */
+		Assert(!IsInParallelMode());
+		pfree(stats);
+	}
+
+	return stats;
 }
 
 /*
@@ -2080,19 +2337,17 @@ count_nondeletable_pages(Relation onerel, LVRelStats *vacrelstats)
 }
 
 /*
- * lazy_space_alloc - space allocation decisions for lazy vacuum
- *
- * See the comments at the head of this file for rationale.
+ * Return the maximum number of dead tuples we can record.
  */
-static void
-lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+static long
+compute_max_dead_tuples(BlockNumber relblocks, bool hasindex)
 {
 	long		maxtuples;
 	int			vac_work_mem = IsAutoVacuumWorkerProcess() &&
 	autovacuum_work_mem != -1 ?
 	autovacuum_work_mem : maintenance_work_mem;
 
-	if (vacrelstats->hasindex)
+	if (hasindex)
 	{
 		maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);
 		maxtuples = Min(maxtuples, INT_MAX);
@@ -2106,34 +2361,49 @@ lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
 		maxtuples = Max(maxtuples, MaxHeapTuplesPerPage);
 	}
 	else
-	{
 		maxtuples = MaxHeapTuplesPerPage;
-	}
 
-	vacrelstats->num_dead_tuples = 0;
-	vacrelstats->max_dead_tuples = (int) maxtuples;
-	vacrelstats->dead_tuples = (ItemPointer)
-		palloc(maxtuples * sizeof(ItemPointerData));
+	return maxtuples;
+}
+
+/*
+ * lazy_space_alloc - space allocation decisions for lazy vacuum
+ *
+ * See the comments at the head of this file for rationale.
+ */
+static void
+lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+{
+	LVDeadTuples	*dead_tuples = NULL;
+	long		maxtuples;
+
+	maxtuples = compute_max_dead_tuples(relblocks, vacrelstats->hasindex);
+
+	dead_tuples = (LVDeadTuples *)
+		palloc(SizeOfLVDeadTuples + maxtuples * sizeof(ItemPointerData));
+	dead_tuples->num_tuples = 0;
+	dead_tuples->max_tuples = (int) maxtuples;
+
+	vacrelstats->dead_tuples = dead_tuples;
 }
 
 /*
  * lazy_record_dead_tuple - remember one deletable tuple
  */
 static void
-lazy_record_dead_tuple(LVRelStats *vacrelstats,
-					   ItemPointer itemptr)
+lazy_record_dead_tuple(LVDeadTuples *dead_tuples, ItemPointer itemptr)
 {
 	/*
 	 * The array shouldn't overflow under normal behavior, but perhaps it
 	 * could if we are given a really small maintenance_work_mem. In that
 	 * case, just forget the last few tuples (we'll get 'em next time).
 	 */
-	if (vacrelstats->num_dead_tuples < vacrelstats->max_dead_tuples)
+	if (dead_tuples->num_tuples < dead_tuples->max_tuples)
 	{
-		vacrelstats->dead_tuples[vacrelstats->num_dead_tuples] = *itemptr;
-		vacrelstats->num_dead_tuples++;
+		dead_tuples->itemptrs[dead_tuples->num_tuples] = *itemptr;
+		dead_tuples->num_tuples++;
 		pgstat_progress_update_param(PROGRESS_VACUUM_NUM_DEAD_TUPLES,
-									 vacrelstats->num_dead_tuples);
+									 dead_tuples->num_tuples);
 	}
 }
 
@@ -2147,12 +2417,12 @@ lazy_record_dead_tuple(LVRelStats *vacrelstats,
 static bool
 lazy_tid_reaped(ItemPointer itemptr, void *state)
 {
-	LVRelStats *vacrelstats = (LVRelStats *) state;
+	LVDeadTuples	*dead_tuples = (LVDeadTuples *) state;
 	ItemPointer res;
 
 	res = (ItemPointer) bsearch((void *) itemptr,
-								(void *) vacrelstats->dead_tuples,
-								vacrelstats->num_dead_tuples,
+								(void *) dead_tuples->itemptrs,
+								dead_tuples->num_tuples,
 								sizeof(ItemPointerData),
 								vac_cmp_itemptr);
 
@@ -2300,3 +2570,390 @@ heap_page_is_all_visible(Relation rel, Buffer buf,
 
 	return all_visible;
 }
+
+/*
+ * Compute the number of parallel worker process to request. Vacuums can be executed
+ * in parallel if the table has more than one index since the parallel index vacuuming
+ * processes one index by one vacuum process. The relation size of table and indexes
+ * doesn't affect to the parallel degree.
+ */
+static int
+compute_parallel_workers(Relation onerel, int nrequested, int nindexes)
+{
+	int parallel_workers;
+
+	if (nindexes <= 1)
+		return 0;
+
+	if (nrequested)
+		parallel_workers = Min(nrequested, nindexes - 1);
+	else
+	{
+		/*
+		 * The parallel degree is neither requested nor set in relopts. Compute
+		 * it based on the number of indexes.
+		 */
+		parallel_workers = nindexes - 1;
+	}
+
+	/* cap by max_parallel_maintenace_workers */
+	parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+
+	return parallel_workers;
+}
+
+/*
+ * Enter the parallel mode, allocate and initialize a DSM segment.
+ */
+static LVParallelState *
+lazy_prepare_parallel(LVRelStats *vacrelstats, Oid relid, BlockNumber nblocks,
+					  int nindexes, int nrequested, bool is_wraparound)
+{
+	LVParallelState *lps = (LVParallelState *) palloc(sizeof(LVParallelState));
+	LVShared	*shared;
+	ParallelContext *pcxt;
+	LVDeadTuples	*tidmap;
+	long	maxtuples;
+	char	*sharedquery;
+	Size	estshared;
+	Size	estdt;
+	int		querylen;
+	int		i;
+	int		keys = 0;
+
+	Assert(nrequested > 0);
+	Assert(nindexes > 0);
+
+	EnterParallelMode();
+	pcxt = CreateParallelContext("postgres", "heap_parallel_vacuum_main",
+								 nrequested, true);
+	lps->pcxt = pcxt;
+	Assert(pcxt->nworkers > 0);
+
+	/* Estimate size for shared information -- PARALLEL_VACUUM_KEY_SHARED */
+	estshared = MAXALIGN(add_size(SizeOfLVShared,
+								  mul_size(sizeof(LVIndStats), nindexes)));
+	shm_toc_estimate_chunk(&pcxt->estimator, estshared);
+	keys++;
+
+	/* Estimate size for dead tuples -- PARALLEL_VACUUM_KEY_DEAD_TUPLES */
+	maxtuples = compute_max_dead_tuples(nblocks, nindexes > 0);
+	estdt = MAXALIGN(add_size(sizeof(LVDeadTuples),
+							  mul_size(sizeof(ItemPointerData), maxtuples)));
+	shm_toc_estimate_chunk(&pcxt->estimator, estdt);
+	keys++;
+
+	shm_toc_estimate_keys(&pcxt->estimator, keys);
+
+	/* Finally, estimate VACUUM_KEY_QUERY_TEXT space */
+	querylen = strlen(debug_query_string);
+	shm_toc_estimate_chunk(&pcxt->estimator, querylen + 1);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	/* create the DSM */
+	InitializeParallelDSM(pcxt);
+
+	/* prepare shared information */
+	shared = (LVShared *) shm_toc_allocate(pcxt->toc, estshared);
+	shared->relid = relid;
+	shared->is_wraparound = is_wraparound;
+	shared->elevel = elevel;
+	pg_atomic_init_u32(&(shared->nprocessed), 0);
+
+	for (i = 0; i < nindexes; i++)
+	{
+		LVIndStats *s = &(shared->indstats[i]);
+		s->updated = false;
+		MemSet(&(s->stats), 0, sizeof(IndexBulkDeleteResult));
+	}
+
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_SHARED, shared);
+	lps->lvshared = shared;
+
+	/* prepare the dead tuple space */
+	tidmap = (LVDeadTuples *) shm_toc_allocate(pcxt->toc, estdt);
+	tidmap->max_tuples = maxtuples;
+	tidmap->num_tuples = 0;
+	MemSet(tidmap->itemptrs, 0, sizeof(ItemPointerData) * maxtuples);
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_DEAD_TUPLES, tidmap);
+	vacrelstats->dead_tuples = tidmap;
+
+	/* Store query string for workers */
+	sharedquery = (char *) shm_toc_allocate(pcxt->toc, querylen + 1);
+	memcpy(sharedquery, debug_query_string, querylen + 1);
+	sharedquery[querylen] = '\0';
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, sharedquery);
+
+	return lps;
+}
+
+/*
+ * Shutdown workers, destroy parallel context, and end parallel mode. If
+ * 'update_indstats' is true, we copy statistics of all indexes before
+ * destroying the parallel context, and then update them after exit parallel
+ * mode.
+ */
+static void
+lazy_end_parallel(LVParallelState *lps, Relation *Irel, int nindexes,
+				  bool update_indstats)
+{
+	LVIndStats *copied_indstats = NULL;
+
+	Assert(!IsParallelWorker());
+
+	if (update_indstats)
+	{
+		Assert(Irel != NULL && nindexes > 0);
+		/* copy the index statistics to a temporary space */
+		copied_indstats = palloc(sizeof(LVIndStats) * nindexes);
+		memcpy(copied_indstats, lps->lvshared->indstats,
+			   sizeof(LVIndStats) * nindexes);
+	}
+
+	/* Shutdown worker processes and destroy the parallel context */
+	WaitForParallelWorkersToFinish(lps->pcxt);
+	DestroyParallelContext(lps->pcxt);
+	ExitParallelMode();
+
+	if (update_indstats)
+	{
+		int i;
+
+		for (i = 0; i < nindexes; i++)
+		{
+			LVIndStats *s = &(copied_indstats[i]);
+
+			/* Update index statistics */
+			if (s->updated && !s->stats.estimated_count)
+				vac_update_relstats(Irel[i],
+									s->stats.num_pages,
+									s->stats.num_index_tuples,
+									0,
+									false,
+									InvalidTransactionId,
+									InvalidMultiXactId,
+									false);
+		}
+
+		pfree(copied_indstats);
+	}
+}
+
+/*
+ * Begin a parallel index vacuuming or cleanup index. Set shared information
+ * and launch parallel worker processes. Return true if at least one worker
+ * has been launched.
+ */
+static bool
+lazy_begin_parallel_vacuum_index(LVParallelState *lps, LVRelStats *vacrelstats,
+								 bool for_cleanup)
+{
+	StringInfoData buf;
+
+	Assert(!IsParallelWorker());
+
+	/*
+	 * Request workers to do either vacuuming indexes or cleaning indexes.
+	 */
+	lps->lvshared->for_cleanup = for_cleanup;
+
+	if (for_cleanup)
+	{
+		/*
+		 * Now we can provide a better estimate of total number of surviving
+		 * tuples (we assume indexes are more interested in that than in the
+		 * number of nominally live tuples).
+		 */
+		lps->lvshared->reltuples = vacrelstats->new_rel_tuples;
+		lps->lvshared->estimated_count =
+			(vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	}
+	else
+	{
+		/* We can only provide an approximate value of num_heap_tuples here */
+		lps->lvshared->reltuples = vacrelstats->old_live_tuples;
+		lps->lvshared->estimated_count = true;
+	}
+
+	LaunchParallelWorkers(lps->pcxt);
+
+	/* Report parallel vacuum worker information */
+	initStringInfo(&buf);
+	if (for_cleanup)
+	{
+		if (lps->nworkers_requested > 0)
+			appendStringInfo(&buf,
+							 ngettext("launched %d parallel vacuum worker for index cleanup (planned: %d, requested %d)",
+									  "launched %d parallel vacuum workers for index cleanup (planned: %d, requsted %d)",
+									  lps->pcxt->nworkers_launched),
+							 lps->pcxt->nworkers_launched,
+							 lps->pcxt->nworkers,
+							 lps->nworkers_requested);
+		else
+			appendStringInfo(&buf,
+							 ngettext("launched %d parallel vacuum worker for index cleanup (planned: %d)",
+									  "launched %d parallel vacuum workers for index cleanup (planned: %d)",
+									  lps->pcxt->nworkers_launched),
+							 lps->pcxt->nworkers_launched,
+							 lps->pcxt->nworkers);
+	}
+	else
+	{
+		if (lps->nworkers_requested > 0)
+			appendStringInfo(&buf,
+							 ngettext("launched %d parallel vacuum worker for index vacuuming (planned: %d, requested %d)",
+									  "launched %d parallel vacuum workers for index vacuuming (planned: %d, requested %d)",
+									  lps->pcxt->nworkers_launched),
+							 lps->pcxt->nworkers_launched,
+							 lps->pcxt->nworkers,
+							 lps->nworkers_requested);
+		else
+			appendStringInfo(&buf,
+							 ngettext("launched %d parallel vacuum worker for index vacuuming (planned: %d)",
+									  "launched %d parallel vacuum workers for index vacuuming (planned: %d)",
+									  lps->pcxt->nworkers_launched),
+							 lps->pcxt->nworkers_launched,
+							 lps->pcxt->nworkers);
+	}
+	ereport(elevel, (errmsg("%s", buf.data)));
+
+	/*
+	 * if no workers launched, we vacuum all indexes by the leader process
+	 * alone. Since there is hope that we can launch workers in the next
+	 * execution time we don't want to end the parallel mode yet.
+	 */
+	if (lps->pcxt->nworkers_launched == 0)
+	{
+		lazy_end_parallel_vacuum_index(lps, for_cleanup);
+		return false;
+	}
+
+	return true;
+}
+
+/*
+ * Wait for all worker processes to finish and reinitialize DSM for
+ * the next execution.
+ */
+static void
+lazy_end_parallel_vacuum_index(LVParallelState *lps, bool for_cleanup)
+{
+	Assert(!IsParallelWorker());
+
+	WaitForParallelWorkersToFinish(lps->pcxt);
+
+	/* Reset the processing count */
+	pg_atomic_write_u32(&(lps->lvshared->nprocessed), 0);
+
+	/*
+	 * Reinitialize the DSM space except to relaunch parallel workers for
+	 * the next execution.
+	 */
+	if (!for_cleanup)
+		ReinitializeParallelDSM(lps->pcxt);
+}
+
+/*
+ * Perform work within a launched parallel process.
+ *
+ * Parallel vacuum worker processes doesn't report the vacuum progress
+ * information.
+ */
+void
+heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
+{
+	Relation	onerel;
+	Relation	*indrels;
+	LVShared	*lvshared;
+	LVDeadTuples	*dead_tuples;
+	int			nindexes;
+	char		*sharedquery;
+
+	/* Set lazy vacuum state and open relations */
+	lvshared = (LVShared *) shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_SHARED, false);
+	onerel = heap_open(lvshared->relid, ShareUpdateExclusiveLock);
+	vac_open_indexes(onerel, RowExclusiveLock, &nindexes, &indrels);
+	elevel = lvshared->elevel;
+
+	ereport(DEBUG1,
+			(errmsg("starting parallel lazy vacuum worker for %s",
+					lvshared->for_cleanup ? "cleanup" : "vacuuming")));
+
+	/* Set debug_query_string for individual workers */
+	sharedquery = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, true);
+
+	/* Report the query string from leader */
+	debug_query_string = sharedquery;
+	pgstat_report_activity(STATE_RUNNING, debug_query_string);
+
+	/* Set dead tuple space within worker */
+	dead_tuples = (LVDeadTuples *) shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_DEAD_TUPLES, false);
+
+	/* Set cost-based vacuum delay */
+	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostBalance = 0;
+	VacuumPageHit = 0;
+	VacuumPageMiss = 0;
+	VacuumPageDirty = 0;
+
+	/* Do either vacuuming indexes or cleaning indexes */
+	lazy_vacuum_indexes_for_worker(indrels, nindexes, lvshared,
+								   dead_tuples,
+								   lvshared->for_cleanup);
+
+	vac_close_indexes(nindexes, indrels, RowExclusiveLock);
+	heap_close(onerel, ShareUpdateExclusiveLock);
+}
+
+/*
+ * Vacuum or cleanup indexes. This function must be used by the parallel vacuum
+ * worker processes. Similar to the leader process in parallel lazy vacuum, we
+ * copy the index bulk-deletion results to the shared memory segment.
+ */
+static void
+lazy_vacuum_indexes_for_worker(Relation *indrels, int nindexes,
+							   LVShared *lvshared, LVDeadTuples *dead_tuples,
+							   bool for_cleanup)
+{
+	int idx = 0;
+
+	Assert(IsParallelWorker());
+
+	for (;;)
+	{
+		IndexBulkDeleteResult *result = NULL;
+
+		/* Get next index to process */
+		idx = pg_atomic_fetch_add_u32(&(lvshared->nprocessed), 1);
+
+		/*  Done for all indexes? */
+		if (idx >= nindexes)
+			break;
+
+		/*
+		 * If we already have the result passed by the index AM, we pass
+		 * it. Otherwise pass NULL as it expects NULL for the first time
+		 * execution.
+		 */
+		if (lvshared->indstats[idx].updated)
+			result = &(lvshared->indstats[idx].stats);
+
+		/* Do vacuuming or cleanup one index */
+		if (for_cleanup)
+			result = lazy_cleanup_index(indrels[idx], result, lvshared->reltuples,
+									   lvshared->estimated_count, false);
+		else
+			result = lazy_vacuum_index(indrels[idx], result, lvshared->reltuples,
+									  dead_tuples);
+
+		if (result)
+		{
+			/* Save index bulk-deletion result to the shared memory space */
+			memcpy(&(lvshared->indstats[idx].stats), result,
+				   sizeof(IndexBulkDeleteResult));
+
+			/* Set true to pass the saved results at the next time */
+			lvshared->indstats[idx].updated = true;
+		}
+	}
+}
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index ce2b616..fb1e951 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -14,6 +14,7 @@
 
 #include "postgres.h"
 
+#include "access/heapam.h"
 #include "access/nbtree.h"
 #include "access/parallel.h"
 #include "access/session.h"
@@ -138,6 +139,9 @@ static const struct
 	},
 	{
 		"_bt_parallel_build_main", _bt_parallel_build_main
+	},
+	{
+		"heap_parallel_vacuum_main", heap_parallel_vacuum_main
 	}
 };
 
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 843f626..ec6efc4 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -111,6 +111,12 @@ ExecVacuum(VacuumStmt *vacstmt, bool isTopLevel)
 		}
 	}
 
+	if ((vacstmt->options->flags & VACOPT_FULL) &&
+		(vacstmt->options->flags & VACOPT_PARALLEL))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot specify FULL option with PARALLEL option")));
+
 	/*
 	 * All freeze ages are zero if the FREEZE option is given; otherwise pass
 	 * them as -1 which means to use the default values.
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 7f937c9..1c4ac81 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -3849,6 +3849,7 @@ _copyVacuumOptions(const VacuumOptions *from)
 	VacuumOptions *newnode = makeNode(VacuumOptions);
 
 	COPY_SCALAR_FIELD(flags);
+	COPY_SCALAR_FIELD(nworkers);
 
 	return newnode;
 }
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 3dbbff4..0869a10 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -1669,6 +1669,7 @@ static bool
 _equalVacuumOptions(const VacuumOptions *a, const VacuumOptions *b)
 {
 	COMPARE_SCALAR_FIELD(flags);
+	COMPARE_SCALAR_FIELD(nworkers);
 
 	return true;
 }
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index e7601da..f6acf5c 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -187,7 +187,7 @@ static void processCASbits(int cas_bits, int location, const char *constrType,
 			   bool *deferrable, bool *initdeferred, bool *not_valid,
 			   bool *no_inherit, core_yyscan_t yyscanner);
 static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
-static VacuumOptions *makeVacOpt(VacuumFlag flags);
+static VacuumOptions *makeVacOpt(VacuumFlag flag, int nworkers);
 
 %}
 
@@ -10438,7 +10438,7 @@ cluster_index_specification:
 VacuumStmt: VACUUM opt_full opt_freeze opt_verbose opt_analyze opt_vacuum_relation_list
 				{
 					VacuumStmt *n = makeNode(VacuumStmt);
-					VacuumOptions *opt = makeVacOpt(VACOPT_VACUUM);
+					VacuumOptions *opt = makeVacOpt(VACOPT_VACUUM, 0);
 					if ($2)
 						opt->flags |= VACOPT_FULL;
 					if ($3)
@@ -10454,8 +10454,10 @@ VacuumStmt: VACUUM opt_full opt_freeze opt_verbose opt_analyze opt_vacuum_relati
 			| VACUUM '(' vacuum_option_list ')' opt_vacuum_relation_list
 				{
 					VacuumStmt *n = makeNode(VacuumStmt);
-					n->options = $3;
-					n->options->flags |= VACOPT_VACUUM;
+					VacuumOptions *opt = makeVacOpt(VACOPT_VACUUM, 0);
+					opt->flags = VACOPT_VACUUM | $3->flags;
+					opt->nworkers = $3->nworkers;
+					n->options = opt;
 					n->rels = $5;
 					$$ = (Node *) n;
 				}
@@ -10465,23 +10467,38 @@ vacuum_option_list:
 			vacuum_option_elem								{ $$ = $1; }
 			| vacuum_option_list ',' vacuum_option_elem
 				{
-					$1->flags |= $3->flags;
-					pfree($3);
-					$$ = $1;
+					VacuumOptions *opt1 = $1;
+					VacuumOptions *opt2 = $3;
+
+					opt1->flags |= opt2->flags;
+					if (opt2->flags == VACOPT_PARALLEL)
+						opt1->nworkers = opt2->nworkers;
+					pfree(opt2);
+					$$ = opt1;
 				}
 		;
 
 vacuum_option_elem:
-			analyze_keyword		{ $$ = makeVacOpt(VACOPT_ANALYZE); }
-			| VERBOSE			{ $$ = makeVacOpt(VACOPT_VERBOSE); }
-			| FREEZE			{ $$ = makeVacOpt(VACOPT_FREEZE); }
-			| FULL				{ $$ = makeVacOpt(VACOPT_FULL); }
+			analyze_keyword		{ $$ = makeVacOpt(VACOPT_ANALYZE, 0); }
+			| VERBOSE			{ $$ = makeVacOpt(VACOPT_VERBOSE, 0); }
+			| FREEZE			{ $$ = makeVacOpt(VACOPT_FREEZE, 0); }
+			| FULL				{ $$ = makeVacOpt(VACOPT_FULL, 0); }
+			| PARALLEL			{ $$ = makeVacOpt(VACOPT_PARALLEL, 0); }
+			| PARALLEL ICONST
+			{
+				if ($2 < 1)
+					ereport(ERROR,
+							(errcode(ERRCODE_SYNTAX_ERROR),
+							 errmsg("parallel vacuum degree must be more than 1"),
+							 parser_errposition(@1)));
+				$$ = makeVacOpt(VACOPT_PARALLEL, $2);
+			}
 			| IDENT
 				{
 					if (strcmp($1, "disable_page_skipping") == 0)
-						$$ = makeVacOpt(VACOPT_DISABLE_PAGE_SKIPPING);
+						$$ = makeVacOpt(VACOPT_DISABLE_PAGE_SKIPPING, 0);
 					else if (strcmp($1, "skip_locked") == 0)
-						$$ = makeVacOpt(VACOPT_SKIP_LOCKED);
+						$$ = makeVacOpt(VACOPT_SKIP_LOCKED, 0);
 					else
 						ereport(ERROR,
 								(errcode(ERRCODE_SYNTAX_ERROR),
@@ -10493,7 +10510,8 @@ vacuum_option_elem:
 AnalyzeStmt: analyze_keyword opt_verbose opt_vacuum_relation_list
 				{
 					VacuumStmt *n = makeNode(VacuumStmt);
-					VacuumOptions *opt = makeVacOpt(VACOPT_ANALYZE);
+					VacuumOptions *opt = makeVacOpt(VACOPT_ANALYZE, 0);
+
 					if ($2)
 						opt->flags |= VACOPT_VERBOSE;
 					n->options = opt;
@@ -10503,7 +10521,9 @@ AnalyzeStmt: analyze_keyword opt_verbose opt_vacuum_relation_list
 			| analyze_keyword '(' analyze_option_list ')' opt_vacuum_relation_list
 				{
 					VacuumStmt *n = makeNode(VacuumStmt);
-					n->options =  makeVacOpt(VACOPT_ANALYZE | $3);
+					VacuumOptions *opt = makeVacOpt(VACOPT_ANALYZE, 0);
+					opt->flags = VACOPT_ANALYZE | $3;
+					n->options = opt;
 					n->rels = $5;
 					$$ = (Node *) n;
 				}
@@ -16027,16 +16047,18 @@ makeXmlExpr(XmlExprOp op, char *name, List *named_args, List *args,
 	return (Node *) x;
 }
 
+
 /*
- * Create a VacuumOptions with the given flags.
+ * Create a VacuumOptions with the given options.
  */
 static VacuumOptions *
-makeVacOpt(const VacuumFlag flags)
+makeVacOpt(VacuumFlag flag, int nworkers)
 {
-	VacuumOptions *opt = makeNode(VacuumOptions);
+	VacuumOptions *vacopt = makeNode(VacuumOptions);
 
-	opt->flags = flags;
-	return opt;
+	vacopt->flags = flag;
+	vacopt->nworkers = nworkers;
+	return vacopt;
 }
 
 /*
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 525a33b..05898cf 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2886,6 +2886,7 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 			(dovacuum ? VACOPT_VACUUM : 0) |
 			(doanalyze ? VACOPT_ANALYZE : 0) |
 			(!wraparound ? VACOPT_SKIP_LOCKED : 0);
+		tab->at_vacoptions.nworkers = 0;	/* parallel lazy autovacuum is not supported */
 		tab->at_params.freeze_min_age = freeze_min_age;
 		tab->at_params.freeze_table_age = freeze_table_age;
 		tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 10ae21c..fef80c4 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -3429,7 +3429,8 @@ psql_completion(const char *text, int start, int end)
 		 */
 		if (ends_with(prev_wd, '(') || ends_with(prev_wd, ','))
 			COMPLETE_WITH("FULL", "FREEZE", "ANALYZE", "VERBOSE",
-						  "DISABLE_PAGE_SKIPPING", "SKIP_LOCKED");
+						  "DISABLE_PAGE_SKIPPING", "SKIP_LOCKED",
+						  "PARALLEL");
 	}
 	else if (HeadMatches("VACUUM") && TailMatches("("))
 		/* "VACUUM (" should be caught above, so assume we want columns */
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 1c8525f..63e7a9e 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -14,12 +14,14 @@
 #ifndef HEAPAM_H
 #define HEAPAM_H
 
+#include "access/parallel.h"
 #include "access/relation.h"	/* for backward compatibility */
 #include "access/sdir.h"
 #include "access/skey.h"
 #include "access/table.h"		/* for backward compatibility */
 #include "nodes/parsenodes.h"
 #include "nodes/lockoptions.h"
+#include "nodes/parsenodes.h"
 #include "nodes/primnodes.h"
 #include "storage/bufpage.h"
 #include "storage/lockdefs.h"
@@ -188,6 +190,7 @@ extern Size SyncScanShmemSize(void);
 struct VacuumParams;
 extern void heap_vacuum_rel(Relation onerel, VacuumOptions *options,
 				struct VacuumParams *params, BufferAccessStrategy bstrategy);
+extern void heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
 
 /* in heap/heapam_visibility.c */
 extern bool HeapTupleSatisfiesVisibility(HeapTuple stup, Snapshot snapshot,
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 278e5d1..46e7fff 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3163,13 +3163,15 @@ typedef enum VacuumFlag
 	VACOPT_FULL = 1 << 4,		/* FULL (non-concurrent) vacuum */
 	VACOPT_SKIP_LOCKED = 1 << 5,	/* skip if cannot get lock */
 	VACOPT_SKIPTOAST = 1 << 6,	/* don't process the TOAST table, if any */
-	VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7	/* don't skip any pages */
+	VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7,	/* don't skip any pages */
+	VACOPT_PARALLEL = 1 << 8	/* do lazy vacuum in parallel */
 } VacuumFlag;
 
 typedef struct VacuumOptions
 {
 	NodeTag		type;
 	int			flags; /* OR of VacuumFlag */
+	int			nworkers;	/* # of parallel vacuum workers */
 } VacuumOptions;
 
 /*
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index fa9d663..9b5b7dc 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -80,6 +80,8 @@ CONTEXT:  SQL function "do_analyze" statement 1
 SQL function "wrap_do_analyze" statement 1
 VACUUM FULL vactst;
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+VACUUM (PARALLEL) vaccluster;
+VACUUM (PARALLEL 2) vaccluster;
 -- partitioned table
 CREATE TABLE vacparted (a int, b char) PARTITION BY LIST (a);
 CREATE TABLE vacparted1 PARTITION OF vacparted FOR VALUES IN (1);
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 9defa0d..f92c4e5 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -61,6 +61,9 @@ VACUUM FULL vaccluster;
 VACUUM FULL vactst;
 
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+VACUUM (PARALLEL) vaccluster;
+VACUUM (PARALLEL 2) vaccluster;
+
 
 -- partitioned table
 CREATE TABLE vacparted (a int, b char) PARTITION BY LIST (a);
-- 
2.10.5

#36

Robert Haas

robertmhaas@gmail.com

almost 7 years ago

In reply to: Masahiko Sawada (#35)

On Wed, Mar 6, 2019 at 1:26 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Okay, attached the latest version of patch set. I've incorporated all
comments I got and separated the patch for making vacuum options a
Node (0001 patch). And the patch doesn't use parallel_workers. It
might be proposed in the another form again in the future if
requested.

Why make it a Node? I mean I think a struct makes sense, but what's
the point of giving it a NodeTag?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#37

Masahiko Sawada

sawada.mshk@gmail.com

almost 7 years ago

In reply to: Robert Haas (#36)

On Thu, Mar 7, 2019 at 2:54 AM Robert Haas <robertmhaas@gmail.com> wrote:

On Wed, Mar 6, 2019 at 1:26 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Okay, attached the latest version of patch set. I've incorporated all
comments I got and separated the patch for making vacuum options a
Node (0001 patch). And the patch doesn't use parallel_workers. It
might be proposed in the another form again in the future if
requested.

Why make it a Node? I mean I think a struct makes sense, but what's
the point of giving it a NodeTag?

Well, the main point is consistency with other nodes and keep the code clean.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#38

Robert Haas

robertmhaas@gmail.com

almost 7 years ago

In reply to: Masahiko Sawada (#37)

On Wed, Mar 6, 2019 at 10:58 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Why make it a Node? I mean I think a struct makes sense, but what's
the point of giving it a NodeTag?

Well, the main point is consistency with other nodes and keep the code clean.

It looks to me like if we made it a plain struct rather than a node,
and embedded that struct (not a pointer) in VacuumStmt, then what
would happen is that _copyVacuumStmt and _equalVacuumStmt would have
clauses for each vacuum option individually, with a dot, like
COPY_SCALAR_FIELD(options.flags).

Also, the grammar production for VacuumStmt would need to be jiggered
around a bit; the way that options consolidation is done there would
have to be changed.

Neither of those things sound terribly hard or terribly messy, but on
the other hand I guess there's nothing really wrong with the way you
did it, either ... anybody else have an opinion?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#39

Masahiko Sawada

sawada.mshk@gmail.com

almost 7 years ago

In reply to: Robert Haas (#38)

On Fri, Mar 8, 2019 at 12:22 AM Robert Haas <robertmhaas@gmail.com> wrote:

On Wed, Mar 6, 2019 at 10:58 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Why make it a Node? I mean I think a struct makes sense, but what's
the point of giving it a NodeTag?

Well, the main point is consistency with other nodes and keep the code clean.

It looks to me like if we made it a plain struct rather than a node,
and embedded that struct (not a pointer) in VacuumStmt, then what
would happen is that _copyVacuumStmt and _equalVacuumStmt would have
clauses for each vacuum option individually, with a dot, like
COPY_SCALAR_FIELD(options.flags).

Also, the grammar production for VacuumStmt would need to be jiggered
around a bit; the way that options consolidation is done there would
have to be changed.

Neither of those things sound terribly hard or terribly messy, but on
the other hand I guess there's nothing really wrong with the way you
did it, either ... anybody else have an opinion?

I don't have a strong opinion but the using a Node would be more
suitable in the future when we add more options to vacuum. And it
seems to me that it's unlikely to change a Node to a plain struct. So
there is an idea of doing it now anyway if we might need to do it
someday.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#40

Robert Haas

robertmhaas@gmail.com

almost 7 years ago

In reply to: Masahiko Sawada (#39)

1 attachment(s)

On Wed, Mar 13, 2019 at 1:56 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

I don't have a strong opinion but the using a Node would be more
suitable in the future when we add more options to vacuum. And it
seems to me that it's unlikely to change a Node to a plain struct. So
there is an idea of doing it now anyway if we might need to do it
someday.

I just tried to apply 0001 again and noticed a conflict in the
autovac_table structure in postmaster.c.

That conflict got me thinking: aren't parameters and options an awful
lot alike? Why do we need to pass around a VacuumOptions structure
*and* a VacuumParams structure to all of these functions? Couldn't we
just have one? That led to the attached patch, which just gets rid of
the separate options flag and folds it into VacuumParams. If we took
this approach, the degree of parallelism would just be another thing
that would get added to VacuumParams, and VacuumOptions wouldn't end
up existing at all.

This patch does not address the question of what the *parse tree*
representation of the PARALLEL option should look like; the idea would
be that ExecVacuum() would need to extra the value for that option and
put it into VacuumParams just as it already does for various other
things in VacuumParams. Maybe the most natural approach would be to
convert the grammar productions for the VACUUM options list so that
they just build a list of DefElems, and then have ExecVacuum() iterate
over that list and make sense of it, as for example ExplainQuery()
already does.

I kinda like the idea of doing it that way, but then I came up with
it, so maybe you or others will think it's terrible.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachments:

vacuum-options-into-params.patchapplication/octet-stream; name=vacuum-options-into-params.patchDownload

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 9416c31889..5c554f9465 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -186,7 +186,7 @@ static bool heap_page_is_all_visible(Relation rel, Buffer buf,
  *		and locked the relation.
  */
 void
-heap_vacuum_rel(Relation onerel, int options, VacuumParams *params,
+heap_vacuum_rel(Relation onerel, VacuumParams *params,
 				BufferAccessStrategy bstrategy)
 {
 	LVRelStats *vacrelstats;
@@ -217,7 +217,7 @@ heap_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 		starttime = GetCurrentTimestamp();
 	}
 
-	if (options & VACOPT_VERBOSE)
+	if (params->options & VACOPT_VERBOSE)
 		elevel = INFO;
 	else
 		elevel = DEBUG2;
@@ -245,7 +245,7 @@ heap_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 											   xidFullScanLimit);
 	aggressive |= MultiXactIdPrecedesOrEquals(onerel->rd_rel->relminmxid,
 											  mxactFullScanLimit);
-	if (options & VACOPT_DISABLE_PAGE_SKIPPING)
+	if (params->options & VACOPT_DISABLE_PAGE_SKIPPING)
 		aggressive = true;
 
 	vacrelstats = (LVRelStats *) palloc0(sizeof(LVRelStats));
@@ -261,7 +261,7 @@ heap_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 	vacrelstats->hasindex = (nindexes > 0);
 
 	/* Do the vacuuming */
-	lazy_scan_heap(onerel, options, vacrelstats, Irel, nindexes, aggressive);
+	lazy_scan_heap(onerel, params->options, vacrelstats, Irel, nindexes, aggressive);
 
 	/* Done with indexes */
 	vac_close_indexes(nindexes, Irel, NoLock);
diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index a534fa944d..3465713d10 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -84,7 +84,7 @@ static MemoryContext anl_context = NULL;
 static BufferAccessStrategy vac_strategy;
 
 
-static void do_analyze_rel(Relation onerel, int options,
+static void do_analyze_rel(Relation onerel,
 			   VacuumParams *params, List *va_cols,
 			   AcquireSampleRowsFunc acquirefunc, BlockNumber relpages,
 			   bool inh, bool in_outer_xact, int elevel);
@@ -115,7 +115,7 @@ static Datum ind_fetch_func(VacAttrStatsP stats, int rownum, bool *isNull);
  * use it once we've successfully opened the rel, since it might be stale.
  */
 void
-analyze_rel(Oid relid, RangeVar *relation, int options,
+analyze_rel(Oid relid, RangeVar *relation,
 			VacuumParams *params, List *va_cols, bool in_outer_xact,
 			BufferAccessStrategy bstrategy)
 {
@@ -125,7 +125,7 @@ analyze_rel(Oid relid, RangeVar *relation, int options,
 	BlockNumber relpages = 0;
 
 	/* Select logging level */
-	if (options & VACOPT_VERBOSE)
+	if (params->options & VACOPT_VERBOSE)
 		elevel = INFO;
 	else
 		elevel = DEBUG2;
@@ -147,7 +147,7 @@ analyze_rel(Oid relid, RangeVar *relation, int options,
 	 *
 	 * Make sure to generate only logs for ANALYZE in this case.
 	 */
-	onerel = vacuum_open_relation(relid, relation, options & ~(VACOPT_VACUUM),
+	onerel = vacuum_open_relation(relid, relation, params->options & ~(VACOPT_VACUUM),
 								  params->log_min_duration >= 0,
 								  ShareUpdateExclusiveLock);
 
@@ -165,7 +165,7 @@ analyze_rel(Oid relid, RangeVar *relation, int options,
 	 */
 	if (!vacuum_is_relation_owner(RelationGetRelid(onerel),
 								  onerel->rd_rel,
-								  options & VACOPT_ANALYZE))
+								  params->options & VACOPT_ANALYZE))
 	{
 		relation_close(onerel, ShareUpdateExclusiveLock);
 		return;
@@ -237,7 +237,7 @@ analyze_rel(Oid relid, RangeVar *relation, int options,
 	else
 	{
 		/* No need for a WARNING if we already complained during VACUUM */
-		if (!(options & VACOPT_VACUUM))
+		if (!(params->options & VACOPT_VACUUM))
 			ereport(WARNING,
 					(errmsg("skipping \"%s\" --- cannot analyze non-tables or special system tables",
 							RelationGetRelationName(onerel))));
@@ -257,14 +257,14 @@ analyze_rel(Oid relid, RangeVar *relation, int options,
 	 * tables, which don't contain any rows.
 	 */
 	if (onerel->rd_rel->relkind != RELKIND_PARTITIONED_TABLE)
-		do_analyze_rel(onerel, options, params, va_cols, acquirefunc,
+		do_analyze_rel(onerel, params, va_cols, acquirefunc,
 					   relpages, false, in_outer_xact, elevel);
 
 	/*
 	 * If there are child tables, do recursive ANALYZE.
 	 */
 	if (onerel->rd_rel->relhassubclass)
-		do_analyze_rel(onerel, options, params, va_cols, acquirefunc, relpages,
+		do_analyze_rel(onerel, params, va_cols, acquirefunc, relpages,
 					   true, in_outer_xact, elevel);
 
 	/*
@@ -292,7 +292,7 @@ analyze_rel(Oid relid, RangeVar *relation, int options,
  * appropriate acquirefunc for each child table.
  */
 static void
-do_analyze_rel(Relation onerel, int options, VacuumParams *params,
+do_analyze_rel(Relation onerel, VacuumParams *params,
 			   List *va_cols, AcquireSampleRowsFunc acquirefunc,
 			   BlockNumber relpages, bool inh, bool in_outer_xact,
 			   int elevel)
@@ -603,7 +603,7 @@ do_analyze_rel(Relation onerel, int options, VacuumParams *params,
 	 * VACUUM ANALYZE, don't overwrite the accurate count already inserted by
 	 * VACUUM.
 	 */
-	if (!inh && !(options & VACOPT_VACUUM))
+	if (!inh && !(params->options & VACOPT_VACUUM))
 	{
 		for (ind = 0; ind < nindexes; ind++)
 		{
@@ -634,7 +634,7 @@ do_analyze_rel(Relation onerel, int options, VacuumParams *params,
 							  (va_cols == NIL));
 
 	/* If this isn't part of VACUUM ANALYZE, let index AMs do cleanup */
-	if (!(options & VACOPT_VACUUM))
+	if (!(params->options & VACOPT_VACUUM))
 	{
 		for (ind = 0; ind < nindexes; ind++)
 		{
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 99b3aa6d3f..3d59c6c37c 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -74,8 +74,7 @@ static void vac_truncate_clog(TransactionId frozenXID,
 				  MultiXactId minMulti,
 				  TransactionId lastSaneFrozenXid,
 				  MultiXactId lastSaneMinMulti);
-static bool vacuum_rel(Oid relid, RangeVar *relation, int options,
-		   VacuumParams *params);
+static bool vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params);
 
 /*
  * Primary entry point for manual VACUUM and ANALYZE commands
@@ -112,6 +111,9 @@ ExecVacuum(VacuumStmt *vacstmt, bool isTopLevel)
 		}
 	}
 
+	/* copy options from parse tree */
+	params.options = vacstmt->options;
+
 	/*
 	 * All freeze ages are zero if the FREEZE option is given; otherwise pass
 	 * them as -1 which means to use the default values.
@@ -138,7 +140,7 @@ ExecVacuum(VacuumStmt *vacstmt, bool isTopLevel)
 	params.log_min_duration = -1;
 
 	/* Now go through the common routine */
-	vacuum(vacstmt->options, vacstmt->rels, &params, NULL, isTopLevel);
+	vacuum(vacstmt->rels, &params, NULL, isTopLevel);
 }
 
 /*
@@ -163,7 +165,7 @@ ExecVacuum(VacuumStmt *vacstmt, bool isTopLevel)
  * memory context that will not disappear at transaction commit.
  */
 void
-vacuum(int options, List *relations, VacuumParams *params,
+vacuum(List *relations, VacuumParams *params,
 	   BufferAccessStrategy bstrategy, bool isTopLevel)
 {
 	static bool in_vacuum = false;
@@ -174,7 +176,7 @@ vacuum(int options, List *relations, VacuumParams *params,
 
 	Assert(params != NULL);
 
-	stmttype = (options & VACOPT_VACUUM) ? "VACUUM" : "ANALYZE";
+	stmttype = (params->options & VACOPT_VACUUM) ? "VACUUM" : "ANALYZE";
 
 	/*
 	 * We cannot run VACUUM inside a user transaction block; if we were inside
@@ -184,7 +186,7 @@ vacuum(int options, List *relations, VacuumParams *params,
 	 *
 	 * ANALYZE (without VACUUM) can run either way.
 	 */
-	if (options & VACOPT_VACUUM)
+	if (params->options & VACOPT_VACUUM)
 	{
 		PreventInTransactionBlock(isTopLevel, stmttype);
 		in_outer_xact = false;
@@ -206,8 +208,8 @@ vacuum(int options, List *relations, VacuumParams *params,
 	/*
 	 * Sanity check DISABLE_PAGE_SKIPPING option.
 	 */
-	if ((options & VACOPT_FULL) != 0 &&
-		(options & VACOPT_DISABLE_PAGE_SKIPPING) != 0)
+	if ((params->options & VACOPT_FULL) != 0 &&
+		(params->options & VACOPT_DISABLE_PAGE_SKIPPING) != 0)
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg("VACUUM option DISABLE_PAGE_SKIPPING cannot be used with FULL")));
@@ -216,7 +218,7 @@ vacuum(int options, List *relations, VacuumParams *params,
 	 * Send info about dead objects to the statistics collector, unless we are
 	 * in autovacuum --- autovacuum.c does this for itself.
 	 */
-	if ((options & VACOPT_VACUUM) && !IsAutoVacuumWorkerProcess())
+	if ((params->options & VACOPT_VACUUM) && !IsAutoVacuumWorkerProcess())
 		pgstat_vacuum_stat();
 
 	/*
@@ -257,7 +259,7 @@ vacuum(int options, List *relations, VacuumParams *params,
 			List	   *sublist;
 			MemoryContext old_context;
 
-			sublist = expand_vacuum_rel(vrel, options);
+			sublist = expand_vacuum_rel(vrel, params->options);
 			old_context = MemoryContextSwitchTo(vac_context);
 			newrels = list_concat(newrels, sublist);
 			MemoryContextSwitchTo(old_context);
@@ -265,7 +267,7 @@ vacuum(int options, List *relations, VacuumParams *params,
 		relations = newrels;
 	}
 	else
-		relations = get_all_vacuum_rels(options);
+		relations = get_all_vacuum_rels(params->options);
 
 	/*
 	 * Decide whether we need to start/commit our own transactions.
@@ -281,11 +283,11 @@ vacuum(int options, List *relations, VacuumParams *params,
 	 * transaction block, and also in an autovacuum worker, use own
 	 * transactions so we can release locks sooner.
 	 */
-	if (options & VACOPT_VACUUM)
+	if (params->options & VACOPT_VACUUM)
 		use_own_xacts = true;
 	else
 	{
-		Assert(options & VACOPT_ANALYZE);
+		Assert(params->options & VACOPT_ANALYZE);
 		if (IsAutoVacuumWorkerProcess())
 			use_own_xacts = true;
 		else if (in_outer_xact)
@@ -335,13 +337,13 @@ vacuum(int options, List *relations, VacuumParams *params,
 		{
 			VacuumRelation *vrel = lfirst_node(VacuumRelation, cur);
 
-			if (options & VACOPT_VACUUM)
+			if (params->options & VACOPT_VACUUM)
 			{
-				if (!vacuum_rel(vrel->oid, vrel->relation, options, params))
+				if (!vacuum_rel(vrel->oid, vrel->relation, params))
 					continue;
 			}
 
-			if (options & VACOPT_ANALYZE)
+			if (params->options & VACOPT_ANALYZE)
 			{
 				/*
 				 * If using separate xacts, start one for analyze. Otherwise,
@@ -354,7 +356,7 @@ vacuum(int options, List *relations, VacuumParams *params,
 					PushActiveSnapshot(GetTransactionSnapshot());
 				}
 
-				analyze_rel(vrel->oid, vrel->relation, options, params,
+				analyze_rel(vrel->oid, vrel->relation, params,
 							vrel->va_cols, in_outer_xact, vac_strategy);
 
 				if (use_own_xacts)
@@ -390,7 +392,7 @@ vacuum(int options, List *relations, VacuumParams *params,
 		StartTransactionCommand();
 	}
 
-	if ((options & VACOPT_VACUUM) && !IsAutoVacuumWorkerProcess())
+	if ((params->options & VACOPT_VACUUM) && !IsAutoVacuumWorkerProcess())
 	{
 		/*
 		 * Update pg_database.datfrozenxid, and truncate pg_xact if possible.
@@ -1519,7 +1521,7 @@ vac_truncate_clog(TransactionId frozenXID,
  *		At entry and exit, we are not inside a transaction.
  */
 static bool
-vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
+vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params)
 {
 	LOCKMODE	lmode;
 	Relation	onerel;
@@ -1540,7 +1542,7 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 	 */
 	PushActiveSnapshot(GetTransactionSnapshot());
 
-	if (!(options & VACOPT_FULL))
+	if (!(params->options & VACOPT_FULL))
 	{
 		/*
 		 * In lazy vacuum, we can set the PROC_IN_VACUUM flag, which lets
@@ -1580,10 +1582,11 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 	 * vacuum, but just ShareUpdateExclusiveLock for concurrent vacuum. Either
 	 * way, we can be sure that no other backend is vacuuming the same table.
 	 */
-	lmode = (options & VACOPT_FULL) ? AccessExclusiveLock : ShareUpdateExclusiveLock;
+	lmode = (params->options & VACOPT_FULL) ?
+		AccessExclusiveLock : ShareUpdateExclusiveLock;
 
 	/* open the relation and get the appropriate lock on it */
-	onerel = vacuum_open_relation(relid, relation, options,
+	onerel = vacuum_open_relation(relid, relation, params->options,
 								  params->log_min_duration >= 0, lmode);
 
 	/* leave if relation could not be opened or locked */
@@ -1604,7 +1607,7 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 	 */
 	if (!vacuum_is_relation_owner(RelationGetRelid(onerel),
 								  onerel->rd_rel,
-								  options & VACOPT_VACUUM))
+								  params->options & VACOPT_VACUUM))
 	{
 		relation_close(onerel, lmode);
 		PopActiveSnapshot();
@@ -1676,7 +1679,7 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 	 * us to process it.  In VACUUM FULL, though, the toast table is
 	 * automatically rebuilt by cluster_rel so we shouldn't recurse to it.
 	 */
-	if (!(options & VACOPT_SKIPTOAST) && !(options & VACOPT_FULL))
+	if (!(params->options & VACOPT_SKIPTOAST) && !(params->options & VACOPT_FULL))
 		toast_relid = onerel->rd_rel->reltoastrelid;
 	else
 		toast_relid = InvalidOid;
@@ -1695,7 +1698,7 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 	/*
 	 * Do the actual work --- either FULL or "lazy" vacuum
 	 */
-	if (options & VACOPT_FULL)
+	if (params->options & VACOPT_FULL)
 	{
 		int			cluster_options = 0;
 
@@ -1703,14 +1706,14 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 		relation_close(onerel, NoLock);
 		onerel = NULL;
 
-		if ((options & VACOPT_VERBOSE) != 0)
+		if ((params->options & VACOPT_VERBOSE) != 0)
 			cluster_options |= CLUOPT_VERBOSE;
 
 		/* VACUUM FULL is now a variant of CLUSTER; see cluster.c */
 		cluster_rel(relid, InvalidOid, cluster_options);
 	}
 	else
-		heap_vacuum_rel(onerel, options, params, vac_strategy);
+		heap_vacuum_rel(onerel, params, vac_strategy);
 
 	/* Roll back any GUC changes executed by index functions */
 	AtEOXact_GUC(false, save_nestlevel);
@@ -1736,7 +1739,7 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 	 * totally unimportant for toast relations.
 	 */
 	if (toast_relid != InvalidOid)
-		vacuum_rel(toast_relid, NULL, options, params);
+		vacuum_rel(toast_relid, NULL, params);
 
 	/*
 	 * Now release the session-level lock on the master table.
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 3bfac919c4..fa875db816 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -188,7 +188,6 @@ typedef struct av_relation
 typedef struct autovac_table
 {
 	Oid			at_relid;
-	int			at_vacoptions;	/* bitmask of VacuumOption */
 	VacuumParams at_params;
 	double		at_vacuum_cost_delay;
 	int			at_vacuum_cost_limit;
@@ -2482,7 +2481,7 @@ do_autovacuum(void)
 			 * next table in our list.
 			 */
 			HOLD_INTERRUPTS();
-			if (tab->at_vacoptions & VACOPT_VACUUM)
+			if (tab->at_params.options & VACOPT_VACUUM)
 				errcontext("automatic vacuum of table \"%s.%s.%s\"",
 						   tab->at_datname, tab->at_nspname, tab->at_relname);
 			else
@@ -2883,7 +2882,7 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		tab = palloc(sizeof(autovac_table));
 		tab->at_relid = relid;
 		tab->at_sharedrel = classForm->relisshared;
-		tab->at_vacoptions = VACOPT_SKIPTOAST |
+		tab->at_params.options = VACOPT_SKIPTOAST |
 			(dovacuum ? VACOPT_VACUUM : 0) |
 			(doanalyze ? VACOPT_ANALYZE : 0) |
 			(!wraparound ? VACOPT_SKIP_LOCKED : 0);
@@ -3110,7 +3109,7 @@ autovacuum_do_vac_analyze(autovac_table *tab, BufferAccessStrategy bstrategy)
 	rel = makeVacuumRelation(rangevar, tab->at_relid, NIL);
 	rel_list = list_make1(rel);
 
-	vacuum(tab->at_vacoptions, rel_list, &tab->at_params, bstrategy, true);
+	vacuum(rel_list, &tab->at_params, bstrategy, true);
 }
 
 /*
@@ -3132,10 +3131,10 @@ autovac_report_activity(autovac_table *tab)
 	int			len;
 
 	/* Report the command and possible options */
-	if (tab->at_vacoptions & VACOPT_VACUUM)
+	if (tab->at_params.options & VACOPT_VACUUM)
 		snprintf(activity, MAX_AUTOVAC_ACTIV_LEN,
 				 "autovacuum: VACUUM%s",
-				 tab->at_vacoptions & VACOPT_ANALYZE ? " ANALYZE" : "");
+				 tab->at_params.options & VACOPT_ANALYZE ? " ANALYZE" : "");
 	else
 		snprintf(activity, MAX_AUTOVAC_ACTIV_LEN,
 				 "autovacuum: ANALYZE");
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 1b6607fe90..eb9e160bfd 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -217,7 +217,7 @@ extern Size SyncScanShmemSize(void);
 
 /* in heap/vacuumlazy.c */
 struct VacuumParams;
-extern void heap_vacuum_rel(Relation onerel, int options,
+extern void heap_vacuum_rel(Relation onerel,
 				struct VacuumParams *params, BufferAccessStrategy bstrategy);
 
 /* in heap/heapam_visibility.c */
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 17ba4ba653..edaccbabb4 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -141,6 +141,7 @@ typedef struct VacAttrStats
  */
 typedef struct VacuumParams
 {
+	int			options;		/* VACOPT_* constants */
 	int			freeze_min_age; /* min freeze age, -1 to use default */
 	int			freeze_table_age;	/* age at which to scan whole table */
 	int			multixact_freeze_min_age;	/* min multixact freeze age, -1 to
@@ -163,7 +164,7 @@ extern int	vacuum_multixact_freeze_table_age;
 
 /* in commands/vacuum.c */
 extern void ExecVacuum(VacuumStmt *vacstmt, bool isTopLevel);
-extern void vacuum(int options, List *relations, VacuumParams *params,
+extern void vacuum(List *relations, VacuumParams *params,
 	   BufferAccessStrategy bstrategy, bool isTopLevel);
 extern void vac_open_indexes(Relation relation, LOCKMODE lockmode,
 				 int *nindexes, Relation **Irel);
@@ -197,7 +198,7 @@ extern Relation vacuum_open_relation(Oid relid, RangeVar *relation,
 					 int options, bool verbose, LOCKMODE lmode);
 
 /* in commands/analyze.c */
-extern void analyze_rel(Oid relid, RangeVar *relation, int options,
+extern void analyze_rel(Oid relid, RangeVar *relation,
 			VacuumParams *params, List *va_cols, bool in_outer_xact,
 			BufferAccessStrategy bstrategy);
 extern bool std_typanalyze(VacAttrStats *stats);

#41

Masahiko Sawada

sawada.mshk@gmail.com

almost 7 years ago

In reply to: Robert Haas (#40)

2 attachment(s)

On Thu, Mar 14, 2019 at 6:41 AM Robert Haas <robertmhaas@gmail.com> wrote:

On Wed, Mar 13, 2019 at 1:56 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

I don't have a strong opinion but the using a Node would be more
suitable in the future when we add more options to vacuum. And it
seems to me that it's unlikely to change a Node to a plain struct. So
there is an idea of doing it now anyway if we might need to do it
someday.

I just tried to apply 0001 again and noticed a conflict in the
autovac_table structure in postmaster.c.

That conflict got me thinking: aren't parameters and options an awful
lot alike? Why do we need to pass around a VacuumOptions structure
*and* a VacuumParams structure to all of these functions? Couldn't we
just have one? That led to the attached patch, which just gets rid of
the separate options flag and folds it into VacuumParams.

Indeed. I like this approach. The comment of vacuum() says,

* options is a bitmask of VacuumOption flags, indicating what to do.
* (snip)
* params contains a set of parameters that can be used to customize the
* behavior.

It seems to me that the purpose of both variables are different. But
it would be acceptable even if we merge them.

BTW your patch seems to not apply to the current HEAD cleanly and to
need to update the comment of vacuum().

If we took
this approach, the degree of parallelism would just be another thing
that would get added to VacuumParams, and VacuumOptions wouldn't end
up existing at all.

Agreed.

This patch does not address the question of what the *parse tree*
representation of the PARALLEL option should look like; the idea would
be that ExecVacuum() would need to extra the value for that option and
put it into VacuumParams just as it already does for various other
things in VacuumParams. Maybe the most natural approach would be to
convert the grammar productions for the VACUUM options list so that
they just build a list of DefElems, and then have ExecVacuum() iterate
over that list and make sense of it, as for example ExplainQuery()
already does.

Agreed. That change would help for the discussion changing VACUUM
option syntax to field-and-value style.

Attached the updated patch you proposed and the patch that converts
the grammer productions for the VACUUM option on top of the former
patch. The latter patch moves VacuumOption to vacuum.h since the
parser no longer needs such information.

If we take this direction I will change the parallel vacuum patch so
that it adds new PARALLEL option and adds 'nworkers' to VacuumParams.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachments:

vacuum-grammer.patchapplication/octet-stream; name=vacuum-grammer.patchDownload

diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 90ff6aa..0d820e0 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -83,20 +83,55 @@ static bool vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params);
  * happen in vacuum().
  */
 void
-ExecVacuum(VacuumStmt *vacstmt, bool isTopLevel)
+ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 {
 	VacuumParams params;
+	ListCell	*lc;
+
+	params.options = vacstmt->is_vacuumcmd ? VACOPT_VACUUM : VACOPT_ANALYZE;
+
+	/* Parse options list */
+	foreach(lc, vacstmt->options)
+	{
+		DefElem	*opt = (DefElem *) lfirst(lc);
+
+		/* Parse common options for VACUUM and ANALYZE */
+		if (strcmp(opt->defname, "verbose") == 0)
+			params.options |= VACOPT_VERBOSE;
+		else if (strcmp(opt->defname, "skip_locked") == 0)
+			params.options |= VACOPT_SKIP_LOCKED;
+		else if (!vacstmt->is_vacuumcmd)
+			ereport(ERROR,
+					(errcode(ERRCODE_SYNTAX_ERROR),
+					 errmsg("unrecognized ANALYZE option \"%s\"", opt->defname),
+					 parser_errposition(pstate, opt->location)));
+
+		/* Parse options available on VACUUM */
+		else if (strcmp(opt->defname, "analyze") == 0)
+				params.options |= VACOPT_ANALYZE;
+		else if (strcmp(opt->defname, "freeze") == 0)
+				params.options |= VACOPT_FREEZE;
+		else if (strcmp(opt->defname, "full") == 0)
+			params.options |= VACOPT_FULL;
+		else if (strcmp(opt->defname, "disable_page_skipping") == 0)
+			params.options |= VACOPT_DISABLE_PAGE_SKIPPING;
+		else
+			ereport(ERROR,
+					(errcode(ERRCODE_SYNTAX_ERROR),
+					 errmsg("unrecognized VACUUM option \"%s\"", opt->defname),
+					 parser_errposition(pstate, opt->location)));
+	}
 
 	/* sanity checks on options */
-	Assert(vacstmt->options & (VACOPT_VACUUM | VACOPT_ANALYZE));
-	Assert((vacstmt->options & VACOPT_VACUUM) ||
-		   !(vacstmt->options & (VACOPT_FULL | VACOPT_FREEZE)));
-	Assert(!(vacstmt->options & VACOPT_SKIPTOAST));
+	Assert(params.options & (VACOPT_VACUUM | VACOPT_ANALYZE));
+	Assert((params.options & VACOPT_VACUUM) ||
+		   !(params.options & (VACOPT_FULL | VACOPT_FREEZE)));
+	Assert(!(params.options & VACOPT_SKIPTOAST));
 
 	/*
 	 * Make sure VACOPT_ANALYZE is specified if any column lists are present.
 	 */
-	if (!(vacstmt->options & VACOPT_ANALYZE))
+	if (!(params.options & VACOPT_ANALYZE))
 	{
 		ListCell   *lc;
 
@@ -111,14 +146,11 @@ ExecVacuum(VacuumStmt *vacstmt, bool isTopLevel)
 		}
 	}
 
-	/* copy options from parse tree */
-	params.options = vacstmt->options;
-
 	/*
 	 * All freeze ages are zero if the FREEZE option is given; otherwise pass
 	 * them as -1 which means to use the default values.
 	 */
-	if (vacstmt->options & VACOPT_FREEZE)
+	if (params.options & VACOPT_FREEZE)
 	{
 		params.freeze_min_age = 0;
 		params.freeze_table_age = 0;
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index e23e68f..e814939 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -306,8 +306,9 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 				create_extension_opt_item alter_extension_opt_item
 
 %type <ival>	opt_lock lock_type cast_context
-%type <ival>	vacuum_option_list vacuum_option_elem
-				analyze_option_list analyze_option_elem
+%type <str>		vac_analyze_option_name
+%type <defelt>	vac_analyze_option_elem
+%type <list>	vac_analyze_option_list
 %type <boolean>	opt_or_replace
 				opt_grant_grant_option opt_grant_admin_option
 				opt_nowait opt_if_exists opt_with_data
@@ -10460,85 +10461,62 @@ cluster_index_specification:
 VacuumStmt: VACUUM opt_full opt_freeze opt_verbose opt_analyze opt_vacuum_relation_list
 				{
 					VacuumStmt *n = makeNode(VacuumStmt);
-					n->options = VACOPT_VACUUM;
+					n->options = NIL;
 					if ($2)
-						n->options |= VACOPT_FULL;
+						n->options = lappend(n->options,
+											 makeDefElem("full", NULL, @2));
 					if ($3)
-						n->options |= VACOPT_FREEZE;
+						n->options = lappend(n->options,
+											 makeDefElem("freeze", NULL, @3));
 					if ($4)
-						n->options |= VACOPT_VERBOSE;
+						n->options = lappend(n->options,
+											 makeDefElem("verbose", NULL, @4));
 					if ($5)
-						n->options |= VACOPT_ANALYZE;
+						n->options = lappend(n->options,
+											 makeDefElem("analyze", NULL, @5));
 					n->rels = $6;
+					n->is_vacuumcmd = true;
 					$$ = (Node *)n;
 				}
-			| VACUUM '(' vacuum_option_list ')' opt_vacuum_relation_list
+			| VACUUM '(' vac_analyze_option_list ')' opt_vacuum_relation_list
 				{
 					VacuumStmt *n = makeNode(VacuumStmt);
-					n->options = VACOPT_VACUUM | $3;
+					n->options = $3;
 					n->rels = $5;
+					n->is_vacuumcmd = true;
 					$$ = (Node *) n;
 				}
 		;
 
-vacuum_option_list:
-			vacuum_option_elem								{ $$ = $1; }
-			| vacuum_option_list ',' vacuum_option_elem		{ $$ = $1 | $3; }
-		;
-
-vacuum_option_elem:
-			analyze_keyword		{ $$ = VACOPT_ANALYZE; }
-			| VERBOSE			{ $$ = VACOPT_VERBOSE; }
-			| FREEZE			{ $$ = VACOPT_FREEZE; }
-			| FULL				{ $$ = VACOPT_FULL; }
-			| IDENT
-				{
-					if (strcmp($1, "disable_page_skipping") == 0)
-						$$ = VACOPT_DISABLE_PAGE_SKIPPING;
-					else if (strcmp($1, "skip_locked") == 0)
-						$$ = VACOPT_SKIP_LOCKED;
-					else
-						ereport(ERROR,
-								(errcode(ERRCODE_SYNTAX_ERROR),
-							 errmsg("unrecognized VACUUM option \"%s\"", $1),
-									 parser_errposition(@1)));
-				}
-		;
-
 AnalyzeStmt: analyze_keyword opt_verbose opt_vacuum_relation_list
 				{
 					VacuumStmt *n = makeNode(VacuumStmt);
-					n->options = VACOPT_ANALYZE;
+					n->options = NIL;
 					if ($2)
-						n->options |= VACOPT_VERBOSE;
+						n->options = lappend(n->options,
+											 makeDefElem("verbose", NULL, @2));
 					n->rels = $3;
+					n->is_vacuumcmd = false;
 					$$ = (Node *)n;
 				}
-			| analyze_keyword '(' analyze_option_list ')' opt_vacuum_relation_list
+			| analyze_keyword '(' vac_analyze_option_list ')' opt_vacuum_relation_list
 				{
 					VacuumStmt *n = makeNode(VacuumStmt);
-					n->options = VACOPT_ANALYZE | $3;
+					n->options = $3;
 					n->rels = $5;
+					n->is_vacuumcmd = false;
 					$$ = (Node *) n;
 				}
 		;
 
-analyze_option_list:
-			analyze_option_elem								{ $$ = $1; }
-			| analyze_option_list ',' analyze_option_elem	{ $$ = $1 | $3; }
-		;
-
-analyze_option_elem:
-			VERBOSE				{ $$ = VACOPT_VERBOSE; }
-			| IDENT
+vac_analyze_option_list:
+			vac_analyze_option_elem
 				{
-					if (strcmp($1, "skip_locked") == 0)
-						$$ = VACOPT_SKIP_LOCKED;
-					else
-						ereport(ERROR,
-								(errcode(ERRCODE_SYNTAX_ERROR),
-								 errmsg("unrecognized ANALYZE option \"%s\"", $1),
-									 parser_errposition(@1)));
+					$$ = list_make1($1);
+				}
+			| vac_analyze_option_list ',' vac_analyze_option_elem
+				{
+					$$ = lappend($1, $3);
 				}
 		;
 
@@ -10547,6 +10525,18 @@ analyze_keyword:
 			| ANALYSE /* British */					{}
 		;
 
+vac_analyze_option_elem:
+			vac_analyze_option_name
+				{
+					$$ = makeDefElem($1, NULL, @1);
+				}
+		;
+
+vac_analyze_option_name:
+			NonReservedWord							{ $$ = $1; }
+			| analyze_keyword						{ $$ = "analyze"; }
+		;
+
 opt_analyze:
 			analyze_keyword							{ $$ = true; }
 			| /*EMPTY*/								{ $$ = false; }
diff --git a/src/backend/tcop/utility.c b/src/backend/tcop/utility.c
index 6ec795f..bdfaa50 100644
--- a/src/backend/tcop/utility.c
+++ b/src/backend/tcop/utility.c
@@ -664,10 +664,10 @@ standard_ProcessUtility(PlannedStmt *pstmt,
 				VacuumStmt *stmt = (VacuumStmt *) parsetree;
 
 				/* we choose to allow this during "read only" transactions */
-				PreventCommandDuringRecovery((stmt->options & VACOPT_VACUUM) ?
+				PreventCommandDuringRecovery(stmt->is_vacuumcmd ?
 											 "VACUUM" : "ANALYZE");
 				/* forbidden in parallel mode due to CommandIsReadOnly */
-				ExecVacuum(stmt, isTopLevel);
+				ExecVacuum(pstate, stmt, isTopLevel);
 			}
 			break;
 
@@ -2570,7 +2570,7 @@ CreateCommandTag(Node *parsetree)
 			break;
 
 		case T_VacuumStmt:
-			if (((VacuumStmt *) parsetree)->options & VACOPT_VACUUM)
+			if (((VacuumStmt *) parsetree)->is_vacuumcmd)
 				tag = "VACUUM";
 			else
 				tag = "ANALYZE";
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 97a4da5..eb1d0ba 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -136,8 +136,23 @@ typedef struct VacAttrStats
 	int			rowstride;
 } VacAttrStats;
 
+typedef enum VacuumOption
+{
+	VACOPT_VACUUM = 1 << 0,		/* do VACUUM */
+	VACOPT_ANALYZE = 1 << 1,	/* do ANALYZE */
+	VACOPT_VERBOSE = 1 << 2,	/* print progress info */
+	VACOPT_FREEZE = 1 << 3,		/* FREEZE option */
+	VACOPT_FULL = 1 << 4,		/* FULL (non-concurrent) vacuum */
+	VACOPT_SKIP_LOCKED = 1 << 5,	/* skip if cannot get lock */
+	VACOPT_SKIPTOAST = 1 << 6,	/* don't process the TOAST table, if any */
+	VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7	/* don't skip any pages */
+} VacuumOption;
+
 /*
  * Parameters customizing behavior of VACUUM and ANALYZE.
+ *
+ * Note that at least one of VACOPT_VACUUM and VACOPT_ANALYZE must be set
+ * in options.
  */
 typedef struct VacuumParams
 {
@@ -163,7 +178,7 @@ extern int	vacuum_multixact_freeze_table_age;
 
 
 /* in commands/vacuum.c */
-extern void ExecVacuum(VacuumStmt *vacstmt, bool isTopLevel);
+extern void ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel);
 extern void vacuum(List *relations, VacuumParams *params,
 	   BufferAccessStrategy bstrategy, bool isTopLevel);
 extern void vac_open_indexes(Relation relation, LOCKMODE lockmode,
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index fe35783..fcfba4b 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3151,21 +3151,16 @@ typedef struct ClusterStmt
  *		Vacuum and Analyze Statements
  *
  * Even though these are nominally two statements, it's convenient to use
- * just one node type for both.  Note that at least one of VACOPT_VACUUM
- * and VACOPT_ANALYZE must be set in options.
+ * just one node type for both.
  * ----------------------
  */
-typedef enum VacuumOption
+typedef struct VacuumStmt
 {
-	VACOPT_VACUUM = 1 << 0,		/* do VACUUM */
-	VACOPT_ANALYZE = 1 << 1,	/* do ANALYZE */
-	VACOPT_VERBOSE = 1 << 2,	/* print progress info */
-	VACOPT_FREEZE = 1 << 3,		/* FREEZE option */
-	VACOPT_FULL = 1 << 4,		/* FULL (non-concurrent) vacuum */
-	VACOPT_SKIP_LOCKED = 1 << 5,	/* skip if cannot get lock */
-	VACOPT_SKIPTOAST = 1 << 6,	/* don't process the TOAST table, if any */
-	VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7	/* don't skip any pages */
-} VacuumOption;
+	NodeTag		type;
+	List		*options;		/* list of DefElem nodes */
+	List	   *rels;			/* list of VacuumRelation, or NIL for all */
+	bool		is_vacuumcmd;	/* true for VACUUM, false for ANALYZE */
+} VacuumStmt;
 
 /*
  * Info about a single target table of VACUUM/ANALYZE.
@@ -3182,13 +3177,6 @@ typedef struct VacuumRelation
 	List	   *va_cols;		/* list of column names, or NIL for all */
 } VacuumRelation;
 
-typedef struct VacuumStmt
-{
-	NodeTag		type;
-	int			options;		/* OR of VacuumOption flags */
-	List	   *rels;			/* list of VacuumRelation, or NIL for all */
-} VacuumStmt;
-
 /* ----------------------
  *		Explain Statement
  *
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index fa9d663..07d0703 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -116,8 +116,12 @@ ERROR:  column "does_not_exist" of relation "vacparted" does not exist
 ANALYZE (VERBOSE) does_not_exist;
 ERROR:  relation "does_not_exist" does not exist
 ANALYZE (nonexistent-arg) does_not_exist;
-ERROR:  unrecognized ANALYZE option "nonexistent"
+ERROR:  syntax error at or near "-"
 LINE 1: ANALYZE (nonexistent-arg) does_not_exist;
+                            ^
+ANALYZE (nonexistentarg) does_not_exit;
+ERROR:  unrecognized ANALYZE option "nonexistentarg"
+LINE 1: ANALYZE (nonexistentarg) does_not_exit;
                  ^
 -- ensure argument order independence, and that SKIP_LOCKED on non-existing
 -- relation still errors out.
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 9defa0d..81f3822 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -92,6 +92,7 @@ ANALYZE vactst (i), vacparted (does_not_exist);
 -- parenthesized syntax for ANALYZE
 ANALYZE (VERBOSE) does_not_exist;
 ANALYZE (nonexistent-arg) does_not_exist;
+ANALYZE (nonexistentarg) does_not_exit;
 
 -- ensure argument order independence, and that SKIP_LOCKED on non-existing
 -- relation still errors out.

vacuum-options-into-params_v2.patchapplication/octet-stream; name=vacuum-options-into-params_v2.patchDownload

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 9416c31..5c554f9 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -186,7 +186,7 @@ static bool heap_page_is_all_visible(Relation rel, Buffer buf,
  *		and locked the relation.
  */
 void
-heap_vacuum_rel(Relation onerel, int options, VacuumParams *params,
+heap_vacuum_rel(Relation onerel, VacuumParams *params,
 				BufferAccessStrategy bstrategy)
 {
 	LVRelStats *vacrelstats;
@@ -217,7 +217,7 @@ heap_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 		starttime = GetCurrentTimestamp();
 	}
 
-	if (options & VACOPT_VERBOSE)
+	if (params->options & VACOPT_VERBOSE)
 		elevel = INFO;
 	else
 		elevel = DEBUG2;
@@ -245,7 +245,7 @@ heap_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 											   xidFullScanLimit);
 	aggressive |= MultiXactIdPrecedesOrEquals(onerel->rd_rel->relminmxid,
 											  mxactFullScanLimit);
-	if (options & VACOPT_DISABLE_PAGE_SKIPPING)
+	if (params->options & VACOPT_DISABLE_PAGE_SKIPPING)
 		aggressive = true;
 
 	vacrelstats = (LVRelStats *) palloc0(sizeof(LVRelStats));
@@ -261,7 +261,7 @@ heap_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 	vacrelstats->hasindex = (nindexes > 0);
 
 	/* Do the vacuuming */
-	lazy_scan_heap(onerel, options, vacrelstats, Irel, nindexes, aggressive);
+	lazy_scan_heap(onerel, params->options, vacrelstats, Irel, nindexes, aggressive);
 
 	/* Done with indexes */
 	vac_close_indexes(nindexes, Irel, NoLock);
diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index c819235..079a096 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -84,7 +84,7 @@ static MemoryContext anl_context = NULL;
 static BufferAccessStrategy vac_strategy;
 
 
-static void do_analyze_rel(Relation onerel, int options,
+static void do_analyze_rel(Relation onerel,
 			   VacuumParams *params, List *va_cols,
 			   AcquireSampleRowsFunc acquirefunc, BlockNumber relpages,
 			   bool inh, bool in_outer_xact, int elevel);
@@ -115,7 +115,7 @@ static Datum ind_fetch_func(VacAttrStatsP stats, int rownum, bool *isNull);
  * use it once we've successfully opened the rel, since it might be stale.
  */
 void
-analyze_rel(Oid relid, RangeVar *relation, int options,
+analyze_rel(Oid relid, RangeVar *relation,
 			VacuumParams *params, List *va_cols, bool in_outer_xact,
 			BufferAccessStrategy bstrategy)
 {
@@ -125,7 +125,7 @@ analyze_rel(Oid relid, RangeVar *relation, int options,
 	BlockNumber relpages = 0;
 
 	/* Select logging level */
-	if (options & VACOPT_VERBOSE)
+	if (params->options & VACOPT_VERBOSE)
 		elevel = INFO;
 	else
 		elevel = DEBUG2;
@@ -148,7 +148,6 @@ analyze_rel(Oid relid, RangeVar *relation, int options,
 	 * Make sure to generate only logs for ANALYZE in this case.
 	 */
 	onerel = vacuum_open_relation(relid, relation, params,
-								  options & ~(VACOPT_VACUUM),
 								  ShareUpdateExclusiveLock);
 
 	/* leave if relation could not be opened or locked */
@@ -165,7 +164,7 @@ analyze_rel(Oid relid, RangeVar *relation, int options,
 	 */
 	if (!vacuum_is_relation_owner(RelationGetRelid(onerel),
 								  onerel->rd_rel,
-								  options & VACOPT_ANALYZE))
+								  params->options & VACOPT_ANALYZE))
 	{
 		relation_close(onerel, ShareUpdateExclusiveLock);
 		return;
@@ -237,7 +236,7 @@ analyze_rel(Oid relid, RangeVar *relation, int options,
 	else
 	{
 		/* No need for a WARNING if we already complained during VACUUM */
-		if (!(options & VACOPT_VACUUM))
+		if (!(params->options & VACOPT_VACUUM))
 			ereport(WARNING,
 					(errmsg("skipping \"%s\" --- cannot analyze non-tables or special system tables",
 							RelationGetRelationName(onerel))));
@@ -257,14 +256,14 @@ analyze_rel(Oid relid, RangeVar *relation, int options,
 	 * tables, which don't contain any rows.
 	 */
 	if (onerel->rd_rel->relkind != RELKIND_PARTITIONED_TABLE)
-		do_analyze_rel(onerel, options, params, va_cols, acquirefunc,
+		do_analyze_rel(onerel, params, va_cols, acquirefunc,
 					   relpages, false, in_outer_xact, elevel);
 
 	/*
 	 * If there are child tables, do recursive ANALYZE.
 	 */
 	if (onerel->rd_rel->relhassubclass)
-		do_analyze_rel(onerel, options, params, va_cols, acquirefunc, relpages,
+		do_analyze_rel(onerel, params, va_cols, acquirefunc, relpages,
 					   true, in_outer_xact, elevel);
 
 	/*
@@ -292,7 +291,7 @@ analyze_rel(Oid relid, RangeVar *relation, int options,
  * appropriate acquirefunc for each child table.
  */
 static void
-do_analyze_rel(Relation onerel, int options, VacuumParams *params,
+do_analyze_rel(Relation onerel, VacuumParams *params,
 			   List *va_cols, AcquireSampleRowsFunc acquirefunc,
 			   BlockNumber relpages, bool inh, bool in_outer_xact,
 			   int elevel)
@@ -603,7 +602,7 @@ do_analyze_rel(Relation onerel, int options, VacuumParams *params,
 	 * VACUUM ANALYZE, don't overwrite the accurate count already inserted by
 	 * VACUUM.
 	 */
-	if (!inh && !(options & VACOPT_VACUUM))
+	if (!inh && !(params->options & VACOPT_VACUUM))
 	{
 		for (ind = 0; ind < nindexes; ind++)
 		{
@@ -634,7 +633,7 @@ do_analyze_rel(Relation onerel, int options, VacuumParams *params,
 							  (va_cols == NIL));
 
 	/* If this isn't part of VACUUM ANALYZE, let index AMs do cleanup */
-	if (!(options & VACOPT_VACUUM))
+	if (!(params->options & VACOPT_VACUUM))
 	{
 		for (ind = 0; ind < nindexes; ind++)
 		{
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 1b5b50c..90ff6aa 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -74,8 +74,7 @@ static void vac_truncate_clog(TransactionId frozenXID,
 				  MultiXactId minMulti,
 				  TransactionId lastSaneFrozenXid,
 				  MultiXactId lastSaneMinMulti);
-static bool vacuum_rel(Oid relid, RangeVar *relation, int options,
-		   VacuumParams *params);
+static bool vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params);
 
 /*
  * Primary entry point for manual VACUUM and ANALYZE commands
@@ -112,6 +111,9 @@ ExecVacuum(VacuumStmt *vacstmt, bool isTopLevel)
 		}
 	}
 
+	/* copy options from parse tree */
+	params.options = vacstmt->options;
+
 	/*
 	 * All freeze ages are zero if the FREEZE option is given; otherwise pass
 	 * them as -1 which means to use the default values.
@@ -138,14 +140,12 @@ ExecVacuum(VacuumStmt *vacstmt, bool isTopLevel)
 	params.log_min_duration = -1;
 
 	/* Now go through the common routine */
-	vacuum(vacstmt->options, vacstmt->rels, &params, NULL, isTopLevel);
+	vacuum(vacstmt->rels, &params, NULL, isTopLevel);
 }
 
 /*
  * Internal entry point for VACUUM and ANALYZE commands.
  *
- * options is a bitmask of VacuumOption flags, indicating what to do.
- *
  * relations, if not NIL, is a list of VacuumRelation to process; otherwise,
  * we process all relevant tables in the database.  For each VacuumRelation,
  * if a valid OID is supplied, the table with that OID is what to process;
@@ -163,8 +163,8 @@ ExecVacuum(VacuumStmt *vacstmt, bool isTopLevel)
  * memory context that will not disappear at transaction commit.
  */
 void
-vacuum(int options, List *relations, VacuumParams *params,
-	   BufferAccessStrategy bstrategy, bool isTopLevel)
+vacuum(List *relations, VacuumParams *params, BufferAccessStrategy bstrategy,
+	   bool isTopLevel)
 {
 	static bool in_vacuum = false;
 
@@ -174,7 +174,7 @@ vacuum(int options, List *relations, VacuumParams *params,
 
 	Assert(params != NULL);
 
-	stmttype = (options & VACOPT_VACUUM) ? "VACUUM" : "ANALYZE";
+	stmttype = (params->options & VACOPT_VACUUM) ? "VACUUM" : "ANALYZE";
 
 	/*
 	 * We cannot run VACUUM inside a user transaction block; if we were inside
@@ -184,7 +184,7 @@ vacuum(int options, List *relations, VacuumParams *params,
 	 *
 	 * ANALYZE (without VACUUM) can run either way.
 	 */
-	if (options & VACOPT_VACUUM)
+	if (params->options & VACOPT_VACUUM)
 	{
 		PreventInTransactionBlock(isTopLevel, stmttype);
 		in_outer_xact = false;
@@ -206,8 +206,8 @@ vacuum(int options, List *relations, VacuumParams *params,
 	/*
 	 * Sanity check DISABLE_PAGE_SKIPPING option.
 	 */
-	if ((options & VACOPT_FULL) != 0 &&
-		(options & VACOPT_DISABLE_PAGE_SKIPPING) != 0)
+	if ((params->options & VACOPT_FULL) != 0 &&
+		(params->options & VACOPT_DISABLE_PAGE_SKIPPING) != 0)
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg("VACUUM option DISABLE_PAGE_SKIPPING cannot be used with FULL")));
@@ -216,7 +216,7 @@ vacuum(int options, List *relations, VacuumParams *params,
 	 * Send info about dead objects to the statistics collector, unless we are
 	 * in autovacuum --- autovacuum.c does this for itself.
 	 */
-	if ((options & VACOPT_VACUUM) && !IsAutoVacuumWorkerProcess())
+	if ((params->options & VACOPT_VACUUM) && !IsAutoVacuumWorkerProcess())
 		pgstat_vacuum_stat();
 
 	/*
@@ -257,7 +257,7 @@ vacuum(int options, List *relations, VacuumParams *params,
 			List	   *sublist;
 			MemoryContext old_context;
 
-			sublist = expand_vacuum_rel(vrel, options);
+			sublist = expand_vacuum_rel(vrel, params->options);
 			old_context = MemoryContextSwitchTo(vac_context);
 			newrels = list_concat(newrels, sublist);
 			MemoryContextSwitchTo(old_context);
@@ -265,7 +265,7 @@ vacuum(int options, List *relations, VacuumParams *params,
 		relations = newrels;
 	}
 	else
-		relations = get_all_vacuum_rels(options);
+		relations = get_all_vacuum_rels(params->options);
 
 	/*
 	 * Decide whether we need to start/commit our own transactions.
@@ -281,11 +281,11 @@ vacuum(int options, List *relations, VacuumParams *params,
 	 * transaction block, and also in an autovacuum worker, use own
 	 * transactions so we can release locks sooner.
 	 */
-	if (options & VACOPT_VACUUM)
+	if (params->options & VACOPT_VACUUM)
 		use_own_xacts = true;
 	else
 	{
-		Assert(options & VACOPT_ANALYZE);
+		Assert(params->options & VACOPT_ANALYZE);
 		if (IsAutoVacuumWorkerProcess())
 			use_own_xacts = true;
 		else if (in_outer_xact)
@@ -335,13 +335,13 @@ vacuum(int options, List *relations, VacuumParams *params,
 		{
 			VacuumRelation *vrel = lfirst_node(VacuumRelation, cur);
 
-			if (options & VACOPT_VACUUM)
+			if (params->options & VACOPT_VACUUM)
 			{
-				if (!vacuum_rel(vrel->oid, vrel->relation, options, params))
+				if (!vacuum_rel(vrel->oid, vrel->relation, params))
 					continue;
 			}
 
-			if (options & VACOPT_ANALYZE)
+			if (params->options & VACOPT_ANALYZE)
 			{
 				/*
 				 * If using separate xacts, start one for analyze. Otherwise,
@@ -354,7 +354,7 @@ vacuum(int options, List *relations, VacuumParams *params,
 					PushActiveSnapshot(GetTransactionSnapshot());
 				}
 
-				analyze_rel(vrel->oid, vrel->relation, options, params,
+				analyze_rel(vrel->oid, vrel->relation, params,
 							vrel->va_cols, in_outer_xact, vac_strategy);
 
 				if (use_own_xacts)
@@ -390,7 +390,7 @@ vacuum(int options, List *relations, VacuumParams *params,
 		StartTransactionCommand();
 	}
 
-	if ((options & VACOPT_VACUUM) && !IsAutoVacuumWorkerProcess())
+	if ((params->options & VACOPT_VACUUM) && !IsAutoVacuumWorkerProcess())
 	{
 		/*
 		 * Update pg_database.datfrozenxid, and truncate pg_xact if possible.
@@ -491,14 +491,14 @@ vacuum_is_relation_owner(Oid relid, Form_pg_class reltuple, int options)
  */
 Relation
 vacuum_open_relation(Oid relid, RangeVar *relation, VacuumParams *params,
-					 int options, LOCKMODE lmode)
+					 LOCKMODE lmode)
 {
 	Relation	onerel;
 	bool		rel_lock = true;
 	int			elevel;
 
 	Assert(params != NULL);
-	Assert((options & (VACOPT_VACUUM | VACOPT_ANALYZE)) != 0);
+	Assert((params->options & (VACOPT_VACUUM | VACOPT_ANALYZE)) != 0);
 
 	/*
 	 * Open the relation and get the appropriate lock on it.
@@ -509,7 +509,7 @@ vacuum_open_relation(Oid relid, RangeVar *relation, VacuumParams *params,
 	 * If we've been asked not to wait for the relation lock, acquire it first
 	 * in non-blocking mode, before calling try_relation_open().
 	 */
-	if (!(options & VACOPT_SKIP_LOCKED))
+	if (!(params->options & VACOPT_SKIP_LOCKED))
 		onerel = try_relation_open(relid, lmode);
 	else if (ConditionalLockRelationOid(relid, lmode))
 		onerel = try_relation_open(relid, NoLock);
@@ -549,7 +549,7 @@ vacuum_open_relation(Oid relid, RangeVar *relation, VacuumParams *params,
 	else
 		return NULL;
 
-	if ((options & VACOPT_VACUUM) != 0)
+	if ((params->options & VACOPT_VACUUM) != 0)
 	{
 		if (!rel_lock)
 			ereport(elevel,
@@ -570,7 +570,7 @@ vacuum_open_relation(Oid relid, RangeVar *relation, VacuumParams *params,
 		return NULL;
 	}
 
-	if ((options & VACOPT_ANALYZE) != 0)
+	if ((params->options & VACOPT_ANALYZE) != 0)
 	{
 		if (!rel_lock)
 			ereport(elevel,
@@ -1521,7 +1521,7 @@ vac_truncate_clog(TransactionId frozenXID,
  *		At entry and exit, we are not inside a transaction.
  */
 static bool
-vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
+vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params)
 {
 	LOCKMODE	lmode;
 	Relation	onerel;
@@ -1542,7 +1542,7 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 	 */
 	PushActiveSnapshot(GetTransactionSnapshot());
 
-	if (!(options & VACOPT_FULL))
+	if (!(params->options & VACOPT_FULL))
 	{
 		/*
 		 * In lazy vacuum, we can set the PROC_IN_VACUUM flag, which lets
@@ -1582,10 +1582,11 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 	 * vacuum, but just ShareUpdateExclusiveLock for concurrent vacuum. Either
 	 * way, we can be sure that no other backend is vacuuming the same table.
 	 */
-	lmode = (options & VACOPT_FULL) ? AccessExclusiveLock : ShareUpdateExclusiveLock;
+	lmode = (params->options & VACOPT_FULL) ?
+		AccessExclusiveLock : ShareUpdateExclusiveLock;
 
 	/* open the relation and get the appropriate lock on it */
-	onerel = vacuum_open_relation(relid, relation, params, options, lmode);
+	onerel = vacuum_open_relation(relid, relation, params, lmode);
 
 	/* leave if relation could not be opened or locked */
 	if (!onerel)
@@ -1605,7 +1606,7 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 	 */
 	if (!vacuum_is_relation_owner(RelationGetRelid(onerel),
 								  onerel->rd_rel,
-								  options & VACOPT_VACUUM))
+								  params->options & VACOPT_VACUUM))
 	{
 		relation_close(onerel, lmode);
 		PopActiveSnapshot();
@@ -1677,7 +1678,7 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 	 * us to process it.  In VACUUM FULL, though, the toast table is
 	 * automatically rebuilt by cluster_rel so we shouldn't recurse to it.
 	 */
-	if (!(options & VACOPT_SKIPTOAST) && !(options & VACOPT_FULL))
+	if (!(params->options & VACOPT_SKIPTOAST) && !(params->options & VACOPT_FULL))
 		toast_relid = onerel->rd_rel->reltoastrelid;
 	else
 		toast_relid = InvalidOid;
@@ -1696,7 +1697,7 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 	/*
 	 * Do the actual work --- either FULL or "lazy" vacuum
 	 */
-	if (options & VACOPT_FULL)
+	if (params->options & VACOPT_FULL)
 	{
 		int			cluster_options = 0;
 
@@ -1704,14 +1705,14 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 		relation_close(onerel, NoLock);
 		onerel = NULL;
 
-		if ((options & VACOPT_VERBOSE) != 0)
+		if ((params->options & VACOPT_VERBOSE) != 0)
 			cluster_options |= CLUOPT_VERBOSE;
 
 		/* VACUUM FULL is now a variant of CLUSTER; see cluster.c */
 		cluster_rel(relid, InvalidOid, cluster_options);
 	}
 	else
-		heap_vacuum_rel(onerel, options, params, vac_strategy);
+		heap_vacuum_rel(onerel, params, vac_strategy);
 
 	/* Roll back any GUC changes executed by index functions */
 	AtEOXact_GUC(false, save_nestlevel);
@@ -1737,7 +1738,7 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 	 * totally unimportant for toast relations.
 	 */
 	if (toast_relid != InvalidOid)
-		vacuum_rel(toast_relid, NULL, options, params);
+		vacuum_rel(toast_relid, NULL, params);
 
 	/*
 	 * Now release the session-level lock on the master table.
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 3bfac91..fa875db 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -188,7 +188,6 @@ typedef struct av_relation
 typedef struct autovac_table
 {
 	Oid			at_relid;
-	int			at_vacoptions;	/* bitmask of VacuumOption */
 	VacuumParams at_params;
 	double		at_vacuum_cost_delay;
 	int			at_vacuum_cost_limit;
@@ -2482,7 +2481,7 @@ do_autovacuum(void)
 			 * next table in our list.
 			 */
 			HOLD_INTERRUPTS();
-			if (tab->at_vacoptions & VACOPT_VACUUM)
+			if (tab->at_params.options & VACOPT_VACUUM)
 				errcontext("automatic vacuum of table \"%s.%s.%s\"",
 						   tab->at_datname, tab->at_nspname, tab->at_relname);
 			else
@@ -2883,7 +2882,7 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		tab = palloc(sizeof(autovac_table));
 		tab->at_relid = relid;
 		tab->at_sharedrel = classForm->relisshared;
-		tab->at_vacoptions = VACOPT_SKIPTOAST |
+		tab->at_params.options = VACOPT_SKIPTOAST |
 			(dovacuum ? VACOPT_VACUUM : 0) |
 			(doanalyze ? VACOPT_ANALYZE : 0) |
 			(!wraparound ? VACOPT_SKIP_LOCKED : 0);
@@ -3110,7 +3109,7 @@ autovacuum_do_vac_analyze(autovac_table *tab, BufferAccessStrategy bstrategy)
 	rel = makeVacuumRelation(rangevar, tab->at_relid, NIL);
 	rel_list = list_make1(rel);
 
-	vacuum(tab->at_vacoptions, rel_list, &tab->at_params, bstrategy, true);
+	vacuum(rel_list, &tab->at_params, bstrategy, true);
 }
 
 /*
@@ -3132,10 +3131,10 @@ autovac_report_activity(autovac_table *tab)
 	int			len;
 
 	/* Report the command and possible options */
-	if (tab->at_vacoptions & VACOPT_VACUUM)
+	if (tab->at_params.options & VACOPT_VACUUM)
 		snprintf(activity, MAX_AUTOVAC_ACTIV_LEN,
 				 "autovacuum: VACUUM%s",
-				 tab->at_vacoptions & VACOPT_ANALYZE ? " ANALYZE" : "");
+				 tab->at_params.options & VACOPT_ANALYZE ? " ANALYZE" : "");
 	else
 		snprintf(activity, MAX_AUTOVAC_ACTIV_LEN,
 				 "autovacuum: ANALYZE");
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 1b6607f..eb9e160 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -217,7 +217,7 @@ extern Size SyncScanShmemSize(void);
 
 /* in heap/vacuumlazy.c */
 struct VacuumParams;
-extern void heap_vacuum_rel(Relation onerel, int options,
+extern void heap_vacuum_rel(Relation onerel,
 				struct VacuumParams *params, BufferAccessStrategy bstrategy);
 
 /* in heap/heapam_visibility.c */
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 0a051ec..97a4da5 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -141,6 +141,7 @@ typedef struct VacAttrStats
  */
 typedef struct VacuumParams
 {
+	int			options;		/* VACOPT_* constants */
 	int			freeze_min_age; /* min freeze age, -1 to use default */
 	int			freeze_table_age;	/* age at which to scan whole table */
 	int			multixact_freeze_min_age;	/* min multixact freeze age, -1 to
@@ -163,7 +164,7 @@ extern int	vacuum_multixact_freeze_table_age;
 
 /* in commands/vacuum.c */
 extern void ExecVacuum(VacuumStmt *vacstmt, bool isTopLevel);
-extern void vacuum(int options, List *relations, VacuumParams *params,
+extern void vacuum(List *relations, VacuumParams *params,
 	   BufferAccessStrategy bstrategy, bool isTopLevel);
 extern void vac_open_indexes(Relation relation, LOCKMODE lockmode,
 				 int *nindexes, Relation **Irel);
@@ -194,10 +195,10 @@ extern void vacuum_delay_point(void);
 extern bool vacuum_is_relation_owner(Oid relid, Form_pg_class reltuple,
 						 int options);
 extern Relation vacuum_open_relation(Oid relid, RangeVar *relation,
-					 VacuumParams *params, int options, LOCKMODE lmode);
+					 VacuumParams *params, LOCKMODE lmode);
 
 /* in commands/analyze.c */
-extern void analyze_rel(Oid relid, RangeVar *relation, int options,
+extern void analyze_rel(Oid relid, RangeVar *relation,
 			VacuumParams *params, List *va_cols, bool in_outer_xact,
 			BufferAccessStrategy bstrategy);
 extern bool std_typanalyze(VacAttrStats *stats);

#42

Masahiko Sawada

sawada.mshk@gmail.com

almost 7 years ago

In reply to: Masahiko Sawada (#29)

3 attachment(s)

On Tue, Feb 26, 2019 at 7:20 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Tue, Feb 26, 2019 at 1:35 PM Haribabu Kommi <kommi.haribabu@gmail.com> wrote:

On Thu, Feb 14, 2019 at 9:17 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Thank you. Attached the rebased patch.

I ran some performance tests to compare the parallelism benefits,

Thank you for testing!

but I got some strange results of performance overhead, may be it is
because, I tested it on my laptop.

Hmm, I think the parallel vacuum would help for heavy workloads like a
big table with multiple indexes. In your test result, all executions
are completed within 1 sec, which seems to be one use case that the
parallel vacuum wouldn't help. I suspect that the table is small,
right? Anyway I'll also do performance tests.

Here is the performance test results. I've setup a 500MB table with
several indexes and made 10% of table dirty before each vacuum.
Compared execution time of the patched postgrse with the current HEAD
(at 'speed_up' column). In my environment,

indexes | parallel_degree | patched | head | speed_up
---------+-----------------+------------+------------+----------
0 | 0 | 238.2085 | 244.7625 | 1.0275
0 | 1 | 237.7050 | 244.7625 | 1.0297
0 | 2 | 238.0390 | 244.7625 | 1.0282
0 | 4 | 238.1045 | 244.7625 | 1.0280
0 | 8 | 237.8995 | 244.7625 | 1.0288
0 | 16 | 237.7775 | 244.7625 | 1.0294
1 | 0 | 1328.8590 | 1334.9125 | 1.0046
1 | 1 | 1325.9140 | 1334.9125 | 1.0068
1 | 2 | 1333.3665 | 1334.9125 | 1.0012
1 | 4 | 1329.5205 | 1334.9125 | 1.0041
1 | 8 | 1334.2255 | 1334.9125 | 1.0005
1 | 16 | 1335.1510 | 1334.9125 | 0.9998
2 | 0 | 2426.2905 | 2427.5165 | 1.0005
2 | 1 | 1416.0595 | 2427.5165 | 1.7143
2 | 2 | 1411.6270 | 2427.5165 | 1.7197
2 | 4 | 1411.6490 | 2427.5165 | 1.7196
2 | 8 | 1410.1750 | 2427.5165 | 1.7214
2 | 16 | 1413.4985 | 2427.5165 | 1.7174
4 | 0 | 4622.5060 | 4619.0340 | 0.9992
4 | 1 | 2536.8435 | 4619.0340 | 1.8208
4 | 2 | 2548.3615 | 4619.0340 | 1.8126
4 | 4 | 1467.9655 | 4619.0340 | 3.1466
4 | 8 | 1486.3155 | 4619.0340 | 3.1077
4 | 16 | 1481.7150 | 4619.0340 | 3.1174
8 | 0 | 9039.3810 | 8990.4735 | 0.9946
8 | 1 | 4807.5880 | 8990.4735 | 1.8701
8 | 2 | 3786.7620 | 8990.4735 | 2.3742
8 | 4 | 2924.2205 | 8990.4735 | 3.0745
8 | 8 | 2684.2545 | 8990.4735 | 3.3493
8 | 16 | 2672.9800 | 8990.4735 | 3.3635
16 | 0 | 17821.4715 | 17740.1300 | 0.9954
16 | 1 | 9318.3810 | 17740.1300 | 1.9038
16 | 2 | 7260.6315 | 17740.1300 | 2.4433
16 | 4 | 5538.5225 | 17740.1300 | 3.2030
16 | 8 | 5368.5255 | 17740.1300 | 3.3045
16 | 16 | 5291.8510 | 17740.1300 | 3.3523
(36 rows)

Attached the updated version patches. The patches apply to the current
HEAD cleanly but the 0001 patch still changes the vacuum option to a
Node since it's under the discussion. After the direction has been
decided, I'll update the patches.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachments:

v17-0001-Make-vacuum-options-a-Node.patchapplication/octet-stream; name=v17-0001-Make-vacuum-options-a-Node.patchDownload

From 19ce143069f757ba74b7d4137617e3b9cc6f8e23 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Mon, 4 Mar 2019 15:15:13 +0900
Subject: [PATCH v17 1/3] Make vacuum options a Node.

Adds new Node VacuumOptions for follow up commit. VacuumOptions is
passed down to vacuum path but not to the analyze path.
---
 src/backend/access/heap/vacuumlazy.c | 14 +++---
 src/backend/commands/vacuum.c        | 90 ++++++++++++++++++------------------
 src/backend/nodes/copyfuncs.c        | 15 +++++-
 src/backend/nodes/equalfuncs.c       | 13 +++++-
 src/backend/parser/gram.y            | 58 +++++++++++++++--------
 src/backend/postmaster/autovacuum.c  | 12 ++---
 src/backend/tcop/utility.c           |  4 +-
 src/include/access/heapam.h          |  3 +-
 src/include/commands/vacuum.h        |  6 +--
 src/include/nodes/nodes.h            |  1 +
 src/include/nodes/parsenodes.h       | 16 +++++--
 11 files changed, 143 insertions(+), 89 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 9416c31..2c33bf6 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -150,7 +150,7 @@ static BufferAccessStrategy vac_strategy;
 
 
 /* non-export function prototypes */
-static void lazy_scan_heap(Relation onerel, int options,
+static void lazy_scan_heap(Relation onerel, VacuumOptions *options,
 			   LVRelStats *vacrelstats, Relation *Irel, int nindexes,
 			   bool aggressive);
 static void lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats, BlockNumber nblocks);
@@ -186,7 +186,7 @@ static bool heap_page_is_all_visible(Relation rel, Buffer buf,
  *		and locked the relation.
  */
 void
-heap_vacuum_rel(Relation onerel, int options, VacuumParams *params,
+heap_vacuum_rel(Relation onerel, VacuumOptions *options, VacuumParams *params,
 				BufferAccessStrategy bstrategy)
 {
 	LVRelStats *vacrelstats;
@@ -217,7 +217,7 @@ heap_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 		starttime = GetCurrentTimestamp();
 	}
 
-	if (options & VACOPT_VERBOSE)
+	if (options->flags & VACOPT_VERBOSE)
 		elevel = INFO;
 	else
 		elevel = DEBUG2;
@@ -245,7 +245,7 @@ heap_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 											   xidFullScanLimit);
 	aggressive |= MultiXactIdPrecedesOrEquals(onerel->rd_rel->relminmxid,
 											  mxactFullScanLimit);
-	if (options & VACOPT_DISABLE_PAGE_SKIPPING)
+	if (options->flags & VACOPT_DISABLE_PAGE_SKIPPING)
 		aggressive = true;
 
 	vacrelstats = (LVRelStats *) palloc0(sizeof(LVRelStats));
@@ -469,7 +469,7 @@ vacuum_log_cleanup_info(Relation rel, LVRelStats *vacrelstats)
  *		reference them have been killed.
  */
 static void
-lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
+lazy_scan_heap(Relation onerel, VacuumOptions *options, LVRelStats *vacrelstats,
 			   Relation *Irel, int nindexes, bool aggressive)
 {
 	BlockNumber nblocks,
@@ -583,7 +583,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	 * be replayed on any hot standby, where it can be disruptive.
 	 */
 	next_unskippable_block = 0;
-	if ((options & VACOPT_DISABLE_PAGE_SKIPPING) == 0)
+	if ((options->flags & VACOPT_DISABLE_PAGE_SKIPPING) == 0)
 	{
 		while (next_unskippable_block < nblocks)
 		{
@@ -638,7 +638,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		{
 			/* Time to advance next_unskippable_block */
 			next_unskippable_block++;
-			if ((options & VACOPT_DISABLE_PAGE_SKIPPING) == 0)
+			if ((options->flags & VACOPT_DISABLE_PAGE_SKIPPING) == 0)
 			{
 				while (next_unskippable_block < nblocks)
 				{
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 1b5b50c..bdd7d46 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -68,13 +68,13 @@ static BufferAccessStrategy vac_strategy;
 
 
 /* non-export function prototypes */
-static List *expand_vacuum_rel(VacuumRelation *vrel, int options);
-static List *get_all_vacuum_rels(int options);
+static List *expand_vacuum_rel(VacuumRelation *vrel, VacuumOptions *options);
+static List *get_all_vacuum_rels(VacuumOptions *options);
 static void vac_truncate_clog(TransactionId frozenXID,
 				  MultiXactId minMulti,
 				  TransactionId lastSaneFrozenXid,
 				  MultiXactId lastSaneMinMulti);
-static bool vacuum_rel(Oid relid, RangeVar *relation, int options,
+static bool vacuum_rel(Oid relid, RangeVar *relation, VacuumOptions *options,
 		   VacuumParams *params);
 
 /*
@@ -89,15 +89,15 @@ ExecVacuum(VacuumStmt *vacstmt, bool isTopLevel)
 	VacuumParams params;
 
 	/* sanity checks on options */
-	Assert(vacstmt->options & (VACOPT_VACUUM | VACOPT_ANALYZE));
-	Assert((vacstmt->options & VACOPT_VACUUM) ||
-		   !(vacstmt->options & (VACOPT_FULL | VACOPT_FREEZE)));
-	Assert(!(vacstmt->options & VACOPT_SKIPTOAST));
+	Assert(vacstmt->options->flags & (VACOPT_VACUUM | VACOPT_ANALYZE));
+	Assert((vacstmt->options->flags & VACOPT_VACUUM) ||
+		   !(vacstmt->options->flags & (VACOPT_FULL | VACOPT_FREEZE)));
+	Assert(!(vacstmt->options->flags & VACOPT_SKIPTOAST));
 
 	/*
 	 * Make sure VACOPT_ANALYZE is specified if any column lists are present.
 	 */
-	if (!(vacstmt->options & VACOPT_ANALYZE))
+	if (!(vacstmt->options->flags & VACOPT_ANALYZE))
 	{
 		ListCell   *lc;
 
@@ -116,7 +116,7 @@ ExecVacuum(VacuumStmt *vacstmt, bool isTopLevel)
 	 * All freeze ages are zero if the FREEZE option is given; otherwise pass
 	 * them as -1 which means to use the default values.
 	 */
-	if (vacstmt->options & VACOPT_FREEZE)
+	if (vacstmt->options->flags & VACOPT_FREEZE)
 	{
 		params.freeze_min_age = 0;
 		params.freeze_table_age = 0;
@@ -144,7 +144,7 @@ ExecVacuum(VacuumStmt *vacstmt, bool isTopLevel)
 /*
  * Internal entry point for VACUUM and ANALYZE commands.
  *
- * options is a bitmask of VacuumOption flags, indicating what to do.
+ * options is a VacuumOptions, indicating what to do.
  *
  * relations, if not NIL, is a list of VacuumRelation to process; otherwise,
  * we process all relevant tables in the database.  For each VacuumRelation,
@@ -163,7 +163,7 @@ ExecVacuum(VacuumStmt *vacstmt, bool isTopLevel)
  * memory context that will not disappear at transaction commit.
  */
 void
-vacuum(int options, List *relations, VacuumParams *params,
+vacuum(VacuumOptions *options, List *relations, VacuumParams *params,
 	   BufferAccessStrategy bstrategy, bool isTopLevel)
 {
 	static bool in_vacuum = false;
@@ -174,7 +174,7 @@ vacuum(int options, List *relations, VacuumParams *params,
 
 	Assert(params != NULL);
 
-	stmttype = (options & VACOPT_VACUUM) ? "VACUUM" : "ANALYZE";
+	stmttype = (options->flags & VACOPT_VACUUM) ? "VACUUM" : "ANALYZE";
 
 	/*
 	 * We cannot run VACUUM inside a user transaction block; if we were inside
@@ -184,7 +184,7 @@ vacuum(int options, List *relations, VacuumParams *params,
 	 *
 	 * ANALYZE (without VACUUM) can run either way.
 	 */
-	if (options & VACOPT_VACUUM)
+	if (options->flags & VACOPT_VACUUM)
 	{
 		PreventInTransactionBlock(isTopLevel, stmttype);
 		in_outer_xact = false;
@@ -206,8 +206,8 @@ vacuum(int options, List *relations, VacuumParams *params,
 	/*
 	 * Sanity check DISABLE_PAGE_SKIPPING option.
 	 */
-	if ((options & VACOPT_FULL) != 0 &&
-		(options & VACOPT_DISABLE_PAGE_SKIPPING) != 0)
+	if ((options->flags & VACOPT_FULL) != 0 &&
+		(options->flags & VACOPT_DISABLE_PAGE_SKIPPING) != 0)
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg("VACUUM option DISABLE_PAGE_SKIPPING cannot be used with FULL")));
@@ -216,7 +216,7 @@ vacuum(int options, List *relations, VacuumParams *params,
 	 * Send info about dead objects to the statistics collector, unless we are
 	 * in autovacuum --- autovacuum.c does this for itself.
 	 */
-	if ((options & VACOPT_VACUUM) && !IsAutoVacuumWorkerProcess())
+	if ((options->flags & VACOPT_VACUUM) && !IsAutoVacuumWorkerProcess())
 		pgstat_vacuum_stat();
 
 	/*
@@ -281,11 +281,11 @@ vacuum(int options, List *relations, VacuumParams *params,
 	 * transaction block, and also in an autovacuum worker, use own
 	 * transactions so we can release locks sooner.
 	 */
-	if (options & VACOPT_VACUUM)
+	if (options->flags & VACOPT_VACUUM)
 		use_own_xacts = true;
 	else
 	{
-		Assert(options & VACOPT_ANALYZE);
+		Assert(options->flags & VACOPT_ANALYZE);
 		if (IsAutoVacuumWorkerProcess())
 			use_own_xacts = true;
 		else if (in_outer_xact)
@@ -335,13 +335,13 @@ vacuum(int options, List *relations, VacuumParams *params,
 		{
 			VacuumRelation *vrel = lfirst_node(VacuumRelation, cur);
 
-			if (options & VACOPT_VACUUM)
+			if (options->flags & VACOPT_VACUUM)
 			{
 				if (!vacuum_rel(vrel->oid, vrel->relation, options, params))
 					continue;
 			}
 
-			if (options & VACOPT_ANALYZE)
+			if (options->flags & VACOPT_ANALYZE)
 			{
 				/*
 				 * If using separate xacts, start one for analyze. Otherwise,
@@ -354,7 +354,7 @@ vacuum(int options, List *relations, VacuumParams *params,
 					PushActiveSnapshot(GetTransactionSnapshot());
 				}
 
-				analyze_rel(vrel->oid, vrel->relation, options, params,
+				analyze_rel(vrel->oid, vrel->relation, options->flags, params,
 							vrel->va_cols, in_outer_xact, vac_strategy);
 
 				if (use_own_xacts)
@@ -390,7 +390,7 @@ vacuum(int options, List *relations, VacuumParams *params,
 		StartTransactionCommand();
 	}
 
-	if ((options & VACOPT_VACUUM) && !IsAutoVacuumWorkerProcess())
+	if ((options->flags & VACOPT_VACUUM) && !IsAutoVacuumWorkerProcess())
 	{
 		/*
 		 * Update pg_database.datfrozenxid, and truncate pg_xact if possible.
@@ -416,11 +416,11 @@ vacuum(int options, List *relations, VacuumParams *params,
  * ANALYZE.
  */
 bool
-vacuum_is_relation_owner(Oid relid, Form_pg_class reltuple, int options)
+vacuum_is_relation_owner(Oid relid, Form_pg_class reltuple, int flags)
 {
 	char	   *relname;
 
-	Assert((options & (VACOPT_VACUUM | VACOPT_ANALYZE)) != 0);
+	Assert((flags & (VACOPT_VACUUM | VACOPT_ANALYZE)) != 0);
 
 	/*
 	 * Check permissions.
@@ -439,7 +439,7 @@ vacuum_is_relation_owner(Oid relid, Form_pg_class reltuple, int options)
 
 	relname = NameStr(reltuple->relname);
 
-	if ((options & VACOPT_VACUUM) != 0)
+	if ((flags & VACOPT_VACUUM) != 0)
 	{
 		if (reltuple->relisshared)
 			ereport(WARNING,
@@ -462,7 +462,7 @@ vacuum_is_relation_owner(Oid relid, Form_pg_class reltuple, int options)
 		return false;
 	}
 
-	if ((options & VACOPT_ANALYZE) != 0)
+	if ((flags & VACOPT_ANALYZE) != 0)
 	{
 		if (reltuple->relisshared)
 			ereport(WARNING,
@@ -491,14 +491,14 @@ vacuum_is_relation_owner(Oid relid, Form_pg_class reltuple, int options)
  */
 Relation
 vacuum_open_relation(Oid relid, RangeVar *relation, VacuumParams *params,
-					 int options, LOCKMODE lmode)
+					 int flags, LOCKMODE lmode)
 {
 	Relation	onerel;
 	bool		rel_lock = true;
 	int			elevel;
 
 	Assert(params != NULL);
-	Assert((options & (VACOPT_VACUUM | VACOPT_ANALYZE)) != 0);
+	Assert((flags & (VACOPT_VACUUM | VACOPT_ANALYZE)) != 0);
 
 	/*
 	 * Open the relation and get the appropriate lock on it.
@@ -509,7 +509,7 @@ vacuum_open_relation(Oid relid, RangeVar *relation, VacuumParams *params,
 	 * If we've been asked not to wait for the relation lock, acquire it first
 	 * in non-blocking mode, before calling try_relation_open().
 	 */
-	if (!(options & VACOPT_SKIP_LOCKED))
+	if (!(flags & VACOPT_SKIP_LOCKED))
 		onerel = try_relation_open(relid, lmode);
 	else if (ConditionalLockRelationOid(relid, lmode))
 		onerel = try_relation_open(relid, NoLock);
@@ -549,7 +549,7 @@ vacuum_open_relation(Oid relid, RangeVar *relation, VacuumParams *params,
 	else
 		return NULL;
 
-	if ((options & VACOPT_VACUUM) != 0)
+	if ((flags & VACOPT_VACUUM) != 0)
 	{
 		if (!rel_lock)
 			ereport(elevel,
@@ -570,7 +570,7 @@ vacuum_open_relation(Oid relid, RangeVar *relation, VacuumParams *params,
 		return NULL;
 	}
 
-	if ((options & VACOPT_ANALYZE) != 0)
+	if ((flags & VACOPT_ANALYZE) != 0)
 	{
 		if (!rel_lock)
 			ereport(elevel,
@@ -603,7 +603,7 @@ vacuum_open_relation(Oid relid, RangeVar *relation, VacuumParams *params,
  * are made in vac_context.
  */
 static List *
-expand_vacuum_rel(VacuumRelation *vrel, int options)
+expand_vacuum_rel(VacuumRelation *vrel, VacuumOptions *options)
 {
 	List	   *vacrels = NIL;
 	MemoryContext oldcontext;
@@ -635,7 +635,7 @@ expand_vacuum_rel(VacuumRelation *vrel, int options)
 		 * below, as well as find_all_inheritors's expectation that the caller
 		 * holds some lock on the starting relation.
 		 */
-		rvr_opts = (options & VACOPT_SKIP_LOCKED) ? RVR_SKIP_LOCKED : 0;
+		rvr_opts = (options->flags & VACOPT_SKIP_LOCKED) ? RVR_SKIP_LOCKED : 0;
 		relid = RangeVarGetRelidExtended(vrel->relation,
 										 AccessShareLock,
 										 rvr_opts,
@@ -647,7 +647,7 @@ expand_vacuum_rel(VacuumRelation *vrel, int options)
 		 */
 		if (!OidIsValid(relid))
 		{
-			if (options & VACOPT_VACUUM)
+			if (options->flags & VACOPT_VACUUM)
 				ereport(WARNING,
 						(errcode(ERRCODE_LOCK_NOT_AVAILABLE),
 						 errmsg("skipping vacuum of \"%s\" --- lock not available",
@@ -673,7 +673,7 @@ expand_vacuum_rel(VacuumRelation *vrel, int options)
 		 * Make a returnable VacuumRelation for this rel if user is a proper
 		 * owner.
 		 */
-		if (vacuum_is_relation_owner(relid, classForm, options))
+		if (vacuum_is_relation_owner(relid, classForm, options->flags))
 		{
 			oldcontext = MemoryContextSwitchTo(vac_context);
 			vacrels = lappend(vacrels, makeVacuumRelation(vrel->relation,
@@ -742,7 +742,7 @@ expand_vacuum_rel(VacuumRelation *vrel, int options)
  * the current database.  The list is built in vac_context.
  */
 static List *
-get_all_vacuum_rels(int options)
+get_all_vacuum_rels(VacuumOptions *options)
 {
 	List	   *vacrels = NIL;
 	Relation	pgclass;
@@ -760,7 +760,7 @@ get_all_vacuum_rels(int options)
 		Oid			relid = classForm->oid;
 
 		/* check permissions of relation */
-		if (!vacuum_is_relation_owner(relid, classForm, options))
+		if (!vacuum_is_relation_owner(relid, classForm, options->flags))
 			continue;
 
 		/*
@@ -1521,7 +1521,7 @@ vac_truncate_clog(TransactionId frozenXID,
  *		At entry and exit, we are not inside a transaction.
  */
 static bool
-vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
+vacuum_rel(Oid relid, RangeVar *relation, VacuumOptions *options, VacuumParams *params)
 {
 	LOCKMODE	lmode;
 	Relation	onerel;
@@ -1542,7 +1542,7 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 	 */
 	PushActiveSnapshot(GetTransactionSnapshot());
 
-	if (!(options & VACOPT_FULL))
+	if (!(options->flags & VACOPT_FULL))
 	{
 		/*
 		 * In lazy vacuum, we can set the PROC_IN_VACUUM flag, which lets
@@ -1582,10 +1582,10 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 	 * vacuum, but just ShareUpdateExclusiveLock for concurrent vacuum. Either
 	 * way, we can be sure that no other backend is vacuuming the same table.
 	 */
-	lmode = (options & VACOPT_FULL) ? AccessExclusiveLock : ShareUpdateExclusiveLock;
+	lmode = (options->flags & VACOPT_FULL) ? AccessExclusiveLock : ShareUpdateExclusiveLock;
 
 	/* open the relation and get the appropriate lock on it */
-	onerel = vacuum_open_relation(relid, relation, params, options, lmode);
+	onerel = vacuum_open_relation(relid, relation, params, options->flags, lmode);
 
 	/* leave if relation could not be opened or locked */
 	if (!onerel)
@@ -1605,7 +1605,7 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 	 */
 	if (!vacuum_is_relation_owner(RelationGetRelid(onerel),
 								  onerel->rd_rel,
-								  options & VACOPT_VACUUM))
+								  options->flags & VACOPT_VACUUM))
 	{
 		relation_close(onerel, lmode);
 		PopActiveSnapshot();
@@ -1677,7 +1677,7 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 	 * us to process it.  In VACUUM FULL, though, the toast table is
 	 * automatically rebuilt by cluster_rel so we shouldn't recurse to it.
 	 */
-	if (!(options & VACOPT_SKIPTOAST) && !(options & VACOPT_FULL))
+	if (!(options->flags & VACOPT_SKIPTOAST) && !(options->flags & VACOPT_FULL))
 		toast_relid = onerel->rd_rel->reltoastrelid;
 	else
 		toast_relid = InvalidOid;
@@ -1696,7 +1696,7 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 	/*
 	 * Do the actual work --- either FULL or "lazy" vacuum
 	 */
-	if (options & VACOPT_FULL)
+	if (options->flags & VACOPT_FULL)
 	{
 		int			cluster_options = 0;
 
@@ -1704,7 +1704,7 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 		relation_close(onerel, NoLock);
 		onerel = NULL;
 
-		if ((options & VACOPT_VERBOSE) != 0)
+		if ((options->flags & VACOPT_VERBOSE) != 0)
 			cluster_options |= CLUOPT_VERBOSE;
 
 		/* VACUUM FULL is now a variant of CLUSTER; see cluster.c */
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index a8a735c..9838e77 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -3846,12 +3846,22 @@ _copyDropdbStmt(const DropdbStmt *from)
 	return newnode;
 }
 
+static VacuumOptions *
+_copyVacuumOptions(const VacuumOptions *from)
+{
+	VacuumOptions *newnode = makeNode(VacuumOptions);
+
+	COPY_SCALAR_FIELD(flags);
+
+	return newnode;
+}
+
 static VacuumStmt *
 _copyVacuumStmt(const VacuumStmt *from)
 {
 	VacuumStmt *newnode = makeNode(VacuumStmt);
 
-	COPY_SCALAR_FIELD(options);
+	COPY_NODE_FIELD(options);
 	COPY_NODE_FIELD(rels);
 
 	return newnode;
@@ -5324,6 +5334,9 @@ copyObjectImpl(const void *from)
 		case T_DropdbStmt:
 			retval = _copyDropdbStmt(from);
 			break;
+		case T_VacuumOptions:
+			retval = _copyVacuumOptions(from);
+			break;
 		case T_VacuumStmt:
 			retval = _copyVacuumStmt(from);
 			break;
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 3cab90e..719eaf5 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -1668,9 +1668,17 @@ _equalDropdbStmt(const DropdbStmt *a, const DropdbStmt *b)
 }
 
 static bool
+_equalVacuumOptions(const VacuumOptions *a, const VacuumOptions *b)
+{
+	COMPARE_SCALAR_FIELD(flags);
+
+	return true;
+}
+
+static bool
 _equalVacuumStmt(const VacuumStmt *a, const VacuumStmt *b)
 {
-	COMPARE_SCALAR_FIELD(options);
+	COMPARE_NODE_FIELD(options);
 	COMPARE_NODE_FIELD(rels);
 
 	return true;
@@ -3388,6 +3396,9 @@ equal(const void *a, const void *b)
 		case T_DropdbStmt:
 			retval = _equalDropdbStmt(a, b);
 			break;
+		case T_VacuumOptions:
+			retval = _equalVacuumOptions(a, b);
+			break;
 		case T_VacuumStmt:
 			retval = _equalVacuumStmt(a, b);
 			break;
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index e23e68f..696809b 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -188,6 +188,7 @@ static void processCASbits(int cas_bits, int location, const char *constrType,
 			   bool *deferrable, bool *initdeferred, bool *not_valid,
 			   bool *no_inherit, core_yyscan_t yyscanner);
 static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
+static VacuumOptions *makeVacOpt(VacuumFlag flags);
 
 %}
 
@@ -238,6 +239,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 	struct ImportQual	*importqual;
 	InsertStmt			*istmt;
 	VariableSetStmt		*vsetstmt;
+	VacuumOptions		*vacopt;
 	PartitionElem		*partelem;
 	PartitionSpec		*partspec;
 	PartitionBoundSpec	*partboundspec;
@@ -306,8 +308,8 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 				create_extension_opt_item alter_extension_opt_item
 
 %type <ival>	opt_lock lock_type cast_context
-%type <ival>	vacuum_option_list vacuum_option_elem
-				analyze_option_list analyze_option_elem
+%type <vacopt>	vacuum_option_list vacuum_option_elem
+%type <ival>	analyze_option_list analyze_option_elem
 %type <boolean>	opt_or_replace
 				opt_grant_grant_option opt_grant_admin_option
 				opt_nowait opt_if_exists opt_with_data
@@ -10460,22 +10462,24 @@ cluster_index_specification:
 VacuumStmt: VACUUM opt_full opt_freeze opt_verbose opt_analyze opt_vacuum_relation_list
 				{
 					VacuumStmt *n = makeNode(VacuumStmt);
-					n->options = VACOPT_VACUUM;
+					VacuumOptions *opt = makeVacOpt(VACOPT_VACUUM);
 					if ($2)
-						n->options |= VACOPT_FULL;
+						opt->flags |= VACOPT_FULL;
 					if ($3)
-						n->options |= VACOPT_FREEZE;
+						opt->flags |= VACOPT_FREEZE;
 					if ($4)
-						n->options |= VACOPT_VERBOSE;
+						opt->flags |= VACOPT_VERBOSE;
 					if ($5)
-						n->options |= VACOPT_ANALYZE;
+						opt->flags |= VACOPT_ANALYZE;
+					n->options = opt;
 					n->rels = $6;
 					$$ = (Node *)n;
 				}
 			| VACUUM '(' vacuum_option_list ')' opt_vacuum_relation_list
 				{
 					VacuumStmt *n = makeNode(VacuumStmt);
-					n->options = VACOPT_VACUUM | $3;
+					n->options = $3;
+					n->options->flags |= VACOPT_VACUUM;
 					n->rels = $5;
 					$$ = (Node *) n;
 				}
@@ -10483,20 +10487,25 @@ VacuumStmt: VACUUM opt_full opt_freeze opt_verbose opt_analyze opt_vacuum_relati
 
 vacuum_option_list:
 			vacuum_option_elem								{ $$ = $1; }
-			| vacuum_option_list ',' vacuum_option_elem		{ $$ = $1 | $3; }
+			| vacuum_option_list ',' vacuum_option_elem
+				{
+					$1->flags |= $3->flags;
+					pfree($3);
+					$$ = $1;
+				}
 		;
 
 vacuum_option_elem:
-			analyze_keyword		{ $$ = VACOPT_ANALYZE; }
-			| VERBOSE			{ $$ = VACOPT_VERBOSE; }
-			| FREEZE			{ $$ = VACOPT_FREEZE; }
-			| FULL				{ $$ = VACOPT_FULL; }
+			analyze_keyword		{ $$ = makeVacOpt(VACOPT_ANALYZE); }
+			| VERBOSE			{ $$ = makeVacOpt(VACOPT_VERBOSE); }
+			| FREEZE			{ $$ = makeVacOpt(VACOPT_FREEZE); }
+			| FULL				{ $$ = makeVacOpt(VACOPT_FULL); }
 			| IDENT
 				{
 					if (strcmp($1, "disable_page_skipping") == 0)
-						$$ = VACOPT_DISABLE_PAGE_SKIPPING;
+						$$ = makeVacOpt(VACOPT_DISABLE_PAGE_SKIPPING);
 					else if (strcmp($1, "skip_locked") == 0)
-						$$ = VACOPT_SKIP_LOCKED;
+						$$ = makeVacOpt(VACOPT_SKIP_LOCKED);
 					else
 						ereport(ERROR,
 								(errcode(ERRCODE_SYNTAX_ERROR),
@@ -10508,16 +10517,17 @@ vacuum_option_elem:
 AnalyzeStmt: analyze_keyword opt_verbose opt_vacuum_relation_list
 				{
 					VacuumStmt *n = makeNode(VacuumStmt);
-					n->options = VACOPT_ANALYZE;
+					VacuumOptions *opt = makeVacOpt(VACOPT_ANALYZE);
 					if ($2)
-						n->options |= VACOPT_VERBOSE;
+						opt->flags |= VACOPT_VERBOSE;
+					n->options = opt;
 					n->rels = $3;
 					$$ = (Node *)n;
 				}
 			| analyze_keyword '(' analyze_option_list ')' opt_vacuum_relation_list
 				{
 					VacuumStmt *n = makeNode(VacuumStmt);
-					n->options = VACOPT_ANALYZE | $3;
+					n->options =  makeVacOpt(VACOPT_ANALYZE | $3);
 					n->rels = $5;
 					$$ = (Node *) n;
 				}
@@ -16047,6 +16057,18 @@ makeXmlExpr(XmlExprOp op, char *name, List *named_args, List *args,
 }
 
 /*
+ * Create a VacuumOptions with the given flags.
+ */
+static VacuumOptions *
+makeVacOpt(const VacuumFlag flags)
+{
+	VacuumOptions *opt = makeNode(VacuumOptions);
+
+	opt->flags = flags;
+	return opt;
+}
+
+/*
  * Merge the input and output parameters of a table function.
  */
 static List *
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 3bfac91..83d2678 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -188,7 +188,7 @@ typedef struct av_relation
 typedef struct autovac_table
 {
 	Oid			at_relid;
-	int			at_vacoptions;	/* bitmask of VacuumOption */
+	VacuumOptions	at_vacoptions;
 	VacuumParams at_params;
 	double		at_vacuum_cost_delay;
 	int			at_vacuum_cost_limit;
@@ -2482,7 +2482,7 @@ do_autovacuum(void)
 			 * next table in our list.
 			 */
 			HOLD_INTERRUPTS();
-			if (tab->at_vacoptions & VACOPT_VACUUM)
+			if (tab->at_vacoptions.flags & VACOPT_VACUUM)
 				errcontext("automatic vacuum of table \"%s.%s.%s\"",
 						   tab->at_datname, tab->at_nspname, tab->at_relname);
 			else
@@ -2883,7 +2883,7 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		tab = palloc(sizeof(autovac_table));
 		tab->at_relid = relid;
 		tab->at_sharedrel = classForm->relisshared;
-		tab->at_vacoptions = VACOPT_SKIPTOAST |
+		tab->at_vacoptions.flags = VACOPT_SKIPTOAST |
 			(dovacuum ? VACOPT_VACUUM : 0) |
 			(doanalyze ? VACOPT_ANALYZE : 0) |
 			(!wraparound ? VACOPT_SKIP_LOCKED : 0);
@@ -3110,7 +3110,7 @@ autovacuum_do_vac_analyze(autovac_table *tab, BufferAccessStrategy bstrategy)
 	rel = makeVacuumRelation(rangevar, tab->at_relid, NIL);
 	rel_list = list_make1(rel);
 
-	vacuum(tab->at_vacoptions, rel_list, &tab->at_params, bstrategy, true);
+	vacuum(&tab->at_vacoptions, rel_list, &tab->at_params, bstrategy, true);
 }
 
 /*
@@ -3132,10 +3132,10 @@ autovac_report_activity(autovac_table *tab)
 	int			len;
 
 	/* Report the command and possible options */
-	if (tab->at_vacoptions & VACOPT_VACUUM)
+	if (tab->at_vacoptions.flags & VACOPT_VACUUM)
 		snprintf(activity, MAX_AUTOVAC_ACTIV_LEN,
 				 "autovacuum: VACUUM%s",
-				 tab->at_vacoptions & VACOPT_ANALYZE ? " ANALYZE" : "");
+				 tab->at_vacoptions.flags & VACOPT_ANALYZE ? " ANALYZE" : "");
 	else
 		snprintf(activity, MAX_AUTOVAC_ACTIV_LEN,
 				 "autovacuum: ANALYZE");
diff --git a/src/backend/tcop/utility.c b/src/backend/tcop/utility.c
index 6ec795f..a735ff9 100644
--- a/src/backend/tcop/utility.c
+++ b/src/backend/tcop/utility.c
@@ -664,7 +664,7 @@ standard_ProcessUtility(PlannedStmt *pstmt,
 				VacuumStmt *stmt = (VacuumStmt *) parsetree;
 
 				/* we choose to allow this during "read only" transactions */
-				PreventCommandDuringRecovery((stmt->options & VACOPT_VACUUM) ?
+				PreventCommandDuringRecovery((stmt->options->flags & VACOPT_VACUUM) ?
 											 "VACUUM" : "ANALYZE");
 				/* forbidden in parallel mode due to CommandIsReadOnly */
 				ExecVacuum(stmt, isTopLevel);
@@ -2570,7 +2570,7 @@ CreateCommandTag(Node *parsetree)
 			break;
 
 		case T_VacuumStmt:
-			if (((VacuumStmt *) parsetree)->options & VACOPT_VACUUM)
+			if (((VacuumStmt *) parsetree)->options->flags & VACOPT_VACUUM)
 				tag = "VACUUM";
 			else
 				tag = "ANALYZE";
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 1b6607f..6bd4ab2 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -19,6 +19,7 @@
 #include "access/sdir.h"
 #include "access/skey.h"
 #include "access/table.h"		/* for backward compatibility */
+#include "nodes/parsenodes.h"
 #include "nodes/lockoptions.h"
 #include "nodes/primnodes.h"
 #include "storage/bufpage.h"
@@ -217,7 +218,7 @@ extern Size SyncScanShmemSize(void);
 
 /* in heap/vacuumlazy.c */
 struct VacuumParams;
-extern void heap_vacuum_rel(Relation onerel, int options,
+extern void heap_vacuum_rel(Relation onerel, VacuumOptions *options,
 				struct VacuumParams *params, BufferAccessStrategy bstrategy);
 
 /* in heap/heapam_visibility.c */
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 0a051ec..cfc6771 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -163,7 +163,7 @@ extern int	vacuum_multixact_freeze_table_age;
 
 /* in commands/vacuum.c */
 extern void ExecVacuum(VacuumStmt *vacstmt, bool isTopLevel);
-extern void vacuum(int options, List *relations, VacuumParams *params,
+extern void vacuum(VacuumOptions *options, List *relations, VacuumParams *params,
 	   BufferAccessStrategy bstrategy, bool isTopLevel);
 extern void vac_open_indexes(Relation relation, LOCKMODE lockmode,
 				 int *nindexes, Relation **Irel);
@@ -192,9 +192,9 @@ extern void vacuum_set_xid_limits(Relation rel,
 extern void vac_update_datfrozenxid(void);
 extern void vacuum_delay_point(void);
 extern bool vacuum_is_relation_owner(Oid relid, Form_pg_class reltuple,
-						 int options);
+									 int flags);
 extern Relation vacuum_open_relation(Oid relid, RangeVar *relation,
-					 VacuumParams *params, int options, LOCKMODE lmode);
+					 VacuumParams *params, int flags, LOCKMODE lmode);
 
 /* in commands/analyze.c */
 extern void analyze_rel(Oid relid, RangeVar *relation, int options,
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index ffb4cd4..6835e37 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -476,6 +476,7 @@ typedef enum NodeTag
 	T_PartitionRangeDatum,
 	T_PartitionCmd,
 	T_VacuumRelation,
+	T_VacuumOptions,
 
 	/*
 	 * TAGS FOR REPLICATION GRAMMAR PARSE NODES (replnodes.h)
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index fe35783..7504d9c 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3155,7 +3155,7 @@ typedef struct ClusterStmt
  * and VACOPT_ANALYZE must be set in options.
  * ----------------------
  */
-typedef enum VacuumOption
+typedef enum VacuumFlag
 {
 	VACOPT_VACUUM = 1 << 0,		/* do VACUUM */
 	VACOPT_ANALYZE = 1 << 1,	/* do ANALYZE */
@@ -3165,7 +3165,13 @@ typedef enum VacuumOption
 	VACOPT_SKIP_LOCKED = 1 << 5,	/* skip if cannot get lock */
 	VACOPT_SKIPTOAST = 1 << 6,	/* don't process the TOAST table, if any */
 	VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7	/* don't skip any pages */
-} VacuumOption;
+} VacuumFlag;
+
+typedef struct VacuumOptions
+{
+	NodeTag		type;
+	int			flags; /* OR of VacuumFlag */
+} VacuumOptions;
 
 /*
  * Info about a single target table of VACUUM/ANALYZE.
@@ -3184,9 +3190,9 @@ typedef struct VacuumRelation
 
 typedef struct VacuumStmt
 {
-	NodeTag		type;
-	int			options;		/* OR of VacuumOption flags */
-	List	   *rels;			/* list of VacuumRelation, or NIL for all */
+	NodeTag			type;
+	VacuumOptions	*options;
+	List		   *rels;			/* list of VacuumRelation, or NIL for all */
 } VacuumStmt;
 
 /* ----------------------
-- 
2.10.5

v17-0002-Add-parallel-option-to-VACUUM-command.patchapplication/octet-stream; name=v17-0002-Add-parallel-option-to-VACUUM-command.patchDownload

From fd343f5d59c35bc02c51ab804eab5be2d1fc4a13 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Mon, 4 Mar 2019 09:31:41 +0900
Subject: [PATCH v17 2/3] Add parallel option to VACUUM command

In parallel vacuum, we do both index vacuum and cleanup vacuum
in parallel with parallel worker processes if the table has
more than one index. All processes including the leader process
process indexes one by one.

Parallel vacuum can be performed by specifying like
VACUUM (PARALLEL 2) tbl, meaning that performing vacuum with 2
parallel worker processes.

The parallel vacuum degree is limited by both the number of
indexes the table has and max_parallel_maintenance_workers.
---
 doc/src/sgml/config.sgml              |  14 +-
 doc/src/sgml/ref/vacuum.sgml          |  20 +
 src/backend/access/heap/vacuumlazy.c  | 851 ++++++++++++++++++++++++++++++----
 src/backend/access/transam/parallel.c |   4 +
 src/backend/commands/vacuum.c         |   6 +
 src/backend/nodes/copyfuncs.c         |   1 +
 src/backend/nodes/equalfuncs.c        |   1 +
 src/backend/parser/gram.y             |  62 ++-
 src/backend/postmaster/autovacuum.c   |   1 +
 src/bin/psql/tab-complete.c           |   3 +-
 src/include/access/heapam.h           |   3 +
 src/include/nodes/parsenodes.h        |   4 +-
 src/test/regress/expected/vacuum.out  |   2 +
 src/test/regress/sql/vacuum.sql       |   3 +
 14 files changed, 853 insertions(+), 122 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index d383de2..3ca3ae8 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2226,13 +2226,13 @@ include_dir 'conf.d'
        <listitem>
         <para>
          Sets the maximum number of parallel workers that can be
-         started by a single utility command.  Currently, the only
-         parallel utility command that supports the use of parallel
-         workers is <command>CREATE INDEX</command>, and only when
-         building a B-tree index.  Parallel workers are taken from the
-         pool of processes established by <xref
-         linkend="guc-max-worker-processes"/>, limited by <xref
-         linkend="guc-max-parallel-workers"/>.  Note that the requested
+         started by a single utility command.  Currently, the parallel
+         utility commands that support the use of parallel workers are
+         <command>CREATE INDEX</command> only when building a B-tree index,
+         and <command>VACUUM</command> without <literal>FULL</literal>
+         option. Parallel workers are taken from the pool of processes
+         established by <xref linkend="guc-max-worker-processes"/>, limited
+         by <xref linkend="guc-max-parallel-workers"/>.  Note that the requested
          number of workers may not actually be available at run time.
          If this occurs, the utility operation will run with fewer
          workers than expected.  The default value is 2.  Setting this
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index fd911f5..1d7a002 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -30,6 +30,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     FREEZE
     VERBOSE
     ANALYZE
+    PARALLEL [ <replaceable class="parameter">N</replaceable> ]
     DISABLE_PAGE_SKIPPING
     SKIP_LOCKED
 
@@ -143,6 +144,25 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    </varlistentry>
 
    <varlistentry>
+    <term><literal>PARALLEL <replaceable class="parameter">N</replaceable></literal></term>
+    <listitem>
+     <para>
+      Perform vacuum index and cleanup index phases of <command>VACUUM</command>
+      in parallel using <replaceable class="parameter">N</replaceable> background
+      workers (for the detail of each vacuum phases, please refer to
+      <xref linkend="vacuum-phases"/>). Only one worker can be used per index.
+      Workers for vacuum launches before starting each phases and exit at the end
+      of the phase. If the parallel degree
+      <replaceable class="parameter">N</replaceable> is omitted, then
+      <command>VACUUM</command> decides the number of workers based on number of
+      indexes on the relation which further limited by
+      <xref linkend="guc-max-parallel-workers-maintenance"/>. This option can not
+      use with  <literal>FULL</literal> option.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><literal>DISABLE_PAGE_SKIPPING</literal></term>
     <listitem>
      <para>
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 2c33bf6..4b1edad 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -22,6 +22,19 @@
  * of index scans performed.  So we don't use maintenance_work_mem memory for
  * the TID array, just enough to hold as many heap tuples as fit on one page.
  *
+ * Lazy vacuum supports parallel execution with parallel worker processes. In
+ * parallel lazy vacuum, we perform both index vacuuming and index cleanup in
+ * parallel. Individual indexes is processed by one vacuum process. At beginning
+ * of lazy vacuum (at lazy_scan_heap) we prepare the parallel context and
+ * initialize the DSM segment that contains shared information as well as the
+ * memory space for dead tuples. When starting either index vacuuming or index
+ * cleanup, we launch parallel worker processes. Once all indexes are processed
+ * the parallel worker processes exit and the leader process re-initializes the
+ * DSM segment. Note that all parallel workers live during one either index
+ * vacuuming or index cleanup but the leader process neither exits from the
+ * parallel mode nor destroys the parallel context. For updating the index
+ * statistics, since any updates are not allowed during parallel mode we update
+ * the index statistics after exited from the parallel mode.
  *
  * Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
@@ -41,8 +54,10 @@
 #include "access/heapam_xlog.h"
 #include "access/htup_details.h"
 #include "access/multixact.h"
+#include "access/parallel.h"
 #include "access/transam.h"
 #include "access/visibilitymap.h"
+#include "access/xact.h"
 #include "access/xlog.h"
 #include "catalog/storage.h"
 #include "commands/dbcommands.h"
@@ -55,6 +70,7 @@
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
 #include "storage/lmgr.h"
+#include "tcop/tcopprot.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/pg_rusage.h"
@@ -110,6 +126,87 @@
  */
 #define PREFETCH_SIZE			((BlockNumber) 32)
 
+/*
+ * DSM keys for parallel lazy vacuum. Since we don't need to worry about DSM
+ * keys conflicting with plan_node_id we can use small integers.
+ */
+#define PARALLEL_VACUUM_KEY_SHARED			1
+#define PARALLEL_VACUUM_KEY_DEAD_TUPLES		2
+#define PARALLEL_VACUUM_KEY_QUERY_TEXT		3
+
+/*
+ * Structs for an index bulk-deletion statistic that is used for parallel
+ * lazy vacuum. This is allocated in a DSM segment.
+ */
+typedef struct LVIndStats
+{
+	bool updated;	/* are the stats updated? */
+	IndexBulkDeleteResult stats;
+} LVIndStats;
+
+/*
+ * LVDeadTuples stores the dead tuple TIDs collected during heap scan.
+ * This is allocated in a DSM segment when parallel
+ * lazy vacuum mode, or allocated in a local memory.
+ */
+typedef struct LVDeadTuples
+{
+	int			max_tuples;	/* # slots allocated in array */
+	int			num_tuples;	/* current # of entries */
+	/* List of TIDs of tuples we intend to delete */
+	/* NB: this list is ordered by TID address */
+	ItemPointerData itemptrs[FLEXIBLE_ARRAY_MEMBER];	/* array of ItemPointerData */
+} LVDeadTuples;
+#define SizeOfLVDeadTuples offsetof(LVDeadTuples, itemptrs) + sizeof(ItemPointerData)
+
+/*
+ * Shared information among parallel workers. So this is allocated in
+ * a DSM segment.
+ */
+typedef struct LVShared
+{
+	/*
+	 * Target table relid and vacuum settings. These fields are not modified
+	 * during the lazy vacuum.
+	 */
+	Oid		relid;
+	int		elevel;
+
+	/*
+	 * An indication for vacuum workers of doing either vacuuming index or
+	 * index cleanup.
+	 */
+	bool	for_cleanup;
+
+	/*
+	 * Fields for both index vacuuming and index cleanup.
+	 *
+	 * reltuples is the total number of input heap tuples. We set either an
+	 * old live tuples in index vacuuming or the new live tuples in index cleanup.
+	 *
+	 * estimated_count is true if the reltuples is estimated value.
+	 */
+	double	reltuples;
+	bool	estimated_count;
+
+	/*
+	 * Variables to control parallel index vacuuming. An variable-sized field
+	 * 'indstats' must come last.
+	 */
+	pg_atomic_uint32	nprocessed;
+	LVIndStats			indstats[FLEXIBLE_ARRAY_MEMBER];
+} LVShared;
+#define SizeOfLVShared offsetof(LVShared, indstats) + sizeof(LVIndStats)
+
+/* Struct for parallel lazy vacuum */
+typedef struct LVParallelState
+{
+	ParallelContext	*pcxt;
+	LVShared		*lvshared;
+	bool			updated_indstats;	/* did we already update index statistics? */
+	int				nworkers_requested;	/* user-requested parallel degree */
+} LVParallelState;
+
 typedef struct LVRelStats
 {
 	/* hasindex = true means two-pass strategy; false means one-pass */
@@ -128,17 +225,12 @@ typedef struct LVRelStats
 	BlockNumber pages_removed;
 	double		tuples_deleted;
 	BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
-	/* List of TIDs of tuples we intend to delete */
-	/* NB: this list is ordered by TID address */
-	int			num_dead_tuples;	/* current # of entries */
-	int			max_dead_tuples;	/* # slots allocated in array */
-	ItemPointer dead_tuples;	/* array of ItemPointerData */
+	LVDeadTuples *dead_tuples;
 	int			num_index_scans;
 	TransactionId latestRemovedXid;
 	bool		lock_waiter_detected;
 } LVRelStats;
 
-
 /* A few variables that don't seem worth passing around as parameters */
 static int	elevel = -1;
 
@@ -156,11 +248,13 @@ static void lazy_scan_heap(Relation onerel, VacuumOptions *options,
 static void lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats, BlockNumber nblocks);
 static bool lazy_check_needs_freeze(Buffer buf, bool *hastup);
 static void lazy_vacuum_index(Relation indrel,
-				  IndexBulkDeleteResult **stats,
-				  LVRelStats *vacrelstats);
+							  IndexBulkDeleteResult **stats,
+							  double reltuples,
+							  LVDeadTuples	*dead_tuples);
 static void lazy_cleanup_index(Relation indrel,
-				   IndexBulkDeleteResult *stats,
-				   LVRelStats *vacrelstats);
+							   IndexBulkDeleteResult **stats,
+							   double reltuples, bool estimated_count,
+							   bool in_parallel);
 static int lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 				 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer);
 static bool should_attempt_truncation(LVRelStats *vacrelstats);
@@ -168,13 +262,26 @@ static void lazy_truncate_heap(Relation onerel, LVRelStats *vacrelstats);
 static BlockNumber count_nondeletable_pages(Relation onerel,
 						 LVRelStats *vacrelstats);
 static void lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks);
-static void lazy_record_dead_tuple(LVRelStats *vacrelstats,
-					   ItemPointer itemptr);
+static void lazy_record_dead_tuple(LVDeadTuples *dead_tuples, ItemPointer itemptr);
 static bool lazy_tid_reaped(ItemPointer itemptr, void *state);
 static int	vac_cmp_itemptr(const void *left, const void *right);
 static bool heap_page_is_all_visible(Relation rel, Buffer buf,
 						 TransactionId *visibility_cutoff_xid, bool *all_frozen);
-
+static LVParallelState *lazy_prepare_parallel(LVRelStats *vacrelstats, Oid relid,
+											  BlockNumber nblocks, int nindexes,
+											  int nrequested);
+static void lazy_end_parallel(LVParallelState *lps, Relation *Irel, int nindexes);
+static bool lazy_begin_parallel_vacuum_index(LVParallelState *lps, LVRelStats *vacrelstats,
+											 bool for_cleanup);
+static void lazy_end_parallel_vacuum_index(LVParallelState *lps, bool for_cleanup);
+static void lazy_vacuum_all_indexes(LVRelStats *vacrelstats, Relation *Irel, int nindexes,
+									IndexBulkDeleteResult **stats,
+									LVParallelState *lps, bool for_cleanup);
+static void lazy_vacuum_indexes_for_worker(Relation *indrels, int nindexes,
+										   LVShared *lvshared, LVDeadTuples *dead_tuples,
+										   bool for_cleanup);
+static int compute_parallel_workers(Relation onerel, int nrequested, int nindexes);
+static long compute_max_dead_tuples(BlockNumber relblocks, bool hasindex);
 
 /*
  *	heap_vacuum_rel() -- perform VACUUM for one heap relation
@@ -464,6 +571,18 @@ vacuum_log_cleanup_info(Relation rel, LVRelStats *vacrelstats)
  *		dead-tuple TIDs, invoke vacuuming of indexes and call lazy_vacuum_heap
  *		to reclaim dead line pointers.
  *
+ *		If the table has more than one index and parallel lazy vacuum is requested,
+ *		we execute both index vacuuming and index cleanup with parallel workers.
+ * 		When allocating the space for lazy scan heap, we enter the parallel mode,
+ * 		create the parallel context and initailize a DSM segment for dead tuples.
+ *		The dead_tuples points either to a DSM segment in parallel vacuum case or
+ * 		to a local memory in single process vacuum case.  Before starting parallel
+ * 		index vacuuming and parallel index cleanup we launch parallel workers.
+ * 		All parallel workers will exit after processed all indexes and the leader
+ * 		process re-initialize parallel context and then re-launch them at the next
+ * 		execution. The index statistics are updated by the leader after exited from
+ * 		the parallel mode since all writes are not allowed during the parallel mode.
+ *
  *		If there are no indexes then we can reclaim line pointers on the fly;
  *		dead line pointers need only be retained until all index pointers that
  *		reference them have been killed.
@@ -472,6 +591,8 @@ static void
 lazy_scan_heap(Relation onerel, VacuumOptions *options, LVRelStats *vacrelstats,
 			   Relation *Irel, int nindexes, bool aggressive)
 {
+	LVParallelState *lps = NULL;	/* non-NULL means ready for parallel vacuum */
+	LVDeadTuples *dead_tuples;
 	BlockNumber nblocks,
 				blkno;
 	HeapTupleData tuple;
@@ -494,6 +615,7 @@ lazy_scan_heap(Relation onerel, VacuumOptions *options, LVRelStats *vacrelstats,
 	bool		skipping_blocks;
 	xl_heap_freeze_tuple *frozen;
 	StringInfoData buf;
+	int			parallel_workers = 0;
 	const int	initprog_index[] = {
 		PROGRESS_VACUUM_PHASE,
 		PROGRESS_VACUUM_TOTAL_HEAP_BLKS,
@@ -529,13 +651,38 @@ lazy_scan_heap(Relation onerel, VacuumOptions *options, LVRelStats *vacrelstats,
 	vacrelstats->nonempty_pages = 0;
 	vacrelstats->latestRemovedXid = InvalidTransactionId;
 
-	lazy_space_alloc(vacrelstats, nblocks);
+	/*
+	 * Compute the number of parallel vacuum worker to request and then enable
+	 * parallel lazy vacuum.
+	 */
+	if ((options->flags & VACOPT_PARALLEL) != 0)
+		parallel_workers = compute_parallel_workers(onerel,
+													options->nworkers,
+													nindexes);
+
+	if (parallel_workers > 0)
+	{
+		/* Enter the parallel mode and prepare parallel vacuum */
+		lps = lazy_prepare_parallel(vacrelstats,
+									RelationGetRelid(onerel),
+									nblocks, nindexes,
+									parallel_workers);
+		lps->nworkers_requested = options->nworkers;
+		lps->updated_indstats = false;
+	}
+	else
+	{
+		/* Allocate the memory space for dead tuples locally */
+		lazy_space_alloc(vacrelstats, nblocks);
+	}
+
+	dead_tuples = vacrelstats->dead_tuples;
 	frozen = palloc(sizeof(xl_heap_freeze_tuple) * MaxHeapTuplesPerPage);
 
 	/* Report that we're scanning the heap, advertising total # of blocks */
 	initprog_val[0] = PROGRESS_VACUUM_PHASE_SCAN_HEAP;
 	initprog_val[1] = nblocks;
-	initprog_val[2] = vacrelstats->max_dead_tuples;
+	initprog_val[2] = dead_tuples->max_tuples;
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
 	/*
@@ -713,8 +860,8 @@ lazy_scan_heap(Relation onerel, VacuumOptions *options, LVRelStats *vacrelstats,
 		 * If we are close to overrunning the available space for dead-tuple
 		 * TIDs, pause and do a cycle of vacuuming before we tackle this page.
 		 */
-		if ((vacrelstats->max_dead_tuples - vacrelstats->num_dead_tuples) < MaxHeapTuplesPerPage &&
-			vacrelstats->num_dead_tuples > 0)
+		if ((dead_tuples->max_tuples - dead_tuples->num_tuples) < MaxHeapTuplesPerPage &&
+			dead_tuples->num_tuples > 0)
 		{
 			const int	hvp_index[] = {
 				PROGRESS_VACUUM_PHASE,
@@ -742,10 +889,8 @@ lazy_scan_heap(Relation onerel, VacuumOptions *options, LVRelStats *vacrelstats,
 										 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
 			/* Remove index entries */
-			for (i = 0; i < nindexes; i++)
-				lazy_vacuum_index(Irel[i],
-								  &indstats[i],
-								  vacrelstats);
+			lazy_vacuum_all_indexes(vacrelstats, Irel, nindexes, indstats,
+									lps, false);
 
 			/*
 			 * Report that we are now vacuuming the heap.  We also increase
@@ -765,7 +910,7 @@ lazy_scan_heap(Relation onerel, VacuumOptions *options, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 			vacrelstats->num_index_scans++;
 
 			/*
@@ -961,7 +1106,7 @@ lazy_scan_heap(Relation onerel, VacuumOptions *options, LVRelStats *vacrelstats,
 		has_dead_tuples = false;
 		nfrozen = 0;
 		hastup = false;
-		prev_dead_count = vacrelstats->num_dead_tuples;
+		prev_dead_count = dead_tuples->num_tuples;
 		maxoff = PageGetMaxOffsetNumber(page);
 
 		/*
@@ -1000,7 +1145,7 @@ lazy_scan_heap(Relation onerel, VacuumOptions *options, LVRelStats *vacrelstats,
 			 */
 			if (ItemIdIsDead(itemid))
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				all_visible = false;
 				continue;
 			}
@@ -1140,7 +1285,7 @@ lazy_scan_heap(Relation onerel, VacuumOptions *options, LVRelStats *vacrelstats,
 
 			if (tupgone)
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				HeapTupleHeaderAdvanceLatestRemovedXid(tuple.t_data,
 													   &vacrelstats->latestRemovedXid);
 				tups_vacuumed += 1;
@@ -1209,8 +1354,7 @@ lazy_scan_heap(Relation onerel, VacuumOptions *options, LVRelStats *vacrelstats,
 		 * If there are no indexes then we can vacuum the page right now
 		 * instead of doing a second scan.
 		 */
-		if (nindexes == 0 &&
-			vacrelstats->num_dead_tuples > 0)
+		if (nindexes == 0 && dead_tuples->num_tuples > 0)
 		{
 			/* Remove tuples from heap */
 			lazy_vacuum_page(onerel, blkno, buf, 0, vacrelstats, &vmbuffer);
@@ -1221,7 +1365,7 @@ lazy_scan_heap(Relation onerel, VacuumOptions *options, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 			vacuumed_pages++;
 
 			/*
@@ -1337,7 +1481,7 @@ lazy_scan_heap(Relation onerel, VacuumOptions *options, LVRelStats *vacrelstats,
 		 * page, so remember its free space as-is.  (This path will always be
 		 * taken if there are no indexes.)
 		 */
-		if (vacrelstats->num_dead_tuples == prev_dead_count)
+		if (dead_tuples->num_tuples == prev_dead_count)
 			RecordPageWithFreeSpace(onerel, blkno, freespace, nblocks);
 	}
 
@@ -1371,7 +1515,7 @@ lazy_scan_heap(Relation onerel, VacuumOptions *options, LVRelStats *vacrelstats,
 
 	/* If any tuples need to be deleted, perform final vacuum cycle */
 	/* XXX put a threshold on min number of tuples here? */
-	if (vacrelstats->num_dead_tuples > 0)
+	if (dead_tuples->num_tuples > 0)
 	{
 		const int	hvp_index[] = {
 			PROGRESS_VACUUM_PHASE,
@@ -1387,10 +1531,8 @@ lazy_scan_heap(Relation onerel, VacuumOptions *options, LVRelStats *vacrelstats,
 									 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
 		/* Remove index entries */
-		for (i = 0; i < nindexes; i++)
-			lazy_vacuum_index(Irel[i],
-							  &indstats[i],
-							  vacrelstats);
+		lazy_vacuum_all_indexes(vacrelstats, Irel, nindexes, indstats,
+								lps, false);
 
 		/* Report that we are now vacuuming the heap */
 		hvp_val[0] = PROGRESS_VACUUM_PHASE_VACUUM_HEAP;
@@ -1417,8 +1559,12 @@ lazy_scan_heap(Relation onerel, VacuumOptions *options, LVRelStats *vacrelstats,
 								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
 
 	/* Do post-vacuum cleanup and statistics update for each index */
-	for (i = 0; i < nindexes; i++)
-		lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
+	lazy_vacuum_all_indexes(vacrelstats, Irel, nindexes, indstats,
+							lps, true);
+
+	/* End parallel vacuum, update index statistics if not yet */
+	if (lps)
+		lazy_end_parallel(lps, Irel, nindexes);
 
 	/* If no indexes, make log report that lazy_vacuum_heap would've made */
 	if (vacuumed_pages)
@@ -1485,7 +1631,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats, BlockNumber nblocks)
 	npages = 0;
 
 	tupindex = 0;
-	while (tupindex < vacrelstats->num_dead_tuples)
+	while (tupindex < vacrelstats->dead_tuples->num_tuples)
 	{
 		BlockNumber tblk;
 		Buffer		buf;
@@ -1494,7 +1640,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats, BlockNumber nblocks)
 
 		vacuum_delay_point();
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples->itemptrs[tupindex]);
 		buf = ReadBufferExtended(onerel, MAIN_FORKNUM, tblk, RBM_NORMAL,
 								 vac_strategy);
 		if (!ConditionalLockBufferForCleanup(buf))
@@ -1542,6 +1688,7 @@ static int
 lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 				 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer)
 {
+	LVDeadTuples	*dead_tuples = vacrelstats->dead_tuples;
 	Page		page = BufferGetPage(buffer);
 	OffsetNumber unused[MaxOffsetNumber];
 	int			uncnt = 0;
@@ -1552,16 +1699,16 @@ lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 
 	START_CRIT_SECTION();
 
-	for (; tupindex < vacrelstats->num_dead_tuples; tupindex++)
+	for (; tupindex < dead_tuples->num_tuples; tupindex++)
 	{
 		BlockNumber tblk;
 		OffsetNumber toff;
 		ItemId		itemid;
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&dead_tuples->itemptrs[tupindex]);
 		if (tblk != blkno)
 			break;				/* past end of tuples for this block */
-		toff = ItemPointerGetOffsetNumber(&vacrelstats->dead_tuples[tupindex]);
+		toff = ItemPointerGetOffsetNumber(&dead_tuples->itemptrs[tupindex]);
 		itemid = PageGetItemId(page, toff);
 		ItemIdSetUnused(itemid);
 		unused[uncnt++] = toff;
@@ -1682,6 +1829,103 @@ lazy_check_needs_freeze(Buffer buf, bool *hastup)
 	return false;
 }
 
+/*
+ * Vacuum or cleanup all indexes. If we're ready to do parallel vacuum it's
+ * performed with parallel workers. So this function must be used by the parallel
+ * vacuum leader process.
+ */
+static void
+lazy_vacuum_all_indexes(LVRelStats *vacrelstats, Relation *Irel, int nindexes,
+						IndexBulkDeleteResult **stats, LVParallelState *lps,
+						bool for_cleanup)
+{
+	int			nprocessed = 0;
+	bool		do_parallel = false;
+	int			idx;
+
+	Assert(!IsParallelWorker());
+
+	/* no job if the table has no index */
+	if (nindexes <= 0)
+		return;
+
+	/* Launch parallel vacuum workers if we're ready */
+	if (lps)
+		do_parallel = lazy_begin_parallel_vacuum_index(lps, vacrelstats,
+													   for_cleanup);
+
+	for (;;)
+	{
+		/* Get the next index to vacuum */
+		if (do_parallel)
+			idx = pg_atomic_fetch_add_u32(&(lps->lvshared->nprocessed), 1);
+		else
+			idx = nprocessed++;
+
+		/* Done for all indexes? */
+		if (idx >= nindexes)
+			break;
+
+		/*
+		 * Update the local pointer to the corresponding bulk-deletion result
+		 * if someone already updated it.
+		 */
+		if (do_parallel &&
+			lps->lvshared->indstats[idx].updated &&
+			stats[idx] == NULL)
+			stats[idx] = &(lps->lvshared->indstats[idx].stats);
+
+		/*
+		 * Do vacuuming or cleanup one index. For index cleanup, we don't update
+		 * index statistics during parallel mode.
+		 */
+		if (!for_cleanup)
+			lazy_vacuum_index(Irel[idx], &stats[idx], vacrelstats->old_rel_pages,
+							  vacrelstats->dead_tuples);
+		else
+			lazy_cleanup_index(Irel[idx], &stats[idx], vacrelstats->new_rel_tuples,
+							   vacrelstats->tupcount_pages < vacrelstats->rel_pages,
+							   do_parallel);
+
+		/*
+		 * In parallel lazy vacuum, we copy the index bulk-deletion results
+		 * returned from ambulkdelete and amvacuumcleanup to the DSM segment
+		 * because they allocate the results locally and it's possible that an
+		 * index will be vacuumed by the different vacuum process at the next
+		 * time.
+		 *
+		 * Since all vacuum workers write the bulk-deletion result at different
+		 * slot we can write them without locking.
+		 */
+		if (do_parallel &&
+			!lps->lvshared->indstats[idx].updated &&
+			stats[idx] != NULL)
+		{
+			memcpy(&(lps->lvshared->indstats[idx].stats),
+				   stats[idx], sizeof(IndexBulkDeleteResult));
+			lps->lvshared->indstats[idx].updated = true;
+
+			/*
+			 * no longer need the locally allocated result and now stats[idx]
+			 * points to the DSM segment.
+			 */
+			pfree(stats[idx]);
+			stats[idx] = &(lps->lvshared->indstats[idx].stats);
+		}
+	}
+
+	/*
+	 * If we failed to begin parallel lazy vacuum for cleanup index we updated
+	 * index statistics during execution.
+	 */
+	if (lps != NULL &&
+		!do_parallel &&
+		for_cleanup)
+		lps->updated_indstats = true;
+
+	if (do_parallel)
+		lazy_end_parallel_vacuum_index(lps, for_cleanup);
+}
 
 /*
  *	lazy_vacuum_index() -- vacuum one index relation.
@@ -1690,11 +1934,11 @@ lazy_check_needs_freeze(Buffer buf, bool *hastup)
  *		vacrelstats->dead_tuples, and update running statistics.
  */
 static void
-lazy_vacuum_index(Relation indrel,
-				  IndexBulkDeleteResult **stats,
-				  LVRelStats *vacrelstats)
+lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult **stats,
+				  double reltuples, LVDeadTuples *dead_tuples)
 {
 	IndexVacuumInfo ivinfo;
+	char		*msg;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
@@ -1703,79 +1947,97 @@ lazy_vacuum_index(Relation indrel,
 	ivinfo.analyze_only = false;
 	ivinfo.estimated_count = true;
 	ivinfo.message_level = elevel;
-	/* We can only provide an approximate value of num_heap_tuples here */
-	ivinfo.num_heap_tuples = vacrelstats->old_live_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
 	/* Do bulk deletion */
 	*stats = index_bulk_delete(&ivinfo, *stats,
-							   lazy_tid_reaped, (void *) vacrelstats);
+							   lazy_tid_reaped, (void *) dead_tuples);
 
+	if (IsParallelWorker())
+		msg = "scanned index \"%s\" to remove %d row versions by parallel vacuum worker";
+	else
+		msg = "scanned index \"%s\" to remove %d row versions";
 	ereport(elevel,
-			(errmsg("scanned index \"%s\" to remove %d row versions",
+			(errmsg(msg,
 					RelationGetRelationName(indrel),
-					vacrelstats->num_dead_tuples),
+					dead_tuples->num_tuples),
 			 errdetail_internal("%s", pg_rusage_show(&ru0))));
 }
 
 /*
  *	lazy_cleanup_index() -- do post-vacuum cleanup for one index relation.
+ *
+ * in_parallel is true if we're performing parallel lazy vacuum. Since any
+ * updates are not allowed during parallel mode we don't update statistics
+ * but set the index bulk-deletion result to *stats. Otherwise we update it
+ * and set NULL.
  */
 static void
-lazy_cleanup_index(Relation indrel,
-				   IndexBulkDeleteResult *stats,
-				   LVRelStats *vacrelstats)
+lazy_cleanup_index(Relation indrel, IndexBulkDeleteResult **stats,
+				   double reltuples, bool estimated_count, bool in_parallel)
 {
+	IndexBulkDeleteResult *res;
 	IndexVacuumInfo ivinfo;
+	char		*msg;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
 
 	ivinfo.index = indrel;
 	ivinfo.analyze_only = false;
-	ivinfo.estimated_count = (vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	ivinfo.estimated_count = estimated_count;
 	ivinfo.message_level = elevel;
-
-	/*
-	 * Now we can provide a better estimate of total number of surviving
-	 * tuples (we assume indexes are more interested in that than in the
-	 * number of nominally live tuples).
-	 */
-	ivinfo.num_heap_tuples = vacrelstats->new_rel_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
-	stats = index_vacuum_cleanup(&ivinfo, stats);
+	res = index_vacuum_cleanup(&ivinfo, *stats);
 
-	if (!stats)
+	if (!res)
 		return;
 
 	/*
 	 * Now update statistics in pg_class, but only if the index says the count
-	 * is accurate.
+	 * is accurate and we're not in parallel mode.
 	 */
-	if (!stats->estimated_count)
+	if (!res->estimated_count && !in_parallel)
 		vac_update_relstats(indrel,
-							stats->num_pages,
-							stats->num_index_tuples,
+							res->num_pages,
+							res->num_index_tuples,
 							0,
 							false,
 							InvalidTransactionId,
 							InvalidMultiXactId,
 							false);
 
+	if (IsParallelWorker())
+		msg = "index \"%s\" now contains %.0f row versions in %u pages, reported by parallel vacuum worker";
+	else
+		msg = "index \"%s\" now contains %.0f row versions in %u pages";
+
 	ereport(elevel,
-			(errmsg("index \"%s\" now contains %.0f row versions in %u pages",
+			(errmsg(msg,
 					RelationGetRelationName(indrel),
-					stats->num_index_tuples,
-					stats->num_pages),
+					res->num_index_tuples,
+					res->num_pages),
 			 errdetail("%.0f index row versions were removed.\n"
 					   "%u index pages have been deleted, %u are currently reusable.\n"
 					   "%s.",
-					   stats->tuples_removed,
-					   stats->pages_deleted, stats->pages_free,
+					   res->tuples_removed,
+					   res->pages_deleted, res->pages_free,
 					   pg_rusage_show(&ru0))));
 
-	pfree(stats);
+	if (!in_parallel)
+	{
+		/*
+		 * Since the stats could points the res we set it to NULL for safety,
+		 * although we no longer use.
+		 */
+		pfree(res);
+		*stats = NULL;
+	}
+	else
+		*stats = res;
 }
 
 /*
@@ -2080,19 +2342,17 @@ count_nondeletable_pages(Relation onerel, LVRelStats *vacrelstats)
 }
 
 /*
- * lazy_space_alloc - space allocation decisions for lazy vacuum
- *
- * See the comments at the head of this file for rationale.
+ * Return the maximum number of dead tuples we can record.
  */
-static void
-lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+static long
+compute_max_dead_tuples(BlockNumber relblocks, bool hasindex)
 {
 	long		maxtuples;
 	int			vac_work_mem = IsAutoVacuumWorkerProcess() &&
 	autovacuum_work_mem != -1 ?
 	autovacuum_work_mem : maintenance_work_mem;
 
-	if (vacrelstats->hasindex)
+	if (hasindex)
 	{
 		maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);
 		maxtuples = Min(maxtuples, INT_MAX);
@@ -2106,34 +2366,49 @@ lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
 		maxtuples = Max(maxtuples, MaxHeapTuplesPerPage);
 	}
 	else
-	{
 		maxtuples = MaxHeapTuplesPerPage;
-	}
 
-	vacrelstats->num_dead_tuples = 0;
-	vacrelstats->max_dead_tuples = (int) maxtuples;
-	vacrelstats->dead_tuples = (ItemPointer)
-		palloc(maxtuples * sizeof(ItemPointerData));
+	return maxtuples;
+}
+
+/*
+ * lazy_space_alloc - space allocation decisions for lazy vacuum
+ *
+ * See the comments at the head of this file for rationale.
+ */
+static void
+lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+{
+	LVDeadTuples	*dead_tuples = NULL;
+	long		maxtuples;
+
+	maxtuples = compute_max_dead_tuples(relblocks, vacrelstats->hasindex);
+
+	dead_tuples = (LVDeadTuples *)
+		palloc(SizeOfLVDeadTuples + maxtuples * sizeof(ItemPointerData));
+	dead_tuples->num_tuples = 0;
+	dead_tuples->max_tuples = (int) maxtuples;
+
+	vacrelstats->dead_tuples = dead_tuples;
 }
 
 /*
  * lazy_record_dead_tuple - remember one deletable tuple
  */
 static void
-lazy_record_dead_tuple(LVRelStats *vacrelstats,
-					   ItemPointer itemptr)
+lazy_record_dead_tuple(LVDeadTuples *dead_tuples, ItemPointer itemptr)
 {
 	/*
 	 * The array shouldn't overflow under normal behavior, but perhaps it
 	 * could if we are given a really small maintenance_work_mem. In that
 	 * case, just forget the last few tuples (we'll get 'em next time).
 	 */
-	if (vacrelstats->num_dead_tuples < vacrelstats->max_dead_tuples)
+	if (dead_tuples->num_tuples < dead_tuples->max_tuples)
 	{
-		vacrelstats->dead_tuples[vacrelstats->num_dead_tuples] = *itemptr;
-		vacrelstats->num_dead_tuples++;
+		dead_tuples->itemptrs[dead_tuples->num_tuples] = *itemptr;
+		dead_tuples->num_tuples++;
 		pgstat_progress_update_param(PROGRESS_VACUUM_NUM_DEAD_TUPLES,
-									 vacrelstats->num_dead_tuples);
+									 dead_tuples->num_tuples);
 	}
 }
 
@@ -2147,12 +2422,12 @@ lazy_record_dead_tuple(LVRelStats *vacrelstats,
 static bool
 lazy_tid_reaped(ItemPointer itemptr, void *state)
 {
-	LVRelStats *vacrelstats = (LVRelStats *) state;
+	LVDeadTuples	*dead_tuples = (LVDeadTuples *) state;
 	ItemPointer res;
 
 	res = (ItemPointer) bsearch((void *) itemptr,
-								(void *) vacrelstats->dead_tuples,
-								vacrelstats->num_dead_tuples,
+								(void *) dead_tuples->itemptrs,
+								dead_tuples->num_tuples,
 								sizeof(ItemPointerData),
 								vac_cmp_itemptr);
 
@@ -2300,3 +2575,393 @@ heap_page_is_all_visible(Relation rel, Buffer buf,
 
 	return all_visible;
 }
+
+/*
+ * Compute the number of parallel worker process to request. Vacuums can be executed
+ * in parallel if the table has more than one index since the parallel index vacuuming
+ * processes one index by one vacuum process. The relation size of table and indexes
+ * doesn't affect to the parallel degree.
+ */
+static int
+compute_parallel_workers(Relation onerel, int nrequested, int nindexes)
+{
+	int parallel_workers;
+
+	if (nindexes <= 1)
+		return 0;
+
+	if (nrequested)
+		parallel_workers = Min(nrequested, nindexes - 1);
+	else
+	{
+		/*
+		 * The parallel degree is neither requested nor set in relopts. Compute
+		 * it based on the number of indexes.
+		 */
+		parallel_workers = nindexes - 1;
+	}
+
+	/* cap by max_parallel_maintenace_workers */
+	parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+
+	return parallel_workers;
+}
+
+/*
+ * Enter the parallel mode, allocate and initialize a DSM segment.
+ */
+static LVParallelState *
+lazy_prepare_parallel(LVRelStats *vacrelstats, Oid relid, BlockNumber nblocks,
+					  int nindexes, int nrequested)
+{
+	LVParallelState *lps = (LVParallelState *) palloc(sizeof(LVParallelState));
+	LVShared	*shared;
+	ParallelContext *pcxt;
+	LVDeadTuples	*tidmap;
+	long	maxtuples;
+	char	*sharedquery;
+	Size	estshared;
+	Size	estdt;
+	int		querylen;
+	int		keys = 0;
+
+	Assert(nrequested > 0);
+	Assert(nindexes > 0);
+
+	EnterParallelMode();
+	pcxt = CreateParallelContext("postgres", "heap_parallel_vacuum_main",
+								 nrequested, true);
+	lps->pcxt = pcxt;
+	Assert(pcxt->nworkers > 0);
+
+	/* Estimate size for shared information -- PARALLEL_VACUUM_KEY_SHARED */
+	estshared = MAXALIGN(add_size(SizeOfLVShared,
+								  mul_size(sizeof(LVIndStats), nindexes)));
+	shm_toc_estimate_chunk(&pcxt->estimator, estshared);
+	keys++;
+
+	/* Estimate size for dead tuples -- PARALLEL_VACUUM_KEY_DEAD_TUPLES */
+	maxtuples = compute_max_dead_tuples(nblocks, nindexes > 0);
+	estdt = MAXALIGN(add_size(sizeof(LVDeadTuples),
+							  mul_size(sizeof(ItemPointerData), maxtuples)));
+	shm_toc_estimate_chunk(&pcxt->estimator, estdt);
+	keys++;
+
+	shm_toc_estimate_keys(&pcxt->estimator, keys);
+
+	/* Finally, estimate VACUUM_KEY_QUERY_TEXT space */
+	querylen = strlen(debug_query_string);
+	shm_toc_estimate_chunk(&pcxt->estimator, querylen + 1);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	InitializeParallelDSM(pcxt);
+
+	/* prepare shared information */
+	shared = (LVShared *) shm_toc_allocate(pcxt->toc, estshared);
+	shared->relid = relid;
+	shared->elevel = elevel;
+	pg_atomic_init_u32(&(shared->nprocessed), 0);
+	MemSet(shared->indstats, 0, sizeof(LVIndStats) * nindexes);
+
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_SHARED, shared);
+	lps->lvshared = shared;
+
+	/* prepare the dead tuple space */
+	tidmap = (LVDeadTuples *) shm_toc_allocate(pcxt->toc, estdt);
+	tidmap->max_tuples = maxtuples;
+	tidmap->num_tuples = 0;
+	MemSet(tidmap->itemptrs, 0, sizeof(ItemPointerData) * maxtuples);
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_DEAD_TUPLES, tidmap);
+	vacrelstats->dead_tuples = tidmap;
+
+	/* Store query string for workers */
+	sharedquery = (char *) shm_toc_allocate(pcxt->toc, querylen + 1);
+	memcpy(sharedquery, debug_query_string, querylen + 1);
+	sharedquery[querylen] = '\0';
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, sharedquery);
+
+	lps->updated_indstats = false;
+	lps->nworkers_requested = 0;
+
+	return lps;
+}
+
+/*
+ * Shutdown workers, destroy parallel context, and end parallel mode. If
+ * If we don't update index statistics yet, we copy statistics of all
+ * indexes before destroying the parallel context, and then update them
+ * after exit parallel mode.
+ */
+static void
+lazy_end_parallel(LVParallelState *lps, Relation *Irel, int nindexes)
+{
+	LVIndStats *copied_indstats = NULL;
+
+	Assert(!IsParallelWorker());
+
+	if (!lps->updated_indstats)
+	{
+		Assert(Irel != NULL && nindexes > 0);
+		/* copy the index statistics to a temporary space */
+		copied_indstats = palloc(sizeof(LVIndStats) * nindexes);
+		memcpy(copied_indstats, lps->lvshared->indstats,
+			   sizeof(LVIndStats) * nindexes);
+	}
+
+	/* Shutdown worker processes and destroy the parallel context */
+	WaitForParallelWorkersToFinish(lps->pcxt);
+	DestroyParallelContext(lps->pcxt);
+	ExitParallelMode();
+
+	if (!lps->updated_indstats)
+	{
+		int i;
+
+		for (i = 0; i < nindexes; i++)
+		{
+			LVIndStats *s = &(copied_indstats[i]);
+
+			/* Update index statistics */
+			if (s->updated && !s->stats.estimated_count)
+				vac_update_relstats(Irel[i],
+									s->stats.num_pages,
+									s->stats.num_index_tuples,
+									0,
+									false,
+									InvalidTransactionId,
+									InvalidMultiXactId,
+									false);
+		}
+
+		pfree(copied_indstats);
+
+		/* For safety */
+		lps->updated_indstats = true;
+	}
+}
+
+/*
+ * Begin a parallel index vacuuming or index cleanup. Set shared information
+ * and launch parallel worker processes. Return true if at least one worker
+ * has been launched.
+ */
+static bool
+lazy_begin_parallel_vacuum_index(LVParallelState *lps, LVRelStats *vacrelstats,
+								 bool for_cleanup)
+{
+	StringInfoData buf;
+
+	Assert(!IsParallelWorker());
+
+	/* Request workers to do either vacuuming indexes or cleaning indexes */
+	lps->lvshared->for_cleanup = for_cleanup;
+
+	if (for_cleanup)
+	{
+		/*
+		 * Now we can provide a better estimate of total number of surviving
+		 * tuples (we assume indexes are more interested in that than in the
+		 * number of nominally live tuples).
+		 */
+		lps->lvshared->reltuples = vacrelstats->new_rel_tuples;
+		lps->lvshared->estimated_count =
+			(vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	}
+	else
+	{
+		/* We can only provide an approximate value of num_heap_tuples here */
+		lps->lvshared->reltuples = vacrelstats->old_live_tuples;
+		lps->lvshared->estimated_count = true;
+	}
+
+	LaunchParallelWorkers(lps->pcxt);
+
+	/* Report parallel vacuum worker information */
+	initStringInfo(&buf);
+	if (for_cleanup)
+	{
+		if (lps->nworkers_requested > 0)
+			appendStringInfo(&buf,
+							 ngettext("launched %d parallel vacuum worker for index cleanup (planned: %d, requested %d)",
+									  "launched %d parallel vacuum workers for index cleanup (planned: %d, requsted %d)",
+									  lps->pcxt->nworkers_launched),
+							 lps->pcxt->nworkers_launched,
+							 lps->pcxt->nworkers,
+							 lps->nworkers_requested);
+		else
+			appendStringInfo(&buf,
+							 ngettext("launched %d parallel vacuum worker for index cleanup (planned: %d)",
+									  "launched %d parallel vacuum workers for index cleanup (planned: %d)",
+									  lps->pcxt->nworkers_launched),
+							 lps->pcxt->nworkers_launched,
+							 lps->pcxt->nworkers);
+	}
+	else
+	{
+		if (lps->nworkers_requested > 0)
+			appendStringInfo(&buf,
+							 ngettext("launched %d parallel vacuum worker for index vacuuming (planned: %d, requested %d)",
+									  "launched %d parallel vacuum workers for index vacuuming (planned: %d, requested %d)",
+									  lps->pcxt->nworkers_launched),
+							 lps->pcxt->nworkers_launched,
+							 lps->pcxt->nworkers,
+							 lps->nworkers_requested);
+		else
+			appendStringInfo(&buf,
+							 ngettext("launched %d parallel vacuum worker for index vacuuming (planned: %d)",
+									  "launched %d parallel vacuum workers for index vacuuming (planned: %d)",
+									  lps->pcxt->nworkers_launched),
+							 lps->pcxt->nworkers_launched,
+							 lps->pcxt->nworkers);
+	}
+	ereport(elevel, (errmsg("%s", buf.data)));
+
+	/*
+	 * if no workers launched, we vacuum all indexes by the leader process
+	 * alone. Since there is hope that we can launch workers in the next
+	 * execution time we don't want to end the parallel mode yet.
+	 */
+	if (lps->pcxt->nworkers_launched == 0)
+	{
+		lazy_end_parallel_vacuum_index(lps, for_cleanup);
+		return false;
+	}
+
+	return true;
+}
+
+/*
+ * Wait for all worker processes to finish and reinitialize DSM for
+ * the next execution.
+ */
+static void
+lazy_end_parallel_vacuum_index(LVParallelState *lps, bool for_cleanup)
+{
+	Assert(!IsParallelWorker());
+
+	WaitForParallelWorkersToFinish(lps->pcxt);
+
+	/* Reset the processing count */
+	pg_atomic_write_u32(&(lps->lvshared->nprocessed), 0);
+
+	/*
+	 * Reinitialize the DSM space except to relaunch parallel workers for
+	 * the next execution.
+	 */
+	if (!for_cleanup)
+		ReinitializeParallelDSM(lps->pcxt);
+}
+
+/*
+ * Perform work within a launched parallel process.
+ *
+ * Parallel vacuum worker processes doesn't report the vacuum progress
+ * information.
+ */
+void
+heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
+{
+	Relation	onerel;
+	Relation	*indrels;
+	LVShared	*lvshared;
+	LVDeadTuples	*dead_tuples;
+	int			nindexes;
+	char		*sharedquery;
+
+	lvshared = (LVShared *) shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_SHARED, false);
+	elevel = lvshared->elevel;
+
+	ereport(DEBUG1,
+			(errmsg("starting parallel lazy vacuum worker for %s",
+					lvshared->for_cleanup ? "cleanup" : "vacuuming")));
+
+	/* Open relations */
+	onerel = heap_open(lvshared->relid, ShareUpdateExclusiveLock);
+
+	/* indrels are sorted in order by OID */
+	vac_open_indexes(onerel, RowExclusiveLock, &nindexes, &indrels);
+	Assert(nindexes > 0);
+
+	/* Set debug_query_string for individual workers */
+	sharedquery = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, true);
+
+	/* Report the query string from leader */
+	debug_query_string = sharedquery;
+	pgstat_report_activity(STATE_RUNNING, debug_query_string);
+
+	/* Set dead tuple space within worker */
+	dead_tuples = (LVDeadTuples *) shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_DEAD_TUPLES, false);
+
+	/* Set cost-based vacuum delay */
+	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostBalance = 0;
+	VacuumPageHit = 0;
+	VacuumPageMiss = 0;
+	VacuumPageDirty = 0;
+
+	/* Do either vacuuming indexes or cleaning indexes */
+	lazy_vacuum_indexes_for_worker(indrels, nindexes, lvshared,
+								   dead_tuples,
+								   lvshared->for_cleanup);
+
+	vac_close_indexes(nindexes, indrels, RowExclusiveLock);
+	heap_close(onerel, ShareUpdateExclusiveLock);
+}
+
+/*
+ * Vacuum or cleanup all indexes. This is similar to the lazy_vacuum_all_indexes
+ * but this function must be used by the parallel vacuum worker processes.
+ */
+static void
+lazy_vacuum_indexes_for_worker(Relation *indrels, int nindexes,
+							   LVShared *lvshared, LVDeadTuples *dead_tuples,
+							   bool for_cleanup)
+{
+	int idx = 0;
+
+	Assert(IsParallelWorker());
+
+	for (;;)
+	{
+		IndexBulkDeleteResult *stats = NULL;
+
+		/* Get next index to vacuum */
+		idx = pg_atomic_fetch_add_u32(&(lvshared->nprocessed), 1);
+
+		/*  Done for all indexes? */
+		if (idx >= nindexes)
+			break;
+
+		/*
+		 * If a vacuum process already updated the bulk-deletion result, we
+		 * pass it to index AMs. Otherwise pass NULL as they expect NULL for
+		 * the first time execution.
+		 */
+		if (lvshared->indstats[idx].updated)
+			stats = &(lvshared->indstats[idx].stats);
+
+		/* Do vacuuming or cleanup one index */
+		if (!for_cleanup)
+			lazy_vacuum_index(indrels[idx], &stats, lvshared->reltuples,
+							  dead_tuples);
+		else
+			lazy_cleanup_index(indrels[idx], &stats, lvshared->reltuples,
+							   lvshared->estimated_count, true);
+
+		/*
+		 * We copy the index bulk-deletion results returned from ambulkdelete
+		 * and amvacuumcleanup to the DSM segment because they allocate
+		 * the results locally and it's possible that an index will be vacuumed
+		 * by the different vacuum process at the next time.
+		 *
+		 * Since all vacuum workers write the bulk-deletion result at different
+		 * slot we can write them without locking.
+		 */
+		if (!lvshared->indstats[idx].updated && stats)
+		{
+			memcpy(&(lvshared->indstats[idx].stats), stats,
+				   sizeof(IndexBulkDeleteResult));
+			lvshared->indstats[idx].updated = true;
+		}
+	}
+}
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index 55d129a..86511b2 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -14,6 +14,7 @@
 
 #include "postgres.h"
 
+#include "access/heapam.h"
 #include "access/nbtree.h"
 #include "access/parallel.h"
 #include "access/session.h"
@@ -140,6 +141,9 @@ static const struct
 	},
 	{
 		"_bt_parallel_build_main", _bt_parallel_build_main
+	},
+	{
+		"heap_parallel_vacuum_main", heap_parallel_vacuum_main
 	}
 };
 
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index bdd7d46..3b9d891 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -112,6 +112,12 @@ ExecVacuum(VacuumStmt *vacstmt, bool isTopLevel)
 		}
 	}
 
+	if ((vacstmt->options->flags & VACOPT_FULL) &&
+		(vacstmt->options->flags & VACOPT_PARALLEL))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot specify FULL option with PARALLEL option")));
+
 	/*
 	 * All freeze ages are zero if the FREEZE option is given; otherwise pass
 	 * them as -1 which means to use the default values.
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 9838e77..bff5ecc 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -3852,6 +3852,7 @@ _copyVacuumOptions(const VacuumOptions *from)
 	VacuumOptions *newnode = makeNode(VacuumOptions);
 
 	COPY_SCALAR_FIELD(flags);
+	COPY_SCALAR_FIELD(nworkers);
 
 	return newnode;
 }
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 719eaf5..b3c7934 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -1671,6 +1671,7 @@ static bool
 _equalVacuumOptions(const VacuumOptions *a, const VacuumOptions *b)
 {
 	COMPARE_SCALAR_FIELD(flags);
+	COMPARE_SCALAR_FIELD(nworkers);
 
 	return true;
 }
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 696809b..982cec8 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -188,7 +188,7 @@ static void processCASbits(int cas_bits, int location, const char *constrType,
 			   bool *deferrable, bool *initdeferred, bool *not_valid,
 			   bool *no_inherit, core_yyscan_t yyscanner);
 static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
-static VacuumOptions *makeVacOpt(VacuumFlag flags);
+static VacuumOptions *makeVacOpt(VacuumFlag flag, int nworkers);
 
 %}
 
@@ -10462,7 +10462,7 @@ cluster_index_specification:
 VacuumStmt: VACUUM opt_full opt_freeze opt_verbose opt_analyze opt_vacuum_relation_list
 				{
 					VacuumStmt *n = makeNode(VacuumStmt);
-					VacuumOptions *opt = makeVacOpt(VACOPT_VACUUM);
+					VacuumOptions *opt = makeVacOpt(VACOPT_VACUUM, 0);
 					if ($2)
 						opt->flags |= VACOPT_FULL;
 					if ($3)
@@ -10478,8 +10478,10 @@ VacuumStmt: VACUUM opt_full opt_freeze opt_verbose opt_analyze opt_vacuum_relati
 			| VACUUM '(' vacuum_option_list ')' opt_vacuum_relation_list
 				{
 					VacuumStmt *n = makeNode(VacuumStmt);
-					n->options = $3;
-					n->options->flags |= VACOPT_VACUUM;
+					VacuumOptions *opt = makeVacOpt(VACOPT_VACUUM, 0);
+					opt->flags = VACOPT_VACUUM | $3->flags;
+					opt->nworkers = $3->nworkers;
+					n->options = opt;
 					n->rels = $5;
 					$$ = (Node *) n;
 				}
@@ -10489,23 +10491,38 @@ vacuum_option_list:
 			vacuum_option_elem								{ $$ = $1; }
 			| vacuum_option_list ',' vacuum_option_elem
 				{
-					$1->flags |= $3->flags;
-					pfree($3);
-					$$ = $1;
+					VacuumOptions *opt1 = $1;
+					VacuumOptions *opt2 = $3;
+
+					opt1->flags |= opt2->flags;
+					if (opt2->flags == VACOPT_PARALLEL)
+						opt1->nworkers = opt2->nworkers;
+					pfree(opt2);
+					$$ = opt1;
 				}
 		;
 
 vacuum_option_elem:
-			analyze_keyword		{ $$ = makeVacOpt(VACOPT_ANALYZE); }
-			| VERBOSE			{ $$ = makeVacOpt(VACOPT_VERBOSE); }
-			| FREEZE			{ $$ = makeVacOpt(VACOPT_FREEZE); }
-			| FULL				{ $$ = makeVacOpt(VACOPT_FULL); }
+			analyze_keyword		{ $$ = makeVacOpt(VACOPT_ANALYZE, 0); }
+			| VERBOSE			{ $$ = makeVacOpt(VACOPT_VERBOSE, 0); }
+			| FREEZE			{ $$ = makeVacOpt(VACOPT_FREEZE, 0); }
+			| FULL				{ $$ = makeVacOpt(VACOPT_FULL, 0); }
+			| PARALLEL			{ $$ = makeVacOpt(VACOPT_PARALLEL, 0); }
+			| PARALLEL ICONST
+			{
+				if ($2 < 1)
+					ereport(ERROR,
+							(errcode(ERRCODE_SYNTAX_ERROR),
+							 errmsg("parallel vacuum degree must be more than 1"),
+							 parser_errposition(@1)));
+				$$ = makeVacOpt(VACOPT_PARALLEL, $2);
+			}
 			| IDENT
 				{
 					if (strcmp($1, "disable_page_skipping") == 0)
-						$$ = makeVacOpt(VACOPT_DISABLE_PAGE_SKIPPING);
+						$$ = makeVacOpt(VACOPT_DISABLE_PAGE_SKIPPING, 0);
 					else if (strcmp($1, "skip_locked") == 0)
-						$$ = makeVacOpt(VACOPT_SKIP_LOCKED);
+						$$ = makeVacOpt(VACOPT_SKIP_LOCKED, 0);
 					else
 						ereport(ERROR,
 								(errcode(ERRCODE_SYNTAX_ERROR),
@@ -10517,7 +10534,8 @@ vacuum_option_elem:
 AnalyzeStmt: analyze_keyword opt_verbose opt_vacuum_relation_list
 				{
 					VacuumStmt *n = makeNode(VacuumStmt);
-					VacuumOptions *opt = makeVacOpt(VACOPT_ANALYZE);
+					VacuumOptions *opt = makeVacOpt(VACOPT_ANALYZE, 0);
+
 					if ($2)
 						opt->flags |= VACOPT_VERBOSE;
 					n->options = opt;
@@ -10527,7 +10545,9 @@ AnalyzeStmt: analyze_keyword opt_verbose opt_vacuum_relation_list
 			| analyze_keyword '(' analyze_option_list ')' opt_vacuum_relation_list
 				{
 					VacuumStmt *n = makeNode(VacuumStmt);
-					n->options =  makeVacOpt(VACOPT_ANALYZE | $3);
+					VacuumOptions *opt = makeVacOpt(VACOPT_ANALYZE, 0);
+					opt->flags = VACOPT_ANALYZE | $3;
+					n->options = opt;
 					n->rels = $5;
 					$$ = (Node *) n;
 				}
@@ -16056,16 +16076,18 @@ makeXmlExpr(XmlExprOp op, char *name, List *named_args, List *args,
 	return (Node *) x;
 }
 
+
 /*
- * Create a VacuumOptions with the given flags.
+ * Create a VacuumOptions with the given options.
  */
 static VacuumOptions *
-makeVacOpt(const VacuumFlag flags)
+makeVacOpt(VacuumFlag flag, int nworkers)
 {
-	VacuumOptions *opt = makeNode(VacuumOptions);
+	VacuumOptions *vacopt = makeNode(VacuumOptions);
 
-	opt->flags = flags;
-	return opt;
+	vacopt->flags = flag;
+	vacopt->nworkers = nworkers;
+	return vacopt;
 }
 
 /*
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 83d2678..447d4a2 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2887,6 +2887,7 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 			(dovacuum ? VACOPT_VACUUM : 0) |
 			(doanalyze ? VACOPT_ANALYZE : 0) |
 			(!wraparound ? VACOPT_SKIP_LOCKED : 0);
+		tab->at_vacoptions.nworkers = 0;	/* parallel lazy autovacuum is not supported */
 		tab->at_params.freeze_min_age = freeze_min_age;
 		tab->at_params.freeze_table_age = freeze_table_age;
 		tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 10ae21c..fef80c4 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -3429,7 +3429,8 @@ psql_completion(const char *text, int start, int end)
 		 */
 		if (ends_with(prev_wd, '(') || ends_with(prev_wd, ','))
 			COMPLETE_WITH("FULL", "FREEZE", "ANALYZE", "VERBOSE",
-						  "DISABLE_PAGE_SKIPPING", "SKIP_LOCKED");
+						  "DISABLE_PAGE_SKIPPING", "SKIP_LOCKED",
+						  "PARALLEL");
 	}
 	else if (HeadMatches("VACUUM") && TailMatches("("))
 		/* "VACUUM (" should be caught above, so assume we want columns */
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 6bd4ab2..7b7f5ab 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -14,6 +14,7 @@
 #ifndef HEAPAM_H
 #define HEAPAM_H
 
+#include "access/parallel.h"
 #include "access/relation.h"	/* for backward compatibility */
 #include "access/relscan.h"
 #include "access/sdir.h"
@@ -21,6 +22,7 @@
 #include "access/table.h"		/* for backward compatibility */
 #include "nodes/parsenodes.h"
 #include "nodes/lockoptions.h"
+#include "nodes/parsenodes.h"
 #include "nodes/primnodes.h"
 #include "storage/bufpage.h"
 #include "storage/lockdefs.h"
@@ -220,6 +222,7 @@ extern Size SyncScanShmemSize(void);
 struct VacuumParams;
 extern void heap_vacuum_rel(Relation onerel, VacuumOptions *options,
 				struct VacuumParams *params, BufferAccessStrategy bstrategy);
+extern void heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
 
 /* in heap/heapam_visibility.c */
 extern bool HeapTupleSatisfiesVisibility(HeapTuple stup, Snapshot snapshot,
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 7504d9c..aada74b 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3164,13 +3164,15 @@ typedef enum VacuumFlag
 	VACOPT_FULL = 1 << 4,		/* FULL (non-concurrent) vacuum */
 	VACOPT_SKIP_LOCKED = 1 << 5,	/* skip if cannot get lock */
 	VACOPT_SKIPTOAST = 1 << 6,	/* don't process the TOAST table, if any */
-	VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7	/* don't skip any pages */
+	VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7,	/* don't skip any pages */
+	VACOPT_PARALLEL = 1 << 8	/* do lazy vacuum in parallel */
 } VacuumFlag;
 
 typedef struct VacuumOptions
 {
 	NodeTag		type;
 	int			flags; /* OR of VacuumFlag */
+	int			nworkers;	/* # of parallel vacuum workers */
 } VacuumOptions;
 
 /*
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index fa9d663..9b5b7dc 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -80,6 +80,8 @@ CONTEXT:  SQL function "do_analyze" statement 1
 SQL function "wrap_do_analyze" statement 1
 VACUUM FULL vactst;
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+VACUUM (PARALLEL) vaccluster;
+VACUUM (PARALLEL 2) vaccluster;
 -- partitioned table
 CREATE TABLE vacparted (a int, b char) PARTITION BY LIST (a);
 CREATE TABLE vacparted1 PARTITION OF vacparted FOR VALUES IN (1);
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 9defa0d..f92c4e5 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -61,6 +61,9 @@ VACUUM FULL vaccluster;
 VACUUM FULL vactst;
 
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+VACUUM (PARALLEL) vaccluster;
+VACUUM (PARALLEL 2) vaccluster;
+
 
 -- partitioned table
 CREATE TABLE vacparted (a int, b char) PARTITION BY LIST (a);
-- 
2.10.5

v17-0003-Add-paralell-P-option-to-vacuumdb-command.patchapplication/octet-stream; name=v17-0003-Add-paralell-P-option-to-vacuumdb-command.patchDownload

From a5c36f9c943d4e992e52922b38d10934d8a0ac4b Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Wed, 23 Jan 2019 16:07:53 +0900
Subject: [PATCH v17 3/3] Add --paralell, -P option to vacuumdb command

---
 doc/src/sgml/ref/vacuumdb.sgml    | 16 +++++++++++++
 src/bin/scripts/t/100_vacuumdb.pl | 10 +++++++-
 src/bin/scripts/vacuumdb.c        | 49 ++++++++++++++++++++++++++++++++++++++-
 3 files changed, 73 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/ref/vacuumdb.sgml b/doc/src/sgml/ref/vacuumdb.sgml
index 41c7f3d..da65177 100644
--- a/doc/src/sgml/ref/vacuumdb.sgml
+++ b/doc/src/sgml/ref/vacuumdb.sgml
@@ -227,6 +227,22 @@ PostgreSQL documentation
      </varlistentry>
 
      <varlistentry>
+      <term><option>-P <replaceable class="parameter">workers</replaceable></option></term>
+      <term><option>--parallel=<replaceable class="parameter">workers</replaceable></option></term>
+      <listitem>
+       <para>
+        Execute parallel vacuum with <productname>PostgreSQL</productname>'s
+        <replaceable class="parameter">workers</replaceable> background workers.
+       </para>
+       <para>
+        This option will require background workers, so make sure your
+        <xref linkend="guc-max-parallel-workers-maintenance"/> setting is more
+        than one.
+       </para>
+      </listitem>
+     </varlistentry>
+
+     <varlistentry>
       <term><option>-q</option></term>
       <term><option>--quiet</option></term>
       <listitem>
diff --git a/src/bin/scripts/t/100_vacuumdb.pl b/src/bin/scripts/t/100_vacuumdb.pl
index 7f3a9b1..5ab87f3 100644
--- a/src/bin/scripts/t/100_vacuumdb.pl
+++ b/src/bin/scripts/t/100_vacuumdb.pl
@@ -3,7 +3,7 @@ use warnings;
 
 use PostgresNode;
 use TestLib;
-use Test::More tests => 44;
+use Test::More tests => 48;
 
 program_help_ok('vacuumdb');
 program_version_ok('vacuumdb');
@@ -48,6 +48,14 @@ $node->issues_sql_like(
 $node->command_fails(
 	[ 'vacuumdb', '--analyze-only', '--disable-page-skipping', 'postgres' ],
 	'--analyze-only and --disable-page-skipping specified together');
+$node->issues_sql_like(
+	[ 'vacuumdb', '-P2', 'postgres' ],
+	qr/statement: VACUUM \(PARALLEL 2\).*;/,
+	'vacuumdb -P2');
+$node->issues_sql_like(
+	[ 'vacuumdb', '-P', 'postgres' ],
+	qr/statement: VACUUM \(PARALLEL\).*;/,
+	'vacuumdb -P');
 $node->command_ok([qw(vacuumdb -Z --table=pg_am dbname=template1)],
 	'vacuumdb with connection string');
 
diff --git a/src/bin/scripts/vacuumdb.c b/src/bin/scripts/vacuumdb.c
index 5ac41ea..b9799db 100644
--- a/src/bin/scripts/vacuumdb.c
+++ b/src/bin/scripts/vacuumdb.c
@@ -45,6 +45,8 @@ typedef struct vacuumingOptions
 	bool		skip_locked;
 	int			min_xid_age;
 	int			min_mxid_age;
+	int			parallel_workers;	/* -1: disabled, 0: PARALLEL without number of
+									 * workers. */
 } vacuumingOptions;
 
 
@@ -111,6 +113,7 @@ main(int argc, char *argv[])
 		{"full", no_argument, NULL, 'f'},
 		{"verbose", no_argument, NULL, 'v'},
 		{"jobs", required_argument, NULL, 'j'},
+		{"parallel", optional_argument, NULL, 'P'},
 		{"maintenance-db", required_argument, NULL, 2},
 		{"analyze-in-stages", no_argument, NULL, 3},
 		{"disable-page-skipping", no_argument, NULL, 4},
@@ -140,6 +143,7 @@ main(int argc, char *argv[])
 
 	/* initialize options to all false */
 	memset(&vacopts, 0, sizeof(vacopts));
+	vacopts.parallel_workers = -1;
 
 	progname = get_progname(argv[0]);
 
@@ -147,7 +151,7 @@ main(int argc, char *argv[])
 
 	handle_help_version_opts(argc, argv, "vacuumdb", help);
 
-	while ((c = getopt_long(argc, argv, "h:p:U:wWeqd:zZFat:fvj:", long_options, &optindex)) != -1)
+	while ((c = getopt_long(argc, argv, "h:p:P::U:wWeqd:zZFat:fvj:", long_options, &optindex)) != -1)
 	{
 		switch (c)
 		{
@@ -214,6 +218,25 @@ main(int argc, char *argv[])
 					exit(1);
 				}
 				break;
+			case 'P':
+				{
+					int parallel_workers = 0;
+
+					if (optarg != NULL)
+					{
+						parallel_workers = atoi(optarg);
+						if (parallel_workers <= 0)
+						{
+							fprintf(stderr, _("%s: number of parallel workers must be at least 1\n"),
+									progname);
+							exit(1);
+						}
+					}
+
+					/* allow to set 0, meaning PARALLEL without the parallel degree */
+					vacopts.parallel_workers = parallel_workers;
+					break;
+				}
 			case 2:
 				maintenance_db = pg_strdup(optarg);
 				break;
@@ -288,9 +311,22 @@ main(int argc, char *argv[])
 					progname, "disable-page-skipping");
 			exit(1);
 		}
+		if (vacopts.parallel_workers >= 0)
+		{
+			fprintf(stderr, _("%s: cannot use the \"%s\" option when performing only analyze\n"),
+					progname, "parallel");
+			exit(1);
+		}
 		/* allow 'and_analyze' with 'analyze_only' */
 	}
 
+	if (vacopts.full && vacopts.parallel_workers >= 0)
+	{
+		fprintf(stderr, _("%s: cannot use the \"%s\" option with \"%s\" option"),
+				progname, "full", "parallel");
+		exit(1);
+	}
+
 	setup_cancel_handler();
 
 	/* Avoid opening extra connections. */
@@ -895,6 +931,16 @@ prepare_vacuum_command(PQExpBuffer sql, int serverVersion,
 				appendPQExpBuffer(sql, "%sANALYZE", sep);
 				sep = comma;
 			}
+			if (vacopts->parallel_workers > 0)
+			{
+				appendPQExpBuffer(sql, "%sPARALLEL %d", sep, vacopts->parallel_workers);
+				sep = comma;
+			}
+			if (vacopts->parallel_workers == 0)
+			{
+				appendPQExpBuffer(sql, "%sPARALLEL", sep);
+				sep = comma;
+			}
 			if (sep != paren)
 				appendPQExpBufferChar(sql, ')');
 		}
@@ -1227,6 +1273,7 @@ help(const char *progname)
 	printf(_("  -j, --jobs=NUM                  use this many concurrent connections to vacuum\n"));
 	printf(_("      --min-mxid-age=MXID_AGE     minimum multixact ID age of tables to vacuum\n"));
 	printf(_("      --min-xid-age=XID_AGE       minimum transaction ID age of tables to vacuum\n"));
+	printf(_("  -P, --parallel=NUM              do parallel vacuuming\n"));
 	printf(_("  -q, --quiet                     don't write any messages\n"));
 	printf(_("      --skip-locked               skip relations that cannot be immediately locked\n"));
 	printf(_("  -t, --table='TABLE[(COLUMNS)]'  vacuum specific table(s) only\n"));
-- 
2.10.5

#43

Kyotaro HORIGUCHI

horiguchi.kyotaro@lab.ntt.co.jp

almost 7 years ago

In reply to: Masahiko Sawada (#42)

Hello.

At Mon, 18 Mar 2019 11:54:42 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in <CAD21AoC6bsM0FfePgzSV40uXofbFSPe-Ax095TOnu5GOZ790uA@mail.gmail.com>

Here is the performance test results. I've setup a 500MB table with
several indexes and made 10% of table dirty before each vacuum.
Compared execution time of the patched postgrse with the current HEAD
(at 'speed_up' column). In my environment,

indexes | parallel_degree | patched | head | speed_up
---------+-----------------+------------+------------+----------
0 | 0 | 238.2085 | 244.7625 | 1.0275
0 | 1 | 237.7050 | 244.7625 | 1.0297
0 | 2 | 238.0390 | 244.7625 | 1.0282
0 | 4 | 238.1045 | 244.7625 | 1.0280
0 | 8 | 237.8995 | 244.7625 | 1.0288
0 | 16 | 237.7775 | 244.7625 | 1.0294
1 | 0 | 1328.8590 | 1334.9125 | 1.0046
1 | 1 | 1325.9140 | 1334.9125 | 1.0068
1 | 2 | 1333.3665 | 1334.9125 | 1.0012
1 | 4 | 1329.5205 | 1334.9125 | 1.0041
1 | 8 | 1334.2255 | 1334.9125 | 1.0005
1 | 16 | 1335.1510 | 1334.9125 | 0.9998
2 | 0 | 2426.2905 | 2427.5165 | 1.0005
2 | 1 | 1416.0595 | 2427.5165 | 1.7143
2 | 2 | 1411.6270 | 2427.5165 | 1.7197
2 | 4 | 1411.6490 | 2427.5165 | 1.7196
2 | 8 | 1410.1750 | 2427.5165 | 1.7214
2 | 16 | 1413.4985 | 2427.5165 | 1.7174
4 | 0 | 4622.5060 | 4619.0340 | 0.9992
4 | 1 | 2536.8435 | 4619.0340 | 1.8208
4 | 2 | 2548.3615 | 4619.0340 | 1.8126
4 | 4 | 1467.9655 | 4619.0340 | 3.1466
4 | 8 | 1486.3155 | 4619.0340 | 3.1077
4 | 16 | 1481.7150 | 4619.0340 | 3.1174
8 | 0 | 9039.3810 | 8990.4735 | 0.9946
8 | 1 | 4807.5880 | 8990.4735 | 1.8701
8 | 2 | 3786.7620 | 8990.4735 | 2.3742
8 | 4 | 2924.2205 | 8990.4735 | 3.0745
8 | 8 | 2684.2545 | 8990.4735 | 3.3493
8 | 16 | 2672.9800 | 8990.4735 | 3.3635
16 | 0 | 17821.4715 | 17740.1300 | 0.9954
16 | 1 | 9318.3810 | 17740.1300 | 1.9038
16 | 2 | 7260.6315 | 17740.1300 | 2.4433
16 | 4 | 5538.5225 | 17740.1300 | 3.2030
16 | 8 | 5368.5255 | 17740.1300 | 3.3045
16 | 16 | 5291.8510 | 17740.1300 | 3.3523
(36 rows)

For indexes=4,8,16, the cases with parallel_degree=4,8,16 behave
almost the same. I suspect that the indexes are too-small and all
the index pages were on memory and CPU is saturated. Maybe you
had four cores and parallel workers more than the number had no
effect. Other normal backends should have been able do almost
nothing meanwhile. Usually the number of parallel workers is
determined so that IO capacity is filled up but this feature
intermittently saturates CPU capacity very under such a
situation.

I'm not sure, but what if we do index vacuum in one-tuple-by-one
manner? That is, heap vacuum passes dead tuple one-by-one (or
buffering few tuples) to workers and workers process it not by
bulkdelete, but just tuple_delete (we don't have one). That could
avoid the sleep time of heap-scan while index bulkdelete.

Attached the updated version patches. The patches apply to the current
HEAD cleanly but the 0001 patch still changes the vacuum option to a
Node since it's under the discussion. After the direction has been
decided, I'll update the patches.

As for the to-be-or-not-to-be a node problem, I don't think it is
needed but from the point of consistency, it seems reasonable and
it is seen in other nodes that *Stmt Node holds option Node. But
makeVacOpt and it's usage, and subsequent operations on the node
look somewhat strange.. Why don't you just do
"makeNode(VacuumOptions)"?

+	/* Estimate size for dead tuples -- PARALLEL_VACUUM_KEY_DEAD_TUPLES */
+    maxtuples = compute_max_dead_tuples(nblocks, nindexes > 0);

If I understand this correctly, nindexes is always > 1 there. At
lesat asserted that > 0 there.

+ estdt = MAXALIGN(add_size(sizeof(LVDeadTuples),

I don't think the name is good. (dt menant detach by the first look for me..)

+        if (lps->nworkers_requested > 0)
+            appendStringInfo(&buf,
+                             ngettext("launched %d parallel vacuum worker for index cleanup (planned: %d, requested %d)",

"planned"?

+        /* Get the next index to vacuum */
+        if (do_parallel)
+            idx = pg_atomic_fetch_add_u32(&(lps->lvshared->nprocessed), 1);
+        else
+            idx = nprocessed++;

It seems that both of the two cases can be handled using
LVParallelState and most of the branches by lps or do_parallel
can be removed.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#44

Robert Haas

robertmhaas@gmail.com

almost 7 years ago

In reply to: Masahiko Sawada (#41)

On Thu, Mar 14, 2019 at 3:37 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

BTW your patch seems to not apply to the current HEAD cleanly and to
need to update the comment of vacuum().

Yeah, I omitted some hunks by being stupid with 'git'.

Since you seem to like the approach, I put back the hunks I intended
to have there, pulled in one change from your v2 that looked good,
made one other tweak, and committed this. I think I like what I did
with vacuum_open_relation a bit better than what you did; actually, I
think it cannot be right to just pass 'params' when the current code
is passing params->options & ~(VACOPT_VACUUM). My approach avoids
that particular pitfall.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#45

Robert Haas

robertmhaas@gmail.com

almost 7 years ago

In reply to: Masahiko Sawada (#41)

On Thu, Mar 14, 2019 at 3:37 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Attached the updated patch you proposed and the patch that converts
the grammer productions for the VACUUM option on top of the former
patch. The latter patch moves VacuumOption to vacuum.h since the
parser no longer needs such information.

Committed.

If we take this direction I will change the parallel vacuum patch so
that it adds new PARALLEL option and adds 'nworkers' to VacuumParams.

Sounds good.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#46

Masahiko Sawada

sawada.mshk@gmail.com

almost 7 years ago

In reply to: Robert Haas (#44)

On Tue, Mar 19, 2019 at 3:05 AM Robert Haas <robertmhaas@gmail.com> wrote:

On Thu, Mar 14, 2019 at 3:37 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

BTW your patch seems to not apply to the current HEAD cleanly and to
need to update the comment of vacuum().

Yeah, I omitted some hunks by being stupid with 'git'.

Since you seem to like the approach, I put back the hunks I intended
to have there, pulled in one change from your v2 that looked good,
made one other tweak, and committed this.

Thank you!

I think I like what I did
with vacuum_open_relation a bit better than what you did; actually, I
think it cannot be right to just pass 'params' when the current code
is passing params->options & ~(VACOPT_VACUUM). My approach avoids
that particular pitfall.

Agreed. Thanks.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#47

Haribabu Kommi

kommi.haribabu@gmail.com

almost 7 years ago

In reply to: Masahiko Sawada (#42)

On Mon, Mar 18, 2019 at 1:58 PM Masahiko Sawada <sawada.mshk@gmail.com>
wrote:

On Tue, Feb 26, 2019 at 7:20 PM Masahiko Sawada <sawada.mshk@gmail.com>
wrote:

On Tue, Feb 26, 2019 at 1:35 PM Haribabu Kommi <kommi.haribabu@gmail.com>

wrote:

On Thu, Feb 14, 2019 at 9:17 PM Masahiko Sawada <sawada.mshk@gmail.com>

wrote:

Thank you. Attached the rebased patch.

I ran some performance tests to compare the parallelism benefits,

Thank you for testing!

but I got some strange results of performance overhead, may be it is
because, I tested it on my laptop.

Hmm, I think the parallel vacuum would help for heavy workloads like a
big table with multiple indexes. In your test result, all executions
are completed within 1 sec, which seems to be one use case that the
parallel vacuum wouldn't help. I suspect that the table is small,
right? Anyway I'll also do performance tests.

Here is the performance test results. I've setup a 500MB table with
several indexes and made 10% of table dirty before each vacuum.
Compared execution time of the patched postgrse with the current HEAD
(at 'speed_up' column). In my environment,

indexes | parallel_degree | patched | head | speed_up
---------+-----------------+------------+------------+----------
0 | 0 | 238.2085 | 244.7625 | 1.0275
0 | 1 | 237.7050 | 244.7625 | 1.0297
0 | 2 | 238.0390 | 244.7625 | 1.0282
0 | 4 | 238.1045 | 244.7625 | 1.0280
0 | 8 | 237.8995 | 244.7625 | 1.0288
0 | 16 | 237.7775 | 244.7625 | 1.0294
1 | 0 | 1328.8590 | 1334.9125 | 1.0046
1 | 1 | 1325.9140 | 1334.9125 | 1.0068
1 | 2 | 1333.3665 | 1334.9125 | 1.0012
1 | 4 | 1329.5205 | 1334.9125 | 1.0041
1 | 8 | 1334.2255 | 1334.9125 | 1.0005
1 | 16 | 1335.1510 | 1334.9125 | 0.9998
2 | 0 | 2426.2905 | 2427.5165 | 1.0005
2 | 1 | 1416.0595 | 2427.5165 | 1.7143
2 | 2 | 1411.6270 | 2427.5165 | 1.7197
2 | 4 | 1411.6490 | 2427.5165 | 1.7196
2 | 8 | 1410.1750 | 2427.5165 | 1.7214
2 | 16 | 1413.4985 | 2427.5165 | 1.7174
4 | 0 | 4622.5060 | 4619.0340 | 0.9992
4 | 1 | 2536.8435 | 4619.0340 | 1.8208
4 | 2 | 2548.3615 | 4619.0340 | 1.8126
4 | 4 | 1467.9655 | 4619.0340 | 3.1466
4 | 8 | 1486.3155 | 4619.0340 | 3.1077
4 | 16 | 1481.7150 | 4619.0340 | 3.1174
8 | 0 | 9039.3810 | 8990.4735 | 0.9946
8 | 1 | 4807.5880 | 8990.4735 | 1.8701
8 | 2 | 3786.7620 | 8990.4735 | 2.3742
8 | 4 | 2924.2205 | 8990.4735 | 3.0745
8 | 8 | 2684.2545 | 8990.4735 | 3.3493
8 | 16 | 2672.9800 | 8990.4735 | 3.3635
16 | 0 | 17821.4715 | 17740.1300 | 0.9954
16 | 1 | 9318.3810 | 17740.1300 | 1.9038
16 | 2 | 7260.6315 | 17740.1300 | 2.4433
16 | 4 | 5538.5225 | 17740.1300 | 3.2030
16 | 8 | 5368.5255 | 17740.1300 | 3.3045
16 | 16 | 5291.8510 | 17740.1300 | 3.3523
(36 rows)

The performance results are good. Do we want to add the recommended
size in the document for the parallel option? the parallel option for
smaller
tables can lead to performance overhead.

Regards,
Haribabu Kommi
Fujitsu Australia

#48

Masahiko Sawada

sawada.mshk@gmail.com

almost 7 years ago

In reply to: Kyotaro HORIGUCHI (#43)

On Mon, Mar 18, 2019 at 7:06 PM Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:

Hello.

At Mon, 18 Mar 2019 11:54:42 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in <CAD21AoC6bsM0FfePgzSV40uXofbFSPe-Ax095TOnu5GOZ790uA@mail.gmail.com>

Here is the performance test results. I've setup a 500MB table with
several indexes and made 10% of table dirty before each vacuum.
Compared execution time of the patched postgrse with the current HEAD
(at 'speed_up' column). In my environment,

indexes | parallel_degree | patched | head | speed_up
---------+-----------------+------------+------------+----------
0 | 0 | 238.2085 | 244.7625 | 1.0275
0 | 1 | 237.7050 | 244.7625 | 1.0297
0 | 2 | 238.0390 | 244.7625 | 1.0282
0 | 4 | 238.1045 | 244.7625 | 1.0280
0 | 8 | 237.8995 | 244.7625 | 1.0288
0 | 16 | 237.7775 | 244.7625 | 1.0294
1 | 0 | 1328.8590 | 1334.9125 | 1.0046
1 | 1 | 1325.9140 | 1334.9125 | 1.0068
1 | 2 | 1333.3665 | 1334.9125 | 1.0012
1 | 4 | 1329.5205 | 1334.9125 | 1.0041
1 | 8 | 1334.2255 | 1334.9125 | 1.0005
1 | 16 | 1335.1510 | 1334.9125 | 0.9998
2 | 0 | 2426.2905 | 2427.5165 | 1.0005
2 | 1 | 1416.0595 | 2427.5165 | 1.7143
2 | 2 | 1411.6270 | 2427.5165 | 1.7197
2 | 4 | 1411.6490 | 2427.5165 | 1.7196
2 | 8 | 1410.1750 | 2427.5165 | 1.7214
2 | 16 | 1413.4985 | 2427.5165 | 1.7174
4 | 0 | 4622.5060 | 4619.0340 | 0.9992
4 | 1 | 2536.8435 | 4619.0340 | 1.8208
4 | 2 | 2548.3615 | 4619.0340 | 1.8126
4 | 4 | 1467.9655 | 4619.0340 | 3.1466
4 | 8 | 1486.3155 | 4619.0340 | 3.1077
4 | 16 | 1481.7150 | 4619.0340 | 3.1174
8 | 0 | 9039.3810 | 8990.4735 | 0.9946
8 | 1 | 4807.5880 | 8990.4735 | 1.8701
8 | 2 | 3786.7620 | 8990.4735 | 2.3742
8 | 4 | 2924.2205 | 8990.4735 | 3.0745
8 | 8 | 2684.2545 | 8990.4735 | 3.3493
8 | 16 | 2672.9800 | 8990.4735 | 3.3635
16 | 0 | 17821.4715 | 17740.1300 | 0.9954
16 | 1 | 9318.3810 | 17740.1300 | 1.9038
16 | 2 | 7260.6315 | 17740.1300 | 2.4433
16 | 4 | 5538.5225 | 17740.1300 | 3.2030
16 | 8 | 5368.5255 | 17740.1300 | 3.3045
16 | 16 | 5291.8510 | 17740.1300 | 3.3523
(36 rows)

For indexes=4,8,16, the cases with parallel_degree=4,8,16 behave
almost the same. I suspect that the indexes are too-small and all
the index pages were on memory and CPU is saturated. Maybe you
had four cores and parallel workers more than the number had no
effect. Other normal backends should have been able do almost
nothing meanwhile. Usually the number of parallel workers is
determined so that IO capacity is filled up but this feature
intermittently saturates CPU capacity very under such a
situation.

I'm sorry I didn't make it clear enough. If the parallel degree is
higher than 'the number of indexes - 1' redundant workers are not
launched. So for indexes=4, 8, 16 the number of actually launched
parallel workers is up to 3, 7, 15 respectively. That's why the result
shows almost the same execution time in the cases where nindexes <=
parallel_degree.

I'll share the performance test result of more larger tables and indexes.

I'm not sure, but what if we do index vacuum in one-tuple-by-one
manner? That is, heap vacuum passes dead tuple one-by-one (or
buffering few tuples) to workers and workers process it not by
bulkdelete, but just tuple_delete (we don't have one). That could
avoid the sleep time of heap-scan while index bulkdelete.

Just to be clear, in parallel lazy vacuum all parallel vacuum
processes including the leader process do index vacuuming, no one
doesn't sleep during index vacuuming. The leader process does heap
scan and launches parallel workers before index vacuuming. Each
processes exclusively processes indexes one by one.

Such index deletion method could be an optimization but I'm not sure
that the calling tuple_delete many times would be faster than one
bulkdelete. If there are many dead tuples vacuum has to call
tuple_delete as much as dead tuples. In general one seqscan is faster
than tons of indexscan. There is the proposal for such one by one
index deletions[1]/messages/by-id/425db134-8bba-005c-b59d-56e50de3b41e@postgrespro.ru but it's not a replacement of bulkdelete.

Attached the updated version patches. The patches apply to the current
HEAD cleanly but the 0001 patch still changes the vacuum option to a
Node since it's under the discussion. After the direction has been
decided, I'll update the patches.

As for the to-be-or-not-to-be a node problem, I don't think it is
needed but from the point of consistency, it seems reasonable and
it is seen in other nodes that *Stmt Node holds option Node. But
makeVacOpt and it's usage, and subsequent operations on the node
look somewhat strange.. Why don't you just do
"makeNode(VacuumOptions)"?

Thank you for the comment but this part has gone away as the recent
commit changed the grammar production of vacuum command.

+      /* Estimate size for dead tuples -- PARALLEL_VACUUM_KEY_DEAD_TUPLES */
+    maxtuples = compute_max_dead_tuples(nblocks, nindexes > 0);
If I understand this correctly, nindexes is always > 1 there. At
lesat asserted that > 0 there.

+ estdt = MAXALIGN(add_size(sizeof(LVDeadTuples),

I don't think the name is good. (dt menant detach by the first look for me..)

Fixed.

+        if (lps->nworkers_requested > 0)
+            appendStringInfo(&buf,
+                             ngettext("launched %d parallel vacuum worker for index cleanup (planned: %d, requested %d)",

"planned"?

The 'planned' shows how many parallel workers we planned to launch.
The degree of parallelism is determined based on either user request
or the number of indexes that the table has.

+        /* Get the next index to vacuum */
+        if (do_parallel)
+            idx = pg_atomic_fetch_add_u32(&(lps->lvshared->nprocessed), 1);
+        else
+            idx = nprocessed++;
It seems that both of the two cases can be handled using
LVParallelState and most of the branches by lps or do_parallel
can be removed.

Sorry I couldn't get your comment. You meant to move nprocessed to
LVParallelState?

[1]: /messages/by-id/425db134-8bba-005c-b59d-56e50de3b41e@postgrespro.ru

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#49

Kyotaro HORIGUCHI

horiguchi.kyotaro@lab.ntt.co.jp

almost 7 years ago

In reply to: Masahiko Sawada (#48)

At Tue, 19 Mar 2019 13:31:04 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in <CAD21AoD4ivrYqg5tau460zEEcgR0t9cV-UagjJ997OfvP3gsNQ@mail.gmail.com>

For indexes=4,8,16, the cases with parallel_degree=4,8,16 behave
almost the same. I suspect that the indexes are too-small and all
the index pages were on memory and CPU is saturated. Maybe you
had four cores and parallel workers more than the number had no
effect. Other normal backends should have been able do almost
nothing meanwhile. Usually the number of parallel workers is
determined so that IO capacity is filled up but this feature
intermittently saturates CPU capacity very under such a
situation.

I'm sorry I didn't make it clear enough. If the parallel degree is
higher than 'the number of indexes - 1' redundant workers are not
launched. So for indexes=4, 8, 16 the number of actually launched
parallel workers is up to 3, 7, 15 respectively. That's why the result
shows almost the same execution time in the cases where nindexes <=
parallel_degree.

In the 16 indexes case, the performance saturated at 4 workers
which contradicts to your explanation.

I'll share the performance test result of more larger tables and indexes.

I'm not sure, but what if we do index vacuum in one-tuple-by-one
manner? That is, heap vacuum passes dead tuple one-by-one (or
buffering few tuples) to workers and workers process it not by
bulkdelete, but just tuple_delete (we don't have one). That could
avoid the sleep time of heap-scan while index bulkdelete.

Just to be clear, in parallel lazy vacuum all parallel vacuum
processes including the leader process do index vacuuming, no one
doesn't sleep during index vacuuming. The leader process does heap
scan and launches parallel workers before index vacuuming. Each
processes exclusively processes indexes one by one.

The leader doesn't continue heap-scan while index vacuuming is
running. And the index-page-scan seems eat up CPU easily. If
index vacuum can run simultaneously with the next heap scan
phase, we can make index scan finishes almost the same time with
the next round of heap scan. It would reduce the (possible) CPU
contention. But this requires as the twice size of shared
memoryas the current implement.

Such index deletion method could be an optimization but I'm not sure
that the calling tuple_delete many times would be faster than one
bulkdelete. If there are many dead tuples vacuum has to call
tuple_delete as much as dead tuples. In general one seqscan is faster
than tons of indexscan. There is the proposal for such one by one
index deletions[1] but it's not a replacement of bulkdelete.

I'm not sure what you mean by 'replacement' but it depends on how
large part of a table is removed at once. As mentioned in the
thread. But unfortunately it doesn't seem easy to do..

Attached the updated version patches. The patches apply to the current
HEAD cleanly but the 0001 patch still changes the vacuum option to a
Node since it's under the discussion. After the direction has been
decided, I'll update the patches.

As for the to-be-or-not-to-be a node problem, I don't think it is
needed but from the point of consistency, it seems reasonable and
it is seen in other nodes that *Stmt Node holds option Node. But
makeVacOpt and it's usage, and subsequent operations on the node
look somewhat strange.. Why don't you just do
"makeNode(VacuumOptions)"?

Thank you for the comment but this part has gone away as the recent
commit changed the grammar production of vacuum command.

Oops!

+      /* Estimate size for dead tuples -- PARALLEL_VACUUM_KEY_DEAD_TUPLES */
+    maxtuples = compute_max_dead_tuples(nblocks, nindexes > 0);
If I understand this correctly, nindexes is always > 1 there. At
lesat asserted that > 0 there.

+ estdt = MAXALIGN(add_size(sizeof(LVDeadTuples),

I don't think the name is good. (dt menant detach by the first look for me..)
Fixed.
+        if (lps->nworkers_requested > 0)
+            appendStringInfo(&buf,
+                             ngettext("launched %d parallel vacuum worker for index cleanup (planned: %d, requested %d)",
"planned"?
The 'planned' shows how many parallel workers we planned to launch.
The degree of parallelism is determined based on either user request
or the number of indexes that the table has.
+        /* Get the next index to vacuum */
+        if (do_parallel)
+            idx = pg_atomic_fetch_add_u32(&(lps->lvshared->nprocessed), 1);
+        else
+            idx = nprocessed++;
It seems that both of the two cases can be handled using
LVParallelState and most of the branches by lps or do_parallel
can be removed.
Sorry I couldn't get your comment. You meant to move nprocessed to
LVParallelState?

Exactly. I meant letting lvshared points to private memory, but
it might introduce confusion.

[1] /messages/by-id/425db134-8bba-005c-b59d-56e50de3b41e@postgrespro.ru

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#50

Masahiko Sawada

sawada.mshk@gmail.com

almost 7 years ago

In reply to: Haribabu Kommi (#47)

On Tue, Mar 19, 2019 at 10:39 AM Haribabu Kommi
<kommi.haribabu@gmail.com> wrote:

On Mon, Mar 18, 2019 at 1:58 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Tue, Feb 26, 2019 at 7:20 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Tue, Feb 26, 2019 at 1:35 PM Haribabu Kommi <kommi.haribabu@gmail.com> wrote:

On Thu, Feb 14, 2019 at 9:17 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Thank you. Attached the rebased patch.

I ran some performance tests to compare the parallelism benefits,

Thank you for testing!

but I got some strange results of performance overhead, may be it is
because, I tested it on my laptop.

Hmm, I think the parallel vacuum would help for heavy workloads like a
big table with multiple indexes. In your test result, all executions
are completed within 1 sec, which seems to be one use case that the
parallel vacuum wouldn't help. I suspect that the table is small,
right? Anyway I'll also do performance tests.

Here is the performance test results. I've setup a 500MB table with
several indexes and made 10% of table dirty before each vacuum.
Compared execution time of the patched postgrse with the current HEAD
(at 'speed_up' column). In my environment,

indexes | parallel_degree | patched | head | speed_up
---------+-----------------+------------+------------+----------
0 | 0 | 238.2085 | 244.7625 | 1.0275
0 | 1 | 237.7050 | 244.7625 | 1.0297
0 | 2 | 238.0390 | 244.7625 | 1.0282
0 | 4 | 238.1045 | 244.7625 | 1.0280
0 | 8 | 237.8995 | 244.7625 | 1.0288
0 | 16 | 237.7775 | 244.7625 | 1.0294
1 | 0 | 1328.8590 | 1334.9125 | 1.0046
1 | 1 | 1325.9140 | 1334.9125 | 1.0068
1 | 2 | 1333.3665 | 1334.9125 | 1.0012
1 | 4 | 1329.5205 | 1334.9125 | 1.0041
1 | 8 | 1334.2255 | 1334.9125 | 1.0005
1 | 16 | 1335.1510 | 1334.9125 | 0.9998
2 | 0 | 2426.2905 | 2427.5165 | 1.0005
2 | 1 | 1416.0595 | 2427.5165 | 1.7143
2 | 2 | 1411.6270 | 2427.5165 | 1.7197
2 | 4 | 1411.6490 | 2427.5165 | 1.7196
2 | 8 | 1410.1750 | 2427.5165 | 1.7214
2 | 16 | 1413.4985 | 2427.5165 | 1.7174
4 | 0 | 4622.5060 | 4619.0340 | 0.9992
4 | 1 | 2536.8435 | 4619.0340 | 1.8208
4 | 2 | 2548.3615 | 4619.0340 | 1.8126
4 | 4 | 1467.9655 | 4619.0340 | 3.1466
4 | 8 | 1486.3155 | 4619.0340 | 3.1077
4 | 16 | 1481.7150 | 4619.0340 | 3.1174
8 | 0 | 9039.3810 | 8990.4735 | 0.9946
8 | 1 | 4807.5880 | 8990.4735 | 1.8701
8 | 2 | 3786.7620 | 8990.4735 | 2.3742
8 | 4 | 2924.2205 | 8990.4735 | 3.0745
8 | 8 | 2684.2545 | 8990.4735 | 3.3493
8 | 16 | 2672.9800 | 8990.4735 | 3.3635
16 | 0 | 17821.4715 | 17740.1300 | 0.9954
16 | 1 | 9318.3810 | 17740.1300 | 1.9038
16 | 2 | 7260.6315 | 17740.1300 | 2.4433
16 | 4 | 5538.5225 | 17740.1300 | 3.2030
16 | 8 | 5368.5255 | 17740.1300 | 3.3045
16 | 16 | 5291.8510 | 17740.1300 | 3.3523
(36 rows)

The performance results are good. Do we want to add the recommended
size in the document for the parallel option? the parallel option for smaller
tables can lead to performance overhead.

Hmm, I don't think we can add the specific recommended size because
the performance gain by parallel lazy vacuum depends on various things
such as CPU cores, the number of indexes, shared buffer size, index
types, HDD or SSD. I suppose that users who want to use this option
have some sort of performance problem such as that vacuum takes a very
long time. They would use it for relatively larger tables.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#51

Masahiko Sawada

sawada.mshk@gmail.com

almost 7 years ago

In reply to: Kyotaro HORIGUCHI (#49)

On Tue, Mar 19, 2019 at 4:59 PM Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:

At Tue, 19 Mar 2019 13:31:04 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in <CAD21AoD4ivrYqg5tau460zEEcgR0t9cV-UagjJ997OfvP3gsNQ@mail.gmail.com>

For indexes=4,8,16, the cases with parallel_degree=4,8,16 behave
almost the same. I suspect that the indexes are too-small and all
the index pages were on memory and CPU is saturated. Maybe you
had four cores and parallel workers more than the number had no
effect. Other normal backends should have been able do almost
nothing meanwhile. Usually the number of parallel workers is
determined so that IO capacity is filled up but this feature
intermittently saturates CPU capacity very under such a
situation.

I'm sorry I didn't make it clear enough. If the parallel degree is
higher than 'the number of indexes - 1' redundant workers are not
launched. So for indexes=4, 8, 16 the number of actually launched
parallel workers is up to 3, 7, 15 respectively. That's why the result
shows almost the same execution time in the cases where nindexes <=
parallel_degree.

In the 16 indexes case, the performance saturated at 4 workers
which contradicts to your explanation.

Because the machine I used has 4 cores the performance doesn't get
improved even if more than 4 parallel workers are launched.

I'll share the performance test result of more larger tables and indexes.

I'm not sure, but what if we do index vacuum in one-tuple-by-one
manner? That is, heap vacuum passes dead tuple one-by-one (or
buffering few tuples) to workers and workers process it not by
bulkdelete, but just tuple_delete (we don't have one). That could
avoid the sleep time of heap-scan while index bulkdelete.

Just to be clear, in parallel lazy vacuum all parallel vacuum
processes including the leader process do index vacuuming, no one
doesn't sleep during index vacuuming. The leader process does heap
scan and launches parallel workers before index vacuuming. Each
processes exclusively processes indexes one by one.

The leader doesn't continue heap-scan while index vacuuming is
running. And the index-page-scan seems eat up CPU easily. If
index vacuum can run simultaneously with the next heap scan
phase, we can make index scan finishes almost the same time with
the next round of heap scan. It would reduce the (possible) CPU
contention. But this requires as the twice size of shared
memoryas the current implement.

Yeah, I've considered that something like pipe-lining approach that
one process continue to queue the dead tuples and other process
fetches and processes them during index vacuuming but the current
version patch employed the most simple approach as the first step.
Once we had the retail index deletion approach we might be able to use
it for parallel vacuum.

Such index deletion method could be an optimization but I'm not sure
that the calling tuple_delete many times would be faster than one
bulkdelete. If there are many dead tuples vacuum has to call
tuple_delete as much as dead tuples. In general one seqscan is faster
than tons of indexscan. There is the proposal for such one by one
index deletions[1] but it's not a replacement of bulkdelete.

I'm not sure what you mean by 'replacement' but it depends on how
large part of a table is removed at once. As mentioned in the
thread. But unfortunately it doesn't seem easy to do..

Attached the updated version patches. The patches apply to the current
HEAD cleanly but the 0001 patch still changes the vacuum option to a
Node since it's under the discussion. After the direction has been
decided, I'll update the patches.

As for the to-be-or-not-to-be a node problem, I don't think it is
needed but from the point of consistency, it seems reasonable and
it is seen in other nodes that *Stmt Node holds option Node. But
makeVacOpt and it's usage, and subsequent operations on the node
look somewhat strange.. Why don't you just do
"makeNode(VacuumOptions)"?

Thank you for the comment but this part has gone away as the recent
commit changed the grammar production of vacuum command.

Oops!
+      /* Estimate size for dead tuples -- PARALLEL_VACUUM_KEY_DEAD_TUPLES */
+    maxtuples = compute_max_dead_tuples(nblocks, nindexes > 0);
If I understand this correctly, nindexes is always > 1 there. At
lesat asserted that > 0 there.

+ estdt = MAXALIGN(add_size(sizeof(LVDeadTuples),

I don't think the name is good. (dt menant detach by the first look for me..)
Fixed.
+        if (lps->nworkers_requested > 0)
+            appendStringInfo(&buf,
+                             ngettext("launched %d parallel vacuum worker for index cleanup (planned: %d, requested %d)",
"planned"?
The 'planned' shows how many parallel workers we planned to launch.
The degree of parallelism is determined based on either user request
or the number of indexes that the table has.
+        /* Get the next index to vacuum */
+        if (do_parallel)
+            idx = pg_atomic_fetch_add_u32(&(lps->lvshared->nprocessed), 1);
+        else
+            idx = nprocessed++;
It seems that both of the two cases can be handled using
LVParallelState and most of the branches by lps or do_parallel
can be removed.
Sorry I couldn't get your comment. You meant to move nprocessed to
LVParallelState?
Exactly. I meant letting lvshared points to private memory, but
it might introduce confusion.

Hmm, I'm not sure it would be a good idea. It would introduce
confusion as you mentioned. And since 'nprocessed' have to be
pg_atomic_uint32 in parallel mode we will end up with having an
another branch.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#52

Kyotaro HORIGUCHI

horiguchi.kyotaro@lab.ntt.co.jp

almost 7 years ago

In reply to: Masahiko Sawada (#51)

At Tue, 19 Mar 2019 19:01:06 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in <CAD21AoA3PpkcNNzcQmiNgFL3DudhdLRWoTvQE6=kRagFLjUiBg@mail.gmail.com>

On Tue, Mar 19, 2019 at 4:59 PM Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:

At Tue, 19 Mar 2019 13:31:04 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in <CAD21AoD4ivrYqg5tau460zEEcgR0t9cV-UagjJ997OfvP3gsNQ@mail.gmail.com>

For indexes=4,8,16, the cases with parallel_degree=4,8,16 behave
almost the same. I suspect that the indexes are too-small and all
the index pages were on memory and CPU is saturated. Maybe you
had four cores and parallel workers more than the number had no
effect. Other normal backends should have been able do almost
nothing meanwhile. Usually the number of parallel workers is
determined so that IO capacity is filled up but this feature
intermittently saturates CPU capacity very under such a
situation.

I'm sorry I didn't make it clear enough. If the parallel degree is
higher than 'the number of indexes - 1' redundant workers are not
launched. So for indexes=4, 8, 16 the number of actually launched
parallel workers is up to 3, 7, 15 respectively. That's why the result
shows almost the same execution time in the cases where nindexes <=
parallel_degree.

In the 16 indexes case, the performance saturated at 4 workers
which contradicts to your explanation.

Because the machine I used has 4 cores the performance doesn't get
improved even if more than 4 parallel workers are launched.

That is what I mentioned in the cited phrases. Sorry for perhaps
hard-to-read phrases..

I'll share the performance test result of more larger tables and indexes.

I'm not sure, but what if we do index vacuum in one-tuple-by-one
manner? That is, heap vacuum passes dead tuple one-by-one (or
buffering few tuples) to workers and workers process it not by
bulkdelete, but just tuple_delete (we don't have one). That could
avoid the sleep time of heap-scan while index bulkdelete.

Just to be clear, in parallel lazy vacuum all parallel vacuum
processes including the leader process do index vacuuming, no one
doesn't sleep during index vacuuming. The leader process does heap
scan and launches parallel workers before index vacuuming. Each
processes exclusively processes indexes one by one.

The leader doesn't continue heap-scan while index vacuuming is
running. And the index-page-scan seems eat up CPU easily. If
index vacuum can run simultaneously with the next heap scan
phase, we can make index scan finishes almost the same time with
the next round of heap scan. It would reduce the (possible) CPU
contention. But this requires as the twice size of shared
memoryas the current implement.

Yeah, I've considered that something like pipe-lining approach that
one process continue to queue the dead tuples and other process
fetches and processes them during index vacuuming but the current
version patch employed the most simple approach as the first step.
Once we had the retail index deletion approach we might be able to use
it for parallel vacuum.

Ok, I understood the direction.

...

Sorry I couldn't get your comment. You meant to move nprocessed to
LVParallelState?

Exactly. I meant letting lvshared points to private memory, but
it might introduce confusion.

Hmm, I'm not sure it would be a good idea. It would introduce
confusion as you mentioned. And since 'nprocessed' have to be
pg_atomic_uint32 in parallel mode we will end up with having an
another branch.

Ok. Agreed. Thank you for the pacience.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#53

Kyotaro HORIGUCHI

horiguchi.kyotaro@lab.ntt.co.jp

almost 7 years ago

In reply to: Masahiko Sawada (#50)

At Tue, 19 Mar 2019 17:51:32 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in <CAD21AoCUZQmyXrwDw57ejoR-j1QrGqm_vrQKOkif_aJK4Gih6Q@mail.gmail.com>

On Tue, Mar 19, 2019 at 10:39 AM Haribabu Kommi
<kommi.haribabu@gmail.com> wrote:

The performance results are good. Do we want to add the recommended
size in the document for the parallel option? the parallel option for smaller
tables can lead to performance overhead.

Hmm, I don't think we can add the specific recommended size because
the performance gain by parallel lazy vacuum depends on various things
such as CPU cores, the number of indexes, shared buffer size, index
types, HDD or SSD. I suppose that users who want to use this option
have some sort of performance problem such as that vacuum takes a very
long time. They would use it for relatively larger tables.

Agree that we have no recommended setting, but I strongly think that documentation on the downside or possible side effect of this feature is required for those who are to use the feature.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#54

Masahiko Sawada

sawada.mshk@gmail.com

almost 7 years ago

In reply to: Kyotaro HORIGUCHI (#52)

3 attachment(s)

On Tue, Mar 19, 2019 at 7:15 PM Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:

At Tue, 19 Mar 2019 19:01:06 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in <CAD21AoA3PpkcNNzcQmiNgFL3DudhdLRWoTvQE6=kRagFLjUiBg@mail.gmail.com>

On Tue, Mar 19, 2019 at 4:59 PM Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:

At Tue, 19 Mar 2019 13:31:04 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in <CAD21AoD4ivrYqg5tau460zEEcgR0t9cV-UagjJ997OfvP3gsNQ@mail.gmail.com>

For indexes=4,8,16, the cases with parallel_degree=4,8,16 behave
almost the same. I suspect that the indexes are too-small and all
the index pages were on memory and CPU is saturated. Maybe you
had four cores and parallel workers more than the number had no
effect. Other normal backends should have been able do almost
nothing meanwhile. Usually the number of parallel workers is
determined so that IO capacity is filled up but this feature
intermittently saturates CPU capacity very under such a
situation.

I'm sorry I didn't make it clear enough. If the parallel degree is
higher than 'the number of indexes - 1' redundant workers are not
launched. So for indexes=4, 8, 16 the number of actually launched
parallel workers is up to 3, 7, 15 respectively. That's why the result
shows almost the same execution time in the cases where nindexes <=
parallel_degree.

In the 16 indexes case, the performance saturated at 4 workers
which contradicts to your explanation.

Because the machine I used has 4 cores the performance doesn't get
improved even if more than 4 parallel workers are launched.

That is what I mentioned in the cited phrases. Sorry for perhaps
hard-to-read phrases..

I understood now. Thank you!

Attached the updated version patches incorporated all review comments.

Commit 6776142 changed the grammar production of vacuum command. This
patch adds PARALLEL option on top of the commit.

I realized that the commit 6776142 breaks indents in ExecVacuum() and
the including nodes/parsenodes.h is no longer needed. Sorry that's my
wrong. Attached the patch (vacuum_fix.patch) fixes them, although the
indent issue will be resolved by pgindent before releasing.

In parsing vacuum command, since only PARALLEL option can have an
argument I've added the check in ExecVacuum to erroring out when other
options have an argument. But it might be good to make other vacuum
options (perhaps except for DISABLE_PAGE_SKIPPING option) accept an
argument just like EXPLAIN command.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachments:

vacuum_fix.patchapplication/octet-stream; name=vacuum_fix.patchDownload

diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index f0afeaf..016e411 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -108,9 +108,9 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 
 		/* Parse options available on VACUUM */
 		else if (strcmp(opt->defname, "analyze") == 0)
-				params.options |= VACOPT_ANALYZE;
+			params.options |= VACOPT_ANALYZE;
 		else if (strcmp(opt->defname, "freeze") == 0)
-				params.options |= VACOPT_FREEZE;
+			params.options |= VACOPT_FREEZE;
 		else if (strcmp(opt->defname, "full") == 0)
 			params.options |= VACOPT_FULL;
 		else if (strcmp(opt->defname, "disable_page_skipping") == 0)
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 77086f3..a8ca199 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -18,7 +18,6 @@
 #include "catalog/pg_class.h"
 #include "catalog/pg_statistic.h"
 #include "catalog/pg_type.h"
-#include "nodes/parsenodes.h"
 #include "storage/buf.h"
 #include "storage/lock.h"
 #include "utils/relcache.h"

v18-0001-Add-parallel-option-to-VACUUM-command.patchapplication/octet-stream; name=v18-0001-Add-parallel-option-to-VACUUM-command.patchDownload

From 8671e9011cdb4831ad44e3fd6ba90a2381625cbd Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Mon, 4 Mar 2019 09:31:41 +0900
Subject: [PATCH v18 1/2] Add parallel option to VACUUM command

In parallel vacuum, we perform both index vacuum and cleanup vacuum
with parallel workers. Indivisual indexes are processed by one vacuum
process. Therefore parallel vacuum can be used when the table has more
than one index.

Parallel vacuum can be performed by specifying like
VACUUM (PARALLEL 2) tbl, meaning that performing vacuum with 2
parallel worker processes. Specifying only PARALLEL means that the
degree of parallalism will be determined based on the number of
indexes the table has.

The parallel vacuum degree is limited by both the number of
indexes the table has and max_parallel_maintenance_workers.
---
 doc/src/sgml/config.sgml              |  14 +-
 doc/src/sgml/ref/vacuum.sgml          |  20 +
 src/backend/access/heap/vacuumlazy.c  | 848 ++++++++++++++++++++++++++++++----
 src/backend/access/transam/parallel.c |   4 +
 src/backend/commands/vacuum.c         |  36 ++
 src/backend/parser/gram.y             |  11 +-
 src/backend/postmaster/autovacuum.c   |   1 +
 src/bin/psql/tab-complete.c           |   3 +-
 src/include/access/heapam.h           |   2 +
 src/include/commands/vacuum.h         |   8 +-
 src/test/regress/expected/vacuum.out  |  10 +-
 src/test/regress/sql/vacuum.sql       |   3 +
 12 files changed, 851 insertions(+), 109 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index d383de2..3ca3ae8 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2226,13 +2226,13 @@ include_dir 'conf.d'
        <listitem>
         <para>
          Sets the maximum number of parallel workers that can be
-         started by a single utility command.  Currently, the only
-         parallel utility command that supports the use of parallel
-         workers is <command>CREATE INDEX</command>, and only when
-         building a B-tree index.  Parallel workers are taken from the
-         pool of processes established by <xref
-         linkend="guc-max-worker-processes"/>, limited by <xref
-         linkend="guc-max-parallel-workers"/>.  Note that the requested
+         started by a single utility command.  Currently, the parallel
+         utility commands that support the use of parallel workers are
+         <command>CREATE INDEX</command> only when building a B-tree index,
+         and <command>VACUUM</command> without <literal>FULL</literal>
+         option. Parallel workers are taken from the pool of processes
+         established by <xref linkend="guc-max-worker-processes"/>, limited
+         by <xref linkend="guc-max-parallel-workers"/>.  Note that the requested
          number of workers may not actually be available at run time.
          If this occurs, the utility operation will run with fewer
          workers than expected.  The default value is 2.  Setting this
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index fd911f5..4544ef2 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -30,6 +30,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     FREEZE
     VERBOSE
     ANALYZE
+    PARALLEL [ <replaceable class="parameter">N</replaceable> ]
     DISABLE_PAGE_SKIPPING
     SKIP_LOCKED
 
@@ -143,6 +144,25 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    </varlistentry>
 
    <varlistentry>
+    <term><literal>PARALLEL <replaceable class="parameter">N</replaceable></literal></term>
+    <listitem>
+     <para>
+      Perform vacuum index and cleanup index phases of <command>VACUUM</command>
+      in parallel using <replaceable class="parameter">N</replaceable> background
+      workers (for the detail of each vacuum phases, please refer to
+      <xref linkend="vacuum-phases"/>). Only one worker can be used per index.
+      Workers for vacuum launches before starting each phases and exit at the end
+      of the phase. If the parallel degree
+      <replaceable class="parameter">N</replaceable> is omitted, then
+      <command>VACUUM</command> decides the number of workers based on number of
+      indexes on the relation which further limited by
+      <xref linkend="guc-max-parallel-workers-maintenance"/>. This option can not
+      use with <literal>FULL</literal> option.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><literal>DISABLE_PAGE_SKIPPING</literal></term>
     <listitem>
      <para>
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 5c554f9..300d82b 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -22,6 +22,19 @@
  * of index scans performed.  So we don't use maintenance_work_mem memory for
  * the TID array, just enough to hold as many heap tuples as fit on one page.
  *
+ * Lazy vacuum supports parallel execution with parallel worker processes. In
+ * parallel lazy vacuum, we perform both index vacuuming and index cleanup in
+ * parallel. Individual indexes is processed by one vacuum process. At beginning
+ * of lazy vacuum (at lazy_scan_heap) we prepare the parallel context and
+ * initialize the DSM segment that contains shared information as well as the
+ * memory space for dead tuples. When starting either index vacuuming or index
+ * cleanup, we launch parallel worker processes. Once all indexes are processed
+ * the parallel worker processes exit and the leader process re-initializes the
+ * DSM segment. Note that all parallel workers live during one either index
+ * vacuuming or index cleanup but the leader process neither exits from the
+ * parallel mode nor destroys the parallel context. For updating the index
+ * statistics, since any updates are not allowed during parallel mode we update
+ * the index statistics after exited from the parallel mode.
  *
  * Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
@@ -41,8 +54,10 @@
 #include "access/heapam_xlog.h"
 #include "access/htup_details.h"
 #include "access/multixact.h"
+#include "access/parallel.h"
 #include "access/transam.h"
 #include "access/visibilitymap.h"
+#include "access/xact.h"
 #include "access/xlog.h"
 #include "catalog/storage.h"
 #include "commands/dbcommands.h"
@@ -55,6 +70,7 @@
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
 #include "storage/lmgr.h"
+#include "tcop/tcopprot.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/pg_rusage.h"
@@ -110,6 +126,86 @@
  */
 #define PREFETCH_SIZE			((BlockNumber) 32)
 
+/*
+ * DSM keys for parallel lazy vacuum. Since we don't need to worry about DSM
+ * keys conflicting with plan_node_id we can use small integers.
+ */
+#define PARALLEL_VACUUM_KEY_SHARED			1
+#define PARALLEL_VACUUM_KEY_DEAD_TUPLES		2
+#define PARALLEL_VACUUM_KEY_QUERY_TEXT		3
+
+/*
+ * Structs for an index bulk-deletion statistic that is used for parallel
+ * lazy vacuum. This is allocated in a DSM segment.
+ */
+typedef struct LVIndStats
+{
+	bool updated;	/* are the stats updated? */
+	IndexBulkDeleteResult stats;
+} LVIndStats;
+
+/*
+ * LVDeadTuples stores the dead tuple TIDs collected during heap scan.
+ * This is allocated in a DSM segment when parallel lazy vacuum mode,
+ * or allocated in a local memory.
+ */
+typedef struct LVDeadTuples
+{
+	int			max_tuples;	/* # slots allocated in array */
+	int			num_tuples;	/* current # of entries */
+	/* List of TIDs of tuples we intend to delete */
+	/* NB: this list is ordered by TID address */
+	ItemPointerData itemptrs[FLEXIBLE_ARRAY_MEMBER];	/* array of ItemPointerData */
+} LVDeadTuples;
+#define SizeOfLVDeadTuples offsetof(LVDeadTuples, itemptrs) + sizeof(ItemPointerData)
+
+/*
+ * Shared information among parallel workers. So this is allocated in
+ * a DSM segment.
+ */
+typedef struct LVShared
+{
+	/*
+	 * Target table relid and vacuum settings. These fields are not modified
+	 * during the lazy vacuum.
+	 */
+	Oid		relid;
+	int		elevel;
+
+	/*
+	 * An indication for vacuum workers of doing either vacuuming index or
+	 * index cleanup.
+	 */
+	bool	for_cleanup;
+
+	/*
+	 * Fields for both index vacuuming and index cleanup.
+	 *
+	 * reltuples is the total number of input heap tuples. We set either an
+	 * old live tuples in index vacuuming or the new live tuples in index cleanup.
+	 *
+	 * estimated_count is true if the reltuples is estimated value.
+	 */
+	double	reltuples;
+	bool	estimated_count;
+
+	/*
+	 * Variables to control parallel index vacuuming. An variable-sized field
+	 * 'indstats' must come last.
+	 */
+	pg_atomic_uint32	nprocessed;
+	LVIndStats			indstats[FLEXIBLE_ARRAY_MEMBER];
+} LVShared;
+#define SizeOfLVShared offsetof(LVShared, indstats) + sizeof(LVIndStats)
+
+/* Struct for parallel lazy vacuum */
+typedef struct LVParallelState
+{
+	ParallelContext	*pcxt;
+	LVShared		*lvshared;
+	int				nworkers_requested;	/* user-requested parallel degree */
+} LVParallelState;
+
 typedef struct LVRelStats
 {
 	/* hasindex = true means two-pass strategy; false means one-pass */
@@ -128,17 +224,12 @@ typedef struct LVRelStats
 	BlockNumber pages_removed;
 	double		tuples_deleted;
 	BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
-	/* List of TIDs of tuples we intend to delete */
-	/* NB: this list is ordered by TID address */
-	int			num_dead_tuples;	/* current # of entries */
-	int			max_dead_tuples;	/* # slots allocated in array */
-	ItemPointer dead_tuples;	/* array of ItemPointerData */
+	LVDeadTuples *dead_tuples;
 	int			num_index_scans;
 	TransactionId latestRemovedXid;
 	bool		lock_waiter_detected;
 } LVRelStats;
 
-
 /* A few variables that don't seem worth passing around as parameters */
 static int	elevel = -1;
 
@@ -150,17 +241,19 @@ static BufferAccessStrategy vac_strategy;
 
 
 /* non-export function prototypes */
-static void lazy_scan_heap(Relation onerel, int options,
+static void lazy_scan_heap(Relation onerel, VacuumParams *params,
 			   LVRelStats *vacrelstats, Relation *Irel, int nindexes,
 			   bool aggressive);
 static void lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats, BlockNumber nblocks);
 static bool lazy_check_needs_freeze(Buffer buf, bool *hastup);
 static void lazy_vacuum_index(Relation indrel,
 				  IndexBulkDeleteResult **stats,
-				  LVRelStats *vacrelstats);
+				  double reltuples,
+				  LVDeadTuples	*dead_tuples);
 static void lazy_cleanup_index(Relation indrel,
-				   IndexBulkDeleteResult *stats,
-				   LVRelStats *vacrelstats);
+				   IndexBulkDeleteResult **stats,
+				   double reltuples, bool estimated_count,
+				   bool in_parallel);
 static int lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 				 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer);
 static bool should_attempt_truncation(LVRelStats *vacrelstats);
@@ -168,12 +261,26 @@ static void lazy_truncate_heap(Relation onerel, LVRelStats *vacrelstats);
 static BlockNumber count_nondeletable_pages(Relation onerel,
 						 LVRelStats *vacrelstats);
 static void lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks);
-static void lazy_record_dead_tuple(LVRelStats *vacrelstats,
-					   ItemPointer itemptr);
+static void lazy_record_dead_tuple(LVDeadTuples *dead_tuples, ItemPointer itemptr);
 static bool lazy_tid_reaped(ItemPointer itemptr, void *state);
 static int	vac_cmp_itemptr(const void *left, const void *right);
 static bool heap_page_is_all_visible(Relation rel, Buffer buf,
 						 TransactionId *visibility_cutoff_xid, bool *all_frozen);
+static LVParallelState *lazy_prepare_parallel(LVRelStats *vacrelstats, Oid relid,
+											  BlockNumber nblocks, int nindexes,
+											  int nrequested);
+static void lazy_end_parallel(LVParallelState *lps, Relation *Irel, int nindexes);
+static bool lazy_begin_parallel_vacuum_index(LVParallelState *lps, LVRelStats *vacrelstats,
+											 bool for_cleanup);
+static void lazy_end_parallel_vacuum_index(LVParallelState *lps, bool reinitialize);
+static void lazy_vacuum_all_indexes(LVRelStats *vacrelstats, Relation *Irel, int nindexes,
+									IndexBulkDeleteResult **stats,
+									LVParallelState *lps, bool for_cleanup);
+static void lazy_vacuum_indexes_for_worker(Relation *indrels, int nindexes,
+										   LVShared *lvshared, LVDeadTuples *dead_tuples,
+										   bool for_cleanup);
+static int compute_parallel_workers(Relation onerel, int nrequested, int nindexes);
+static long compute_max_dead_tuples(BlockNumber relblocks, bool hasindex);
 
 
 /*
@@ -261,7 +368,7 @@ heap_vacuum_rel(Relation onerel, VacuumParams *params,
 	vacrelstats->hasindex = (nindexes > 0);
 
 	/* Do the vacuuming */
-	lazy_scan_heap(onerel, params->options, vacrelstats, Irel, nindexes, aggressive);
+	lazy_scan_heap(onerel, params, vacrelstats, Irel, nindexes, aggressive);
 
 	/* Done with indexes */
 	vac_close_indexes(nindexes, Irel, NoLock);
@@ -464,14 +571,28 @@ vacuum_log_cleanup_info(Relation rel, LVRelStats *vacrelstats)
  *		dead-tuple TIDs, invoke vacuuming of indexes and call lazy_vacuum_heap
  *		to reclaim dead line pointers.
  *
+ *		If the table has more than one index and parallel lazy vacuum is requested,
+ *		we execute both index vacuuming and index cleanup with parallel workers.
+ *		When allocating the space for lazy scan heap, we enter the parallel mode,
+ *		create the parallel context and initailize a DSM segment for dead tuples.
+ *		The dead_tuples points either to a DSM segment in parallel lazy vacuum case
+ *		or to a local memory in single process vacuum case.  Before starting parallel
+ *		index vacuuming and parallel index cleanup we launch parallel workers.
+ *		All parallel workers will exit after processed all indexes and the leader
+ *		process re-initialize parallel context and then re-launch them at the next
+ *		execution. The index statistics are updated by the leader after exited from
+ *		the parallel mode since all writes are not allowed during the parallel mode.
+ *
  *		If there are no indexes then we can reclaim line pointers on the fly;
  *		dead line pointers need only be retained until all index pointers that
  *		reference them have been killed.
  */
 static void
-lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
+lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			   Relation *Irel, int nindexes, bool aggressive)
 {
+	LVParallelState *lps = NULL;	/* non-NULL means we're in parallel mode */
+	LVDeadTuples *dead_tuples;
 	BlockNumber nblocks,
 				blkno;
 	HeapTupleData tuple;
@@ -494,6 +615,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	bool		skipping_blocks;
 	xl_heap_freeze_tuple *frozen;
 	StringInfoData buf;
+	int			parallel_workers = 0;
 	const int	initprog_index[] = {
 		PROGRESS_VACUUM_PHASE,
 		PROGRESS_VACUUM_TOTAL_HEAP_BLKS,
@@ -529,13 +651,37 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	vacrelstats->nonempty_pages = 0;
 	vacrelstats->latestRemovedXid = InvalidTransactionId;
 
-	lazy_space_alloc(vacrelstats, nblocks);
+	/* Compute the number of parallel vacuum worker to request */
+	if ((params->options & VACOPT_PARALLEL) != 0)
+		parallel_workers = compute_parallel_workers(onerel,
+													params->nworkers,
+													nindexes);
+
+	if (parallel_workers > 0)
+	{
+		/*
+		 * If we launch parallel workers, enter the parallel mode and prepare
+		 * parallel lazy vacuum.
+		 */
+		lps = lazy_prepare_parallel(vacrelstats,
+									RelationGetRelid(onerel),
+									nblocks, nindexes,
+									parallel_workers);
+		lps->nworkers_requested = params->nworkers;
+	}
+	else
+	{
+		/* Allocate the memory space for dead tuples locally */
+		lazy_space_alloc(vacrelstats, nblocks);
+	}
+
+	dead_tuples = vacrelstats->dead_tuples;
 	frozen = palloc(sizeof(xl_heap_freeze_tuple) * MaxHeapTuplesPerPage);
 
 	/* Report that we're scanning the heap, advertising total # of blocks */
 	initprog_val[0] = PROGRESS_VACUUM_PHASE_SCAN_HEAP;
 	initprog_val[1] = nblocks;
-	initprog_val[2] = vacrelstats->max_dead_tuples;
+	initprog_val[2] = dead_tuples->max_tuples;
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
 	/*
@@ -583,7 +729,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	 * be replayed on any hot standby, where it can be disruptive.
 	 */
 	next_unskippable_block = 0;
-	if ((options & VACOPT_DISABLE_PAGE_SKIPPING) == 0)
+	if ((params->options & VACOPT_DISABLE_PAGE_SKIPPING) == 0)
 	{
 		while (next_unskippable_block < nblocks)
 		{
@@ -638,7 +784,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		{
 			/* Time to advance next_unskippable_block */
 			next_unskippable_block++;
-			if ((options & VACOPT_DISABLE_PAGE_SKIPPING) == 0)
+			if ((params->options & VACOPT_DISABLE_PAGE_SKIPPING) == 0)
 			{
 				while (next_unskippable_block < nblocks)
 				{
@@ -713,8 +859,8 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		 * If we are close to overrunning the available space for dead-tuple
 		 * TIDs, pause and do a cycle of vacuuming before we tackle this page.
 		 */
-		if ((vacrelstats->max_dead_tuples - vacrelstats->num_dead_tuples) < MaxHeapTuplesPerPage &&
-			vacrelstats->num_dead_tuples > 0)
+		if ((dead_tuples->max_tuples - dead_tuples->num_tuples) < MaxHeapTuplesPerPage &&
+			dead_tuples->num_tuples > 0)
 		{
 			const int	hvp_index[] = {
 				PROGRESS_VACUUM_PHASE,
@@ -742,10 +888,8 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 										 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
 			/* Remove index entries */
-			for (i = 0; i < nindexes; i++)
-				lazy_vacuum_index(Irel[i],
-								  &indstats[i],
-								  vacrelstats);
+			lazy_vacuum_all_indexes(vacrelstats, Irel, nindexes, indstats,
+									lps, false);
 
 			/*
 			 * Report that we are now vacuuming the heap.  We also increase
@@ -765,7 +909,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 			vacrelstats->num_index_scans++;
 
 			/*
@@ -961,7 +1105,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		has_dead_tuples = false;
 		nfrozen = 0;
 		hastup = false;
-		prev_dead_count = vacrelstats->num_dead_tuples;
+		prev_dead_count = dead_tuples->num_tuples;
 		maxoff = PageGetMaxOffsetNumber(page);
 
 		/*
@@ -1000,7 +1144,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			 */
 			if (ItemIdIsDead(itemid))
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				all_visible = false;
 				continue;
 			}
@@ -1140,7 +1284,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 
 			if (tupgone)
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				HeapTupleHeaderAdvanceLatestRemovedXid(tuple.t_data,
 													   &vacrelstats->latestRemovedXid);
 				tups_vacuumed += 1;
@@ -1209,8 +1353,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		 * If there are no indexes then we can vacuum the page right now
 		 * instead of doing a second scan.
 		 */
-		if (nindexes == 0 &&
-			vacrelstats->num_dead_tuples > 0)
+		if (nindexes == 0 && dead_tuples->num_tuples > 0)
 		{
 			/* Remove tuples from heap */
 			lazy_vacuum_page(onerel, blkno, buf, 0, vacrelstats, &vmbuffer);
@@ -1221,7 +1364,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 			vacuumed_pages++;
 
 			/*
@@ -1337,7 +1480,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		 * page, so remember its free space as-is.  (This path will always be
 		 * taken if there are no indexes.)
 		 */
-		if (vacrelstats->num_dead_tuples == prev_dead_count)
+		if (dead_tuples->num_tuples == prev_dead_count)
 			RecordPageWithFreeSpace(onerel, blkno, freespace, nblocks);
 	}
 
@@ -1371,7 +1514,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 
 	/* If any tuples need to be deleted, perform final vacuum cycle */
 	/* XXX put a threshold on min number of tuples here? */
-	if (vacrelstats->num_dead_tuples > 0)
+	if (dead_tuples->num_tuples > 0)
 	{
 		const int	hvp_index[] = {
 			PROGRESS_VACUUM_PHASE,
@@ -1387,10 +1530,8 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 									 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
 		/* Remove index entries */
-		for (i = 0; i < nindexes; i++)
-			lazy_vacuum_index(Irel[i],
-							  &indstats[i],
-							  vacrelstats);
+		lazy_vacuum_all_indexes(vacrelstats, Irel, nindexes, indstats,
+								lps, false);
 
 		/* Report that we are now vacuuming the heap */
 		hvp_val[0] = PROGRESS_VACUUM_PHASE_VACUUM_HEAP;
@@ -1417,8 +1558,12 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
 
 	/* Do post-vacuum cleanup and statistics update for each index */
-	for (i = 0; i < nindexes; i++)
-		lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
+	lazy_vacuum_all_indexes(vacrelstats, Irel, nindexes, indstats,
+							lps, true);
+
+	/* End parallel lazy vacuum, update index statistics if not yet */
+	if (lps)
+		lazy_end_parallel(lps, Irel, nindexes);
 
 	/* If no indexes, make log report that lazy_vacuum_heap would've made */
 	if (vacuumed_pages)
@@ -1485,7 +1630,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats, BlockNumber nblocks)
 	npages = 0;
 
 	tupindex = 0;
-	while (tupindex < vacrelstats->num_dead_tuples)
+	while (tupindex < vacrelstats->dead_tuples->num_tuples)
 	{
 		BlockNumber tblk;
 		Buffer		buf;
@@ -1494,7 +1639,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats, BlockNumber nblocks)
 
 		vacuum_delay_point();
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples->itemptrs[tupindex]);
 		buf = ReadBufferExtended(onerel, MAIN_FORKNUM, tblk, RBM_NORMAL,
 								 vac_strategy);
 		if (!ConditionalLockBufferForCleanup(buf))
@@ -1542,6 +1687,7 @@ static int
 lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 				 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer)
 {
+	LVDeadTuples	*dead_tuples = vacrelstats->dead_tuples;
 	Page		page = BufferGetPage(buffer);
 	OffsetNumber unused[MaxOffsetNumber];
 	int			uncnt = 0;
@@ -1552,16 +1698,16 @@ lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 
 	START_CRIT_SECTION();
 
-	for (; tupindex < vacrelstats->num_dead_tuples; tupindex++)
+	for (; tupindex < dead_tuples->num_tuples; tupindex++)
 	{
 		BlockNumber tblk;
 		OffsetNumber toff;
 		ItemId		itemid;
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&dead_tuples->itemptrs[tupindex]);
 		if (tblk != blkno)
 			break;				/* past end of tuples for this block */
-		toff = ItemPointerGetOffsetNumber(&vacrelstats->dead_tuples[tupindex]);
+		toff = ItemPointerGetOffsetNumber(&dead_tuples->itemptrs[tupindex]);
 		itemid = PageGetItemId(page, toff);
 		ItemIdSetUnused(itemid);
 		unused[uncnt++] = toff;
@@ -1682,6 +1828,100 @@ lazy_check_needs_freeze(Buffer buf, bool *hastup)
 	return false;
 }
 
+/*
+ * Vacuum or cleanup all indexes. If we're ready for parallel lazy vacuum it's
+ * performed with parallel workers. So this function must be used by the parallel
+ * vacuum leader process.
+ */
+static void
+lazy_vacuum_all_indexes(LVRelStats *vacrelstats, Relation *Irel, int nindexes,
+						IndexBulkDeleteResult **stats, LVParallelState *lps,
+						bool for_cleanup)
+{
+	int		nprocessed = 0;
+	bool	do_parallel = false;	/* true means workers has been launched */
+	int		idx;
+
+	Assert(!IsParallelWorker());
+
+	/* no job if the table has no index */
+	if (nindexes <= 0)
+		return;
+
+	/* Launch parallel vacuum workers if we're ready */
+	if (lps)
+		do_parallel = lazy_begin_parallel_vacuum_index(lps, vacrelstats,
+													   for_cleanup);
+
+	for (;;)
+	{
+		/* Get the next index to vacuum */
+		if (do_parallel)
+			idx = pg_atomic_fetch_add_u32(&(lps->lvshared->nprocessed), 1);
+		else
+			idx = nprocessed++;
+
+		/* Done for all indexes? */
+		if (idx >= nindexes)
+			break;
+
+		/*
+		 * Update the local pointer to the corresponding bulk-deletion result
+		 * if someone already updated it.
+		 */
+		if (do_parallel &&
+			lps->lvshared->indstats[idx].updated &&
+			stats[idx] == NULL)
+			stats[idx] = &(lps->lvshared->indstats[idx].stats);
+
+		/* Do vacuum or cleanup one index */
+		if (!for_cleanup)
+			lazy_vacuum_index(Irel[idx], &stats[idx], vacrelstats->old_rel_pages,
+							  vacrelstats->dead_tuples);
+		else
+		{
+			/*
+			 * Check 'lps', not 'do_parallel' here in order to check whether we're
+			 * in parallel mode.
+			 */
+			lazy_cleanup_index(Irel[idx], &stats[idx], vacrelstats->new_rel_tuples,
+							   vacrelstats->tupcount_pages < vacrelstats->rel_pages,
+							   lps != NULL);
+		}
+
+		/*
+		 * In parallel lazy vacuum, we copy the index bulk-deletion result
+		 * returned from ambulkdelete and amvacuumcleanup to the DSM segment
+		 * if the result on the DSM segment is not updated yet. It's necessary
+		 * because they allocate the results locally and it's possible that an
+		 * index will be vacuumed by the different vacuum process at the next
+		 * time. The copying the result normally happens only after the first
+		 * time of index vacuuming. From the second time, we pass the result
+		 * on the DSM segment so that they update it directly.
+		 *
+		 * Since all vacuum workers write the bulk-deletion result at different
+		 * slot we can write them without locking.
+		 */
+		if (do_parallel &&
+			!lps->lvshared->indstats[idx].updated &&
+			stats[idx] != NULL)
+		{
+			memcpy(&(lps->lvshared->indstats[idx].stats),
+				   stats[idx], sizeof(IndexBulkDeleteResult));
+			lps->lvshared->indstats[idx].updated = true;
+
+			/*
+			 * no longer need the locally allocated result and now stats[idx]
+			 * points to the DSM segment.
+			 */
+			pfree(stats[idx]);
+			stats[idx] = &(lps->lvshared->indstats[idx].stats);
+		}
+	}
+
+	if (do_parallel)
+		lazy_end_parallel_vacuum_index(lps, !for_cleanup);
+}
 
 /*
  *	lazy_vacuum_index() -- vacuum one index relation.
@@ -1690,11 +1930,11 @@ lazy_check_needs_freeze(Buffer buf, bool *hastup)
  *		vacrelstats->dead_tuples, and update running statistics.
  */
 static void
-lazy_vacuum_index(Relation indrel,
-				  IndexBulkDeleteResult **stats,
-				  LVRelStats *vacrelstats)
+lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult **stats,
+				  double reltuples, LVDeadTuples *dead_tuples)
 {
 	IndexVacuumInfo ivinfo;
+	char		*msg;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
@@ -1703,79 +1943,93 @@ lazy_vacuum_index(Relation indrel,
 	ivinfo.analyze_only = false;
 	ivinfo.estimated_count = true;
 	ivinfo.message_level = elevel;
-	/* We can only provide an approximate value of num_heap_tuples here */
-	ivinfo.num_heap_tuples = vacrelstats->old_live_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
 	/* Do bulk deletion */
 	*stats = index_bulk_delete(&ivinfo, *stats,
-							   lazy_tid_reaped, (void *) vacrelstats);
+							   lazy_tid_reaped, (void *) dead_tuples);
 
+	if (IsParallelWorker())
+		msg = "scanned index \"%s\" to remove %d row versions by parallel vacuum worker";
+	else
+		msg = "scanned index \"%s\" to remove %d row versions";
 	ereport(elevel,
-			(errmsg("scanned index \"%s\" to remove %d row versions",
+			(errmsg(msg,
 					RelationGetRelationName(indrel),
-					vacrelstats->num_dead_tuples),
+					dead_tuples->num_tuples),
 			 errdetail_internal("%s", pg_rusage_show(&ru0))));
 }
 
 /*
  *	lazy_cleanup_index() -- do post-vacuum cleanup for one index relation.
+ *
+ * in_parallel is true if we're performing parallel lazy vacuum. Since any
+ * updates are not allowed during parallel mode we don't update statistics
+ * but set the index bulk-deletion result to *stats. Otherwise we update it
+ * and set NULL.
  */
 static void
-lazy_cleanup_index(Relation indrel,
-				   IndexBulkDeleteResult *stats,
-				   LVRelStats *vacrelstats)
+lazy_cleanup_index(Relation indrel, IndexBulkDeleteResult **stats,
+				   double reltuples, bool estimated_count, bool in_parallel)
 {
+	IndexBulkDeleteResult *res;
 	IndexVacuumInfo ivinfo;
+	char		*msg;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
 
 	ivinfo.index = indrel;
 	ivinfo.analyze_only = false;
-	ivinfo.estimated_count = (vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	ivinfo.estimated_count = estimated_count;
 	ivinfo.message_level = elevel;
-
-	/*
-	 * Now we can provide a better estimate of total number of surviving
-	 * tuples (we assume indexes are more interested in that than in the
-	 * number of nominally live tuples).
-	 */
-	ivinfo.num_heap_tuples = vacrelstats->new_rel_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
-	stats = index_vacuum_cleanup(&ivinfo, stats);
+	res = index_vacuum_cleanup(&ivinfo, *stats);
 
-	if (!stats)
+	if (!res)
 		return;
 
 	/*
 	 * Now update statistics in pg_class, but only if the index says the count
-	 * is accurate.
+	 * is accurate and we're not in parallel mode.
 	 */
-	if (!stats->estimated_count)
+	if (!res->estimated_count && !in_parallel)
 		vac_update_relstats(indrel,
-							stats->num_pages,
-							stats->num_index_tuples,
+							res->num_pages,
+							res->num_index_tuples,
 							0,
 							false,
 							InvalidTransactionId,
 							InvalidMultiXactId,
 							false);
 
+	if (IsParallelWorker())
+		msg = "index \"%s\" now contains %.0f row versions in %u pages, reported by parallel vacuum worker";
+	else
+		msg = "index \"%s\" now contains %.0f row versions in %u pages";
+
 	ereport(elevel,
-			(errmsg("index \"%s\" now contains %.0f row versions in %u pages",
+			(errmsg(msg,
 					RelationGetRelationName(indrel),
-					stats->num_index_tuples,
-					stats->num_pages),
+					res->num_index_tuples,
+					res->num_pages),
 			 errdetail("%.0f index row versions were removed.\n"
 					   "%u index pages have been deleted, %u are currently reusable.\n"
 					   "%s.",
-					   stats->tuples_removed,
-					   stats->pages_deleted, stats->pages_free,
+					   res->tuples_removed,
+					   res->pages_deleted, res->pages_free,
 					   pg_rusage_show(&ru0))));
 
-	pfree(stats);
+	if (!in_parallel)
+	{
+		pfree(res);
+		*stats = NULL;
+	}
+	else
+		*stats = res;
 }
 
 /*
@@ -2080,19 +2334,17 @@ count_nondeletable_pages(Relation onerel, LVRelStats *vacrelstats)
 }
 
 /*
- * lazy_space_alloc - space allocation decisions for lazy vacuum
- *
- * See the comments at the head of this file for rationale.
+ * Return the maximum number of dead tuples we can record.
  */
-static void
-lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+static long
+compute_max_dead_tuples(BlockNumber relblocks, bool hasindex)
 {
 	long		maxtuples;
 	int			vac_work_mem = IsAutoVacuumWorkerProcess() &&
 	autovacuum_work_mem != -1 ?
 	autovacuum_work_mem : maintenance_work_mem;
 
-	if (vacrelstats->hasindex)
+	if (hasindex)
 	{
 		maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);
 		maxtuples = Min(maxtuples, INT_MAX);
@@ -2106,34 +2358,49 @@ lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
 		maxtuples = Max(maxtuples, MaxHeapTuplesPerPage);
 	}
 	else
-	{
 		maxtuples = MaxHeapTuplesPerPage;
-	}
 
-	vacrelstats->num_dead_tuples = 0;
-	vacrelstats->max_dead_tuples = (int) maxtuples;
-	vacrelstats->dead_tuples = (ItemPointer)
-		palloc(maxtuples * sizeof(ItemPointerData));
+	return maxtuples;
+}
+
+/*
+ * lazy_space_alloc - space allocation decisions for lazy vacuum
+ *
+ * See the comments at the head of this file for rationale.
+ */
+static void
+lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+{
+	LVDeadTuples	*dead_tuples = NULL;
+	long		maxtuples;
+
+	maxtuples = compute_max_dead_tuples(relblocks, vacrelstats->hasindex);
+
+	dead_tuples = (LVDeadTuples *)
+		palloc(SizeOfLVDeadTuples + maxtuples * sizeof(ItemPointerData));
+	dead_tuples->num_tuples = 0;
+	dead_tuples->max_tuples = (int) maxtuples;
+
+	vacrelstats->dead_tuples = dead_tuples;
 }
 
 /*
  * lazy_record_dead_tuple - remember one deletable tuple
  */
 static void
-lazy_record_dead_tuple(LVRelStats *vacrelstats,
-					   ItemPointer itemptr)
+lazy_record_dead_tuple(LVDeadTuples *dead_tuples, ItemPointer itemptr)
 {
 	/*
 	 * The array shouldn't overflow under normal behavior, but perhaps it
 	 * could if we are given a really small maintenance_work_mem. In that
 	 * case, just forget the last few tuples (we'll get 'em next time).
 	 */
-	if (vacrelstats->num_dead_tuples < vacrelstats->max_dead_tuples)
+	if (dead_tuples->num_tuples < dead_tuples->max_tuples)
 	{
-		vacrelstats->dead_tuples[vacrelstats->num_dead_tuples] = *itemptr;
-		vacrelstats->num_dead_tuples++;
+		dead_tuples->itemptrs[dead_tuples->num_tuples] = *itemptr;
+		dead_tuples->num_tuples++;
 		pgstat_progress_update_param(PROGRESS_VACUUM_NUM_DEAD_TUPLES,
-									 vacrelstats->num_dead_tuples);
+									 dead_tuples->num_tuples);
 	}
 }
 
@@ -2147,12 +2414,12 @@ lazy_record_dead_tuple(LVRelStats *vacrelstats,
 static bool
 lazy_tid_reaped(ItemPointer itemptr, void *state)
 {
-	LVRelStats *vacrelstats = (LVRelStats *) state;
+	LVDeadTuples	*dead_tuples = (LVDeadTuples *) state;
 	ItemPointer res;
 
 	res = (ItemPointer) bsearch((void *) itemptr,
-								(void *) vacrelstats->dead_tuples,
-								vacrelstats->num_dead_tuples,
+								(void *) dead_tuples->itemptrs,
+								dead_tuples->num_tuples,
 								sizeof(ItemPointerData),
 								vac_cmp_itemptr);
 
@@ -2300,3 +2567,392 @@ heap_page_is_all_visible(Relation rel, Buffer buf,
 
 	return all_visible;
 }
+
+/*
+ * Compute the number of parallel worker process to request. Both index vacuuming
+ * and cleanup index can be executed together with parallel workers if the table
+ * has more than one index. The relation size of table and indexes doesn't affect
+ * to the parallel degree for now.
+ */
+static int
+compute_parallel_workers(Relation onerel, int nrequested, int nindexes)
+{
+	int parallel_workers;
+
+	Assert(nrequested >= 0);
+
+	if (nindexes <= 1)
+		return 0;
+
+	if (nrequested > 0)
+		parallel_workers = Min(nrequested, nindexes - 1);
+	else
+	{
+		/*
+		 * The parallel degree is not requested. Compute it based on the
+		 * number of indexes.
+		 */
+		parallel_workers = nindexes - 1;
+	}
+
+	/* cap by max_parallel_maintenace_workers */
+	parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+
+	return parallel_workers;
+}
+
+/*
+ * Enter the parallel mode, allocate and initialize a DSM segment.
+ */
+static LVParallelState *
+lazy_prepare_parallel(LVRelStats *vacrelstats, Oid relid, BlockNumber nblocks,
+					  int nindexes, int nrequested)
+{
+	LVParallelState *lps = (LVParallelState *) palloc(sizeof(LVParallelState));
+	LVShared	*shared;
+	ParallelContext *pcxt;
+	LVDeadTuples	*tidmap;
+	long	maxtuples;
+	char	*sharedquery;
+	Size	est_shared;
+	Size	est_deadtuples;
+	int		querylen;
+	int		keys = 0;
+
+	Assert(nrequested > 0);
+	Assert(nindexes > 0);
+
+	EnterParallelMode();
+	pcxt = CreateParallelContext("postgres", "heap_parallel_vacuum_main",
+								 nrequested);
+	lps->pcxt = pcxt;
+	Assert(pcxt->nworkers > 0);
+
+	/* Estimate size for shared information -- PARALLEL_VACUUM_KEY_SHARED */
+	est_shared = MAXALIGN(add_size(SizeOfLVShared,
+								   mul_size(sizeof(LVIndStats), nindexes)));
+	shm_toc_estimate_chunk(&pcxt->estimator, est_shared);
+	keys++;
+
+	/* Estimate size for dead tuples -- PARALLEL_VACUUM_KEY_DEAD_TUPLES */
+	maxtuples = compute_max_dead_tuples(nblocks, true);
+	est_deadtuples = MAXALIGN(add_size(sizeof(LVDeadTuples),
+									   mul_size(sizeof(ItemPointerData), maxtuples)));
+	shm_toc_estimate_chunk(&pcxt->estimator, est_deadtuples);
+	keys++;
+
+	shm_toc_estimate_keys(&pcxt->estimator, keys);
+
+	/* Finally, estimate VACUUM_KEY_QUERY_TEXT space */
+	querylen = strlen(debug_query_string);
+	shm_toc_estimate_chunk(&pcxt->estimator, querylen + 1);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	InitializeParallelDSM(pcxt);
+
+	/* prepare shared information */
+	shared = (LVShared *) shm_toc_allocate(pcxt->toc, est_shared);
+	shared->relid = relid;
+	shared->elevel = elevel;
+	pg_atomic_init_u32(&(shared->nprocessed), 0);
+	MemSet(shared->indstats, 0, sizeof(LVIndStats) * nindexes);
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_SHARED, shared);
+	lps->lvshared = shared;
+
+	/* prepare the dead tuple space */
+	tidmap = (LVDeadTuples *) shm_toc_allocate(pcxt->toc, est_deadtuples);
+	tidmap->max_tuples = maxtuples;
+	tidmap->num_tuples = 0;
+	MemSet(tidmap->itemptrs, 0, sizeof(ItemPointerData) * maxtuples);
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_DEAD_TUPLES, tidmap);
+	vacrelstats->dead_tuples = tidmap;
+
+	/* Store query string for workers */
+	sharedquery = (char *) shm_toc_allocate(pcxt->toc, querylen + 1);
+	memcpy(sharedquery, debug_query_string, querylen + 1);
+	sharedquery[querylen] = '\0';
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, sharedquery);
+
+	lps->nworkers_requested = 0;
+
+	return lps;
+}
+
+/*
+ * Shutdown workers, destroy parallel context, and end parallel mode.
+ * Update index statistics after exited parallel mode.
+ */
+static void
+lazy_end_parallel(LVParallelState *lps, Relation *Irel, int nindexes)
+{
+	LVIndStats *copied_indstats = NULL;
+	int i;
+
+	Assert(!IsParallelWorker());
+	Assert(Irel != NULL && nindexes > 0);
+
+	/* copy the index statistics to a temporary space */
+	copied_indstats = palloc(sizeof(LVIndStats) * nindexes);
+	memcpy(copied_indstats, lps->lvshared->indstats,
+		   sizeof(LVIndStats) * nindexes);
+
+	/* Shutdown worker processes and destroy the parallel context */
+	WaitForParallelWorkersToFinish(lps->pcxt);
+	DestroyParallelContext(lps->pcxt);
+	ExitParallelMode();
+
+	for (i = 0; i < nindexes; i++)
+	{
+		LVIndStats *s = &(copied_indstats[i]);
+
+		/*
+		 * Update statistics in pg_class, but only if the index says the
+		 * count is accurate.
+		 */
+		if (s->updated && !s->stats.estimated_count)
+			vac_update_relstats(Irel[i],
+								s->stats.num_pages,
+								s->stats.num_index_tuples,
+								0,
+								false,
+								InvalidTransactionId,
+								InvalidMultiXactId,
+								false);
+	}
+
+	pfree(copied_indstats);
+}
+
+/*
+ * Begin a parallel index vacuuming or index cleanup. Set shared information
+ * and launch parallel worker processes. Return true if at least one worker
+ * has been launched.
+ */
+static bool
+lazy_begin_parallel_vacuum_index(LVParallelState *lps, LVRelStats *vacrelstats,
+								 bool for_cleanup)
+{
+	StringInfoData buf;
+
+	Assert(!IsParallelWorker());
+
+	/* Request workers to do either vacuuming indexes or cleaning indexes */
+	lps->lvshared->for_cleanup = for_cleanup;
+
+	if (for_cleanup)
+	{
+		/*
+		 * Now we can provide a better estimate of total number of surviving
+		 * tuples (we assume indexes are more interested in that than in the
+		 * number of nominally live tuples).
+		 */
+		lps->lvshared->reltuples = vacrelstats->new_rel_tuples;
+		lps->lvshared->estimated_count =
+			(vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	}
+	else
+	{
+		/* We can only provide an approximate value of num_heap_tuples here */
+		lps->lvshared->reltuples = vacrelstats->old_live_tuples;
+		lps->lvshared->estimated_count = true;
+	}
+
+	LaunchParallelWorkers(lps->pcxt);
+
+	/*
+	 * if no workers launched, we vacuum all indexes by the leader process
+	 * alone. Since there is hope that we can launch workers in the next
+	 * execution time we don't want to end the parallel mode yet.
+	 */
+	if (lps->pcxt->nworkers_launched == 0)
+	{
+		ereport(elevel,
+				(errmsg("could not launch parallel vacuum worker (planned: %d)",
+						lps->nworkers_requested)));
+		lazy_end_parallel_vacuum_index(lps, !for_cleanup);
+		return false;
+	}
+
+	/* Report parallel vacuum worker information */
+	initStringInfo(&buf);
+	if (for_cleanup)
+	{
+		if (lps->nworkers_requested > 0)
+			appendStringInfo(&buf,
+							 ngettext("launched %d parallel vacuum worker for index cleanup (planned: %d, requested %d)",
+									  "launched %d parallel vacuum workers for index cleanup (planned: %d, requsted %d)",
+									  lps->pcxt->nworkers_launched),
+							 lps->pcxt->nworkers_launched,
+							 lps->pcxt->nworkers,
+							 lps->nworkers_requested);
+		else
+			appendStringInfo(&buf,
+							 ngettext("launched %d parallel vacuum worker for index cleanup (planned: %d)",
+									  "launched %d parallel vacuum workers for index cleanup (planned: %d)",
+									  lps->pcxt->nworkers_launched),
+							 lps->pcxt->nworkers_launched,
+							 lps->pcxt->nworkers);
+	}
+	else
+	{
+		if (lps->nworkers_requested > 0)
+			appendStringInfo(&buf,
+							 ngettext("launched %d parallel vacuum worker for index vacuuming (planned: %d, requested %d)",
+									  "launched %d parallel vacuum workers for index vacuuming (planned: %d, requested %d)",
+									  lps->pcxt->nworkers_launched),
+							 lps->pcxt->nworkers_launched,
+							 lps->pcxt->nworkers,
+							 lps->nworkers_requested);
+		else
+			appendStringInfo(&buf,
+							 ngettext("launched %d parallel vacuum worker for index vacuuming (planned: %d)",
+									  "launched %d parallel vacuum workers for index vacuuming (planned: %d)",
+									  lps->pcxt->nworkers_launched),
+							 lps->pcxt->nworkers_launched,
+							 lps->pcxt->nworkers);
+	}
+	ereport(elevel, (errmsg("%s", buf.data)));
+
+	return true;
+}
+
+/*
+ * Wait for all worker processes to finish and reinitialize DSM for
+ * the next execution.
+ */
+static void
+lazy_end_parallel_vacuum_index(LVParallelState *lps, bool reinitialize)
+{
+	Assert(!IsParallelWorker());
+
+	WaitForParallelWorkersToFinish(lps->pcxt);
+
+	if (reinitialize)
+	{
+		/* Reset the processing count */
+		pg_atomic_write_u32(&(lps->lvshared->nprocessed), 0);
+
+		/*
+		 * Reinitialize the DSM space except to relaunch parallel workers for
+		 * the next execution.
+		 */
+		ReinitializeParallelDSM(lps->pcxt);
+	}
+}
+
+/*
+ * Perform work within a launched parallel process.
+ *
+ * Parallel vacuum worker processes doesn't report the vacuum progress
+ * information.
+ */
+void
+heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
+{
+	Relation	onerel;
+	Relation	*indrels;
+	LVShared	*lvshared;
+	LVDeadTuples	*dead_tuples;
+	int			nindexes;
+	char		*sharedquery;
+
+	lvshared = (LVShared *) shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_SHARED, false);
+	elevel = lvshared->elevel;
+
+	ereport(DEBUG1,
+			(errmsg("starting parallel lazy vacuum worker for %s",
+					lvshared->for_cleanup ? "cleanup" : "vacuuming")));
+
+	/* Open relations */
+	onerel = heap_open(lvshared->relid, ShareUpdateExclusiveLock);
+
+	/* indrels are sorted in order by OID */
+	vac_open_indexes(onerel, RowExclusiveLock, &nindexes, &indrels);
+	Assert(nindexes > 0);
+
+	/* Set debug_query_string for individual workers */
+	sharedquery = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, true);
+
+	/* Report the query string from leader */
+	debug_query_string = sharedquery;
+	pgstat_report_activity(STATE_RUNNING, debug_query_string);
+
+	/* Set dead tuple space within worker */
+	dead_tuples = (LVDeadTuples *) shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_DEAD_TUPLES, false);
+
+	/* Set cost-based vacuum delay */
+	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostBalance = 0;
+	VacuumPageHit = 0;
+	VacuumPageMiss = 0;
+	VacuumPageDirty = 0;
+
+	/* Do either vacuuming indexes or cleaning indexes */
+	lazy_vacuum_indexes_for_worker(indrels, nindexes, lvshared,
+								   dead_tuples,
+								   lvshared->for_cleanup);
+
+	vac_close_indexes(nindexes, indrels, RowExclusiveLock);
+	heap_close(onerel, ShareUpdateExclusiveLock);
+}
+
+/*
+ * Vacuum or cleanup all indexes. This is similar to the lazy_vacuum_all_indexes
+ * but this function must be used by the parallel vacuum worker processes.
+ */
+static void
+lazy_vacuum_indexes_for_worker(Relation *indrels, int nindexes,
+							   LVShared *lvshared, LVDeadTuples *dead_tuples,
+							   bool for_cleanup)
+{
+	int idx = 0;
+
+	Assert(IsParallelWorker());
+
+	for (;;)
+	{
+		IndexBulkDeleteResult *stats = NULL;
+
+		/* Get next index to vacuum */
+		idx = pg_atomic_fetch_add_u32(&(lvshared->nprocessed), 1);
+
+		/*  Done for all indexes? */
+		if (idx >= nindexes)
+			break;
+
+		/*
+		 * If a vacuum process already updated the bulk-deletion result, we
+		 * pass it to index AMs. Otherwise pass NULL as they expect NULL for
+		 * the first time execution.
+		 */
+		if (lvshared->indstats[idx].updated)
+			stats = &(lvshared->indstats[idx].stats);
+
+		/* Do vacuum or cleanup one index */
+		if (!for_cleanup)
+			lazy_vacuum_index(indrels[idx], &stats, lvshared->reltuples,
+							  dead_tuples);
+		else
+			lazy_cleanup_index(indrels[idx], &stats, lvshared->reltuples,
+							   lvshared->estimated_count, true);
+
+		/*
+		 * We copy the index bulk-deletion results returned from ambulkdelete
+		 * and amvacuumcleanup to the DSM segment because they allocate
+		 * the results locally and it's possible that an index will be vacuumed
+		 * by the different vacuum process at the next time. The copying the
+		 * result normally happens only after the first time of index vacuuming.
+		 * From the second time, we pass the result on the DSM segment so
+		 * that they update it directly.
+		 *
+		 * Since all vacuum workers write the bulk-deletion result at different
+		 * slot we can write them without locking.
+		 */
+		if (!lvshared->indstats[idx].updated && stats)
+		{
+			memcpy(&(lvshared->indstats[idx].stats), stats,
+				   sizeof(IndexBulkDeleteResult));
+			lvshared->indstats[idx].updated = true;
+		}
+	}
+}
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index 55d129a..86511b2 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -14,6 +14,7 @@
 
 #include "postgres.h"
 
+#include "access/heapam.h"
 #include "access/nbtree.h"
 #include "access/parallel.h"
 #include "access/session.h"
@@ -140,6 +141,9 @@ static const struct
 	},
 	{
 		"_bt_parallel_build_main", _bt_parallel_build_main
+	},
+	{
+		"heap_parallel_vacuum_main", heap_parallel_vacuum_main
 	}
 };
 
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 016e411..b17d320 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -36,6 +36,7 @@
 #include "catalog/pg_inherits.h"
 #include "catalog/pg_namespace.h"
 #include "commands/cluster.h"
+#include "commands/defrem.h"
 #include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
@@ -89,12 +90,22 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 	ListCell	*lc;
 
 	params.options = vacstmt->is_vacuumcmd ? VACOPT_VACUUM : VACOPT_ANALYZE;
+	params.nworkers = -1;
 
 	/* Parse options list */
 	foreach(lc, vacstmt->options)
 	{
 		DefElem	*opt = (DefElem *) lfirst(lc);
 
+		/* Only PARALLEL option can have an argument */
+		if (strcmp(opt->defname, "parallel") != 0 &&
+			opt->arg != NULL)
+			ereport(ERROR,
+					(errcode(ERRCODE_SYNTAX_ERROR),
+					 errmsg("option \"%s\" cannot have an argument",
+							opt->defname),
+					 parser_errposition(pstate, opt->location)));
+
 		/* Parse common options for VACUUM and ANALYZE */
 		if (strcmp(opt->defname, "verbose") == 0)
 			params.options |= VACOPT_VERBOSE;
@@ -115,6 +126,25 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 			params.options |= VACOPT_FULL;
 		else if (strcmp(opt->defname, "disable_page_skipping") == 0)
 			params.options |= VACOPT_DISABLE_PAGE_SKIPPING;
+		else if (strcmp(opt->defname, "parallel") == 0)
+		{
+			params.options |= VACOPT_PARALLEL;
+
+			if (opt->arg == NULL)
+			{
+				/* User didn't specify the parallel degree */
+				params.nworkers = 0;
+			}
+			else
+			{
+				params.nworkers = defGetInt32(opt);
+				if (params.nworkers <= 0)
+					ereport(ERROR,
+							(errcode(ERRCODE_SYNTAX_ERROR),
+							 errmsg("parallel vacuum degree must be at least 1"),
+							 parser_errposition(pstate, opt->location)));
+			}
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -146,6 +176,12 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 		}
 	}
 
+	if ((params.options & VACOPT_FULL) &&
+		(params.options & VACOPT_PARALLEL))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot specify FULL option with PARALLEL option")));
+
 	/*
 	 * All freeze ages are zero if the FREEZE option is given; otherwise pass
 	 * them as -1 which means to use the default values.
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 502e51b..03e0809 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -309,6 +309,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 %type <str>		vac_analyze_option_name
 %type <defelt>	vac_analyze_option_elem
 %type <list>	vac_analyze_option_list
+%type <node>	vac_analyze_option_arg
 %type <boolean>	opt_or_replace
 				opt_grant_grant_option opt_grant_admin_option
 				opt_nowait opt_if_exists opt_with_data
@@ -10528,9 +10529,9 @@ analyze_keyword:
 		;
 
 vac_analyze_option_elem:
-			vac_analyze_option_name
+			vac_analyze_option_name vac_analyze_option_arg
 				{
-					$$ = makeDefElem($1, NULL, @1);
+					$$ = makeDefElem($1, $2, @1);
 				}
 		;
 
@@ -10539,6 +10540,11 @@ vac_analyze_option_name:
 			| analyze_keyword						{ $$ = "analyze"; }
 		;
 
+vac_analyze_option_arg:
+			NumericOnly								{ $$ = (Node *) $1; }
+			| /* EMPTY */		 					{ $$ = NULL; }
+		;
+
 opt_analyze:
 			analyze_keyword							{ $$ = true; }
 			| /*EMPTY*/								{ $$ = false; }
@@ -16038,6 +16044,7 @@ makeXmlExpr(XmlExprOp op, char *name, List *named_args, List *args,
 	return (Node *) x;
 }
 
+
 /*
  * Merge the input and output parameters of a table function.
  */
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index fa875db..22df17f 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2886,6 +2886,7 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 			(dovacuum ? VACOPT_VACUUM : 0) |
 			(doanalyze ? VACOPT_ANALYZE : 0) |
 			(!wraparound ? VACOPT_SKIP_LOCKED : 0);
+		tab->at_params.nworkers = 0;	/* parallel lazy autovacuum is not supported */
 		tab->at_params.freeze_min_age = freeze_min_age;
 		tab->at_params.freeze_table_age = freeze_table_age;
 		tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 10ae21c..fef80c4 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -3429,7 +3429,8 @@ psql_completion(const char *text, int start, int end)
 		 */
 		if (ends_with(prev_wd, '(') || ends_with(prev_wd, ','))
 			COMPLETE_WITH("FULL", "FREEZE", "ANALYZE", "VERBOSE",
-						  "DISABLE_PAGE_SKIPPING", "SKIP_LOCKED");
+						  "DISABLE_PAGE_SKIPPING", "SKIP_LOCKED",
+						  "PARALLEL");
 	}
 	else if (HeadMatches("VACUUM") && TailMatches("("))
 		/* "VACUUM (" should be caught above, so assume we want columns */
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index eb9e160..3eb7946 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -14,6 +14,7 @@
 #ifndef HEAPAM_H
 #define HEAPAM_H
 
+#include "access/parallel.h"
 #include "access/relation.h"	/* for backward compatibility */
 #include "access/relscan.h"
 #include "access/sdir.h"
@@ -219,6 +220,7 @@ extern Size SyncScanShmemSize(void);
 struct VacuumParams;
 extern void heap_vacuum_rel(Relation onerel,
 				struct VacuumParams *params, BufferAccessStrategy bstrategy);
+extern void heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
 
 /* in heap/heapam_visibility.c */
 extern bool HeapTupleSatisfiesVisibility(HeapTuple stup, Snapshot snapshot,
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index a8ca199..fdcccef 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -144,7 +144,8 @@ typedef enum VacuumOption
 	VACOPT_FULL = 1 << 4,		/* FULL (non-concurrent) vacuum */
 	VACOPT_SKIP_LOCKED = 1 << 5,	/* skip if cannot get lock */
 	VACOPT_SKIPTOAST = 1 << 6,	/* don't process the TOAST table, if any */
-	VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7	/* don't skip any pages */
+	VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7,	/* don't skip any pages */
+	VACOPT_PARALLEL = 1 << 8
 } VacuumOption;
 
 /*
@@ -166,6 +167,11 @@ typedef struct VacuumParams
 	int			log_min_duration;	/* minimum execution threshold in ms at
 									 * which  verbose logs are activated, -1
 									 * to use default */
+	/*
+	 * The number of parallel vacuum workers. -1 by default for no workers,
+	 * otherwise user requested.
+	 */
+	int			nworkers;
 } VacuumParams;
 
 /* GUC parameters */
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index 07d0703..973bb33 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -80,6 +80,12 @@ CONTEXT:  SQL function "do_analyze" statement 1
 SQL function "wrap_do_analyze" statement 1
 VACUUM FULL vactst;
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+VACUUM (PARALLEL) vaccluster;
+VACUUM (PARALLEL 2) vaccluster;
+VACUUM (PARALLEL 0) vaccluster; -- error
+ERROR:  parallel vacuum degree must be at least 1
+LINE 1: VACUUM (PARALLEL 0) vaccluster;
+                ^
 -- partitioned table
 CREATE TABLE vacparted (a int, b char) PARTITION BY LIST (a);
 CREATE TABLE vacparted1 PARTITION OF vacparted FOR VALUES IN (1);
@@ -116,9 +122,9 @@ ERROR:  column "does_not_exist" of relation "vacparted" does not exist
 ANALYZE (VERBOSE) does_not_exist;
 ERROR:  relation "does_not_exist" does not exist
 ANALYZE (nonexistent-arg) does_not_exist;
-ERROR:  syntax error at or near "-"
+ERROR:  syntax error at or near "arg"
 LINE 1: ANALYZE (nonexistent-arg) does_not_exist;
-                            ^
+                             ^
 ANALYZE (nonexistentarg) does_not_exit;
 ERROR:  unrecognized ANALYZE option "nonexistentarg"
 LINE 1: ANALYZE (nonexistentarg) does_not_exit;
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 81f3822..d0c209a 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -61,6 +61,9 @@ VACUUM FULL vaccluster;
 VACUUM FULL vactst;
 
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+VACUUM (PARALLEL) vaccluster;
+VACUUM (PARALLEL 2) vaccluster;
+VACUUM (PARALLEL 0) vaccluster; -- error
 
 -- partitioned table
 CREATE TABLE vacparted (a int, b char) PARTITION BY LIST (a);
-- 
2.10.5

v18-0002-Add-paralell-P-option-to-vacuumdb-command.patchapplication/octet-stream; name=v18-0002-Add-paralell-P-option-to-vacuumdb-command.patchDownload

From 0b94e177daafae3a967fe981fc42dea43d5ce634 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Wed, 23 Jan 2019 16:07:53 +0900
Subject: [PATCH v18 2/2] Add --paralell, -P option to vacuumdb command

---
 doc/src/sgml/ref/vacuumdb.sgml    | 16 +++++++++++++
 src/bin/scripts/t/100_vacuumdb.pl | 10 +++++++-
 src/bin/scripts/vacuumdb.c        | 49 ++++++++++++++++++++++++++++++++++++++-
 3 files changed, 73 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/ref/vacuumdb.sgml b/doc/src/sgml/ref/vacuumdb.sgml
index 41c7f3d..da65177 100644
--- a/doc/src/sgml/ref/vacuumdb.sgml
+++ b/doc/src/sgml/ref/vacuumdb.sgml
@@ -227,6 +227,22 @@ PostgreSQL documentation
      </varlistentry>
 
      <varlistentry>
+      <term><option>-P <replaceable class="parameter">workers</replaceable></option></term>
+      <term><option>--parallel=<replaceable class="parameter">workers</replaceable></option></term>
+      <listitem>
+       <para>
+        Execute parallel vacuum with <productname>PostgreSQL</productname>'s
+        <replaceable class="parameter">workers</replaceable> background workers.
+       </para>
+       <para>
+        This option will require background workers, so make sure your
+        <xref linkend="guc-max-parallel-workers-maintenance"/> setting is more
+        than one.
+       </para>
+      </listitem>
+     </varlistentry>
+
+     <varlistentry>
       <term><option>-q</option></term>
       <term><option>--quiet</option></term>
       <listitem>
diff --git a/src/bin/scripts/t/100_vacuumdb.pl b/src/bin/scripts/t/100_vacuumdb.pl
index 7f3a9b1..5ab87f3 100644
--- a/src/bin/scripts/t/100_vacuumdb.pl
+++ b/src/bin/scripts/t/100_vacuumdb.pl
@@ -3,7 +3,7 @@ use warnings;
 
 use PostgresNode;
 use TestLib;
-use Test::More tests => 44;
+use Test::More tests => 48;
 
 program_help_ok('vacuumdb');
 program_version_ok('vacuumdb');
@@ -48,6 +48,14 @@ $node->issues_sql_like(
 $node->command_fails(
 	[ 'vacuumdb', '--analyze-only', '--disable-page-skipping', 'postgres' ],
 	'--analyze-only and --disable-page-skipping specified together');
+$node->issues_sql_like(
+	[ 'vacuumdb', '-P2', 'postgres' ],
+	qr/statement: VACUUM \(PARALLEL 2\).*;/,
+	'vacuumdb -P2');
+$node->issues_sql_like(
+	[ 'vacuumdb', '-P', 'postgres' ],
+	qr/statement: VACUUM \(PARALLEL\).*;/,
+	'vacuumdb -P');
 $node->command_ok([qw(vacuumdb -Z --table=pg_am dbname=template1)],
 	'vacuumdb with connection string');
 
diff --git a/src/bin/scripts/vacuumdb.c b/src/bin/scripts/vacuumdb.c
index 5ac41ea..f33a432 100644
--- a/src/bin/scripts/vacuumdb.c
+++ b/src/bin/scripts/vacuumdb.c
@@ -45,6 +45,8 @@ typedef struct vacuumingOptions
 	bool		skip_locked;
 	int			min_xid_age;
 	int			min_mxid_age;
+	int			parallel_workers;	/* -1: disabled, 0: PARALLEL without number of
+									 * workers. */
 } vacuumingOptions;
 
 
@@ -111,6 +113,7 @@ main(int argc, char *argv[])
 		{"full", no_argument, NULL, 'f'},
 		{"verbose", no_argument, NULL, 'v'},
 		{"jobs", required_argument, NULL, 'j'},
+		{"parallel", optional_argument, NULL, 'P'},
 		{"maintenance-db", required_argument, NULL, 2},
 		{"analyze-in-stages", no_argument, NULL, 3},
 		{"disable-page-skipping", no_argument, NULL, 4},
@@ -140,6 +143,7 @@ main(int argc, char *argv[])
 
 	/* initialize options to all false */
 	memset(&vacopts, 0, sizeof(vacopts));
+	vacopts.parallel_workers = -1;
 
 	progname = get_progname(argv[0]);
 
@@ -147,7 +151,7 @@ main(int argc, char *argv[])
 
 	handle_help_version_opts(argc, argv, "vacuumdb", help);
 
-	while ((c = getopt_long(argc, argv, "h:p:U:wWeqd:zZFat:fvj:", long_options, &optindex)) != -1)
+	while ((c = getopt_long(argc, argv, "h:p:P::U:wWeqd:zZFat:fvj:", long_options, &optindex)) != -1)
 	{
 		switch (c)
 		{
@@ -214,6 +218,25 @@ main(int argc, char *argv[])
 					exit(1);
 				}
 				break;
+			case 'P':
+				{
+					int parallel_workers = 0;
+
+					if (optarg != NULL)
+					{
+						parallel_workers = atoi(optarg);
+						if (parallel_workers <= 0)
+						{
+							fprintf(stderr, _("%s: number of parallel workers must be at least 1\n"),
+									progname);
+							exit(1);
+						}
+					}
+
+					/* allow to set 0, meaning PARALLEL without the parallel degree */
+					vacopts.parallel_workers = parallel_workers;
+					break;
+				}
 			case 2:
 				maintenance_db = pg_strdup(optarg);
 				break;
@@ -288,9 +311,22 @@ main(int argc, char *argv[])
 					progname, "disable-page-skipping");
 			exit(1);
 		}
+		if (vacopts.parallel_workers >= 0)
+		{
+			fprintf(stderr, _("%s: cannot use the \"%s\" option when performing only analyze\n"),
+					progname, "parallel");
+			exit(1);
+		}
 		/* allow 'and_analyze' with 'analyze_only' */
 	}
 
+	if (vacopts.full && vacopts.parallel_workers >= 0)
+	{
+		fprintf(stderr, _("%s: cannot use the \"%s\" option with \"%s\" option\n"),
+				progname, "full", "parallel");
+		exit(1);
+	}
+
 	setup_cancel_handler();
 
 	/* Avoid opening extra connections. */
@@ -895,6 +931,16 @@ prepare_vacuum_command(PQExpBuffer sql, int serverVersion,
 				appendPQExpBuffer(sql, "%sANALYZE", sep);
 				sep = comma;
 			}
+			if (vacopts->parallel_workers > 0)
+			{
+				appendPQExpBuffer(sql, "%sPARALLEL %d", sep, vacopts->parallel_workers);
+				sep = comma;
+			}
+			if (vacopts->parallel_workers == 0)
+			{
+				appendPQExpBuffer(sql, "%sPARALLEL", sep);
+				sep = comma;
+			}
 			if (sep != paren)
 				appendPQExpBufferChar(sql, ')');
 		}
@@ -1227,6 +1273,7 @@ help(const char *progname)
 	printf(_("  -j, --jobs=NUM                  use this many concurrent connections to vacuum\n"));
 	printf(_("      --min-mxid-age=MXID_AGE     minimum multixact ID age of tables to vacuum\n"));
 	printf(_("      --min-xid-age=XID_AGE       minimum transaction ID age of tables to vacuum\n"));
+	printf(_("  -P, --parallel[=NUM]            do parallel vacuuming\n"));
 	printf(_("  -q, --quiet                     don't write any messages\n"));
 	printf(_("      --skip-locked               skip relations that cannot be immediately locked\n"));
 	printf(_("  -t, --table='TABLE[(COLUMNS)]'  vacuum specific table(s) only\n"));
-- 
2.10.5

#55

Masahiko Sawada

sawada.mshk@gmail.com

almost 7 years ago

In reply to: Kyotaro HORIGUCHI (#53)

On Tue, Mar 19, 2019 at 7:29 PM Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:

At Tue, 19 Mar 2019 17:51:32 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in <CAD21AoCUZQmyXrwDw57ejoR-j1QrGqm_vrQKOkif_aJK4Gih6Q@mail.gmail.com>

On Tue, Mar 19, 2019 at 10:39 AM Haribabu Kommi
<kommi.haribabu@gmail.com> wrote:

The performance results are good. Do we want to add the recommended
size in the document for the parallel option? the parallel option for smaller
tables can lead to performance overhead.

Hmm, I don't think we can add the specific recommended size because
the performance gain by parallel lazy vacuum depends on various things
such as CPU cores, the number of indexes, shared buffer size, index
types, HDD or SSD. I suppose that users who want to use this option
have some sort of performance problem such as that vacuum takes a very
long time. They would use it for relatively larger tables.

Agree that we have no recommended setting, but I strongly think that documentation on the downside or possible side effect of this feature is required for those who are to use the feature.

I think that the side effect of parallel lazy vacuum would be to
consume more CPUs and I/O bandwidth, but which is also true for the
other utility command (i.e. parallel create index). The description of
max_parallel_maintenance_worker documents such things[1]https://www.postgresql.org/docs/devel/runtime-config-resource.html#RUNTIME-CONFIG-RESOURCE-ASYNC-BEHAVIOR. Anything
else to document?

[1]: https://www.postgresql.org/docs/devel/runtime-config-resource.html#RUNTIME-CONFIG-RESOURCE-ASYNC-BEHAVIOR

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#56

Sergei Kornilov

sk@zsrv.org

almost 7 years ago

In reply to: Masahiko Sawada (#54)

Hello

* in_parallel is true if we're performing parallel lazy vacuum. Since any
* updates are not allowed during parallel mode we don't update statistics
* but set the index bulk-deletion result to *stats. Otherwise we update it
* and set NULL.

lazy_cleanup_index has in_parallel argument only for this purpose, but caller still should check in_parallel after lazy_cleanup_index call and do something else with stats for parallel execution.
Would be better always return stats and update statistics in caller? It's possible to update all index stats in lazy_vacuum_all_indexes for example? This routine is always parallel leader and has comment /* Do post-vacuum cleanup and statistics update for each index */ on for_cleanup=true call.

I think we need note in documentation that parallel leader is not counted in PARALLEL N option, so with PARALLEL 2 option we want use 3 processes. Or even change behavior? Default with PARALLEL 1 - only current backend in single process is running, PARALLEL 2 - leader + one parallel worker, two processes works in parallel.

regards, Sergei

#57

Robert Haas

robertmhaas@gmail.com

almost 7 years ago

In reply to: Kyotaro HORIGUCHI (#49)

On Tue, Mar 19, 2019 at 3:59 AM Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:

The leader doesn't continue heap-scan while index vacuuming is
running. And the index-page-scan seems eat up CPU easily. If
index vacuum can run simultaneously with the next heap scan
phase, we can make index scan finishes almost the same time with
the next round of heap scan. It would reduce the (possible) CPU
contention. But this requires as the twice size of shared
memoryas the current implement.

I think you're approaching this from the wrong point of view. If we
have a certain amount of memory available, is it better to (a) fill
the entire thing with dead tuples once, or (b) better to fill half of
it with dead tuples, start index vacuuming, and then fill the other
half of it with dead tuples for the next index-vacuum cycle while the
current one is running? I think the answer is that (a) is clearly
better, because it results in half as many index vacuum cycles.

We can't really ask the user how much memory it's OK to use and then
use twice as much. But if we could, what you're proposing here is
probably still not the right way to use it.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#58

Robert Haas

robertmhaas@gmail.com

almost 7 years ago

In reply to: Masahiko Sawada (#54)

On Tue, Mar 19, 2019 at 7:26 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

In parsing vacuum command, since only PARALLEL option can have an
argument I've added the check in ExecVacuum to erroring out when other
options have an argument. But it might be good to make other vacuum
options (perhaps except for DISABLE_PAGE_SKIPPING option) accept an
argument just like EXPLAIN command.

I think all of the existing options, including DISABLE_PAGE_SKIPPING,
should permit an argument that is passed to defGetBoolean().

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#59

Masahiko Sawada

sawada.mshk@gmail.com

almost 7 years ago

In reply to: Robert Haas (#58)

3 attachment(s)

On Fri, Mar 22, 2019 at 4:53 AM Robert Haas <robertmhaas@gmail.com> wrote:

On Tue, Mar 19, 2019 at 7:26 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

In parsing vacuum command, since only PARALLEL option can have an
argument I've added the check in ExecVacuum to erroring out when other
options have an argument. But it might be good to make other vacuum
options (perhaps except for DISABLE_PAGE_SKIPPING option) accept an
argument just like EXPLAIN command.

I think all of the existing options, including DISABLE_PAGE_SKIPPING,
should permit an argument that is passed to defGetBoolean().

Agreed. The attached 0001 patch changes so.

On Thu, Mar 21, 2019 at 8:05 PM Sergei Kornilov <sk@zsrv.org> wrote:

Hello

Thank you for reviewing the patch!

* in_parallel is true if we're performing parallel lazy vacuum. Since any
* updates are not allowed during parallel mode we don't update statistics
* but set the index bulk-deletion result to *stats. Otherwise we update it
* and set NULL.

lazy_cleanup_index has in_parallel argument only for this purpose, but caller still should check in_parallel after lazy_cleanup_index call and do something else with stats for parallel execution.
Would be better always return stats and update statistics in caller? It's possible to update all index stats in lazy_vacuum_all_indexes for example? This routine is always parallel leader and has comment /* Do post-vacuum cleanup and statistics update for each index */ on for_cleanup=true call.

Agreed. I've changed the patch so that we update index statistics in
lazy_vacuum_all_indexes().

I think we need note in documentation that parallel leader is not counted in PARALLEL N option, so with PARALLEL 2 option we want use 3 processes. Or even change behavior? Default with PARALLEL 1 - only current backend in single process is running, PARALLEL 2 - leader + one parallel worker, two processes works in parallel.

Hmm, the documentation says "Perform vacuum index and cleanup index
phases of VACUUM in parallel using N background workers". Doesn't it
already explain that?

Attached the updated version patch. 0001 patch allows all existing
vacuum options an boolean argument. 0002 patch introduces parallel
lazy vacuum. 0003 patch adds -P (--parallel) option to vacuumdb
command.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachments:

v19-0001-All-VACUUM-command-options-allow-an-argument.patchapplication/octet-stream; name=v19-0001-All-VACUUM-command-options-allow-an-argument.patchDownload

From 41702c0b900ce4fb1c195009fe3d2dfbcd53ca8d Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 22 Mar 2019 10:31:31 +0900
Subject: [PATCH v19 1/3] All VACUUM command options allow an argument.

All existing VACUUM command options allow a boolean argument like
EXPLAIN command.
---
 doc/src/sgml/ref/vacuum.sgml  | 27 +++++++++++++++++++++------
 src/backend/commands/vacuum.c | 13 +++++++------
 src/backend/parser/gram.y     | 10 ++++++++--
 3 files changed, 36 insertions(+), 14 deletions(-)

diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index fd911f5..99dda89 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -26,12 +26,13 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
 
 <phrase>where <replaceable class="parameter">option</replaceable> can be one of:</phrase>
 
-    FULL
-    FREEZE
-    VERBOSE
-    ANALYZE
-    DISABLE_PAGE_SKIPPING
-    SKIP_LOCKED
+    FULL [ <replaceable class="parameter">boolean</replaceable> ]
+    FREEZE [ <replaceable class="parameter">boolean</replaceable> ]
+    VERBOSE [ <replaceable class="parameter">boolean</replaceable> ]
+    ANALYZE [ <replaceable class="parameter">boolean</replaceable> ]
+    PARALLEL [ <replaceable class="parameter">N</replaceable> ]
+    DISABLE_PAGE_SKIPPING [ <replaceable class="parameter">boolean</replaceable> ]
+    SKIP_LOCKED [ <replaceable class="parameter">boolean</replaceable> ]
 
 <phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
 
@@ -182,6 +183,20 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    </varlistentry>
 
    <varlistentry>
+    <term><replaceable class="parameter">boolean</replaceable></term>
+    <listitem>
+     <para>
+      Specifies whether the selected option should be turned on or off.
+      You can write <literal>TRUE</literal>, <literal>ON</literal>, or
+      <literal>1</literal> to enable the option, and <literal>FALSE</literal>,
+      <literal>OFF</literal>, or <literal>0</literal> to disable it.  The
+      <replaceable class="parameter">boolean</replaceable> value can also
+      be omitted, in which case <literal>TRUE</literal> is assumed.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><replaceable class="parameter">table_name</replaceable></term>
     <listitem>
      <para>
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index f0afeaf..72f140e 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -36,6 +36,7 @@
 #include "catalog/pg_inherits.h"
 #include "catalog/pg_namespace.h"
 #include "commands/cluster.h"
+#include "commands/defrem.h"
 #include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
@@ -97,9 +98,9 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 
 		/* Parse common options for VACUUM and ANALYZE */
 		if (strcmp(opt->defname, "verbose") == 0)
-			params.options |= VACOPT_VERBOSE;
+			params.options |= defGetBoolean(opt) ? VACOPT_VERBOSE : 0;
 		else if (strcmp(opt->defname, "skip_locked") == 0)
-			params.options |= VACOPT_SKIP_LOCKED;
+			params.options |= defGetBoolean(opt) ? VACOPT_SKIP_LOCKED : 0;
 		else if (!vacstmt->is_vacuumcmd)
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -108,13 +109,13 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 
 		/* Parse options available on VACUUM */
 		else if (strcmp(opt->defname, "analyze") == 0)
-				params.options |= VACOPT_ANALYZE;
+			params.options |= defGetBoolean(opt) ? VACOPT_ANALYZE : 0;
 		else if (strcmp(opt->defname, "freeze") == 0)
-				params.options |= VACOPT_FREEZE;
+			params.options |= defGetBoolean(opt) ? VACOPT_FREEZE : 0;
 		else if (strcmp(opt->defname, "full") == 0)
-			params.options |= VACOPT_FULL;
+			params.options |= defGetBoolean(opt) ? VACOPT_FULL : 0;
 		else if (strcmp(opt->defname, "disable_page_skipping") == 0)
-			params.options |= VACOPT_DISABLE_PAGE_SKIPPING;
+			params.options |= defGetBoolean(opt) ? VACOPT_DISABLE_PAGE_SKIPPING : 0;
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 502e51b..921e7d2 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -309,6 +309,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 %type <str>		vac_analyze_option_name
 %type <defelt>	vac_analyze_option_elem
 %type <list>	vac_analyze_option_list
+%type <node>	vac_analyze_option_arg
 %type <boolean>	opt_or_replace
 				opt_grant_grant_option opt_grant_admin_option
 				opt_nowait opt_if_exists opt_with_data
@@ -10528,9 +10529,9 @@ analyze_keyword:
 		;
 
 vac_analyze_option_elem:
-			vac_analyze_option_name
+			vac_analyze_option_name vac_analyze_option_arg
 				{
-					$$ = makeDefElem($1, NULL, @1);
+					$$ = makeDefElem($1, $2, @1);
 				}
 		;
 
@@ -10539,6 +10540,11 @@ vac_analyze_option_name:
 			| analyze_keyword						{ $$ = "analyze"; }
 		;
 
+vac_analyze_option_arg:
+			opt_boolean_or_string					{ $$ = (Node *) makeString($1); }
+			| /* EMPTY */		 					{ $$ = NULL; }
+		;
+
 opt_analyze:
 			analyze_keyword							{ $$ = true; }
 			| /*EMPTY*/								{ $$ = false; }
-- 
2.10.5

v19-0003-Add-paralell-P-option-to-vacuumdb-command.patchapplication/octet-stream; name=v19-0003-Add-paralell-P-option-to-vacuumdb-command.patchDownload

From e33e3a72266a3a2d6ff3d017b0c97670ea32d81e Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Wed, 23 Jan 2019 16:07:53 +0900
Subject: [PATCH v19 3/3] Add --paralell, -P option to vacuumdb command

---
 doc/src/sgml/ref/vacuumdb.sgml    | 16 +++++++++++++
 src/bin/scripts/t/100_vacuumdb.pl | 10 +++++++-
 src/bin/scripts/vacuumdb.c        | 49 ++++++++++++++++++++++++++++++++++++++-
 3 files changed, 73 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/ref/vacuumdb.sgml b/doc/src/sgml/ref/vacuumdb.sgml
index 41c7f3d..da65177 100644
--- a/doc/src/sgml/ref/vacuumdb.sgml
+++ b/doc/src/sgml/ref/vacuumdb.sgml
@@ -227,6 +227,22 @@ PostgreSQL documentation
      </varlistentry>
 
      <varlistentry>
+      <term><option>-P <replaceable class="parameter">workers</replaceable></option></term>
+      <term><option>--parallel=<replaceable class="parameter">workers</replaceable></option></term>
+      <listitem>
+       <para>
+        Execute parallel vacuum with <productname>PostgreSQL</productname>'s
+        <replaceable class="parameter">workers</replaceable> background workers.
+       </para>
+       <para>
+        This option will require background workers, so make sure your
+        <xref linkend="guc-max-parallel-workers-maintenance"/> setting is more
+        than one.
+       </para>
+      </listitem>
+     </varlistentry>
+
+     <varlistentry>
       <term><option>-q</option></term>
       <term><option>--quiet</option></term>
       <listitem>
diff --git a/src/bin/scripts/t/100_vacuumdb.pl b/src/bin/scripts/t/100_vacuumdb.pl
index 7f3a9b1..5ab87f3 100644
--- a/src/bin/scripts/t/100_vacuumdb.pl
+++ b/src/bin/scripts/t/100_vacuumdb.pl
@@ -3,7 +3,7 @@ use warnings;
 
 use PostgresNode;
 use TestLib;
-use Test::More tests => 44;
+use Test::More tests => 48;
 
 program_help_ok('vacuumdb');
 program_version_ok('vacuumdb');
@@ -48,6 +48,14 @@ $node->issues_sql_like(
 $node->command_fails(
 	[ 'vacuumdb', '--analyze-only', '--disable-page-skipping', 'postgres' ],
 	'--analyze-only and --disable-page-skipping specified together');
+$node->issues_sql_like(
+	[ 'vacuumdb', '-P2', 'postgres' ],
+	qr/statement: VACUUM \(PARALLEL 2\).*;/,
+	'vacuumdb -P2');
+$node->issues_sql_like(
+	[ 'vacuumdb', '-P', 'postgres' ],
+	qr/statement: VACUUM \(PARALLEL\).*;/,
+	'vacuumdb -P');
 $node->command_ok([qw(vacuumdb -Z --table=pg_am dbname=template1)],
 	'vacuumdb with connection string');
 
diff --git a/src/bin/scripts/vacuumdb.c b/src/bin/scripts/vacuumdb.c
index 5ac41ea..6be3f8f 100644
--- a/src/bin/scripts/vacuumdb.c
+++ b/src/bin/scripts/vacuumdb.c
@@ -45,6 +45,8 @@ typedef struct vacuumingOptions
 	bool		skip_locked;
 	int			min_xid_age;
 	int			min_mxid_age;
+	int			parallel_workers;	/* -1 disables, 0 for choosing based on the
+									 * number of indexes */
 } vacuumingOptions;
 
 
@@ -111,6 +113,7 @@ main(int argc, char *argv[])
 		{"full", no_argument, NULL, 'f'},
 		{"verbose", no_argument, NULL, 'v'},
 		{"jobs", required_argument, NULL, 'j'},
+		{"parallel", optional_argument, NULL, 'P'},
 		{"maintenance-db", required_argument, NULL, 2},
 		{"analyze-in-stages", no_argument, NULL, 3},
 		{"disable-page-skipping", no_argument, NULL, 4},
@@ -140,6 +143,7 @@ main(int argc, char *argv[])
 
 	/* initialize options to all false */
 	memset(&vacopts, 0, sizeof(vacopts));
+	vacopts.parallel_workers = -1;
 
 	progname = get_progname(argv[0]);
 
@@ -147,7 +151,7 @@ main(int argc, char *argv[])
 
 	handle_help_version_opts(argc, argv, "vacuumdb", help);
 
-	while ((c = getopt_long(argc, argv, "h:p:U:wWeqd:zZFat:fvj:", long_options, &optindex)) != -1)
+	while ((c = getopt_long(argc, argv, "h:p:P::U:wWeqd:zZFat:fvj:", long_options, &optindex)) != -1)
 	{
 		switch (c)
 		{
@@ -214,6 +218,25 @@ main(int argc, char *argv[])
 					exit(1);
 				}
 				break;
+			case 'P':
+				{
+					int parallel_workers = 0;
+
+					if (optarg != NULL)
+					{
+						parallel_workers = atoi(optarg);
+						if (parallel_workers <= 0)
+						{
+							fprintf(stderr, _("%s: number of parallel workers must be at least 1\n"),
+									progname);
+							exit(1);
+						}
+					}
+
+					/* allow to set 0, meaning PARALLEL without the parallel degree */
+					vacopts.parallel_workers = parallel_workers;
+					break;
+				}
 			case 2:
 				maintenance_db = pg_strdup(optarg);
 				break;
@@ -288,9 +311,22 @@ main(int argc, char *argv[])
 					progname, "disable-page-skipping");
 			exit(1);
 		}
+		if (vacopts.parallel_workers >= 0)
+		{
+			fprintf(stderr, _("%s: cannot use the \"%s\" option when performing only analyze\n"),
+					progname, "parallel");
+			exit(1);
+		}
 		/* allow 'and_analyze' with 'analyze_only' */
 	}
 
+	if (vacopts.full && vacopts.parallel_workers >= 0)
+	{
+		fprintf(stderr, _("%s: cannot use the \"%s\" option with \"%s\" option\n"),
+				progname, "full", "parallel");
+		exit(1);
+	}
+
 	setup_cancel_handler();
 
 	/* Avoid opening extra connections. */
@@ -895,6 +931,16 @@ prepare_vacuum_command(PQExpBuffer sql, int serverVersion,
 				appendPQExpBuffer(sql, "%sANALYZE", sep);
 				sep = comma;
 			}
+			if (vacopts->parallel_workers > 0)
+			{
+				appendPQExpBuffer(sql, "%sPARALLEL %d", sep, vacopts->parallel_workers);
+				sep = comma;
+			}
+			if (vacopts->parallel_workers == 0)
+			{
+				appendPQExpBuffer(sql, "%sPARALLEL", sep);
+				sep = comma;
+			}
 			if (sep != paren)
 				appendPQExpBufferChar(sql, ')');
 		}
@@ -1227,6 +1273,7 @@ help(const char *progname)
 	printf(_("  -j, --jobs=NUM                  use this many concurrent connections to vacuum\n"));
 	printf(_("      --min-mxid-age=MXID_AGE     minimum multixact ID age of tables to vacuum\n"));
 	printf(_("      --min-xid-age=XID_AGE       minimum transaction ID age of tables to vacuum\n"));
+	printf(_("  -P, --parallel[=NUM]            do parallel vacuuming\n"));
 	printf(_("  -q, --quiet                     don't write any messages\n"));
 	printf(_("      --skip-locked               skip relations that cannot be immediately locked\n"));
 	printf(_("  -t, --table='TABLE[(COLUMNS)]'  vacuum specific table(s) only\n"));
-- 
2.10.5

v19-0002-Add-parallel-option-to-VACUUM-command.patchapplication/octet-stream; name=v19-0002-Add-parallel-option-to-VACUUM-command.patchDownload

From 4150888f2c0953853fce7ae5964b0ad55fa00c31 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Mon, 4 Mar 2019 09:31:41 +0900
Subject: [PATCH v19 2/3] Add parallel option to VACUUM command

In parallel vacuum, we perform both index vacuum and cleanup vacuum
with parallel workers. Indivisual indexes are processed by one vacuum
process. Therefore parallel vacuum can be used when the table has more
than one index.

Parallel vacuum can be performed by specifying like
VACUUM (PARALLEL 2) tbl, meaning that performing vacuum with 2
parallel worker processes. Specifying only PARALLEL means that the
degree of parallalism will be determined based on the number of
indexes the table has.

The parallel vacuum degree is limited by both the number of
indexes the table has and max_parallel_maintenance_workers.
---
 doc/src/sgml/config.sgml              |  14 +-
 doc/src/sgml/ref/vacuum.sgml          |  30 +-
 src/backend/access/heap/vacuumlazy.c  | 877 ++++++++++++++++++++++++++++++----
 src/backend/access/transam/parallel.c |   4 +
 src/backend/commands/vacuum.c         |  26 +
 src/backend/parser/gram.y             |   1 +
 src/backend/postmaster/autovacuum.c   |   1 +
 src/bin/psql/tab-complete.c           |   3 +-
 src/include/access/heapam.h           |   2 +
 src/include/commands/vacuum.h         |   8 +-
 src/test/regress/expected/vacuum.out  |  10 +-
 src/test/regress/sql/vacuum.sql       |   3 +
 12 files changed, 861 insertions(+), 118 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index d383de2..3ca3ae8 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2226,13 +2226,13 @@ include_dir 'conf.d'
        <listitem>
         <para>
          Sets the maximum number of parallel workers that can be
-         started by a single utility command.  Currently, the only
-         parallel utility command that supports the use of parallel
-         workers is <command>CREATE INDEX</command>, and only when
-         building a B-tree index.  Parallel workers are taken from the
-         pool of processes established by <xref
-         linkend="guc-max-worker-processes"/>, limited by <xref
-         linkend="guc-max-parallel-workers"/>.  Note that the requested
+         started by a single utility command.  Currently, the parallel
+         utility commands that support the use of parallel workers are
+         <command>CREATE INDEX</command> only when building a B-tree index,
+         and <command>VACUUM</command> without <literal>FULL</literal>
+         option. Parallel workers are taken from the pool of processes
+         established by <xref linkend="guc-max-worker-processes"/>, limited
+         by <xref linkend="guc-max-parallel-workers"/>.  Note that the requested
          number of workers may not actually be available at run time.
          If this occurs, the utility operation will run with fewer
          workers than expected.  The default value is 2.  Setting this
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index 99dda89..91b9c30 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -30,7 +30,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     FREEZE [ <replaceable class="parameter">boolean</replaceable> ]
     VERBOSE [ <replaceable class="parameter">boolean</replaceable> ]
     ANALYZE [ <replaceable class="parameter">boolean</replaceable> ]
-    PARALLEL [ <replaceable class="parameter">N</replaceable> ]
+    PARALLEL [ <replaceable class="parameter">integer</replaceable> ]
     DISABLE_PAGE_SKIPPING [ <replaceable class="parameter">boolean</replaceable> ]
     SKIP_LOCKED [ <replaceable class="parameter">boolean</replaceable> ]
 
@@ -144,6 +144,20 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    </varlistentry>
 
    <varlistentry>
+    <term><literal>PARALLEL</literal></term>
+    <listitem>
+     <para>
+      Perform vacuum index and cleanup index phases of <command>VACUUM</command>
+      in parallel using <replaceable class="parameter">integer</replaceable> background
+      workers (for the detail of each vacuum phases, please refer to
+      <xref linkend="vacuum-phases"/>). Only one worker can be used per index.
+      Workers for vacuum launches before starting each phases and exit at the end
+      of the phase. This option can not use with <literal>FULL</literal> option.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><literal>DISABLE_PAGE_SKIPPING</literal></term>
     <listitem>
      <para>
@@ -197,6 +211,20 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    </varlistentry>
 
    <varlistentry>
+    <term><replaceable class="parameter">integer</replaceable></term>
+    <listitem>
+     <para>
+      Specifies parallel degree for <literal>PARALLEL</literal> option. The
+      value must be at least 1. If the parallel degree
+      <replaceable class="parameter">integer</replaceable> is omitted, then
+      <command>VACUUM</command> decides the number of workers based on number of
+      indexes on the relation which further limited by
+      <xref linkend="guc-max-parallel-workers-maintenance"/>.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><replaceable class="parameter">table_name</replaceable></term>
     <listitem>
      <para>
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 5c554f9..917a879 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -22,6 +22,19 @@
  * of index scans performed.  So we don't use maintenance_work_mem memory for
  * the TID array, just enough to hold as many heap tuples as fit on one page.
  *
+ * Lazy vacuum supports parallel execution with parallel worker processes. In
+ * parallel lazy vacuum, we perform both index vacuuming and index cleanup in
+ * parallel. Individual indexes is processed by one vacuum process. At beginning
+ * of lazy vacuum (at lazy_scan_heap) we prepare the parallel context and
+ * initialize the DSM segment that contains shared information as well as the
+ * memory space for dead tuples. When starting either index vacuuming or index
+ * cleanup, we launch parallel worker processes. Once all indexes are processed
+ * the parallel worker processes exit and the leader process re-initializes the
+ * DSM segment. Note that all parallel workers live during one either index
+ * vacuuming or index cleanup but the leader process neither exits from the
+ * parallel mode nor destroys the parallel context. For updating the index
+ * statistics, since any updates are not allowed during parallel mode we update
+ * the index statistics after exited from parallel mode.
  *
  * Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
@@ -41,8 +54,10 @@
 #include "access/heapam_xlog.h"
 #include "access/htup_details.h"
 #include "access/multixact.h"
+#include "access/parallel.h"
 #include "access/transam.h"
 #include "access/visibilitymap.h"
+#include "access/xact.h"
 #include "access/xlog.h"
 #include "catalog/storage.h"
 #include "commands/dbcommands.h"
@@ -55,6 +70,7 @@
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
 #include "storage/lmgr.h"
+#include "tcop/tcopprot.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/pg_rusage.h"
@@ -110,6 +126,92 @@
  */
 #define PREFETCH_SIZE			((BlockNumber) 32)
 
+/*
+ * DSM keys for parallel lazy vacuum. Since we don't need to worry about DSM
+ * keys conflicting with plan_node_id we can use small integers.
+ */
+#define PARALLEL_VACUUM_KEY_SHARED			1
+#define PARALLEL_VACUUM_KEY_DEAD_TUPLES		2
+#define PARALLEL_VACUUM_KEY_QUERY_TEXT		3
+
+/*
+ * Are we in a parallel lazy vacuum? If that's true, we're in parallel mode
+ * and prepared the DSM segments.
+ */
+#define IsInParallelVacuum(lps) (((LVParallelState *) (lps)) != NULL)
+
+/*
+ * Structs for an index bulk-deletion statistic that is used for parallel
+ * lazy vacuum. This is allocated in a DSM segment.
+ */
+typedef struct LVIndStats
+{
+	bool updated;	/* are the stats updated? */
+	IndexBulkDeleteResult stats;
+} LVIndStats;
+
+/*
+ * LVDeadTuples stores the dead tuple TIDs collected during heap scan.
+ * This is allocated in a DSM segment when parallel lazy vacuum mode,
+ * or allocated in a local memory.
+ */
+typedef struct LVDeadTuples
+{
+	int			max_tuples;	/* # slots allocated in array */
+	int			num_tuples;	/* current # of entries */
+	/* List of TIDs of tuples we intend to delete */
+	/* NB: this list is ordered by TID address */
+	ItemPointerData itemptrs[FLEXIBLE_ARRAY_MEMBER];	/* array of ItemPointerData */
+} LVDeadTuples;
+#define SizeOfLVDeadTuples offsetof(LVDeadTuples, itemptrs) + sizeof(ItemPointerData)
+
+/*
+ * Shared information among parallel workers. So this is allocated in
+ * a DSM segment.
+ */
+typedef struct LVShared
+{
+	/*
+	 * Target table relid and vacuum settings. These fields are not modified
+	 * during the lazy vacuum.
+	 */
+	Oid		relid;
+	int		elevel;
+
+	/*
+	 * An indication for vacuum workers of doing either vacuuming index or
+	 * index cleanup.
+	 */
+	bool	for_cleanup;
+
+	/*
+	 * Fields for both index vacuuming and index cleanup.
+	 *
+	 * reltuples is the total number of input heap tuples. We set either an
+	 * old live tuples in index vacuuming or the new live tuples in index cleanup.
+	 *
+	 * estimated_count is true if the reltuples is estimated value.
+	 */
+	double	reltuples;
+	bool	estimated_count;
+
+	/*
+	 * Variables to control parallel index vacuuming. An variable-sized field
+	 * 'indstats' must come last.
+	 */
+	pg_atomic_uint32	nprocessed;
+	LVIndStats			indstats[FLEXIBLE_ARRAY_MEMBER];
+} LVShared;
+#define SizeOfLVShared offsetof(LVShared, indstats) + sizeof(LVIndStats)
+
+/* Struct for parallel lazy vacuum */
+typedef struct LVParallelState
+{
+	ParallelContext	*pcxt;
+	LVShared		*lvshared;
+	int				nworkers_requested;	/* user-requested parallel degree */
+} LVParallelState;
+
 typedef struct LVRelStats
 {
 	/* hasindex = true means two-pass strategy; false means one-pass */
@@ -128,17 +230,12 @@ typedef struct LVRelStats
 	BlockNumber pages_removed;
 	double		tuples_deleted;
 	BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
-	/* List of TIDs of tuples we intend to delete */
-	/* NB: this list is ordered by TID address */
-	int			num_dead_tuples;	/* current # of entries */
-	int			max_dead_tuples;	/* # slots allocated in array */
-	ItemPointer dead_tuples;	/* array of ItemPointerData */
+	LVDeadTuples *dead_tuples;
 	int			num_index_scans;
 	TransactionId latestRemovedXid;
 	bool		lock_waiter_detected;
 } LVRelStats;
 
-
 /* A few variables that don't seem worth passing around as parameters */
 static int	elevel = -1;
 
@@ -150,17 +247,18 @@ static BufferAccessStrategy vac_strategy;
 
 
 /* non-export function prototypes */
-static void lazy_scan_heap(Relation onerel, int options,
+static void lazy_scan_heap(Relation onerel, VacuumParams *params,
 			   LVRelStats *vacrelstats, Relation *Irel, int nindexes,
 			   bool aggressive);
 static void lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats, BlockNumber nblocks);
 static bool lazy_check_needs_freeze(Buffer buf, bool *hastup);
 static void lazy_vacuum_index(Relation indrel,
 				  IndexBulkDeleteResult **stats,
-				  LVRelStats *vacrelstats);
+				  double reltuples,
+				  LVDeadTuples	*dead_tuples);
 static void lazy_cleanup_index(Relation indrel,
-				   IndexBulkDeleteResult *stats,
-				   LVRelStats *vacrelstats);
+				   IndexBulkDeleteResult **stats,
+				   double reltuples, bool estimated_count);
 static int lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 				 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer);
 static bool should_attempt_truncation(LVRelStats *vacrelstats);
@@ -168,12 +266,27 @@ static void lazy_truncate_heap(Relation onerel, LVRelStats *vacrelstats);
 static BlockNumber count_nondeletable_pages(Relation onerel,
 						 LVRelStats *vacrelstats);
 static void lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks);
-static void lazy_record_dead_tuple(LVRelStats *vacrelstats,
-					   ItemPointer itemptr);
+static void lazy_record_dead_tuple(LVDeadTuples *dead_tuples, ItemPointer itemptr);
 static bool lazy_tid_reaped(ItemPointer itemptr, void *state);
 static int	vac_cmp_itemptr(const void *left, const void *right);
 static bool heap_page_is_all_visible(Relation rel, Buffer buf,
 						 TransactionId *visibility_cutoff_xid, bool *all_frozen);
+static void lazy_update_index_statistics(Relation indrel, IndexBulkDeleteResult *stats);
+static LVParallelState *lazy_prepare_parallel(LVRelStats *vacrelstats, Oid relid,
+											  BlockNumber nblocks, int nindexes,
+											  int nrequested);
+static void lazy_end_parallel(LVParallelState *lps, Relation *Irel, int nindexes);
+static bool lazy_begin_parallel_vacuum_index(LVParallelState *lps, LVRelStats *vacrelstats,
+											 bool for_cleanup);
+static void lazy_end_parallel_vacuum_index(LVParallelState *lps, bool reinitialize);
+static void lazy_vacuum_all_indexes(LVRelStats *vacrelstats, Relation *Irel, int nindexes,
+									IndexBulkDeleteResult **stats,
+									LVParallelState *lps, bool for_cleanup);
+static void lazy_vacuum_indexes_for_worker(Relation *indrels, int nindexes,
+										   LVShared *lvshared, LVDeadTuples *dead_tuples,
+										   bool for_cleanup);
+static int compute_parallel_workers(Relation onerel, int nrequested, int nindexes);
+static long compute_max_dead_tuples(BlockNumber relblocks, bool hasindex);
 
 
 /*
@@ -261,7 +374,7 @@ heap_vacuum_rel(Relation onerel, VacuumParams *params,
 	vacrelstats->hasindex = (nindexes > 0);
 
 	/* Do the vacuuming */
-	lazy_scan_heap(onerel, params->options, vacrelstats, Irel, nindexes, aggressive);
+	lazy_scan_heap(onerel, params, vacrelstats, Irel, nindexes, aggressive);
 
 	/* Done with indexes */
 	vac_close_indexes(nindexes, Irel, NoLock);
@@ -464,14 +577,28 @@ vacuum_log_cleanup_info(Relation rel, LVRelStats *vacrelstats)
  *		dead-tuple TIDs, invoke vacuuming of indexes and call lazy_vacuum_heap
  *		to reclaim dead line pointers.
  *
+ *		If the table has more than one index and parallel lazy vacuum is requested,
+ *		we execute both index vacuuming and index cleanup with parallel workers.
+ *		When allocating the space for lazy scan heap, we enter parallel mode,
+ *		create the parallel context and initailize a DSM segment for dead tuples.
+ *		The dead_tuples points either to a DSM segment in parallel lazy vacuum case
+ *		or to a local memory in single process vacuum case.  Before starting parallel
+ *		index vacuuming and parallel index cleanup we launch parallel workers.
+ *		All parallel workers will exit after processed all indexes and the leader
+ *		process re-initialize the parallel context and then re-launch them at the next
+ *		execution. The index statistics are updated by the leader after exited from
+ *		parallel mode since all writes are not allowed during parallel mode.
+ *
  *		If there are no indexes then we can reclaim line pointers on the fly;
  *		dead line pointers need only be retained until all index pointers that
  *		reference them have been killed.
  */
 static void
-lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
+lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			   Relation *Irel, int nindexes, bool aggressive)
 {
+	LVParallelState *lps = NULL;
+	LVDeadTuples *dead_tuples;
 	BlockNumber nblocks,
 				blkno;
 	HeapTupleData tuple;
@@ -494,6 +621,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	bool		skipping_blocks;
 	xl_heap_freeze_tuple *frozen;
 	StringInfoData buf;
+	int			parallel_workers = 0;
 	const int	initprog_index[] = {
 		PROGRESS_VACUUM_PHASE,
 		PROGRESS_VACUUM_TOTAL_HEAP_BLKS,
@@ -529,13 +657,34 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	vacrelstats->nonempty_pages = 0;
 	vacrelstats->latestRemovedXid = InvalidTransactionId;
 
-	lazy_space_alloc(vacrelstats, nblocks);
+	/* Compute the number of parallel vacuum worker to request */
+	if ((params->options & VACOPT_PARALLEL) != 0)
+		parallel_workers = compute_parallel_workers(onerel,
+													params->nworkers,
+													nindexes);
+
+	if (parallel_workers > 0)
+	{
+		/* enter parallel mode and prepare parallel lazy vacuum */
+		lps = lazy_prepare_parallel(vacrelstats,
+									RelationGetRelid(onerel),
+									nblocks, nindexes,
+									parallel_workers);
+		lps->nworkers_requested = params->nworkers;
+	}
+	else
+	{
+		/* Allocate the memory space for dead tuples locally */
+		lazy_space_alloc(vacrelstats, nblocks);
+	}
+
+	dead_tuples = vacrelstats->dead_tuples;
 	frozen = palloc(sizeof(xl_heap_freeze_tuple) * MaxHeapTuplesPerPage);
 
 	/* Report that we're scanning the heap, advertising total # of blocks */
 	initprog_val[0] = PROGRESS_VACUUM_PHASE_SCAN_HEAP;
 	initprog_val[1] = nblocks;
-	initprog_val[2] = vacrelstats->max_dead_tuples;
+	initprog_val[2] = dead_tuples->max_tuples;
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
 	/*
@@ -583,7 +732,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	 * be replayed on any hot standby, where it can be disruptive.
 	 */
 	next_unskippable_block = 0;
-	if ((options & VACOPT_DISABLE_PAGE_SKIPPING) == 0)
+	if ((params->options & VACOPT_DISABLE_PAGE_SKIPPING) == 0)
 	{
 		while (next_unskippable_block < nblocks)
 		{
@@ -638,7 +787,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		{
 			/* Time to advance next_unskippable_block */
 			next_unskippable_block++;
-			if ((options & VACOPT_DISABLE_PAGE_SKIPPING) == 0)
+			if ((params->options & VACOPT_DISABLE_PAGE_SKIPPING) == 0)
 			{
 				while (next_unskippable_block < nblocks)
 				{
@@ -713,8 +862,8 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		 * If we are close to overrunning the available space for dead-tuple
 		 * TIDs, pause and do a cycle of vacuuming before we tackle this page.
 		 */
-		if ((vacrelstats->max_dead_tuples - vacrelstats->num_dead_tuples) < MaxHeapTuplesPerPage &&
-			vacrelstats->num_dead_tuples > 0)
+		if ((dead_tuples->max_tuples - dead_tuples->num_tuples) < MaxHeapTuplesPerPage &&
+			dead_tuples->num_tuples > 0)
 		{
 			const int	hvp_index[] = {
 				PROGRESS_VACUUM_PHASE,
@@ -742,10 +891,8 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 										 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
 			/* Remove index entries */
-			for (i = 0; i < nindexes; i++)
-				lazy_vacuum_index(Irel[i],
-								  &indstats[i],
-								  vacrelstats);
+			lazy_vacuum_all_indexes(vacrelstats, Irel, nindexes, indstats,
+									lps, false);
 
 			/*
 			 * Report that we are now vacuuming the heap.  We also increase
@@ -765,7 +912,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 			vacrelstats->num_index_scans++;
 
 			/*
@@ -961,7 +1108,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		has_dead_tuples = false;
 		nfrozen = 0;
 		hastup = false;
-		prev_dead_count = vacrelstats->num_dead_tuples;
+		prev_dead_count = dead_tuples->num_tuples;
 		maxoff = PageGetMaxOffsetNumber(page);
 
 		/*
@@ -1000,7 +1147,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			 */
 			if (ItemIdIsDead(itemid))
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				all_visible = false;
 				continue;
 			}
@@ -1140,7 +1287,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 
 			if (tupgone)
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				HeapTupleHeaderAdvanceLatestRemovedXid(tuple.t_data,
 													   &vacrelstats->latestRemovedXid);
 				tups_vacuumed += 1;
@@ -1209,8 +1356,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		 * If there are no indexes then we can vacuum the page right now
 		 * instead of doing a second scan.
 		 */
-		if (nindexes == 0 &&
-			vacrelstats->num_dead_tuples > 0)
+		if (nindexes == 0 && dead_tuples->num_tuples > 0)
 		{
 			/* Remove tuples from heap */
 			lazy_vacuum_page(onerel, blkno, buf, 0, vacrelstats, &vmbuffer);
@@ -1221,7 +1367,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 			vacuumed_pages++;
 
 			/*
@@ -1337,7 +1483,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		 * page, so remember its free space as-is.  (This path will always be
 		 * taken if there are no indexes.)
 		 */
-		if (vacrelstats->num_dead_tuples == prev_dead_count)
+		if (dead_tuples->num_tuples == prev_dead_count)
 			RecordPageWithFreeSpace(onerel, blkno, freespace, nblocks);
 	}
 
@@ -1371,7 +1517,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 
 	/* If any tuples need to be deleted, perform final vacuum cycle */
 	/* XXX put a threshold on min number of tuples here? */
-	if (vacrelstats->num_dead_tuples > 0)
+	if (dead_tuples->num_tuples > 0)
 	{
 		const int	hvp_index[] = {
 			PROGRESS_VACUUM_PHASE,
@@ -1387,10 +1533,8 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 									 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
 		/* Remove index entries */
-		for (i = 0; i < nindexes; i++)
-			lazy_vacuum_index(Irel[i],
-							  &indstats[i],
-							  vacrelstats);
+		lazy_vacuum_all_indexes(vacrelstats, Irel, nindexes, indstats,
+								lps, false);
 
 		/* Report that we are now vacuuming the heap */
 		hvp_val[0] = PROGRESS_VACUUM_PHASE_VACUUM_HEAP;
@@ -1416,9 +1560,21 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
 								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
 
-	/* Do post-vacuum cleanup and statistics update for each index */
-	for (i = 0; i < nindexes; i++)
-		lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
+	/*
+	 * Do post-vacuum cleanup and statistics update for each index if
+	 * we're not in parallel lazy vacuum. If in parallel lazy vacuum, do
+	 * only post-vacum cleanup and then update statistics after exited
+	 * from parallel mode.
+	 */
+	lazy_vacuum_all_indexes(vacrelstats, Irel, nindexes, indstats,
+							lps, true);
+
+	/*
+	 * If we're in parallel lazy vacuum, end parallel lazy vacuum and
+	 * update index statistics.
+	 */
+	if (IsInParallelVacuum(lps))
+		lazy_end_parallel(lps, Irel, nindexes);
 
 	/* If no indexes, make log report that lazy_vacuum_heap would've made */
 	if (vacuumed_pages)
@@ -1485,7 +1641,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats, BlockNumber nblocks)
 	npages = 0;
 
 	tupindex = 0;
-	while (tupindex < vacrelstats->num_dead_tuples)
+	while (tupindex < vacrelstats->dead_tuples->num_tuples)
 	{
 		BlockNumber tblk;
 		Buffer		buf;
@@ -1494,7 +1650,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats, BlockNumber nblocks)
 
 		vacuum_delay_point();
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples->itemptrs[tupindex]);
 		buf = ReadBufferExtended(onerel, MAIN_FORKNUM, tblk, RBM_NORMAL,
 								 vac_strategy);
 		if (!ConditionalLockBufferForCleanup(buf))
@@ -1542,6 +1698,7 @@ static int
 lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 				 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer)
 {
+	LVDeadTuples	*dead_tuples = vacrelstats->dead_tuples;
 	Page		page = BufferGetPage(buffer);
 	OffsetNumber unused[MaxOffsetNumber];
 	int			uncnt = 0;
@@ -1552,16 +1709,16 @@ lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 
 	START_CRIT_SECTION();
 
-	for (; tupindex < vacrelstats->num_dead_tuples; tupindex++)
+	for (; tupindex < dead_tuples->num_tuples; tupindex++)
 	{
 		BlockNumber tblk;
 		OffsetNumber toff;
 		ItemId		itemid;
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&dead_tuples->itemptrs[tupindex]);
 		if (tblk != blkno)
 			break;				/* past end of tuples for this block */
-		toff = ItemPointerGetOffsetNumber(&vacrelstats->dead_tuples[tupindex]);
+		toff = ItemPointerGetOffsetNumber(&dead_tuples->itemptrs[tupindex]);
 		itemid = PageGetItemId(page, toff);
 		ItemIdSetUnused(itemid);
 		unused[uncnt++] = toff;
@@ -1682,6 +1839,107 @@ lazy_check_needs_freeze(Buffer buf, bool *hastup)
 	return false;
 }
 
+/*
+ * Vacuum or cleanup all indexes. If we're ready for parallel lazy vacuum it's
+ * performed with parallel workers. So this function must be used by the parallel
+ * vacuum leader process.
+ */
+static void
+lazy_vacuum_all_indexes(LVRelStats *vacrelstats, Relation *Irel, int nindexes,
+						IndexBulkDeleteResult **stats, LVParallelState *lps,
+						bool for_cleanup)
+{
+	int		nprocessed = 0;
+	bool	do_parallel = false;	/* true means workers has been launched */
+	int		idx;
+
+	Assert(!IsParallelWorker());
+
+	/* no job if the table has no index */
+	if (nindexes <= 0)
+		return;
+
+	/* Launch parallel vacuum workers if we're ready */
+	if (IsInParallelVacuum(lps))
+		do_parallel = lazy_begin_parallel_vacuum_index(lps, vacrelstats,
+													   for_cleanup);
+
+	for (;;)
+	{
+		/* Get the next index to vacuum */
+		if (do_parallel)
+			idx = pg_atomic_fetch_add_u32(&(lps->lvshared->nprocessed), 1);
+		else
+			idx = nprocessed++;
+
+		/* Done for all indexes? */
+		if (idx >= nindexes)
+			break;
+
+		/*
+		 * Update the local pointer to the corresponding bulk-deletion result
+		 * if someone already updated it.
+		 */
+		if (IsInParallelVacuum(lps) &&
+			lps->lvshared->indstats[idx].updated &&
+			stats[idx] == NULL)
+			stats[idx] = &(lps->lvshared->indstats[idx].stats);
+
+		/* Do vacuum or cleanup one index */
+		if (!for_cleanup)
+			lazy_vacuum_index(Irel[idx], &stats[idx], vacrelstats->old_rel_pages,
+							  vacrelstats->dead_tuples);
+		else
+		{
+			lazy_cleanup_index(Irel[idx], &stats[idx], vacrelstats->new_rel_tuples,
+							   vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+
+			if (!IsInParallelVacuum(lps))
+			{
+				/*
+				 * Update index statistics. If in parallel lazy vacuum, we will
+				 * update them after exited from parallel mode.
+				 */
+				lazy_update_index_statistics(Irel[idx], stats[idx]);
+
+				if (stats[idx])
+					pfree(stats[idx]);
+			}
+		}
+
+		/*
+		 * In parallel lazy vacuum, we copy the index bulk-deletion result
+		 * returned from ambulkdelete and amvacuumcleanup to the DSM segment
+		 * if the result on the DSM segment is not updated yet. It's necessary
+		 * because they allocate the results locally and it's possible that an
+		 * index will be vacuumed by the different vacuum process at the next
+		 * time. The copying the result normally happens only after the first
+		 * time of index vacuuming. From the second time, we pass the result
+		 * on the DSM segment so that they update it directly.
+		 *
+		 * Since all vacuum workers write the bulk-deletion result at different
+		 * slot we can write them without locking.
+		 */
+		if (IsInParallelVacuum(lps) &&
+			!lps->lvshared->indstats[idx].updated &&
+			stats[idx] != NULL)
+		{
+			memcpy(&(lps->lvshared->indstats[idx].stats),
+				   stats[idx], sizeof(IndexBulkDeleteResult));
+			lps->lvshared->indstats[idx].updated = true;
+
+			/*
+			 * no longer need the locally allocated result and now stats[idx]
+			 * points to the DSM segment.
+			 */
+			pfree(stats[idx]);
+			stats[idx] = &(lps->lvshared->indstats[idx].stats);
+		}
+	}
+
+	if (do_parallel)
+		lazy_end_parallel_vacuum_index(lps, !for_cleanup);
+}
 
 /*
  *	lazy_vacuum_index() -- vacuum one index relation.
@@ -1690,11 +1948,11 @@ lazy_check_needs_freeze(Buffer buf, bool *hastup)
  *		vacrelstats->dead_tuples, and update running statistics.
  */
 static void
-lazy_vacuum_index(Relation indrel,
-				  IndexBulkDeleteResult **stats,
-				  LVRelStats *vacrelstats)
+lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult **stats,
+				  double reltuples, LVDeadTuples *dead_tuples)
 {
 	IndexVacuumInfo ivinfo;
+	char		*msg;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
@@ -1703,18 +1961,21 @@ lazy_vacuum_index(Relation indrel,
 	ivinfo.analyze_only = false;
 	ivinfo.estimated_count = true;
 	ivinfo.message_level = elevel;
-	/* We can only provide an approximate value of num_heap_tuples here */
-	ivinfo.num_heap_tuples = vacrelstats->old_live_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
 	/* Do bulk deletion */
 	*stats = index_bulk_delete(&ivinfo, *stats,
-							   lazy_tid_reaped, (void *) vacrelstats);
+							   lazy_tid_reaped, (void *) dead_tuples);
 
+	if (IsParallelWorker())
+		msg = "scanned index \"%s\" to remove %d row versions by parallel vacuum worker";
+	else
+		msg = "scanned index \"%s\" to remove %d row versions";
 	ereport(elevel,
-			(errmsg("scanned index \"%s\" to remove %d row versions",
+			(errmsg(msg,
 					RelationGetRelationName(indrel),
-					vacrelstats->num_dead_tuples),
+					dead_tuples->num_tuples),
 			 errdetail_internal("%s", pg_rusage_show(&ru0))));
 }
 
@@ -1722,60 +1983,65 @@ lazy_vacuum_index(Relation indrel,
  *	lazy_cleanup_index() -- do post-vacuum cleanup for one index relation.
  */
 static void
-lazy_cleanup_index(Relation indrel,
-				   IndexBulkDeleteResult *stats,
-				   LVRelStats *vacrelstats)
+lazy_cleanup_index(Relation indrel, IndexBulkDeleteResult **stats,
+				   double reltuples, bool estimated_count)
 {
 	IndexVacuumInfo ivinfo;
+	char		*msg;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
 
 	ivinfo.index = indrel;
 	ivinfo.analyze_only = false;
-	ivinfo.estimated_count = (vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	ivinfo.estimated_count = estimated_count;
 	ivinfo.message_level = elevel;
-
-	/*
-	 * Now we can provide a better estimate of total number of surviving
-	 * tuples (we assume indexes are more interested in that than in the
-	 * number of nominally live tuples).
-	 */
-	ivinfo.num_heap_tuples = vacrelstats->new_rel_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
-	stats = index_vacuum_cleanup(&ivinfo, stats);
+	*stats = index_vacuum_cleanup(&ivinfo, *stats);
 
-	if (!stats)
+	if (!(*stats))
 		return;
 
-	/*
-	 * Now update statistics in pg_class, but only if the index says the count
-	 * is accurate.
-	 */
-	if (!stats->estimated_count)
-		vac_update_relstats(indrel,
-							stats->num_pages,
-							stats->num_index_tuples,
-							0,
-							false,
-							InvalidTransactionId,
-							InvalidMultiXactId,
-							false);
+	if (IsParallelWorker())
+		msg = "index \"%s\" now contains %.0f row versions in %u pages, reported by parallel vacuum worker";
+	else
+		msg = "index \"%s\" now contains %.0f row versions in %u pages";
 
 	ereport(elevel,
-			(errmsg("index \"%s\" now contains %.0f row versions in %u pages",
+			(errmsg(msg,
 					RelationGetRelationName(indrel),
-					stats->num_index_tuples,
-					stats->num_pages),
+					(*stats)->num_index_tuples,
+					(*stats)->num_pages),
 			 errdetail("%.0f index row versions were removed.\n"
 					   "%u index pages have been deleted, %u are currently reusable.\n"
 					   "%s.",
-					   stats->tuples_removed,
-					   stats->pages_deleted, stats->pages_free,
+					   (*stats)->tuples_removed,
+					   (*stats)->pages_deleted, (*stats)->pages_free,
 					   pg_rusage_show(&ru0))));
+}
 
-	pfree(stats);
+/*
+ * Update index statistics in pg_class, but only if the index says the count
+ * is accurate.
+ */
+static void
+lazy_update_index_statistics(Relation indrel, IndexBulkDeleteResult *stats)
+{
+	Assert(!IsInParallelMode());
+
+	if (!stats || stats->estimated_count)
+		return;
+
+	vac_update_relstats(indrel,
+						stats->num_pages,
+						stats->num_index_tuples,
+						0,
+						false,
+						InvalidTransactionId,
+						InvalidMultiXactId,
+						false);
 }
 
 /*
@@ -2080,19 +2346,17 @@ count_nondeletable_pages(Relation onerel, LVRelStats *vacrelstats)
 }
 
 /*
- * lazy_space_alloc - space allocation decisions for lazy vacuum
- *
- * See the comments at the head of this file for rationale.
+ * Return the maximum number of dead tuples we can record.
  */
-static void
-lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+static long
+compute_max_dead_tuples(BlockNumber relblocks, bool hasindex)
 {
 	long		maxtuples;
 	int			vac_work_mem = IsAutoVacuumWorkerProcess() &&
 	autovacuum_work_mem != -1 ?
 	autovacuum_work_mem : maintenance_work_mem;
 
-	if (vacrelstats->hasindex)
+	if (hasindex)
 	{
 		maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);
 		maxtuples = Min(maxtuples, INT_MAX);
@@ -2106,34 +2370,49 @@ lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
 		maxtuples = Max(maxtuples, MaxHeapTuplesPerPage);
 	}
 	else
-	{
 		maxtuples = MaxHeapTuplesPerPage;
-	}
 
-	vacrelstats->num_dead_tuples = 0;
-	vacrelstats->max_dead_tuples = (int) maxtuples;
-	vacrelstats->dead_tuples = (ItemPointer)
-		palloc(maxtuples * sizeof(ItemPointerData));
+	return maxtuples;
+}
+
+/*
+ * lazy_space_alloc - space allocation decisions for lazy vacuum
+ *
+ * See the comments at the head of this file for rationale.
+ */
+static void
+lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+{
+	LVDeadTuples	*dead_tuples = NULL;
+	long		maxtuples;
+
+	maxtuples = compute_max_dead_tuples(relblocks, vacrelstats->hasindex);
+
+	dead_tuples = (LVDeadTuples *)
+		palloc(SizeOfLVDeadTuples + maxtuples * sizeof(ItemPointerData));
+	dead_tuples->num_tuples = 0;
+	dead_tuples->max_tuples = (int) maxtuples;
+
+	vacrelstats->dead_tuples = dead_tuples;
 }
 
 /*
  * lazy_record_dead_tuple - remember one deletable tuple
  */
 static void
-lazy_record_dead_tuple(LVRelStats *vacrelstats,
-					   ItemPointer itemptr)
+lazy_record_dead_tuple(LVDeadTuples *dead_tuples, ItemPointer itemptr)
 {
 	/*
 	 * The array shouldn't overflow under normal behavior, but perhaps it
 	 * could if we are given a really small maintenance_work_mem. In that
 	 * case, just forget the last few tuples (we'll get 'em next time).
 	 */
-	if (vacrelstats->num_dead_tuples < vacrelstats->max_dead_tuples)
+	if (dead_tuples->num_tuples < dead_tuples->max_tuples)
 	{
-		vacrelstats->dead_tuples[vacrelstats->num_dead_tuples] = *itemptr;
-		vacrelstats->num_dead_tuples++;
+		dead_tuples->itemptrs[dead_tuples->num_tuples] = *itemptr;
+		dead_tuples->num_tuples++;
 		pgstat_progress_update_param(PROGRESS_VACUUM_NUM_DEAD_TUPLES,
-									 vacrelstats->num_dead_tuples);
+									 dead_tuples->num_tuples);
 	}
 }
 
@@ -2147,12 +2426,12 @@ lazy_record_dead_tuple(LVRelStats *vacrelstats,
 static bool
 lazy_tid_reaped(ItemPointer itemptr, void *state)
 {
-	LVRelStats *vacrelstats = (LVRelStats *) state;
+	LVDeadTuples	*dead_tuples = (LVDeadTuples *) state;
 	ItemPointer res;
 
 	res = (ItemPointer) bsearch((void *) itemptr,
-								(void *) vacrelstats->dead_tuples,
-								vacrelstats->num_dead_tuples,
+								(void *) dead_tuples->itemptrs,
+								dead_tuples->num_tuples,
 								sizeof(ItemPointerData),
 								vac_cmp_itemptr);
 
@@ -2300,3 +2579,389 @@ heap_page_is_all_visible(Relation rel, Buffer buf,
 
 	return all_visible;
 }
+
+/*
+ * Compute the number of parallel worker process to request. Both index vacuuming
+ * and cleanup index can be executed together with parallel workers if the table
+ * has more than one index. The relation sizes of table and indexes don't affect
+ * to the parallel degree for now.
+ */
+static int
+compute_parallel_workers(Relation onerel, int nrequested, int nindexes)
+{
+	int parallel_workers;
+
+	Assert(nrequested >= 0);
+
+	if (nindexes <= 1)
+		return 0;
+
+	if (nrequested > 0)
+		parallel_workers = Min(nrequested, nindexes - 1);
+	else
+	{
+		/*
+		 * The parallel degree is not requested. Compute it based on the
+		 * number of indexes.
+		 */
+		parallel_workers = nindexes - 1;
+	}
+
+	/* cap by max_parallel_maintenace_workers */
+	parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+
+	return parallel_workers;
+}
+
+/*
+ * Enter parallel mode, allocate and initialize a DSM segment.
+ */
+static LVParallelState *
+lazy_prepare_parallel(LVRelStats *vacrelstats, Oid relid, BlockNumber nblocks,
+					  int nindexes, int nrequested)
+{
+	LVParallelState *lps = (LVParallelState *) palloc(sizeof(LVParallelState));
+	LVShared	*shared;
+	ParallelContext *pcxt;
+	LVDeadTuples	*tidmap;
+	long	maxtuples;
+	char	*sharedquery;
+	Size	est_shared;
+	Size	est_deadtuples;
+	int		querylen;
+	int		keys = 0;
+
+	Assert(nrequested > 0);
+	Assert(nindexes > 0);
+
+	EnterParallelMode();
+	pcxt = CreateParallelContext("postgres", "heap_parallel_vacuum_main",
+								 nrequested);
+	lps->pcxt = pcxt;
+	Assert(pcxt->nworkers > 0);
+
+	/* Estimate size for shared information -- PARALLEL_VACUUM_KEY_SHARED */
+	est_shared = MAXALIGN(add_size(SizeOfLVShared,
+								   mul_size(sizeof(LVIndStats), nindexes)));
+	shm_toc_estimate_chunk(&pcxt->estimator, est_shared);
+	keys++;
+
+	/* Estimate size for dead tuples -- PARALLEL_VACUUM_KEY_DEAD_TUPLES */
+	maxtuples = compute_max_dead_tuples(nblocks, true);
+	est_deadtuples = MAXALIGN(add_size(sizeof(LVDeadTuples),
+									   mul_size(sizeof(ItemPointerData), maxtuples)));
+	shm_toc_estimate_chunk(&pcxt->estimator, est_deadtuples);
+	keys++;
+
+	shm_toc_estimate_keys(&pcxt->estimator, keys);
+
+	/* Finally, estimate VACUUM_KEY_QUERY_TEXT space */
+	querylen = strlen(debug_query_string);
+	shm_toc_estimate_chunk(&pcxt->estimator, querylen + 1);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	InitializeParallelDSM(pcxt);
+
+	/* prepare shared information */
+	shared = (LVShared *) shm_toc_allocate(pcxt->toc, est_shared);
+	shared->relid = relid;
+	shared->elevel = elevel;
+	pg_atomic_init_u32(&(shared->nprocessed), 0);
+	MemSet(shared->indstats, 0, sizeof(LVIndStats) * nindexes);
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_SHARED, shared);
+	lps->lvshared = shared;
+
+	/* prepare the dead tuple space */
+	tidmap = (LVDeadTuples *) shm_toc_allocate(pcxt->toc, est_deadtuples);
+	tidmap->max_tuples = maxtuples;
+	tidmap->num_tuples = 0;
+	MemSet(tidmap->itemptrs, 0, sizeof(ItemPointerData) * maxtuples);
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_DEAD_TUPLES, tidmap);
+	vacrelstats->dead_tuples = tidmap;
+
+	/* Store query string for workers */
+	sharedquery = (char *) shm_toc_allocate(pcxt->toc, querylen + 1);
+	memcpy(sharedquery, debug_query_string, querylen + 1);
+	sharedquery[querylen] = '\0';
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, sharedquery);
+
+	lps->nworkers_requested = 0;
+
+	return lps;
+}
+
+/*
+ * Shutdown workers, destroy the parallel context, and end parallel mode.
+ * Update index statistics after exited from parallel mode.
+ */
+static void
+lazy_end_parallel(LVParallelState *lps, Relation *Irel, int nindexes)
+{
+	LVIndStats *copied_indstats = NULL;
+	int i;
+
+	Assert(!IsParallelWorker());
+	Assert(Irel != NULL && nindexes > 0);
+
+	/* copy the index statistics to a temporary space */
+	copied_indstats = palloc(sizeof(LVIndStats) * nindexes);
+	memcpy(copied_indstats, lps->lvshared->indstats,
+		   sizeof(LVIndStats) * nindexes);
+
+	/* Shutdown worker processes and destroy the parallel context */
+	WaitForParallelWorkersToFinish(lps->pcxt);
+	DestroyParallelContext(lps->pcxt);
+	ExitParallelMode();
+
+	for (i = 0; i < nindexes; i++)
+	{
+		LVIndStats *s = &(copied_indstats[i]);
+
+		if (s->updated)
+			lazy_update_index_statistics(Irel[i], &(s->stats));
+	}
+
+	pfree(copied_indstats);
+}
+
+/*
+ * Begin a parallel index vacuuming or index cleanup. Set shared information
+ * and launch parallel worker processes. Return true if at least one worker
+ * has been launched.
+ */
+static bool
+lazy_begin_parallel_vacuum_index(LVParallelState *lps, LVRelStats *vacrelstats,
+								 bool for_cleanup)
+{
+	StringInfoData buf;
+
+	Assert(!IsParallelWorker());
+
+	/* Request workers to do either vacuuming indexes or cleaning indexes */
+	lps->lvshared->for_cleanup = for_cleanup;
+
+	if (for_cleanup)
+	{
+		/*
+		 * Now we can provide a better estimate of total number of surviving
+		 * tuples (we assume indexes are more interested in that than in the
+		 * number of nominally live tuples).
+		 */
+		lps->lvshared->reltuples = vacrelstats->new_rel_tuples;
+		lps->lvshared->estimated_count =
+			(vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	}
+	else
+	{
+		/* We can only provide an approximate value of num_heap_tuples here */
+		lps->lvshared->reltuples = vacrelstats->old_live_tuples;
+		lps->lvshared->estimated_count = true;
+	}
+
+	LaunchParallelWorkers(lps->pcxt);
+
+	initStringInfo(&buf);
+
+	/*
+	 * if no workers launched, we vacuum all indexes by the leader process
+	 * alone. Since there is hope that we can launch workers in the next
+	 * execution time we don't want to end parallel mode yet.
+	 */
+	if (lps->pcxt->nworkers_launched == 0)
+	{
+		if (lps->nworkers_requested > 0)
+			appendStringInfo(&buf,
+							 "could not launch parallel vacuum worker (planned: %d, requested: %d)",
+							 lps->pcxt->nworkers, lps->nworkers_requested);
+		else
+			appendStringInfo(&buf,
+							 "could not launch parallel vacuum worker (planned: %d)",
+							 lps->pcxt->nworkers);
+		ereport(elevel, (errmsg("%s", buf.data)));
+
+		lazy_end_parallel_vacuum_index(lps, !for_cleanup);
+		return false;
+	}
+
+	/* Report parallel vacuum worker information */
+	if (for_cleanup)
+	{
+		if (lps->nworkers_requested > 0)
+			appendStringInfo(&buf,
+							 ngettext("launched %d parallel vacuum worker for index cleanup (planned: %d, requested %d)",
+									  "launched %d parallel vacuum workers for index cleanup (planned: %d, requsted %d)",
+									  lps->pcxt->nworkers_launched),
+							 lps->pcxt->nworkers_launched,
+							 lps->pcxt->nworkers,
+							 lps->nworkers_requested);
+		else
+			appendStringInfo(&buf,
+							 ngettext("launched %d parallel vacuum worker for index cleanup (planned: %d)",
+									  "launched %d parallel vacuum workers for index cleanup (planned: %d)",
+									  lps->pcxt->nworkers_launched),
+							 lps->pcxt->nworkers_launched,
+							 lps->pcxt->nworkers);
+	}
+	else
+	{
+		if (lps->nworkers_requested > 0)
+			appendStringInfo(&buf,
+							 ngettext("launched %d parallel vacuum worker for index vacuuming (planned: %d, requested %d)",
+									  "launched %d parallel vacuum workers for index vacuuming (planned: %d, requested %d)",
+									  lps->pcxt->nworkers_launched),
+							 lps->pcxt->nworkers_launched,
+							 lps->pcxt->nworkers,
+							 lps->nworkers_requested);
+		else
+			appendStringInfo(&buf,
+							 ngettext("launched %d parallel vacuum worker for index vacuuming (planned: %d)",
+									  "launched %d parallel vacuum workers for index vacuuming (planned: %d)",
+									  lps->pcxt->nworkers_launched),
+							 lps->pcxt->nworkers_launched,
+							 lps->pcxt->nworkers);
+	}
+	ereport(elevel, (errmsg("%s", buf.data)));
+
+	return true;
+}
+
+/*
+ * Wait for all worker processes to finish and reinitialize DSM for
+ * the next execution.
+ */
+static void
+lazy_end_parallel_vacuum_index(LVParallelState *lps, bool reinitialize)
+{
+	Assert(!IsParallelWorker());
+
+	WaitForParallelWorkersToFinish(lps->pcxt);
+
+	if (reinitialize)
+	{
+		/* Reset the processing count */
+		pg_atomic_write_u32(&(lps->lvshared->nprocessed), 0);
+
+		/*
+		 * Reinitialize the DSM space except to relaunch parallel workers for
+		 * the next execution.
+		 */
+		ReinitializeParallelDSM(lps->pcxt);
+	}
+}
+
+/*
+ * Perform work within a launched parallel process.
+ *
+ * Parallel vacuum worker processes doesn't report the vacuum progress
+ * information.
+ */
+void
+heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
+{
+	Relation	onerel;
+	Relation	*indrels;
+	LVShared	*lvshared;
+	LVDeadTuples	*dead_tuples;
+	int			nindexes;
+	char		*sharedquery;
+
+	lvshared = (LVShared *) shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_SHARED, false);
+	elevel = lvshared->elevel;
+
+	ereport(DEBUG1,
+			(errmsg("starting parallel lazy vacuum worker for %s",
+					lvshared->for_cleanup ? "cleanup" : "vacuuming")));
+
+	/* Open relations */
+	onerel = heap_open(lvshared->relid, ShareUpdateExclusiveLock);
+
+	/* indrels are sorted in order by OID */
+	vac_open_indexes(onerel, RowExclusiveLock, &nindexes, &indrels);
+	Assert(nindexes > 0);
+
+	/* Set debug_query_string for individual workers */
+	sharedquery = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, true);
+
+	/* Report the query string from leader */
+	debug_query_string = sharedquery;
+	pgstat_report_activity(STATE_RUNNING, debug_query_string);
+
+	/* Set dead tuple space within worker */
+	dead_tuples = (LVDeadTuples *) shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_DEAD_TUPLES, false);
+
+	/* Set cost-based vacuum delay */
+	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostBalance = 0;
+	VacuumPageHit = 0;
+	VacuumPageMiss = 0;
+	VacuumPageDirty = 0;
+
+	/* Do either vacuuming indexes or cleaning indexes */
+	lazy_vacuum_indexes_for_worker(indrels, nindexes, lvshared,
+								   dead_tuples,
+								   lvshared->for_cleanup);
+
+	vac_close_indexes(nindexes, indrels, RowExclusiveLock);
+	heap_close(onerel, ShareUpdateExclusiveLock);
+}
+
+/*
+ * Vacuum or cleanup all indexes. This is similar to the lazy_vacuum_all_indexes
+ * but this function must be used by the parallel vacuum worker processes.
+ */
+static void
+lazy_vacuum_indexes_for_worker(Relation *indrels, int nindexes,
+							   LVShared *lvshared, LVDeadTuples *dead_tuples,
+							   bool for_cleanup)
+{
+	int idx = 0;
+
+	Assert(IsParallelWorker());
+
+	for (;;)
+	{
+		IndexBulkDeleteResult *stats = NULL;
+
+		/* Get next index to vacuum */
+		idx = pg_atomic_fetch_add_u32(&(lvshared->nprocessed), 1);
+
+		/*  Done for all indexes? */
+		if (idx >= nindexes)
+			break;
+
+		/*
+		 * If a vacuum process already updated the bulk-deletion result, we
+		 * pass it to index AMs. Otherwise pass NULL as they expect NULL for
+		 * the first time execution.
+		 */
+		if (lvshared->indstats[idx].updated)
+			stats = &(lvshared->indstats[idx].stats);
+
+		/* Do vacuum or cleanup one index */
+		if (!for_cleanup)
+			lazy_vacuum_index(indrels[idx], &stats, lvshared->reltuples,
+							  dead_tuples);
+		else
+			lazy_cleanup_index(indrels[idx], &stats, lvshared->reltuples,
+							   lvshared->estimated_count);
+
+		/*
+		 * We copy the index bulk-deletion results returned from ambulkdelete
+		 * and amvacuumcleanup to the DSM segment because they allocate
+		 * the results locally and it's possible that an index will be vacuumed
+		 * by the different vacuum process at the next time. The copying the
+		 * result normally happens only after the first time of index vacuuming.
+		 * From the second time, we pass the result on the DSM segment so
+		 * that they update it directly.
+		 *
+		 * Since all vacuum workers write the bulk-deletion result at different
+		 * slot we can write them without locking.
+		 */
+		if (!lvshared->indstats[idx].updated && stats)
+		{
+			memcpy(&(lvshared->indstats[idx].stats), stats,
+				   sizeof(IndexBulkDeleteResult));
+			lvshared->indstats[idx].updated = true;
+		}
+	}
+}
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index 55d129a..86511b2 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -14,6 +14,7 @@
 
 #include "postgres.h"
 
+#include "access/heapam.h"
 #include "access/nbtree.h"
 #include "access/parallel.h"
 #include "access/session.h"
@@ -140,6 +141,9 @@ static const struct
 	},
 	{
 		"_bt_parallel_build_main", _bt_parallel_build_main
+	},
+	{
+		"heap_parallel_vacuum_main", heap_parallel_vacuum_main
 	}
 };
 
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 72f140e..432a1f3 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -90,6 +90,7 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 	ListCell	*lc;
 
 	params.options = vacstmt->is_vacuumcmd ? VACOPT_VACUUM : VACOPT_ANALYZE;
+	params.nworkers = -1;
 
 	/* Parse options list */
 	foreach(lc, vacstmt->options)
@@ -116,6 +117,25 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 			params.options |= defGetBoolean(opt) ? VACOPT_FULL : 0;
 		else if (strcmp(opt->defname, "disable_page_skipping") == 0)
 			params.options |= defGetBoolean(opt) ? VACOPT_DISABLE_PAGE_SKIPPING : 0;
+		else if (strcmp(opt->defname, "parallel") == 0)
+		{
+			params.options |= VACOPT_PARALLEL;
+
+			if (opt->arg == NULL)
+			{
+				/* User didn't specify the parallel degree */
+				params.nworkers = 0;
+			}
+			else
+			{
+				params.nworkers = defGetInt32(opt);
+				if (params.nworkers <= 0)
+					ereport(ERROR,
+							(errcode(ERRCODE_SYNTAX_ERROR),
+							 errmsg("parallel vacuum degree must be at least 1"),
+							 parser_errposition(pstate, opt->location)));
+			}
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -147,6 +167,12 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 		}
 	}
 
+	if ((params.options & VACOPT_FULL) &&
+		(params.options & VACOPT_PARALLEL))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot specify FULL option with PARALLEL option")));
+
 	/*
 	 * All freeze ages are zero if the FREEZE option is given; otherwise pass
 	 * them as -1 which means to use the default values.
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 921e7d2..c79d962 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -10542,6 +10542,7 @@ vac_analyze_option_name:
 
 vac_analyze_option_arg:
 			opt_boolean_or_string					{ $$ = (Node *) makeString($1); }
+			| NumericOnly							{ $$ = (Node *) $1; }
 			| /* EMPTY */		 					{ $$ = NULL; }
 		;
 
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index fa875db..22df17f 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2886,6 +2886,7 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 			(dovacuum ? VACOPT_VACUUM : 0) |
 			(doanalyze ? VACOPT_ANALYZE : 0) |
 			(!wraparound ? VACOPT_SKIP_LOCKED : 0);
+		tab->at_params.nworkers = 0;	/* parallel lazy autovacuum is not supported */
 		tab->at_params.freeze_min_age = freeze_min_age;
 		tab->at_params.freeze_table_age = freeze_table_age;
 		tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 10ae21c..fef80c4 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -3429,7 +3429,8 @@ psql_completion(const char *text, int start, int end)
 		 */
 		if (ends_with(prev_wd, '(') || ends_with(prev_wd, ','))
 			COMPLETE_WITH("FULL", "FREEZE", "ANALYZE", "VERBOSE",
-						  "DISABLE_PAGE_SKIPPING", "SKIP_LOCKED");
+						  "DISABLE_PAGE_SKIPPING", "SKIP_LOCKED",
+						  "PARALLEL");
 	}
 	else if (HeadMatches("VACUUM") && TailMatches("("))
 		/* "VACUUM (" should be caught above, so assume we want columns */
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index eb9e160..3eb7946 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -14,6 +14,7 @@
 #ifndef HEAPAM_H
 #define HEAPAM_H
 
+#include "access/parallel.h"
 #include "access/relation.h"	/* for backward compatibility */
 #include "access/relscan.h"
 #include "access/sdir.h"
@@ -219,6 +220,7 @@ extern Size SyncScanShmemSize(void);
 struct VacuumParams;
 extern void heap_vacuum_rel(Relation onerel,
 				struct VacuumParams *params, BufferAccessStrategy bstrategy);
+extern void heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
 
 /* in heap/heapam_visibility.c */
 extern bool HeapTupleSatisfiesVisibility(HeapTuple stup, Snapshot snapshot,
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 77086f3..e6ce35b 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -145,7 +145,8 @@ typedef enum VacuumOption
 	VACOPT_FULL = 1 << 4,		/* FULL (non-concurrent) vacuum */
 	VACOPT_SKIP_LOCKED = 1 << 5,	/* skip if cannot get lock */
 	VACOPT_SKIPTOAST = 1 << 6,	/* don't process the TOAST table, if any */
-	VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7	/* don't skip any pages */
+	VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7,	/* don't skip any pages */
+	VACOPT_PARALLEL = 1 << 8
 } VacuumOption;
 
 /*
@@ -167,6 +168,11 @@ typedef struct VacuumParams
 	int			log_min_duration;	/* minimum execution threshold in ms at
 									 * which  verbose logs are activated, -1
 									 * to use default */
+	/*
+	 * The number of parallel vacuum workers. -1 by default for no workers
+	 * and 0 for choosing based on the number of indexes.
+	 */
+	int			nworkers;
 } VacuumParams;
 
 /* GUC parameters */
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index 07d0703..973bb33 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -80,6 +80,12 @@ CONTEXT:  SQL function "do_analyze" statement 1
 SQL function "wrap_do_analyze" statement 1
 VACUUM FULL vactst;
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+VACUUM (PARALLEL) vaccluster;
+VACUUM (PARALLEL 2) vaccluster;
+VACUUM (PARALLEL 0) vaccluster; -- error
+ERROR:  parallel vacuum degree must be at least 1
+LINE 1: VACUUM (PARALLEL 0) vaccluster;
+                ^
 -- partitioned table
 CREATE TABLE vacparted (a int, b char) PARTITION BY LIST (a);
 CREATE TABLE vacparted1 PARTITION OF vacparted FOR VALUES IN (1);
@@ -116,9 +122,9 @@ ERROR:  column "does_not_exist" of relation "vacparted" does not exist
 ANALYZE (VERBOSE) does_not_exist;
 ERROR:  relation "does_not_exist" does not exist
 ANALYZE (nonexistent-arg) does_not_exist;
-ERROR:  syntax error at or near "-"
+ERROR:  syntax error at or near "arg"
 LINE 1: ANALYZE (nonexistent-arg) does_not_exist;
-                            ^
+                             ^
 ANALYZE (nonexistentarg) does_not_exit;
 ERROR:  unrecognized ANALYZE option "nonexistentarg"
 LINE 1: ANALYZE (nonexistentarg) does_not_exit;
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 81f3822..d0c209a 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -61,6 +61,9 @@ VACUUM FULL vaccluster;
 VACUUM FULL vactst;
 
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+VACUUM (PARALLEL) vaccluster;
+VACUUM (PARALLEL 2) vaccluster;
+VACUUM (PARALLEL 0) vaccluster; -- error
 
 -- partitioned table
 CREATE TABLE vacparted (a int, b char) PARTITION BY LIST (a);
-- 
2.10.5

#60

Haribabu Kommi

kommi.haribabu@gmail.com

almost 7 years ago

In reply to: Masahiko Sawada (#59)

On Fri, Mar 22, 2019 at 4:06 PM Masahiko Sawada <sawada.mshk@gmail.com>
wrote:

Attached the updated version patch. 0001 patch allows all existing
vacuum options an boolean argument. 0002 patch introduces parallel
lazy vacuum. 0003 patch adds -P (--parallel) option to vacuumdb
command.

Thanks for sharing the updated patches.

0001 patch:

+ PARALLEL [ <replaceable class="parameter">N</replaceable> ]

But this patch contains syntax of PARALLEL but no explanation, I saw that
it is explained in 0002. It is not a problem, but just mentioning.

+      Specifies parallel degree for <literal>PARALLEL</literal> option. The
+      value must be at least 1. If the parallel degree
+      <replaceable class="parameter">integer</replaceable> is omitted, then
+      <command>VACUUM</command> decides the number of workers based on
number of
+      indexes on the relation which further limited by
+      <xref linkend="guc-max-parallel-workers-maintenance"/>.

Can we add some more details about backend participation also, parallel
workers will
come into picture only when there are 2 indexes in the table.

+ /*
+ * Do post-vacuum cleanup and statistics update for each index if
+ * we're not in parallel lazy vacuum. If in parallel lazy vacuum, do
+ * only post-vacum cleanup and then update statistics after exited
+ * from parallel mode.
+ */
+ lazy_vacuum_all_indexes(vacrelstats, Irel, nindexes, indstats,
+ lps, true);

How about renaming the above function, as it does the cleanup also?
lazy_vacuum_or_cleanup_all_indexes?

+ if (!IsInParallelVacuum(lps))
+ {
+ /*
+ * Update index statistics. If in parallel lazy vacuum, we will
+ * update them after exited from parallel mode.
+ */
+ lazy_update_index_statistics(Irel[idx], stats[idx]);
+
+ if (stats[idx])
+ pfree(stats[idx]);
+ }

The above check in lazy_vacuum_all_indexes can be combined it with the outer
if check where the memcpy is happening. I still feel that the logic around
the stats
makes it little bit complex.

+ if (IsParallelWorker())
+ msg = "scanned index \"%s\" to remove %d row versions by parallel vacuum
worker";
+ else
+ msg = "scanned index \"%s\" to remove %d row versions";

I feel, this way of error message may not be picked for the translations.
Is there any problem if we duplicate the entire ereport message with
changed message?

+ for (i = 0; i < nindexes; i++)
+ {
+ LVIndStats *s = &(copied_indstats[i]);
+
+ if (s->updated)
+ lazy_update_index_statistics(Irel[i], &(s->stats));
+ }
+
+ pfree(copied_indstats);

why can't we use the shared memory directly to update the stats once all
the workers
are finished, instead of copying them to a local memory?

+ tab->at_params.nworkers = 0; /* parallel lazy autovacuum is not supported
*/

User is not required to provide workers number compulsory even that
parallel vacuum can
work, so just setting the above parameters doesn't stop the parallel
workers, user must
pass the PARALLEL option also. So mentioning that also will be helpful
later when we
start supporting it or some one who is reading the code can understand.

Regards,
Haribabu Kommi
Fujitsu Australia

#61

Kyotaro HORIGUCHI

horiguchi.kyotaro@lab.ntt.co.jp

almost 7 years ago

In reply to: Robert Haas (#57)

Hello.

At Thu, 21 Mar 2019 15:51:40 -0400, Robert Haas <robertmhaas@gmail.com> wrote in <CA+TgmobkRtLb5frmEF5t9U=d+iV9c5emtN+NrRS_xrHaH1Z20A@mail.gmail.com>

On Tue, Mar 19, 2019 at 3:59 AM Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:

The leader doesn't continue heap-scan while index vacuuming is
running. And the index-page-scan seems eat up CPU easily. If
index vacuum can run simultaneously with the next heap scan
phase, we can make index scan finishes almost the same time with
the next round of heap scan. It would reduce the (possible) CPU
contention. But this requires as the twice size of shared
memoryas the current implement.

I think you're approaching this from the wrong point of view. If we
have a certain amount of memory available, is it better to (a) fill
the entire thing with dead tuples once, or (b) better to fill half of
it with dead tuples, start index vacuuming, and then fill the other
half of it with dead tuples for the next index-vacuum cycle while the
current one is running? I think the answer is that (a) is clearly

Sure.

better, because it results in half as many index vacuum cycles.

The "problem" I see there is it stops heap scanning on the leader
process. The leader cannot start the heap scan until the index
scan on workers end.

The heap scan is expected not to stop by the half-and-half
stratregy especially when the whole index pages are on
memory. But it is not always the case, of course.

We can't really ask the user how much memory it's OK to use and then
use twice as much. But if we could, what you're proposing here is
probably still not the right way to use it.

Yes. I thought that I wrote that with such implication. "requires
as the twice size" has negative implications as you wrote above.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#62

Kyotaro HORIGUCHI

horiguchi.kyotaro@lab.ntt.co.jp

almost 7 years ago

In reply to: Masahiko Sawada (#59)

Hello. I forgot to mention a point.

At Fri, 22 Mar 2019 14:02:36 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in <CAD21AoD7rqZPPyV7z4bku8Mn8AE2_kRdW1hTO4Lrsp+vn_U1kQ@mail.gmail.com>

Attached the updated version patch. 0001 patch allows all existing
vacuum options an boolean argument. 0002 patch introduces parallel
lazy vacuum. 0003 patch adds -P (--parallel) option to vacuumdb
command.

+    if (IsParallelWorker())
+        msg = "scanned index \"%s\" to remove %d row versions by parallel vacuum worker";
+    else
+        msg = "scanned index \"%s\" to remove %d row versions";
ereport(elevel,
-            (errmsg("scanned index \"%s\" to remove %d row versions",
+            (errmsg(msg,
RelationGetRelationName(indrel),
-                    vacrelstats->num_dead_tuples),
+                    dead_tuples->num_tuples),

The msg prevents NLS from working. Please enclose the right-hand
literals by gettext_noop().

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#63

Masahiko Sawada

sawada.mshk@gmail.com

almost 7 years ago

In reply to: Haribabu Kommi (#60)

3 attachment(s)

On Tue, Mar 26, 2019 at 10:19 AM Haribabu Kommi
<kommi.haribabu@gmail.com> wrote:

On Fri, Mar 22, 2019 at 4:06 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Attached the updated version patch. 0001 patch allows all existing
vacuum options an boolean argument. 0002 patch introduces parallel
lazy vacuum. 0003 patch adds -P (--parallel) option to vacuumdb
command.

Thanks for sharing the updated patches.

Thank you for reviewing the patch.

0001 patch:

+ PARALLEL [ <replaceable class="parameter">N</replaceable> ]

But this patch contains syntax of PARALLEL but no explanation, I saw that
it is explained in 0002. It is not a problem, but just mentioning.

Oops, removed it from 0001.

+      Specifies parallel degree for <literal>PARALLEL</literal> option. The
+      value must be at least 1. If the parallel degree
+      <replaceable class="parameter">integer</replaceable> is omitted, then
+      <command>VACUUM</command> decides the number of workers based on number of
+      indexes on the relation which further limited by
+      <xref linkend="guc-max-parallel-workers-maintenance"/>.

Can we add some more details about backend participation also, parallel workers will
come into picture only when there are 2 indexes in the table.

Agreed. I've add the description. Since such behavior might change in
a future release I've added such description.

+ /*
+ * Do post-vacuum cleanup and statistics update for each index if
+ * we're not in parallel lazy vacuum. If in parallel lazy vacuum, do
+ * only post-vacum cleanup and then update statistics after exited
+ * from parallel mode.
+ */
+ lazy_vacuum_all_indexes(vacrelstats, Irel, nindexes, indstats,
+ lps, true);

How about renaming the above function, as it does the cleanup also?
lazy_vacuum_or_cleanup_all_indexes?

Agreed. I think lazy_vacuum_or_cleanup_indexes would be better. Also
the same is true for lazy_vacuum_indexes_for_workers(). Fixed.

+ if (!IsInParallelVacuum(lps))
+ {
+ /*
+ * Update index statistics. If in parallel lazy vacuum, we will
+ * update them after exited from parallel mode.
+ */
+ lazy_update_index_statistics(Irel[idx], stats[idx]);
+
+ if (stats[idx])
+ pfree(stats[idx]);
+ }
The above check in lazy_vacuum_all_indexes can be combined it with the outer
if check where the memcpy is happening. I still feel that the logic around the stats
makes it little bit complex.

Hmm, memcpy is needed in both vacuuming index and cleanup index case
but the updating index statistics is needed in only cleanup index
case. I've split the code for parallel index vacuuming from
lazy_vacuum_or_cleanup_indexes. Also, I've changed the patch so that
both the leader process and worker processes use the same code for
index vacuuming and index cleanup. I hope the code got simple and
understandable.

+ if (IsParallelWorker())
+ msg = "scanned index \"%s\" to remove %d row versions by parallel vacuum worker";
+ else
+ msg = "scanned index \"%s\" to remove %d row versions";
I feel, this way of error message may not be picked for the translations.
Is there any problem if we duplicate the entire ereport message with changed message?

No, but I'd like to avoid writing the same arguments in multiple
ereport()s. I've add gettext_noop() as per comment from Horiguchi-san.

+ for (i = 0; i < nindexes; i++)
+ {
+ LVIndStats *s = &(copied_indstats[i]);
+
+ if (s->updated)
+ lazy_update_index_statistics(Irel[i], &(s->stats));
+ }
+
+ pfree(copied_indstats);
why can't we use the shared memory directly to update the stats once all the workers
are finished, instead of copying them to a local memory?

Since we cannot use heap_inplace_update() which is called by
vac_update_relstats() during parallel mode I copied the stats. Is that
safe if we destroy the parallel context *after* exited parallel mode?

+ tab->at_params.nworkers = 0; /* parallel lazy autovacuum is not supported */

User is not required to provide workers number compulsory even that parallel vacuum can
work, so just setting the above parameters doesn't stop the parallel workers, user must
pass the PARALLEL option also. So mentioning that also will be helpful later when we
start supporting it or some one who is reading the code can understand.

We should set -1 to at_params.nworkers here to disable parallel lazy
vacuum. And this review comment got me thinking that VACOPT_PARALLEL
can be combined with nworkers of VacuumParams. So I've removed
VACOPT_PARALELL and passing nworkers with >= 0 enables parallel lazy
vacuum.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachments:

v20-0001-All-VACUUM-command-options-allow-an-argument.patchapplication/x-patch; name=v20-0001-All-VACUUM-command-options-allow-an-argument.patchDownload

From f243cf3b1b22136cad8fe981cf3b51f3cc44dfc5 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Tue, 26 Mar 2019 22:13:53 +0900
Subject: [PATCH v20 1/3] All VACUUM command options allow an argument.

All existing VACUUM command options allow a boolean argument like
EXPLAIN command.
---
 doc/src/sgml/ref/vacuum.sgml  | 26 ++++++++++++++++++++------
 src/backend/commands/vacuum.c | 13 +++++++------
 src/backend/parser/gram.y     | 10 ++++++++--
 src/bin/psql/tab-complete.c   |  2 ++
 4 files changed, 37 insertions(+), 14 deletions(-)

diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index fd911f5..906d0c2 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -26,12 +26,12 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
 
 <phrase>where <replaceable class="parameter">option</replaceable> can be one of:</phrase>
 
-    FULL
-    FREEZE
-    VERBOSE
-    ANALYZE
-    DISABLE_PAGE_SKIPPING
-    SKIP_LOCKED
+    FULL [ <replaceable class="parameter">boolean</replaceable> ]
+    FREEZE [ <replaceable class="parameter">boolean</replaceable> ]
+    VERBOSE [ <replaceable class="parameter">boolean</replaceable> ]
+    ANALYZE [ <replaceable class="parameter">boolean</replaceable> ]
+    DISABLE_PAGE_SKIPPING [ <replaceable class="parameter">boolean</replaceable> ]
+    SKIP_LOCKED [ <replaceable class="parameter">boolean</replaceable> ]
 
 <phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
 
@@ -182,6 +182,20 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    </varlistentry>
 
    <varlistentry>
+    <term><replaceable class="parameter">boolean</replaceable></term>
+    <listitem>
+     <para>
+      Specifies whether the selected option should be turned on or off.
+      You can write <literal>TRUE</literal>, <literal>ON</literal>, or
+      <literal>1</literal> to enable the option, and <literal>FALSE</literal>,
+      <literal>OFF</literal>, or <literal>0</literal> to disable it.  The
+      <replaceable class="parameter">boolean</replaceable> value can also
+      be omitted, in which case <literal>TRUE</literal> is assumed.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><replaceable class="parameter">table_name</replaceable></term>
     <listitem>
      <para>
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index f0afeaf..72f140e 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -36,6 +36,7 @@
 #include "catalog/pg_inherits.h"
 #include "catalog/pg_namespace.h"
 #include "commands/cluster.h"
+#include "commands/defrem.h"
 #include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
@@ -97,9 +98,9 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 
 		/* Parse common options for VACUUM and ANALYZE */
 		if (strcmp(opt->defname, "verbose") == 0)
-			params.options |= VACOPT_VERBOSE;
+			params.options |= defGetBoolean(opt) ? VACOPT_VERBOSE : 0;
 		else if (strcmp(opt->defname, "skip_locked") == 0)
-			params.options |= VACOPT_SKIP_LOCKED;
+			params.options |= defGetBoolean(opt) ? VACOPT_SKIP_LOCKED : 0;
 		else if (!vacstmt->is_vacuumcmd)
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -108,13 +109,13 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 
 		/* Parse options available on VACUUM */
 		else if (strcmp(opt->defname, "analyze") == 0)
-				params.options |= VACOPT_ANALYZE;
+			params.options |= defGetBoolean(opt) ? VACOPT_ANALYZE : 0;
 		else if (strcmp(opt->defname, "freeze") == 0)
-				params.options |= VACOPT_FREEZE;
+			params.options |= defGetBoolean(opt) ? VACOPT_FREEZE : 0;
 		else if (strcmp(opt->defname, "full") == 0)
-			params.options |= VACOPT_FULL;
+			params.options |= defGetBoolean(opt) ? VACOPT_FULL : 0;
 		else if (strcmp(opt->defname, "disable_page_skipping") == 0)
-			params.options |= VACOPT_DISABLE_PAGE_SKIPPING;
+			params.options |= defGetBoolean(opt) ? VACOPT_DISABLE_PAGE_SKIPPING : 0;
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 0a48228..5af91aa 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -309,6 +309,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 %type <str>		vac_analyze_option_name
 %type <defelt>	vac_analyze_option_elem
 %type <list>	vac_analyze_option_list
+%type <node>	vac_analyze_option_arg
 %type <boolean>	opt_or_replace
 				opt_grant_grant_option opt_grant_admin_option
 				opt_nowait opt_if_exists opt_with_data
@@ -10539,9 +10540,9 @@ analyze_keyword:
 		;
 
 vac_analyze_option_elem:
-			vac_analyze_option_name
+			vac_analyze_option_name vac_analyze_option_arg
 				{
-					$$ = makeDefElem($1, NULL, @1);
+					$$ = makeDefElem($1, $2, @1);
 				}
 		;
 
@@ -10550,6 +10551,11 @@ vac_analyze_option_name:
 			| analyze_keyword						{ $$ = "analyze"; }
 		;
 
+vac_analyze_option_arg:
+			opt_boolean_or_string					{ $$ = (Node *) makeString($1); }
+			| /* EMPTY */		 					{ $$ = NULL; }
+		;
+
 opt_analyze:
 			analyze_keyword							{ $$ = true; }
 			| /*EMPTY*/								{ $$ = false; }
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 3ba3498..36e20fb 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -3432,6 +3432,8 @@ psql_completion(const char *text, int start, int end)
 		if (ends_with(prev_wd, '(') || ends_with(prev_wd, ','))
 			COMPLETE_WITH("FULL", "FREEZE", "ANALYZE", "VERBOSE",
 						  "DISABLE_PAGE_SKIPPING", "SKIP_LOCKED");
+		else if (TailMatches("FULL|FREEZE|ANALYZE|VERBOSE|DISABLE_PAGE_SKIPPING|SKIP_LOCKED"))
+			COMPLETE_WITH("ON", "OFF");
 	}
 	else if (HeadMatches("VACUUM") && TailMatches("("))
 		/* "VACUUM (" should be caught above, so assume we want columns */
-- 
1.8.3.1

v20-0002-Add-parallel-option-to-VACUUM-command.patchapplication/x-patch; name=v20-0002-Add-parallel-option-to-VACUUM-command.patchDownload

From ab8f1a80d95f13a216e4faf054444603a3b8f640 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 22 Mar 2019 10:31:31 +0900
Subject: [PATCH v20 2/3] Add parallel option to VACUUM command

In parallel vacuum, we perform both index vacuum and cleanup vacuum
with parallel workers. Indivisual indexes are processed by one vacuum
process. Therefore parallel vacuum can be used when the table has more
than one index.

Parallel vacuum can be performed by specifying like
VACUUM (PARALLEL 2) tbl, meaning that performing vacuum with 2
parallel worker processes. Specifying only PARALLEL means that the
degree of parallalism will be determined based on the number of
indexes the table has.

The parallel vacuum degree is limited by both the number of
indexes the table has and max_parallel_maintenance_workers.
---
 doc/src/sgml/config.sgml              |  14 +-
 doc/src/sgml/ref/vacuum.sgml          |  31 ++
 src/backend/access/heap/vacuumlazy.c  | 872 +++++++++++++++++++++++++++++-----
 src/backend/access/transam/parallel.c |   4 +
 src/backend/commands/vacuum.c         |  27 ++
 src/backend/parser/gram.y             |   1 +
 src/backend/postmaster/autovacuum.c   |   1 +
 src/bin/psql/tab-complete.c           |   3 +-
 src/include/access/heapam.h           |   2 +
 src/include/commands/vacuum.h         |   5 +
 src/test/regress/expected/vacuum.out  |  10 +-
 src/test/regress/sql/vacuum.sql       |   3 +
 12 files changed, 857 insertions(+), 116 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index d383de2..3ca3ae8 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2226,13 +2226,13 @@ include_dir 'conf.d'
        <listitem>
         <para>
          Sets the maximum number of parallel workers that can be
-         started by a single utility command.  Currently, the only
-         parallel utility command that supports the use of parallel
-         workers is <command>CREATE INDEX</command>, and only when
-         building a B-tree index.  Parallel workers are taken from the
-         pool of processes established by <xref
-         linkend="guc-max-worker-processes"/>, limited by <xref
-         linkend="guc-max-parallel-workers"/>.  Note that the requested
+         started by a single utility command.  Currently, the parallel
+         utility commands that support the use of parallel workers are
+         <command>CREATE INDEX</command> only when building a B-tree index,
+         and <command>VACUUM</command> without <literal>FULL</literal>
+         option. Parallel workers are taken from the pool of processes
+         established by <xref linkend="guc-max-worker-processes"/>, limited
+         by <xref linkend="guc-max-parallel-workers"/>.  Note that the requested
          number of workers may not actually be available at run time.
          If this occurs, the utility operation will run with fewer
          workers than expected.  The default value is 2.  Setting this
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index 906d0c2..d3fe0f6 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -32,6 +32,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     ANALYZE [ <replaceable class="parameter">boolean</replaceable> ]
     DISABLE_PAGE_SKIPPING [ <replaceable class="parameter">boolean</replaceable> ]
     SKIP_LOCKED [ <replaceable class="parameter">boolean</replaceable> ]
+    PARALLEL [ <replaceable class="parameter">integer</replaceable> ]
 
 <phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
 
@@ -143,6 +144,22 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    </varlistentry>
 
    <varlistentry>
+    <term><literal>PARALLEL</literal></term>
+    <listitem>
+     <para>
+      Perform vacuum index and cleanup index phases of <command>VACUUM</command>
+      in parallel using <replaceable class="parameter">integer</replaceable> background
+      workers (for the detail of each vacuum phases, please refer to
+      <xref linkend="vacuum-phases"/>). Only one worker can be used per index. So
+      parallel workers are launched only when there are at least <literal>2</literal>
+      indexes in the table. Workers for vacuum launches before starting each phases
+      and exit at the end of the phase. These behaviors might change in a future release.
+      This option can not use with <literal>FULL</literal> option.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><literal>DISABLE_PAGE_SKIPPING</literal></term>
     <listitem>
      <para>
@@ -196,6 +213,20 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    </varlistentry>
 
    <varlistentry>
+    <term><replaceable class="parameter">integer</replaceable></term>
+    <listitem>
+     <para>
+      Specifies parallel degree for <literal>PARALLEL</literal> option. The
+      value must be at least 1. If the parallel degree
+      <replaceable class="parameter">integer</replaceable> is omitted, then
+      <command>VACUUM</command> decides the number of workers based on number of
+      indexes on the relation which further limited by
+      <xref linkend="guc-max-parallel-workers-maintenance"/>.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><replaceable class="parameter">table_name</replaceable></term>
     <listitem>
      <para>
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 5c554f9..a864d18 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -22,6 +22,19 @@
  * of index scans performed.  So we don't use maintenance_work_mem memory for
  * the TID array, just enough to hold as many heap tuples as fit on one page.
  *
+ * Lazy vacuum supports parallel execution with parallel worker processes. In
+ * parallel lazy vacuum, we perform both index vacuuming and index cleanup in
+ * parallel. Individual indexes is processed by one vacuum process. At beginning
+ * of lazy vacuum (at lazy_scan_heap) we prepare the parallel context and
+ * initialize the DSM segment that contains shared information as well as the
+ * memory space for dead tuples. When starting either index vacuuming or index
+ * cleanup, we launch parallel worker processes. Once all indexes are processed
+ * the parallel worker processes exit and the leader process re-initializes the
+ * DSM segment. Note that all parallel workers live during one either index
+ * vacuuming or index cleanup but the leader process neither exits from the
+ * parallel mode nor destroys the parallel context. For updating the index
+ * statistics, since any updates are not allowed during parallel mode we update
+ * the index statistics after exited from parallel mode.
  *
  * Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
@@ -41,8 +54,10 @@
 #include "access/heapam_xlog.h"
 #include "access/htup_details.h"
 #include "access/multixact.h"
+#include "access/parallel.h"
 #include "access/transam.h"
 #include "access/visibilitymap.h"
+#include "access/xact.h"
 #include "access/xlog.h"
 #include "catalog/storage.h"
 #include "commands/dbcommands.h"
@@ -55,6 +70,7 @@
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
 #include "storage/lmgr.h"
+#include "tcop/tcopprot.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/pg_rusage.h"
@@ -110,6 +126,92 @@
  */
 #define PREFETCH_SIZE			((BlockNumber) 32)
 
+/*
+ * DSM keys for parallel lazy vacuum. Since we don't need to worry about DSM
+ * keys conflicting with plan_node_id we can use small integers.
+ */
+#define PARALLEL_VACUUM_KEY_SHARED			1
+#define PARALLEL_VACUUM_KEY_DEAD_TUPLES		2
+#define PARALLEL_VACUUM_KEY_QUERY_TEXT		3
+
+/*
+ * Are we in a parallel lazy vacuum? If that's true, we're in parallel mode
+ * and prepared the DSM segments.
+ */
+#define IsInParallelVacuum(lps) (((LVParallelState *) (lps)) != NULL)
+
+/*
+ * Structs for an index bulk-deletion statistic that is used for parallel
+ * lazy vacuum. This is allocated in a DSM segment.
+ */
+typedef struct LVIndStats
+{
+	bool updated;	/* are the stats updated? */
+	IndexBulkDeleteResult stats;
+} LVIndStats;
+
+/*
+ * LVDeadTuples stores the dead tuple TIDs collected during heap scan.
+ * This is allocated in a DSM segment when parallel lazy vacuum mode,
+ * or allocated in a local memory.
+ */
+typedef struct LVDeadTuples
+{
+	int			max_tuples;	/* # slots allocated in array */
+	int			num_tuples;	/* current # of entries */
+	/* List of TIDs of tuples we intend to delete */
+	/* NB: this list is ordered by TID address */
+	ItemPointerData itemptrs[FLEXIBLE_ARRAY_MEMBER];	/* array of ItemPointerData */
+} LVDeadTuples;
+#define SizeOfLVDeadTuples offsetof(LVDeadTuples, itemptrs) + sizeof(ItemPointerData)
+
+/*
+ * Shared information among parallel workers. So this is allocated in
+ * a DSM segment.
+ */
+typedef struct LVShared
+{
+	/*
+	 * Target table relid and vacuum settings. These fields are not modified
+	 * during the lazy vacuum.
+	 */
+	Oid		relid;
+	int		elevel;
+
+	/*
+	 * An indication for vacuum workers of doing either vacuuming index or
+	 * index cleanup.
+	 */
+	bool	for_cleanup;
+
+	/*
+	 * Fields for both index vacuuming and index cleanup.
+	 *
+	 * reltuples is the total number of input heap tuples. We set either an
+	 * old live tuples in index vacuuming or the new live tuples in index cleanup.
+	 *
+	 * estimated_count is true if the reltuples is estimated value.
+	 */
+	double	reltuples;
+	bool	estimated_count;
+
+	/*
+	 * Variables to control parallel index vacuuming. An variable-sized field
+	 * 'indstats' must come last.
+	 */
+	pg_atomic_uint32	nprocessed;
+	LVIndStats			indstats[FLEXIBLE_ARRAY_MEMBER];
+} LVShared;
+#define SizeOfLVShared offsetof(LVShared, indstats) + sizeof(LVIndStats)
+
+/* Struct for parallel lazy vacuum */
+typedef struct LVParallelState
+{
+	ParallelContext	*pcxt;
+	LVShared		*lvshared;
+	int				nworkers_requested;	/* user-requested parallel degree */
+} LVParallelState;
+
 typedef struct LVRelStats
 {
 	/* hasindex = true means two-pass strategy; false means one-pass */
@@ -128,17 +230,12 @@ typedef struct LVRelStats
 	BlockNumber pages_removed;
 	double		tuples_deleted;
 	BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
-	/* List of TIDs of tuples we intend to delete */
-	/* NB: this list is ordered by TID address */
-	int			num_dead_tuples;	/* current # of entries */
-	int			max_dead_tuples;	/* # slots allocated in array */
-	ItemPointer dead_tuples;	/* array of ItemPointerData */
+	LVDeadTuples *dead_tuples;
 	int			num_index_scans;
 	TransactionId latestRemovedXid;
 	bool		lock_waiter_detected;
 } LVRelStats;
 
-
 /* A few variables that don't seem worth passing around as parameters */
 static int	elevel = -1;
 
@@ -150,17 +247,18 @@ static BufferAccessStrategy vac_strategy;
 
 
 /* non-export function prototypes */
-static void lazy_scan_heap(Relation onerel, int options,
+static void lazy_scan_heap(Relation onerel, VacuumParams *params,
 			   LVRelStats *vacrelstats, Relation *Irel, int nindexes,
 			   bool aggressive);
 static void lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats, BlockNumber nblocks);
 static bool lazy_check_needs_freeze(Buffer buf, bool *hastup);
 static void lazy_vacuum_index(Relation indrel,
 				  IndexBulkDeleteResult **stats,
-				  LVRelStats *vacrelstats);
+				  double reltuples,
+				  LVDeadTuples	*dead_tuples);
 static void lazy_cleanup_index(Relation indrel,
-				   IndexBulkDeleteResult *stats,
-				   LVRelStats *vacrelstats);
+				   IndexBulkDeleteResult **stats,
+				   double reltuples, bool estimated_count);
 static int lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 				 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer);
 static bool should_attempt_truncation(LVRelStats *vacrelstats);
@@ -168,12 +266,35 @@ static void lazy_truncate_heap(Relation onerel, LVRelStats *vacrelstats);
 static BlockNumber count_nondeletable_pages(Relation onerel,
 						 LVRelStats *vacrelstats);
 static void lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks);
-static void lazy_record_dead_tuple(LVRelStats *vacrelstats,
-					   ItemPointer itemptr);
+static void lazy_record_dead_tuple(LVDeadTuples *dead_tuples, ItemPointer itemptr);
 static bool lazy_tid_reaped(ItemPointer itemptr, void *state);
 static int	vac_cmp_itemptr(const void *left, const void *right);
 static bool heap_page_is_all_visible(Relation rel, Buffer buf,
 						 TransactionId *visibility_cutoff_xid, bool *all_frozen);
+static void lazy_update_index_statistics(Relation indrel, IndexBulkDeleteResult *stats);
+static LVParallelState *lazy_prepare_parallel(LVRelStats *vacrelstats, Oid relid,
+											  BlockNumber nblocks, int nindexes,
+											  int nrequested);
+static void lazy_end_parallel(LVParallelState *lps, Relation *Irel, int nindexes);
+static void lazy_begin_parallel_vacuum_index(LVParallelState *lps, LVRelStats *vacrelstats,
+											 bool for_cleanup);
+static void lazy_end_parallel_vacuum_index(LVParallelState *lps, bool reinitialize);
+static void lazy_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
+										   int nindexes,
+										   IndexBulkDeleteResult **stats,
+										   LVParallelState *lps, bool for_cleanup);
+static void lazy_parallel_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats,
+													Relation *Irel,
+													int nindexes,
+													IndexBulkDeleteResult **stats,
+													LVParallelState *lps,
+													bool for_cleanup);
+static void do_parallel_vacuum_or_cleanup_indexes(Relation *Irel, int nindexes,
+												  IndexBulkDeleteResult **stats,
+												  LVShared *lvshared,
+												  LVDeadTuples *dead_tuples);
+static int compute_parallel_workers(Relation onerel, int nrequested, int nindexes);
+static long compute_max_dead_tuples(BlockNumber relblocks, bool hasindex);
 
 
 /*
@@ -261,7 +382,7 @@ heap_vacuum_rel(Relation onerel, VacuumParams *params,
 	vacrelstats->hasindex = (nindexes > 0);
 
 	/* Do the vacuuming */
-	lazy_scan_heap(onerel, params->options, vacrelstats, Irel, nindexes, aggressive);
+	lazy_scan_heap(onerel, params, vacrelstats, Irel, nindexes, aggressive);
 
 	/* Done with indexes */
 	vac_close_indexes(nindexes, Irel, NoLock);
@@ -464,14 +585,28 @@ vacuum_log_cleanup_info(Relation rel, LVRelStats *vacrelstats)
  *		dead-tuple TIDs, invoke vacuuming of indexes and call lazy_vacuum_heap
  *		to reclaim dead line pointers.
  *
+ *		If the table has more than one index and parallel lazy vacuum is requested,
+ *		we execute both index vacuuming and index cleanup with parallel workers.
+ *		When allocating the space for lazy scan heap, we enter parallel mode,
+ *		create the parallel context and initailize a DSM segment for dead tuples.
+ *		The dead_tuples points either to a DSM segment in parallel lazy vacuum case
+ *		or to a local memory in single process vacuum case.  Before starting parallel
+ *		index vacuuming and parallel index cleanup we launch parallel workers.
+ *		All parallel workers will exit after processed all indexes and the leader
+ *		process re-initialize the parallel context and then re-launch them at the next
+ *		execution. The index statistics are updated by the leader after exited from
+ *		parallel mode since all writes are not allowed during parallel mode.
+ *
  *		If there are no indexes then we can reclaim line pointers on the fly;
  *		dead line pointers need only be retained until all index pointers that
  *		reference them have been killed.
  */
 static void
-lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
+lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			   Relation *Irel, int nindexes, bool aggressive)
 {
+	LVParallelState *lps = NULL;
+	LVDeadTuples *dead_tuples;
 	BlockNumber nblocks,
 				blkno;
 	HeapTupleData tuple;
@@ -494,6 +629,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	bool		skipping_blocks;
 	xl_heap_freeze_tuple *frozen;
 	StringInfoData buf;
+	int			parallel_workers = 0;
 	const int	initprog_index[] = {
 		PROGRESS_VACUUM_PHASE,
 		PROGRESS_VACUUM_TOTAL_HEAP_BLKS,
@@ -529,13 +665,34 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	vacrelstats->nonempty_pages = 0;
 	vacrelstats->latestRemovedXid = InvalidTransactionId;
 
-	lazy_space_alloc(vacrelstats, nblocks);
+	/* Compute the number of parallel vacuum worker to request */
+	if (params->nworkers >= 0)
+		parallel_workers = compute_parallel_workers(onerel,
+													params->nworkers,
+													nindexes);
+
+	if (parallel_workers > 0)
+	{
+		/* enter parallel mode and prepare parallel lazy vacuum */
+		lps = lazy_prepare_parallel(vacrelstats,
+									RelationGetRelid(onerel),
+									nblocks, nindexes,
+									parallel_workers);
+		lps->nworkers_requested = params->nworkers;
+	}
+	else
+	{
+		/* Allocate the memory space for dead tuples locally */
+		lazy_space_alloc(vacrelstats, nblocks);
+	}
+
+	dead_tuples = vacrelstats->dead_tuples;
 	frozen = palloc(sizeof(xl_heap_freeze_tuple) * MaxHeapTuplesPerPage);
 
 	/* Report that we're scanning the heap, advertising total # of blocks */
 	initprog_val[0] = PROGRESS_VACUUM_PHASE_SCAN_HEAP;
 	initprog_val[1] = nblocks;
-	initprog_val[2] = vacrelstats->max_dead_tuples;
+	initprog_val[2] = dead_tuples->max_tuples;
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
 	/*
@@ -583,7 +740,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	 * be replayed on any hot standby, where it can be disruptive.
 	 */
 	next_unskippable_block = 0;
-	if ((options & VACOPT_DISABLE_PAGE_SKIPPING) == 0)
+	if ((params->options & VACOPT_DISABLE_PAGE_SKIPPING) == 0)
 	{
 		while (next_unskippable_block < nblocks)
 		{
@@ -638,7 +795,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		{
 			/* Time to advance next_unskippable_block */
 			next_unskippable_block++;
-			if ((options & VACOPT_DISABLE_PAGE_SKIPPING) == 0)
+			if ((params->options & VACOPT_DISABLE_PAGE_SKIPPING) == 0)
 			{
 				while (next_unskippable_block < nblocks)
 				{
@@ -713,8 +870,8 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		 * If we are close to overrunning the available space for dead-tuple
 		 * TIDs, pause and do a cycle of vacuuming before we tackle this page.
 		 */
-		if ((vacrelstats->max_dead_tuples - vacrelstats->num_dead_tuples) < MaxHeapTuplesPerPage &&
-			vacrelstats->num_dead_tuples > 0)
+		if ((dead_tuples->max_tuples - dead_tuples->num_tuples) < MaxHeapTuplesPerPage &&
+			dead_tuples->num_tuples > 0)
 		{
 			const int	hvp_index[] = {
 				PROGRESS_VACUUM_PHASE,
@@ -742,10 +899,8 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 										 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
 			/* Remove index entries */
-			for (i = 0; i < nindexes; i++)
-				lazy_vacuum_index(Irel[i],
-								  &indstats[i],
-								  vacrelstats);
+			lazy_vacuum_or_cleanup_indexes(vacrelstats, Irel, nindexes,
+										   indstats, lps, false);
 
 			/*
 			 * Report that we are now vacuuming the heap.  We also increase
@@ -765,7 +920,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 			vacrelstats->num_index_scans++;
 
 			/*
@@ -961,7 +1116,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		has_dead_tuples = false;
 		nfrozen = 0;
 		hastup = false;
-		prev_dead_count = vacrelstats->num_dead_tuples;
+		prev_dead_count = dead_tuples->num_tuples;
 		maxoff = PageGetMaxOffsetNumber(page);
 
 		/*
@@ -1000,7 +1155,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			 */
 			if (ItemIdIsDead(itemid))
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				all_visible = false;
 				continue;
 			}
@@ -1140,7 +1295,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 
 			if (tupgone)
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				HeapTupleHeaderAdvanceLatestRemovedXid(tuple.t_data,
 													   &vacrelstats->latestRemovedXid);
 				tups_vacuumed += 1;
@@ -1209,8 +1364,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		 * If there are no indexes then we can vacuum the page right now
 		 * instead of doing a second scan.
 		 */
-		if (nindexes == 0 &&
-			vacrelstats->num_dead_tuples > 0)
+		if (nindexes == 0 && dead_tuples->num_tuples > 0)
 		{
 			/* Remove tuples from heap */
 			lazy_vacuum_page(onerel, blkno, buf, 0, vacrelstats, &vmbuffer);
@@ -1221,7 +1375,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 			vacuumed_pages++;
 
 			/*
@@ -1337,7 +1491,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		 * page, so remember its free space as-is.  (This path will always be
 		 * taken if there are no indexes.)
 		 */
-		if (vacrelstats->num_dead_tuples == prev_dead_count)
+		if (dead_tuples->num_tuples == prev_dead_count)
 			RecordPageWithFreeSpace(onerel, blkno, freespace, nblocks);
 	}
 
@@ -1371,7 +1525,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 
 	/* If any tuples need to be deleted, perform final vacuum cycle */
 	/* XXX put a threshold on min number of tuples here? */
-	if (vacrelstats->num_dead_tuples > 0)
+	if (dead_tuples->num_tuples > 0)
 	{
 		const int	hvp_index[] = {
 			PROGRESS_VACUUM_PHASE,
@@ -1387,10 +1541,8 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 									 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
 		/* Remove index entries */
-		for (i = 0; i < nindexes; i++)
-			lazy_vacuum_index(Irel[i],
-							  &indstats[i],
-							  vacrelstats);
+		lazy_vacuum_or_cleanup_indexes(vacrelstats, Irel, nindexes,
+										   indstats, lps, false);
 
 		/* Report that we are now vacuuming the heap */
 		hvp_val[0] = PROGRESS_VACUUM_PHASE_VACUUM_HEAP;
@@ -1416,9 +1568,21 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
 								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
 
-	/* Do post-vacuum cleanup and statistics update for each index */
-	for (i = 0; i < nindexes; i++)
-		lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
+	/*
+	 * Do post-vacuum cleanup and statistics update for each index if
+	 * we're not in parallel lazy vacuum. If in parallel lazy vacuum, do
+	 * only post-vacum cleanup and then update statistics after exited
+	 * from parallel mode.
+	 */
+	lazy_vacuum_or_cleanup_indexes(vacrelstats, Irel, nindexes,
+									   indstats, lps, true);
+
+	/*
+	 * If we're in parallel lazy vacuum, end parallel lazy vacuum and
+	 * update index statistics.
+	 */
+	if (IsInParallelVacuum(lps))
+		lazy_end_parallel(lps, Irel, nindexes);
 
 	/* If no indexes, make log report that lazy_vacuum_heap would've made */
 	if (vacuumed_pages)
@@ -1485,7 +1649,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats, BlockNumber nblocks)
 	npages = 0;
 
 	tupindex = 0;
-	while (tupindex < vacrelstats->num_dead_tuples)
+	while (tupindex < vacrelstats->dead_tuples->num_tuples)
 	{
 		BlockNumber tblk;
 		Buffer		buf;
@@ -1494,7 +1658,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats, BlockNumber nblocks)
 
 		vacuum_delay_point();
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples->itemptrs[tupindex]);
 		buf = ReadBufferExtended(onerel, MAIN_FORKNUM, tblk, RBM_NORMAL,
 								 vac_strategy);
 		if (!ConditionalLockBufferForCleanup(buf))
@@ -1542,6 +1706,7 @@ static int
 lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 				 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer)
 {
+	LVDeadTuples	*dead_tuples = vacrelstats->dead_tuples;
 	Page		page = BufferGetPage(buffer);
 	OffsetNumber unused[MaxOffsetNumber];
 	int			uncnt = 0;
@@ -1552,16 +1717,16 @@ lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 
 	START_CRIT_SECTION();
 
-	for (; tupindex < vacrelstats->num_dead_tuples; tupindex++)
+	for (; tupindex < dead_tuples->num_tuples; tupindex++)
 	{
 		BlockNumber tblk;
 		OffsetNumber toff;
 		ItemId		itemid;
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&dead_tuples->itemptrs[tupindex]);
 		if (tblk != blkno)
 			break;				/* past end of tuples for this block */
-		toff = ItemPointerGetOffsetNumber(&vacrelstats->dead_tuples[tupindex]);
+		toff = ItemPointerGetOffsetNumber(&dead_tuples->itemptrs[tupindex]);
 		itemid = PageGetItemId(page, toff);
 		ItemIdSetUnused(itemid);
 		unused[uncnt++] = toff;
@@ -1682,6 +1847,151 @@ lazy_check_needs_freeze(Buffer buf, bool *hastup)
 	return false;
 }
 
+/*
+ * Vacuum or cleanup indexes with parallel workers. This function must be used
+ * by the parallel vacuum leader process.
+ */
+static void
+lazy_parallel_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
+										int nindexes, IndexBulkDeleteResult **stats,
+										LVParallelState *lps, bool for_cleanup)
+{
+	Assert(!IsParallelWorker());
+	Assert(lps != NULL);
+	Assert(nindexes > 0);
+
+	/* Launch parallel vacuum workers if we're ready */
+	lazy_begin_parallel_vacuum_index(lps, vacrelstats,
+									 for_cleanup);
+
+	/*
+	 * Do index vacuuming or cleanup index with parallel workers.
+	 * Only the leader process could do that if no workers are launched.
+	 */
+	do_parallel_vacuum_or_cleanup_indexes(Irel, nindexes, stats,
+										  lps->lvshared,
+										  vacrelstats->dead_tuples);
+
+	lazy_end_parallel_vacuum_index(lps, !for_cleanup);
+}
+
+/*
+ * Index vacuuming and index cleanup routine for both the leader process
+ * and worker processes. Unlike single process vacuum, we don't update
+ * index statistics after cleanup index since that's not allowed during
+ * parallel mode, and copy index bulk-deletion results from local memory
+ * to the DSM segment.
+ */
+static void
+do_parallel_vacuum_or_cleanup_indexes(Relation *Irel, int nindexes,
+									  IndexBulkDeleteResult **stats,
+									  LVShared *lvshared,
+									  LVDeadTuples *dead_tuples)
+{
+	int idx = 0;
+
+	for (;;)
+	{
+		idx = pg_atomic_fetch_add_u32(&(lvshared->nprocessed), 1);
+
+		/* Done for all indexes? */
+		if (idx >= nindexes)
+			break;
+
+		/*
+		 * Update the local pointer to the corresponding bulk-deletion result
+		 * if someone already updated it.
+		 */
+		if (lvshared->indstats[idx].updated &&
+			stats[idx] == NULL)
+			stats[idx] = &(lvshared->indstats[idx].stats);
+
+		/* Do vacuum or cleanup one index */
+		if (!lvshared->for_cleanup)
+			lazy_vacuum_index(Irel[idx], &stats[idx], lvshared->reltuples,
+							  dead_tuples);
+		else
+			lazy_cleanup_index(Irel[idx], &stats[idx], lvshared->reltuples,
+							   lvshared->estimated_count);
+
+		/*
+		 * We copy the index bulk-deletion results returned from ambulkdelete
+		 * and amvacuumcleanup to the DSM segment because they allocate the
+		 * results locally and it's possible that an index will be vacuumed
+		 * by the different vacuum process at the next time. The copying the
+		 * result normally happens only after the first time of index vacuuming.
+		 * From the second time, we pass the result on the DSM segment so
+		 * that they update it directly.
+		 *
+		 * Since all vacuum workers write the bulk-deletion result at different
+		 * slot we can write them without locking.
+		 */
+		if (!lvshared->indstats[idx].updated &&
+			stats[idx] != NULL)
+		{
+			memcpy(&(lvshared->indstats[idx].stats),
+				   stats[idx], sizeof(IndexBulkDeleteResult));
+			lvshared->indstats[idx].updated = true;
+
+			/*
+			 * no longer need the locally allocated result and now stats[idx]
+			 * points to the DSM segment.
+			 */
+			pfree(stats[idx]);
+			stats[idx] = &(lvshared->indstats[idx].stats);
+		}
+	}
+}
+
+/*
+ * Vacuum or cleanup indexes. If we're ready for parallel lazy vacuum it's
+ * performed with parallel workers. So this function must be used by the parallel
+ * vacuum leader process.
+ */
+static void
+lazy_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
+								   int nindexes, IndexBulkDeleteResult **stats,
+								   LVParallelState *lps, bool for_cleanup)
+{
+	int		idx;
+
+	Assert(!IsParallelWorker());
+
+	/* no job if the table has no index */
+	if (nindexes <= 0)
+		return;
+
+	/* Do parallel lazy index vacuuming or cleanup if we're ready */
+	if (IsInParallelVacuum(lps))
+	{
+		lazy_parallel_vacuum_or_cleanup_indexes(vacrelstats, Irel,
+												nindexes, stats,
+												lps, for_cleanup);
+		return;
+	}
+
+	for (idx = 0; idx < nindexes; idx++)
+	{
+		/* Do vacuum or cleanup one index */
+		if (!for_cleanup)
+			lazy_vacuum_index(Irel[idx], &stats[idx], vacrelstats->old_live_tuples,
+							  vacrelstats->dead_tuples);
+		else
+		{
+			lazy_cleanup_index(Irel[idx], &stats[idx], vacrelstats->new_rel_tuples,
+							   vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+
+			/*
+			 * Update index statistics. If in parallel lazy vacuum, we will
+			 * update them after exited from parallel mode.
+			 */
+			lazy_update_index_statistics(Irel[idx], stats[idx]);
+
+			if (stats[idx])
+				pfree(stats[idx]);
+		}
+	}
+}
 
 /*
  *	lazy_vacuum_index() -- vacuum one index relation.
@@ -1690,11 +2000,11 @@ lazy_check_needs_freeze(Buffer buf, bool *hastup)
  *		vacrelstats->dead_tuples, and update running statistics.
  */
 static void
-lazy_vacuum_index(Relation indrel,
-				  IndexBulkDeleteResult **stats,
-				  LVRelStats *vacrelstats)
+lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult **stats,
+				  double reltuples, LVDeadTuples *dead_tuples)
 {
 	IndexVacuumInfo ivinfo;
+	char		*msgfmt;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
@@ -1703,18 +2013,22 @@ lazy_vacuum_index(Relation indrel,
 	ivinfo.analyze_only = false;
 	ivinfo.estimated_count = true;
 	ivinfo.message_level = elevel;
-	/* We can only provide an approximate value of num_heap_tuples here */
-	ivinfo.num_heap_tuples = vacrelstats->old_live_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
 	/* Do bulk deletion */
 	*stats = index_bulk_delete(&ivinfo, *stats,
-							   lazy_tid_reaped, (void *) vacrelstats);
+							   lazy_tid_reaped, (void *) dead_tuples);
+
+	if (IsParallelWorker())
+		msgfmt = gettext_noop("scanned index \"%s\" to remove %d row versions by parallel vacuum worker");
+	else
+		msgfmt = gettext_noop("scanned index \"%s\" to remove %d row versions");
 
 	ereport(elevel,
-			(errmsg("scanned index \"%s\" to remove %d row versions",
+			(errmsg(msgfmt,
 					RelationGetRelationName(indrel),
-					vacrelstats->num_dead_tuples),
+					dead_tuples->num_tuples),
 			 errdetail_internal("%s", pg_rusage_show(&ru0))));
 }
 
@@ -1722,60 +2036,65 @@ lazy_vacuum_index(Relation indrel,
  *	lazy_cleanup_index() -- do post-vacuum cleanup for one index relation.
  */
 static void
-lazy_cleanup_index(Relation indrel,
-				   IndexBulkDeleteResult *stats,
-				   LVRelStats *vacrelstats)
+lazy_cleanup_index(Relation indrel, IndexBulkDeleteResult **stats,
+				   double reltuples, bool estimated_count)
 {
 	IndexVacuumInfo ivinfo;
+	char		*msgfmt;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
 
 	ivinfo.index = indrel;
 	ivinfo.analyze_only = false;
-	ivinfo.estimated_count = (vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	ivinfo.estimated_count = estimated_count;
 	ivinfo.message_level = elevel;
-
-	/*
-	 * Now we can provide a better estimate of total number of surviving
-	 * tuples (we assume indexes are more interested in that than in the
-	 * number of nominally live tuples).
-	 */
-	ivinfo.num_heap_tuples = vacrelstats->new_rel_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
-	stats = index_vacuum_cleanup(&ivinfo, stats);
+	*stats = index_vacuum_cleanup(&ivinfo, *stats);
 
-	if (!stats)
+	if (!(*stats))
 		return;
 
-	/*
-	 * Now update statistics in pg_class, but only if the index says the count
-	 * is accurate.
-	 */
-	if (!stats->estimated_count)
-		vac_update_relstats(indrel,
-							stats->num_pages,
-							stats->num_index_tuples,
-							0,
-							false,
-							InvalidTransactionId,
-							InvalidMultiXactId,
-							false);
+	if (IsParallelWorker())
+		msgfmt = gettext_noop("index \"%s\" now contains %.0f row versions in %u pages, reported by parallel vacuum worker");
+	else
+		msgfmt = gettext_noop("index \"%s\" now contains %.0f row versions in %u pages");
 
 	ereport(elevel,
-			(errmsg("index \"%s\" now contains %.0f row versions in %u pages",
+			(errmsg(msgfmt,
 					RelationGetRelationName(indrel),
-					stats->num_index_tuples,
-					stats->num_pages),
+					(*stats)->num_index_tuples,
+					(*stats)->num_pages),
 			 errdetail("%.0f index row versions were removed.\n"
 					   "%u index pages have been deleted, %u are currently reusable.\n"
 					   "%s.",
-					   stats->tuples_removed,
-					   stats->pages_deleted, stats->pages_free,
+					   (*stats)->tuples_removed,
+					   (*stats)->pages_deleted, (*stats)->pages_free,
 					   pg_rusage_show(&ru0))));
+}
+
+/*
+ * Update index statistics in pg_class, but only if the index says the count
+ * is accurate.
+ */
+static void
+lazy_update_index_statistics(Relation indrel, IndexBulkDeleteResult *stats)
+{
+	Assert(!IsInParallelMode());
+
+	if (!stats || stats->estimated_count)
+		return;
 
-	pfree(stats);
+	vac_update_relstats(indrel,
+						stats->num_pages,
+						stats->num_index_tuples,
+						0,
+						false,
+						InvalidTransactionId,
+						InvalidMultiXactId,
+						false);
 }
 
 /*
@@ -2080,19 +2399,17 @@ count_nondeletable_pages(Relation onerel, LVRelStats *vacrelstats)
 }
 
 /*
- * lazy_space_alloc - space allocation decisions for lazy vacuum
- *
- * See the comments at the head of this file for rationale.
+ * Return the maximum number of dead tuples we can record.
  */
-static void
-lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+static long
+compute_max_dead_tuples(BlockNumber relblocks, bool hasindex)
 {
 	long		maxtuples;
 	int			vac_work_mem = IsAutoVacuumWorkerProcess() &&
 	autovacuum_work_mem != -1 ?
 	autovacuum_work_mem : maintenance_work_mem;
 
-	if (vacrelstats->hasindex)
+	if (hasindex)
 	{
 		maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);
 		maxtuples = Min(maxtuples, INT_MAX);
@@ -2106,34 +2423,49 @@ lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
 		maxtuples = Max(maxtuples, MaxHeapTuplesPerPage);
 	}
 	else
-	{
 		maxtuples = MaxHeapTuplesPerPage;
-	}
 
-	vacrelstats->num_dead_tuples = 0;
-	vacrelstats->max_dead_tuples = (int) maxtuples;
-	vacrelstats->dead_tuples = (ItemPointer)
-		palloc(maxtuples * sizeof(ItemPointerData));
+	return maxtuples;
+}
+
+/*
+ * lazy_space_alloc - space allocation decisions for lazy vacuum
+ *
+ * See the comments at the head of this file for rationale.
+ */
+static void
+lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+{
+	LVDeadTuples	*dead_tuples = NULL;
+	long		maxtuples;
+
+	maxtuples = compute_max_dead_tuples(relblocks, vacrelstats->hasindex);
+
+	dead_tuples = (LVDeadTuples *)
+		palloc(SizeOfLVDeadTuples + maxtuples * sizeof(ItemPointerData));
+	dead_tuples->num_tuples = 0;
+	dead_tuples->max_tuples = (int) maxtuples;
+
+	vacrelstats->dead_tuples = dead_tuples;
 }
 
 /*
  * lazy_record_dead_tuple - remember one deletable tuple
  */
 static void
-lazy_record_dead_tuple(LVRelStats *vacrelstats,
-					   ItemPointer itemptr)
+lazy_record_dead_tuple(LVDeadTuples *dead_tuples, ItemPointer itemptr)
 {
 	/*
 	 * The array shouldn't overflow under normal behavior, but perhaps it
 	 * could if we are given a really small maintenance_work_mem. In that
 	 * case, just forget the last few tuples (we'll get 'em next time).
 	 */
-	if (vacrelstats->num_dead_tuples < vacrelstats->max_dead_tuples)
+	if (dead_tuples->num_tuples < dead_tuples->max_tuples)
 	{
-		vacrelstats->dead_tuples[vacrelstats->num_dead_tuples] = *itemptr;
-		vacrelstats->num_dead_tuples++;
+		dead_tuples->itemptrs[dead_tuples->num_tuples] = *itemptr;
+		dead_tuples->num_tuples++;
 		pgstat_progress_update_param(PROGRESS_VACUUM_NUM_DEAD_TUPLES,
-									 vacrelstats->num_dead_tuples);
+									 dead_tuples->num_tuples);
 	}
 }
 
@@ -2147,12 +2479,12 @@ lazy_record_dead_tuple(LVRelStats *vacrelstats,
 static bool
 lazy_tid_reaped(ItemPointer itemptr, void *state)
 {
-	LVRelStats *vacrelstats = (LVRelStats *) state;
+	LVDeadTuples	*dead_tuples = (LVDeadTuples *) state;
 	ItemPointer res;
 
 	res = (ItemPointer) bsearch((void *) itemptr,
-								(void *) vacrelstats->dead_tuples,
-								vacrelstats->num_dead_tuples,
+								(void *) dead_tuples->itemptrs,
+								dead_tuples->num_tuples,
 								sizeof(ItemPointerData),
 								vac_cmp_itemptr);
 
@@ -2300,3 +2632,331 @@ heap_page_is_all_visible(Relation rel, Buffer buf,
 
 	return all_visible;
 }
+
+/*
+ * Compute the number of parallel worker process to request. Both index vacuuming
+ * and cleanup index can be executed together with parallel workers if the table
+ * has more than one index. The relation sizes of table and indexes don't affect
+ * to the parallel degree for now.
+ */
+static int
+compute_parallel_workers(Relation onerel, int nrequested, int nindexes)
+{
+	int parallel_workers;
+
+	Assert(nrequested >= 0);
+
+	if (nindexes <= 1)
+		return 0;
+
+	if (nrequested > 0)
+		parallel_workers = Min(nrequested, nindexes - 1);
+	else
+	{
+		/*
+		 * The parallel degree is not requested. Compute it based on the
+		 * number of indexes.
+		 */
+		parallel_workers = nindexes - 1;
+	}
+
+	/* cap by max_parallel_maintenace_workers */
+	parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+
+	return parallel_workers;
+}
+
+/*
+ * Enter parallel mode, allocate and initialize a DSM segment.
+ */
+static LVParallelState *
+lazy_prepare_parallel(LVRelStats *vacrelstats, Oid relid, BlockNumber nblocks,
+					  int nindexes, int nrequested)
+{
+	LVParallelState *lps = (LVParallelState *) palloc(sizeof(LVParallelState));
+	LVShared	*shared;
+	ParallelContext *pcxt;
+	LVDeadTuples	*tidmap;
+	long	maxtuples;
+	char	*sharedquery;
+	Size	est_shared;
+	Size	est_deadtuples;
+	int		querylen;
+	int		keys = 0;
+
+	Assert(nrequested > 0);
+	Assert(nindexes > 0);
+
+	EnterParallelMode();
+	pcxt = CreateParallelContext("postgres", "heap_parallel_vacuum_main",
+								 nrequested);
+	lps->pcxt = pcxt;
+	Assert(pcxt->nworkers > 0);
+
+	/* Estimate size for shared information -- PARALLEL_VACUUM_KEY_SHARED */
+	est_shared = MAXALIGN(add_size(SizeOfLVShared,
+								   mul_size(sizeof(LVIndStats), nindexes)));
+	shm_toc_estimate_chunk(&pcxt->estimator, est_shared);
+	keys++;
+
+	/* Estimate size for dead tuples -- PARALLEL_VACUUM_KEY_DEAD_TUPLES */
+	maxtuples = compute_max_dead_tuples(nblocks, true);
+	est_deadtuples = MAXALIGN(add_size(sizeof(LVDeadTuples),
+									   mul_size(sizeof(ItemPointerData), maxtuples)));
+	shm_toc_estimate_chunk(&pcxt->estimator, est_deadtuples);
+	keys++;
+
+	shm_toc_estimate_keys(&pcxt->estimator, keys);
+
+	/* Finally, estimate VACUUM_KEY_QUERY_TEXT space */
+	querylen = strlen(debug_query_string);
+	shm_toc_estimate_chunk(&pcxt->estimator, querylen + 1);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	InitializeParallelDSM(pcxt);
+
+	/* prepare shared information */
+	shared = (LVShared *) shm_toc_allocate(pcxt->toc, est_shared);
+	shared->relid = relid;
+	shared->elevel = elevel;
+	pg_atomic_init_u32(&(shared->nprocessed), 0);
+	MemSet(shared->indstats, 0, sizeof(LVIndStats) * nindexes);
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_SHARED, shared);
+	lps->lvshared = shared;
+
+	/* prepare the dead tuple space */
+	tidmap = (LVDeadTuples *) shm_toc_allocate(pcxt->toc, est_deadtuples);
+	tidmap->max_tuples = maxtuples;
+	tidmap->num_tuples = 0;
+	MemSet(tidmap->itemptrs, 0, sizeof(ItemPointerData) * maxtuples);
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_DEAD_TUPLES, tidmap);
+	vacrelstats->dead_tuples = tidmap;
+
+	/* Store query string for workers */
+	sharedquery = (char *) shm_toc_allocate(pcxt->toc, querylen + 1);
+	memcpy(sharedquery, debug_query_string, querylen + 1);
+	sharedquery[querylen] = '\0';
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, sharedquery);
+
+	lps->nworkers_requested = 0;
+
+	return lps;
+}
+
+/*
+ * Shutdown workers, destroy the parallel context, and end parallel mode.
+ * Update index statistics after exited from parallel mode.
+ */
+static void
+lazy_end_parallel(LVParallelState *lps, Relation *Irel, int nindexes)
+{
+	LVIndStats *copied_indstats = NULL;
+	int i;
+
+	Assert(!IsParallelWorker());
+	Assert(Irel != NULL && nindexes > 0);
+
+	/* copy the index statistics to a temporary space */
+	copied_indstats = palloc(sizeof(LVIndStats) * nindexes);
+	memcpy(copied_indstats, lps->lvshared->indstats,
+		   sizeof(LVIndStats) * nindexes);
+
+	/* Shutdown worker processes and destroy the parallel context */
+	WaitForParallelWorkersToFinish(lps->pcxt);
+	DestroyParallelContext(lps->pcxt);
+	ExitParallelMode();
+
+	for (i = 0; i < nindexes; i++)
+	{
+		LVIndStats *s = &(copied_indstats[i]);
+
+		if (s->updated)
+			lazy_update_index_statistics(Irel[i], &(s->stats));
+	}
+
+	pfree(copied_indstats);
+}
+
+/*
+ * Begin a parallel index vacuuming or index cleanup. Set shared information
+ * and launch parallel worker processes.
+ */
+static void
+lazy_begin_parallel_vacuum_index(LVParallelState *lps, LVRelStats *vacrelstats,
+								 bool for_cleanup)
+{
+	StringInfoData buf;
+
+	Assert(!IsParallelWorker());
+
+	/* Request workers to do either vacuuming indexes or cleaning indexes */
+	lps->lvshared->for_cleanup = for_cleanup;
+
+	if (!for_cleanup)
+	{
+		/* We can only provide an approximate value of num_heap_tuples here */
+		lps->lvshared->reltuples = vacrelstats->old_live_tuples;
+		lps->lvshared->estimated_count = true;
+	}
+	else
+	{
+		/*
+		 * Now we can provide a better estimate of total number of surviving
+		 * tuples (we assume indexes are more interested in that than in the
+		 * number of nominally live tuples).
+		 */
+		lps->lvshared->reltuples = vacrelstats->new_rel_tuples;
+		lps->lvshared->estimated_count =
+			(vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+
+	}
+
+	LaunchParallelWorkers(lps->pcxt);
+
+	initStringInfo(&buf);
+
+	/*
+	 * if no workers launched, we vacuum all indexes by the leader process
+	 * alone. Since there is hope that we can launch workers in the next
+	 * execution time we don't want to end parallel mode yet.
+	 */
+	if (lps->pcxt->nworkers_launched == 0)
+	{
+		if (lps->nworkers_requested > 0)
+			appendStringInfo(&buf,
+							 gettext_noop("could not launch parallel vacuum worker (planned: %d, requested: %d)"),
+							 lps->pcxt->nworkers, lps->nworkers_requested);
+		else
+			appendStringInfo(&buf,
+							 gettext_noop("could not launch parallel vacuum worker (planned: %d)"),
+							 lps->pcxt->nworkers);
+		ereport(elevel, (errmsg("%s", buf.data)));
+
+		lazy_end_parallel_vacuum_index(lps, !for_cleanup);
+		return;
+	}
+
+	/* Report parallel vacuum worker information */
+	if (for_cleanup)
+	{
+		if (lps->nworkers_requested > 0)
+			appendStringInfo(&buf,
+							 ngettext("launched %d parallel vacuum worker for index cleanup (planned: %d, requested %d)",
+									  "launched %d parallel vacuum workers for index cleanup (planned: %d, requsted %d)",
+									  lps->pcxt->nworkers_launched),
+							 lps->pcxt->nworkers_launched,
+							 lps->pcxt->nworkers,
+							 lps->nworkers_requested);
+		else
+			appendStringInfo(&buf,
+							 ngettext("launched %d parallel vacuum worker for index cleanup (planned: %d)",
+									  "launched %d parallel vacuum workers for index cleanup (planned: %d)",
+									  lps->pcxt->nworkers_launched),
+							 lps->pcxt->nworkers_launched,
+							 lps->pcxt->nworkers);
+	}
+	else
+	{
+		if (lps->nworkers_requested > 0)
+			appendStringInfo(&buf,
+							 ngettext("launched %d parallel vacuum worker for index vacuuming (planned: %d, requested %d)",
+									  "launched %d parallel vacuum workers for index vacuuming (planned: %d, requested %d)",
+									  lps->pcxt->nworkers_launched),
+							 lps->pcxt->nworkers_launched,
+							 lps->pcxt->nworkers,
+							 lps->nworkers_requested);
+		else
+			appendStringInfo(&buf,
+							 ngettext("launched %d parallel vacuum worker for index vacuuming (planned: %d)",
+									  "launched %d parallel vacuum workers for index vacuuming (planned: %d)",
+									  lps->pcxt->nworkers_launched),
+							 lps->pcxt->nworkers_launched,
+							 lps->pcxt->nworkers);
+	}
+	ereport(elevel, (errmsg("%s", buf.data)));
+}
+
+/*
+ * Wait for all worker processes to finish and reinitialize DSM for
+ * the next execution.
+ */
+static void
+lazy_end_parallel_vacuum_index(LVParallelState *lps, bool reinitialize)
+{
+	Assert(!IsParallelWorker());
+
+	WaitForParallelWorkersToFinish(lps->pcxt);
+
+	if (reinitialize)
+	{
+		/* Reset the processing count */
+		pg_atomic_write_u32(&(lps->lvshared->nprocessed), 0);
+
+		/*
+		 * Reinitialize the parallel context to relaunch parallel workers
+		 * for the next execution.
+		 */
+		ReinitializeParallelDSM(lps->pcxt);
+	}
+}
+
+/*
+ * Perform work within a launched parallel process.
+ *
+ * Parallel vacuum worker processes doesn't report the vacuum progress
+ * information.
+ */
+void
+heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
+{
+	Relation	onerel;
+	Relation	*indrels;
+	LVShared	*lvshared;
+	LVDeadTuples	*dead_tuples;
+	int			nindexes;
+	char		*sharedquery;
+	IndexBulkDeleteResult **stats;
+
+	lvshared = (LVShared *) shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_SHARED,
+										   false);
+	elevel = lvshared->elevel;
+
+	ereport(DEBUG1,
+			(errmsg("starting parallel lazy vacuum worker for %s",
+					lvshared->for_cleanup ? "cleanup" : "vacuuming")));
+
+	/* Open relations */
+	onerel = heap_open(lvshared->relid, ShareUpdateExclusiveLock);
+
+	/* indrels are sorted in order by OID */
+	vac_open_indexes(onerel, RowExclusiveLock, &nindexes, &indrels);
+	Assert(nindexes > 0);
+
+	/* Set debug_query_string for individual workers */
+	sharedquery = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, true);
+
+	/* Report the query string from leader */
+	debug_query_string = sharedquery;
+	pgstat_report_activity(STATE_RUNNING, debug_query_string);
+
+	/* Set dead tuple space within worker */
+	dead_tuples = (LVDeadTuples *) shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_DEAD_TUPLES,
+												  false);
+
+	/* Set cost-based vacuum delay */
+	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostBalance = 0;
+	VacuumPageHit = 0;
+	VacuumPageMiss = 0;
+	VacuumPageDirty = 0;
+
+	stats = (IndexBulkDeleteResult **)
+		palloc0(nindexes * sizeof(IndexBulkDeleteResult *));
+
+	/* Do either vacuuming indexes or cleaning indexes */
+	do_parallel_vacuum_or_cleanup_indexes(indrels, nindexes, stats,
+										  lvshared, dead_tuples);
+
+	vac_close_indexes(nindexes, indrels, RowExclusiveLock);
+	heap_close(onerel, ShareUpdateExclusiveLock);
+}
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index 55d129a..86511b2 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -14,6 +14,7 @@
 
 #include "postgres.h"
 
+#include "access/heapam.h"
 #include "access/nbtree.h"
 #include "access/parallel.h"
 #include "access/session.h"
@@ -140,6 +141,9 @@ static const struct
 	},
 	{
 		"_bt_parallel_build_main", _bt_parallel_build_main
+	},
+	{
+		"heap_parallel_vacuum_main", heap_parallel_vacuum_main
 	}
 };
 
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 72f140e..044138a 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -90,6 +90,7 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 	ListCell	*lc;
 
 	params.options = vacstmt->is_vacuumcmd ? VACOPT_VACUUM : VACOPT_ANALYZE;
+	params.nworkers = -1;
 
 	/* Parse options list */
 	foreach(lc, vacstmt->options)
@@ -116,6 +117,27 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 			params.options |= defGetBoolean(opt) ? VACOPT_FULL : 0;
 		else if (strcmp(opt->defname, "disable_page_skipping") == 0)
 			params.options |= defGetBoolean(opt) ? VACOPT_DISABLE_PAGE_SKIPPING : 0;
+		else if (strcmp(opt->defname, "parallel") == 0)
+		{
+			if (opt->arg == NULL)
+			{
+				/*
+				 * Parallel lazy vacuum is requested but user didn't specify
+				 * the parallel degree. The parallel degree will be determined
+				 * at the start of lazy vacuum.
+				 */
+				params.nworkers = 0;
+			}
+			else
+			{
+				params.nworkers = defGetInt32(opt);
+				if (params.nworkers <= 0)
+					ereport(ERROR,
+							(errcode(ERRCODE_SYNTAX_ERROR),
+							 errmsg("parallel vacuum degree must be at least 1"),
+							 parser_errposition(pstate, opt->location)));
+			}
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -147,6 +169,11 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 		}
 	}
 
+	if ((params.options & VACOPT_FULL) && params.nworkers >= 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot specify FULL option with PARALLEL option")));
+
 	/*
 	 * All freeze ages are zero if the FREEZE option is given; otherwise pass
 	 * them as -1 which means to use the default values.
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 5af91aa..effc85b 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -10553,6 +10553,7 @@ vac_analyze_option_name:
 
 vac_analyze_option_arg:
 			opt_boolean_or_string					{ $$ = (Node *) makeString($1); }
+			| NumericOnly							{ $$ = (Node *) $1; }
 			| /* EMPTY */		 					{ $$ = NULL; }
 		;
 
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index fa875db..010a49c 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2886,6 +2886,7 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 			(dovacuum ? VACOPT_VACUUM : 0) |
 			(doanalyze ? VACOPT_ANALYZE : 0) |
 			(!wraparound ? VACOPT_SKIP_LOCKED : 0);
+		tab->at_params.nworkers = -1;	/* parallel lazy autovacuum is not supported */
 		tab->at_params.freeze_min_age = freeze_min_age;
 		tab->at_params.freeze_table_age = freeze_table_age;
 		tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 36e20fb..685bfa3 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -3431,7 +3431,8 @@ psql_completion(const char *text, int start, int end)
 		 */
 		if (ends_with(prev_wd, '(') || ends_with(prev_wd, ','))
 			COMPLETE_WITH("FULL", "FREEZE", "ANALYZE", "VERBOSE",
-						  "DISABLE_PAGE_SKIPPING", "SKIP_LOCKED");
+						  "DISABLE_PAGE_SKIPPING", "SKIP_LOCKED",
+						  "PARALLEL");
 		else if (TailMatches("FULL|FREEZE|ANALYZE|VERBOSE|DISABLE_PAGE_SKIPPING|SKIP_LOCKED"))
 			COMPLETE_WITH("ON", "OFF");
 	}
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 3773a4d..4fcbfcc 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -14,6 +14,7 @@
 #ifndef HEAPAM_H
 #define HEAPAM_H
 
+#include "access/parallel.h"
 #include "access/relation.h"	/* for backward compatibility */
 #include "access/relscan.h"
 #include "access/sdir.h"
@@ -195,6 +196,7 @@ extern Size SyncScanShmemSize(void);
 struct VacuumParams;
 extern void heap_vacuum_rel(Relation onerel,
 				struct VacuumParams *params, BufferAccessStrategy bstrategy);
+extern void heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
 
 /* in heap/heapam_visibility.c */
 extern bool HeapTupleSatisfiesVisibility(HeapTuple stup, Snapshot snapshot,
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 77086f3..c4b355a 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -167,6 +167,11 @@ typedef struct VacuumParams
 	int			log_min_duration;	/* minimum execution threshold in ms at
 									 * which  verbose logs are activated, -1
 									 * to use default */
+	/*
+	 * The number of parallel vacuum workers. -1 by default for no workers
+	 * and 0 for choosing based on the number of indexes.
+	 */
+	int			nworkers;
 } VacuumParams;
 
 /* GUC parameters */
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index 07d0703..973bb33 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -80,6 +80,12 @@ CONTEXT:  SQL function "do_analyze" statement 1
 SQL function "wrap_do_analyze" statement 1
 VACUUM FULL vactst;
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+VACUUM (PARALLEL) vaccluster;
+VACUUM (PARALLEL 2) vaccluster;
+VACUUM (PARALLEL 0) vaccluster; -- error
+ERROR:  parallel vacuum degree must be at least 1
+LINE 1: VACUUM (PARALLEL 0) vaccluster;
+                ^
 -- partitioned table
 CREATE TABLE vacparted (a int, b char) PARTITION BY LIST (a);
 CREATE TABLE vacparted1 PARTITION OF vacparted FOR VALUES IN (1);
@@ -116,9 +122,9 @@ ERROR:  column "does_not_exist" of relation "vacparted" does not exist
 ANALYZE (VERBOSE) does_not_exist;
 ERROR:  relation "does_not_exist" does not exist
 ANALYZE (nonexistent-arg) does_not_exist;
-ERROR:  syntax error at or near "-"
+ERROR:  syntax error at or near "arg"
 LINE 1: ANALYZE (nonexistent-arg) does_not_exist;
-                            ^
+                             ^
 ANALYZE (nonexistentarg) does_not_exit;
 ERROR:  unrecognized ANALYZE option "nonexistentarg"
 LINE 1: ANALYZE (nonexistentarg) does_not_exit;
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 81f3822..d0c209a 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -61,6 +61,9 @@ VACUUM FULL vaccluster;
 VACUUM FULL vactst;
 
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+VACUUM (PARALLEL) vaccluster;
+VACUUM (PARALLEL 2) vaccluster;
+VACUUM (PARALLEL 0) vaccluster; -- error
 
 -- partitioned table
 CREATE TABLE vacparted (a int, b char) PARTITION BY LIST (a);
-- 
1.8.3.1

v20-0003-Add-paralell-P-option-to-vacuumdb-command.patchapplication/x-patch; name=v20-0003-Add-paralell-P-option-to-vacuumdb-command.patchDownload

From cafb7a75636345017e1a8d9c499751aea4b7d8be Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Wed, 23 Jan 2019 16:07:53 +0900
Subject: [PATCH v20 3/3] Add --paralell, -P option to vacuumdb command

---
 doc/src/sgml/ref/vacuumdb.sgml    | 16 +++++++++++++
 src/bin/scripts/t/100_vacuumdb.pl | 10 +++++++-
 src/bin/scripts/vacuumdb.c        | 49 ++++++++++++++++++++++++++++++++++++++-
 3 files changed, 73 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/ref/vacuumdb.sgml b/doc/src/sgml/ref/vacuumdb.sgml
index 41c7f3d..da65177 100644
--- a/doc/src/sgml/ref/vacuumdb.sgml
+++ b/doc/src/sgml/ref/vacuumdb.sgml
@@ -227,6 +227,22 @@ PostgreSQL documentation
      </varlistentry>
 
      <varlistentry>
+      <term><option>-P <replaceable class="parameter">workers</replaceable></option></term>
+      <term><option>--parallel=<replaceable class="parameter">workers</replaceable></option></term>
+      <listitem>
+       <para>
+        Execute parallel vacuum with <productname>PostgreSQL</productname>'s
+        <replaceable class="parameter">workers</replaceable> background workers.
+       </para>
+       <para>
+        This option will require background workers, so make sure your
+        <xref linkend="guc-max-parallel-workers-maintenance"/> setting is more
+        than one.
+       </para>
+      </listitem>
+     </varlistentry>
+
+     <varlistentry>
       <term><option>-q</option></term>
       <term><option>--quiet</option></term>
       <listitem>
diff --git a/src/bin/scripts/t/100_vacuumdb.pl b/src/bin/scripts/t/100_vacuumdb.pl
index 7f3a9b1..5ab87f3 100644
--- a/src/bin/scripts/t/100_vacuumdb.pl
+++ b/src/bin/scripts/t/100_vacuumdb.pl
@@ -3,7 +3,7 @@ use warnings;
 
 use PostgresNode;
 use TestLib;
-use Test::More tests => 44;
+use Test::More tests => 48;
 
 program_help_ok('vacuumdb');
 program_version_ok('vacuumdb');
@@ -48,6 +48,14 @@ $node->issues_sql_like(
 $node->command_fails(
 	[ 'vacuumdb', '--analyze-only', '--disable-page-skipping', 'postgres' ],
 	'--analyze-only and --disable-page-skipping specified together');
+$node->issues_sql_like(
+	[ 'vacuumdb', '-P2', 'postgres' ],
+	qr/statement: VACUUM \(PARALLEL 2\).*;/,
+	'vacuumdb -P2');
+$node->issues_sql_like(
+	[ 'vacuumdb', '-P', 'postgres' ],
+	qr/statement: VACUUM \(PARALLEL\).*;/,
+	'vacuumdb -P');
 $node->command_ok([qw(vacuumdb -Z --table=pg_am dbname=template1)],
 	'vacuumdb with connection string');
 
diff --git a/src/bin/scripts/vacuumdb.c b/src/bin/scripts/vacuumdb.c
index 5ac41ea..6be3f8f 100644
--- a/src/bin/scripts/vacuumdb.c
+++ b/src/bin/scripts/vacuumdb.c
@@ -45,6 +45,8 @@ typedef struct vacuumingOptions
 	bool		skip_locked;
 	int			min_xid_age;
 	int			min_mxid_age;
+	int			parallel_workers;	/* -1 disables, 0 for choosing based on the
+									 * number of indexes */
 } vacuumingOptions;
 
 
@@ -111,6 +113,7 @@ main(int argc, char *argv[])
 		{"full", no_argument, NULL, 'f'},
 		{"verbose", no_argument, NULL, 'v'},
 		{"jobs", required_argument, NULL, 'j'},
+		{"parallel", optional_argument, NULL, 'P'},
 		{"maintenance-db", required_argument, NULL, 2},
 		{"analyze-in-stages", no_argument, NULL, 3},
 		{"disable-page-skipping", no_argument, NULL, 4},
@@ -140,6 +143,7 @@ main(int argc, char *argv[])
 
 	/* initialize options to all false */
 	memset(&vacopts, 0, sizeof(vacopts));
+	vacopts.parallel_workers = -1;
 
 	progname = get_progname(argv[0]);
 
@@ -147,7 +151,7 @@ main(int argc, char *argv[])
 
 	handle_help_version_opts(argc, argv, "vacuumdb", help);
 
-	while ((c = getopt_long(argc, argv, "h:p:U:wWeqd:zZFat:fvj:", long_options, &optindex)) != -1)
+	while ((c = getopt_long(argc, argv, "h:p:P::U:wWeqd:zZFat:fvj:", long_options, &optindex)) != -1)
 	{
 		switch (c)
 		{
@@ -214,6 +218,25 @@ main(int argc, char *argv[])
 					exit(1);
 				}
 				break;
+			case 'P':
+				{
+					int parallel_workers = 0;
+
+					if (optarg != NULL)
+					{
+						parallel_workers = atoi(optarg);
+						if (parallel_workers <= 0)
+						{
+							fprintf(stderr, _("%s: number of parallel workers must be at least 1\n"),
+									progname);
+							exit(1);
+						}
+					}
+
+					/* allow to set 0, meaning PARALLEL without the parallel degree */
+					vacopts.parallel_workers = parallel_workers;
+					break;
+				}
 			case 2:
 				maintenance_db = pg_strdup(optarg);
 				break;
@@ -288,9 +311,22 @@ main(int argc, char *argv[])
 					progname, "disable-page-skipping");
 			exit(1);
 		}
+		if (vacopts.parallel_workers >= 0)
+		{
+			fprintf(stderr, _("%s: cannot use the \"%s\" option when performing only analyze\n"),
+					progname, "parallel");
+			exit(1);
+		}
 		/* allow 'and_analyze' with 'analyze_only' */
 	}
 
+	if (vacopts.full && vacopts.parallel_workers >= 0)
+	{
+		fprintf(stderr, _("%s: cannot use the \"%s\" option with \"%s\" option\n"),
+				progname, "full", "parallel");
+		exit(1);
+	}
+
 	setup_cancel_handler();
 
 	/* Avoid opening extra connections. */
@@ -895,6 +931,16 @@ prepare_vacuum_command(PQExpBuffer sql, int serverVersion,
 				appendPQExpBuffer(sql, "%sANALYZE", sep);
 				sep = comma;
 			}
+			if (vacopts->parallel_workers > 0)
+			{
+				appendPQExpBuffer(sql, "%sPARALLEL %d", sep, vacopts->parallel_workers);
+				sep = comma;
+			}
+			if (vacopts->parallel_workers == 0)
+			{
+				appendPQExpBuffer(sql, "%sPARALLEL", sep);
+				sep = comma;
+			}
 			if (sep != paren)
 				appendPQExpBufferChar(sql, ')');
 		}
@@ -1227,6 +1273,7 @@ help(const char *progname)
 	printf(_("  -j, --jobs=NUM                  use this many concurrent connections to vacuum\n"));
 	printf(_("      --min-mxid-age=MXID_AGE     minimum multixact ID age of tables to vacuum\n"));
 	printf(_("      --min-xid-age=XID_AGE       minimum transaction ID age of tables to vacuum\n"));
+	printf(_("  -P, --parallel[=NUM]            do parallel vacuuming\n"));
 	printf(_("  -q, --quiet                     don't write any messages\n"));
 	printf(_("      --skip-locked               skip relations that cannot be immediately locked\n"));
 	printf(_("  -t, --table='TABLE[(COLUMNS)]'  vacuum specific table(s) only\n"));
-- 
1.8.3.1

#64

Haribabu Kommi

kommi.haribabu@gmail.com

almost 7 years ago

In reply to: Masahiko Sawada (#63)

On Wed, Mar 27, 2019 at 1:31 AM Masahiko Sawada <sawada.mshk@gmail.com>
wrote:

On Tue, Mar 26, 2019 at 10:19 AM Haribabu Kommi
<kommi.haribabu@gmail.com> wrote:
+ for (i = 0; i < nindexes; i++)
+ {
+ LVIndStats *s = &(copied_indstats[i]);
+
+ if (s->updated)
+ lazy_update_index_statistics(Irel[i], &(s->stats));
+ }
+
+ pfree(copied_indstats);
why can't we use the shared memory directly to update the stats once all
the workers

are finished, instead of copying them to a local memory?

Since we cannot use heap_inplace_update() which is called by
vac_update_relstats() during parallel mode I copied the stats. Is that
safe if we destroy the parallel context *after* exited parallel mode?

OK, understood the reason behind the copy.

Thanks for the updated patches. I reviewed them again and they
are fine. I marked the patch as "ready for committer".

Regards,
Haribabu Kommi
Fujitsu Australia

#65

Robert Haas

robertmhaas@gmail.com

almost 7 years ago

In reply to: Masahiko Sawada (#63)

On Tue, Mar 26, 2019 at 10:31 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Thank you for reviewing the patch.

I don't think the approach in v20-0001 is quite right.

         if (strcmp(opt->defname, "verbose") == 0)
-            params.options |= VACOPT_VERBOSE;
+            params.options |= defGetBoolean(opt) ? VACOPT_VERBOSE : 0;

It seems to me that it would be better to do declare a separate
boolean for each flag at the top; e.g. bool verbose. Then here do
verbose = defGetBoolean(opt). And then after the loop do
params.options = (verbose ? VACOPT_VERBOSE : 0) | ... similarly for
other options.

The thing I don't like about the way you have it here is that it's not
going to work well for options that are true by default but can
optionally be set to false. In that case, you would need to start
with the bit set and then clear it, but |= can only set bits, not
clear them. I went and looked at the VACUUM (INDEX_CLEANUP) patch on
the other thread and it doesn't have any special handling for that
case, which makes me suspect that if you use that patch, the reloption
works as expected but VACUUM (INDEX_CLEANUP false) doesn't actually
succeed in disabling index cleanup. The structure I suggested above
would fix that.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#66

Masahiko Sawada

sawada.mshk@gmail.com

almost 7 years ago

In reply to: Robert Haas (#65)

3 attachment(s)

On Fri, Mar 29, 2019 at 4:53 AM Robert Haas <robertmhaas@gmail.com> wrote:

On Tue, Mar 26, 2019 at 10:31 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Thank you for reviewing the patch.

I don't think the approach in v20-0001 is quite right.
if (strcmp(opt->defname, "verbose") == 0)
-            params.options |= VACOPT_VERBOSE;
+            params.options |= defGetBoolean(opt) ? VACOPT_VERBOSE : 0;
It seems to me that it would be better to do declare a separate
boolean for each flag at the top; e.g. bool verbose. Then here do
verbose = defGetBoolean(opt). And then after the loop do
params.options = (verbose ? VACOPT_VERBOSE : 0) | ... similarly for
other options.

The thing I don't like about the way you have it here is that it's not
going to work well for options that are true by default but can
optionally be set to false. In that case, you would need to start
with the bit set and then clear it, but |= can only set bits, not
clear them. I went and looked at the VACUUM (INDEX_CLEANUP) patch on
the other thread and it doesn't have any special handling for that
case, which makes me suspect that if you use that patch, the reloption
works as expected but VACUUM (INDEX_CLEANUP false) doesn't actually
succeed in disabling index cleanup. The structure I suggested above
would fix that.

You're right, the previous patches are wrong. Attached the updated
version patches.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachments:

v21-0001-All-VACUUM-command-options-allow-an-argument.patchtext/x-patch; charset=US-ASCII; name=v21-0001-All-VACUUM-command-options-allow-an-argument.patchDownload

From 14df117a0aac13689cf16a65c1cbda088910a215 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Tue, 26 Mar 2019 22:13:53 +0900
Subject: [PATCH v21 1/3] All VACUUM command options allow an argument.

All existing VACUUM command options allow a boolean argument like
EXPLAIN command.
---
 doc/src/sgml/ref/vacuum.sgml  | 26 ++++++++++++++++++++------
 src/backend/commands/vacuum.c | 31 +++++++++++++++++++++++--------
 src/backend/parser/gram.y     | 10 ++++++++--
 src/bin/psql/tab-complete.c   |  2 ++
 4 files changed, 53 insertions(+), 16 deletions(-)

diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index fd911f5..906d0c2 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -26,12 +26,12 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
 
 <phrase>where <replaceable class="parameter">option</replaceable> can be one of:</phrase>
 
-    FULL
-    FREEZE
-    VERBOSE
-    ANALYZE
-    DISABLE_PAGE_SKIPPING
-    SKIP_LOCKED
+    FULL [ <replaceable class="parameter">boolean</replaceable> ]
+    FREEZE [ <replaceable class="parameter">boolean</replaceable> ]
+    VERBOSE [ <replaceable class="parameter">boolean</replaceable> ]
+    ANALYZE [ <replaceable class="parameter">boolean</replaceable> ]
+    DISABLE_PAGE_SKIPPING [ <replaceable class="parameter">boolean</replaceable> ]
+    SKIP_LOCKED [ <replaceable class="parameter">boolean</replaceable> ]
 
 <phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
 
@@ -182,6 +182,20 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    </varlistentry>
 
    <varlistentry>
+    <term><replaceable class="parameter">boolean</replaceable></term>
+    <listitem>
+     <para>
+      Specifies whether the selected option should be turned on or off.
+      You can write <literal>TRUE</literal>, <literal>ON</literal>, or
+      <literal>1</literal> to enable the option, and <literal>FALSE</literal>,
+      <literal>OFF</literal>, or <literal>0</literal> to disable it.  The
+      <replaceable class="parameter">boolean</replaceable> value can also
+      be omitted, in which case <literal>TRUE</literal> is assumed.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><replaceable class="parameter">table_name</replaceable></term>
     <listitem>
      <para>
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index f0afeaf..10df766 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -36,6 +36,7 @@
 #include "catalog/pg_inherits.h"
 #include "catalog/pg_namespace.h"
 #include "commands/cluster.h"
+#include "commands/defrem.h"
 #include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
@@ -86,10 +87,14 @@ void
 ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 {
 	VacuumParams params;
+	bool verbose = false;
+	bool skip_locked = false;
+	bool analyze = false;
+	bool freeze = false;
+	bool full = false;
+	bool disable_page_skipping = false;
 	ListCell	*lc;
 
-	params.options = vacstmt->is_vacuumcmd ? VACOPT_VACUUM : VACOPT_ANALYZE;
-
 	/* Parse options list */
 	foreach(lc, vacstmt->options)
 	{
@@ -97,9 +102,9 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 
 		/* Parse common options for VACUUM and ANALYZE */
 		if (strcmp(opt->defname, "verbose") == 0)
-			params.options |= VACOPT_VERBOSE;
+			verbose = defGetBoolean(opt);
 		else if (strcmp(opt->defname, "skip_locked") == 0)
-			params.options |= VACOPT_SKIP_LOCKED;
+			skip_locked = defGetBoolean(opt);
 		else if (!vacstmt->is_vacuumcmd)
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -108,13 +113,13 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 
 		/* Parse options available on VACUUM */
 		else if (strcmp(opt->defname, "analyze") == 0)
-				params.options |= VACOPT_ANALYZE;
+			analyze = defGetBoolean(opt);
 		else if (strcmp(opt->defname, "freeze") == 0)
-				params.options |= VACOPT_FREEZE;
+			freeze = defGetBoolean(opt);
 		else if (strcmp(opt->defname, "full") == 0)
-			params.options |= VACOPT_FULL;
+			full = defGetBoolean(opt);
 		else if (strcmp(opt->defname, "disable_page_skipping") == 0)
-			params.options |= VACOPT_DISABLE_PAGE_SKIPPING;
+			disable_page_skipping = defGetBoolean(opt);
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -122,6 +127,16 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 					 parser_errposition(pstate, opt->location)));
 	}
 
+	/* Set vacuum options */
+	params.options =
+		(vacstmt->is_vacuumcmd ? VACOPT_VACUUM : VACOPT_ANALYZE) |
+		(verbose ? VACOPT_VERBOSE : 0) |
+		(skip_locked ? VACOPT_SKIP_LOCKED : 0) |
+		(analyze ? VACOPT_ANALYZE : 0) |
+		(freeze ? VACOPT_FREEZE : 0) |
+		(full ? VACOPT_FULL : 0) |
+		(disable_page_skipping ? VACOPT_DISABLE_PAGE_SKIPPING : 0);
+
 	/* sanity checks on options */
 	Assert(params.options & (VACOPT_VACUUM | VACOPT_ANALYZE));
 	Assert((params.options & VACOPT_VACUUM) ||
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 0a48228..5af91aa 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -309,6 +309,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 %type <str>		vac_analyze_option_name
 %type <defelt>	vac_analyze_option_elem
 %type <list>	vac_analyze_option_list
+%type <node>	vac_analyze_option_arg
 %type <boolean>	opt_or_replace
 				opt_grant_grant_option opt_grant_admin_option
 				opt_nowait opt_if_exists opt_with_data
@@ -10539,9 +10540,9 @@ analyze_keyword:
 		;
 
 vac_analyze_option_elem:
-			vac_analyze_option_name
+			vac_analyze_option_name vac_analyze_option_arg
 				{
-					$$ = makeDefElem($1, NULL, @1);
+					$$ = makeDefElem($1, $2, @1);
 				}
 		;
 
@@ -10550,6 +10551,11 @@ vac_analyze_option_name:
 			| analyze_keyword						{ $$ = "analyze"; }
 		;
 
+vac_analyze_option_arg:
+			opt_boolean_or_string					{ $$ = (Node *) makeString($1); }
+			| /* EMPTY */		 					{ $$ = NULL; }
+		;
+
 opt_analyze:
 			analyze_keyword							{ $$ = true; }
 			| /*EMPTY*/								{ $$ = false; }
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index f14921e..c18977c 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -3432,6 +3432,8 @@ psql_completion(const char *text, int start, int end)
 		if (ends_with(prev_wd, '(') || ends_with(prev_wd, ','))
 			COMPLETE_WITH("FULL", "FREEZE", "ANALYZE", "VERBOSE",
 						  "DISABLE_PAGE_SKIPPING", "SKIP_LOCKED");
+		else if (TailMatches("FULL|FREEZE|ANALYZE|VERBOSE|DISABLE_PAGE_SKIPPING|SKIP_LOCKED"))
+			COMPLETE_WITH("ON", "OFF");
 	}
 	else if (HeadMatches("VACUUM") && TailMatches("("))
 		/* "VACUUM (" should be caught above, so assume we want columns */
-- 
1.8.3.1

v21-0002-Add-parallel-option-to-VACUUM-command.patchtext/x-patch; charset=US-ASCII; name=v21-0002-Add-parallel-option-to-VACUUM-command.patchDownload

From 1af8df908700ef2793647f4d16ae1dee30d8c9bd Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 22 Mar 2019 10:31:31 +0900
Subject: [PATCH v21 2/3] Add parallel option to VACUUM command

In parallel vacuum, we perform both index vacuum and cleanup vacuum
with parallel workers. Indivisual indexes are processed by one vacuum
process. Therefore parallel vacuum can be used when the table has more
than one index.

Parallel vacuum can be performed by specifying like
VACUUM (PARALLEL 2) tbl, meaning that performing vacuum with 2
parallel worker processes. Specifying only PARALLEL means that the
degree of parallalism will be determined based on the number of
indexes the table has.

The parallel vacuum degree is limited by both the number of
indexes the table has and max_parallel_maintenance_workers.
---
 doc/src/sgml/config.sgml              |  14 +-
 doc/src/sgml/ref/vacuum.sgml          |  31 ++
 src/backend/access/heap/vacuumlazy.c  | 872 +++++++++++++++++++++++++++++-----
 src/backend/access/transam/parallel.c |   4 +
 src/backend/commands/vacuum.c         |  29 ++
 src/backend/parser/gram.y             |   1 +
 src/backend/postmaster/autovacuum.c   |   1 +
 src/bin/psql/tab-complete.c           |   3 +-
 src/include/access/heapam.h           |   2 +
 src/include/commands/vacuum.h         |   5 +
 src/test/regress/expected/vacuum.out  |  10 +-
 src/test/regress/sql/vacuum.sql       |   3 +
 12 files changed, 859 insertions(+), 116 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index d383de2..3ca3ae8 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2226,13 +2226,13 @@ include_dir 'conf.d'
        <listitem>
         <para>
          Sets the maximum number of parallel workers that can be
-         started by a single utility command.  Currently, the only
-         parallel utility command that supports the use of parallel
-         workers is <command>CREATE INDEX</command>, and only when
-         building a B-tree index.  Parallel workers are taken from the
-         pool of processes established by <xref
-         linkend="guc-max-worker-processes"/>, limited by <xref
-         linkend="guc-max-parallel-workers"/>.  Note that the requested
+         started by a single utility command.  Currently, the parallel
+         utility commands that support the use of parallel workers are
+         <command>CREATE INDEX</command> only when building a B-tree index,
+         and <command>VACUUM</command> without <literal>FULL</literal>
+         option. Parallel workers are taken from the pool of processes
+         established by <xref linkend="guc-max-worker-processes"/>, limited
+         by <xref linkend="guc-max-parallel-workers"/>.  Note that the requested
          number of workers may not actually be available at run time.
          If this occurs, the utility operation will run with fewer
          workers than expected.  The default value is 2.  Setting this
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index 906d0c2..d3fe0f6 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -32,6 +32,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     ANALYZE [ <replaceable class="parameter">boolean</replaceable> ]
     DISABLE_PAGE_SKIPPING [ <replaceable class="parameter">boolean</replaceable> ]
     SKIP_LOCKED [ <replaceable class="parameter">boolean</replaceable> ]
+    PARALLEL [ <replaceable class="parameter">integer</replaceable> ]
 
 <phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
 
@@ -143,6 +144,22 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    </varlistentry>
 
    <varlistentry>
+    <term><literal>PARALLEL</literal></term>
+    <listitem>
+     <para>
+      Perform vacuum index and cleanup index phases of <command>VACUUM</command>
+      in parallel using <replaceable class="parameter">integer</replaceable> background
+      workers (for the detail of each vacuum phases, please refer to
+      <xref linkend="vacuum-phases"/>). Only one worker can be used per index. So
+      parallel workers are launched only when there are at least <literal>2</literal>
+      indexes in the table. Workers for vacuum launches before starting each phases
+      and exit at the end of the phase. These behaviors might change in a future release.
+      This option can not use with <literal>FULL</literal> option.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><literal>DISABLE_PAGE_SKIPPING</literal></term>
     <listitem>
      <para>
@@ -196,6 +213,20 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    </varlistentry>
 
    <varlistentry>
+    <term><replaceable class="parameter">integer</replaceable></term>
+    <listitem>
+     <para>
+      Specifies parallel degree for <literal>PARALLEL</literal> option. The
+      value must be at least 1. If the parallel degree
+      <replaceable class="parameter">integer</replaceable> is omitted, then
+      <command>VACUUM</command> decides the number of workers based on number of
+      indexes on the relation which further limited by
+      <xref linkend="guc-max-parallel-workers-maintenance"/>.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><replaceable class="parameter">table_name</replaceable></term>
     <listitem>
      <para>
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 5c554f9..a864d18 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -22,6 +22,19 @@
  * of index scans performed.  So we don't use maintenance_work_mem memory for
  * the TID array, just enough to hold as many heap tuples as fit on one page.
  *
+ * Lazy vacuum supports parallel execution with parallel worker processes. In
+ * parallel lazy vacuum, we perform both index vacuuming and index cleanup in
+ * parallel. Individual indexes is processed by one vacuum process. At beginning
+ * of lazy vacuum (at lazy_scan_heap) we prepare the parallel context and
+ * initialize the DSM segment that contains shared information as well as the
+ * memory space for dead tuples. When starting either index vacuuming or index
+ * cleanup, we launch parallel worker processes. Once all indexes are processed
+ * the parallel worker processes exit and the leader process re-initializes the
+ * DSM segment. Note that all parallel workers live during one either index
+ * vacuuming or index cleanup but the leader process neither exits from the
+ * parallel mode nor destroys the parallel context. For updating the index
+ * statistics, since any updates are not allowed during parallel mode we update
+ * the index statistics after exited from parallel mode.
  *
  * Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
@@ -41,8 +54,10 @@
 #include "access/heapam_xlog.h"
 #include "access/htup_details.h"
 #include "access/multixact.h"
+#include "access/parallel.h"
 #include "access/transam.h"
 #include "access/visibilitymap.h"
+#include "access/xact.h"
 #include "access/xlog.h"
 #include "catalog/storage.h"
 #include "commands/dbcommands.h"
@@ -55,6 +70,7 @@
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
 #include "storage/lmgr.h"
+#include "tcop/tcopprot.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/pg_rusage.h"
@@ -110,6 +126,92 @@
  */
 #define PREFETCH_SIZE			((BlockNumber) 32)
 
+/*
+ * DSM keys for parallel lazy vacuum. Since we don't need to worry about DSM
+ * keys conflicting with plan_node_id we can use small integers.
+ */
+#define PARALLEL_VACUUM_KEY_SHARED			1
+#define PARALLEL_VACUUM_KEY_DEAD_TUPLES		2
+#define PARALLEL_VACUUM_KEY_QUERY_TEXT		3
+
+/*
+ * Are we in a parallel lazy vacuum? If that's true, we're in parallel mode
+ * and prepared the DSM segments.
+ */
+#define IsInParallelVacuum(lps) (((LVParallelState *) (lps)) != NULL)
+
+/*
+ * Structs for an index bulk-deletion statistic that is used for parallel
+ * lazy vacuum. This is allocated in a DSM segment.
+ */
+typedef struct LVIndStats
+{
+	bool updated;	/* are the stats updated? */
+	IndexBulkDeleteResult stats;
+} LVIndStats;
+
+/*
+ * LVDeadTuples stores the dead tuple TIDs collected during heap scan.
+ * This is allocated in a DSM segment when parallel lazy vacuum mode,
+ * or allocated in a local memory.
+ */
+typedef struct LVDeadTuples
+{
+	int			max_tuples;	/* # slots allocated in array */
+	int			num_tuples;	/* current # of entries */
+	/* List of TIDs of tuples we intend to delete */
+	/* NB: this list is ordered by TID address */
+	ItemPointerData itemptrs[FLEXIBLE_ARRAY_MEMBER];	/* array of ItemPointerData */
+} LVDeadTuples;
+#define SizeOfLVDeadTuples offsetof(LVDeadTuples, itemptrs) + sizeof(ItemPointerData)
+
+/*
+ * Shared information among parallel workers. So this is allocated in
+ * a DSM segment.
+ */
+typedef struct LVShared
+{
+	/*
+	 * Target table relid and vacuum settings. These fields are not modified
+	 * during the lazy vacuum.
+	 */
+	Oid		relid;
+	int		elevel;
+
+	/*
+	 * An indication for vacuum workers of doing either vacuuming index or
+	 * index cleanup.
+	 */
+	bool	for_cleanup;
+
+	/*
+	 * Fields for both index vacuuming and index cleanup.
+	 *
+	 * reltuples is the total number of input heap tuples. We set either an
+	 * old live tuples in index vacuuming or the new live tuples in index cleanup.
+	 *
+	 * estimated_count is true if the reltuples is estimated value.
+	 */
+	double	reltuples;
+	bool	estimated_count;
+
+	/*
+	 * Variables to control parallel index vacuuming. An variable-sized field
+	 * 'indstats' must come last.
+	 */
+	pg_atomic_uint32	nprocessed;
+	LVIndStats			indstats[FLEXIBLE_ARRAY_MEMBER];
+} LVShared;
+#define SizeOfLVShared offsetof(LVShared, indstats) + sizeof(LVIndStats)
+
+/* Struct for parallel lazy vacuum */
+typedef struct LVParallelState
+{
+	ParallelContext	*pcxt;
+	LVShared		*lvshared;
+	int				nworkers_requested;	/* user-requested parallel degree */
+} LVParallelState;
+
 typedef struct LVRelStats
 {
 	/* hasindex = true means two-pass strategy; false means one-pass */
@@ -128,17 +230,12 @@ typedef struct LVRelStats
 	BlockNumber pages_removed;
 	double		tuples_deleted;
 	BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
-	/* List of TIDs of tuples we intend to delete */
-	/* NB: this list is ordered by TID address */
-	int			num_dead_tuples;	/* current # of entries */
-	int			max_dead_tuples;	/* # slots allocated in array */
-	ItemPointer dead_tuples;	/* array of ItemPointerData */
+	LVDeadTuples *dead_tuples;
 	int			num_index_scans;
 	TransactionId latestRemovedXid;
 	bool		lock_waiter_detected;
 } LVRelStats;
 
-
 /* A few variables that don't seem worth passing around as parameters */
 static int	elevel = -1;
 
@@ -150,17 +247,18 @@ static BufferAccessStrategy vac_strategy;
 
 
 /* non-export function prototypes */
-static void lazy_scan_heap(Relation onerel, int options,
+static void lazy_scan_heap(Relation onerel, VacuumParams *params,
 			   LVRelStats *vacrelstats, Relation *Irel, int nindexes,
 			   bool aggressive);
 static void lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats, BlockNumber nblocks);
 static bool lazy_check_needs_freeze(Buffer buf, bool *hastup);
 static void lazy_vacuum_index(Relation indrel,
 				  IndexBulkDeleteResult **stats,
-				  LVRelStats *vacrelstats);
+				  double reltuples,
+				  LVDeadTuples	*dead_tuples);
 static void lazy_cleanup_index(Relation indrel,
-				   IndexBulkDeleteResult *stats,
-				   LVRelStats *vacrelstats);
+				   IndexBulkDeleteResult **stats,
+				   double reltuples, bool estimated_count);
 static int lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 				 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer);
 static bool should_attempt_truncation(LVRelStats *vacrelstats);
@@ -168,12 +266,35 @@ static void lazy_truncate_heap(Relation onerel, LVRelStats *vacrelstats);
 static BlockNumber count_nondeletable_pages(Relation onerel,
 						 LVRelStats *vacrelstats);
 static void lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks);
-static void lazy_record_dead_tuple(LVRelStats *vacrelstats,
-					   ItemPointer itemptr);
+static void lazy_record_dead_tuple(LVDeadTuples *dead_tuples, ItemPointer itemptr);
 static bool lazy_tid_reaped(ItemPointer itemptr, void *state);
 static int	vac_cmp_itemptr(const void *left, const void *right);
 static bool heap_page_is_all_visible(Relation rel, Buffer buf,
 						 TransactionId *visibility_cutoff_xid, bool *all_frozen);
+static void lazy_update_index_statistics(Relation indrel, IndexBulkDeleteResult *stats);
+static LVParallelState *lazy_prepare_parallel(LVRelStats *vacrelstats, Oid relid,
+											  BlockNumber nblocks, int nindexes,
+											  int nrequested);
+static void lazy_end_parallel(LVParallelState *lps, Relation *Irel, int nindexes);
+static void lazy_begin_parallel_vacuum_index(LVParallelState *lps, LVRelStats *vacrelstats,
+											 bool for_cleanup);
+static void lazy_end_parallel_vacuum_index(LVParallelState *lps, bool reinitialize);
+static void lazy_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
+										   int nindexes,
+										   IndexBulkDeleteResult **stats,
+										   LVParallelState *lps, bool for_cleanup);
+static void lazy_parallel_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats,
+													Relation *Irel,
+													int nindexes,
+													IndexBulkDeleteResult **stats,
+													LVParallelState *lps,
+													bool for_cleanup);
+static void do_parallel_vacuum_or_cleanup_indexes(Relation *Irel, int nindexes,
+												  IndexBulkDeleteResult **stats,
+												  LVShared *lvshared,
+												  LVDeadTuples *dead_tuples);
+static int compute_parallel_workers(Relation onerel, int nrequested, int nindexes);
+static long compute_max_dead_tuples(BlockNumber relblocks, bool hasindex);
 
 
 /*
@@ -261,7 +382,7 @@ heap_vacuum_rel(Relation onerel, VacuumParams *params,
 	vacrelstats->hasindex = (nindexes > 0);
 
 	/* Do the vacuuming */
-	lazy_scan_heap(onerel, params->options, vacrelstats, Irel, nindexes, aggressive);
+	lazy_scan_heap(onerel, params, vacrelstats, Irel, nindexes, aggressive);
 
 	/* Done with indexes */
 	vac_close_indexes(nindexes, Irel, NoLock);
@@ -464,14 +585,28 @@ vacuum_log_cleanup_info(Relation rel, LVRelStats *vacrelstats)
  *		dead-tuple TIDs, invoke vacuuming of indexes and call lazy_vacuum_heap
  *		to reclaim dead line pointers.
  *
+ *		If the table has more than one index and parallel lazy vacuum is requested,
+ *		we execute both index vacuuming and index cleanup with parallel workers.
+ *		When allocating the space for lazy scan heap, we enter parallel mode,
+ *		create the parallel context and initailize a DSM segment for dead tuples.
+ *		The dead_tuples points either to a DSM segment in parallel lazy vacuum case
+ *		or to a local memory in single process vacuum case.  Before starting parallel
+ *		index vacuuming and parallel index cleanup we launch parallel workers.
+ *		All parallel workers will exit after processed all indexes and the leader
+ *		process re-initialize the parallel context and then re-launch them at the next
+ *		execution. The index statistics are updated by the leader after exited from
+ *		parallel mode since all writes are not allowed during parallel mode.
+ *
  *		If there are no indexes then we can reclaim line pointers on the fly;
  *		dead line pointers need only be retained until all index pointers that
  *		reference them have been killed.
  */
 static void
-lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
+lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			   Relation *Irel, int nindexes, bool aggressive)
 {
+	LVParallelState *lps = NULL;
+	LVDeadTuples *dead_tuples;
 	BlockNumber nblocks,
 				blkno;
 	HeapTupleData tuple;
@@ -494,6 +629,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	bool		skipping_blocks;
 	xl_heap_freeze_tuple *frozen;
 	StringInfoData buf;
+	int			parallel_workers = 0;
 	const int	initprog_index[] = {
 		PROGRESS_VACUUM_PHASE,
 		PROGRESS_VACUUM_TOTAL_HEAP_BLKS,
@@ -529,13 +665,34 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	vacrelstats->nonempty_pages = 0;
 	vacrelstats->latestRemovedXid = InvalidTransactionId;
 
-	lazy_space_alloc(vacrelstats, nblocks);
+	/* Compute the number of parallel vacuum worker to request */
+	if (params->nworkers >= 0)
+		parallel_workers = compute_parallel_workers(onerel,
+													params->nworkers,
+													nindexes);
+
+	if (parallel_workers > 0)
+	{
+		/* enter parallel mode and prepare parallel lazy vacuum */
+		lps = lazy_prepare_parallel(vacrelstats,
+									RelationGetRelid(onerel),
+									nblocks, nindexes,
+									parallel_workers);
+		lps->nworkers_requested = params->nworkers;
+	}
+	else
+	{
+		/* Allocate the memory space for dead tuples locally */
+		lazy_space_alloc(vacrelstats, nblocks);
+	}
+
+	dead_tuples = vacrelstats->dead_tuples;
 	frozen = palloc(sizeof(xl_heap_freeze_tuple) * MaxHeapTuplesPerPage);
 
 	/* Report that we're scanning the heap, advertising total # of blocks */
 	initprog_val[0] = PROGRESS_VACUUM_PHASE_SCAN_HEAP;
 	initprog_val[1] = nblocks;
-	initprog_val[2] = vacrelstats->max_dead_tuples;
+	initprog_val[2] = dead_tuples->max_tuples;
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
 	/*
@@ -583,7 +740,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	 * be replayed on any hot standby, where it can be disruptive.
 	 */
 	next_unskippable_block = 0;
-	if ((options & VACOPT_DISABLE_PAGE_SKIPPING) == 0)
+	if ((params->options & VACOPT_DISABLE_PAGE_SKIPPING) == 0)
 	{
 		while (next_unskippable_block < nblocks)
 		{
@@ -638,7 +795,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		{
 			/* Time to advance next_unskippable_block */
 			next_unskippable_block++;
-			if ((options & VACOPT_DISABLE_PAGE_SKIPPING) == 0)
+			if ((params->options & VACOPT_DISABLE_PAGE_SKIPPING) == 0)
 			{
 				while (next_unskippable_block < nblocks)
 				{
@@ -713,8 +870,8 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		 * If we are close to overrunning the available space for dead-tuple
 		 * TIDs, pause and do a cycle of vacuuming before we tackle this page.
 		 */
-		if ((vacrelstats->max_dead_tuples - vacrelstats->num_dead_tuples) < MaxHeapTuplesPerPage &&
-			vacrelstats->num_dead_tuples > 0)
+		if ((dead_tuples->max_tuples - dead_tuples->num_tuples) < MaxHeapTuplesPerPage &&
+			dead_tuples->num_tuples > 0)
 		{
 			const int	hvp_index[] = {
 				PROGRESS_VACUUM_PHASE,
@@ -742,10 +899,8 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 										 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
 			/* Remove index entries */
-			for (i = 0; i < nindexes; i++)
-				lazy_vacuum_index(Irel[i],
-								  &indstats[i],
-								  vacrelstats);
+			lazy_vacuum_or_cleanup_indexes(vacrelstats, Irel, nindexes,
+										   indstats, lps, false);
 
 			/*
 			 * Report that we are now vacuuming the heap.  We also increase
@@ -765,7 +920,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 			vacrelstats->num_index_scans++;
 
 			/*
@@ -961,7 +1116,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		has_dead_tuples = false;
 		nfrozen = 0;
 		hastup = false;
-		prev_dead_count = vacrelstats->num_dead_tuples;
+		prev_dead_count = dead_tuples->num_tuples;
 		maxoff = PageGetMaxOffsetNumber(page);
 
 		/*
@@ -1000,7 +1155,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			 */
 			if (ItemIdIsDead(itemid))
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				all_visible = false;
 				continue;
 			}
@@ -1140,7 +1295,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 
 			if (tupgone)
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				HeapTupleHeaderAdvanceLatestRemovedXid(tuple.t_data,
 													   &vacrelstats->latestRemovedXid);
 				tups_vacuumed += 1;
@@ -1209,8 +1364,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		 * If there are no indexes then we can vacuum the page right now
 		 * instead of doing a second scan.
 		 */
-		if (nindexes == 0 &&
-			vacrelstats->num_dead_tuples > 0)
+		if (nindexes == 0 && dead_tuples->num_tuples > 0)
 		{
 			/* Remove tuples from heap */
 			lazy_vacuum_page(onerel, blkno, buf, 0, vacrelstats, &vmbuffer);
@@ -1221,7 +1375,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 			vacuumed_pages++;
 
 			/*
@@ -1337,7 +1491,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		 * page, so remember its free space as-is.  (This path will always be
 		 * taken if there are no indexes.)
 		 */
-		if (vacrelstats->num_dead_tuples == prev_dead_count)
+		if (dead_tuples->num_tuples == prev_dead_count)
 			RecordPageWithFreeSpace(onerel, blkno, freespace, nblocks);
 	}
 
@@ -1371,7 +1525,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 
 	/* If any tuples need to be deleted, perform final vacuum cycle */
 	/* XXX put a threshold on min number of tuples here? */
-	if (vacrelstats->num_dead_tuples > 0)
+	if (dead_tuples->num_tuples > 0)
 	{
 		const int	hvp_index[] = {
 			PROGRESS_VACUUM_PHASE,
@@ -1387,10 +1541,8 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 									 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
 		/* Remove index entries */
-		for (i = 0; i < nindexes; i++)
-			lazy_vacuum_index(Irel[i],
-							  &indstats[i],
-							  vacrelstats);
+		lazy_vacuum_or_cleanup_indexes(vacrelstats, Irel, nindexes,
+										   indstats, lps, false);
 
 		/* Report that we are now vacuuming the heap */
 		hvp_val[0] = PROGRESS_VACUUM_PHASE_VACUUM_HEAP;
@@ -1416,9 +1568,21 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
 								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
 
-	/* Do post-vacuum cleanup and statistics update for each index */
-	for (i = 0; i < nindexes; i++)
-		lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
+	/*
+	 * Do post-vacuum cleanup and statistics update for each index if
+	 * we're not in parallel lazy vacuum. If in parallel lazy vacuum, do
+	 * only post-vacum cleanup and then update statistics after exited
+	 * from parallel mode.
+	 */
+	lazy_vacuum_or_cleanup_indexes(vacrelstats, Irel, nindexes,
+									   indstats, lps, true);
+
+	/*
+	 * If we're in parallel lazy vacuum, end parallel lazy vacuum and
+	 * update index statistics.
+	 */
+	if (IsInParallelVacuum(lps))
+		lazy_end_parallel(lps, Irel, nindexes);
 
 	/* If no indexes, make log report that lazy_vacuum_heap would've made */
 	if (vacuumed_pages)
@@ -1485,7 +1649,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats, BlockNumber nblocks)
 	npages = 0;
 
 	tupindex = 0;
-	while (tupindex < vacrelstats->num_dead_tuples)
+	while (tupindex < vacrelstats->dead_tuples->num_tuples)
 	{
 		BlockNumber tblk;
 		Buffer		buf;
@@ -1494,7 +1658,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats, BlockNumber nblocks)
 
 		vacuum_delay_point();
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples->itemptrs[tupindex]);
 		buf = ReadBufferExtended(onerel, MAIN_FORKNUM, tblk, RBM_NORMAL,
 								 vac_strategy);
 		if (!ConditionalLockBufferForCleanup(buf))
@@ -1542,6 +1706,7 @@ static int
 lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 				 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer)
 {
+	LVDeadTuples	*dead_tuples = vacrelstats->dead_tuples;
 	Page		page = BufferGetPage(buffer);
 	OffsetNumber unused[MaxOffsetNumber];
 	int			uncnt = 0;
@@ -1552,16 +1717,16 @@ lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 
 	START_CRIT_SECTION();
 
-	for (; tupindex < vacrelstats->num_dead_tuples; tupindex++)
+	for (; tupindex < dead_tuples->num_tuples; tupindex++)
 	{
 		BlockNumber tblk;
 		OffsetNumber toff;
 		ItemId		itemid;
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&dead_tuples->itemptrs[tupindex]);
 		if (tblk != blkno)
 			break;				/* past end of tuples for this block */
-		toff = ItemPointerGetOffsetNumber(&vacrelstats->dead_tuples[tupindex]);
+		toff = ItemPointerGetOffsetNumber(&dead_tuples->itemptrs[tupindex]);
 		itemid = PageGetItemId(page, toff);
 		ItemIdSetUnused(itemid);
 		unused[uncnt++] = toff;
@@ -1682,6 +1847,151 @@ lazy_check_needs_freeze(Buffer buf, bool *hastup)
 	return false;
 }
 
+/*
+ * Vacuum or cleanup indexes with parallel workers. This function must be used
+ * by the parallel vacuum leader process.
+ */
+static void
+lazy_parallel_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
+										int nindexes, IndexBulkDeleteResult **stats,
+										LVParallelState *lps, bool for_cleanup)
+{
+	Assert(!IsParallelWorker());
+	Assert(lps != NULL);
+	Assert(nindexes > 0);
+
+	/* Launch parallel vacuum workers if we're ready */
+	lazy_begin_parallel_vacuum_index(lps, vacrelstats,
+									 for_cleanup);
+
+	/*
+	 * Do index vacuuming or cleanup index with parallel workers.
+	 * Only the leader process could do that if no workers are launched.
+	 */
+	do_parallel_vacuum_or_cleanup_indexes(Irel, nindexes, stats,
+										  lps->lvshared,
+										  vacrelstats->dead_tuples);
+
+	lazy_end_parallel_vacuum_index(lps, !for_cleanup);
+}
+
+/*
+ * Index vacuuming and index cleanup routine for both the leader process
+ * and worker processes. Unlike single process vacuum, we don't update
+ * index statistics after cleanup index since that's not allowed during
+ * parallel mode, and copy index bulk-deletion results from local memory
+ * to the DSM segment.
+ */
+static void
+do_parallel_vacuum_or_cleanup_indexes(Relation *Irel, int nindexes,
+									  IndexBulkDeleteResult **stats,
+									  LVShared *lvshared,
+									  LVDeadTuples *dead_tuples)
+{
+	int idx = 0;
+
+	for (;;)
+	{
+		idx = pg_atomic_fetch_add_u32(&(lvshared->nprocessed), 1);
+
+		/* Done for all indexes? */
+		if (idx >= nindexes)
+			break;
+
+		/*
+		 * Update the local pointer to the corresponding bulk-deletion result
+		 * if someone already updated it.
+		 */
+		if (lvshared->indstats[idx].updated &&
+			stats[idx] == NULL)
+			stats[idx] = &(lvshared->indstats[idx].stats);
+
+		/* Do vacuum or cleanup one index */
+		if (!lvshared->for_cleanup)
+			lazy_vacuum_index(Irel[idx], &stats[idx], lvshared->reltuples,
+							  dead_tuples);
+		else
+			lazy_cleanup_index(Irel[idx], &stats[idx], lvshared->reltuples,
+							   lvshared->estimated_count);
+
+		/*
+		 * We copy the index bulk-deletion results returned from ambulkdelete
+		 * and amvacuumcleanup to the DSM segment because they allocate the
+		 * results locally and it's possible that an index will be vacuumed
+		 * by the different vacuum process at the next time. The copying the
+		 * result normally happens only after the first time of index vacuuming.
+		 * From the second time, we pass the result on the DSM segment so
+		 * that they update it directly.
+		 *
+		 * Since all vacuum workers write the bulk-deletion result at different
+		 * slot we can write them without locking.
+		 */
+		if (!lvshared->indstats[idx].updated &&
+			stats[idx] != NULL)
+		{
+			memcpy(&(lvshared->indstats[idx].stats),
+				   stats[idx], sizeof(IndexBulkDeleteResult));
+			lvshared->indstats[idx].updated = true;
+
+			/*
+			 * no longer need the locally allocated result and now stats[idx]
+			 * points to the DSM segment.
+			 */
+			pfree(stats[idx]);
+			stats[idx] = &(lvshared->indstats[idx].stats);
+		}
+	}
+}
+
+/*
+ * Vacuum or cleanup indexes. If we're ready for parallel lazy vacuum it's
+ * performed with parallel workers. So this function must be used by the parallel
+ * vacuum leader process.
+ */
+static void
+lazy_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
+								   int nindexes, IndexBulkDeleteResult **stats,
+								   LVParallelState *lps, bool for_cleanup)
+{
+	int		idx;
+
+	Assert(!IsParallelWorker());
+
+	/* no job if the table has no index */
+	if (nindexes <= 0)
+		return;
+
+	/* Do parallel lazy index vacuuming or cleanup if we're ready */
+	if (IsInParallelVacuum(lps))
+	{
+		lazy_parallel_vacuum_or_cleanup_indexes(vacrelstats, Irel,
+												nindexes, stats,
+												lps, for_cleanup);
+		return;
+	}
+
+	for (idx = 0; idx < nindexes; idx++)
+	{
+		/* Do vacuum or cleanup one index */
+		if (!for_cleanup)
+			lazy_vacuum_index(Irel[idx], &stats[idx], vacrelstats->old_live_tuples,
+							  vacrelstats->dead_tuples);
+		else
+		{
+			lazy_cleanup_index(Irel[idx], &stats[idx], vacrelstats->new_rel_tuples,
+							   vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+
+			/*
+			 * Update index statistics. If in parallel lazy vacuum, we will
+			 * update them after exited from parallel mode.
+			 */
+			lazy_update_index_statistics(Irel[idx], stats[idx]);
+
+			if (stats[idx])
+				pfree(stats[idx]);
+		}
+	}
+}
 
 /*
  *	lazy_vacuum_index() -- vacuum one index relation.
@@ -1690,11 +2000,11 @@ lazy_check_needs_freeze(Buffer buf, bool *hastup)
  *		vacrelstats->dead_tuples, and update running statistics.
  */
 static void
-lazy_vacuum_index(Relation indrel,
-				  IndexBulkDeleteResult **stats,
-				  LVRelStats *vacrelstats)
+lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult **stats,
+				  double reltuples, LVDeadTuples *dead_tuples)
 {
 	IndexVacuumInfo ivinfo;
+	char		*msgfmt;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
@@ -1703,18 +2013,22 @@ lazy_vacuum_index(Relation indrel,
 	ivinfo.analyze_only = false;
 	ivinfo.estimated_count = true;
 	ivinfo.message_level = elevel;
-	/* We can only provide an approximate value of num_heap_tuples here */
-	ivinfo.num_heap_tuples = vacrelstats->old_live_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
 	/* Do bulk deletion */
 	*stats = index_bulk_delete(&ivinfo, *stats,
-							   lazy_tid_reaped, (void *) vacrelstats);
+							   lazy_tid_reaped, (void *) dead_tuples);
+
+	if (IsParallelWorker())
+		msgfmt = gettext_noop("scanned index \"%s\" to remove %d row versions by parallel vacuum worker");
+	else
+		msgfmt = gettext_noop("scanned index \"%s\" to remove %d row versions");
 
 	ereport(elevel,
-			(errmsg("scanned index \"%s\" to remove %d row versions",
+			(errmsg(msgfmt,
 					RelationGetRelationName(indrel),
-					vacrelstats->num_dead_tuples),
+					dead_tuples->num_tuples),
 			 errdetail_internal("%s", pg_rusage_show(&ru0))));
 }
 
@@ -1722,60 +2036,65 @@ lazy_vacuum_index(Relation indrel,
  *	lazy_cleanup_index() -- do post-vacuum cleanup for one index relation.
  */
 static void
-lazy_cleanup_index(Relation indrel,
-				   IndexBulkDeleteResult *stats,
-				   LVRelStats *vacrelstats)
+lazy_cleanup_index(Relation indrel, IndexBulkDeleteResult **stats,
+				   double reltuples, bool estimated_count)
 {
 	IndexVacuumInfo ivinfo;
+	char		*msgfmt;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
 
 	ivinfo.index = indrel;
 	ivinfo.analyze_only = false;
-	ivinfo.estimated_count = (vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	ivinfo.estimated_count = estimated_count;
 	ivinfo.message_level = elevel;
-
-	/*
-	 * Now we can provide a better estimate of total number of surviving
-	 * tuples (we assume indexes are more interested in that than in the
-	 * number of nominally live tuples).
-	 */
-	ivinfo.num_heap_tuples = vacrelstats->new_rel_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
-	stats = index_vacuum_cleanup(&ivinfo, stats);
+	*stats = index_vacuum_cleanup(&ivinfo, *stats);
 
-	if (!stats)
+	if (!(*stats))
 		return;
 
-	/*
-	 * Now update statistics in pg_class, but only if the index says the count
-	 * is accurate.
-	 */
-	if (!stats->estimated_count)
-		vac_update_relstats(indrel,
-							stats->num_pages,
-							stats->num_index_tuples,
-							0,
-							false,
-							InvalidTransactionId,
-							InvalidMultiXactId,
-							false);
+	if (IsParallelWorker())
+		msgfmt = gettext_noop("index \"%s\" now contains %.0f row versions in %u pages, reported by parallel vacuum worker");
+	else
+		msgfmt = gettext_noop("index \"%s\" now contains %.0f row versions in %u pages");
 
 	ereport(elevel,
-			(errmsg("index \"%s\" now contains %.0f row versions in %u pages",
+			(errmsg(msgfmt,
 					RelationGetRelationName(indrel),
-					stats->num_index_tuples,
-					stats->num_pages),
+					(*stats)->num_index_tuples,
+					(*stats)->num_pages),
 			 errdetail("%.0f index row versions were removed.\n"
 					   "%u index pages have been deleted, %u are currently reusable.\n"
 					   "%s.",
-					   stats->tuples_removed,
-					   stats->pages_deleted, stats->pages_free,
+					   (*stats)->tuples_removed,
+					   (*stats)->pages_deleted, (*stats)->pages_free,
 					   pg_rusage_show(&ru0))));
+}
+
+/*
+ * Update index statistics in pg_class, but only if the index says the count
+ * is accurate.
+ */
+static void
+lazy_update_index_statistics(Relation indrel, IndexBulkDeleteResult *stats)
+{
+	Assert(!IsInParallelMode());
+
+	if (!stats || stats->estimated_count)
+		return;
 
-	pfree(stats);
+	vac_update_relstats(indrel,
+						stats->num_pages,
+						stats->num_index_tuples,
+						0,
+						false,
+						InvalidTransactionId,
+						InvalidMultiXactId,
+						false);
 }
 
 /*
@@ -2080,19 +2399,17 @@ count_nondeletable_pages(Relation onerel, LVRelStats *vacrelstats)
 }
 
 /*
- * lazy_space_alloc - space allocation decisions for lazy vacuum
- *
- * See the comments at the head of this file for rationale.
+ * Return the maximum number of dead tuples we can record.
  */
-static void
-lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+static long
+compute_max_dead_tuples(BlockNumber relblocks, bool hasindex)
 {
 	long		maxtuples;
 	int			vac_work_mem = IsAutoVacuumWorkerProcess() &&
 	autovacuum_work_mem != -1 ?
 	autovacuum_work_mem : maintenance_work_mem;
 
-	if (vacrelstats->hasindex)
+	if (hasindex)
 	{
 		maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);
 		maxtuples = Min(maxtuples, INT_MAX);
@@ -2106,34 +2423,49 @@ lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
 		maxtuples = Max(maxtuples, MaxHeapTuplesPerPage);
 	}
 	else
-	{
 		maxtuples = MaxHeapTuplesPerPage;
-	}
 
-	vacrelstats->num_dead_tuples = 0;
-	vacrelstats->max_dead_tuples = (int) maxtuples;
-	vacrelstats->dead_tuples = (ItemPointer)
-		palloc(maxtuples * sizeof(ItemPointerData));
+	return maxtuples;
+}
+
+/*
+ * lazy_space_alloc - space allocation decisions for lazy vacuum
+ *
+ * See the comments at the head of this file for rationale.
+ */
+static void
+lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+{
+	LVDeadTuples	*dead_tuples = NULL;
+	long		maxtuples;
+
+	maxtuples = compute_max_dead_tuples(relblocks, vacrelstats->hasindex);
+
+	dead_tuples = (LVDeadTuples *)
+		palloc(SizeOfLVDeadTuples + maxtuples * sizeof(ItemPointerData));
+	dead_tuples->num_tuples = 0;
+	dead_tuples->max_tuples = (int) maxtuples;
+
+	vacrelstats->dead_tuples = dead_tuples;
 }
 
 /*
  * lazy_record_dead_tuple - remember one deletable tuple
  */
 static void
-lazy_record_dead_tuple(LVRelStats *vacrelstats,
-					   ItemPointer itemptr)
+lazy_record_dead_tuple(LVDeadTuples *dead_tuples, ItemPointer itemptr)
 {
 	/*
 	 * The array shouldn't overflow under normal behavior, but perhaps it
 	 * could if we are given a really small maintenance_work_mem. In that
 	 * case, just forget the last few tuples (we'll get 'em next time).
 	 */
-	if (vacrelstats->num_dead_tuples < vacrelstats->max_dead_tuples)
+	if (dead_tuples->num_tuples < dead_tuples->max_tuples)
 	{
-		vacrelstats->dead_tuples[vacrelstats->num_dead_tuples] = *itemptr;
-		vacrelstats->num_dead_tuples++;
+		dead_tuples->itemptrs[dead_tuples->num_tuples] = *itemptr;
+		dead_tuples->num_tuples++;
 		pgstat_progress_update_param(PROGRESS_VACUUM_NUM_DEAD_TUPLES,
-									 vacrelstats->num_dead_tuples);
+									 dead_tuples->num_tuples);
 	}
 }
 
@@ -2147,12 +2479,12 @@ lazy_record_dead_tuple(LVRelStats *vacrelstats,
 static bool
 lazy_tid_reaped(ItemPointer itemptr, void *state)
 {
-	LVRelStats *vacrelstats = (LVRelStats *) state;
+	LVDeadTuples	*dead_tuples = (LVDeadTuples *) state;
 	ItemPointer res;
 
 	res = (ItemPointer) bsearch((void *) itemptr,
-								(void *) vacrelstats->dead_tuples,
-								vacrelstats->num_dead_tuples,
+								(void *) dead_tuples->itemptrs,
+								dead_tuples->num_tuples,
 								sizeof(ItemPointerData),
 								vac_cmp_itemptr);
 
@@ -2300,3 +2632,331 @@ heap_page_is_all_visible(Relation rel, Buffer buf,
 
 	return all_visible;
 }
+
+/*
+ * Compute the number of parallel worker process to request. Both index vacuuming
+ * and cleanup index can be executed together with parallel workers if the table
+ * has more than one index. The relation sizes of table and indexes don't affect
+ * to the parallel degree for now.
+ */
+static int
+compute_parallel_workers(Relation onerel, int nrequested, int nindexes)
+{
+	int parallel_workers;
+
+	Assert(nrequested >= 0);
+
+	if (nindexes <= 1)
+		return 0;
+
+	if (nrequested > 0)
+		parallel_workers = Min(nrequested, nindexes - 1);
+	else
+	{
+		/*
+		 * The parallel degree is not requested. Compute it based on the
+		 * number of indexes.
+		 */
+		parallel_workers = nindexes - 1;
+	}
+
+	/* cap by max_parallel_maintenace_workers */
+	parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+
+	return parallel_workers;
+}
+
+/*
+ * Enter parallel mode, allocate and initialize a DSM segment.
+ */
+static LVParallelState *
+lazy_prepare_parallel(LVRelStats *vacrelstats, Oid relid, BlockNumber nblocks,
+					  int nindexes, int nrequested)
+{
+	LVParallelState *lps = (LVParallelState *) palloc(sizeof(LVParallelState));
+	LVShared	*shared;
+	ParallelContext *pcxt;
+	LVDeadTuples	*tidmap;
+	long	maxtuples;
+	char	*sharedquery;
+	Size	est_shared;
+	Size	est_deadtuples;
+	int		querylen;
+	int		keys = 0;
+
+	Assert(nrequested > 0);
+	Assert(nindexes > 0);
+
+	EnterParallelMode();
+	pcxt = CreateParallelContext("postgres", "heap_parallel_vacuum_main",
+								 nrequested);
+	lps->pcxt = pcxt;
+	Assert(pcxt->nworkers > 0);
+
+	/* Estimate size for shared information -- PARALLEL_VACUUM_KEY_SHARED */
+	est_shared = MAXALIGN(add_size(SizeOfLVShared,
+								   mul_size(sizeof(LVIndStats), nindexes)));
+	shm_toc_estimate_chunk(&pcxt->estimator, est_shared);
+	keys++;
+
+	/* Estimate size for dead tuples -- PARALLEL_VACUUM_KEY_DEAD_TUPLES */
+	maxtuples = compute_max_dead_tuples(nblocks, true);
+	est_deadtuples = MAXALIGN(add_size(sizeof(LVDeadTuples),
+									   mul_size(sizeof(ItemPointerData), maxtuples)));
+	shm_toc_estimate_chunk(&pcxt->estimator, est_deadtuples);
+	keys++;
+
+	shm_toc_estimate_keys(&pcxt->estimator, keys);
+
+	/* Finally, estimate VACUUM_KEY_QUERY_TEXT space */
+	querylen = strlen(debug_query_string);
+	shm_toc_estimate_chunk(&pcxt->estimator, querylen + 1);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	InitializeParallelDSM(pcxt);
+
+	/* prepare shared information */
+	shared = (LVShared *) shm_toc_allocate(pcxt->toc, est_shared);
+	shared->relid = relid;
+	shared->elevel = elevel;
+	pg_atomic_init_u32(&(shared->nprocessed), 0);
+	MemSet(shared->indstats, 0, sizeof(LVIndStats) * nindexes);
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_SHARED, shared);
+	lps->lvshared = shared;
+
+	/* prepare the dead tuple space */
+	tidmap = (LVDeadTuples *) shm_toc_allocate(pcxt->toc, est_deadtuples);
+	tidmap->max_tuples = maxtuples;
+	tidmap->num_tuples = 0;
+	MemSet(tidmap->itemptrs, 0, sizeof(ItemPointerData) * maxtuples);
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_DEAD_TUPLES, tidmap);
+	vacrelstats->dead_tuples = tidmap;
+
+	/* Store query string for workers */
+	sharedquery = (char *) shm_toc_allocate(pcxt->toc, querylen + 1);
+	memcpy(sharedquery, debug_query_string, querylen + 1);
+	sharedquery[querylen] = '\0';
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, sharedquery);
+
+	lps->nworkers_requested = 0;
+
+	return lps;
+}
+
+/*
+ * Shutdown workers, destroy the parallel context, and end parallel mode.
+ * Update index statistics after exited from parallel mode.
+ */
+static void
+lazy_end_parallel(LVParallelState *lps, Relation *Irel, int nindexes)
+{
+	LVIndStats *copied_indstats = NULL;
+	int i;
+
+	Assert(!IsParallelWorker());
+	Assert(Irel != NULL && nindexes > 0);
+
+	/* copy the index statistics to a temporary space */
+	copied_indstats = palloc(sizeof(LVIndStats) * nindexes);
+	memcpy(copied_indstats, lps->lvshared->indstats,
+		   sizeof(LVIndStats) * nindexes);
+
+	/* Shutdown worker processes and destroy the parallel context */
+	WaitForParallelWorkersToFinish(lps->pcxt);
+	DestroyParallelContext(lps->pcxt);
+	ExitParallelMode();
+
+	for (i = 0; i < nindexes; i++)
+	{
+		LVIndStats *s = &(copied_indstats[i]);
+
+		if (s->updated)
+			lazy_update_index_statistics(Irel[i], &(s->stats));
+	}
+
+	pfree(copied_indstats);
+}
+
+/*
+ * Begin a parallel index vacuuming or index cleanup. Set shared information
+ * and launch parallel worker processes.
+ */
+static void
+lazy_begin_parallel_vacuum_index(LVParallelState *lps, LVRelStats *vacrelstats,
+								 bool for_cleanup)
+{
+	StringInfoData buf;
+
+	Assert(!IsParallelWorker());
+
+	/* Request workers to do either vacuuming indexes or cleaning indexes */
+	lps->lvshared->for_cleanup = for_cleanup;
+
+	if (!for_cleanup)
+	{
+		/* We can only provide an approximate value of num_heap_tuples here */
+		lps->lvshared->reltuples = vacrelstats->old_live_tuples;
+		lps->lvshared->estimated_count = true;
+	}
+	else
+	{
+		/*
+		 * Now we can provide a better estimate of total number of surviving
+		 * tuples (we assume indexes are more interested in that than in the
+		 * number of nominally live tuples).
+		 */
+		lps->lvshared->reltuples = vacrelstats->new_rel_tuples;
+		lps->lvshared->estimated_count =
+			(vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+
+	}
+
+	LaunchParallelWorkers(lps->pcxt);
+
+	initStringInfo(&buf);
+
+	/*
+	 * if no workers launched, we vacuum all indexes by the leader process
+	 * alone. Since there is hope that we can launch workers in the next
+	 * execution time we don't want to end parallel mode yet.
+	 */
+	if (lps->pcxt->nworkers_launched == 0)
+	{
+		if (lps->nworkers_requested > 0)
+			appendStringInfo(&buf,
+							 gettext_noop("could not launch parallel vacuum worker (planned: %d, requested: %d)"),
+							 lps->pcxt->nworkers, lps->nworkers_requested);
+		else
+			appendStringInfo(&buf,
+							 gettext_noop("could not launch parallel vacuum worker (planned: %d)"),
+							 lps->pcxt->nworkers);
+		ereport(elevel, (errmsg("%s", buf.data)));
+
+		lazy_end_parallel_vacuum_index(lps, !for_cleanup);
+		return;
+	}
+
+	/* Report parallel vacuum worker information */
+	if (for_cleanup)
+	{
+		if (lps->nworkers_requested > 0)
+			appendStringInfo(&buf,
+							 ngettext("launched %d parallel vacuum worker for index cleanup (planned: %d, requested %d)",
+									  "launched %d parallel vacuum workers for index cleanup (planned: %d, requsted %d)",
+									  lps->pcxt->nworkers_launched),
+							 lps->pcxt->nworkers_launched,
+							 lps->pcxt->nworkers,
+							 lps->nworkers_requested);
+		else
+			appendStringInfo(&buf,
+							 ngettext("launched %d parallel vacuum worker for index cleanup (planned: %d)",
+									  "launched %d parallel vacuum workers for index cleanup (planned: %d)",
+									  lps->pcxt->nworkers_launched),
+							 lps->pcxt->nworkers_launched,
+							 lps->pcxt->nworkers);
+	}
+	else
+	{
+		if (lps->nworkers_requested > 0)
+			appendStringInfo(&buf,
+							 ngettext("launched %d parallel vacuum worker for index vacuuming (planned: %d, requested %d)",
+									  "launched %d parallel vacuum workers for index vacuuming (planned: %d, requested %d)",
+									  lps->pcxt->nworkers_launched),
+							 lps->pcxt->nworkers_launched,
+							 lps->pcxt->nworkers,
+							 lps->nworkers_requested);
+		else
+			appendStringInfo(&buf,
+							 ngettext("launched %d parallel vacuum worker for index vacuuming (planned: %d)",
+									  "launched %d parallel vacuum workers for index vacuuming (planned: %d)",
+									  lps->pcxt->nworkers_launched),
+							 lps->pcxt->nworkers_launched,
+							 lps->pcxt->nworkers);
+	}
+	ereport(elevel, (errmsg("%s", buf.data)));
+}
+
+/*
+ * Wait for all worker processes to finish and reinitialize DSM for
+ * the next execution.
+ */
+static void
+lazy_end_parallel_vacuum_index(LVParallelState *lps, bool reinitialize)
+{
+	Assert(!IsParallelWorker());
+
+	WaitForParallelWorkersToFinish(lps->pcxt);
+
+	if (reinitialize)
+	{
+		/* Reset the processing count */
+		pg_atomic_write_u32(&(lps->lvshared->nprocessed), 0);
+
+		/*
+		 * Reinitialize the parallel context to relaunch parallel workers
+		 * for the next execution.
+		 */
+		ReinitializeParallelDSM(lps->pcxt);
+	}
+}
+
+/*
+ * Perform work within a launched parallel process.
+ *
+ * Parallel vacuum worker processes doesn't report the vacuum progress
+ * information.
+ */
+void
+heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
+{
+	Relation	onerel;
+	Relation	*indrels;
+	LVShared	*lvshared;
+	LVDeadTuples	*dead_tuples;
+	int			nindexes;
+	char		*sharedquery;
+	IndexBulkDeleteResult **stats;
+
+	lvshared = (LVShared *) shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_SHARED,
+										   false);
+	elevel = lvshared->elevel;
+
+	ereport(DEBUG1,
+			(errmsg("starting parallel lazy vacuum worker for %s",
+					lvshared->for_cleanup ? "cleanup" : "vacuuming")));
+
+	/* Open relations */
+	onerel = heap_open(lvshared->relid, ShareUpdateExclusiveLock);
+
+	/* indrels are sorted in order by OID */
+	vac_open_indexes(onerel, RowExclusiveLock, &nindexes, &indrels);
+	Assert(nindexes > 0);
+
+	/* Set debug_query_string for individual workers */
+	sharedquery = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, true);
+
+	/* Report the query string from leader */
+	debug_query_string = sharedquery;
+	pgstat_report_activity(STATE_RUNNING, debug_query_string);
+
+	/* Set dead tuple space within worker */
+	dead_tuples = (LVDeadTuples *) shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_DEAD_TUPLES,
+												  false);
+
+	/* Set cost-based vacuum delay */
+	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostBalance = 0;
+	VacuumPageHit = 0;
+	VacuumPageMiss = 0;
+	VacuumPageDirty = 0;
+
+	stats = (IndexBulkDeleteResult **)
+		palloc0(nindexes * sizeof(IndexBulkDeleteResult *));
+
+	/* Do either vacuuming indexes or cleaning indexes */
+	do_parallel_vacuum_or_cleanup_indexes(indrels, nindexes, stats,
+										  lvshared, dead_tuples);
+
+	vac_close_indexes(nindexes, indrels, RowExclusiveLock);
+	heap_close(onerel, ShareUpdateExclusiveLock);
+}
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index 55d129a..86511b2 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -14,6 +14,7 @@
 
 #include "postgres.h"
 
+#include "access/heapam.h"
 #include "access/nbtree.h"
 #include "access/parallel.h"
 #include "access/session.h"
@@ -140,6 +141,9 @@ static const struct
 	},
 	{
 		"_bt_parallel_build_main", _bt_parallel_build_main
+	},
+	{
+		"heap_parallel_vacuum_main", heap_parallel_vacuum_main
 	}
 };
 
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 10df766..2e612c1 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -95,6 +95,9 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 	bool disable_page_skipping = false;
 	ListCell	*lc;
 
+	/* disable parallel lazy vacuum by default */
+	params.nworkers = -1;
+
 	/* Parse options list */
 	foreach(lc, vacstmt->options)
 	{
@@ -120,6 +123,27 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 			full = defGetBoolean(opt);
 		else if (strcmp(opt->defname, "disable_page_skipping") == 0)
 			disable_page_skipping = defGetBoolean(opt);
+		else if (strcmp(opt->defname, "parallel") == 0)
+		{
+			if (opt->arg == NULL)
+			{
+				/*
+				 * Parallel lazy vacuum is requested but user didn't specify
+				 * the parallel degree. The parallel degree will be determined
+				 * at the start of lazy vacuum.
+				 */
+				params.nworkers = 0;
+			}
+			else
+			{
+				params.nworkers = defGetInt32(opt);
+				if (params.nworkers <= 0)
+					ereport(ERROR,
+							(errcode(ERRCODE_SYNTAX_ERROR),
+							 errmsg("parallel vacuum degree must be at least 1"),
+							 parser_errposition(pstate, opt->location)));
+			}
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -161,6 +185,11 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 		}
 	}
 
+	if ((params.options & VACOPT_FULL) && params.nworkers >= 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot specify FULL option with PARALLEL option")));
+
 	/*
 	 * All freeze ages are zero if the FREEZE option is given; otherwise pass
 	 * them as -1 which means to use the default values.
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 5af91aa..effc85b 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -10553,6 +10553,7 @@ vac_analyze_option_name:
 
 vac_analyze_option_arg:
 			opt_boolean_or_string					{ $$ = (Node *) makeString($1); }
+			| NumericOnly							{ $$ = (Node *) $1; }
 			| /* EMPTY */		 					{ $$ = NULL; }
 		;
 
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index fa875db..010a49c 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2886,6 +2886,7 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 			(dovacuum ? VACOPT_VACUUM : 0) |
 			(doanalyze ? VACOPT_ANALYZE : 0) |
 			(!wraparound ? VACOPT_SKIP_LOCKED : 0);
+		tab->at_params.nworkers = -1;	/* parallel lazy autovacuum is not supported */
 		tab->at_params.freeze_min_age = freeze_min_age;
 		tab->at_params.freeze_table_age = freeze_table_age;
 		tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index c18977c..f489898 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -3431,7 +3431,8 @@ psql_completion(const char *text, int start, int end)
 		 */
 		if (ends_with(prev_wd, '(') || ends_with(prev_wd, ','))
 			COMPLETE_WITH("FULL", "FREEZE", "ANALYZE", "VERBOSE",
-						  "DISABLE_PAGE_SKIPPING", "SKIP_LOCKED");
+						  "DISABLE_PAGE_SKIPPING", "SKIP_LOCKED",
+						  "PARALLEL");
 		else if (TailMatches("FULL|FREEZE|ANALYZE|VERBOSE|DISABLE_PAGE_SKIPPING|SKIP_LOCKED"))
 			COMPLETE_WITH("ON", "OFF");
 	}
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 4c07775..48df92c 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -14,6 +14,7 @@
 #ifndef HEAPAM_H
 #define HEAPAM_H
 
+#include "access/parallel.h"
 #include "access/relation.h"	/* for backward compatibility */
 #include "access/relscan.h"
 #include "access/sdir.h"
@@ -199,6 +200,7 @@ extern Size SyncScanShmemSize(void);
 struct VacuumParams;
 extern void heap_vacuum_rel(Relation onerel,
 				struct VacuumParams *params, BufferAccessStrategy bstrategy);
+extern void heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
 
 /* in heap/heapam_visibility.c */
 extern bool HeapTupleSatisfiesVisibility(HeapTuple stup, Snapshot snapshot,
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 77086f3..c4b355a 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -167,6 +167,11 @@ typedef struct VacuumParams
 	int			log_min_duration;	/* minimum execution threshold in ms at
 									 * which  verbose logs are activated, -1
 									 * to use default */
+	/*
+	 * The number of parallel vacuum workers. -1 by default for no workers
+	 * and 0 for choosing based on the number of indexes.
+	 */
+	int			nworkers;
 } VacuumParams;
 
 /* GUC parameters */
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index 07d0703..973bb33 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -80,6 +80,12 @@ CONTEXT:  SQL function "do_analyze" statement 1
 SQL function "wrap_do_analyze" statement 1
 VACUUM FULL vactst;
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+VACUUM (PARALLEL) vaccluster;
+VACUUM (PARALLEL 2) vaccluster;
+VACUUM (PARALLEL 0) vaccluster; -- error
+ERROR:  parallel vacuum degree must be at least 1
+LINE 1: VACUUM (PARALLEL 0) vaccluster;
+                ^
 -- partitioned table
 CREATE TABLE vacparted (a int, b char) PARTITION BY LIST (a);
 CREATE TABLE vacparted1 PARTITION OF vacparted FOR VALUES IN (1);
@@ -116,9 +122,9 @@ ERROR:  column "does_not_exist" of relation "vacparted" does not exist
 ANALYZE (VERBOSE) does_not_exist;
 ERROR:  relation "does_not_exist" does not exist
 ANALYZE (nonexistent-arg) does_not_exist;
-ERROR:  syntax error at or near "-"
+ERROR:  syntax error at or near "arg"
 LINE 1: ANALYZE (nonexistent-arg) does_not_exist;
-                            ^
+                             ^
 ANALYZE (nonexistentarg) does_not_exit;
 ERROR:  unrecognized ANALYZE option "nonexistentarg"
 LINE 1: ANALYZE (nonexistentarg) does_not_exit;
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 81f3822..d0c209a 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -61,6 +61,9 @@ VACUUM FULL vaccluster;
 VACUUM FULL vactst;
 
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+VACUUM (PARALLEL) vaccluster;
+VACUUM (PARALLEL 2) vaccluster;
+VACUUM (PARALLEL 0) vaccluster; -- error
 
 -- partitioned table
 CREATE TABLE vacparted (a int, b char) PARTITION BY LIST (a);
-- 
1.8.3.1

v21-0003-Add-paralell-P-option-to-vacuumdb-command.patchtext/x-patch; charset=US-ASCII; name=v21-0003-Add-paralell-P-option-to-vacuumdb-command.patchDownload

From e249e91db5de5325934ebb8abd87caa54c48c44b Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Wed, 23 Jan 2019 16:07:53 +0900
Subject: [PATCH v21 3/3] Add --paralell, -P option to vacuumdb command

---
 doc/src/sgml/ref/vacuumdb.sgml    | 16 +++++++++++++
 src/bin/scripts/t/100_vacuumdb.pl | 10 +++++++-
 src/bin/scripts/vacuumdb.c        | 49 ++++++++++++++++++++++++++++++++++++++-
 3 files changed, 73 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/ref/vacuumdb.sgml b/doc/src/sgml/ref/vacuumdb.sgml
index 41c7f3d..da65177 100644
--- a/doc/src/sgml/ref/vacuumdb.sgml
+++ b/doc/src/sgml/ref/vacuumdb.sgml
@@ -227,6 +227,22 @@ PostgreSQL documentation
      </varlistentry>
 
      <varlistentry>
+      <term><option>-P <replaceable class="parameter">workers</replaceable></option></term>
+      <term><option>--parallel=<replaceable class="parameter">workers</replaceable></option></term>
+      <listitem>
+       <para>
+        Execute parallel vacuum with <productname>PostgreSQL</productname>'s
+        <replaceable class="parameter">workers</replaceable> background workers.
+       </para>
+       <para>
+        This option will require background workers, so make sure your
+        <xref linkend="guc-max-parallel-workers-maintenance"/> setting is more
+        than one.
+       </para>
+      </listitem>
+     </varlistentry>
+
+     <varlistentry>
       <term><option>-q</option></term>
       <term><option>--quiet</option></term>
       <listitem>
diff --git a/src/bin/scripts/t/100_vacuumdb.pl b/src/bin/scripts/t/100_vacuumdb.pl
index 7f3a9b1..5ab87f3 100644
--- a/src/bin/scripts/t/100_vacuumdb.pl
+++ b/src/bin/scripts/t/100_vacuumdb.pl
@@ -3,7 +3,7 @@ use warnings;
 
 use PostgresNode;
 use TestLib;
-use Test::More tests => 44;
+use Test::More tests => 48;
 
 program_help_ok('vacuumdb');
 program_version_ok('vacuumdb');
@@ -48,6 +48,14 @@ $node->issues_sql_like(
 $node->command_fails(
 	[ 'vacuumdb', '--analyze-only', '--disable-page-skipping', 'postgres' ],
 	'--analyze-only and --disable-page-skipping specified together');
+$node->issues_sql_like(
+	[ 'vacuumdb', '-P2', 'postgres' ],
+	qr/statement: VACUUM \(PARALLEL 2\).*;/,
+	'vacuumdb -P2');
+$node->issues_sql_like(
+	[ 'vacuumdb', '-P', 'postgres' ],
+	qr/statement: VACUUM \(PARALLEL\).*;/,
+	'vacuumdb -P');
 $node->command_ok([qw(vacuumdb -Z --table=pg_am dbname=template1)],
 	'vacuumdb with connection string');
 
diff --git a/src/bin/scripts/vacuumdb.c b/src/bin/scripts/vacuumdb.c
index 5ac41ea..6be3f8f 100644
--- a/src/bin/scripts/vacuumdb.c
+++ b/src/bin/scripts/vacuumdb.c
@@ -45,6 +45,8 @@ typedef struct vacuumingOptions
 	bool		skip_locked;
 	int			min_xid_age;
 	int			min_mxid_age;
+	int			parallel_workers;	/* -1 disables, 0 for choosing based on the
+									 * number of indexes */
 } vacuumingOptions;
 
 
@@ -111,6 +113,7 @@ main(int argc, char *argv[])
 		{"full", no_argument, NULL, 'f'},
 		{"verbose", no_argument, NULL, 'v'},
 		{"jobs", required_argument, NULL, 'j'},
+		{"parallel", optional_argument, NULL, 'P'},
 		{"maintenance-db", required_argument, NULL, 2},
 		{"analyze-in-stages", no_argument, NULL, 3},
 		{"disable-page-skipping", no_argument, NULL, 4},
@@ -140,6 +143,7 @@ main(int argc, char *argv[])
 
 	/* initialize options to all false */
 	memset(&vacopts, 0, sizeof(vacopts));
+	vacopts.parallel_workers = -1;
 
 	progname = get_progname(argv[0]);
 
@@ -147,7 +151,7 @@ main(int argc, char *argv[])
 
 	handle_help_version_opts(argc, argv, "vacuumdb", help);
 
-	while ((c = getopt_long(argc, argv, "h:p:U:wWeqd:zZFat:fvj:", long_options, &optindex)) != -1)
+	while ((c = getopt_long(argc, argv, "h:p:P::U:wWeqd:zZFat:fvj:", long_options, &optindex)) != -1)
 	{
 		switch (c)
 		{
@@ -214,6 +218,25 @@ main(int argc, char *argv[])
 					exit(1);
 				}
 				break;
+			case 'P':
+				{
+					int parallel_workers = 0;
+
+					if (optarg != NULL)
+					{
+						parallel_workers = atoi(optarg);
+						if (parallel_workers <= 0)
+						{
+							fprintf(stderr, _("%s: number of parallel workers must be at least 1\n"),
+									progname);
+							exit(1);
+						}
+					}
+
+					/* allow to set 0, meaning PARALLEL without the parallel degree */
+					vacopts.parallel_workers = parallel_workers;
+					break;
+				}
 			case 2:
 				maintenance_db = pg_strdup(optarg);
 				break;
@@ -288,9 +311,22 @@ main(int argc, char *argv[])
 					progname, "disable-page-skipping");
 			exit(1);
 		}
+		if (vacopts.parallel_workers >= 0)
+		{
+			fprintf(stderr, _("%s: cannot use the \"%s\" option when performing only analyze\n"),
+					progname, "parallel");
+			exit(1);
+		}
 		/* allow 'and_analyze' with 'analyze_only' */
 	}
 
+	if (vacopts.full && vacopts.parallel_workers >= 0)
+	{
+		fprintf(stderr, _("%s: cannot use the \"%s\" option with \"%s\" option\n"),
+				progname, "full", "parallel");
+		exit(1);
+	}
+
 	setup_cancel_handler();
 
 	/* Avoid opening extra connections. */
@@ -895,6 +931,16 @@ prepare_vacuum_command(PQExpBuffer sql, int serverVersion,
 				appendPQExpBuffer(sql, "%sANALYZE", sep);
 				sep = comma;
 			}
+			if (vacopts->parallel_workers > 0)
+			{
+				appendPQExpBuffer(sql, "%sPARALLEL %d", sep, vacopts->parallel_workers);
+				sep = comma;
+			}
+			if (vacopts->parallel_workers == 0)
+			{
+				appendPQExpBuffer(sql, "%sPARALLEL", sep);
+				sep = comma;
+			}
 			if (sep != paren)
 				appendPQExpBufferChar(sql, ')');
 		}
@@ -1227,6 +1273,7 @@ help(const char *progname)
 	printf(_("  -j, --jobs=NUM                  use this many concurrent connections to vacuum\n"));
 	printf(_("      --min-mxid-age=MXID_AGE     minimum multixact ID age of tables to vacuum\n"));
 	printf(_("      --min-xid-age=XID_AGE       minimum transaction ID age of tables to vacuum\n"));
+	printf(_("  -P, --parallel[=NUM]            do parallel vacuuming\n"));
 	printf(_("  -q, --quiet                     don't write any messages\n"));
 	printf(_("      --skip-locked               skip relations that cannot be immediately locked\n"));
 	printf(_("  -t, --table='TABLE[(COLUMNS)]'  vacuum specific table(s) only\n"));
-- 
1.8.3.1

#67

Robert Haas

robertmhaas@gmail.com

almost 7 years ago

In reply to: Masahiko Sawada (#66)

On Thu, Mar 28, 2019 at 10:27 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

You're right, the previous patches are wrong. Attached the updated
version patches.

0001 looks good now. Committed.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#68

Masahiko Sawada

sawada.mshk@gmail.com

almost 7 years ago

In reply to: Robert Haas (#67)

On Fri, Mar 29, 2019 at 9:28 PM Robert Haas <robertmhaas@gmail.com> wrote:

On Thu, Mar 28, 2019 at 10:27 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

You're right, the previous patches are wrong. Attached the updated
version patches.

0001 looks good now. Committed.

Thank you!

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#69

Masahiko Sawada

sawada.mshk@gmail.com

almost 7 years ago

In reply to: Masahiko Sawada (#66)

2 attachment(s)

On Fri, Mar 29, 2019 at 11:26 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Fri, Mar 29, 2019 at 4:53 AM Robert Haas <robertmhaas@gmail.com> wrote:
On Tue, Mar 26, 2019 at 10:31 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Thank you for reviewing the patch.

I don't think the approach in v20-0001 is quite right.
if (strcmp(opt->defname, "verbose") == 0)
-            params.options |= VACOPT_VERBOSE;
+            params.options |= defGetBoolean(opt) ? VACOPT_VERBOSE : 0;
It seems to me that it would be better to do declare a separate
boolean for each flag at the top; e.g. bool verbose. Then here do
verbose = defGetBoolean(opt). And then after the loop do
params.options = (verbose ? VACOPT_VERBOSE : 0) | ... similarly for
other options.

The thing I don't like about the way you have it here is that it's not
going to work well for options that are true by default but can
optionally be set to false. In that case, you would need to start
with the bit set and then clear it, but |= can only set bits, not
clear them. I went and looked at the VACUUM (INDEX_CLEANUP) patch on
the other thread and it doesn't have any special handling for that
case, which makes me suspect that if you use that patch, the reloption
works as expected but VACUUM (INDEX_CLEANUP false) doesn't actually
succeed in disabling index cleanup. The structure I suggested above
would fix that.
You're right, the previous patches are wrong. Attached the updated
version patches.

These patches conflict with the current HEAD. Attached the updated patches.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachments:

v22-0001-Add-parallel-option-to-VACUUM-command.patchapplication/octet-stream; name=v22-0001-Add-parallel-option-to-VACUUM-command.patchDownload

From efa68e08e74840c0afa4183e14b212b4fffcbbee Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Thu, 4 Apr 2019 11:42:25 +0900
Subject: [PATCH v22 1/2] Add parallel option to VACUUM command

In parallel vacuum, we perform both index vacuum and cleanup vacuum
with parallel workers. Indivisual indexes are processed by one vacuum
process. Therefore parallel vacuum can be used when the table has more
than one index.

During vacuum execution, the leader process scan heaps while
collect dead tuples onto the DSM segment. Before starting either
index vacuuming or index cleanup the leader process launch parallel
vacuum workers and the leader itself also participates to the parallel
execution. After all indexes are processed all vacuum parallel workers
exit and the leader process reinitialize the DSM segment while
keeping the recorded dead tuples for the next execution.

Parallel vacuum can be performed by specifying like
VACUUM (PARALLEL 2) tbl, meaning that performing vacuum with 2
parallel worker processes. If the degree of parallelism is omitted it
will be determined based on the number of indexes that table has.

The parallel vacuum degree is limited by both the number of
indexes the table has and max_parallel_maintenance_workers.
---
 doc/src/sgml/config.sgml              |  14 +-
 doc/src/sgml/ref/vacuum.sgml          |  31 ++
 src/backend/access/heap/vacuumlazy.c  | 894 ++++++++++++++++++++++++++++++----
 src/backend/access/transam/parallel.c |   4 +
 src/backend/commands/vacuum.c         |  29 ++
 src/backend/parser/gram.y             |   1 +
 src/backend/postmaster/autovacuum.c   |   1 +
 src/bin/psql/tab-complete.c           |   3 +-
 src/include/access/heapam.h           |   2 +
 src/include/commands/vacuum.h         |   5 +
 src/test/regress/expected/vacuum.out  |  10 +-
 src/test/regress/sql/vacuum.sql       |   3 +
 12 files changed, 886 insertions(+), 111 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index bc1d0f7..0b65d9b 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2226,13 +2226,13 @@ include_dir 'conf.d'
        <listitem>
         <para>
          Sets the maximum number of parallel workers that can be
-         started by a single utility command.  Currently, the only
-         parallel utility command that supports the use of parallel
-         workers is <command>CREATE INDEX</command>, and only when
-         building a B-tree index.  Parallel workers are taken from the
-         pool of processes established by <xref
-         linkend="guc-max-worker-processes"/>, limited by <xref
-         linkend="guc-max-parallel-workers"/>.  Note that the requested
+         started by a single utility command.  Currently, the parallel
+         utility commands that support the use of parallel workers are
+         <command>CREATE INDEX</command> only when building a B-tree index,
+         and <command>VACUUM</command> without <literal>FULL</literal>
+         option. Parallel workers are taken from the pool of processes
+         established by <xref linkend="guc-max-worker-processes"/>, limited
+         by <xref linkend="guc-max-parallel-workers"/>.  Note that the requested
          number of workers may not actually be available at run time.
          If this occurs, the utility operation will run with fewer
          workers than expected.  The default value is 2.  Setting this
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index 906d0c2..d3fe0f6 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -32,6 +32,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     ANALYZE [ <replaceable class="parameter">boolean</replaceable> ]
     DISABLE_PAGE_SKIPPING [ <replaceable class="parameter">boolean</replaceable> ]
     SKIP_LOCKED [ <replaceable class="parameter">boolean</replaceable> ]
+    PARALLEL [ <replaceable class="parameter">integer</replaceable> ]
 
 <phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
 
@@ -143,6 +144,22 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    </varlistentry>
 
    <varlistentry>
+    <term><literal>PARALLEL</literal></term>
+    <listitem>
+     <para>
+      Perform vacuum index and cleanup index phases of <command>VACUUM</command>
+      in parallel using <replaceable class="parameter">integer</replaceable> background
+      workers (for the detail of each vacuum phases, please refer to
+      <xref linkend="vacuum-phases"/>). Only one worker can be used per index. So
+      parallel workers are launched only when there are at least <literal>2</literal>
+      indexes in the table. Workers for vacuum launches before starting each phases
+      and exit at the end of the phase. These behaviors might change in a future release.
+      This option can not use with <literal>FULL</literal> option.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><literal>DISABLE_PAGE_SKIPPING</literal></term>
     <listitem>
      <para>
@@ -196,6 +213,20 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    </varlistentry>
 
    <varlistentry>
+    <term><replaceable class="parameter">integer</replaceable></term>
+    <listitem>
+     <para>
+      Specifies parallel degree for <literal>PARALLEL</literal> option. The
+      value must be at least 1. If the parallel degree
+      <replaceable class="parameter">integer</replaceable> is omitted, then
+      <command>VACUUM</command> decides the number of workers based on number of
+      indexes on the relation which further limited by
+      <xref linkend="guc-max-parallel-workers-maintenance"/>.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><replaceable class="parameter">table_name</replaceable></term>
     <listitem>
      <para>
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 392b35e..bacf6e8 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -22,6 +22,20 @@
  * of index scans performed.  So we don't use maintenance_work_mem memory for
  * the TID array, just enough to hold as many heap tuples as fit on one page.
  *
+ * Lazy vacuum supports parallel execution with parallel worker processes. In
+ * parallel lazy vacuum, we perform both index vacuuming and index cleanup with
+ * parallel worker processes. Individual indexes is processed by one vacuum
+ * process. At beginning of lazy vacuum (at lazy_scan_heap) we prepare the
+ * parallel context and initialize the DSM segment that contains shared information
+ * as well as the memory space for storing dead tuples. When starting either
+ * index vacuuming or index cleanup, we launch parallel worker processes. Once
+ * all indexes are processed the parallel worker processes exit and the leader
+ * process re-initializes the DSM segment while keeping recorded dead tuples.
+ * Note that all parallel workers live during one either index vacuuming or
+ * index cleanup but the leader process neither exits from the parallel mode
+ * nor destroys the parallel context. For updating the index statistics, since
+ * any updates are not allowed during parallel mode we update the index
+ * statistics after exited from parallel mode.
  *
  * Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
@@ -41,8 +55,10 @@
 #include "access/heapam_xlog.h"
 #include "access/htup_details.h"
 #include "access/multixact.h"
+#include "access/parallel.h"
 #include "access/transam.h"
 #include "access/visibilitymap.h"
+#include "access/xact.h"
 #include "access/xlog.h"
 #include "catalog/storage.h"
 #include "commands/dbcommands.h"
@@ -55,6 +71,7 @@
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
 #include "storage/lmgr.h"
+#include "tcop/tcopprot.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/pg_rusage.h"
@@ -110,6 +127,93 @@
  */
 #define PREFETCH_SIZE			((BlockNumber) 32)
 
+/*
+ * DSM keys for parallel lazy vacuum. Since we don't need to worry about DSM
+ * keys conflicting with plan_node_id we can use small integers.
+ */
+#define PARALLEL_VACUUM_KEY_SHARED			1
+#define PARALLEL_VACUUM_KEY_DEAD_TUPLES		2
+#define PARALLEL_VACUUM_KEY_QUERY_TEXT		3
+
+/*
+ * Macro to check if we in a parallel lazy vacuum. If true, we're in parallel
+ * mode and prepared the DSM segments.
+ */
+#define IsInParallelVacuum(lps) (((LVParallelState *) (lps)) != NULL)
+
+/*
+ * Structs for an index bulk-deletion statistic that is used for parallel
+ * lazy vacuum. This is allocated in a DSM segment.
+ */
+typedef struct LVIndStats
+{
+	bool updated;	/* are the stats updated? */
+	IndexBulkDeleteResult stats;
+} LVIndStats;
+
+/*
+ * LVDeadTuples stores the dead tuple TIDs collected during heap scan.
+ * This is allocated in a DSM segment when parallel lazy vacuum
+ * mode, otherwise allocated in a local memory.
+ */
+typedef struct LVDeadTuples
+{
+	int			max_tuples;	/* # slots allocated in array */
+	int			num_tuples;	/* current # of entries */
+	/* List of TIDs of tuples we intend to delete */
+	/* NB: this list is ordered by TID address */
+	ItemPointerData itemptrs[FLEXIBLE_ARRAY_MEMBER];	/* array of ItemPointerData */
+} LVDeadTuples;
+#define SizeOfLVDeadTuples offsetof(LVDeadTuples, itemptrs) + sizeof(ItemPointerData)
+
+/*
+ * Shared information among parallel workers. So this is allocated in
+ * a DSM segment.
+ */
+typedef struct LVShared
+{
+	/*
+	 * Target table relid and log level. These fields are not modified during
+	 * the lazy vacuum.
+	 */
+	Oid		relid;
+	int		elevel;
+
+	/*
+	 * An indication for vacuum workers of doing either index vacuuming or
+	 * index cleanup.
+	 */
+	bool	for_cleanup;
+
+	/*
+	 * Fields for both index vacuuming and index cleanup.
+	 *
+	 * reltuples is the total number of input heap tuples. We set either an
+	 * old live tuples in index vacuuming case or the new live tuples in index
+	 * cleanup case.
+	 *
+	 * estimated_count is true if the reltuples is estimated value.
+	 */
+	double	reltuples;
+	bool	estimated_count;
+
+	/*
+	 * Variables to control parallel index vacuuming. An variable-sized field
+	 * 'indstats' must come last.
+	 */
+	pg_atomic_uint32	nprocessed;
+	LVIndStats			indstats[FLEXIBLE_ARRAY_MEMBER];
+} LVShared;
+#define SizeOfLVShared offsetof(LVShared, indstats) + sizeof(LVIndStats)
+
+/* Struct for parallel lazy vacuum */
+typedef struct LVParallelState
+{
+	ParallelContext	*pcxt;
+	LVShared		*lvshared;
+	int				nworkers_requested;	/* user-requested parallel degree */
+} LVParallelState;
+
 typedef struct LVRelStats
 {
 	/* hasindex = true means two-pass strategy; false means one-pass */
@@ -128,17 +232,12 @@ typedef struct LVRelStats
 	BlockNumber pages_removed;
 	double		tuples_deleted;
 	BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
-	/* List of TIDs of tuples we intend to delete */
-	/* NB: this list is ordered by TID address */
-	int			num_dead_tuples;	/* current # of entries */
-	int			max_dead_tuples;	/* # slots allocated in array */
-	ItemPointer dead_tuples;	/* array of ItemPointerData */
+	LVDeadTuples *dead_tuples;
 	int			num_index_scans;
 	TransactionId latestRemovedXid;
 	bool		lock_waiter_detected;
 } LVRelStats;
 
-
 /* A few variables that don't seem worth passing around as parameters */
 static int	elevel = -1;
 
@@ -150,17 +249,18 @@ static BufferAccessStrategy vac_strategy;
 
 
 /* non-export function prototypes */
-static void lazy_scan_heap(Relation onerel, int options,
+static void lazy_scan_heap(Relation onerel, VacuumParams *params,
 			   LVRelStats *vacrelstats, Relation *Irel, int nindexes,
 			   bool aggressive);
 static void lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats, BlockNumber nblocks);
 static bool lazy_check_needs_freeze(Buffer buf, bool *hastup);
 static void lazy_vacuum_index(Relation indrel,
 				  IndexBulkDeleteResult **stats,
-				  LVRelStats *vacrelstats);
+				  double reltuples,
+				  LVDeadTuples	*dead_tuples);
 static void lazy_cleanup_index(Relation indrel,
-				   IndexBulkDeleteResult *stats,
-				   LVRelStats *vacrelstats);
+				   IndexBulkDeleteResult **stats,
+				   double reltuples, bool estimated_count);
 static int lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 				 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer);
 static bool should_attempt_truncation(LVRelStats *vacrelstats);
@@ -168,12 +268,35 @@ static void lazy_truncate_heap(Relation onerel, LVRelStats *vacrelstats);
 static BlockNumber count_nondeletable_pages(Relation onerel,
 						 LVRelStats *vacrelstats);
 static void lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks);
-static void lazy_record_dead_tuple(LVRelStats *vacrelstats,
-					   ItemPointer itemptr);
+static void lazy_record_dead_tuple(LVDeadTuples *dead_tuples, ItemPointer itemptr);
 static bool lazy_tid_reaped(ItemPointer itemptr, void *state);
 static int	vac_cmp_itemptr(const void *left, const void *right);
 static bool heap_page_is_all_visible(Relation rel, Buffer buf,
 						 TransactionId *visibility_cutoff_xid, bool *all_frozen);
+static void lazy_update_index_statistics(Relation indrel, IndexBulkDeleteResult *stats);
+static LVParallelState *lazy_prepare_parallel(LVRelStats *vacrelstats, Oid relid,
+											  BlockNumber nblocks, int nindexes,
+											  int nrequested);
+static void lazy_end_parallel(LVParallelState *lps, Relation *Irel, int nindexes);
+static void lazy_begin_parallel_index_vacuum(LVParallelState *lps, LVRelStats *vacrelstats,
+											 bool for_cleanup);
+static void lazy_end_parallel_index_vacuum(LVParallelState *lps, bool reinitialize);
+static void lazy_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
+										   int nindexes,
+										   IndexBulkDeleteResult **stats,
+										   LVParallelState *lps, bool for_cleanup);
+static void lazy_parallel_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats,
+													Relation *Irel,
+													int nindexes,
+													IndexBulkDeleteResult **stats,
+													LVParallelState *lps,
+													bool for_cleanup);
+static void do_parallel_vacuum_or_cleanup_indexes(Relation *Irel, int nindexes,
+												  IndexBulkDeleteResult **stats,
+												  LVShared *lvshared,
+												  LVDeadTuples *dead_tuples);
+static int compute_parallel_workers(Relation onerel, int nrequested, int nindexes);
+static long compute_max_dead_tuples(BlockNumber relblocks, bool hasindex);
 
 
 /*
@@ -278,7 +401,7 @@ heap_vacuum_rel(Relation onerel, VacuumParams *params,
 	vacrelstats->hasindex = (nindexes > 0);
 
 	/* Do the vacuuming */
-	lazy_scan_heap(onerel, params->options, vacrelstats, Irel, nindexes, aggressive);
+	lazy_scan_heap(onerel, params, vacrelstats, Irel, nindexes, aggressive);
 
 	/* Done with indexes */
 	vac_close_indexes(nindexes, Irel, NoLock);
@@ -480,14 +603,29 @@ vacuum_log_cleanup_info(Relation rel, LVRelStats *vacrelstats)
  *		dead-tuple TIDs, invoke vacuuming of indexes and call lazy_vacuum_heap
  *		to reclaim dead line pointers.
  *
+ *		If the table has more than one index and parallel lazy vacuum is requested,
+ *		we execute both index vacuuming and index cleanup with parallel workers.
+ *		When allocating the space for lazy scan heap, we enter parallel mode,
+ *		create the parallel context and initailize a DSM segment for dead tuples.
+ *		The dead_tuples points either to a DSM segment in parallel lazy vacuum
+ *		case or to a local memory in single process vacuum case.  Before starting
+ *		parallel index vacuuming and parallel index cleanup we launch parallel
+ *		workers. All parallel workers will exit after processed all indexes and
+ *		the leader process re-initialize the parallel context and then re-launch
+ *		them at the next execution. The index statistics are updated by the leader
+ *		after exited from parallel mode since all writes are not allowed during
+ *		parallel mode.
+ *
  *		If there are no indexes then we can reclaim line pointers on the fly;
  *		dead line pointers need only be retained until all index pointers that
  *		reference them have been killed.
  */
 static void
-lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
+lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			   Relation *Irel, int nindexes, bool aggressive)
 {
+	LVParallelState *lps = NULL;
+	LVDeadTuples *dead_tuples;
 	BlockNumber nblocks,
 				blkno;
 	HeapTupleData tuple;
@@ -510,6 +648,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	bool		skipping_blocks;
 	xl_heap_freeze_tuple *frozen;
 	StringInfoData buf;
+	int			parallel_workers = 0;
 	const int	initprog_index[] = {
 		PROGRESS_VACUUM_PHASE,
 		PROGRESS_VACUUM_TOTAL_HEAP_BLKS,
@@ -545,13 +684,36 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	vacrelstats->nonempty_pages = 0;
 	vacrelstats->latestRemovedXid = InvalidTransactionId;
 
-	lazy_space_alloc(vacrelstats, nblocks);
+	/* Compute the number of parallel vacuum worker to request */
+	if (params->nworkers >= 0)
+		parallel_workers = compute_parallel_workers(onerel,
+													params->nworkers,
+													nindexes);
+
+	if (parallel_workers > 0)
+	{
+		/* Enter parallel mode and prepare parallel lazy vacuum */
+		lps = lazy_prepare_parallel(vacrelstats,
+									RelationGetRelid(onerel),
+									nblocks, nindexes,
+									parallel_workers);
+
+		/* Remember the user-requested parallel degree to reporting */
+		lps->nworkers_requested = params->nworkers;
+	}
+	else
+	{
+		/* Allocate the memory space for dead tuples locally */
+		lazy_space_alloc(vacrelstats, nblocks);
+	}
+
+	dead_tuples = vacrelstats->dead_tuples;
 	frozen = palloc(sizeof(xl_heap_freeze_tuple) * MaxHeapTuplesPerPage);
 
 	/* Report that we're scanning the heap, advertising total # of blocks */
 	initprog_val[0] = PROGRESS_VACUUM_PHASE_SCAN_HEAP;
 	initprog_val[1] = nblocks;
-	initprog_val[2] = vacrelstats->max_dead_tuples;
+	initprog_val[2] = dead_tuples->max_tuples;
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
 	/*
@@ -599,7 +761,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	 * be replayed on any hot standby, where it can be disruptive.
 	 */
 	next_unskippable_block = 0;
-	if ((options & VACOPT_DISABLE_PAGE_SKIPPING) == 0)
+	if ((params->options & VACOPT_DISABLE_PAGE_SKIPPING) == 0)
 	{
 		while (next_unskippable_block < nblocks)
 		{
@@ -654,7 +816,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		{
 			/* Time to advance next_unskippable_block */
 			next_unskippable_block++;
-			if ((options & VACOPT_DISABLE_PAGE_SKIPPING) == 0)
+			if ((params->options & VACOPT_DISABLE_PAGE_SKIPPING) == 0)
 			{
 				while (next_unskippable_block < nblocks)
 				{
@@ -729,8 +891,8 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		 * If we are close to overrunning the available space for dead-tuple
 		 * TIDs, pause and do a cycle of vacuuming before we tackle this page.
 		 */
-		if ((vacrelstats->max_dead_tuples - vacrelstats->num_dead_tuples) < MaxHeapTuplesPerPage &&
-			vacrelstats->num_dead_tuples > 0)
+		if ((dead_tuples->max_tuples - dead_tuples->num_tuples) < MaxHeapTuplesPerPage &&
+			dead_tuples->num_tuples > 0)
 		{
 			const int	hvp_index[] = {
 				PROGRESS_VACUUM_PHASE,
@@ -758,10 +920,8 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 										 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
 			/* Remove index entries */
-			for (i = 0; i < nindexes; i++)
-				lazy_vacuum_index(Irel[i],
-								  &indstats[i],
-								  vacrelstats);
+			lazy_vacuum_or_cleanup_indexes(vacrelstats, Irel, nindexes,
+										   indstats, lps, false);
 
 			/*
 			 * Report that we are now vacuuming the heap.  We also increase
@@ -781,7 +941,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 			vacrelstats->num_index_scans++;
 
 			/*
@@ -977,7 +1137,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		has_dead_tuples = false;
 		nfrozen = 0;
 		hastup = false;
-		prev_dead_count = vacrelstats->num_dead_tuples;
+		prev_dead_count = dead_tuples->num_tuples;
 		maxoff = PageGetMaxOffsetNumber(page);
 
 		/*
@@ -1016,7 +1176,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			 */
 			if (ItemIdIsDead(itemid))
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				all_visible = false;
 				continue;
 			}
@@ -1156,7 +1316,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 
 			if (tupgone)
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				HeapTupleHeaderAdvanceLatestRemovedXid(tuple.t_data,
 													   &vacrelstats->latestRemovedXid);
 				tups_vacuumed += 1;
@@ -1225,8 +1385,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		 * If there are no indexes then we can vacuum the page right now
 		 * instead of doing a second scan.
 		 */
-		if (nindexes == 0 &&
-			vacrelstats->num_dead_tuples > 0)
+		if (nindexes == 0 && dead_tuples->num_tuples > 0)
 		{
 			/* Remove tuples from heap */
 			lazy_vacuum_page(onerel, blkno, buf, 0, vacrelstats, &vmbuffer);
@@ -1237,7 +1396,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 			vacuumed_pages++;
 
 			/*
@@ -1353,7 +1512,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		 * page, so remember its free space as-is.  (This path will always be
 		 * taken if there are no indexes.)
 		 */
-		if (vacrelstats->num_dead_tuples == prev_dead_count)
+		if (dead_tuples->num_tuples == prev_dead_count)
 			RecordPageWithFreeSpace(onerel, blkno, freespace, nblocks);
 	}
 
@@ -1387,7 +1546,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 
 	/* If any tuples need to be deleted, perform final vacuum cycle */
 	/* XXX put a threshold on min number of tuples here? */
-	if (vacrelstats->num_dead_tuples > 0)
+	if (dead_tuples->num_tuples > 0)
 	{
 		const int	hvp_index[] = {
 			PROGRESS_VACUUM_PHASE,
@@ -1403,10 +1562,8 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 									 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
 		/* Remove index entries */
-		for (i = 0; i < nindexes; i++)
-			lazy_vacuum_index(Irel[i],
-							  &indstats[i],
-							  vacrelstats);
+		lazy_vacuum_or_cleanup_indexes(vacrelstats, Irel, nindexes,
+										   indstats, lps, false);
 
 		/* Report that we are now vacuuming the heap */
 		hvp_val[0] = PROGRESS_VACUUM_PHASE_VACUUM_HEAP;
@@ -1432,9 +1589,21 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
 								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
 
-	/* Do post-vacuum cleanup and statistics update for each index */
-	for (i = 0; i < nindexes; i++)
-		lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
+	/*
+	 * Do post-vacuum cleanup and statistics update for each index if
+	 * we're not in parallel lazy vacuum. If in parallel lazy vacuum, do
+	 * only post-vacum cleanup and then update statistics at the end of
+	 * parallel lazy vacuum.
+	 */
+	lazy_vacuum_or_cleanup_indexes(vacrelstats, Irel, nindexes,
+									   indstats, lps, true);
+
+	/*
+	 * If we're in parallel lazy vacuum, end parallel lazy vacuum and
+	 * update index statistics.
+	 */
+	if (IsInParallelVacuum(lps))
+		lazy_end_parallel(lps, Irel, nindexes);
 
 	/* If no indexes, make log report that lazy_vacuum_heap would've made */
 	if (vacuumed_pages)
@@ -1501,7 +1670,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats, BlockNumber nblocks)
 	npages = 0;
 
 	tupindex = 0;
-	while (tupindex < vacrelstats->num_dead_tuples)
+	while (tupindex < vacrelstats->dead_tuples->num_tuples)
 	{
 		BlockNumber tblk;
 		Buffer		buf;
@@ -1510,7 +1679,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats, BlockNumber nblocks)
 
 		vacuum_delay_point();
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples->itemptrs[tupindex]);
 		buf = ReadBufferExtended(onerel, MAIN_FORKNUM, tblk, RBM_NORMAL,
 								 vac_strategy);
 		if (!ConditionalLockBufferForCleanup(buf))
@@ -1558,6 +1727,7 @@ static int
 lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 				 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer)
 {
+	LVDeadTuples	*dead_tuples = vacrelstats->dead_tuples;
 	Page		page = BufferGetPage(buffer);
 	OffsetNumber unused[MaxOffsetNumber];
 	int			uncnt = 0;
@@ -1568,16 +1738,16 @@ lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 
 	START_CRIT_SECTION();
 
-	for (; tupindex < vacrelstats->num_dead_tuples; tupindex++)
+	for (; tupindex < dead_tuples->num_tuples; tupindex++)
 	{
 		BlockNumber tblk;
 		OffsetNumber toff;
 		ItemId		itemid;
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&dead_tuples->itemptrs[tupindex]);
 		if (tblk != blkno)
 			break;				/* past end of tuples for this block */
-		toff = ItemPointerGetOffsetNumber(&vacrelstats->dead_tuples[tupindex]);
+		toff = ItemPointerGetOffsetNumber(&dead_tuples->itemptrs[tupindex]);
 		itemid = PageGetItemId(page, toff);
 		ItemIdSetUnused(itemid);
 		unused[uncnt++] = toff;
@@ -1698,6 +1868,153 @@ lazy_check_needs_freeze(Buffer buf, bool *hastup)
 	return false;
 }
 
+/*
+ * Vacuum or cleanup indexes with parallel workers. This function must be used
+ * by the parallel vacuum leader process.
+ */
+static void
+lazy_parallel_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
+										int nindexes, IndexBulkDeleteResult **stats,
+										LVParallelState *lps, bool for_cleanup)
+{
+	Assert(!IsParallelWorker());
+	Assert(IsInParallelVacuum(lps));
+	Assert(nindexes > 0);
+
+	/* Launch parallel vacuum workers if we're ready */
+	lazy_begin_parallel_index_vacuum(lps, vacrelstats,
+									 for_cleanup);
+
+	/*
+	 * Do index vacuuming or index cleanup with parallel workers or by
+	 * the leader process alone if no workers could not launch.
+	 */
+	do_parallel_vacuum_or_cleanup_indexes(Irel, nindexes, stats,
+										  lps->lvshared,
+										  vacrelstats->dead_tuples);
+
+	/*
+	 * Wait for all workers to finish, and prepare for the next index
+	 * vacuuming or index cleanup.
+	 */
+	lazy_end_parallel_index_vacuum(lps, !for_cleanup);
+}
+
+/*
+ * Index vacuuming and index cleanup routine used by both the leader process
+ * and worker processes. Unlike single process vacuum, we don't update
+ * index statistics after cleanup index since it is not allowed during
+ * parallel mode, therefore copy index bulk-deletion results from the local
+ * memory to the DSM segment and update them at the end of parallel lazy
+ * vacuum.
+ */
+static void
+do_parallel_vacuum_or_cleanup_indexes(Relation *Irel, int nindexes,
+									  IndexBulkDeleteResult **stats,
+									  LVShared *lvshared,
+									  LVDeadTuples *dead_tuples)
+{
+	int idx = 0;
+
+	for (;;)
+	{
+		/* Get an index number to process */
+		idx = pg_atomic_fetch_add_u32(&(lvshared->nprocessed), 1);
+
+		/* Done for all indexes? */
+		if (idx >= nindexes)
+			break;
+
+		/*
+		 * Update the local pointer to the corresponding bulk-deletion result
+		 * if someone already updated it.
+		 */
+		if (lvshared->indstats[idx].updated &&
+			stats[idx] == NULL)
+			stats[idx] = &(lvshared->indstats[idx].stats);
+
+		/* Do vacuum or cleanup one index */
+		if (!lvshared->for_cleanup)
+			lazy_vacuum_index(Irel[idx], &stats[idx], lvshared->reltuples,
+							  dead_tuples);
+		else
+			lazy_cleanup_index(Irel[idx], &stats[idx], lvshared->reltuples,
+							   lvshared->estimated_count);
+
+		/*
+		 * We copy the index bulk-deletion results returned from ambulkdelete
+		 * and amvacuumcleanup to the DSM segment because they allocate the
+		 * results locally and it's possible that an index will be vacuumed
+		 * by the different vacuum process at the next time. The copying the
+		 * result normally happens only after the first time of index vacuuming.
+		 * From the second time, we pass the result on the DSM segment so
+		 * that they then update it directly.
+		 *
+		 * Since all vacuum workers write the bulk-deletion result at different
+		 * slot we can write them without locking.
+		 */
+		if (!lvshared->indstats[idx].updated &&
+			stats[idx] != NULL)
+		{
+			memcpy(&(lvshared->indstats[idx].stats),
+				   stats[idx], sizeof(IndexBulkDeleteResult));
+			lvshared->indstats[idx].updated = true;
+
+			/*
+			 * no longer need the locally allocated result and now stats[idx]
+			 * points to the DSM segment.
+			 */
+			pfree(stats[idx]);
+			stats[idx] = &(lvshared->indstats[idx].stats);
+		}
+	}
+}
+
+/*
+ * Vacuum or cleanup indexes. If we're ready for parallel lazy vacuum it's
+ * performed with parallel workers. So this function must be used by the parallel
+ * vacuum leader process.
+ */
+static void
+lazy_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
+								   int nindexes, IndexBulkDeleteResult **stats,
+								   LVParallelState *lps, bool for_cleanup)
+{
+	int		idx;
+
+	Assert(!IsParallelWorker());
+
+	/* no job if the table has no index */
+	if (nindexes <= 0)
+		return;
+
+	if (IsInParallelVacuum(lps))
+	{
+		/* Do parallel index vacuuming or index cleanup */
+		lazy_parallel_vacuum_or_cleanup_indexes(vacrelstats, Irel,
+												nindexes, stats,
+												lps, for_cleanup);
+		return;
+	}
+
+	/* Do index vacuuming or index cleanup in single vacuum mode */
+	for (idx = 0; idx < nindexes; idx++)
+	{
+		if (!for_cleanup)
+			lazy_vacuum_index(Irel[idx], &stats[idx], vacrelstats->old_live_tuples,
+							  vacrelstats->dead_tuples);
+		else
+		{
+			lazy_cleanup_index(Irel[idx], &stats[idx], vacrelstats->new_rel_tuples,
+							   vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+
+			lazy_update_index_statistics(Irel[idx], stats[idx]);
+
+			if (stats[idx])
+				pfree(stats[idx]);
+		}
+	}
+}
 
 /*
  *	lazy_vacuum_index() -- vacuum one index relation.
@@ -1708,9 +2025,10 @@ lazy_check_needs_freeze(Buffer buf, bool *hastup)
 static void
 lazy_vacuum_index(Relation indrel,
 				  IndexBulkDeleteResult **stats,
-				  LVRelStats *vacrelstats)
+				  double reltuples, LVDeadTuples *dead_tuples)
 {
 	IndexVacuumInfo ivinfo;
+	char		*msgfmt;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
@@ -1720,18 +2038,22 @@ lazy_vacuum_index(Relation indrel,
 	ivinfo.report_progress = false;
 	ivinfo.estimated_count = true;
 	ivinfo.message_level = elevel;
-	/* We can only provide an approximate value of num_heap_tuples here */
-	ivinfo.num_heap_tuples = vacrelstats->old_live_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
 	/* Do bulk deletion */
 	*stats = index_bulk_delete(&ivinfo, *stats,
-							   lazy_tid_reaped, (void *) vacrelstats);
+							   lazy_tid_reaped, (void *) dead_tuples);
+
+	if (IsParallelWorker())
+		msgfmt = gettext_noop("scanned index \"%s\" to remove %d row versions by parallel vacuum worker");
+	else
+		msgfmt = gettext_noop("scanned index \"%s\" to remove %d row versions");
 
 	ereport(elevel,
-			(errmsg("scanned index \"%s\" to remove %d row versions",
+			(errmsg(msgfmt,
 					RelationGetRelationName(indrel),
-					vacrelstats->num_dead_tuples),
+					dead_tuples->num_tuples),
 			 errdetail_internal("%s", pg_rusage_show(&ru0))));
 }
 
@@ -1740,10 +2062,11 @@ lazy_vacuum_index(Relation indrel,
  */
 static void
 lazy_cleanup_index(Relation indrel,
-				   IndexBulkDeleteResult *stats,
-				   LVRelStats *vacrelstats)
+				   IndexBulkDeleteResult **stats,
+				   double reltuples, bool estimated_count)
 {
 	IndexVacuumInfo ivinfo;
+	char		*msgfmt;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
@@ -1751,49 +2074,55 @@ lazy_cleanup_index(Relation indrel,
 	ivinfo.index = indrel;
 	ivinfo.analyze_only = false;
 	ivinfo.report_progress = false;
-	ivinfo.estimated_count = (vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	ivinfo.estimated_count = estimated_count;
 	ivinfo.message_level = elevel;
 
-	/*
-	 * Now we can provide a better estimate of total number of surviving
-	 * tuples (we assume indexes are more interested in that than in the
-	 * number of nominally live tuples).
-	 */
-	ivinfo.num_heap_tuples = vacrelstats->new_rel_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
-	stats = index_vacuum_cleanup(&ivinfo, stats);
+	*stats = index_vacuum_cleanup(&ivinfo, *stats);
 
 	if (!stats)
 		return;
 
-	/*
-	 * Now update statistics in pg_class, but only if the index says the count
-	 * is accurate.
-	 */
-	if (!stats->estimated_count)
-		vac_update_relstats(indrel,
-							stats->num_pages,
-							stats->num_index_tuples,
-							0,
-							false,
-							InvalidTransactionId,
-							InvalidMultiXactId,
-							false);
+	if (IsParallelWorker())
+		msgfmt = gettext_noop("index \"%s\" now contains %.0f row versions in %u pages, reported by parallel vacuum worker");
+	else
+		msgfmt = gettext_noop("index \"%s\" now contains %.0f row versions in %u pages");
 
 	ereport(elevel,
-			(errmsg("index \"%s\" now contains %.0f row versions in %u pages",
+			(errmsg(msgfmt,
 					RelationGetRelationName(indrel),
-					stats->num_index_tuples,
-					stats->num_pages),
+					(*stats)->num_index_tuples,
+					(*stats)->num_pages),
 			 errdetail("%.0f index row versions were removed.\n"
 					   "%u index pages have been deleted, %u are currently reusable.\n"
 					   "%s.",
-					   stats->tuples_removed,
-					   stats->pages_deleted, stats->pages_free,
+					   (*stats)->tuples_removed,
+					   (*stats)->pages_deleted, (*stats)->pages_free,
 					   pg_rusage_show(&ru0))));
+}
+
+/*
+ * Update index statistics in pg_class if the index says the statistics
+ * is accurate.
+ */
+static void
+lazy_update_index_statistics(Relation indrel, IndexBulkDeleteResult *stats)
+{
+	Assert(!IsInParallelMode());
+
+	if (!stats || stats->estimated_count)
+		return;
 
-	pfree(stats);
+	vac_update_relstats(indrel,
+						stats->num_pages,
+						stats->num_index_tuples,
+						0,
+						false,
+						InvalidTransactionId,
+						InvalidMultiXactId,
+						false);
 }
 
 /*
@@ -2098,19 +2427,17 @@ count_nondeletable_pages(Relation onerel, LVRelStats *vacrelstats)
 }
 
 /*
- * lazy_space_alloc - space allocation decisions for lazy vacuum
- *
- * See the comments at the head of this file for rationale.
+ * Return the maximum number of dead tuples we can record.
  */
-static void
-lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+static long
+compute_max_dead_tuples(BlockNumber relblocks, bool hasindex)
 {
 	long		maxtuples;
 	int			vac_work_mem = IsAutoVacuumWorkerProcess() &&
 	autovacuum_work_mem != -1 ?
 	autovacuum_work_mem : maintenance_work_mem;
 
-	if (vacrelstats->hasindex)
+	if (hasindex)
 	{
 		maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);
 		maxtuples = Min(maxtuples, INT_MAX);
@@ -2124,34 +2451,49 @@ lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
 		maxtuples = Max(maxtuples, MaxHeapTuplesPerPage);
 	}
 	else
-	{
 		maxtuples = MaxHeapTuplesPerPage;
-	}
 
-	vacrelstats->num_dead_tuples = 0;
-	vacrelstats->max_dead_tuples = (int) maxtuples;
-	vacrelstats->dead_tuples = (ItemPointer)
-		palloc(maxtuples * sizeof(ItemPointerData));
+	return maxtuples;
+}
+
+/*
+ * lazy_space_alloc - space allocation decisions for lazy vacuum
+ *
+ * See the comments at the head of this file for rationale.
+ */
+static void
+lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+{
+	LVDeadTuples	*dead_tuples = NULL;
+	long		maxtuples;
+
+	maxtuples = compute_max_dead_tuples(relblocks, vacrelstats->hasindex);
+
+	dead_tuples = (LVDeadTuples *)
+		palloc(SizeOfLVDeadTuples + maxtuples * sizeof(ItemPointerData));
+	dead_tuples->num_tuples = 0;
+	dead_tuples->max_tuples = (int) maxtuples;
+
+	vacrelstats->dead_tuples = dead_tuples;
 }
 
 /*
  * lazy_record_dead_tuple - remember one deletable tuple
  */
 static void
-lazy_record_dead_tuple(LVRelStats *vacrelstats,
-					   ItemPointer itemptr)
+lazy_record_dead_tuple(LVDeadTuples *dead_tuples, ItemPointer itemptr)
 {
 	/*
 	 * The array shouldn't overflow under normal behavior, but perhaps it
 	 * could if we are given a really small maintenance_work_mem. In that
 	 * case, just forget the last few tuples (we'll get 'em next time).
 	 */
-	if (vacrelstats->num_dead_tuples < vacrelstats->max_dead_tuples)
+	if (dead_tuples->num_tuples < dead_tuples->max_tuples)
 	{
-		vacrelstats->dead_tuples[vacrelstats->num_dead_tuples] = *itemptr;
-		vacrelstats->num_dead_tuples++;
+		dead_tuples->itemptrs[dead_tuples->num_tuples] = *itemptr;
+		dead_tuples->num_tuples++;
 		pgstat_progress_update_param(PROGRESS_VACUUM_NUM_DEAD_TUPLES,
-									 vacrelstats->num_dead_tuples);
+									 dead_tuples->num_tuples);
 	}
 }
 
@@ -2165,12 +2507,12 @@ lazy_record_dead_tuple(LVRelStats *vacrelstats,
 static bool
 lazy_tid_reaped(ItemPointer itemptr, void *state)
 {
-	LVRelStats *vacrelstats = (LVRelStats *) state;
+	LVDeadTuples	*dead_tuples = (LVDeadTuples *) state;
 	ItemPointer res;
 
 	res = (ItemPointer) bsearch((void *) itemptr,
-								(void *) vacrelstats->dead_tuples,
-								vacrelstats->num_dead_tuples,
+								(void *) dead_tuples->itemptrs,
+								dead_tuples->num_tuples,
 								sizeof(ItemPointerData),
 								vac_cmp_itemptr);
 
@@ -2318,3 +2660,353 @@ heap_page_is_all_visible(Relation rel, Buffer buf,
 
 	return all_visible;
 }
+
+/*
+ * Compute the number of parallel worker process to request. Both index vacuuming
+ * and cleanup index can be executed together with parallel workers if the table
+ * has more than one index. The relation sizes of table and indexes don't affect
+ * to the parallel degree for now.
+ */
+static int
+compute_parallel_workers(Relation onerel, int nrequested, int nindexes)
+{
+	int parallel_workers;
+
+	Assert(nrequested >= 0);
+
+	if (nindexes <= 1)
+		return 0;
+
+	if (nrequested > 0)
+		parallel_workers = Min(nrequested, nindexes - 1);
+	else
+	{
+		/*
+		 * The parallel degree is not requested. Compute it based on the
+		 * number of indexes.
+		 */
+		parallel_workers = nindexes - 1;
+	}
+
+	/* cap by max_parallel_maintenace_workers */
+	parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+
+	return parallel_workers;
+}
+
+/*
+ * Enter parallel mode, allocate and initialize a DSM segment.
+ */
+static LVParallelState *
+lazy_prepare_parallel(LVRelStats *vacrelstats, Oid relid, BlockNumber nblocks,
+					  int nindexes, int nrequested)
+{
+	LVParallelState *lps = (LVParallelState *) palloc(sizeof(LVParallelState));
+	LVShared	*shared;
+	ParallelContext *pcxt;
+	LVDeadTuples	*tidmap;
+	long	maxtuples;
+	char	*sharedquery;
+	Size	est_shared;
+	Size	est_deadtuples;
+	int		querylen;
+	int		keys = 0;
+
+	Assert(nrequested > 0);
+	Assert(nindexes > 0);
+
+	EnterParallelMode();
+	pcxt = CreateParallelContext("postgres", "heap_parallel_vacuum_main",
+								 nrequested);
+	lps->pcxt = pcxt;
+	Assert(pcxt->nworkers > 0);
+
+	/* Estimate size for shared information -- PARALLEL_VACUUM_KEY_SHARED */
+	est_shared = MAXALIGN(add_size(SizeOfLVShared,
+								   mul_size(sizeof(LVIndStats), nindexes)));
+	shm_toc_estimate_chunk(&pcxt->estimator, est_shared);
+	keys++;
+
+	/* Estimate size for dead tuples -- PARALLEL_VACUUM_KEY_DEAD_TUPLES */
+	maxtuples = compute_max_dead_tuples(nblocks, true);
+	est_deadtuples = MAXALIGN(add_size(sizeof(LVDeadTuples),
+									   mul_size(sizeof(ItemPointerData), maxtuples)));
+	shm_toc_estimate_chunk(&pcxt->estimator, est_deadtuples);
+	keys++;
+
+	shm_toc_estimate_keys(&pcxt->estimator, keys);
+
+	/* Finally, estimate VACUUM_KEY_QUERY_TEXT space */
+	querylen = strlen(debug_query_string);
+	shm_toc_estimate_chunk(&pcxt->estimator, querylen + 1);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	InitializeParallelDSM(pcxt);
+
+	/* prepare shared information */
+	shared = (LVShared *) shm_toc_allocate(pcxt->toc, est_shared);
+	shared->relid = relid;
+	shared->elevel = elevel;
+	pg_atomic_init_u32(&(shared->nprocessed), 0);
+	MemSet(shared->indstats, 0, sizeof(LVIndStats) * nindexes);
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_SHARED, shared);
+	lps->lvshared = shared;
+
+	/* prepare the dead tuple space */
+	tidmap = (LVDeadTuples *) shm_toc_allocate(pcxt->toc, est_deadtuples);
+	tidmap->max_tuples = maxtuples;
+	tidmap->num_tuples = 0;
+	MemSet(tidmap->itemptrs, 0, sizeof(ItemPointerData) * maxtuples);
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_DEAD_TUPLES, tidmap);
+	vacrelstats->dead_tuples = tidmap;
+
+	/* Store query string for workers */
+	sharedquery = (char *) shm_toc_allocate(pcxt->toc, querylen + 1);
+	memcpy(sharedquery, debug_query_string, querylen + 1);
+	sharedquery[querylen] = '\0';
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, sharedquery);
+
+	lps->nworkers_requested = 0;
+
+	return lps;
+}
+
+/*
+ * Shutdown workers, destroy the parallel context, and end parallel mode.
+ * Update index statistics after exited from parallel mode.
+ */
+static void
+lazy_end_parallel(LVParallelState *lps, Relation *Irel, int nindexes)
+{
+	LVIndStats *copied_indstats = NULL;
+	int i;
+
+	Assert(!IsParallelWorker());
+	Assert(Irel != NULL && nindexes > 0);
+
+	/*
+	 * All writes are not allowed during parallel mode and it might not be
+	 * safe to exit from parallel mode while keeping the parallel context.
+	 * So we copy the index statistics to a temporary space and update
+	 * them after exited from parallel mode.
+	 */
+	copied_indstats = palloc(sizeof(LVIndStats) * nindexes);
+	memcpy(copied_indstats, lps->lvshared->indstats,
+		   sizeof(LVIndStats) * nindexes);
+
+	/* Shutdown worker processes and destroy the parallel context */
+	WaitForParallelWorkersToFinish(lps->pcxt);
+	DestroyParallelContext(lps->pcxt);
+	ExitParallelMode();
+
+	for (i = 0; i < nindexes; i++)
+	{
+		LVIndStats *s = &(copied_indstats[i]);
+
+		if (s->updated)
+			lazy_update_index_statistics(Irel[i], &(s->stats));
+	}
+
+	pfree(copied_indstats);
+}
+
+/*
+ * Begin a parallel index vacuuming or index cleanup. Set shared information
+ * and launch parallel worker processes.
+ */
+static void
+lazy_begin_parallel_index_vacuum(LVParallelState *lps, LVRelStats *vacrelstats,
+								 bool for_cleanup)
+{
+	StringInfoData buf;
+
+	Assert(!IsParallelWorker());
+
+	/* Set shared information to tell parallel workers */
+	lps->lvshared->for_cleanup = for_cleanup;
+	if (!for_cleanup)
+	{
+		/* We can only provide an approximate value of num_heap_tuples here */
+		lps->lvshared->reltuples = vacrelstats->old_live_tuples;
+		lps->lvshared->estimated_count = true;
+	}
+	else
+	{
+		/*
+		 * Now we can provide a better estimate of total number of surviving
+		 * tuples (we assume indexes are more interested in that than in the
+		 * number of nominally live tuples).
+		 */
+		lps->lvshared->reltuples = vacrelstats->new_rel_tuples;
+		lps->lvshared->estimated_count =
+			(vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+
+	}
+
+	/* Launch all parallel workers */
+	LaunchParallelWorkers(lps->pcxt);
+
+	initStringInfo(&buf);
+
+	/* Create the log message to report */
+	if (lps->pcxt->nworkers_launched == 0)
+	{
+		/*
+		 * If no workers launched, the leader process vacuums all indexes alone.
+		 * Since there is hope that we can launch parallel workers in the next
+		 * index vacuuming time we don't end parallel mode yet.
+		 */
+		if (for_cleanup)
+		{
+			if (lps->nworkers_requested > 0)
+				appendStringInfo(&buf,
+								 gettext_noop("could not launch parallel vacuum worker for index cleanup (planned: %d, requested: %d)"),
+								 lps->pcxt->nworkers, lps->nworkers_requested);
+			else
+				appendStringInfo(&buf,
+								 gettext_noop("could not launch parallel vacuum worker for index cleanup (planned: %d)"),
+								 lps->pcxt->nworkers);
+		}
+		else
+		{
+			if (lps->nworkers_requested > 0)
+				appendStringInfo(&buf,
+								 gettext_noop("could not launch parallel vacuum worker for index vacuuming (planned: %d, requested: %d)"),
+								 lps->pcxt->nworkers, lps->nworkers_requested);
+			else
+				appendStringInfo(&buf,
+								 gettext_noop("could not launch parallel vacuum worker for index vacuuming (planned: %d)"),
+								 lps->pcxt->nworkers);
+		}
+	}
+	else
+	{
+		if (for_cleanup)
+		{
+			if (lps->nworkers_requested > 0)
+				appendStringInfo(&buf,
+								 ngettext("launched %d parallel vacuum worker for index cleanup (planned: %d, requested %d)",
+										  "launched %d parallel vacuum workers for index cleanup (planned: %d, requsted %d)",
+										  lps->pcxt->nworkers_launched),
+								 lps->pcxt->nworkers_launched,
+								 lps->pcxt->nworkers,
+								 lps->nworkers_requested);
+			else
+				appendStringInfo(&buf,
+								 ngettext("launched %d parallel vacuum worker for index cleanup (planned: %d)",
+										  "launched %d parallel vacuum workers for index cleanup (planned: %d)",
+										  lps->pcxt->nworkers_launched),
+								 lps->pcxt->nworkers_launched,
+								 lps->pcxt->nworkers);
+		}
+		else
+		{
+			if (lps->nworkers_requested > 0)
+				appendStringInfo(&buf,
+								 ngettext("launched %d parallel vacuum worker for index vacuuming (planned: %d, requested %d)",
+										  "launched %d parallel vacuum workers for index vacuuming (planned: %d, requested %d)",
+										  lps->pcxt->nworkers_launched),
+								 lps->pcxt->nworkers_launched,
+								 lps->pcxt->nworkers,
+								 lps->nworkers_requested);
+			else
+				appendStringInfo(&buf,
+								 ngettext("launched %d parallel vacuum worker for index vacuuming (planned: %d)",
+										  "launched %d parallel vacuum workers for index vacuuming (planned: %d)",
+										  lps->pcxt->nworkers_launched),
+								 lps->pcxt->nworkers_launched,
+								 lps->pcxt->nworkers);
+		}
+	}
+
+	ereport(elevel, (errmsg("%s", buf.data)));
+	return;
+}
+
+/*
+ * Wait for all worker processes to finish and reinitialize DSM for
+ * the next index vacuuming or index cleanup if necessary.
+ */
+static void
+lazy_end_parallel_index_vacuum(LVParallelState *lps, bool reinitialize)
+{
+	Assert(!IsParallelWorker());
+
+	WaitForParallelWorkersToFinish(lps->pcxt);
+
+	if (reinitialize)
+	{
+		/* Reset the processing count */
+		pg_atomic_write_u32(&(lps->lvshared->nprocessed), 0);
+
+		/*
+		 * Reinitialize the parallel context to relaunch parallel workers
+		 * for the next execution.
+		 */
+		ReinitializeParallelDSM(lps->pcxt);
+	}
+}
+
+/*
+ * Perform work within a launched parallel process.
+ *
+ * Parallel vacuum worker processes doesn't report the vacuum progress
+ * information.
+ */
+void
+heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
+{
+	Relation	onerel;
+	Relation	*indrels;
+	LVShared	*lvshared;
+	LVDeadTuples	*dead_tuples;
+	int			nindexes;
+	char		*sharedquery;
+	IndexBulkDeleteResult **stats;
+
+	lvshared = (LVShared *) shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_SHARED,
+										   false);
+	elevel = lvshared->elevel;
+
+	ereport(DEBUG1,
+			(errmsg("starting parallel lazy vacuum worker for %s",
+					lvshared->for_cleanup ? "cleanup" : "vacuuming")));
+
+	/* Set debug_query_string for individual workers */
+	sharedquery = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, true);
+	debug_query_string = sharedquery;
+	pgstat_report_activity(STATE_RUNNING, debug_query_string);
+
+	/* Open table */
+	onerel = heap_open(lvshared->relid, ShareUpdateExclusiveLock);
+
+	/*
+	 * Open all indexes. indrels are sorted in order by OID, which should
+	 * be matched to the leader's one.
+	 */
+	vac_open_indexes(onerel, RowExclusiveLock, &nindexes, &indrels);
+	Assert(nindexes > 0);
+
+
+	/* Set dead tuple space */
+	dead_tuples = (LVDeadTuples *) shm_toc_lookup(toc,
+												  PARALLEL_VACUUM_KEY_DEAD_TUPLES,
+												  false);
+
+	/* Set cost-based vacuum delay */
+	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostBalance = 0;
+	VacuumPageHit = 0;
+	VacuumPageMiss = 0;
+	VacuumPageDirty = 0;
+
+	stats = (IndexBulkDeleteResult **)
+		palloc0(nindexes * sizeof(IndexBulkDeleteResult *));
+
+	/* Do either vacuuming indexes or cleaning indexes */
+	do_parallel_vacuum_or_cleanup_indexes(indrels, nindexes, stats,
+										  lvshared, dead_tuples);
+
+	vac_close_indexes(nindexes, indrels, RowExclusiveLock);
+	heap_close(onerel, ShareUpdateExclusiveLock);
+}
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index 55d129a..86511b2 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -14,6 +14,7 @@
 
 #include "postgres.h"
 
+#include "access/heapam.h"
 #include "access/nbtree.h"
 #include "access/parallel.h"
 #include "access/session.h"
@@ -140,6 +141,9 @@ static const struct
 	},
 	{
 		"_bt_parallel_build_main", _bt_parallel_build_main
+	},
+	{
+		"heap_parallel_vacuum_main", heap_parallel_vacuum_main
 	}
 };
 
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index fd2e47f..9d5cb1c 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -95,6 +95,9 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 	bool disable_page_skipping = false;
 	ListCell	*lc;
 
+	/* disable parallel lazy vacuum by default */
+	params.nworkers = -1;
+
 	/* Parse options list */
 	foreach(lc, vacstmt->options)
 	{
@@ -120,6 +123,27 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 			full = defGetBoolean(opt);
 		else if (strcmp(opt->defname, "disable_page_skipping") == 0)
 			disable_page_skipping = defGetBoolean(opt);
+		else if (strcmp(opt->defname, "parallel") == 0)
+		{
+			if (opt->arg == NULL)
+			{
+				/*
+				 * Parallel lazy vacuum is requested but user didn't specify
+				 * the parallel degree. The parallel degree will be determined
+				 * at the start of lazy vacuum.
+				 */
+				params.nworkers = 0;
+			}
+			else
+			{
+				params.nworkers = defGetInt32(opt);
+				if (params.nworkers <= 0)
+					ereport(ERROR,
+							(errcode(ERRCODE_SYNTAX_ERROR),
+							 errmsg("parallel vacuum degree must be at least 1"),
+							 parser_errposition(pstate, opt->location)));
+			}
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -161,6 +185,11 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 		}
 	}
 
+	if ((params.options & VACOPT_FULL) && params.nworkers >= 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot specify FULL option with PARALLEL option")));
+
 	/*
 	 * All freeze ages are zero if the FREEZE option is given; otherwise pass
 	 * them as -1 which means to use the default values.
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index b51f12d..e61de95 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -10581,6 +10581,7 @@ vac_analyze_option_name:
 
 vac_analyze_option_arg:
 			opt_boolean_or_string					{ $$ = (Node *) makeString($1); }
+			| NumericOnly							{ $$ = (Node *) $1; }
 			| /* EMPTY */		 					{ $$ = NULL; }
 		;
 
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index fa875db..010a49c 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2886,6 +2886,7 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 			(dovacuum ? VACOPT_VACUUM : 0) |
 			(doanalyze ? VACOPT_ANALYZE : 0) |
 			(!wraparound ? VACOPT_SKIP_LOCKED : 0);
+		tab->at_params.nworkers = -1;	/* parallel lazy autovacuum is not supported */
 		tab->at_params.freeze_min_age = freeze_min_age;
 		tab->at_params.freeze_table_age = freeze_table_age;
 		tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index d34bf86..62b3cd5 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -3443,7 +3443,8 @@ psql_completion(const char *text, int start, int end)
 		 */
 		if (ends_with(prev_wd, '(') || ends_with(prev_wd, ','))
 			COMPLETE_WITH("FULL", "FREEZE", "ANALYZE", "VERBOSE",
-						  "DISABLE_PAGE_SKIPPING", "SKIP_LOCKED");
+						  "DISABLE_PAGE_SKIPPING", "SKIP_LOCKED",
+						  "PARALLEL");
 		else if (TailMatches("FULL|FREEZE|ANALYZE|VERBOSE|DISABLE_PAGE_SKIPPING|SKIP_LOCKED"))
 			COMPLETE_WITH("ON", "OFF");
 	}
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 4c07775..48df92c 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -14,6 +14,7 @@
 #ifndef HEAPAM_H
 #define HEAPAM_H
 
+#include "access/parallel.h"
 #include "access/relation.h"	/* for backward compatibility */
 #include "access/relscan.h"
 #include "access/sdir.h"
@@ -199,6 +200,7 @@ extern Size SyncScanShmemSize(void);
 struct VacuumParams;
 extern void heap_vacuum_rel(Relation onerel,
 				struct VacuumParams *params, BufferAccessStrategy bstrategy);
+extern void heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
 
 /* in heap/heapam_visibility.c */
 extern bool HeapTupleSatisfiesVisibility(HeapTuple stup, Snapshot snapshot,
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 77086f3..c4b355a 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -167,6 +167,11 @@ typedef struct VacuumParams
 	int			log_min_duration;	/* minimum execution threshold in ms at
 									 * which  verbose logs are activated, -1
 									 * to use default */
+	/*
+	 * The number of parallel vacuum workers. -1 by default for no workers
+	 * and 0 for choosing based on the number of indexes.
+	 */
+	int			nworkers;
 } VacuumParams;
 
 /* GUC parameters */
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index 07d0703..973bb33 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -80,6 +80,12 @@ CONTEXT:  SQL function "do_analyze" statement 1
 SQL function "wrap_do_analyze" statement 1
 VACUUM FULL vactst;
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+VACUUM (PARALLEL) vaccluster;
+VACUUM (PARALLEL 2) vaccluster;
+VACUUM (PARALLEL 0) vaccluster; -- error
+ERROR:  parallel vacuum degree must be at least 1
+LINE 1: VACUUM (PARALLEL 0) vaccluster;
+                ^
 -- partitioned table
 CREATE TABLE vacparted (a int, b char) PARTITION BY LIST (a);
 CREATE TABLE vacparted1 PARTITION OF vacparted FOR VALUES IN (1);
@@ -116,9 +122,9 @@ ERROR:  column "does_not_exist" of relation "vacparted" does not exist
 ANALYZE (VERBOSE) does_not_exist;
 ERROR:  relation "does_not_exist" does not exist
 ANALYZE (nonexistent-arg) does_not_exist;
-ERROR:  syntax error at or near "-"
+ERROR:  syntax error at or near "arg"
 LINE 1: ANALYZE (nonexistent-arg) does_not_exist;
-                            ^
+                             ^
 ANALYZE (nonexistentarg) does_not_exit;
 ERROR:  unrecognized ANALYZE option "nonexistentarg"
 LINE 1: ANALYZE (nonexistentarg) does_not_exit;
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 81f3822..d0c209a 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -61,6 +61,9 @@ VACUUM FULL vaccluster;
 VACUUM FULL vactst;
 
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+VACUUM (PARALLEL) vaccluster;
+VACUUM (PARALLEL 2) vaccluster;
+VACUUM (PARALLEL 0) vaccluster; -- error
 
 -- partitioned table
 CREATE TABLE vacparted (a int, b char) PARTITION BY LIST (a);
-- 
2.10.5

v22-0002-Add-paralell-P-option-to-vacuumdb-command.patchapplication/octet-stream; name=v22-0002-Add-paralell-P-option-to-vacuumdb-command.patchDownload

From 2ddc01d6692196879493b3fbc0cf6bcd3cc48ac7 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Wed, 23 Jan 2019 16:07:53 +0900
Subject: [PATCH v22 2/2] Add --paralell, -P option to vacuumdb command

---
 doc/src/sgml/ref/vacuumdb.sgml    | 16 +++++++++++++
 src/bin/scripts/t/100_vacuumdb.pl | 10 +++++++-
 src/bin/scripts/vacuumdb.c        | 48 ++++++++++++++++++++++++++++++++++++++-
 3 files changed, 72 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/ref/vacuumdb.sgml b/doc/src/sgml/ref/vacuumdb.sgml
index 47d9345..f6ac0c6 100644
--- a/doc/src/sgml/ref/vacuumdb.sgml
+++ b/doc/src/sgml/ref/vacuumdb.sgml
@@ -227,6 +227,22 @@ PostgreSQL documentation
      </varlistentry>
 
      <varlistentry>
+      <term><option>-P <replaceable class="parameter">workers</replaceable></option></term>
+      <term><option>--parallel=<replaceable class="parameter">workers</replaceable></option></term>
+      <listitem>
+       <para>
+        Execute parallel vacuum with <productname>PostgreSQL</productname>'s
+        <replaceable class="parameter">workers</replaceable> background workers.
+       </para>
+       <para>
+        This option will require background workers, so make sure your
+        <xref linkend="guc-max-parallel-workers-maintenance"/> setting is more
+        than one.
+       </para>
+      </listitem>
+     </varlistentry>
+
+     <varlistentry>
       <term><option>-q</option></term>
       <term><option>--quiet</option></term>
       <listitem>
diff --git a/src/bin/scripts/t/100_vacuumdb.pl b/src/bin/scripts/t/100_vacuumdb.pl
index 7f3a9b1..5ab87f3 100644
--- a/src/bin/scripts/t/100_vacuumdb.pl
+++ b/src/bin/scripts/t/100_vacuumdb.pl
@@ -3,7 +3,7 @@ use warnings;
 
 use PostgresNode;
 use TestLib;
-use Test::More tests => 44;
+use Test::More tests => 48;
 
 program_help_ok('vacuumdb');
 program_version_ok('vacuumdb');
@@ -48,6 +48,14 @@ $node->issues_sql_like(
 $node->command_fails(
 	[ 'vacuumdb', '--analyze-only', '--disable-page-skipping', 'postgres' ],
 	'--analyze-only and --disable-page-skipping specified together');
+$node->issues_sql_like(
+	[ 'vacuumdb', '-P2', 'postgres' ],
+	qr/statement: VACUUM \(PARALLEL 2\).*;/,
+	'vacuumdb -P2');
+$node->issues_sql_like(
+	[ 'vacuumdb', '-P', 'postgres' ],
+	qr/statement: VACUUM \(PARALLEL\).*;/,
+	'vacuumdb -P');
 $node->command_ok([qw(vacuumdb -Z --table=pg_am dbname=template1)],
 	'vacuumdb with connection string');
 
diff --git a/src/bin/scripts/vacuumdb.c b/src/bin/scripts/vacuumdb.c
index 25ff19e..68b10ad 100644
--- a/src/bin/scripts/vacuumdb.c
+++ b/src/bin/scripts/vacuumdb.c
@@ -46,6 +46,8 @@ typedef struct vacuumingOptions
 	bool		skip_locked;
 	int			min_xid_age;
 	int			min_mxid_age;
+	int			parallel_workers;	/* -1 disables, 0 for choosing based on the
+									 * number of indexes */
 } vacuumingOptions;
 
 
@@ -112,6 +114,7 @@ main(int argc, char *argv[])
 		{"full", no_argument, NULL, 'f'},
 		{"verbose", no_argument, NULL, 'v'},
 		{"jobs", required_argument, NULL, 'j'},
+		{"parallel", optional_argument, NULL, 'P'},
 		{"maintenance-db", required_argument, NULL, 2},
 		{"analyze-in-stages", no_argument, NULL, 3},
 		{"disable-page-skipping", no_argument, NULL, 4},
@@ -141,6 +144,7 @@ main(int argc, char *argv[])
 
 	/* initialize options to all false */
 	memset(&vacopts, 0, sizeof(vacopts));
+	vacopts.parallel_workers = -1;
 
 	pg_logging_init(argv[0]);
 	progname = get_progname(argv[0]);
@@ -148,7 +152,7 @@ main(int argc, char *argv[])
 
 	handle_help_version_opts(argc, argv, "vacuumdb", help);
 
-	while ((c = getopt_long(argc, argv, "h:p:U:wWeqd:zZFat:fvj:", long_options, &optindex)) != -1)
+	while ((c = getopt_long(argc, argv, "h:p:P::U:wWeqd:zZFat:fvj:", long_options, &optindex)) != -1)
 	{
 		switch (c)
 		{
@@ -214,6 +218,24 @@ main(int argc, char *argv[])
 					exit(1);
 				}
 				break;
+			case 'P':
+				{
+					int parallel_workers = 0;
+
+					if (optarg != NULL)
+					{
+						parallel_workers = atoi(optarg);
+						if (parallel_workers <= 0)
+						{
+							pg_log_error("number of parallel workers must be at least 1");
+							exit(1);
+						}
+					}
+
+					/* allow to set 0, meaning PARALLEL without the parallel degree */
+					vacopts.parallel_workers = parallel_workers;
+					break;
+				}
 			case 2:
 				maintenance_db = pg_strdup(optarg);
 				break;
@@ -286,9 +308,22 @@ main(int argc, char *argv[])
 						 "disable-page-skipping");
 			exit(1);
 		}
+		if (vacopts.parallel_workers >= 0)
+		{
+			pg_log_error("cannot use the \"%s\" option when performing only analyze",
+						 "parallel");
+			exit(1);
+		}
 		/* allow 'and_analyze' with 'analyze_only' */
 	}
 
+	if (vacopts.full && vacopts.parallel_workers >= 0)
+	{
+		pg_log_error("cannot use the \"%s\" option with \"%s\" option",
+					 "full", "parallel");
+		exit(1);
+	}
+
 	setup_cancel_handler();
 
 	/* Avoid opening extra connections. */
@@ -891,6 +926,16 @@ prepare_vacuum_command(PQExpBuffer sql, int serverVersion,
 				appendPQExpBuffer(sql, "%sANALYZE", sep);
 				sep = comma;
 			}
+			if (vacopts->parallel_workers > 0)
+			{
+				appendPQExpBuffer(sql, "%sPARALLEL %d", sep, vacopts->parallel_workers);
+				sep = comma;
+			}
+			if (vacopts->parallel_workers == 0)
+			{
+				appendPQExpBuffer(sql, "%sPARALLEL", sep);
+				sep = comma;
+			}
 			if (sep != paren)
 				appendPQExpBufferChar(sql, ')');
 		}
@@ -1222,6 +1267,7 @@ help(const char *progname)
 	printf(_("  -j, --jobs=NUM                  use this many concurrent connections to vacuum\n"));
 	printf(_("      --min-mxid-age=MXID_AGE     minimum multixact ID age of tables to vacuum\n"));
 	printf(_("      --min-xid-age=XID_AGE       minimum transaction ID age of tables to vacuum\n"));
+	printf(_("  -P, --parallel[=NUM]            do parallel vacuuming\n"));
 	printf(_("  -q, --quiet                     don't write any messages\n"));
 	printf(_("      --skip-locked               skip relations that cannot be immediately locked\n"));
 	printf(_("  -t, --table='TABLE[(COLUMNS)]'  vacuum specific table(s) only\n"));
-- 
2.10.5

#70

Robert Haas

robertmhaas@gmail.com

almost 7 years ago

In reply to: Masahiko Sawada (#69)

On Thu, Apr 4, 2019 at 6:28 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

These patches conflict with the current HEAD. Attached the updated patches.

They'll need another rebase.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#71

Masahiko Sawada

sawada.mshk@gmail.com

almost 7 years ago

In reply to: Robert Haas (#70)

2 attachment(s)

On Fri, Apr 5, 2019 at 4:51 AM Robert Haas <robertmhaas@gmail.com> wrote:

On Thu, Apr 4, 2019 at 6:28 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

These patches conflict with the current HEAD. Attached the updated patches.

They'll need another rebase.

Thank you for the notice. Rebased.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachments:

v23-0001-Add-parallel-option-to-VACUUM-command.patchapplication/x-patch; name=v23-0001-Add-parallel-option-to-VACUUM-command.patchDownload

From 87061bbc5b0c2d7c47b820ed97e6d738fbd1781a Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Thu, 4 Apr 2019 11:42:25 +0900
Subject: [PATCH v23 1/2] Add parallel option to VACUUM command

This change adds PARALLEL option to VACUUM command that enable us to
perform index vacuuming and index cleanup with parallel
workers. Indivisual indexes are processed by one vacuum
process. Therefore parallel vacuum can be used when the table has at
least two indexes and we cannot specify larger parallel degree than
the number of indexes that the table has.

The parallel degree is either specified by user or determined based on
the number of indexes that the table has, and further limited by
max_parallel_maintenance_workers. The table size and index size don't
affect it.
---
 doc/src/sgml/config.sgml              |  14 +-
 doc/src/sgml/ref/vacuum.sgml          |  31 ++
 src/backend/access/heap/vacuumlazy.c  | 890 ++++++++++++++++++++++++++++++----
 src/backend/access/transam/parallel.c |   4 +
 src/backend/commands/vacuum.c         |  27 ++
 src/backend/parser/gram.y             |   1 +
 src/backend/postmaster/autovacuum.c   |   1 +
 src/bin/psql/tab-complete.c           |   2 +-
 src/include/access/heapam.h           |   2 +
 src/include/commands/vacuum.h         |   5 +
 src/test/regress/expected/vacuum.out  |  12 +-
 src/test/regress/sql/vacuum.sql       |   6 +
 12 files changed, 889 insertions(+), 106 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index bc1d0f7..0b65d9b 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2226,13 +2226,13 @@ include_dir 'conf.d'
        <listitem>
         <para>
          Sets the maximum number of parallel workers that can be
-         started by a single utility command.  Currently, the only
-         parallel utility command that supports the use of parallel
-         workers is <command>CREATE INDEX</command>, and only when
-         building a B-tree index.  Parallel workers are taken from the
-         pool of processes established by <xref
-         linkend="guc-max-worker-processes"/>, limited by <xref
-         linkend="guc-max-parallel-workers"/>.  Note that the requested
+         started by a single utility command.  Currently, the parallel
+         utility commands that support the use of parallel workers are
+         <command>CREATE INDEX</command> only when building a B-tree index,
+         and <command>VACUUM</command> without <literal>FULL</literal>
+         option. Parallel workers are taken from the pool of processes
+         established by <xref linkend="guc-max-worker-processes"/>, limited
+         by <xref linkend="guc-max-parallel-workers"/>.  Note that the requested
          number of workers may not actually be available at run time.
          If this occurs, the utility operation will run with fewer
          workers than expected.  The default value is 2.  Setting this
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index fdd8151..a0dd997 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -33,6 +33,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     DISABLE_PAGE_SKIPPING [ <replaceable class="parameter">boolean</replaceable> ]
     SKIP_LOCKED [ <replaceable class="parameter">boolean</replaceable> ]
     INDEX_CLEANUP [ <replaceable class="parameter">boolean</replaceable> ]
+    PARALLEL [ <replaceable class="parameter">integer</replaceable> ]
 
 <phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
 
@@ -144,6 +145,22 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    </varlistentry>
 
    <varlistentry>
+    <term><literal>PARALLEL</literal></term>
+    <listitem>
+     <para>
+      Perform vacuum index and cleanup index phases of <command>VACUUM</command>
+      in parallel using <replaceable class="parameter">integer</replaceable> background
+      workers (for the detail of each vacuum phases, please refer to
+      <xref linkend="vacuum-phases"/>). Only one worker can be used per index. So
+      parallel workers are launched only when there are at least <literal>2</literal>
+      indexes in the table. Workers for vacuum launches before starting each phases
+      and exit at the end of the phase. These behaviors might change in a future release.
+      This option can not use with <literal>FULL</literal> option.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><literal>DISABLE_PAGE_SKIPPING</literal></term>
     <listitem>
      <para>
@@ -219,6 +236,20 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    </varlistentry>
 
    <varlistentry>
+    <term><replaceable class="parameter">integer</replaceable></term>
+    <listitem>
+     <para>
+      Specifies parallel degree for <literal>PARALLEL</literal> option. The
+      value must be at least 1. If the parallel degree
+      <replaceable class="parameter">integer</replaceable> is omitted, then
+      <command>VACUUM</command> decides the number of workers based on number of
+      indexes on the relation which further limited by
+      <xref linkend="guc-max-parallel-workers-maintenance"/>.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><replaceable class="parameter">table_name</replaceable></term>
     <listitem>
      <para>
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index c9d8312..ae077ab 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -22,6 +22,20 @@
  * of index scans performed.  So we don't use maintenance_work_mem memory for
  * the TID array, just enough to hold as many heap tuples as fit on one page.
  *
+ * Lazy vacuum supports parallel execution with parallel worker processes. In
+ * parallel lazy vacuum, we perform both index vacuuming and index cleanup with
+ * parallel worker processes. Individual indexes is processed by one vacuum
+ * process. At beginning of lazy vacuum (at lazy_scan_heap) we prepare the
+ * parallel context and initialize the DSM segment that contains shared information
+ * as well as the memory space for storing dead tuples. When starting either
+ * index vacuuming or index cleanup, we launch parallel worker processes. Once
+ * all indexes are processed the parallel worker processes exit and the leader
+ * process re-initializes the DSM segment while keeping recorded dead tuples.
+ * Note that all parallel workers live during one either index vacuuming or
+ * index cleanup but the leader process neither exits from the parallel mode
+ * nor destroys the parallel context. For updating the index statistics, since
+ * any updates are not allowed during parallel mode we update the index
+ * statistics after exited from parallel mode.
  *
  * Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
@@ -41,8 +55,10 @@
 #include "access/heapam_xlog.h"
 #include "access/htup_details.h"
 #include "access/multixact.h"
+#include "access/parallel.h"
 #include "access/transam.h"
 #include "access/visibilitymap.h"
+#include "access/xact.h"
 #include "access/xlog.h"
 #include "catalog/storage.h"
 #include "commands/dbcommands.h"
@@ -55,6 +71,7 @@
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
 #include "storage/lmgr.h"
+#include "tcop/tcopprot.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/pg_rusage.h"
@@ -110,6 +127,93 @@
  */
 #define PREFETCH_SIZE			((BlockNumber) 32)
 
+/*
+ * DSM keys for parallel lazy vacuum. Since we don't need to worry about DSM
+ * keys conflicting with plan_node_id we can use small integers.
+ */
+#define PARALLEL_VACUUM_KEY_SHARED			1
+#define PARALLEL_VACUUM_KEY_DEAD_TUPLES		2
+#define PARALLEL_VACUUM_KEY_QUERY_TEXT		3
+
+/*
+ * Macro to check if we in a parallel lazy vacuum. If true, we're in parallel
+ * mode and prepared the DSM segments.
+ */
+#define IsInParallelVacuum(lps) (((LVParallelState *) (lps)) != NULL)
+
+/*
+ * Struct for an index bulk-deletion statistic used for parallel lazy vacuum.
+ * This is allocated in the DSM segment.
+ */
+typedef struct LVIndStats
+{
+	IndexBulkDeleteResult	stats;
+	bool					updated;	/* are the stats updated? */
+} LVIndStats;
+
+/*
+ * LVDeadTuples stores the dead tuple TIDs collected during heap scan.
+ * This is allocated in the DSM segment when parallel lazy vacuum
+ * mode, otherwise allocated in a local memory.
+ */
+typedef struct LVDeadTuples
+{
+	int			max_tuples;	/* # slots allocated in array */
+	int			num_tuples;	/* current # of entries */
+	/* List of TIDs of tuples we intend to delete */
+	/* NB: this list is ordered by TID address */
+	ItemPointerData itemptrs[FLEXIBLE_ARRAY_MEMBER];	/* array of ItemPointerData */
+} LVDeadTuples;
+#define SizeOfLVDeadTuples offsetof(LVDeadTuples, itemptrs) + sizeof(ItemPointerData)
+
+/*
+ * Shared information among parallel workers. So this is allocated in
+ * the DSM segment.
+ */
+typedef struct LVShared
+{
+	/*
+	 * Target table relid and log level. These fields are not modified during
+	 * the lazy vacuum.
+	 */
+	Oid		relid;
+	int		elevel;
+
+	/*
+	 * An indication for vacuum workers of doing either index vacuuming or
+	 * index cleanup.
+	 */
+	bool	for_cleanup;
+
+	/*
+	 * Fields for both index vacuuming and index cleanup.
+	 *
+	 * reltuples is the total number of input heap tuples. We set either an
+	 * old live tuples in index vacuuming case or the new live tuples in index
+	 * cleanup case.
+	 *
+	 * estimated_count is true if the reltuples is estimated value.
+	 */
+	double	reltuples;
+	bool	estimated_count;
+
+	/*
+	 * Variables to control parallel index vacuuming. An variable-sized field
+	 * 'indstats' must come last.
+	 */
+	pg_atomic_uint32	nprocessed;
+	LVIndStats			indstats[FLEXIBLE_ARRAY_MEMBER];
+} LVShared;
+#define SizeOfLVShared offsetof(LVShared, indstats) + sizeof(LVIndStats)
+
+/* Struct for parallel lazy vacuum */
+typedef struct LVParallelState
+{
+	ParallelContext	*pcxt;
+	LVShared		*lvshared;
+	int				nworkers_requested;	/* user-requested parallel degree */
+} LVParallelState;
+
 typedef struct LVRelStats
 {
 	/* useindex = true means two-pass strategy; false means one-pass */
@@ -130,17 +234,12 @@ typedef struct LVRelStats
 	BlockNumber pages_removed;
 	double		tuples_deleted;
 	BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
-	/* List of TIDs of tuples we intend to delete */
-	/* NB: this list is ordered by TID address */
-	int			num_dead_tuples;	/* current # of entries */
-	int			max_dead_tuples;	/* # slots allocated in array */
-	ItemPointer dead_tuples;	/* array of ItemPointerData */
+	LVDeadTuples *dead_tuples;
 	int			num_index_scans;
 	TransactionId latestRemovedXid;
 	bool		lock_waiter_detected;
 } LVRelStats;
 
-
 /* A few variables that don't seem worth passing around as parameters */
 static int	elevel = -1;
 
@@ -159,10 +258,11 @@ static void lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats, BlockNumb
 static bool lazy_check_needs_freeze(Buffer buf, bool *hastup);
 static void lazy_vacuum_index(Relation indrel,
 				  IndexBulkDeleteResult **stats,
-				  LVRelStats *vacrelstats);
+				  double reltuples,
+				  LVDeadTuples	*dead_tuples);
 static void lazy_cleanup_index(Relation indrel,
-				   IndexBulkDeleteResult *stats,
-				   LVRelStats *vacrelstats);
+				   IndexBulkDeleteResult **stats,
+				   double reltuples, bool estimated_count);
 static int lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 				 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer);
 static bool should_attempt_truncation(LVRelStats *vacrelstats);
@@ -170,12 +270,35 @@ static void lazy_truncate_heap(Relation onerel, LVRelStats *vacrelstats);
 static BlockNumber count_nondeletable_pages(Relation onerel,
 						 LVRelStats *vacrelstats);
 static void lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks);
-static void lazy_record_dead_tuple(LVRelStats *vacrelstats,
-					   ItemPointer itemptr);
+static void lazy_record_dead_tuple(LVDeadTuples *dead_tuples, ItemPointer itemptr);
 static bool lazy_tid_reaped(ItemPointer itemptr, void *state);
 static int	vac_cmp_itemptr(const void *left, const void *right);
 static bool heap_page_is_all_visible(Relation rel, Buffer buf,
 						 TransactionId *visibility_cutoff_xid, bool *all_frozen);
+static void lazy_update_index_statistics(Relation indrel, IndexBulkDeleteResult *stats);
+static LVParallelState *lazy_prepare_parallel(LVRelStats *vacrelstats, Oid relid,
+											  BlockNumber nblocks, int nindexes,
+											  int nrequested);
+static void lazy_end_parallel(LVParallelState *lps, Relation *Irel, int nindexes);
+static void lazy_begin_parallel_index_vacuum(LVParallelState *lps, LVRelStats *vacrelstats,
+											 bool for_cleanup);
+static void lazy_end_parallel_index_vacuum(LVParallelState *lps, bool reinitialize);
+static void lazy_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
+										   int nindexes,
+										   IndexBulkDeleteResult **stats,
+										   LVParallelState *lps, bool for_cleanup);
+static void lazy_parallel_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats,
+													Relation *Irel,
+													int nindexes,
+													IndexBulkDeleteResult **stats,
+													LVParallelState *lps,
+													bool for_cleanup);
+static void do_parallel_vacuum_or_cleanup_indexes(Relation *Irel, int nindexes,
+												  IndexBulkDeleteResult **stats,
+												  LVShared *lvshared,
+												  LVDeadTuples *dead_tuples);
+static int compute_parallel_workers(Relation onerel, int nrequested, int nindexes);
+static long compute_max_dead_tuples(BlockNumber relblocks, bool hasindex);
 
 
 /*
@@ -490,6 +613,17 @@ vacuum_log_cleanup_info(Relation rel, LVRelStats *vacrelstats)
  *		dead-tuple TIDs, invoke vacuuming of indexes and call lazy_vacuum_heap
  *		to reclaim dead line pointers.
  *
+ *		If the table has at least two indexes and parallel lazy vacuum is
+ *		requested, we execute both index vacuuming and index cleanup with
+ *		parallel workers. In parallel lazy vacuum, we enter parallel mode and
+ *		create the parallel context and the DSM segment before starting heap
+ *		scan. All parallel workers are launched at beginning of index vacuuming
+ *		and index cleanup and they exit once done with all indexes. At end of
+ *		this function we exit from parallel mode. Index bulk-deletion results
+ *		are stored in the DSM segment and update index statistics as a whole
+ *		after exited from parallel mode since all writes are not allowed during
+ *		parallel mode.
+ *
  *		If there are no indexes then we can reclaim line pointers on the fly;
  *		dead line pointers need only be retained until all index pointers that
  *		reference them have been killed.
@@ -498,6 +632,8 @@ static void
 lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			   Relation *Irel, int nindexes, bool aggressive)
 {
+	LVParallelState *lps = NULL;
+	LVDeadTuples *dead_tuples;
 	BlockNumber nblocks,
 				blkno;
 	HeapTupleData tuple;
@@ -523,6 +659,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	bool		skipping_blocks;
 	xl_heap_freeze_tuple *frozen;
 	StringInfoData buf;
+	int			parallel_workers = 0;
 	const int	initprog_index[] = {
 		PROGRESS_VACUUM_PHASE,
 		PROGRESS_VACUUM_TOTAL_HEAP_BLKS,
@@ -559,13 +696,45 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	vacrelstats->nonempty_pages = 0;
 	vacrelstats->latestRemovedXid = InvalidTransactionId;
 
-	lazy_space_alloc(vacrelstats, nblocks);
+	/*
+	 * If parallel lazy vacuum is requested and we vacuum indexes, compute
+	 * the number of parallel vacuum worker to launch.
+	 */
+	if (params->nworkers >= 0 && vacrelstats->useindex)
+		parallel_workers = compute_parallel_workers(onerel,
+													params->nworkers,
+													nindexes);
+
+	if (parallel_workers > 0)
+	{
+		/*
+		 * Enter parallel mode, create the parallel context and allocate the
+		 * DSM segment.
+		 */
+		lps = lazy_prepare_parallel(vacrelstats,
+									RelationGetRelid(onerel),
+									nblocks, nindexes,
+									parallel_workers);
+
+		/* Remember the user-requested parallel degree to reporting */
+		lps->nworkers_requested = params->nworkers;
+	}
+	else
+	{
+		/*
+		 * Use single process vacuum, allocate the memory space for dead
+		 * tuples locally.
+		 */
+		lazy_space_alloc(vacrelstats, nblocks);
+	}
+
+	dead_tuples = vacrelstats->dead_tuples;
 	frozen = palloc(sizeof(xl_heap_freeze_tuple) * MaxHeapTuplesPerPage);
 
 	/* Report that we're scanning the heap, advertising total # of blocks */
 	initprog_val[0] = PROGRESS_VACUUM_PHASE_SCAN_HEAP;
 	initprog_val[1] = nblocks;
-	initprog_val[2] = vacrelstats->max_dead_tuples;
+	initprog_val[2] = dead_tuples->max_tuples;
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
 	/*
@@ -743,8 +912,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * If we are close to overrunning the available space for dead-tuple
 		 * TIDs, pause and do a cycle of vacuuming before we tackle this page.
 		 */
-		if ((vacrelstats->max_dead_tuples - vacrelstats->num_dead_tuples) < MaxHeapTuplesPerPage &&
-			vacrelstats->num_dead_tuples > 0)
+		if ((dead_tuples->max_tuples - dead_tuples->num_tuples) < MaxHeapTuplesPerPage &&
+			dead_tuples->num_tuples > 0)
 		{
 			const int	hvp_index[] = {
 				PROGRESS_VACUUM_PHASE,
@@ -772,10 +941,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 										 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
 			/* Remove index entries */
-			for (i = 0; i < nindexes; i++)
-				lazy_vacuum_index(Irel[i],
-								  &indstats[i],
-								  vacrelstats);
+			lazy_vacuum_or_cleanup_indexes(vacrelstats, Irel, nindexes,
+										   indstats, lps, false);
 
 			/*
 			 * Report that we are now vacuuming the heap.  We also increase
@@ -795,7 +962,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 			vacrelstats->num_index_scans++;
 
 			/*
@@ -991,7 +1158,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		has_dead_tuples = false;
 		nfrozen = 0;
 		hastup = false;
-		prev_dead_count = vacrelstats->num_dead_tuples;
+		prev_dead_count = dead_tuples->num_tuples;
 		maxoff = PageGetMaxOffsetNumber(page);
 
 		/*
@@ -1030,7 +1197,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 */
 			if (ItemIdIsDead(itemid))
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				all_visible = false;
 				continue;
 			}
@@ -1180,7 +1347,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 
 			if (tupgone)
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				HeapTupleHeaderAdvanceLatestRemovedXid(tuple.t_data,
 													   &vacrelstats->latestRemovedXid);
 				tups_vacuumed += 1;
@@ -1250,7 +1417,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * doing a second scan. Also we don't do that but forget dead tuples
 		 * when index cleanup is disabled.
 		 */
-		if (!vacrelstats->useindex && vacrelstats->num_dead_tuples > 0)
+		if (!vacrelstats->useindex && dead_tuples->num_tuples > 0)
 		{
 			if (nindexes == 0)
 			{
@@ -1271,7 +1438,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 				 * the next vacuum will process them anyway.
 				 */
 				Assert(params->index_cleanup == VACOPT_TERNARY_DISABLED);
-				nleft_dead_itemids += vacrelstats->num_dead_tuples;
+				nleft_dead_itemids += dead_tuples->num_tuples;
 			}
 
 			/*
@@ -1279,7 +1446,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 
 			/*
 			 * Periodically do incremental FSM vacuuming to make newly-freed
@@ -1394,7 +1561,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * page, so remember its free space as-is.  (This path will always be
 		 * taken if there are no indexes.)
 		 */
-		if (vacrelstats->num_dead_tuples == prev_dead_count)
+		if (dead_tuples->num_tuples == prev_dead_count)
 			RecordPageWithFreeSpace(onerel, blkno, freespace, nblocks);
 	}
 
@@ -1435,7 +1602,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 
 	/* If any tuples need to be deleted, perform final vacuum cycle */
 	/* XXX put a threshold on min number of tuples here? */
-	if (vacrelstats->num_dead_tuples > 0)
+	if (dead_tuples->num_tuples > 0)
 	{
 		const int	hvp_index[] = {
 			PROGRESS_VACUUM_PHASE,
@@ -1451,10 +1618,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 									 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
 		/* Remove index entries */
-		for (i = 0; i < nindexes; i++)
-			lazy_vacuum_index(Irel[i],
-							  &indstats[i],
-							  vacrelstats);
+		lazy_vacuum_or_cleanup_indexes(vacrelstats, Irel, nindexes,
+										   indstats, lps, false);
 
 		/* Report that we are now vacuuming the heap */
 		hvp_val[0] = PROGRESS_VACUUM_PHASE_VACUUM_HEAP;
@@ -1480,11 +1645,20 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
 								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
 
-	/* Do post-vacuum cleanup and statistics update for each index */
+	/*
+	 * Do post-vacuum cleanup, and statistics update for each index if
+	 * we're not in parallel lazy vacuum. If in parallel lazy vacuum, do
+	 * only post-vacum cleanup and then update statistics at the end of
+	 * parallel lazy vacuum.
+	 */
 	if (vacrelstats->useindex)
+		lazy_vacuum_or_cleanup_indexes(vacrelstats, Irel, nindexes,
+									   indstats, lps, true);
+
+	if (IsInParallelVacuum(lps))
 	{
-		for (i = 0; i < nindexes; i++)
-			lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
+		/* End parallel mode and update index statistics */
+		lazy_end_parallel(lps, Irel, nindexes);
 	}
 
 	/* If no indexes, make log report that lazy_vacuum_heap would've made */
@@ -1554,7 +1728,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats, BlockNumber nblocks)
 	npages = 0;
 
 	tupindex = 0;
-	while (tupindex < vacrelstats->num_dead_tuples)
+	while (tupindex < vacrelstats->dead_tuples->num_tuples)
 	{
 		BlockNumber tblk;
 		Buffer		buf;
@@ -1563,7 +1737,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats, BlockNumber nblocks)
 
 		vacuum_delay_point();
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples->itemptrs[tupindex]);
 		buf = ReadBufferExtended(onerel, MAIN_FORKNUM, tblk, RBM_NORMAL,
 								 vac_strategy);
 		if (!ConditionalLockBufferForCleanup(buf))
@@ -1611,6 +1785,7 @@ static int
 lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 				 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer)
 {
+	LVDeadTuples	*dead_tuples = vacrelstats->dead_tuples;
 	Page		page = BufferGetPage(buffer);
 	OffsetNumber unused[MaxOffsetNumber];
 	int			uncnt = 0;
@@ -1621,16 +1796,16 @@ lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 
 	START_CRIT_SECTION();
 
-	for (; tupindex < vacrelstats->num_dead_tuples; tupindex++)
+	for (; tupindex < dead_tuples->num_tuples; tupindex++)
 	{
 		BlockNumber tblk;
 		OffsetNumber toff;
 		ItemId		itemid;
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&dead_tuples->itemptrs[tupindex]);
 		if (tblk != blkno)
 			break;				/* past end of tuples for this block */
-		toff = ItemPointerGetOffsetNumber(&vacrelstats->dead_tuples[tupindex]);
+		toff = ItemPointerGetOffsetNumber(&dead_tuples->itemptrs[tupindex]);
 		itemid = PageGetItemId(page, toff);
 		ItemIdSetUnused(itemid);
 		unused[uncnt++] = toff;
@@ -1751,6 +1926,154 @@ lazy_check_needs_freeze(Buffer buf, bool *hastup)
 	return false;
 }
 
+/*
+ * Vacuum or cleanup indexes with parallel workers. This function must be used
+ * by the parallel vacuum leader process.
+ */
+static void
+lazy_parallel_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
+										int nindexes, IndexBulkDeleteResult **stats,
+										LVParallelState *lps, bool for_cleanup)
+{
+	Assert(!IsParallelWorker());
+	Assert(IsInParallelVacuum(lps));
+	Assert(nindexes > 0);
+
+	/* Launch parallel vacuum workers if we're ready */
+	lazy_begin_parallel_index_vacuum(lps, vacrelstats,
+									 for_cleanup);
+
+	/*
+	 * Do index vacuuming or index cleanup with parallel workers or by
+	 * the leader process alone if no workers could not launch.
+	 */
+	do_parallel_vacuum_or_cleanup_indexes(Irel, nindexes, stats,
+										  lps->lvshared,
+										  vacrelstats->dead_tuples);
+
+	/*
+	 * Wait for all workers to finish, and prepare for the next index
+	 * vacuuming or index cleanup.
+	 */
+	lazy_end_parallel_index_vacuum(lps, !for_cleanup);
+}
+
+/*
+ * Index vacuuming and index cleanup routine used by both the leader process
+ * and worker processes. Unlike single process vacuum, we don't update
+ * index statistics after cleanup index since it is not allowed during
+ * parallel mode, therefore copy index bulk-deletion results from the local
+ * memory to the DSM segment and update them at the end of parallel lazy
+ * vacuum.
+ */
+static void
+do_parallel_vacuum_or_cleanup_indexes(Relation *Irel, int nindexes,
+									  IndexBulkDeleteResult **stats,
+									  LVShared *lvshared,
+									  LVDeadTuples *dead_tuples)
+{
+	/* Loop until all indexes are vacuumed */
+	for (;;)
+	{
+		int idx;
+
+		/* Get an index number to process */
+		idx = pg_atomic_fetch_add_u32(&(lvshared->nprocessed), 1);
+
+		/* Done for all indexes? */
+		if (idx >= nindexes)
+			break;
+
+		/*
+		 * Update the local pointer to the corresponding bulk-deletion result
+		 * if someone already updated it.
+		 */
+		if (lvshared->indstats[idx].updated &&
+			stats[idx] == NULL)
+			stats[idx] = &(lvshared->indstats[idx].stats);
+
+		/* Do vacuum or cleanup one index */
+		if (!lvshared->for_cleanup)
+			lazy_vacuum_index(Irel[idx], &stats[idx], lvshared->reltuples,
+							  dead_tuples);
+		else
+			lazy_cleanup_index(Irel[idx], &stats[idx], lvshared->reltuples,
+							   lvshared->estimated_count);
+
+		/*
+		 * Copy the index bulk-deletion result returned from ambulkdelete and
+		 * amvacuumcleanup to the DSM segment if it's the first time to get it
+		 * from them, because they allocate it locally and it's possible that an
+		 * index will be vacuumed by the different vacuum process at the next
+		 * time. The copying the result normally happens only after the first
+		 * time of index vacuuming. From the second time, we pass the result on
+		 * the DSM segment so that they then update it directly.
+		 *
+		 * Since all vacuum workers write the bulk-deletion result at different
+		 * slots we can write them without locking.
+		 */
+		if (!lvshared->indstats[idx].updated &&
+			stats[idx] != NULL)
+		{
+			memcpy(&(lvshared->indstats[idx].stats),
+				   stats[idx], sizeof(IndexBulkDeleteResult));
+			lvshared->indstats[idx].updated = true;
+
+			/*
+			 * no longer need the locally allocated result and now stats[idx]
+			 * points to the DSM segment.
+			 */
+			pfree(stats[idx]);
+			stats[idx] = &(lvshared->indstats[idx].stats);
+		}
+	}
+}
+
+/*
+ * Vacuum or cleanup indexes. If we're ready for parallel lazy vacuum it's
+ * performed with parallel workers. So this function must be used by the parallel
+ * vacuum leader process.
+ */
+static void
+lazy_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
+							   int nindexes, IndexBulkDeleteResult **stats,
+							   LVParallelState *lps, bool for_cleanup)
+{
+	int		idx;
+
+	Assert(!IsParallelWorker());
+
+	/* no job if the table has no index */
+	if (nindexes <= 0)
+		return;
+
+	if (IsInParallelVacuum(lps))
+	{
+		/* Do parallel index vacuuming or index cleanup */
+		lazy_parallel_vacuum_or_cleanup_indexes(vacrelstats, Irel,
+												nindexes, stats,
+												lps, for_cleanup);
+		return;
+	}
+
+	/* Do index vacuuming or index cleanup in single vacuum mode */
+	for (idx = 0; idx < nindexes; idx++)
+	{
+		if (!for_cleanup)
+			lazy_vacuum_index(Irel[idx], &stats[idx], vacrelstats->old_live_tuples,
+							  vacrelstats->dead_tuples);
+		else
+		{
+			lazy_cleanup_index(Irel[idx], &stats[idx], vacrelstats->new_rel_tuples,
+							   vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+
+			lazy_update_index_statistics(Irel[idx], stats[idx]);
+
+			if (stats[idx])
+				pfree(stats[idx]);
+		}
+	}
+}
 
 /*
  *	lazy_vacuum_index() -- vacuum one index relation.
@@ -1761,9 +2084,10 @@ lazy_check_needs_freeze(Buffer buf, bool *hastup)
 static void
 lazy_vacuum_index(Relation indrel,
 				  IndexBulkDeleteResult **stats,
-				  LVRelStats *vacrelstats)
+				  double reltuples, LVDeadTuples *dead_tuples)
 {
 	IndexVacuumInfo ivinfo;
+	char		*msgfmt;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
@@ -1773,18 +2097,22 @@ lazy_vacuum_index(Relation indrel,
 	ivinfo.report_progress = false;
 	ivinfo.estimated_count = true;
 	ivinfo.message_level = elevel;
-	/* We can only provide an approximate value of num_heap_tuples here */
-	ivinfo.num_heap_tuples = vacrelstats->old_live_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
 	/* Do bulk deletion */
 	*stats = index_bulk_delete(&ivinfo, *stats,
-							   lazy_tid_reaped, (void *) vacrelstats);
+							   lazy_tid_reaped, (void *) dead_tuples);
+
+	if (IsParallelWorker())
+		msgfmt = gettext_noop("scanned index \"%s\" to remove %d row versions by parallel vacuum worker");
+	else
+		msgfmt = gettext_noop("scanned index \"%s\" to remove %d row versions");
 
 	ereport(elevel,
-			(errmsg("scanned index \"%s\" to remove %d row versions",
+			(errmsg(msgfmt,
 					RelationGetRelationName(indrel),
-					vacrelstats->num_dead_tuples),
+					dead_tuples->num_tuples),
 			 errdetail_internal("%s", pg_rusage_show(&ru0))));
 }
 
@@ -1793,10 +2121,11 @@ lazy_vacuum_index(Relation indrel,
  */
 static void
 lazy_cleanup_index(Relation indrel,
-				   IndexBulkDeleteResult *stats,
-				   LVRelStats *vacrelstats)
+				   IndexBulkDeleteResult **stats,
+				   double reltuples, bool estimated_count)
 {
 	IndexVacuumInfo ivinfo;
+	char		*msgfmt;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
@@ -1804,49 +2133,55 @@ lazy_cleanup_index(Relation indrel,
 	ivinfo.index = indrel;
 	ivinfo.analyze_only = false;
 	ivinfo.report_progress = false;
-	ivinfo.estimated_count = (vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	ivinfo.estimated_count = estimated_count;
 	ivinfo.message_level = elevel;
 
-	/*
-	 * Now we can provide a better estimate of total number of surviving
-	 * tuples (we assume indexes are more interested in that than in the
-	 * number of nominally live tuples).
-	 */
-	ivinfo.num_heap_tuples = vacrelstats->new_rel_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
-	stats = index_vacuum_cleanup(&ivinfo, stats);
+	*stats = index_vacuum_cleanup(&ivinfo, *stats);
 
 	if (!stats)
 		return;
 
-	/*
-	 * Now update statistics in pg_class, but only if the index says the count
-	 * is accurate.
-	 */
-	if (!stats->estimated_count)
-		vac_update_relstats(indrel,
-							stats->num_pages,
-							stats->num_index_tuples,
-							0,
-							false,
-							InvalidTransactionId,
-							InvalidMultiXactId,
-							false);
+	if (IsParallelWorker())
+		msgfmt = gettext_noop("index \"%s\" now contains %.0f row versions in %u pages, reported by parallel vacuum worker");
+	else
+		msgfmt = gettext_noop("index \"%s\" now contains %.0f row versions in %u pages");
 
 	ereport(elevel,
-			(errmsg("index \"%s\" now contains %.0f row versions in %u pages",
+			(errmsg(msgfmt,
 					RelationGetRelationName(indrel),
-					stats->num_index_tuples,
-					stats->num_pages),
+					(*stats)->num_index_tuples,
+					(*stats)->num_pages),
 			 errdetail("%.0f index row versions were removed.\n"
 					   "%u index pages have been deleted, %u are currently reusable.\n"
 					   "%s.",
-					   stats->tuples_removed,
-					   stats->pages_deleted, stats->pages_free,
+					   (*stats)->tuples_removed,
+					   (*stats)->pages_deleted, (*stats)->pages_free,
 					   pg_rusage_show(&ru0))));
+}
+
+/*
+ * Update index statistics in pg_class if the statistics is accurate.
+ */
+static void
+lazy_update_index_statistics(Relation indrel, IndexBulkDeleteResult *stats)
+{
+	Assert(!IsInParallelMode());
+
+	if (!stats || stats->estimated_count)
+		return;
 
-	pfree(stats);
+	/* Update index statistics */
+	vac_update_relstats(indrel,
+						stats->num_pages,
+						stats->num_index_tuples,
+						0,
+						false,
+						InvalidTransactionId,
+						InvalidMultiXactId,
+						false);
 }
 
 /*
@@ -2151,19 +2486,17 @@ count_nondeletable_pages(Relation onerel, LVRelStats *vacrelstats)
 }
 
 /*
- * lazy_space_alloc - space allocation decisions for lazy vacuum
- *
- * See the comments at the head of this file for rationale.
+ * Return the maximum number of dead tuples we can record.
  */
-static void
-lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+static long
+compute_max_dead_tuples(BlockNumber relblocks, bool useindex)
 {
 	long		maxtuples;
 	int			vac_work_mem = IsAutoVacuumWorkerProcess() &&
 	autovacuum_work_mem != -1 ?
 	autovacuum_work_mem : maintenance_work_mem;
 
-	if (vacrelstats->useindex)
+	if (useindex)
 	{
 		maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);
 		maxtuples = Min(maxtuples, INT_MAX);
@@ -2177,34 +2510,49 @@ lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
 		maxtuples = Max(maxtuples, MaxHeapTuplesPerPage);
 	}
 	else
-	{
 		maxtuples = MaxHeapTuplesPerPage;
-	}
 
-	vacrelstats->num_dead_tuples = 0;
-	vacrelstats->max_dead_tuples = (int) maxtuples;
-	vacrelstats->dead_tuples = (ItemPointer)
-		palloc(maxtuples * sizeof(ItemPointerData));
+	return maxtuples;
+}
+
+/*
+ * lazy_space_alloc - space allocation decisions for lazy vacuum
+ *
+ * See the comments at the head of this file for rationale.
+ */
+static void
+lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+{
+	LVDeadTuples	*dead_tuples = NULL;
+	long		maxtuples;
+
+	maxtuples = compute_max_dead_tuples(relblocks, vacrelstats->useindex);
+
+	dead_tuples = (LVDeadTuples *)
+		palloc(SizeOfLVDeadTuples + maxtuples * sizeof(ItemPointerData));
+	dead_tuples->num_tuples = 0;
+	dead_tuples->max_tuples = (int) maxtuples;
+
+	vacrelstats->dead_tuples = dead_tuples;
 }
 
 /*
  * lazy_record_dead_tuple - remember one deletable tuple
  */
 static void
-lazy_record_dead_tuple(LVRelStats *vacrelstats,
-					   ItemPointer itemptr)
+lazy_record_dead_tuple(LVDeadTuples *dead_tuples, ItemPointer itemptr)
 {
 	/*
 	 * The array shouldn't overflow under normal behavior, but perhaps it
 	 * could if we are given a really small maintenance_work_mem. In that
 	 * case, just forget the last few tuples (we'll get 'em next time).
 	 */
-	if (vacrelstats->num_dead_tuples < vacrelstats->max_dead_tuples)
+	if (dead_tuples->num_tuples < dead_tuples->max_tuples)
 	{
-		vacrelstats->dead_tuples[vacrelstats->num_dead_tuples] = *itemptr;
-		vacrelstats->num_dead_tuples++;
+		dead_tuples->itemptrs[dead_tuples->num_tuples] = *itemptr;
+		dead_tuples->num_tuples++;
 		pgstat_progress_update_param(PROGRESS_VACUUM_NUM_DEAD_TUPLES,
-									 vacrelstats->num_dead_tuples);
+									 dead_tuples->num_tuples);
 	}
 }
 
@@ -2218,12 +2566,12 @@ lazy_record_dead_tuple(LVRelStats *vacrelstats,
 static bool
 lazy_tid_reaped(ItemPointer itemptr, void *state)
 {
-	LVRelStats *vacrelstats = (LVRelStats *) state;
+	LVDeadTuples	*dead_tuples = (LVDeadTuples *) state;
 	ItemPointer res;
 
 	res = (ItemPointer) bsearch((void *) itemptr,
-								(void *) vacrelstats->dead_tuples,
-								vacrelstats->num_dead_tuples,
+								(void *) dead_tuples->itemptrs,
+								dead_tuples->num_tuples,
 								sizeof(ItemPointerData),
 								vac_cmp_itemptr);
 
@@ -2371,3 +2719,353 @@ heap_page_is_all_visible(Relation rel, Buffer buf,
 
 	return all_visible;
 }
+
+/*
+ * Compute the number of parallel worker process to request. Both index
+ * vacuuming and index cleanup can be executed together with parallel workers
+ * if the table has more than one index. The relation sizes of table and
+ * indexes don't affect to the parallel degree for now. nrequested is the
+ * number of parallel workers that user requested and nindexes is the number
+ * of indexes that the table has.
+ */
+static int
+compute_parallel_workers(Relation onerel, int nrequested, int nindexes)
+{
+	int parallel_workers;
+
+	Assert(nrequested >= 0);
+
+	if (nindexes <= 1)
+		return 0;
+
+	if (nrequested > 0)
+		parallel_workers = Min(nrequested, nindexes - 1);
+	else
+	{
+		/*
+		 * The parallel degree is not requested. Compute it based on the
+		 * number of indexes.
+		 */
+		parallel_workers = nindexes - 1;
+	}
+
+	/* cap by max_parallel_maintenace_workers */
+	parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+
+	return parallel_workers;
+}
+
+/*
+ * Enter parallel mode, allocate and initialize the DSM segment.
+ */
+static LVParallelState *
+lazy_prepare_parallel(LVRelStats *vacrelstats, Oid relid, BlockNumber nblocks,
+					  int nindexes, int nrequested)
+{
+	LVParallelState *lps = (LVParallelState *) palloc(sizeof(LVParallelState));
+	LVShared	*shared;
+	ParallelContext *pcxt;
+	LVDeadTuples	*tidmap;
+	long	maxtuples;
+	char	*sharedquery;
+	Size	est_shared;
+	Size	est_deadtuples;
+	int		querylen;
+	int		keys = 0;
+
+	Assert(nrequested > 0);
+	Assert(nindexes > 0);
+
+	EnterParallelMode();
+	pcxt = CreateParallelContext("postgres", "heap_parallel_vacuum_main",
+								 nrequested);
+	lps->pcxt = pcxt;
+	Assert(pcxt->nworkers > 0);
+
+	/* Estimate size for shared information -- PARALLEL_VACUUM_KEY_SHARED */
+	est_shared = MAXALIGN(add_size(SizeOfLVShared,
+								   mul_size(sizeof(LVIndStats), nindexes)));
+	shm_toc_estimate_chunk(&pcxt->estimator, est_shared);
+	keys++;
+
+	/* Estimate size for dead tuples -- PARALLEL_VACUUM_KEY_DEAD_TUPLES */
+	maxtuples = compute_max_dead_tuples(nblocks, true);
+	est_deadtuples = MAXALIGN(add_size(sizeof(LVDeadTuples),
+									   mul_size(sizeof(ItemPointerData), maxtuples)));
+	shm_toc_estimate_chunk(&pcxt->estimator, est_deadtuples);
+	keys++;
+
+	shm_toc_estimate_keys(&pcxt->estimator, keys);
+
+	/* Finally, estimate VACUUM_KEY_QUERY_TEXT space */
+	querylen = strlen(debug_query_string);
+	shm_toc_estimate_chunk(&pcxt->estimator, querylen + 1);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	InitializeParallelDSM(pcxt);
+
+	/* prepare shared information */
+	shared = (LVShared *) shm_toc_allocate(pcxt->toc, est_shared);
+	shared->relid = relid;
+	shared->elevel = elevel;
+	pg_atomic_init_u32(&(shared->nprocessed), 0);
+	MemSet(shared->indstats, 0, sizeof(LVIndStats) * nindexes);
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_SHARED, shared);
+	lps->lvshared = shared;
+
+	/* prepare the dead tuple space */
+	tidmap = (LVDeadTuples *) shm_toc_allocate(pcxt->toc, est_deadtuples);
+	tidmap->max_tuples = maxtuples;
+	tidmap->num_tuples = 0;
+	MemSet(tidmap->itemptrs, 0, sizeof(ItemPointerData) * maxtuples);
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_DEAD_TUPLES, tidmap);
+	vacrelstats->dead_tuples = tidmap;
+
+	/* Store query string for workers */
+	sharedquery = (char *) shm_toc_allocate(pcxt->toc, querylen + 1);
+	memcpy(sharedquery, debug_query_string, querylen + 1);
+	sharedquery[querylen] = '\0';
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, sharedquery);
+
+	lps->nworkers_requested = 0;
+
+	return lps;
+}
+
+/*
+ * Shutdown workers, destroy the parallel context, and end parallel mode.
+ * Update index statistics after exited from parallel mode.
+ */
+static void
+lazy_end_parallel(LVParallelState *lps, Relation *Irel, int nindexes)
+{
+	LVIndStats *copied_indstats = NULL;
+	int i;
+
+	Assert(!IsParallelWorker());
+	Assert(Irel != NULL && nindexes > 0);
+
+	/*
+	 * All writes are not allowed during parallel mode and it might not be
+	 * safe to exit from parallel mode while keeping the parallel context.
+	 * So we copy the index statistics to a temporary space and update
+	 * them after exited from parallel mode.
+	 */
+	copied_indstats = palloc(sizeof(LVIndStats) * nindexes);
+	memcpy(copied_indstats, lps->lvshared->indstats,
+		   sizeof(LVIndStats) * nindexes);
+
+	/* Shutdown worker processes and destroy the parallel context */
+	WaitForParallelWorkersToFinish(lps->pcxt);
+	DestroyParallelContext(lps->pcxt);
+	ExitParallelMode();
+
+	for (i = 0; i < nindexes; i++)
+	{
+		LVIndStats *s = &(copied_indstats[i]);
+
+		if (s->updated)
+			lazy_update_index_statistics(Irel[i], &(s->stats));
+	}
+
+	pfree(copied_indstats);
+}
+
+/*
+ * Begin a parallel index vacuuming or index cleanup. Set shared information
+ * and launch parallel worker processes.
+ */
+static void
+lazy_begin_parallel_index_vacuum(LVParallelState *lps, LVRelStats *vacrelstats,
+								 bool for_cleanup)
+{
+	StringInfoData buf;
+
+	Assert(!IsParallelWorker());
+
+	/* Set shared information to tell parallel workers */
+	lps->lvshared->for_cleanup = for_cleanup;
+	if (!for_cleanup)
+	{
+		/* We can only provide an approximate value of num_heap_tuples here */
+		lps->lvshared->reltuples = vacrelstats->old_live_tuples;
+		lps->lvshared->estimated_count = true;
+	}
+	else
+	{
+		/*
+		 * Now we can provide a better estimate of total number of surviving
+		 * tuples (we assume indexes are more interested in that than in the
+		 * number of nominally live tuples).
+		 */
+		lps->lvshared->reltuples = vacrelstats->new_rel_tuples;
+		lps->lvshared->estimated_count =
+			(vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+
+	}
+
+	LaunchParallelWorkers(lps->pcxt);
+
+	initStringInfo(&buf);
+
+	/* Create the log message to report */
+	if (lps->pcxt->nworkers_launched == 0)
+	{
+		/*
+		 * If no workers launched, the leader process vacuums all indexes alone.
+		 * Since there is hope that we can launch parallel workers in the next
+		 * index vacuuming time we don't end parallel mode yet.
+		 */
+		if (for_cleanup)
+		{
+			if (lps->nworkers_requested > 0)
+				appendStringInfo(&buf,
+								 gettext_noop("could not launch parallel vacuum worker for index cleanup (planned: %d, requested: %d)"),
+								 lps->pcxt->nworkers, lps->nworkers_requested);
+			else
+				appendStringInfo(&buf,
+								 gettext_noop("could not launch parallel vacuum worker for index cleanup (planned: %d)"),
+								 lps->pcxt->nworkers);
+		}
+		else
+		{
+			if (lps->nworkers_requested > 0)
+				appendStringInfo(&buf,
+								 gettext_noop("could not launch parallel vacuum worker for index vacuuming (planned: %d, requested: %d)"),
+								 lps->pcxt->nworkers, lps->nworkers_requested);
+			else
+				appendStringInfo(&buf,
+								 gettext_noop("could not launch parallel vacuum worker for index vacuuming (planned: %d)"),
+								 lps->pcxt->nworkers);
+		}
+	}
+	else
+	{
+		if (for_cleanup)
+		{
+			if (lps->nworkers_requested > 0)
+				appendStringInfo(&buf,
+								 ngettext("launched %d parallel vacuum worker for index cleanup (planned: %d, requested %d)",
+										  "launched %d parallel vacuum workers for index cleanup (planned: %d, requsted %d)",
+										  lps->pcxt->nworkers_launched),
+								 lps->pcxt->nworkers_launched,
+								 lps->pcxt->nworkers,
+								 lps->nworkers_requested);
+			else
+				appendStringInfo(&buf,
+								 ngettext("launched %d parallel vacuum worker for index cleanup (planned: %d)",
+										  "launched %d parallel vacuum workers for index cleanup (planned: %d)",
+										  lps->pcxt->nworkers_launched),
+								 lps->pcxt->nworkers_launched,
+								 lps->pcxt->nworkers);
+		}
+		else
+		{
+			if (lps->nworkers_requested > 0)
+				appendStringInfo(&buf,
+								 ngettext("launched %d parallel vacuum worker for index vacuuming (planned: %d, requested %d)",
+										  "launched %d parallel vacuum workers for index vacuuming (planned: %d, requested %d)",
+										  lps->pcxt->nworkers_launched),
+								 lps->pcxt->nworkers_launched,
+								 lps->pcxt->nworkers,
+								 lps->nworkers_requested);
+			else
+				appendStringInfo(&buf,
+								 ngettext("launched %d parallel vacuum worker for index vacuuming (planned: %d)",
+										  "launched %d parallel vacuum workers for index vacuuming (planned: %d)",
+										  lps->pcxt->nworkers_launched),
+								 lps->pcxt->nworkers_launched,
+								 lps->pcxt->nworkers);
+		}
+	}
+
+	ereport(elevel, (errmsg("%s", buf.data)));
+	return;
+}
+
+/*
+ * Wait for all worker processes to finish and reinitialize DSM for
+ * the next index vacuuming or index cleanup if necessary.
+ */
+static void
+lazy_end_parallel_index_vacuum(LVParallelState *lps, bool reinitialize)
+{
+	Assert(!IsParallelWorker());
+
+	WaitForParallelWorkersToFinish(lps->pcxt);
+
+	if (reinitialize)
+	{
+		/* Reset the processing count */
+		pg_atomic_write_u32(&(lps->lvshared->nprocessed), 0);
+
+		/*
+		 * Reinitialize the parallel context to relaunch parallel workers
+		 * for the next execution.
+		 */
+		ReinitializeParallelDSM(lps->pcxt);
+	}
+}
+
+/*
+ * Perform work within a launched parallel process.
+ *
+ * Since parallel vacuum workers work only within index vacuuming and index
+ * cleanup, no need to report the progress information.
+ */
+void
+heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
+{
+	Relation	onerel;
+	Relation	*indrels;
+	LVShared	*lvshared;
+	LVDeadTuples	*dead_tuples;
+	int			nindexes;
+	char		*sharedquery;
+	IndexBulkDeleteResult **stats;
+
+	lvshared = (LVShared *) shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_SHARED,
+										   false);
+	elevel = lvshared->elevel;
+
+	ereport(DEBUG1,
+			(errmsg("starting parallel lazy vacuum worker for %s",
+					lvshared->for_cleanup ? "cleanup" : "vacuuming")));
+
+	/* Set debug_query_string for individual workers */
+	sharedquery = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, true);
+	debug_query_string = sharedquery;
+	pgstat_report_activity(STATE_RUNNING, debug_query_string);
+
+	/* Open table */
+	onerel = heap_open(lvshared->relid, ShareUpdateExclusiveLock);
+
+	/*
+	 * Open all indexes. indrels are sorted in order by OID, which should
+	 * be matched to the leader's one.
+	 */
+	vac_open_indexes(onerel, RowExclusiveLock, &nindexes, &indrels);
+	Assert(nindexes > 0);
+
+	/* Set dead tuple space */
+	dead_tuples = (LVDeadTuples *) shm_toc_lookup(toc,
+												  PARALLEL_VACUUM_KEY_DEAD_TUPLES,
+												  false);
+
+	/* Set cost-based vacuum delay */
+	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostBalance = 0;
+	VacuumPageHit = 0;
+	VacuumPageMiss = 0;
+	VacuumPageDirty = 0;
+
+	stats = (IndexBulkDeleteResult **)
+		palloc0(nindexes * sizeof(IndexBulkDeleteResult *));
+
+	/* Do either vacuuming indexes or cleaning indexes */
+	do_parallel_vacuum_or_cleanup_indexes(indrels, nindexes, stats,
+										  lvshared, dead_tuples);
+
+	vac_close_indexes(nindexes, indrels, RowExclusiveLock);
+	heap_close(onerel, ShareUpdateExclusiveLock);
+}
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index 55d129a..86511b2 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -14,6 +14,7 @@
 
 #include "postgres.h"
 
+#include "access/heapam.h"
 #include "access/nbtree.h"
 #include "access/parallel.h"
 #include "access/session.h"
@@ -140,6 +141,9 @@ static const struct
 	},
 	{
 		"_bt_parallel_build_main", _bt_parallel_build_main
+	},
+	{
+		"heap_parallel_vacuum_main", heap_parallel_vacuum_main
 	}
 };
 
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 1a7291d..d0a650e 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -98,6 +98,7 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 
 	/* Set default value */
 	params.index_cleanup = VACOPT_TERNARY_DEFAULT;
+	params.nworkers = -1;
 
 	/* Parse options list */
 	foreach(lc, vacstmt->options)
@@ -126,6 +127,27 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 			disable_page_skipping = defGetBoolean(opt);
 		else if (strcmp(opt->defname, "index_cleanup") == 0)
 			params.index_cleanup = get_vacopt_ternary_value(opt);
+		else if (strcmp(opt->defname, "parallel") == 0)
+		{
+			if (opt->arg == NULL)
+			{
+				/*
+				 * Parallel lazy vacuum is requested but user didn't specify
+				 * the parallel degree. The parallel degree will be determined
+				 * at the start of lazy vacuum.
+				 */
+				params.nworkers = 0;
+			}
+			else
+			{
+				params.nworkers = defGetInt32(opt);
+				if (params.nworkers <= 0)
+					ereport(ERROR,
+							(errcode(ERRCODE_SYNTAX_ERROR),
+							 errmsg("parallel vacuum degree must be at least 1"),
+							 parser_errposition(pstate, opt->location)));
+			}
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -167,6 +189,11 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 		}
 	}
 
+	if ((params.options & VACOPT_FULL) && params.nworkers >= 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot specify FULL option with PARALLEL option")));
+
 	/*
 	 * All freeze ages are zero if the FREEZE option is given; otherwise pass
 	 * them as -1 which means to use the default values.
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index b51f12d..e61de95 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -10581,6 +10581,7 @@ vac_analyze_option_name:
 
 vac_analyze_option_arg:
 			opt_boolean_or_string					{ $$ = (Node *) makeString($1); }
+			| NumericOnly							{ $$ = (Node *) $1; }
 			| /* EMPTY */		 					{ $$ = NULL; }
 		;
 
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 0976029..c5005c8 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2887,6 +2887,7 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 			(doanalyze ? VACOPT_ANALYZE : 0) |
 			(!wraparound ? VACOPT_SKIP_LOCKED : 0);
 		tab->at_params.index_cleanup = VACOPT_TERNARY_DEFAULT;
+		tab->at_params.nworkers = -1;	/* parallel lazy autovacuum is not supported */
 		tab->at_params.freeze_min_age = freeze_min_age;
 		tab->at_params.freeze_table_age = freeze_table_age;
 		tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 7c4e5fba..827afc0 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -3445,7 +3445,7 @@ psql_completion(const char *text, int start, int end)
 		if (ends_with(prev_wd, '(') || ends_with(prev_wd, ','))
 			COMPLETE_WITH("FULL", "FREEZE", "ANALYZE", "VERBOSE",
 						  "DISABLE_PAGE_SKIPPING", "SKIP_LOCKED",
-						  "INDEX_CLEANUP");
+						  "INDEX_CLEANUP", "PARALLEL");
 		else if (TailMatches("FULL|FREEZE|ANALYZE|VERBOSE|DISABLE_PAGE_SKIPPING|SKIP_LOCKED|INDEX_CLEANUP"))
 			COMPLETE_WITH("ON", "OFF");
 	}
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 77e5e60..c1410c4 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -14,6 +14,7 @@
 #ifndef HEAPAM_H
 #define HEAPAM_H
 
+#include "access/parallel.h"
 #include "access/relation.h"	/* for backward compatibility */
 #include "access/relscan.h"
 #include "access/sdir.h"
@@ -201,6 +202,7 @@ extern Size SyncScanShmemSize(void);
 struct VacuumParams;
 extern void heap_vacuum_rel(Relation onerel,
 				struct VacuumParams *params, BufferAccessStrategy bstrategy);
+extern void heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
 
 /* in heap/heapam_visibility.c */
 extern bool HeapTupleSatisfiesVisibility(HeapTuple stup, Snapshot snapshot,
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 9cc6e0d..9504a01 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -182,6 +182,11 @@ typedef struct VacuumParams
 									 * to use default */
 	VacOptTernaryValue index_cleanup;	/* Do index vacuum and cleanup,
 										* default value depends on reloptions */
+	/*
+	 * The number of parallel vacuum workers. -1 by default for no workers
+	 * and 0 for choosing based on the number of indexes.
+	 */
+	int			nworkers;
 } VacuumParams;
 
 /* GUC parameters */
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index 6ba7cd7..74a69b5 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -80,6 +80,14 @@ CONTEXT:  SQL function "do_analyze" statement 1
 SQL function "wrap_do_analyze" statement 1
 VACUUM FULL vactst;
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+-- PARALLEL option
+VACUUM (PARALLEL) vaccluster;
+VACUUM (PARALLEL 2) vaccluster;
+VACUUM (PARALLEL 0) vaccluster; -- error
+ERROR:  parallel vacuum degree must be at least 1
+LINE 1: VACUUM (PARALLEL 0) vaccluster;
+                ^
+VACUUM (PARALLEL 2, INDEX_CLEANUP FALSE) vaccluster;
 -- INDEX_CLEANUP option
 CREATE TABLE no_index_cleanup (i INT PRIMARY KEY) WITH (vacuum_index_cleanup = false);
 VACUUM (INDEX_CLEANUP FALSE) vaccluster;
@@ -124,9 +132,9 @@ ERROR:  column "does_not_exist" of relation "vacparted" does not exist
 ANALYZE (VERBOSE) does_not_exist;
 ERROR:  relation "does_not_exist" does not exist
 ANALYZE (nonexistent-arg) does_not_exist;
-ERROR:  syntax error at or near "-"
+ERROR:  syntax error at or near "arg"
 LINE 1: ANALYZE (nonexistent-arg) does_not_exist;
-                            ^
+                             ^
 ANALYZE (nonexistentarg) does_not_exit;
 ERROR:  unrecognized ANALYZE option "nonexistentarg"
 LINE 1: ANALYZE (nonexistentarg) does_not_exit;
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 57e0f35..cfedaf3 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -62,6 +62,12 @@ VACUUM FULL vactst;
 
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
 
+-- PARALLEL option
+VACUUM (PARALLEL) vaccluster;
+VACUUM (PARALLEL 2) vaccluster;
+VACUUM (PARALLEL 0) vaccluster; -- error
+VACUUM (PARALLEL 2, INDEX_CLEANUP FALSE) vaccluster;
+
 -- INDEX_CLEANUP option
 CREATE TABLE no_index_cleanup (i INT PRIMARY KEY) WITH (vacuum_index_cleanup = false);
 VACUUM (INDEX_CLEANUP FALSE) vaccluster;
-- 
2.10.5

v23-0002-Add-paralell-P-option-to-vacuumdb-command.patchapplication/x-patch; name=v23-0002-Add-paralell-P-option-to-vacuumdb-command.patchDownload

From 2d7db9a90bccbe0e80a3078c0f856a002ff3d83f Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Wed, 23 Jan 2019 16:07:53 +0900
Subject: [PATCH v23 2/2] Add --paralell, -P option to vacuumdb command

---
 doc/src/sgml/ref/vacuumdb.sgml    | 16 +++++++++++++
 src/bin/scripts/t/100_vacuumdb.pl | 10 +++++++-
 src/bin/scripts/vacuumdb.c        | 48 ++++++++++++++++++++++++++++++++++++++-
 3 files changed, 72 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/ref/vacuumdb.sgml b/doc/src/sgml/ref/vacuumdb.sgml
index 47d9345..f6ac0c6 100644
--- a/doc/src/sgml/ref/vacuumdb.sgml
+++ b/doc/src/sgml/ref/vacuumdb.sgml
@@ -227,6 +227,22 @@ PostgreSQL documentation
      </varlistentry>
 
      <varlistentry>
+      <term><option>-P <replaceable class="parameter">workers</replaceable></option></term>
+      <term><option>--parallel=<replaceable class="parameter">workers</replaceable></option></term>
+      <listitem>
+       <para>
+        Execute parallel vacuum with <productname>PostgreSQL</productname>'s
+        <replaceable class="parameter">workers</replaceable> background workers.
+       </para>
+       <para>
+        This option will require background workers, so make sure your
+        <xref linkend="guc-max-parallel-workers-maintenance"/> setting is more
+        than one.
+       </para>
+      </listitem>
+     </varlistentry>
+
+     <varlistentry>
       <term><option>-q</option></term>
       <term><option>--quiet</option></term>
       <listitem>
diff --git a/src/bin/scripts/t/100_vacuumdb.pl b/src/bin/scripts/t/100_vacuumdb.pl
index 7f3a9b1..5ab87f3 100644
--- a/src/bin/scripts/t/100_vacuumdb.pl
+++ b/src/bin/scripts/t/100_vacuumdb.pl
@@ -3,7 +3,7 @@ use warnings;
 
 use PostgresNode;
 use TestLib;
-use Test::More tests => 44;
+use Test::More tests => 48;
 
 program_help_ok('vacuumdb');
 program_version_ok('vacuumdb');
@@ -48,6 +48,14 @@ $node->issues_sql_like(
 $node->command_fails(
 	[ 'vacuumdb', '--analyze-only', '--disable-page-skipping', 'postgres' ],
 	'--analyze-only and --disable-page-skipping specified together');
+$node->issues_sql_like(
+	[ 'vacuumdb', '-P2', 'postgres' ],
+	qr/statement: VACUUM \(PARALLEL 2\).*;/,
+	'vacuumdb -P2');
+$node->issues_sql_like(
+	[ 'vacuumdb', '-P', 'postgres' ],
+	qr/statement: VACUUM \(PARALLEL\).*;/,
+	'vacuumdb -P');
 $node->command_ok([qw(vacuumdb -Z --table=pg_am dbname=template1)],
 	'vacuumdb with connection string');
 
diff --git a/src/bin/scripts/vacuumdb.c b/src/bin/scripts/vacuumdb.c
index 25ff19e..68b10ad 100644
--- a/src/bin/scripts/vacuumdb.c
+++ b/src/bin/scripts/vacuumdb.c
@@ -46,6 +46,8 @@ typedef struct vacuumingOptions
 	bool		skip_locked;
 	int			min_xid_age;
 	int			min_mxid_age;
+	int			parallel_workers;	/* -1 disables, 0 for choosing based on the
+									 * number of indexes */
 } vacuumingOptions;
 
 
@@ -112,6 +114,7 @@ main(int argc, char *argv[])
 		{"full", no_argument, NULL, 'f'},
 		{"verbose", no_argument, NULL, 'v'},
 		{"jobs", required_argument, NULL, 'j'},
+		{"parallel", optional_argument, NULL, 'P'},
 		{"maintenance-db", required_argument, NULL, 2},
 		{"analyze-in-stages", no_argument, NULL, 3},
 		{"disable-page-skipping", no_argument, NULL, 4},
@@ -141,6 +144,7 @@ main(int argc, char *argv[])
 
 	/* initialize options to all false */
 	memset(&vacopts, 0, sizeof(vacopts));
+	vacopts.parallel_workers = -1;
 
 	pg_logging_init(argv[0]);
 	progname = get_progname(argv[0]);
@@ -148,7 +152,7 @@ main(int argc, char *argv[])
 
 	handle_help_version_opts(argc, argv, "vacuumdb", help);
 
-	while ((c = getopt_long(argc, argv, "h:p:U:wWeqd:zZFat:fvj:", long_options, &optindex)) != -1)
+	while ((c = getopt_long(argc, argv, "h:p:P::U:wWeqd:zZFat:fvj:", long_options, &optindex)) != -1)
 	{
 		switch (c)
 		{
@@ -214,6 +218,24 @@ main(int argc, char *argv[])
 					exit(1);
 				}
 				break;
+			case 'P':
+				{
+					int parallel_workers = 0;
+
+					if (optarg != NULL)
+					{
+						parallel_workers = atoi(optarg);
+						if (parallel_workers <= 0)
+						{
+							pg_log_error("number of parallel workers must be at least 1");
+							exit(1);
+						}
+					}
+
+					/* allow to set 0, meaning PARALLEL without the parallel degree */
+					vacopts.parallel_workers = parallel_workers;
+					break;
+				}
 			case 2:
 				maintenance_db = pg_strdup(optarg);
 				break;
@@ -286,9 +308,22 @@ main(int argc, char *argv[])
 						 "disable-page-skipping");
 			exit(1);
 		}
+		if (vacopts.parallel_workers >= 0)
+		{
+			pg_log_error("cannot use the \"%s\" option when performing only analyze",
+						 "parallel");
+			exit(1);
+		}
 		/* allow 'and_analyze' with 'analyze_only' */
 	}
 
+	if (vacopts.full && vacopts.parallel_workers >= 0)
+	{
+		pg_log_error("cannot use the \"%s\" option with \"%s\" option",
+					 "full", "parallel");
+		exit(1);
+	}
+
 	setup_cancel_handler();
 
 	/* Avoid opening extra connections. */
@@ -891,6 +926,16 @@ prepare_vacuum_command(PQExpBuffer sql, int serverVersion,
 				appendPQExpBuffer(sql, "%sANALYZE", sep);
 				sep = comma;
 			}
+			if (vacopts->parallel_workers > 0)
+			{
+				appendPQExpBuffer(sql, "%sPARALLEL %d", sep, vacopts->parallel_workers);
+				sep = comma;
+			}
+			if (vacopts->parallel_workers == 0)
+			{
+				appendPQExpBuffer(sql, "%sPARALLEL", sep);
+				sep = comma;
+			}
 			if (sep != paren)
 				appendPQExpBufferChar(sql, ')');
 		}
@@ -1222,6 +1267,7 @@ help(const char *progname)
 	printf(_("  -j, --jobs=NUM                  use this many concurrent connections to vacuum\n"));
 	printf(_("      --min-mxid-age=MXID_AGE     minimum multixact ID age of tables to vacuum\n"));
 	printf(_("      --min-xid-age=XID_AGE       minimum transaction ID age of tables to vacuum\n"));
+	printf(_("  -P, --parallel[=NUM]            do parallel vacuuming\n"));
 	printf(_("  -q, --quiet                     don't write any messages\n"));
 	printf(_("      --skip-locked               skip relations that cannot be immediately locked\n"));
 	printf(_("  -t, --table='TABLE[(COLUMNS)]'  vacuum specific table(s) only\n"));
-- 
2.10.5

#72

Kyotaro HORIGUCHI

horiguchi.kyotaro@lab.ntt.co.jp

almost 7 years ago

In reply to: Masahiko Sawada (#71)

Thank you for the rebased version.

At Fri, 5 Apr 2019 13:59:36 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in <CAD21AoC_s0H0x-dDPhVJEqMYcnKYOMjESXd6r_9bbc3ZZegg1A@mail.gmail.com>

Thank you for the notice. Rebased.

+    <term><replaceable class="parameter">integer</replaceable></term>
+    <listitem>
+     <para>
+      Specifies parallel degree for <literal>PARALLEL</literal> option. The
+      value must be at least 1. If the parallel degree
+      <replaceable class="parameter">integer</replaceable> is omitted, then
+      <command>VACUUM</command> decides the number of workers based on number of
+      indexes on the relation which further limited by
+      <xref linkend="guc-max-parallel-workers-maintenance"/>.
+     </para>
+    </listitem>
+   </varlistentry>

I'm quite confused to see this. I suppose the <para> should be a
description about <integer> parameters. Actually the existing
<boolean> entry is describing the boolean itself.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#73

Masahiko Sawada

sawada.mshk@gmail.com

almost 7 years ago

In reply to: Kyotaro HORIGUCHI (#72)

On Fri, Apr 5, 2019 at 3:47 PM Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:

Thank you for the rebased version.

At Fri, 5 Apr 2019 13:59:36 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in <CAD21AoC_s0H0x-dDPhVJEqMYcnKYOMjESXd6r_9bbc3ZZegg1A@mail.gmail.com>

Thank you for the notice. Rebased.

+    <term><replaceable class="parameter">integer</replaceable></term>
+    <listitem>
+     <para>
+      Specifies parallel degree for <literal>PARALLEL</literal> option. The
+      value must be at least 1. If the parallel degree
+      <replaceable class="parameter">integer</replaceable> is omitted, then
+      <command>VACUUM</command> decides the number of workers based on number of
+      indexes on the relation which further limited by
+      <xref linkend="guc-max-parallel-workers-maintenance"/>.
+     </para>
+    </listitem>
+   </varlistentry>

Thank you for reviewing the patch.

I'm quite confused to see this. I suppose the <para> should be a
description about <integer> parameters. Actually the existing
<boolean> entry is describing the boolean itself.

Indeed. How about the following description?

PARALLEL
Perform vacuum index and cleanup index phases of VACUUM in parallel
using integer background workers (for the detail of each vacuum
phases, please refer to Table 27.25). If the parallel degree integer
is omitted, then VACUUM decides the number of workers based on number
of indexes on the relation which further limited by
max_parallel_maintenance_workers. Only one worker can be used per
index. So parallel workers are launched only when there are at least 2
indexes in the table. Workers for vacuum are launched before starting
each phases and exit at the end of the phase. These behaviors might
change in a future release. This option can not use with FULL option.

integer
Specifies a positive integer value passed to the selected option. The
integer value can also be omitted, in which case the default value of
the selected option is used.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#74

Masahiko Sawada

sawada.mshk@gmail.com

almost 7 years ago

In reply to: Masahiko Sawada (#73)

2 attachment(s)

On Fri, Apr 5, 2019 at 4:10 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Fri, Apr 5, 2019 at 3:47 PM Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:
Thank you for the rebased version.

At Fri, 5 Apr 2019 13:59:36 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in <CAD21AoC_s0H0x-dDPhVJEqMYcnKYOMjESXd6r_9bbc3ZZegg1A@mail.gmail.com>

Thank you for the notice. Rebased.
+    <term><replaceable class="parameter">integer</replaceable></term>
+    <listitem>
+     <para>
+      Specifies parallel degree for <literal>PARALLEL</literal> option. The
+      value must be at least 1. If the parallel degree
+      <replaceable class="parameter">integer</replaceable> is omitted, then
+      <command>VACUUM</command> decides the number of workers based on number of
+      indexes on the relation which further limited by
+      <xref linkend="guc-max-parallel-workers-maintenance"/>.
+     </para>
+    </listitem>
+   </varlistentry>
Thank you for reviewing the patch.

I'm quite confused to see this. I suppose the <para> should be a
description about <integer> parameters. Actually the existing
<boolean> entry is describing the boolean itself.

Indeed. How about the following description?

Attached the updated version patches.
Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachments:

v24-0002-Add-paralell-P-option-to-vacuumdb-command.patchapplication/x-patch; name=v24-0002-Add-paralell-P-option-to-vacuumdb-command.patchDownload

From f9574677afd18cb8b4d42a7d6048ab5fbb54a588 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Wed, 23 Jan 2019 16:07:53 +0900
Subject: [PATCH v24 2/2] Add --paralell, -P option to vacuumdb command

---
 doc/src/sgml/ref/vacuumdb.sgml    | 16 +++++++++++++
 src/bin/scripts/t/100_vacuumdb.pl | 10 +++++++-
 src/bin/scripts/vacuumdb.c        | 48 ++++++++++++++++++++++++++++++++++++++-
 3 files changed, 72 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/ref/vacuumdb.sgml b/doc/src/sgml/ref/vacuumdb.sgml
index 47d9345..f6ac0c6 100644
--- a/doc/src/sgml/ref/vacuumdb.sgml
+++ b/doc/src/sgml/ref/vacuumdb.sgml
@@ -227,6 +227,22 @@ PostgreSQL documentation
      </varlistentry>
 
      <varlistentry>
+      <term><option>-P <replaceable class="parameter">workers</replaceable></option></term>
+      <term><option>--parallel=<replaceable class="parameter">workers</replaceable></option></term>
+      <listitem>
+       <para>
+        Execute parallel vacuum with <productname>PostgreSQL</productname>'s
+        <replaceable class="parameter">workers</replaceable> background workers.
+       </para>
+       <para>
+        This option will require background workers, so make sure your
+        <xref linkend="guc-max-parallel-workers-maintenance"/> setting is more
+        than one.
+       </para>
+      </listitem>
+     </varlistentry>
+
+     <varlistentry>
       <term><option>-q</option></term>
       <term><option>--quiet</option></term>
       <listitem>
diff --git a/src/bin/scripts/t/100_vacuumdb.pl b/src/bin/scripts/t/100_vacuumdb.pl
index 7f3a9b1..5ab87f3 100644
--- a/src/bin/scripts/t/100_vacuumdb.pl
+++ b/src/bin/scripts/t/100_vacuumdb.pl
@@ -3,7 +3,7 @@ use warnings;
 
 use PostgresNode;
 use TestLib;
-use Test::More tests => 44;
+use Test::More tests => 48;
 
 program_help_ok('vacuumdb');
 program_version_ok('vacuumdb');
@@ -48,6 +48,14 @@ $node->issues_sql_like(
 $node->command_fails(
 	[ 'vacuumdb', '--analyze-only', '--disable-page-skipping', 'postgres' ],
 	'--analyze-only and --disable-page-skipping specified together');
+$node->issues_sql_like(
+	[ 'vacuumdb', '-P2', 'postgres' ],
+	qr/statement: VACUUM \(PARALLEL 2\).*;/,
+	'vacuumdb -P2');
+$node->issues_sql_like(
+	[ 'vacuumdb', '-P', 'postgres' ],
+	qr/statement: VACUUM \(PARALLEL\).*;/,
+	'vacuumdb -P');
 $node->command_ok([qw(vacuumdb -Z --table=pg_am dbname=template1)],
 	'vacuumdb with connection string');
 
diff --git a/src/bin/scripts/vacuumdb.c b/src/bin/scripts/vacuumdb.c
index 25ff19e..68b10ad 100644
--- a/src/bin/scripts/vacuumdb.c
+++ b/src/bin/scripts/vacuumdb.c
@@ -46,6 +46,8 @@ typedef struct vacuumingOptions
 	bool		skip_locked;
 	int			min_xid_age;
 	int			min_mxid_age;
+	int			parallel_workers;	/* -1 disables, 0 for choosing based on the
+									 * number of indexes */
 } vacuumingOptions;
 
 
@@ -112,6 +114,7 @@ main(int argc, char *argv[])
 		{"full", no_argument, NULL, 'f'},
 		{"verbose", no_argument, NULL, 'v'},
 		{"jobs", required_argument, NULL, 'j'},
+		{"parallel", optional_argument, NULL, 'P'},
 		{"maintenance-db", required_argument, NULL, 2},
 		{"analyze-in-stages", no_argument, NULL, 3},
 		{"disable-page-skipping", no_argument, NULL, 4},
@@ -141,6 +144,7 @@ main(int argc, char *argv[])
 
 	/* initialize options to all false */
 	memset(&vacopts, 0, sizeof(vacopts));
+	vacopts.parallel_workers = -1;
 
 	pg_logging_init(argv[0]);
 	progname = get_progname(argv[0]);
@@ -148,7 +152,7 @@ main(int argc, char *argv[])
 
 	handle_help_version_opts(argc, argv, "vacuumdb", help);
 
-	while ((c = getopt_long(argc, argv, "h:p:U:wWeqd:zZFat:fvj:", long_options, &optindex)) != -1)
+	while ((c = getopt_long(argc, argv, "h:p:P::U:wWeqd:zZFat:fvj:", long_options, &optindex)) != -1)
 	{
 		switch (c)
 		{
@@ -214,6 +218,24 @@ main(int argc, char *argv[])
 					exit(1);
 				}
 				break;
+			case 'P':
+				{
+					int parallel_workers = 0;
+
+					if (optarg != NULL)
+					{
+						parallel_workers = atoi(optarg);
+						if (parallel_workers <= 0)
+						{
+							pg_log_error("number of parallel workers must be at least 1");
+							exit(1);
+						}
+					}
+
+					/* allow to set 0, meaning PARALLEL without the parallel degree */
+					vacopts.parallel_workers = parallel_workers;
+					break;
+				}
 			case 2:
 				maintenance_db = pg_strdup(optarg);
 				break;
@@ -286,9 +308,22 @@ main(int argc, char *argv[])
 						 "disable-page-skipping");
 			exit(1);
 		}
+		if (vacopts.parallel_workers >= 0)
+		{
+			pg_log_error("cannot use the \"%s\" option when performing only analyze",
+						 "parallel");
+			exit(1);
+		}
 		/* allow 'and_analyze' with 'analyze_only' */
 	}
 
+	if (vacopts.full && vacopts.parallel_workers >= 0)
+	{
+		pg_log_error("cannot use the \"%s\" option with \"%s\" option",
+					 "full", "parallel");
+		exit(1);
+	}
+
 	setup_cancel_handler();
 
 	/* Avoid opening extra connections. */
@@ -891,6 +926,16 @@ prepare_vacuum_command(PQExpBuffer sql, int serverVersion,
 				appendPQExpBuffer(sql, "%sANALYZE", sep);
 				sep = comma;
 			}
+			if (vacopts->parallel_workers > 0)
+			{
+				appendPQExpBuffer(sql, "%sPARALLEL %d", sep, vacopts->parallel_workers);
+				sep = comma;
+			}
+			if (vacopts->parallel_workers == 0)
+			{
+				appendPQExpBuffer(sql, "%sPARALLEL", sep);
+				sep = comma;
+			}
 			if (sep != paren)
 				appendPQExpBufferChar(sql, ')');
 		}
@@ -1222,6 +1267,7 @@ help(const char *progname)
 	printf(_("  -j, --jobs=NUM                  use this many concurrent connections to vacuum\n"));
 	printf(_("      --min-mxid-age=MXID_AGE     minimum multixact ID age of tables to vacuum\n"));
 	printf(_("      --min-xid-age=XID_AGE       minimum transaction ID age of tables to vacuum\n"));
+	printf(_("  -P, --parallel[=NUM]            do parallel vacuuming\n"));
 	printf(_("  -q, --quiet                     don't write any messages\n"));
 	printf(_("      --skip-locked               skip relations that cannot be immediately locked\n"));
 	printf(_("  -t, --table='TABLE[(COLUMNS)]'  vacuum specific table(s) only\n"));
-- 
1.8.3.1

v24-0001-Add-parallel-option-to-VACUUM-command.patchapplication/x-patch; name=v24-0001-Add-parallel-option-to-VACUUM-command.patchDownload

From b71c270b13d34a2ffc7200b14f46f816b880eff8 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Thu, 4 Apr 2019 11:42:25 +0900
Subject: [PATCH v24 1/2] Add parallel option to VACUUM command

This change adds PARALLEL option to VACUUM command that enable us to
perform index vacuuming and index cleanup with parallel
workers. Indivisual indexes are processed by one vacuum
process. Therefore parallel vacuum can be used when the table has at
least two indexes and we cannot specify larger parallel degree than
the number of indexes that the table has.

The parallel degree is either specified by user or determined based on
the number of indexes that the table has, and further limited by
max_parallel_maintenance_workers. The table size and index size don't
affect it.
---
 doc/src/sgml/config.sgml              |  14 +-
 doc/src/sgml/ref/vacuum.sgml          |  34 ++
 src/backend/access/heap/vacuumlazy.c  | 890 ++++++++++++++++++++++++++++++----
 src/backend/access/transam/parallel.c |   4 +
 src/backend/commands/vacuum.c         |  27 ++
 src/backend/parser/gram.y             |   1 +
 src/backend/postmaster/autovacuum.c   |   1 +
 src/bin/psql/tab-complete.c           |   2 +-
 src/include/access/heapam.h           |   2 +
 src/include/commands/vacuum.h         |   5 +
 src/test/regress/expected/vacuum.out  |  12 +-
 src/test/regress/sql/vacuum.sql       |   6 +
 12 files changed, 892 insertions(+), 106 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index bc1d0f7..0b65d9b 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2226,13 +2226,13 @@ include_dir 'conf.d'
        <listitem>
         <para>
          Sets the maximum number of parallel workers that can be
-         started by a single utility command.  Currently, the only
-         parallel utility command that supports the use of parallel
-         workers is <command>CREATE INDEX</command>, and only when
-         building a B-tree index.  Parallel workers are taken from the
-         pool of processes established by <xref
-         linkend="guc-max-worker-processes"/>, limited by <xref
-         linkend="guc-max-parallel-workers"/>.  Note that the requested
+         started by a single utility command.  Currently, the parallel
+         utility commands that support the use of parallel workers are
+         <command>CREATE INDEX</command> only when building a B-tree index,
+         and <command>VACUUM</command> without <literal>FULL</literal>
+         option. Parallel workers are taken from the pool of processes
+         established by <xref linkend="guc-max-worker-processes"/>, limited
+         by <xref linkend="guc-max-parallel-workers"/>.  Note that the requested
          number of workers may not actually be available at run time.
          If this occurs, the utility operation will run with fewer
          workers than expected.  The default value is 2.  Setting this
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index c652f8b..e120d82 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -33,6 +33,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     DISABLE_PAGE_SKIPPING [ <replaceable class="parameter">boolean</replaceable> ]
     SKIP_LOCKED [ <replaceable class="parameter">boolean</replaceable> ]
     INDEX_CLEANUP [ <replaceable class="parameter">boolean</replaceable> ]
+    PARALLEL [ <replaceable class="parameter">integer</replaceable> ]
 
 <phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
 
@@ -205,6 +206,27 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    </varlistentry>
 
    <varlistentry>
+    <term><literal>PARALLEL</literal></term>
+    <listitem>
+     <para>
+      Perform vacuum index and cleanup index phases of <command>VACUUM</command>
+      in parallel using <replaceable class="parameter">integer</replaceable>
+      background workers (for the detail of each vacuum phases, please refer
+      to <xref linkend="vacuum-phases"/>). If the parallel degree
+      <replaceable class="parameter">integer</replaceable> is omitted, then
+      <command>VACUUM</command> decides the number of workers based on number of
+      indexes on the relation which further limited by
+      <xref linkend="guc-max-parallel-workers-maintenance"/>. Only one worker
+      can be used per index. So parallel workers are launched only when
+      there are at least <literal>2</literal> indexes in the table. Workers
+      for vacuum launches before starting each phases and exit at the end of
+      the phase. These behaviors might change in a future release. This
+      option can not use with <literal>FULL</literal> option.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><replaceable class="parameter">boolean</replaceable></term>
     <listitem>
      <para>
@@ -219,6 +241,18 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    </varlistentry>
 
    <varlistentry>
+    <term><replaceable class="parameter">integer</replaceable></term>
+    <listitem>
+     <para>
+      Specifies a positive integer value passed to the selected option.
+      The <replaceable class="parameter">integer</replaceable> value can
+      also be omitted, in which case the default value of the selected
+      option is used.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><replaceable class="parameter">table_name</replaceable></term>
     <listitem>
      <para>
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index c9d8312..ae077ab 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -22,6 +22,20 @@
  * of index scans performed.  So we don't use maintenance_work_mem memory for
  * the TID array, just enough to hold as many heap tuples as fit on one page.
  *
+ * Lazy vacuum supports parallel execution with parallel worker processes. In
+ * parallel lazy vacuum, we perform both index vacuuming and index cleanup with
+ * parallel worker processes. Individual indexes is processed by one vacuum
+ * process. At beginning of lazy vacuum (at lazy_scan_heap) we prepare the
+ * parallel context and initialize the DSM segment that contains shared information
+ * as well as the memory space for storing dead tuples. When starting either
+ * index vacuuming or index cleanup, we launch parallel worker processes. Once
+ * all indexes are processed the parallel worker processes exit and the leader
+ * process re-initializes the DSM segment while keeping recorded dead tuples.
+ * Note that all parallel workers live during one either index vacuuming or
+ * index cleanup but the leader process neither exits from the parallel mode
+ * nor destroys the parallel context. For updating the index statistics, since
+ * any updates are not allowed during parallel mode we update the index
+ * statistics after exited from parallel mode.
  *
  * Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
@@ -41,8 +55,10 @@
 #include "access/heapam_xlog.h"
 #include "access/htup_details.h"
 #include "access/multixact.h"
+#include "access/parallel.h"
 #include "access/transam.h"
 #include "access/visibilitymap.h"
+#include "access/xact.h"
 #include "access/xlog.h"
 #include "catalog/storage.h"
 #include "commands/dbcommands.h"
@@ -55,6 +71,7 @@
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
 #include "storage/lmgr.h"
+#include "tcop/tcopprot.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/pg_rusage.h"
@@ -110,6 +127,93 @@
  */
 #define PREFETCH_SIZE			((BlockNumber) 32)
 
+/*
+ * DSM keys for parallel lazy vacuum. Since we don't need to worry about DSM
+ * keys conflicting with plan_node_id we can use small integers.
+ */
+#define PARALLEL_VACUUM_KEY_SHARED			1
+#define PARALLEL_VACUUM_KEY_DEAD_TUPLES		2
+#define PARALLEL_VACUUM_KEY_QUERY_TEXT		3
+
+/*
+ * Macro to check if we in a parallel lazy vacuum. If true, we're in parallel
+ * mode and prepared the DSM segments.
+ */
+#define IsInParallelVacuum(lps) (((LVParallelState *) (lps)) != NULL)
+
+/*
+ * Struct for an index bulk-deletion statistic used for parallel lazy vacuum.
+ * This is allocated in the DSM segment.
+ */
+typedef struct LVIndStats
+{
+	IndexBulkDeleteResult	stats;
+	bool					updated;	/* are the stats updated? */
+} LVIndStats;
+
+/*
+ * LVDeadTuples stores the dead tuple TIDs collected during heap scan.
+ * This is allocated in the DSM segment when parallel lazy vacuum
+ * mode, otherwise allocated in a local memory.
+ */
+typedef struct LVDeadTuples
+{
+	int			max_tuples;	/* # slots allocated in array */
+	int			num_tuples;	/* current # of entries */
+	/* List of TIDs of tuples we intend to delete */
+	/* NB: this list is ordered by TID address */
+	ItemPointerData itemptrs[FLEXIBLE_ARRAY_MEMBER];	/* array of ItemPointerData */
+} LVDeadTuples;
+#define SizeOfLVDeadTuples offsetof(LVDeadTuples, itemptrs) + sizeof(ItemPointerData)
+
+/*
+ * Shared information among parallel workers. So this is allocated in
+ * the DSM segment.
+ */
+typedef struct LVShared
+{
+	/*
+	 * Target table relid and log level. These fields are not modified during
+	 * the lazy vacuum.
+	 */
+	Oid		relid;
+	int		elevel;
+
+	/*
+	 * An indication for vacuum workers of doing either index vacuuming or
+	 * index cleanup.
+	 */
+	bool	for_cleanup;
+
+	/*
+	 * Fields for both index vacuuming and index cleanup.
+	 *
+	 * reltuples is the total number of input heap tuples. We set either an
+	 * old live tuples in index vacuuming case or the new live tuples in index
+	 * cleanup case.
+	 *
+	 * estimated_count is true if the reltuples is estimated value.
+	 */
+	double	reltuples;
+	bool	estimated_count;
+
+	/*
+	 * Variables to control parallel index vacuuming. An variable-sized field
+	 * 'indstats' must come last.
+	 */
+	pg_atomic_uint32	nprocessed;
+	LVIndStats			indstats[FLEXIBLE_ARRAY_MEMBER];
+} LVShared;
+#define SizeOfLVShared offsetof(LVShared, indstats) + sizeof(LVIndStats)
+
+/* Struct for parallel lazy vacuum */
+typedef struct LVParallelState
+{
+	ParallelContext	*pcxt;
+	LVShared		*lvshared;
+	int				nworkers_requested;	/* user-requested parallel degree */
+} LVParallelState;
+
 typedef struct LVRelStats
 {
 	/* useindex = true means two-pass strategy; false means one-pass */
@@ -130,17 +234,12 @@ typedef struct LVRelStats
 	BlockNumber pages_removed;
 	double		tuples_deleted;
 	BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
-	/* List of TIDs of tuples we intend to delete */
-	/* NB: this list is ordered by TID address */
-	int			num_dead_tuples;	/* current # of entries */
-	int			max_dead_tuples;	/* # slots allocated in array */
-	ItemPointer dead_tuples;	/* array of ItemPointerData */
+	LVDeadTuples *dead_tuples;
 	int			num_index_scans;
 	TransactionId latestRemovedXid;
 	bool		lock_waiter_detected;
 } LVRelStats;
 
-
 /* A few variables that don't seem worth passing around as parameters */
 static int	elevel = -1;
 
@@ -159,10 +258,11 @@ static void lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats, BlockNumb
 static bool lazy_check_needs_freeze(Buffer buf, bool *hastup);
 static void lazy_vacuum_index(Relation indrel,
 				  IndexBulkDeleteResult **stats,
-				  LVRelStats *vacrelstats);
+				  double reltuples,
+				  LVDeadTuples	*dead_tuples);
 static void lazy_cleanup_index(Relation indrel,
-				   IndexBulkDeleteResult *stats,
-				   LVRelStats *vacrelstats);
+				   IndexBulkDeleteResult **stats,
+				   double reltuples, bool estimated_count);
 static int lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 				 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer);
 static bool should_attempt_truncation(LVRelStats *vacrelstats);
@@ -170,12 +270,35 @@ static void lazy_truncate_heap(Relation onerel, LVRelStats *vacrelstats);
 static BlockNumber count_nondeletable_pages(Relation onerel,
 						 LVRelStats *vacrelstats);
 static void lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks);
-static void lazy_record_dead_tuple(LVRelStats *vacrelstats,
-					   ItemPointer itemptr);
+static void lazy_record_dead_tuple(LVDeadTuples *dead_tuples, ItemPointer itemptr);
 static bool lazy_tid_reaped(ItemPointer itemptr, void *state);
 static int	vac_cmp_itemptr(const void *left, const void *right);
 static bool heap_page_is_all_visible(Relation rel, Buffer buf,
 						 TransactionId *visibility_cutoff_xid, bool *all_frozen);
+static void lazy_update_index_statistics(Relation indrel, IndexBulkDeleteResult *stats);
+static LVParallelState *lazy_prepare_parallel(LVRelStats *vacrelstats, Oid relid,
+											  BlockNumber nblocks, int nindexes,
+											  int nrequested);
+static void lazy_end_parallel(LVParallelState *lps, Relation *Irel, int nindexes);
+static void lazy_begin_parallel_index_vacuum(LVParallelState *lps, LVRelStats *vacrelstats,
+											 bool for_cleanup);
+static void lazy_end_parallel_index_vacuum(LVParallelState *lps, bool reinitialize);
+static void lazy_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
+										   int nindexes,
+										   IndexBulkDeleteResult **stats,
+										   LVParallelState *lps, bool for_cleanup);
+static void lazy_parallel_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats,
+													Relation *Irel,
+													int nindexes,
+													IndexBulkDeleteResult **stats,
+													LVParallelState *lps,
+													bool for_cleanup);
+static void do_parallel_vacuum_or_cleanup_indexes(Relation *Irel, int nindexes,
+												  IndexBulkDeleteResult **stats,
+												  LVShared *lvshared,
+												  LVDeadTuples *dead_tuples);
+static int compute_parallel_workers(Relation onerel, int nrequested, int nindexes);
+static long compute_max_dead_tuples(BlockNumber relblocks, bool hasindex);
 
 
 /*
@@ -490,6 +613,17 @@ vacuum_log_cleanup_info(Relation rel, LVRelStats *vacrelstats)
  *		dead-tuple TIDs, invoke vacuuming of indexes and call lazy_vacuum_heap
  *		to reclaim dead line pointers.
  *
+ *		If the table has at least two indexes and parallel lazy vacuum is
+ *		requested, we execute both index vacuuming and index cleanup with
+ *		parallel workers. In parallel lazy vacuum, we enter parallel mode and
+ *		create the parallel context and the DSM segment before starting heap
+ *		scan. All parallel workers are launched at beginning of index vacuuming
+ *		and index cleanup and they exit once done with all indexes. At end of
+ *		this function we exit from parallel mode. Index bulk-deletion results
+ *		are stored in the DSM segment and update index statistics as a whole
+ *		after exited from parallel mode since all writes are not allowed during
+ *		parallel mode.
+ *
  *		If there are no indexes then we can reclaim line pointers on the fly;
  *		dead line pointers need only be retained until all index pointers that
  *		reference them have been killed.
@@ -498,6 +632,8 @@ static void
 lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			   Relation *Irel, int nindexes, bool aggressive)
 {
+	LVParallelState *lps = NULL;
+	LVDeadTuples *dead_tuples;
 	BlockNumber nblocks,
 				blkno;
 	HeapTupleData tuple;
@@ -523,6 +659,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	bool		skipping_blocks;
 	xl_heap_freeze_tuple *frozen;
 	StringInfoData buf;
+	int			parallel_workers = 0;
 	const int	initprog_index[] = {
 		PROGRESS_VACUUM_PHASE,
 		PROGRESS_VACUUM_TOTAL_HEAP_BLKS,
@@ -559,13 +696,45 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	vacrelstats->nonempty_pages = 0;
 	vacrelstats->latestRemovedXid = InvalidTransactionId;
 
-	lazy_space_alloc(vacrelstats, nblocks);
+	/*
+	 * If parallel lazy vacuum is requested and we vacuum indexes, compute
+	 * the number of parallel vacuum worker to launch.
+	 */
+	if (params->nworkers >= 0 && vacrelstats->useindex)
+		parallel_workers = compute_parallel_workers(onerel,
+													params->nworkers,
+													nindexes);
+
+	if (parallel_workers > 0)
+	{
+		/*
+		 * Enter parallel mode, create the parallel context and allocate the
+		 * DSM segment.
+		 */
+		lps = lazy_prepare_parallel(vacrelstats,
+									RelationGetRelid(onerel),
+									nblocks, nindexes,
+									parallel_workers);
+
+		/* Remember the user-requested parallel degree to reporting */
+		lps->nworkers_requested = params->nworkers;
+	}
+	else
+	{
+		/*
+		 * Use single process vacuum, allocate the memory space for dead
+		 * tuples locally.
+		 */
+		lazy_space_alloc(vacrelstats, nblocks);
+	}
+
+	dead_tuples = vacrelstats->dead_tuples;
 	frozen = palloc(sizeof(xl_heap_freeze_tuple) * MaxHeapTuplesPerPage);
 
 	/* Report that we're scanning the heap, advertising total # of blocks */
 	initprog_val[0] = PROGRESS_VACUUM_PHASE_SCAN_HEAP;
 	initprog_val[1] = nblocks;
-	initprog_val[2] = vacrelstats->max_dead_tuples;
+	initprog_val[2] = dead_tuples->max_tuples;
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
 	/*
@@ -743,8 +912,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * If we are close to overrunning the available space for dead-tuple
 		 * TIDs, pause and do a cycle of vacuuming before we tackle this page.
 		 */
-		if ((vacrelstats->max_dead_tuples - vacrelstats->num_dead_tuples) < MaxHeapTuplesPerPage &&
-			vacrelstats->num_dead_tuples > 0)
+		if ((dead_tuples->max_tuples - dead_tuples->num_tuples) < MaxHeapTuplesPerPage &&
+			dead_tuples->num_tuples > 0)
 		{
 			const int	hvp_index[] = {
 				PROGRESS_VACUUM_PHASE,
@@ -772,10 +941,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 										 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
 			/* Remove index entries */
-			for (i = 0; i < nindexes; i++)
-				lazy_vacuum_index(Irel[i],
-								  &indstats[i],
-								  vacrelstats);
+			lazy_vacuum_or_cleanup_indexes(vacrelstats, Irel, nindexes,
+										   indstats, lps, false);
 
 			/*
 			 * Report that we are now vacuuming the heap.  We also increase
@@ -795,7 +962,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 			vacrelstats->num_index_scans++;
 
 			/*
@@ -991,7 +1158,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		has_dead_tuples = false;
 		nfrozen = 0;
 		hastup = false;
-		prev_dead_count = vacrelstats->num_dead_tuples;
+		prev_dead_count = dead_tuples->num_tuples;
 		maxoff = PageGetMaxOffsetNumber(page);
 
 		/*
@@ -1030,7 +1197,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 */
 			if (ItemIdIsDead(itemid))
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				all_visible = false;
 				continue;
 			}
@@ -1180,7 +1347,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 
 			if (tupgone)
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				HeapTupleHeaderAdvanceLatestRemovedXid(tuple.t_data,
 													   &vacrelstats->latestRemovedXid);
 				tups_vacuumed += 1;
@@ -1250,7 +1417,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * doing a second scan. Also we don't do that but forget dead tuples
 		 * when index cleanup is disabled.
 		 */
-		if (!vacrelstats->useindex && vacrelstats->num_dead_tuples > 0)
+		if (!vacrelstats->useindex && dead_tuples->num_tuples > 0)
 		{
 			if (nindexes == 0)
 			{
@@ -1271,7 +1438,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 				 * the next vacuum will process them anyway.
 				 */
 				Assert(params->index_cleanup == VACOPT_TERNARY_DISABLED);
-				nleft_dead_itemids += vacrelstats->num_dead_tuples;
+				nleft_dead_itemids += dead_tuples->num_tuples;
 			}
 
 			/*
@@ -1279,7 +1446,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 
 			/*
 			 * Periodically do incremental FSM vacuuming to make newly-freed
@@ -1394,7 +1561,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * page, so remember its free space as-is.  (This path will always be
 		 * taken if there are no indexes.)
 		 */
-		if (vacrelstats->num_dead_tuples == prev_dead_count)
+		if (dead_tuples->num_tuples == prev_dead_count)
 			RecordPageWithFreeSpace(onerel, blkno, freespace, nblocks);
 	}
 
@@ -1435,7 +1602,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 
 	/* If any tuples need to be deleted, perform final vacuum cycle */
 	/* XXX put a threshold on min number of tuples here? */
-	if (vacrelstats->num_dead_tuples > 0)
+	if (dead_tuples->num_tuples > 0)
 	{
 		const int	hvp_index[] = {
 			PROGRESS_VACUUM_PHASE,
@@ -1451,10 +1618,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 									 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
 		/* Remove index entries */
-		for (i = 0; i < nindexes; i++)
-			lazy_vacuum_index(Irel[i],
-							  &indstats[i],
-							  vacrelstats);
+		lazy_vacuum_or_cleanup_indexes(vacrelstats, Irel, nindexes,
+										   indstats, lps, false);
 
 		/* Report that we are now vacuuming the heap */
 		hvp_val[0] = PROGRESS_VACUUM_PHASE_VACUUM_HEAP;
@@ -1480,11 +1645,20 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
 								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
 
-	/* Do post-vacuum cleanup and statistics update for each index */
+	/*
+	 * Do post-vacuum cleanup, and statistics update for each index if
+	 * we're not in parallel lazy vacuum. If in parallel lazy vacuum, do
+	 * only post-vacum cleanup and then update statistics at the end of
+	 * parallel lazy vacuum.
+	 */
 	if (vacrelstats->useindex)
+		lazy_vacuum_or_cleanup_indexes(vacrelstats, Irel, nindexes,
+									   indstats, lps, true);
+
+	if (IsInParallelVacuum(lps))
 	{
-		for (i = 0; i < nindexes; i++)
-			lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
+		/* End parallel mode and update index statistics */
+		lazy_end_parallel(lps, Irel, nindexes);
 	}
 
 	/* If no indexes, make log report that lazy_vacuum_heap would've made */
@@ -1554,7 +1728,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats, BlockNumber nblocks)
 	npages = 0;
 
 	tupindex = 0;
-	while (tupindex < vacrelstats->num_dead_tuples)
+	while (tupindex < vacrelstats->dead_tuples->num_tuples)
 	{
 		BlockNumber tblk;
 		Buffer		buf;
@@ -1563,7 +1737,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats, BlockNumber nblocks)
 
 		vacuum_delay_point();
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples->itemptrs[tupindex]);
 		buf = ReadBufferExtended(onerel, MAIN_FORKNUM, tblk, RBM_NORMAL,
 								 vac_strategy);
 		if (!ConditionalLockBufferForCleanup(buf))
@@ -1611,6 +1785,7 @@ static int
 lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 				 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer)
 {
+	LVDeadTuples	*dead_tuples = vacrelstats->dead_tuples;
 	Page		page = BufferGetPage(buffer);
 	OffsetNumber unused[MaxOffsetNumber];
 	int			uncnt = 0;
@@ -1621,16 +1796,16 @@ lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 
 	START_CRIT_SECTION();
 
-	for (; tupindex < vacrelstats->num_dead_tuples; tupindex++)
+	for (; tupindex < dead_tuples->num_tuples; tupindex++)
 	{
 		BlockNumber tblk;
 		OffsetNumber toff;
 		ItemId		itemid;
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&dead_tuples->itemptrs[tupindex]);
 		if (tblk != blkno)
 			break;				/* past end of tuples for this block */
-		toff = ItemPointerGetOffsetNumber(&vacrelstats->dead_tuples[tupindex]);
+		toff = ItemPointerGetOffsetNumber(&dead_tuples->itemptrs[tupindex]);
 		itemid = PageGetItemId(page, toff);
 		ItemIdSetUnused(itemid);
 		unused[uncnt++] = toff;
@@ -1751,6 +1926,154 @@ lazy_check_needs_freeze(Buffer buf, bool *hastup)
 	return false;
 }
 
+/*
+ * Vacuum or cleanup indexes with parallel workers. This function must be used
+ * by the parallel vacuum leader process.
+ */
+static void
+lazy_parallel_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
+										int nindexes, IndexBulkDeleteResult **stats,
+										LVParallelState *lps, bool for_cleanup)
+{
+	Assert(!IsParallelWorker());
+	Assert(IsInParallelVacuum(lps));
+	Assert(nindexes > 0);
+
+	/* Launch parallel vacuum workers if we're ready */
+	lazy_begin_parallel_index_vacuum(lps, vacrelstats,
+									 for_cleanup);
+
+	/*
+	 * Do index vacuuming or index cleanup with parallel workers or by
+	 * the leader process alone if no workers could not launch.
+	 */
+	do_parallel_vacuum_or_cleanup_indexes(Irel, nindexes, stats,
+										  lps->lvshared,
+										  vacrelstats->dead_tuples);
+
+	/*
+	 * Wait for all workers to finish, and prepare for the next index
+	 * vacuuming or index cleanup.
+	 */
+	lazy_end_parallel_index_vacuum(lps, !for_cleanup);
+}
+
+/*
+ * Index vacuuming and index cleanup routine used by both the leader process
+ * and worker processes. Unlike single process vacuum, we don't update
+ * index statistics after cleanup index since it is not allowed during
+ * parallel mode, therefore copy index bulk-deletion results from the local
+ * memory to the DSM segment and update them at the end of parallel lazy
+ * vacuum.
+ */
+static void
+do_parallel_vacuum_or_cleanup_indexes(Relation *Irel, int nindexes,
+									  IndexBulkDeleteResult **stats,
+									  LVShared *lvshared,
+									  LVDeadTuples *dead_tuples)
+{
+	/* Loop until all indexes are vacuumed */
+	for (;;)
+	{
+		int idx;
+
+		/* Get an index number to process */
+		idx = pg_atomic_fetch_add_u32(&(lvshared->nprocessed), 1);
+
+		/* Done for all indexes? */
+		if (idx >= nindexes)
+			break;
+
+		/*
+		 * Update the local pointer to the corresponding bulk-deletion result
+		 * if someone already updated it.
+		 */
+		if (lvshared->indstats[idx].updated &&
+			stats[idx] == NULL)
+			stats[idx] = &(lvshared->indstats[idx].stats);
+
+		/* Do vacuum or cleanup one index */
+		if (!lvshared->for_cleanup)
+			lazy_vacuum_index(Irel[idx], &stats[idx], lvshared->reltuples,
+							  dead_tuples);
+		else
+			lazy_cleanup_index(Irel[idx], &stats[idx], lvshared->reltuples,
+							   lvshared->estimated_count);
+
+		/*
+		 * Copy the index bulk-deletion result returned from ambulkdelete and
+		 * amvacuumcleanup to the DSM segment if it's the first time to get it
+		 * from them, because they allocate it locally and it's possible that an
+		 * index will be vacuumed by the different vacuum process at the next
+		 * time. The copying the result normally happens only after the first
+		 * time of index vacuuming. From the second time, we pass the result on
+		 * the DSM segment so that they then update it directly.
+		 *
+		 * Since all vacuum workers write the bulk-deletion result at different
+		 * slots we can write them without locking.
+		 */
+		if (!lvshared->indstats[idx].updated &&
+			stats[idx] != NULL)
+		{
+			memcpy(&(lvshared->indstats[idx].stats),
+				   stats[idx], sizeof(IndexBulkDeleteResult));
+			lvshared->indstats[idx].updated = true;
+
+			/*
+			 * no longer need the locally allocated result and now stats[idx]
+			 * points to the DSM segment.
+			 */
+			pfree(stats[idx]);
+			stats[idx] = &(lvshared->indstats[idx].stats);
+		}
+	}
+}
+
+/*
+ * Vacuum or cleanup indexes. If we're ready for parallel lazy vacuum it's
+ * performed with parallel workers. So this function must be used by the parallel
+ * vacuum leader process.
+ */
+static void
+lazy_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
+							   int nindexes, IndexBulkDeleteResult **stats,
+							   LVParallelState *lps, bool for_cleanup)
+{
+	int		idx;
+
+	Assert(!IsParallelWorker());
+
+	/* no job if the table has no index */
+	if (nindexes <= 0)
+		return;
+
+	if (IsInParallelVacuum(lps))
+	{
+		/* Do parallel index vacuuming or index cleanup */
+		lazy_parallel_vacuum_or_cleanup_indexes(vacrelstats, Irel,
+												nindexes, stats,
+												lps, for_cleanup);
+		return;
+	}
+
+	/* Do index vacuuming or index cleanup in single vacuum mode */
+	for (idx = 0; idx < nindexes; idx++)
+	{
+		if (!for_cleanup)
+			lazy_vacuum_index(Irel[idx], &stats[idx], vacrelstats->old_live_tuples,
+							  vacrelstats->dead_tuples);
+		else
+		{
+			lazy_cleanup_index(Irel[idx], &stats[idx], vacrelstats->new_rel_tuples,
+							   vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+
+			lazy_update_index_statistics(Irel[idx], stats[idx]);
+
+			if (stats[idx])
+				pfree(stats[idx]);
+		}
+	}
+}
 
 /*
  *	lazy_vacuum_index() -- vacuum one index relation.
@@ -1761,9 +2084,10 @@ lazy_check_needs_freeze(Buffer buf, bool *hastup)
 static void
 lazy_vacuum_index(Relation indrel,
 				  IndexBulkDeleteResult **stats,
-				  LVRelStats *vacrelstats)
+				  double reltuples, LVDeadTuples *dead_tuples)
 {
 	IndexVacuumInfo ivinfo;
+	char		*msgfmt;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
@@ -1773,18 +2097,22 @@ lazy_vacuum_index(Relation indrel,
 	ivinfo.report_progress = false;
 	ivinfo.estimated_count = true;
 	ivinfo.message_level = elevel;
-	/* We can only provide an approximate value of num_heap_tuples here */
-	ivinfo.num_heap_tuples = vacrelstats->old_live_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
 	/* Do bulk deletion */
 	*stats = index_bulk_delete(&ivinfo, *stats,
-							   lazy_tid_reaped, (void *) vacrelstats);
+							   lazy_tid_reaped, (void *) dead_tuples);
+
+	if (IsParallelWorker())
+		msgfmt = gettext_noop("scanned index \"%s\" to remove %d row versions by parallel vacuum worker");
+	else
+		msgfmt = gettext_noop("scanned index \"%s\" to remove %d row versions");
 
 	ereport(elevel,
-			(errmsg("scanned index \"%s\" to remove %d row versions",
+			(errmsg(msgfmt,
 					RelationGetRelationName(indrel),
-					vacrelstats->num_dead_tuples),
+					dead_tuples->num_tuples),
 			 errdetail_internal("%s", pg_rusage_show(&ru0))));
 }
 
@@ -1793,10 +2121,11 @@ lazy_vacuum_index(Relation indrel,
  */
 static void
 lazy_cleanup_index(Relation indrel,
-				   IndexBulkDeleteResult *stats,
-				   LVRelStats *vacrelstats)
+				   IndexBulkDeleteResult **stats,
+				   double reltuples, bool estimated_count)
 {
 	IndexVacuumInfo ivinfo;
+	char		*msgfmt;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
@@ -1804,49 +2133,55 @@ lazy_cleanup_index(Relation indrel,
 	ivinfo.index = indrel;
 	ivinfo.analyze_only = false;
 	ivinfo.report_progress = false;
-	ivinfo.estimated_count = (vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	ivinfo.estimated_count = estimated_count;
 	ivinfo.message_level = elevel;
 
-	/*
-	 * Now we can provide a better estimate of total number of surviving
-	 * tuples (we assume indexes are more interested in that than in the
-	 * number of nominally live tuples).
-	 */
-	ivinfo.num_heap_tuples = vacrelstats->new_rel_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
-	stats = index_vacuum_cleanup(&ivinfo, stats);
+	*stats = index_vacuum_cleanup(&ivinfo, *stats);
 
 	if (!stats)
 		return;
 
-	/*
-	 * Now update statistics in pg_class, but only if the index says the count
-	 * is accurate.
-	 */
-	if (!stats->estimated_count)
-		vac_update_relstats(indrel,
-							stats->num_pages,
-							stats->num_index_tuples,
-							0,
-							false,
-							InvalidTransactionId,
-							InvalidMultiXactId,
-							false);
+	if (IsParallelWorker())
+		msgfmt = gettext_noop("index \"%s\" now contains %.0f row versions in %u pages, reported by parallel vacuum worker");
+	else
+		msgfmt = gettext_noop("index \"%s\" now contains %.0f row versions in %u pages");
 
 	ereport(elevel,
-			(errmsg("index \"%s\" now contains %.0f row versions in %u pages",
+			(errmsg(msgfmt,
 					RelationGetRelationName(indrel),
-					stats->num_index_tuples,
-					stats->num_pages),
+					(*stats)->num_index_tuples,
+					(*stats)->num_pages),
 			 errdetail("%.0f index row versions were removed.\n"
 					   "%u index pages have been deleted, %u are currently reusable.\n"
 					   "%s.",
-					   stats->tuples_removed,
-					   stats->pages_deleted, stats->pages_free,
+					   (*stats)->tuples_removed,
+					   (*stats)->pages_deleted, (*stats)->pages_free,
 					   pg_rusage_show(&ru0))));
+}
+
+/*
+ * Update index statistics in pg_class if the statistics is accurate.
+ */
+static void
+lazy_update_index_statistics(Relation indrel, IndexBulkDeleteResult *stats)
+{
+	Assert(!IsInParallelMode());
+
+	if (!stats || stats->estimated_count)
+		return;
 
-	pfree(stats);
+	/* Update index statistics */
+	vac_update_relstats(indrel,
+						stats->num_pages,
+						stats->num_index_tuples,
+						0,
+						false,
+						InvalidTransactionId,
+						InvalidMultiXactId,
+						false);
 }
 
 /*
@@ -2151,19 +2486,17 @@ count_nondeletable_pages(Relation onerel, LVRelStats *vacrelstats)
 }
 
 /*
- * lazy_space_alloc - space allocation decisions for lazy vacuum
- *
- * See the comments at the head of this file for rationale.
+ * Return the maximum number of dead tuples we can record.
  */
-static void
-lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+static long
+compute_max_dead_tuples(BlockNumber relblocks, bool useindex)
 {
 	long		maxtuples;
 	int			vac_work_mem = IsAutoVacuumWorkerProcess() &&
 	autovacuum_work_mem != -1 ?
 	autovacuum_work_mem : maintenance_work_mem;
 
-	if (vacrelstats->useindex)
+	if (useindex)
 	{
 		maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);
 		maxtuples = Min(maxtuples, INT_MAX);
@@ -2177,34 +2510,49 @@ lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
 		maxtuples = Max(maxtuples, MaxHeapTuplesPerPage);
 	}
 	else
-	{
 		maxtuples = MaxHeapTuplesPerPage;
-	}
 
-	vacrelstats->num_dead_tuples = 0;
-	vacrelstats->max_dead_tuples = (int) maxtuples;
-	vacrelstats->dead_tuples = (ItemPointer)
-		palloc(maxtuples * sizeof(ItemPointerData));
+	return maxtuples;
+}
+
+/*
+ * lazy_space_alloc - space allocation decisions for lazy vacuum
+ *
+ * See the comments at the head of this file for rationale.
+ */
+static void
+lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+{
+	LVDeadTuples	*dead_tuples = NULL;
+	long		maxtuples;
+
+	maxtuples = compute_max_dead_tuples(relblocks, vacrelstats->useindex);
+
+	dead_tuples = (LVDeadTuples *)
+		palloc(SizeOfLVDeadTuples + maxtuples * sizeof(ItemPointerData));
+	dead_tuples->num_tuples = 0;
+	dead_tuples->max_tuples = (int) maxtuples;
+
+	vacrelstats->dead_tuples = dead_tuples;
 }
 
 /*
  * lazy_record_dead_tuple - remember one deletable tuple
  */
 static void
-lazy_record_dead_tuple(LVRelStats *vacrelstats,
-					   ItemPointer itemptr)
+lazy_record_dead_tuple(LVDeadTuples *dead_tuples, ItemPointer itemptr)
 {
 	/*
 	 * The array shouldn't overflow under normal behavior, but perhaps it
 	 * could if we are given a really small maintenance_work_mem. In that
 	 * case, just forget the last few tuples (we'll get 'em next time).
 	 */
-	if (vacrelstats->num_dead_tuples < vacrelstats->max_dead_tuples)
+	if (dead_tuples->num_tuples < dead_tuples->max_tuples)
 	{
-		vacrelstats->dead_tuples[vacrelstats->num_dead_tuples] = *itemptr;
-		vacrelstats->num_dead_tuples++;
+		dead_tuples->itemptrs[dead_tuples->num_tuples] = *itemptr;
+		dead_tuples->num_tuples++;
 		pgstat_progress_update_param(PROGRESS_VACUUM_NUM_DEAD_TUPLES,
-									 vacrelstats->num_dead_tuples);
+									 dead_tuples->num_tuples);
 	}
 }
 
@@ -2218,12 +2566,12 @@ lazy_record_dead_tuple(LVRelStats *vacrelstats,
 static bool
 lazy_tid_reaped(ItemPointer itemptr, void *state)
 {
-	LVRelStats *vacrelstats = (LVRelStats *) state;
+	LVDeadTuples	*dead_tuples = (LVDeadTuples *) state;
 	ItemPointer res;
 
 	res = (ItemPointer) bsearch((void *) itemptr,
-								(void *) vacrelstats->dead_tuples,
-								vacrelstats->num_dead_tuples,
+								(void *) dead_tuples->itemptrs,
+								dead_tuples->num_tuples,
 								sizeof(ItemPointerData),
 								vac_cmp_itemptr);
 
@@ -2371,3 +2719,353 @@ heap_page_is_all_visible(Relation rel, Buffer buf,
 
 	return all_visible;
 }
+
+/*
+ * Compute the number of parallel worker process to request. Both index
+ * vacuuming and index cleanup can be executed together with parallel workers
+ * if the table has more than one index. The relation sizes of table and
+ * indexes don't affect to the parallel degree for now. nrequested is the
+ * number of parallel workers that user requested and nindexes is the number
+ * of indexes that the table has.
+ */
+static int
+compute_parallel_workers(Relation onerel, int nrequested, int nindexes)
+{
+	int parallel_workers;
+
+	Assert(nrequested >= 0);
+
+	if (nindexes <= 1)
+		return 0;
+
+	if (nrequested > 0)
+		parallel_workers = Min(nrequested, nindexes - 1);
+	else
+	{
+		/*
+		 * The parallel degree is not requested. Compute it based on the
+		 * number of indexes.
+		 */
+		parallel_workers = nindexes - 1;
+	}
+
+	/* cap by max_parallel_maintenace_workers */
+	parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+
+	return parallel_workers;
+}
+
+/*
+ * Enter parallel mode, allocate and initialize the DSM segment.
+ */
+static LVParallelState *
+lazy_prepare_parallel(LVRelStats *vacrelstats, Oid relid, BlockNumber nblocks,
+					  int nindexes, int nrequested)
+{
+	LVParallelState *lps = (LVParallelState *) palloc(sizeof(LVParallelState));
+	LVShared	*shared;
+	ParallelContext *pcxt;
+	LVDeadTuples	*tidmap;
+	long	maxtuples;
+	char	*sharedquery;
+	Size	est_shared;
+	Size	est_deadtuples;
+	int		querylen;
+	int		keys = 0;
+
+	Assert(nrequested > 0);
+	Assert(nindexes > 0);
+
+	EnterParallelMode();
+	pcxt = CreateParallelContext("postgres", "heap_parallel_vacuum_main",
+								 nrequested);
+	lps->pcxt = pcxt;
+	Assert(pcxt->nworkers > 0);
+
+	/* Estimate size for shared information -- PARALLEL_VACUUM_KEY_SHARED */
+	est_shared = MAXALIGN(add_size(SizeOfLVShared,
+								   mul_size(sizeof(LVIndStats), nindexes)));
+	shm_toc_estimate_chunk(&pcxt->estimator, est_shared);
+	keys++;
+
+	/* Estimate size for dead tuples -- PARALLEL_VACUUM_KEY_DEAD_TUPLES */
+	maxtuples = compute_max_dead_tuples(nblocks, true);
+	est_deadtuples = MAXALIGN(add_size(sizeof(LVDeadTuples),
+									   mul_size(sizeof(ItemPointerData), maxtuples)));
+	shm_toc_estimate_chunk(&pcxt->estimator, est_deadtuples);
+	keys++;
+
+	shm_toc_estimate_keys(&pcxt->estimator, keys);
+
+	/* Finally, estimate VACUUM_KEY_QUERY_TEXT space */
+	querylen = strlen(debug_query_string);
+	shm_toc_estimate_chunk(&pcxt->estimator, querylen + 1);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	InitializeParallelDSM(pcxt);
+
+	/* prepare shared information */
+	shared = (LVShared *) shm_toc_allocate(pcxt->toc, est_shared);
+	shared->relid = relid;
+	shared->elevel = elevel;
+	pg_atomic_init_u32(&(shared->nprocessed), 0);
+	MemSet(shared->indstats, 0, sizeof(LVIndStats) * nindexes);
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_SHARED, shared);
+	lps->lvshared = shared;
+
+	/* prepare the dead tuple space */
+	tidmap = (LVDeadTuples *) shm_toc_allocate(pcxt->toc, est_deadtuples);
+	tidmap->max_tuples = maxtuples;
+	tidmap->num_tuples = 0;
+	MemSet(tidmap->itemptrs, 0, sizeof(ItemPointerData) * maxtuples);
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_DEAD_TUPLES, tidmap);
+	vacrelstats->dead_tuples = tidmap;
+
+	/* Store query string for workers */
+	sharedquery = (char *) shm_toc_allocate(pcxt->toc, querylen + 1);
+	memcpy(sharedquery, debug_query_string, querylen + 1);
+	sharedquery[querylen] = '\0';
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, sharedquery);
+
+	lps->nworkers_requested = 0;
+
+	return lps;
+}
+
+/*
+ * Shutdown workers, destroy the parallel context, and end parallel mode.
+ * Update index statistics after exited from parallel mode.
+ */
+static void
+lazy_end_parallel(LVParallelState *lps, Relation *Irel, int nindexes)
+{
+	LVIndStats *copied_indstats = NULL;
+	int i;
+
+	Assert(!IsParallelWorker());
+	Assert(Irel != NULL && nindexes > 0);
+
+	/*
+	 * All writes are not allowed during parallel mode and it might not be
+	 * safe to exit from parallel mode while keeping the parallel context.
+	 * So we copy the index statistics to a temporary space and update
+	 * them after exited from parallel mode.
+	 */
+	copied_indstats = palloc(sizeof(LVIndStats) * nindexes);
+	memcpy(copied_indstats, lps->lvshared->indstats,
+		   sizeof(LVIndStats) * nindexes);
+
+	/* Shutdown worker processes and destroy the parallel context */
+	WaitForParallelWorkersToFinish(lps->pcxt);
+	DestroyParallelContext(lps->pcxt);
+	ExitParallelMode();
+
+	for (i = 0; i < nindexes; i++)
+	{
+		LVIndStats *s = &(copied_indstats[i]);
+
+		if (s->updated)
+			lazy_update_index_statistics(Irel[i], &(s->stats));
+	}
+
+	pfree(copied_indstats);
+}
+
+/*
+ * Begin a parallel index vacuuming or index cleanup. Set shared information
+ * and launch parallel worker processes.
+ */
+static void
+lazy_begin_parallel_index_vacuum(LVParallelState *lps, LVRelStats *vacrelstats,
+								 bool for_cleanup)
+{
+	StringInfoData buf;
+
+	Assert(!IsParallelWorker());
+
+	/* Set shared information to tell parallel workers */
+	lps->lvshared->for_cleanup = for_cleanup;
+	if (!for_cleanup)
+	{
+		/* We can only provide an approximate value of num_heap_tuples here */
+		lps->lvshared->reltuples = vacrelstats->old_live_tuples;
+		lps->lvshared->estimated_count = true;
+	}
+	else
+	{
+		/*
+		 * Now we can provide a better estimate of total number of surviving
+		 * tuples (we assume indexes are more interested in that than in the
+		 * number of nominally live tuples).
+		 */
+		lps->lvshared->reltuples = vacrelstats->new_rel_tuples;
+		lps->lvshared->estimated_count =
+			(vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+
+	}
+
+	LaunchParallelWorkers(lps->pcxt);
+
+	initStringInfo(&buf);
+
+	/* Create the log message to report */
+	if (lps->pcxt->nworkers_launched == 0)
+	{
+		/*
+		 * If no workers launched, the leader process vacuums all indexes alone.
+		 * Since there is hope that we can launch parallel workers in the next
+		 * index vacuuming time we don't end parallel mode yet.
+		 */
+		if (for_cleanup)
+		{
+			if (lps->nworkers_requested > 0)
+				appendStringInfo(&buf,
+								 gettext_noop("could not launch parallel vacuum worker for index cleanup (planned: %d, requested: %d)"),
+								 lps->pcxt->nworkers, lps->nworkers_requested);
+			else
+				appendStringInfo(&buf,
+								 gettext_noop("could not launch parallel vacuum worker for index cleanup (planned: %d)"),
+								 lps->pcxt->nworkers);
+		}
+		else
+		{
+			if (lps->nworkers_requested > 0)
+				appendStringInfo(&buf,
+								 gettext_noop("could not launch parallel vacuum worker for index vacuuming (planned: %d, requested: %d)"),
+								 lps->pcxt->nworkers, lps->nworkers_requested);
+			else
+				appendStringInfo(&buf,
+								 gettext_noop("could not launch parallel vacuum worker for index vacuuming (planned: %d)"),
+								 lps->pcxt->nworkers);
+		}
+	}
+	else
+	{
+		if (for_cleanup)
+		{
+			if (lps->nworkers_requested > 0)
+				appendStringInfo(&buf,
+								 ngettext("launched %d parallel vacuum worker for index cleanup (planned: %d, requested %d)",
+										  "launched %d parallel vacuum workers for index cleanup (planned: %d, requsted %d)",
+										  lps->pcxt->nworkers_launched),
+								 lps->pcxt->nworkers_launched,
+								 lps->pcxt->nworkers,
+								 lps->nworkers_requested);
+			else
+				appendStringInfo(&buf,
+								 ngettext("launched %d parallel vacuum worker for index cleanup (planned: %d)",
+										  "launched %d parallel vacuum workers for index cleanup (planned: %d)",
+										  lps->pcxt->nworkers_launched),
+								 lps->pcxt->nworkers_launched,
+								 lps->pcxt->nworkers);
+		}
+		else
+		{
+			if (lps->nworkers_requested > 0)
+				appendStringInfo(&buf,
+								 ngettext("launched %d parallel vacuum worker for index vacuuming (planned: %d, requested %d)",
+										  "launched %d parallel vacuum workers for index vacuuming (planned: %d, requested %d)",
+										  lps->pcxt->nworkers_launched),
+								 lps->pcxt->nworkers_launched,
+								 lps->pcxt->nworkers,
+								 lps->nworkers_requested);
+			else
+				appendStringInfo(&buf,
+								 ngettext("launched %d parallel vacuum worker for index vacuuming (planned: %d)",
+										  "launched %d parallel vacuum workers for index vacuuming (planned: %d)",
+										  lps->pcxt->nworkers_launched),
+								 lps->pcxt->nworkers_launched,
+								 lps->pcxt->nworkers);
+		}
+	}
+
+	ereport(elevel, (errmsg("%s", buf.data)));
+	return;
+}
+
+/*
+ * Wait for all worker processes to finish and reinitialize DSM for
+ * the next index vacuuming or index cleanup if necessary.
+ */
+static void
+lazy_end_parallel_index_vacuum(LVParallelState *lps, bool reinitialize)
+{
+	Assert(!IsParallelWorker());
+
+	WaitForParallelWorkersToFinish(lps->pcxt);
+
+	if (reinitialize)
+	{
+		/* Reset the processing count */
+		pg_atomic_write_u32(&(lps->lvshared->nprocessed), 0);
+
+		/*
+		 * Reinitialize the parallel context to relaunch parallel workers
+		 * for the next execution.
+		 */
+		ReinitializeParallelDSM(lps->pcxt);
+	}
+}
+
+/*
+ * Perform work within a launched parallel process.
+ *
+ * Since parallel vacuum workers work only within index vacuuming and index
+ * cleanup, no need to report the progress information.
+ */
+void
+heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
+{
+	Relation	onerel;
+	Relation	*indrels;
+	LVShared	*lvshared;
+	LVDeadTuples	*dead_tuples;
+	int			nindexes;
+	char		*sharedquery;
+	IndexBulkDeleteResult **stats;
+
+	lvshared = (LVShared *) shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_SHARED,
+										   false);
+	elevel = lvshared->elevel;
+
+	ereport(DEBUG1,
+			(errmsg("starting parallel lazy vacuum worker for %s",
+					lvshared->for_cleanup ? "cleanup" : "vacuuming")));
+
+	/* Set debug_query_string for individual workers */
+	sharedquery = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, true);
+	debug_query_string = sharedquery;
+	pgstat_report_activity(STATE_RUNNING, debug_query_string);
+
+	/* Open table */
+	onerel = heap_open(lvshared->relid, ShareUpdateExclusiveLock);
+
+	/*
+	 * Open all indexes. indrels are sorted in order by OID, which should
+	 * be matched to the leader's one.
+	 */
+	vac_open_indexes(onerel, RowExclusiveLock, &nindexes, &indrels);
+	Assert(nindexes > 0);
+
+	/* Set dead tuple space */
+	dead_tuples = (LVDeadTuples *) shm_toc_lookup(toc,
+												  PARALLEL_VACUUM_KEY_DEAD_TUPLES,
+												  false);
+
+	/* Set cost-based vacuum delay */
+	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostBalance = 0;
+	VacuumPageHit = 0;
+	VacuumPageMiss = 0;
+	VacuumPageDirty = 0;
+
+	stats = (IndexBulkDeleteResult **)
+		palloc0(nindexes * sizeof(IndexBulkDeleteResult *));
+
+	/* Do either vacuuming indexes or cleaning indexes */
+	do_parallel_vacuum_or_cleanup_indexes(indrels, nindexes, stats,
+										  lvshared, dead_tuples);
+
+	vac_close_indexes(nindexes, indrels, RowExclusiveLock);
+	heap_close(onerel, ShareUpdateExclusiveLock);
+}
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index 55d129a..86511b2 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -14,6 +14,7 @@
 
 #include "postgres.h"
 
+#include "access/heapam.h"
 #include "access/nbtree.h"
 #include "access/parallel.h"
 #include "access/session.h"
@@ -140,6 +141,9 @@ static const struct
 	},
 	{
 		"_bt_parallel_build_main", _bt_parallel_build_main
+	},
+	{
+		"heap_parallel_vacuum_main", heap_parallel_vacuum_main
 	}
 };
 
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 1a7291d..d0a650e 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -98,6 +98,7 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 
 	/* Set default value */
 	params.index_cleanup = VACOPT_TERNARY_DEFAULT;
+	params.nworkers = -1;
 
 	/* Parse options list */
 	foreach(lc, vacstmt->options)
@@ -126,6 +127,27 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 			disable_page_skipping = defGetBoolean(opt);
 		else if (strcmp(opt->defname, "index_cleanup") == 0)
 			params.index_cleanup = get_vacopt_ternary_value(opt);
+		else if (strcmp(opt->defname, "parallel") == 0)
+		{
+			if (opt->arg == NULL)
+			{
+				/*
+				 * Parallel lazy vacuum is requested but user didn't specify
+				 * the parallel degree. The parallel degree will be determined
+				 * at the start of lazy vacuum.
+				 */
+				params.nworkers = 0;
+			}
+			else
+			{
+				params.nworkers = defGetInt32(opt);
+				if (params.nworkers <= 0)
+					ereport(ERROR,
+							(errcode(ERRCODE_SYNTAX_ERROR),
+							 errmsg("parallel vacuum degree must be at least 1"),
+							 parser_errposition(pstate, opt->location)));
+			}
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -167,6 +189,11 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 		}
 	}
 
+	if ((params.options & VACOPT_FULL) && params.nworkers >= 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot specify FULL option with PARALLEL option")));
+
 	/*
 	 * All freeze ages are zero if the FREEZE option is given; otherwise pass
 	 * them as -1 which means to use the default values.
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index b51f12d..e61de95 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -10581,6 +10581,7 @@ vac_analyze_option_name:
 
 vac_analyze_option_arg:
 			opt_boolean_or_string					{ $$ = (Node *) makeString($1); }
+			| NumericOnly							{ $$ = (Node *) $1; }
 			| /* EMPTY */		 					{ $$ = NULL; }
 		;
 
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 0976029..c5005c8 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2887,6 +2887,7 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 			(doanalyze ? VACOPT_ANALYZE : 0) |
 			(!wraparound ? VACOPT_SKIP_LOCKED : 0);
 		tab->at_params.index_cleanup = VACOPT_TERNARY_DEFAULT;
+		tab->at_params.nworkers = -1;	/* parallel lazy autovacuum is not supported */
 		tab->at_params.freeze_min_age = freeze_min_age;
 		tab->at_params.freeze_table_age = freeze_table_age;
 		tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 7c4e5fba..827afc0 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -3445,7 +3445,7 @@ psql_completion(const char *text, int start, int end)
 		if (ends_with(prev_wd, '(') || ends_with(prev_wd, ','))
 			COMPLETE_WITH("FULL", "FREEZE", "ANALYZE", "VERBOSE",
 						  "DISABLE_PAGE_SKIPPING", "SKIP_LOCKED",
-						  "INDEX_CLEANUP");
+						  "INDEX_CLEANUP", "PARALLEL");
 		else if (TailMatches("FULL|FREEZE|ANALYZE|VERBOSE|DISABLE_PAGE_SKIPPING|SKIP_LOCKED|INDEX_CLEANUP"))
 			COMPLETE_WITH("ON", "OFF");
 	}
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 77e5e60..c1410c4 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -14,6 +14,7 @@
 #ifndef HEAPAM_H
 #define HEAPAM_H
 
+#include "access/parallel.h"
 #include "access/relation.h"	/* for backward compatibility */
 #include "access/relscan.h"
 #include "access/sdir.h"
@@ -201,6 +202,7 @@ extern Size SyncScanShmemSize(void);
 struct VacuumParams;
 extern void heap_vacuum_rel(Relation onerel,
 				struct VacuumParams *params, BufferAccessStrategy bstrategy);
+extern void heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
 
 /* in heap/heapam_visibility.c */
 extern bool HeapTupleSatisfiesVisibility(HeapTuple stup, Snapshot snapshot,
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 9cc6e0d..9504a01 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -182,6 +182,11 @@ typedef struct VacuumParams
 									 * to use default */
 	VacOptTernaryValue index_cleanup;	/* Do index vacuum and cleanup,
 										* default value depends on reloptions */
+	/*
+	 * The number of parallel vacuum workers. -1 by default for no workers
+	 * and 0 for choosing based on the number of indexes.
+	 */
+	int			nworkers;
 } VacuumParams;
 
 /* GUC parameters */
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index 6ba7cd7..74a69b5 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -80,6 +80,14 @@ CONTEXT:  SQL function "do_analyze" statement 1
 SQL function "wrap_do_analyze" statement 1
 VACUUM FULL vactst;
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+-- PARALLEL option
+VACUUM (PARALLEL) vaccluster;
+VACUUM (PARALLEL 2) vaccluster;
+VACUUM (PARALLEL 0) vaccluster; -- error
+ERROR:  parallel vacuum degree must be at least 1
+LINE 1: VACUUM (PARALLEL 0) vaccluster;
+                ^
+VACUUM (PARALLEL 2, INDEX_CLEANUP FALSE) vaccluster;
 -- INDEX_CLEANUP option
 CREATE TABLE no_index_cleanup (i INT PRIMARY KEY) WITH (vacuum_index_cleanup = false);
 VACUUM (INDEX_CLEANUP FALSE) vaccluster;
@@ -124,9 +132,9 @@ ERROR:  column "does_not_exist" of relation "vacparted" does not exist
 ANALYZE (VERBOSE) does_not_exist;
 ERROR:  relation "does_not_exist" does not exist
 ANALYZE (nonexistent-arg) does_not_exist;
-ERROR:  syntax error at or near "-"
+ERROR:  syntax error at or near "arg"
 LINE 1: ANALYZE (nonexistent-arg) does_not_exist;
-                            ^
+                             ^
 ANALYZE (nonexistentarg) does_not_exit;
 ERROR:  unrecognized ANALYZE option "nonexistentarg"
 LINE 1: ANALYZE (nonexistentarg) does_not_exit;
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 57e0f35..cfedaf3 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -62,6 +62,12 @@ VACUUM FULL vactst;
 
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
 
+-- PARALLEL option
+VACUUM (PARALLEL) vaccluster;
+VACUUM (PARALLEL 2) vaccluster;
+VACUUM (PARALLEL 0) vaccluster; -- error
+VACUUM (PARALLEL 2, INDEX_CLEANUP FALSE) vaccluster;
+
 -- INDEX_CLEANUP option
 CREATE TABLE no_index_cleanup (i INT PRIMARY KEY) WITH (vacuum_index_cleanup = false);
 VACUUM (INDEX_CLEANUP FALSE) vaccluster;
-- 
1.8.3.1

#75

Kyotaro HORIGUCHI

horiguchi.kyotaro@lab.ntt.co.jp

almost 7 years ago

In reply to: Masahiko Sawada (#74)

Hello.

# Is this still living? I changed the status to "needs review"

At Sat, 6 Apr 2019 06:47:32 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in <CAD21AoAuD3txrxucnVtM6NGo=JGSjs3VDkoCzN0jGz_egc_82g@mail.gmail.com>

Indeed. How about the following description?

Attached the updated version patches.

Thanks.

heapam.h is including access/parallel.h but the file doesn't use
parallel.h stuff and storage/shm_toc.h and storage/dsm.h are
enough.

+ * DSM keys for parallel lazy vacuum. Since we don't need to worry about DSM
+ * keys conflicting with plan_node_id we can use small integers.

Yeah, this is right, but "plan_node_id" seems abrupt
there. Please prepend "differently from parallel execution code"
or .. I think no excuse is needed to use that numbers. The
executor code is already making an excuse for the large numbers
as unusual instead.

+ * Macro to check if we in a parallel lazy vacuum. If true, we're in parallel
+ * mode and prepared the DSM segments.
+ */
+#define IsInParallelVacuum(lps) (((LVParallelState *) (lps)) != NULL)

we *are* in?

The name "IsInParallleVacuum()" looks (to me) like suggesting
"this process is a parallel vacuum worker". How about
ParallelVacuumIsActive?

+typedef struct LVIndStats
+typedef struct LVDeadTuples
+typedef struct LVShared
+typedef struct LVParallelState

The names are confusing, and the name LVShared is too
generic. Shared-only structs are better to be marked in the name.
That is, maybe it would be better that LVIndStats were
LVSharedIndStats and LVShared were LVSharedRelStats.

It might be better that LVIndStats were moved out from LVShared,
but I'm not confident.

+static void
+lazy_parallel_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel
...
+	lazy_begin_parallel_index_vacuum(lps, vacrelstats, for_cleanup);
...
+	do_parallel_vacuum_or_cleanup_indexes(Irel, nindexes, stats,
+                                  lps->lvshared, vacrelstats->dead_tuples);
...
+	lazy_end_parallel_index_vacuum(lps, !for_cleanup);

The function takes the parameter for_cleanup, but the flag is
used by the three subfunctions in utterly ununified way. It seems
to me useless to store for_cleanup in lvshared and lazy_end is
rather confusing. There's no explanation why "reinitialization"
== "!for_cleanup". In the first place,
lazy_begin_parallel_index_vacuum and
lazy_end_parallel_index_vacuum are called only from the function
and rather short so it doesn't seem reasonable that the are
independend functions.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#76

Masahiko Sawada

sawada.mshk@gmail.com

almost 7 years ago

In reply to: Kyotaro HORIGUCHI (#75)

On Mon, Apr 8, 2019 at 7:25 PM Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:

Hello.

# Is this still living? I changed the status to "needs review"

At Sat, 6 Apr 2019 06:47:32 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in <CAD21AoAuD3txrxucnVtM6NGo=JGSjs3VDkoCzN0jGz_egc_82g@mail.gmail.com>

Indeed. How about the following description?

Attached the updated version patches.

Thanks.

Thank you for reviewing the patch!

heapam.h is including access/parallel.h but the file doesn't use
parallel.h stuff and storage/shm_toc.h and storage/dsm.h are
enough.

Fixed.

+ * DSM keys for parallel lazy vacuum. Since we don't need to worry about DSM
+ * keys conflicting with plan_node_id we can use small integers.
Yeah, this is right, but "plan_node_id" seems abrupt
there. Please prepend "differently from parallel execution code"
or .. I think no excuse is needed to use that numbers. The
executor code is already making an excuse for the large numbers
as unusual instead.

Fixed.

+ * Macro to check if we in a parallel lazy vacuum. If true, we're in parallel
+ * mode and prepared the DSM segments.
+ */
+#define IsInParallelVacuum(lps) (((LVParallelState *) (lps)) != NULL)

we *are* in?

Fixed.

The name "IsInParallleVacuum()" looks (to me) like suggesting
"this process is a parallel vacuum worker". How about
ParallelVacuumIsActive?

Fixed.

+typedef struct LVIndStats
+typedef struct LVDeadTuples
+typedef struct LVShared
+typedef struct LVParallelState
The names are confusing, and the name LVShared is too
generic. Shared-only structs are better to be marked in the name.
That is, maybe it would be better that LVIndStats were
LVSharedIndStats and LVShared were LVSharedRelStats.

Hmm, LVShared actually stores also various things that are not
relevant with the relation. I'm not sure that's a good idea to rename
it to LVSharedRelStats. When we support parallel vacuum for other
vacuum steps the adding a struct for storing only relation statistics
might work well.

It might be better that LVIndStats were moved out from LVShared,
but I'm not confident.
+static void
+lazy_parallel_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel
...
+       lazy_begin_parallel_index_vacuum(lps, vacrelstats, for_cleanup);
...
+       do_parallel_vacuum_or_cleanup_indexes(Irel, nindexes, stats,
+                                  lps->lvshared, vacrelstats->dead_tuples);
...
+       lazy_end_parallel_index_vacuum(lps, !for_cleanup);
The function takes the parameter for_cleanup, but the flag is
used by the three subfunctions in utterly ununified way. It seems
to me useless to store for_cleanup in lvshared

I think that we need to store for_cleanup or a something telling
vacuum workers to do either index vacuuming or index cleanup in
lvshared. Or can we use another thing instead?

and lazy_end is
rather confusing.

Ah, I used "lazy" as prefix of function in vacuumlazy.c. Fixed.

There's no explanation why "reinitialization"
== "!for_cleanup". In the first place,
lazy_begin_parallel_index_vacuum and
lazy_end_parallel_index_vacuum are called only from the function
and rather short so it doesn't seem reasonable that the are
independend functions.

Okay agreed, fixed.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#77

Masahiko Sawada

sawada.mshk@gmail.com

over 6 years ago

In reply to: Masahiko Sawada (#76)

2 attachment(s)

On Wed, Apr 10, 2019 at 2:19 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Mon, Apr 8, 2019 at 7:25 PM Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:

Hello.

# Is this still living? I changed the status to "needs review"

At Sat, 6 Apr 2019 06:47:32 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in <CAD21AoAuD3txrxucnVtM6NGo=JGSjs3VDkoCzN0jGz_egc_82g@mail.gmail.com>

Indeed. How about the following description?

Attached the updated version patches.

Thanks.

Thank you for reviewing the patch!

heapam.h is including access/parallel.h but the file doesn't use
parallel.h stuff and storage/shm_toc.h and storage/dsm.h are
enough.

Fixed.
+ * DSM keys for parallel lazy vacuum. Since we don't need to worry about DSM
+ * keys conflicting with plan_node_id we can use small integers.
Yeah, this is right, but "plan_node_id" seems abrupt
there. Please prepend "differently from parallel execution code"
or .. I think no excuse is needed to use that numbers. The
executor code is already making an excuse for the large numbers
as unusual instead.
Fixed.
+ * Macro to check if we in a parallel lazy vacuum. If true, we're in parallel
+ * mode and prepared the DSM segments.
+ */
+#define IsInParallelVacuum(lps) (((LVParallelState *) (lps)) != NULL)
we *are* in?
Fixed.

The name "IsInParallleVacuum()" looks (to me) like suggesting
"this process is a parallel vacuum worker". How about
ParallelVacuumIsActive?

Fixed.
+typedef struct LVIndStats
+typedef struct LVDeadTuples
+typedef struct LVShared
+typedef struct LVParallelState
The names are confusing, and the name LVShared is too
generic. Shared-only structs are better to be marked in the name.
That is, maybe it would be better that LVIndStats were
LVSharedIndStats and LVShared were LVSharedRelStats.
Hmm, LVShared actually stores also various things that are not
relevant with the relation. I'm not sure that's a good idea to rename
it to LVSharedRelStats. When we support parallel vacuum for other
vacuum steps the adding a struct for storing only relation statistics
might work well.
It might be better that LVIndStats were moved out from LVShared,
but I'm not confident.
+static void
+lazy_parallel_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel
...
+       lazy_begin_parallel_index_vacuum(lps, vacrelstats, for_cleanup);
...
+       do_parallel_vacuum_or_cleanup_indexes(Irel, nindexes, stats,
+                                  lps->lvshared, vacrelstats->dead_tuples);
...
+       lazy_end_parallel_index_vacuum(lps, !for_cleanup);
The function takes the parameter for_cleanup, but the flag is
used by the three subfunctions in utterly ununified way. It seems
to me useless to store for_cleanup in lvshared
I think that we need to store for_cleanup or a something telling
vacuum workers to do either index vacuuming or index cleanup in
lvshared. Or can we use another thing instead?

and lazy_end is
rather confusing.

Ah, I used "lazy" as prefix of function in vacuumlazy.c. Fixed.

There's no explanation why "reinitialization"
== "!for_cleanup". In the first place,
lazy_begin_parallel_index_vacuum and
lazy_end_parallel_index_vacuum are called only from the function
and rather short so it doesn't seem reasonable that the are
independend functions.

Okay agreed, fixed.

Since the previous version patch conflicts with current HEAD, I've
attached the updated version patches.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachments:

v25-0001-Add-parallel-option-to-VACUUM-command.patchapplication/octet-stream; name=v25-0001-Add-parallel-option-to-VACUUM-command.patchDownload

From 7c90417a130c92f8930853a551046c2affc96c6b Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 7 Jun 2019 15:25:45 +0900
Subject: [PATCH v25 1/2] Add parallel option to VACUUM command

This change adds PARALLEL option to VACUUM command that enable us to
perform index vacuuming and index cleanup with background
workers. Indivisual indexes is processed by one vacuum
process. Therefore parallel vacuum can be used when the table has at
least two indexes and it cannot specify larger parallel degree than
the number of indexes that the table has.

The parallel degree is either specified by user or determined based on
the number of indexes that the table has, and further limited by
max_parallel_maintenance_workers. The table size and index size don't
affect it.
---
 doc/src/sgml/config.sgml              |  14 +-
 doc/src/sgml/ref/vacuum.sgml          |  33 ++
 src/backend/access/heap/vacuumlazy.c  | 888 ++++++++++++++++++++++++++++++----
 src/backend/access/transam/parallel.c |   4 +
 src/backend/commands/vacuum.c         |  27 ++
 src/backend/postmaster/autovacuum.c   |   2 +
 src/include/access/heapam.h           |   3 +
 src/include/commands/vacuum.h         |   5 +
 src/test/regress/expected/vacuum.out  |  10 +
 src/test/regress/sql/vacuum.sql       |   7 +
 10 files changed, 887 insertions(+), 106 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 84341a3..0cb32a1 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2251,13 +2251,13 @@ include_dir 'conf.d'
        <listitem>
         <para>
          Sets the maximum number of parallel workers that can be
-         started by a single utility command.  Currently, the only
-         parallel utility command that supports the use of parallel
-         workers is <command>CREATE INDEX</command>, and only when
-         building a B-tree index.  Parallel workers are taken from the
-         pool of processes established by <xref
-         linkend="guc-max-worker-processes"/>, limited by <xref
-         linkend="guc-max-parallel-workers"/>.  Note that the requested
+         started by a single utility command.  Currently, the parallel
+         utility commands that support the use of parallel workers are
+         <command>CREATE INDEX</command> only when building a B-tree index,
+         and <command>VACUUM</command> without <literal>FULL</literal>
+         option. Parallel workers are taken from the pool of processes
+         established by <xref linkend="guc-max-worker-processes"/>, limited
+         by <xref linkend="guc-max-parallel-workers"/>.  Note that the requested
          number of workers may not actually be available at run time.
          If this occurs, the utility operation will run with fewer
          workers than expected.  The default value is 2.  Setting this
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index f9b0fb8..c3347f2 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -224,6 +224,27 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    </varlistentry>
 
    <varlistentry>
+    <term><literal>PARALLEL</literal></term>
+    <listitem>
+     <para>
+      Perform vacuum index and cleanup index phases of <command>VACUUM</command>
+      in parallel using <replaceable class="parameter">integer</replaceable>
+      background workers (for the detail of each vacuum phases, please refer
+      to <xref linkend="vacuum-phases"/>). If the parallel degree
+      <replaceable class="parameter">integer</replaceable> is omitted, then
+      <command>VACUUM</command> decides the number of workers based on number of
+      indexes on the relation which further limited by
+      <xref linkend="guc-max-parallel-workers-maintenance"/>. Only one worker
+      can be used per index. So parallel workers are launched only when
+      there are at least <literal>2</literal> indexes in the table. Workers
+      for vacuum launches before starting each phases and exit at the end of
+      the phase. These behaviors might change in a future release. This
+      option can not use with <literal>FULL</literal> option.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><replaceable class="parameter">boolean</replaceable></term>
     <listitem>
      <para>
@@ -238,6 +259,18 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    </varlistentry>
 
    <varlistentry>
+    <term><replaceable class="parameter">integer</replaceable></term>
+    <listitem>
+     <para>
+      Specifies a positive integer value passed to the selected option.
+      The <replaceable class="parameter">integer</replaceable> value can
+      also be omitted, in which case the default value of the selected
+      option is used.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><replaceable class="parameter">table_name</replaceable></term>
     <listitem>
      <para>
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index a3c4a1d..e8f4199 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -22,6 +22,20 @@
  * of index scans performed.  So we don't use maintenance_work_mem memory for
  * the TID array, just enough to hold as many heap tuples as fit on one page.
  *
+ * Lazy vacuum supports parallel execution with parallel worker processes. In
+ * parallel lazy vacuum, we perform both index vacuuming and index cleanup with
+ * parallel worker processes. Individual indexes is processed by one vacuum
+ * process. At beginning of lazy vacuum (at lazy_scan_heap) we prepare the
+ * parallel context and initialize the DSM segment that contains shared information
+ * as well as the memory space for storing dead tuples. When starting either
+ * index vacuuming or index cleanup, we launch parallel worker processes. Once
+ * all indexes are processed the parallel worker processes exit and the leader
+ * process re-initializes the DSM segment while keeping recorded dead tuples.
+ * Note that all parallel workers live during one either index vacuuming or
+ * index cleanup but the leader process neither exits from the parallel mode
+ * nor destroys the parallel context. For updating the index statistics, since
+ * any updates are not allowed during parallel mode we update the index
+ * statistics after exited from parallel mode.
  *
  * Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
@@ -41,8 +55,10 @@
 #include "access/heapam_xlog.h"
 #include "access/htup_details.h"
 #include "access/multixact.h"
+#include "access/parallel.h"
 #include "access/transam.h"
 #include "access/visibilitymap.h"
+#include "access/xact.h"
 #include "access/xlog.h"
 #include "catalog/storage.h"
 #include "commands/dbcommands.h"
@@ -55,6 +71,7 @@
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
 #include "storage/lmgr.h"
+#include "tcop/tcopprot.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/pg_rusage.h"
@@ -110,6 +127,94 @@
  */
 #define PREFETCH_SIZE			((BlockNumber) 32)
 
+/*
+ * DSM keys for parallel lazy vacuum. Unlike other parallel execution code,
+ * since we don't need to worry about DSM keys conflicting with plan_node_id
+ * we can use small integers.
+ */
+#define PARALLEL_VACUUM_KEY_SHARED			1
+#define PARALLEL_VACUUM_KEY_DEAD_TUPLES		2
+#define PARALLEL_VACUUM_KEY_QUERY_TEXT		3
+
+/*
+ * Macro to check if we in a parallel lazy vacuum. If true, we are in parallel
+ * mode and prepared the DSM segments.
+ */
+#define ParallelVacuumIsActive(lps) (((LVParallelState *) (lps)) != NULL)
+
+/*
+ * LVDeadTuples stores the dead tuple TIDs collected during heap scan.
+ * This is allocated in the DSM segment when parallel lazy vacuum
+ * mode, otherwise allocated in a local memory.
+ */
+typedef struct LVDeadTuples
+{
+	int			max_tuples;	/* # slots allocated in array */
+	int			num_tuples;	/* current # of entries */
+	/* List of TIDs of tuples we intend to delete */
+	/* NB: this list is ordered by TID address */
+	ItemPointerData itemptrs[FLEXIBLE_ARRAY_MEMBER];	/* array of ItemPointerData */
+} LVDeadTuples;
+#define SizeOfLVDeadTuples offsetof(LVDeadTuples, itemptrs) + sizeof(ItemPointerData)
+
+/*
+ * Struct for an index bulk-deletion statistic used for parallel lazy vacuum.
+ * This is allocated in the DSM segment.
+ */
+typedef struct LVSharedIndStats
+{
+	IndexBulkDeleteResult	stats;
+	bool					updated;	/* are the stats updated? */
+} LVSharedIndStats;
+
+/*
+ * Shared information among parallel workers. So this is allocated in
+ * the DSM segment.
+ */
+typedef struct LVShared
+{
+	/*
+	 * Target table relid and log level. These fields are not modified during
+	 * the lazy vacuum.
+	 */
+	Oid		relid;
+	int		elevel;
+
+	/*
+	 * An indication for vacuum workers of doing either index vacuuming or
+	 * index cleanup.
+	 */
+	bool	for_cleanup;
+
+	/*
+	 * Fields for both index vacuuming and index cleanup.
+	 *
+	 * reltuples is the total number of input heap tuples. We set either an
+	 * old live tuples in index vacuuming case or the new live tuples in index
+	 * cleanup case.
+	 *
+	 * estimated_count is true if the reltuples is estimated value.
+	 */
+	double	reltuples;
+	bool	estimated_count;
+
+	/*
+	 * Variables to control parallel index vacuuming. An variable-sized field
+	 * 'indstats' must come last.
+	 */
+	pg_atomic_uint32	nprocessed;
+	LVSharedIndStats	indstats[FLEXIBLE_ARRAY_MEMBER];
+} LVShared;
+#define SizeOfLVShared offsetof(LVShared, indstats) + sizeof(LVSharedIndStats)
+
+/* Struct for parallel lazy vacuum */
+typedef struct LVParallelState
+{
+	ParallelContext	*pcxt;
+	LVShared		*lvshared;
+	int				nworkers_requested;	/* user-requested parallel degree */
+} LVParallelState;
+
 typedef struct LVRelStats
 {
 	/* useindex = true means two-pass strategy; false means one-pass */
@@ -128,17 +233,12 @@ typedef struct LVRelStats
 	BlockNumber pages_removed;
 	double		tuples_deleted;
 	BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
-	/* List of TIDs of tuples we intend to delete */
-	/* NB: this list is ordered by TID address */
-	int			num_dead_tuples;	/* current # of entries */
-	int			max_dead_tuples;	/* # slots allocated in array */
-	ItemPointer dead_tuples;	/* array of ItemPointerData */
+	LVDeadTuples *dead_tuples;
 	int			num_index_scans;
 	TransactionId latestRemovedXid;
 	bool		lock_waiter_detected;
 } LVRelStats;
 
-
 /* A few variables that don't seem worth passing around as parameters */
 static int	elevel = -1;
 
@@ -155,12 +255,11 @@ static void lazy_scan_heap(Relation onerel, VacuumParams *params,
 						   bool aggressive);
 static void lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats);
 static bool lazy_check_needs_freeze(Buffer buf, bool *hastup);
-static void lazy_vacuum_index(Relation indrel,
-							  IndexBulkDeleteResult **stats,
-							  LVRelStats *vacrelstats);
+static void lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult **stats,
+							  LVDeadTuples *dead_tuples, double reltuples);
 static void lazy_cleanup_index(Relation indrel,
-							   IndexBulkDeleteResult *stats,
-							   LVRelStats *vacrelstats);
+							   IndexBulkDeleteResult **stats,
+							   double reltuples, bool estimated_count);
 static int	lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 							 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer);
 static bool should_attempt_truncation(VacuumParams *params,
@@ -169,12 +268,33 @@ static void lazy_truncate_heap(Relation onerel, LVRelStats *vacrelstats);
 static BlockNumber count_nondeletable_pages(Relation onerel,
 											LVRelStats *vacrelstats);
 static void lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks);
-static void lazy_record_dead_tuple(LVRelStats *vacrelstats,
+static void lazy_record_dead_tuple(LVDeadTuples *dead_tuples,
 								   ItemPointer itemptr);
 static bool lazy_tid_reaped(ItemPointer itemptr, void *state);
 static int	vac_cmp_itemptr(const void *left, const void *right);
 static bool heap_page_is_all_visible(Relation rel, Buffer buf,
 									 TransactionId *visibility_cutoff_xid, bool *all_frozen);
+static void lazy_update_index_statistics(Relation indrel, IndexBulkDeleteResult *stats);
+static LVParallelState *begin_parallel_vacuum(LVRelStats *vacrelstats, Oid relid,
+											  BlockNumber nblocks, int nindexes,
+											  int nrequested);
+static void end_parallel_vacuum(LVParallelState *lps, Relation *Irel, int nindexes);
+static void lazy_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
+										   int nindexes,
+										   IndexBulkDeleteResult **stats,
+										   LVParallelState *lps, bool for_cleanup);
+static void lazy_parallel_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats,
+													Relation *Irel,
+													int nindexes,
+													IndexBulkDeleteResult **stats,
+													LVParallelState *lps,
+													bool for_cleanup);
+static void do_parallel_vacuum_or_cleanup_indexes(Relation *Irel, int nindexes,
+												  IndexBulkDeleteResult **stats,
+												  LVShared *lvshared,
+												  LVDeadTuples *dead_tuples);
+static int compute_parallel_workers(Relation onerel, int nrequested, int nindexes);
+static long compute_max_dead_tuples(BlockNumber relblocks, bool hasindex);
 
 
 /*
@@ -488,6 +608,17 @@ vacuum_log_cleanup_info(Relation rel, LVRelStats *vacrelstats)
  *		dead-tuple TIDs, invoke vacuuming of indexes and call lazy_vacuum_heap
  *		to reclaim dead line pointers.
  *
+ *		If the table has at least two indexes and parallel lazy vacuum is
+ *		requested, we execute both index vacuuming and index cleanup with
+ *		parallel workers. In parallel lazy vacuum, we enter parallel mode and
+ *		create the parallel context and the DSM segment before starting heap
+ *		scan. All parallel workers are launched at beginning of index vacuuming
+ *		and index cleanup and they exit once done with all indexes. At end of
+ *		this function we exit from parallel mode. Index bulk-deletion results
+ *		are stored in the DSM segment and update index statistics as a whole
+ *		after exited from parallel mode since all writes are not allowed during
+ *		parallel mode.
+ *
  *		If there are no indexes then we can reclaim line pointers on the fly;
  *		dead line pointers need only be retained until all index pointers that
  *		reference them have been killed.
@@ -496,6 +627,8 @@ static void
 lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			   Relation *Irel, int nindexes, bool aggressive)
 {
+	LVParallelState *lps = NULL;
+	LVDeadTuples *dead_tuples;
 	BlockNumber nblocks,
 				blkno;
 	HeapTupleData tuple;
@@ -518,6 +651,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	bool		skipping_blocks;
 	xl_heap_freeze_tuple *frozen;
 	StringInfoData buf;
+	int			parallel_workers = 0;
 	const int	initprog_index[] = {
 		PROGRESS_VACUUM_PHASE,
 		PROGRESS_VACUUM_TOTAL_HEAP_BLKS,
@@ -553,13 +687,45 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	vacrelstats->nonempty_pages = 0;
 	vacrelstats->latestRemovedXid = InvalidTransactionId;
 
-	lazy_space_alloc(vacrelstats, nblocks);
+	/*
+	 * If parallel lazy vacuum is requested and we vacuum indexes, compute
+	 * the number of parallel vacuum worker to launch.
+	 */
+	if (params->nworkers >= 0 && vacrelstats->useindex)
+		parallel_workers = compute_parallel_workers(onerel,
+													params->nworkers,
+													nindexes);
+
+	if (parallel_workers > 0)
+	{
+		/*
+		 * Enter parallel mode, create the parallel context and allocate the
+		 * DSM segment.
+		 */
+		lps = begin_parallel_vacuum(vacrelstats,
+									RelationGetRelid(onerel),
+									nblocks, nindexes,
+									parallel_workers);
+
+		/* Remember the user-requested parallel degree to reporting */
+		lps->nworkers_requested = params->nworkers;
+	}
+	else
+	{
+		/*
+		 * Use single process vacuum, allocate the memory space for dead
+		 * tuples locally.
+		 */
+		lazy_space_alloc(vacrelstats, nblocks);
+	}
+
+	dead_tuples = vacrelstats->dead_tuples;
 	frozen = palloc(sizeof(xl_heap_freeze_tuple) * MaxHeapTuplesPerPage);
 
 	/* Report that we're scanning the heap, advertising total # of blocks */
 	initprog_val[0] = PROGRESS_VACUUM_PHASE_SCAN_HEAP;
 	initprog_val[1] = nblocks;
-	initprog_val[2] = vacrelstats->max_dead_tuples;
+	initprog_val[2] = dead_tuples->max_tuples;
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
 	/*
@@ -737,8 +903,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * If we are close to overrunning the available space for dead-tuple
 		 * TIDs, pause and do a cycle of vacuuming before we tackle this page.
 		 */
-		if ((vacrelstats->max_dead_tuples - vacrelstats->num_dead_tuples) < MaxHeapTuplesPerPage &&
-			vacrelstats->num_dead_tuples > 0)
+		if ((dead_tuples->max_tuples - dead_tuples->num_tuples) < MaxHeapTuplesPerPage &&
+			dead_tuples->num_tuples > 0)
 		{
 			const int	hvp_index[] = {
 				PROGRESS_VACUUM_PHASE,
@@ -766,10 +932,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 										 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
 			/* Remove index entries */
-			for (i = 0; i < nindexes; i++)
-				lazy_vacuum_index(Irel[i],
-								  &indstats[i],
-								  vacrelstats);
+			lazy_vacuum_or_cleanup_indexes(vacrelstats, Irel, nindexes,
+										   indstats, lps, false);
 
 			/*
 			 * Report that we are now vacuuming the heap.  We also increase
@@ -789,7 +953,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 			vacrelstats->num_index_scans++;
 
 			/*
@@ -985,7 +1149,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		has_dead_tuples = false;
 		nfrozen = 0;
 		hastup = false;
-		prev_dead_count = vacrelstats->num_dead_tuples;
+		prev_dead_count = dead_tuples->num_tuples;
 		maxoff = PageGetMaxOffsetNumber(page);
 
 		/*
@@ -1024,7 +1188,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 */
 			if (ItemIdIsDead(itemid))
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				all_visible = false;
 				continue;
 			}
@@ -1170,7 +1334,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 
 			if (tupgone)
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				HeapTupleHeaderAdvanceLatestRemovedXid(tuple.t_data,
 													   &vacrelstats->latestRemovedXid);
 				tups_vacuumed += 1;
@@ -1240,7 +1404,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * doing a second scan. Also we don't do that but forget dead tuples
 		 * when index cleanup is disabled.
 		 */
-		if (!vacrelstats->useindex && vacrelstats->num_dead_tuples > 0)
+		if (!vacrelstats->useindex && dead_tuples->num_tuples > 0)
 		{
 			if (nindexes == 0)
 			{
@@ -1269,7 +1433,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 
 			/*
 			 * Periodically do incremental FSM vacuuming to make newly-freed
@@ -1384,7 +1548,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * page, so remember its free space as-is.  (This path will always be
 		 * taken if there are no indexes.)
 		 */
-		if (vacrelstats->num_dead_tuples == prev_dead_count)
+		if (dead_tuples->num_tuples == prev_dead_count)
 			RecordPageWithFreeSpace(onerel, blkno, freespace);
 	}
 
@@ -1418,7 +1582,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 
 	/* If any tuples need to be deleted, perform final vacuum cycle */
 	/* XXX put a threshold on min number of tuples here? */
-	if (vacrelstats->num_dead_tuples > 0)
+	if (dead_tuples->num_tuples > 0)
 	{
 		const int	hvp_index[] = {
 			PROGRESS_VACUUM_PHASE,
@@ -1434,10 +1598,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 									 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
 		/* Remove index entries */
-		for (i = 0; i < nindexes; i++)
-			lazy_vacuum_index(Irel[i],
-							  &indstats[i],
-							  vacrelstats);
+		lazy_vacuum_or_cleanup_indexes(vacrelstats, Irel, nindexes,
+										   indstats, lps, false);
 
 		/* Report that we are now vacuuming the heap */
 		hvp_val[0] = PROGRESS_VACUUM_PHASE_VACUUM_HEAP;
@@ -1463,11 +1625,20 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
 								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
 
-	/* Do post-vacuum cleanup and statistics update for each index */
+	/*
+	 * Do post-vacuum cleanup, and statistics update for each index if
+	 * we're not in parallel lazy vacuum. If in parallel lazy vacuum, do
+	 * only post-vacum cleanup and update statistics at the end of parallel
+	 * lazy vacuum.
+	 */
 	if (vacrelstats->useindex)
+		lazy_vacuum_or_cleanup_indexes(vacrelstats, Irel, nindexes,
+									   indstats, lps, true);
+
+	if (ParallelVacuumIsActive(lps))
 	{
-		for (i = 0; i < nindexes; i++)
-			lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
+		/* End parallel mode and update index statistics */
+		end_parallel_vacuum(lps, Irel, nindexes);
 	}
 
 	/* If no indexes, make log report that lazy_vacuum_heap would've made */
@@ -1534,7 +1705,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 	npages = 0;
 
 	tupindex = 0;
-	while (tupindex < vacrelstats->num_dead_tuples)
+	while (tupindex < vacrelstats->dead_tuples->num_tuples)
 	{
 		BlockNumber tblk;
 		Buffer		buf;
@@ -1543,7 +1714,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 
 		vacuum_delay_point();
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples->itemptrs[tupindex]);
 		buf = ReadBufferExtended(onerel, MAIN_FORKNUM, tblk, RBM_NORMAL,
 								 vac_strategy);
 		if (!ConditionalLockBufferForCleanup(buf))
@@ -1591,6 +1762,7 @@ static int
 lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 				 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer)
 {
+	LVDeadTuples	*dead_tuples = vacrelstats->dead_tuples;
 	Page		page = BufferGetPage(buffer);
 	OffsetNumber unused[MaxOffsetNumber];
 	int			uncnt = 0;
@@ -1601,16 +1773,16 @@ lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 
 	START_CRIT_SECTION();
 
-	for (; tupindex < vacrelstats->num_dead_tuples; tupindex++)
+	for (; tupindex < dead_tuples->num_tuples; tupindex++)
 	{
 		BlockNumber tblk;
 		OffsetNumber toff;
 		ItemId		itemid;
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&dead_tuples->itemptrs[tupindex]);
 		if (tblk != blkno)
 			break;				/* past end of tuples for this block */
-		toff = ItemPointerGetOffsetNumber(&vacrelstats->dead_tuples[tupindex]);
+		toff = ItemPointerGetOffsetNumber(&dead_tuples->itemptrs[tupindex]);
 		itemid = PageGetItemId(page, toff);
 		ItemIdSetUnused(itemid);
 		unused[uncnt++] = toff;
@@ -1731,19 +1903,284 @@ lazy_check_needs_freeze(Buffer buf, bool *hastup)
 	return false;
 }
 
+/*
+ * Vacuum or cleanup indexes with parallel workers. This function must be used
+ * by the parallel vacuum leader process.
+ */
+static void
+lazy_parallel_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
+										int nindexes, IndexBulkDeleteResult **stats,
+										LVParallelState *lps, bool for_cleanup)
+{
+	StringInfoData buf;
+
+	Assert(!IsParallelWorker());
+	Assert(ParallelVacuumIsActive(lps));
+	Assert(nindexes > 0);
+
+	/* Set shared information to tell parallel workers */
+	lps->lvshared->for_cleanup = for_cleanup;
+	if (!for_cleanup)
+	{
+		/*
+		 * We can only provide an approximate value of num_heap_tuples in
+		 * vacuum cases.
+		 */
+		lps->lvshared->reltuples = vacrelstats->old_live_tuples;
+		lps->lvshared->estimated_count = true;
+	}
+	else
+	{
+		/*
+		 * Now we can provide a better estimate of total number of surviving
+		 * tuples (we assume indexes are more interested in that than in the
+		 * number of nominally live tuples).
+		 */
+		lps->lvshared->reltuples = vacrelstats->new_rel_tuples;
+		lps->lvshared->estimated_count =
+			(vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+
+	}
+
+	LaunchParallelWorkers(lps->pcxt);
+
+	/* Create the log message to report */
+	initStringInfo(&buf);
+	if (lps->pcxt->nworkers_launched == 0)
+	{
+		/*
+		 * If no workers launched, the leader process vacuums all indexes alone.
+		 * Since there is hope that we can launch parallel workers in the next
+		 * index vacuuming time we don't end parallel mode yet.
+		 */
+		if (for_cleanup)
+		{
+			if (lps->nworkers_requested > 0)
+				appendStringInfo(&buf,
+								 gettext_noop("could not launch parallel vacuum worker for index cleanup (planned: %d, requested: %d)"),
+								 lps->pcxt->nworkers, lps->nworkers_requested);
+			else
+				appendStringInfo(&buf,
+								 gettext_noop("could not launch parallel vacuum worker for index cleanup (planned: %d)"),
+								 lps->pcxt->nworkers);
+		}
+		else
+		{
+			if (lps->nworkers_requested > 0)
+				appendStringInfo(&buf,
+								 gettext_noop("could not launch parallel vacuum worker for index vacuuming (planned: %d, requested: %d)"),
+								 lps->pcxt->nworkers, lps->nworkers_requested);
+			else
+				appendStringInfo(&buf,
+								 gettext_noop("could not launch parallel vacuum worker for index vacuuming (planned: %d)"),
+								 lps->pcxt->nworkers);
+		}
+	}
+	else
+	{
+		if (for_cleanup)
+		{
+			if (lps->nworkers_requested > 0)
+				appendStringInfo(&buf,
+								 ngettext("launched %d parallel vacuum worker for index cleanup (planned: %d, requested %d)",
+										  "launched %d parallel vacuum workers for index cleanup (planned: %d, requsted %d)",
+										  lps->pcxt->nworkers_launched),
+								 lps->pcxt->nworkers_launched,
+								 lps->pcxt->nworkers,
+								 lps->nworkers_requested);
+			else
+				appendStringInfo(&buf,
+								 ngettext("launched %d parallel vacuum worker for index cleanup (planned: %d)",
+										  "launched %d parallel vacuum workers for index cleanup (planned: %d)",
+										  lps->pcxt->nworkers_launched),
+								 lps->pcxt->nworkers_launched,
+								 lps->pcxt->nworkers);
+		}
+		else
+		{
+			if (lps->nworkers_requested > 0)
+				appendStringInfo(&buf,
+								 ngettext("launched %d parallel vacuum worker for index vacuuming (planned: %d, requested %d)",
+										  "launched %d parallel vacuum workers for index vacuuming (planned: %d, requested %d)",
+										  lps->pcxt->nworkers_launched),
+								 lps->pcxt->nworkers_launched,
+								 lps->pcxt->nworkers,
+								 lps->nworkers_requested);
+			else
+				appendStringInfo(&buf,
+								 ngettext("launched %d parallel vacuum worker for index vacuuming (planned: %d)",
+										  "launched %d parallel vacuum workers for index vacuuming (planned: %d)",
+										  lps->pcxt->nworkers_launched),
+								 lps->pcxt->nworkers_launched,
+								 lps->pcxt->nworkers);
+		}
+	}
+
+	ereport(elevel, (errmsg("%s", buf.data)));
+
+	/*
+	 * Do index vacuuming or index cleanup with parallel workers or by
+	 * the leader process alone in case where no workers could launch.
+	 */
+	do_parallel_vacuum_or_cleanup_indexes(Irel, nindexes, stats,
+										  lps->lvshared,
+										  vacrelstats->dead_tuples);
+
+	/* Wait for all vacuum workers to finish */
+	WaitForParallelWorkersToFinish(lps->pcxt);
+
+	/*
+	 * If we are doing index cleanup, we don't need to reinitialize the
+	 * parallel context as no more index vacuuming and index cleanup will
+	 * be performed after that.
+	 */
+	if (!for_cleanup)
+	{
+		/* Reset the processing count */
+		pg_atomic_write_u32(&(lps->lvshared->nprocessed), 0);
+
+		/*
+		 * Reinitialize the parallel context to relaunch parallel workers
+		 * for the next execution.
+		 */
+		ReinitializeParallelDSM(lps->pcxt);
+	}
+}
+
+/*
+ * Parallel Index vacuuming and index cleanup routine used by both the leader
+ * process and worker processes. Unlike single process vacuum, we don't update
+ * index statistics after cleanup index since it is not allowed during
+ * parallel mode, instead copy index bulk-deletion results from the local
+ * memory to the DSM segment and update them at the end of parallel lazy
+ * vacuum.
+ */
+static void
+do_parallel_vacuum_or_cleanup_indexes(Relation *Irel, int nindexes,
+									  IndexBulkDeleteResult **stats,
+									  LVShared *lvshared,
+									  LVDeadTuples *dead_tuples)
+{
+	/* Loop until all indexes are vacuumed */
+	for (;;)
+	{
+		int idx;
+
+		/* Get an index number to process */
+		idx = pg_atomic_fetch_add_u32(&(lvshared->nprocessed), 1);
+
+		/* Done for all indexes? */
+		if (idx >= nindexes)
+			break;
+
+		/*
+		 * Update the pointer to the corresponding bulk-deletion result
+		 * if someone has already updated it.
+		 */
+		if (lvshared->indstats[idx].updated &&
+			stats[idx] == NULL)
+			stats[idx] = &(lvshared->indstats[idx].stats);
+
+		/* Do vacuum or cleanup one index */
+		if (!lvshared->for_cleanup)
+			lazy_vacuum_index(Irel[idx], &stats[idx], dead_tuples,
+							  lvshared->reltuples);
+		else
+			lazy_cleanup_index(Irel[idx], &stats[idx], lvshared->reltuples,
+							   lvshared->estimated_count);
+
+		/*
+		 * Copy the index bulk-deletion result returned from ambulkdelete and
+		 * amvacuumcleanup to the DSM segment if it's the first time to get it
+		 * from them, because they allocate it locally and it's possible that an
+		 * index will be vacuumed by the different vacuum process at the next
+		 * time. The copying the result normally happens only after the first
+		 * time of index vacuuming. From the second time, we pass the result on
+		 * the DSM segment so that they then update it directly.
+		 *
+		 * Since all vacuum workers write the bulk-deletion result at different
+		 * slots we can write them without locking.
+		 */
+		if (!lvshared->indstats[idx].updated &&
+			stats[idx] != NULL)
+		{
+			memcpy(&(lvshared->indstats[idx].stats),
+				   stats[idx], sizeof(IndexBulkDeleteResult));
+			lvshared->indstats[idx].updated = true;
+
+			/*
+			 * no longer need the locally allocated result and now stats[idx]
+			 * points to the DSM segment.
+			 */
+			pfree(stats[idx]);
+			stats[idx] = &(lvshared->indstats[idx].stats);
+		}
+	}
+}
+
+/*
+ * Vacuum or cleanup indexes. If we're ready for parallel lazy vacuum it's
+ * performed with parallel workers, otherwise do vacuum or cleanup indexes.
+ * In parallel vacuum case, this function must be used by the parallel
+ * vacuum leader process. for_cleanup is true if the caller requests index
+ * cleanup, otherwise index vacuuming.
+ */
+static void
+lazy_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
+							   int nindexes, IndexBulkDeleteResult **stats,
+							   LVParallelState *lps, bool for_cleanup)
+{
+	int		idx;
+
+	Assert(!IsParallelWorker());
+
+	/* no job if the table has no index */
+	if (nindexes <= 0)
+		return;
+
+	if (ParallelVacuumIsActive(lps))
+	{
+		/* Do parallel index vacuuming or index cleanup */
+		lazy_parallel_vacuum_or_cleanup_indexes(vacrelstats, Irel,
+												nindexes, stats,
+												lps, for_cleanup);
+		return;
+	}
+
+	/* We are in single process vacuum, do index vacuuming or index cleanup */
+	for (idx = 0; idx < nindexes; idx++)
+	{
+		if (!for_cleanup)
+			lazy_vacuum_index(Irel[idx], &stats[idx], vacrelstats->dead_tuples,
+							  vacrelstats->old_live_tuples);
+		else
+		{
+			/* Cleanup one index and update index statistics */
+			lazy_cleanup_index(Irel[idx], &stats[idx], vacrelstats->new_rel_tuples,
+							   vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+
+			lazy_update_index_statistics(Irel[idx], stats[idx]);
+
+			if (stats[idx])
+				pfree(stats[idx]);
+		}
+	}
+}
 
 /*
  *	lazy_vacuum_index() -- vacuum one index relation.
  *
  *		Delete all the index entries pointing to tuples listed in
  *		vacrelstats->dead_tuples, and update running statistics.
+ *		reltuples is the number of heap tuples to be passed to the
+ *		bulk delete callback.
  */
 static void
-lazy_vacuum_index(Relation indrel,
-				  IndexBulkDeleteResult **stats,
-				  LVRelStats *vacrelstats)
+lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult **stats,
+				  LVDeadTuples *dead_tuples, double reltuples)
 {
 	IndexVacuumInfo ivinfo;
+	char		*msgfmt;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
@@ -1753,30 +2190,38 @@ lazy_vacuum_index(Relation indrel,
 	ivinfo.report_progress = false;
 	ivinfo.estimated_count = true;
 	ivinfo.message_level = elevel;
-	/* We can only provide an approximate value of num_heap_tuples here */
-	ivinfo.num_heap_tuples = vacrelstats->old_live_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
 	/* Do bulk deletion */
 	*stats = index_bulk_delete(&ivinfo, *stats,
-							   lazy_tid_reaped, (void *) vacrelstats);
+							   lazy_tid_reaped, (void *) dead_tuples);
+
+	if (IsParallelWorker())
+		msgfmt = gettext_noop("scanned index \"%s\" to remove %d row versions by parallel vacuum worker");
+	else
+		msgfmt = gettext_noop("scanned index \"%s\" to remove %d row versions");
 
 	ereport(elevel,
-			(errmsg("scanned index \"%s\" to remove %d row versions",
+			(errmsg(msgfmt,
 					RelationGetRelationName(indrel),
-					vacrelstats->num_dead_tuples),
+					dead_tuples->num_tuples),
 			 errdetail_internal("%s", pg_rusage_show(&ru0))));
 }
 
 /*
  *	lazy_cleanup_index() -- do post-vacuum cleanup for one index relation.
+ *
+ *		reltuples is the number of heap tuples and estimated_count is true
+ *		if the reltuples is an estimated value.
  */
 static void
 lazy_cleanup_index(Relation indrel,
-				   IndexBulkDeleteResult *stats,
-				   LVRelStats *vacrelstats)
+				   IndexBulkDeleteResult **stats,
+				   double reltuples, bool estimated_count)
 {
 	IndexVacuumInfo ivinfo;
+	char		*msgfmt;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
@@ -1784,49 +2229,55 @@ lazy_cleanup_index(Relation indrel,
 	ivinfo.index = indrel;
 	ivinfo.analyze_only = false;
 	ivinfo.report_progress = false;
-	ivinfo.estimated_count = (vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	ivinfo.estimated_count = estimated_count;
 	ivinfo.message_level = elevel;
 
-	/*
-	 * Now we can provide a better estimate of total number of surviving
-	 * tuples (we assume indexes are more interested in that than in the
-	 * number of nominally live tuples).
-	 */
-	ivinfo.num_heap_tuples = vacrelstats->new_rel_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
-	stats = index_vacuum_cleanup(&ivinfo, stats);
+	*stats = index_vacuum_cleanup(&ivinfo, *stats);
 
-	if (!stats)
+	if (!(*stats))
 		return;
 
-	/*
-	 * Now update statistics in pg_class, but only if the index says the count
-	 * is accurate.
-	 */
-	if (!stats->estimated_count)
-		vac_update_relstats(indrel,
-							stats->num_pages,
-							stats->num_index_tuples,
-							0,
-							false,
-							InvalidTransactionId,
-							InvalidMultiXactId,
-							false);
+	if (IsParallelWorker())
+		msgfmt = gettext_noop("index \"%s\" now contains %.0f row versions in %u pages, reported by parallel vacuum worker");
+	else
+		msgfmt = gettext_noop("index \"%s\" now contains %.0f row versions in %u pages");
 
 	ereport(elevel,
-			(errmsg("index \"%s\" now contains %.0f row versions in %u pages",
+			(errmsg(msgfmt,
 					RelationGetRelationName(indrel),
-					stats->num_index_tuples,
-					stats->num_pages),
+					(*stats)->num_index_tuples,
+					(*stats)->num_pages),
 			 errdetail("%.0f index row versions were removed.\n"
 					   "%u index pages have been deleted, %u are currently reusable.\n"
 					   "%s.",
-					   stats->tuples_removed,
-					   stats->pages_deleted, stats->pages_free,
+					   (*stats)->tuples_removed,
+					   (*stats)->pages_deleted, (*stats)->pages_free,
 					   pg_rusage_show(&ru0))));
+}
+
+/*
+ * Update index statistics in pg_class if the statistics is accurate.
+ */
+static void
+lazy_update_index_statistics(Relation indrel, IndexBulkDeleteResult *stats)
+{
+	Assert(!IsInParallelMode());
+
+	if (!stats || stats->estimated_count)
+		return;
 
-	pfree(stats);
+	/* Update index statistics */
+	vac_update_relstats(indrel,
+						stats->num_pages,
+						stats->num_index_tuples,
+						0,
+						false,
+						InvalidTransactionId,
+						InvalidMultiXactId,
+						false);
 }
 
 /*
@@ -2134,19 +2585,17 @@ count_nondeletable_pages(Relation onerel, LVRelStats *vacrelstats)
 }
 
 /*
- * lazy_space_alloc - space allocation decisions for lazy vacuum
- *
- * See the comments at the head of this file for rationale.
+ * Return the maximum number of dead tuples we can record.
  */
-static void
-lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+static long
+compute_max_dead_tuples(BlockNumber relblocks, bool useindex)
 {
 	long		maxtuples;
 	int			vac_work_mem = IsAutoVacuumWorkerProcess() &&
 	autovacuum_work_mem != -1 ?
 	autovacuum_work_mem : maintenance_work_mem;
 
-	if (vacrelstats->useindex)
+	if (useindex)
 	{
 		maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);
 		maxtuples = Min(maxtuples, INT_MAX);
@@ -2160,34 +2609,49 @@ lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
 		maxtuples = Max(maxtuples, MaxHeapTuplesPerPage);
 	}
 	else
-	{
 		maxtuples = MaxHeapTuplesPerPage;
-	}
 
-	vacrelstats->num_dead_tuples = 0;
-	vacrelstats->max_dead_tuples = (int) maxtuples;
-	vacrelstats->dead_tuples = (ItemPointer)
-		palloc(maxtuples * sizeof(ItemPointerData));
+	return maxtuples;
+}
+
+/*
+ * lazy_space_alloc - space allocation decisions for lazy vacuum
+ *
+ * See the comments at the head of this file for rationale.
+ */
+static void
+lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+{
+	LVDeadTuples	*dead_tuples = NULL;
+	long		maxtuples;
+
+	maxtuples = compute_max_dead_tuples(relblocks, vacrelstats->useindex);
+
+	dead_tuples = (LVDeadTuples *)
+		palloc(SizeOfLVDeadTuples + maxtuples * sizeof(ItemPointerData));
+	dead_tuples->num_tuples = 0;
+	dead_tuples->max_tuples = (int) maxtuples;
+
+	vacrelstats->dead_tuples = dead_tuples;
 }
 
 /*
  * lazy_record_dead_tuple - remember one deletable tuple
  */
 static void
-lazy_record_dead_tuple(LVRelStats *vacrelstats,
-					   ItemPointer itemptr)
+lazy_record_dead_tuple(LVDeadTuples *dead_tuples, ItemPointer itemptr)
 {
 	/*
 	 * The array shouldn't overflow under normal behavior, but perhaps it
 	 * could if we are given a really small maintenance_work_mem. In that
 	 * case, just forget the last few tuples (we'll get 'em next time).
 	 */
-	if (vacrelstats->num_dead_tuples < vacrelstats->max_dead_tuples)
+	if (dead_tuples->num_tuples < dead_tuples->max_tuples)
 	{
-		vacrelstats->dead_tuples[vacrelstats->num_dead_tuples] = *itemptr;
-		vacrelstats->num_dead_tuples++;
+		dead_tuples->itemptrs[dead_tuples->num_tuples] = *itemptr;
+		dead_tuples->num_tuples++;
 		pgstat_progress_update_param(PROGRESS_VACUUM_NUM_DEAD_TUPLES,
-									 vacrelstats->num_dead_tuples);
+									 dead_tuples->num_tuples);
 	}
 }
 
@@ -2201,12 +2665,12 @@ lazy_record_dead_tuple(LVRelStats *vacrelstats,
 static bool
 lazy_tid_reaped(ItemPointer itemptr, void *state)
 {
-	LVRelStats *vacrelstats = (LVRelStats *) state;
+	LVDeadTuples	*dead_tuples = (LVDeadTuples *) state;
 	ItemPointer res;
 
 	res = (ItemPointer) bsearch((void *) itemptr,
-								(void *) vacrelstats->dead_tuples,
-								vacrelstats->num_dead_tuples,
+								(void *) dead_tuples->itemptrs,
+								dead_tuples->num_tuples,
 								sizeof(ItemPointerData),
 								vac_cmp_itemptr);
 
@@ -2354,3 +2818,229 @@ heap_page_is_all_visible(Relation rel, Buffer buf,
 
 	return all_visible;
 }
+
+/*
+ * Compute the number of parallel worker process to request. Both index
+ * vacuuming and index cleanup can be executed together with parallel workers
+ * if the table has more than one index. The relation sizes of table and
+ * indexes don't affect to the parallel degree for now. nrequested is the
+ * number of parallel workers that user requested and nindexes is the number
+ * of indexes that the table has.
+ */
+static int
+compute_parallel_workers(Relation onerel, int nrequested, int nindexes)
+{
+	int parallel_workers = 0;
+
+	Assert(nrequested >= 0);
+
+	if (nindexes <= 1)
+		return 0;
+
+	if (nrequested > 0)
+	{
+		/* At least one index is taken by the leader process */
+		parallel_workers = Min(nrequested, nindexes - 1);
+	}
+	else
+	{
+		/*
+		 * The parallel degree is not requested. Compute it based on the
+		 * number of indexes.
+		 */
+		parallel_workers = nindexes - 1;
+	}
+
+	/* cap by max_parallel_maintenace_workers */
+	parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+
+	return parallel_workers;
+}
+
+/*
+ * Enter parallel mode, allocate and initialize the DSM segment.
+ */
+static LVParallelState *
+begin_parallel_vacuum(LVRelStats *vacrelstats, Oid relid, BlockNumber nblocks,
+					  int nindexes, int nrequested)
+{
+	LVParallelState *lps = (LVParallelState *) palloc(sizeof(LVParallelState));
+	LVShared	*shared;
+	ParallelContext *pcxt;
+	LVDeadTuples	*tidmap;
+	long	maxtuples;
+	char	*sharedquery;
+	Size	est_shared;
+	Size	est_deadtuples;
+	int		querylen;
+	int		keys = 0;
+
+	Assert(nrequested > 0);
+	Assert(nindexes > 0);
+
+	EnterParallelMode();
+	pcxt = CreateParallelContext("postgres", "heap_parallel_vacuum_main",
+								 nrequested);
+	lps->pcxt = pcxt;
+	Assert(pcxt->nworkers > 0);
+
+	/* Estimate size for shared information -- PARALLEL_VACUUM_KEY_SHARED */
+	est_shared = MAXALIGN(add_size(SizeOfLVShared,
+								   mul_size(sizeof(LVSharedIndStats), nindexes)));
+	shm_toc_estimate_chunk(&pcxt->estimator, est_shared);
+	keys++;
+
+	/* Estimate size for dead tuples -- PARALLEL_VACUUM_KEY_DEAD_TUPLES */
+	maxtuples = compute_max_dead_tuples(nblocks, true);
+	est_deadtuples = MAXALIGN(add_size(sizeof(LVDeadTuples),
+									   mul_size(sizeof(ItemPointerData), maxtuples)));
+	shm_toc_estimate_chunk(&pcxt->estimator, est_deadtuples);
+	keys++;
+
+	shm_toc_estimate_keys(&pcxt->estimator, keys);
+
+	/* Finally, estimate VACUUM_KEY_QUERY_TEXT space */
+	querylen = strlen(debug_query_string);
+	shm_toc_estimate_chunk(&pcxt->estimator, querylen + 1);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	InitializeParallelDSM(pcxt);
+
+	/* prepare shared information */
+	shared = (LVShared *) shm_toc_allocate(pcxt->toc, est_shared);
+	shared->relid = relid;
+	shared->elevel = elevel;
+	pg_atomic_init_u32(&(shared->nprocessed), 0);
+	MemSet(shared->indstats, 0, sizeof(LVSharedIndStats) * nindexes);
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_SHARED, shared);
+	lps->lvshared = shared;
+
+	/* prepare the dead tuple space */
+	tidmap = (LVDeadTuples *) shm_toc_allocate(pcxt->toc, est_deadtuples);
+	tidmap->max_tuples = maxtuples;
+	tidmap->num_tuples = 0;
+	MemSet(tidmap->itemptrs, 0, sizeof(ItemPointerData) * maxtuples);
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_DEAD_TUPLES, tidmap);
+	vacrelstats->dead_tuples = tidmap;
+
+	/* Store query string for workers */
+	sharedquery = (char *) shm_toc_allocate(pcxt->toc, querylen + 1);
+	memcpy(sharedquery, debug_query_string, querylen + 1);
+	sharedquery[querylen] = '\0';
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, sharedquery);
+
+	lps->nworkers_requested = 0;
+
+	return lps;
+}
+
+/*
+ * Shutdown workers, destroy the parallel context, and end parallel mode.
+ * Update index statistics after exited from parallel mode.
+ */
+static void
+end_parallel_vacuum(LVParallelState *lps, Relation *Irel, int nindexes)
+{
+	IndexBulkDeleteResult *copied_stats = NULL;
+	int *updated_idx = palloc(sizeof(int) * nindexes);
+	int nupdated = 0;
+	int i;
+
+	Assert(!IsParallelWorker());
+	Assert(Irel != NULL && nindexes > 0);
+
+	/*
+	 * All writes are not allowed during parallel mode and it might not be
+	 * safe to exit from parallel mode while keeping the parallel context.
+	 * So we copy the updated index statistics to a temporary space and
+	 * update them after exited from parallel mode.
+	 */
+	for (i = 0; i < nindexes; i++)
+	{
+		if (lps->lvshared->indstats[i].updated)
+			updated_idx[nupdated++] = i;
+	}
+
+	copied_stats = palloc(sizeof(IndexBulkDeleteResult) * nupdated);
+
+	for (i = 0; i < nupdated; i++)
+		memcpy(&(copied_stats[i]),
+			   &(lps->lvshared->indstats[updated_idx[i]].stats),
+			   sizeof(IndexBulkDeleteResult));
+
+	/* Shutdown worker processes and destroy the parallel context */
+	WaitForParallelWorkersToFinish(lps->pcxt);
+	DestroyParallelContext(lps->pcxt);
+	ExitParallelMode();
+
+	/* Update index statistics */
+	for (i = 0; i < nupdated; i++)
+		lazy_update_index_statistics(Irel[updated_idx[i]],
+									 &(copied_stats[i]));
+
+	pfree(copied_stats);
+	pfree(updated_idx);
+}
+
+/*
+ * Perform work within a launched parallel process.
+ *
+ * Since parallel vacuum workers work only within index vacuuming and index
+ * cleanup, no need to report the progress information.
+ */
+void
+heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
+{
+	Relation	onerel;
+	Relation	*indrels;
+	LVShared	*lvshared;
+	LVDeadTuples	*dead_tuples;
+	int			nindexes;
+	char		*sharedquery;
+	IndexBulkDeleteResult **stats;
+
+	lvshared = (LVShared *) shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_SHARED,
+										   false);
+	elevel = lvshared->elevel;
+
+	ereport(DEBUG1,
+			(errmsg("starting parallel lazy vacuum worker for %s",
+					lvshared->for_cleanup ? "cleanup" : "vacuuming")));
+
+	/* Set debug_query_string for individual workers */
+	sharedquery = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, true);
+	debug_query_string = sharedquery;
+	pgstat_report_activity(STATE_RUNNING, debug_query_string);
+
+	/* Open table */
+	onerel = heap_open(lvshared->relid, ShareUpdateExclusiveLock);
+
+	/*
+	 * Open all indexes. indrels are sorted in order by OID, which should
+	 * be matched to the leader's one.
+	 */
+	vac_open_indexes(onerel, RowExclusiveLock, &nindexes, &indrels);
+	Assert(nindexes > 0);
+
+	/* Set dead tuple space */
+	dead_tuples = (LVDeadTuples *) shm_toc_lookup(toc,
+												  PARALLEL_VACUUM_KEY_DEAD_TUPLES,
+												  false);
+
+	/* Set cost-based vacuum delay */
+	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostBalance = 0;
+	VacuumPageHit = 0;
+	VacuumPageMiss = 0;
+	VacuumPageDirty = 0;
+
+	stats = (IndexBulkDeleteResult **)
+		palloc0(nindexes * sizeof(IndexBulkDeleteResult *));
+
+	/* Do either vacuuming indexes or cleaning indexes */
+	do_parallel_vacuum_or_cleanup_indexes(indrels, nindexes, stats,
+										  lvshared, dead_tuples);
+
+	vac_close_indexes(nindexes, indrels, RowExclusiveLock);
+	heap_close(onerel, ShareUpdateExclusiveLock);
+}
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index 55d129a..86511b2 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -14,6 +14,7 @@
 
 #include "postgres.h"
 
+#include "access/heapam.h"
 #include "access/nbtree.h"
 #include "access/parallel.h"
 #include "access/session.h"
@@ -140,6 +141,9 @@ static const struct
 	},
 	{
 		"_bt_parallel_build_main", _bt_parallel_build_main
+	},
+	{
+		"heap_parallel_vacuum_main", heap_parallel_vacuum_main
 	}
 };
 
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index e7b379d..89dfc3b 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -99,6 +99,7 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 	/* Set default value */
 	params.index_cleanup = VACOPT_TERNARY_DEFAULT;
 	params.truncate = VACOPT_TERNARY_DEFAULT;
+	params.nworkers = -1;
 
 	/* Parse options list */
 	foreach(lc, vacstmt->options)
@@ -129,6 +130,27 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 			params.index_cleanup = get_vacopt_ternary_value(opt);
 		else if (strcmp(opt->defname, "truncate") == 0)
 			params.truncate = get_vacopt_ternary_value(opt);
+		else if (strcmp(opt->defname, "parallel") == 0)
+		{
+			if (opt->arg == NULL)
+			{
+				/*
+				 * Parallel lazy vacuum is requested but user didn't specify
+				 * the parallel degree. The parallel degree will be determined
+				 * at the start of lazy vacuum.
+				 */
+				params.nworkers = 0;
+			}
+			else
+			{
+				params.nworkers = defGetInt32(opt);
+				if (params.nworkers < 1)
+					ereport(ERROR,
+							(errcode(ERRCODE_SYNTAX_ERROR),
+							 errmsg("parallel vacuum degree must be at least 1"),
+							 parser_errposition(pstate, opt->location)));
+			}
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -170,6 +192,11 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 		}
 	}
 
+	if ((params.options & VACOPT_FULL) && params.nworkers >= 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot specify FULL option with PARALLEL option")));
+
 	/*
 	 * All freeze ages are zero if the FREEZE option is given; otherwise pass
 	 * them as -1 which means to use the default values.
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index fd85b9c..88e2fb0 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2888,6 +2888,8 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 			(!wraparound ? VACOPT_SKIP_LOCKED : 0);
 		tab->at_params.index_cleanup = VACOPT_TERNARY_DEFAULT;
 		tab->at_params.truncate = VACOPT_TERNARY_DEFAULT;
+		/* parallel lazy vacuum is not supported for autovacuum */
+		tab->at_params.nworkers = -1;
 		tab->at_params.freeze_min_age = freeze_min_age;
 		tab->at_params.freeze_table_age = freeze_table_age;
 		tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index b88bd8a..464d34d 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -24,6 +24,8 @@
 #include "nodes/primnodes.h"
 #include "storage/bufpage.h"
 #include "storage/lockdefs.h"
+#include "storage/shm_toc.h"
+#include "storage/dsm.h"
 #include "utils/relcache.h"
 #include "utils/snapshot.h"
 
@@ -195,6 +197,7 @@ extern Size SyncScanShmemSize(void);
 struct VacuumParams;
 extern void heap_vacuum_rel(Relation onerel,
 							struct VacuumParams *params, BufferAccessStrategy bstrategy);
+extern void heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
 
 /* in heap/heapam_visibility.c */
 extern bool HeapTupleSatisfiesVisibility(HeapTuple stup, Snapshot snapshot,
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 128f7ae..43702f2 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -184,6 +184,11 @@ typedef struct VacuumParams
 										 * default value depends on reloptions */
 	VacOptTernaryValue truncate;	/* Truncate empty pages at the end,
 									 * default value depends on reloptions */
+	/*
+	 * The number of parallel vacuum workers. -1 by default for no workers
+	 * and 0 for choosing based on the number of indexes.
+	 */
+	int			nworkers;
 } VacuumParams;
 
 /* GUC parameters */
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index 21a167f..cbb0bbb 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -80,6 +80,16 @@ CONTEXT:  SQL function "do_analyze" statement 1
 SQL function "wrap_do_analyze" statement 1
 VACUUM FULL vactst;
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+-- PARALLEL option
+VACUUM (PARALLEL) vaccluster;
+VACUUM (PARALLEL 2) vaccluster;
+VACUUM (PARALLEL 0) vaccluster; -- error
+ERROR:  parallel vacuum degree must be at least 1
+LINE 1: VACUUM (PARALLEL 0) vaccluster;
+                ^
+VACUUM (PARALLEL 2, INDEX_CLEANUP FALSE) vaccluster;
+VACUUM (PARALLEL 2, FULL TRUE) vaccluster; -- error, cannot use both PARALLEL and FULL
+ERROR:  cannot specify FULL option with PARALLEL option
 -- INDEX_CLEANUP option
 CREATE TABLE no_index_cleanup (i INT PRIMARY KEY) WITH (vacuum_index_cleanup = false);
 VACUUM (INDEX_CLEANUP FALSE) vaccluster;
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index a558580..97a3140 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -62,6 +62,13 @@ VACUUM FULL vactst;
 
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
 
+-- PARALLEL option
+VACUUM (PARALLEL) vaccluster;
+VACUUM (PARALLEL 2) vaccluster;
+VACUUM (PARALLEL 0) vaccluster; -- error
+VACUUM (PARALLEL 2, INDEX_CLEANUP FALSE) vaccluster;
+VACUUM (PARALLEL 2, FULL TRUE) vaccluster; -- error, cannot use both PARALLEL and FULL
+
 -- INDEX_CLEANUP option
 CREATE TABLE no_index_cleanup (i INT PRIMARY KEY) WITH (vacuum_index_cleanup = false);
 VACUUM (INDEX_CLEANUP FALSE) vaccluster;
-- 
2.10.5

v25-0002-Add-paralell-P-option-to-vacuumdb-command.patchapplication/octet-stream; name=v25-0002-Add-paralell-P-option-to-vacuumdb-command.patchDownload

From f86e821ca2f23200f40e97222101c73e983cefa8 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Wed, 23 Jan 2019 16:07:53 +0900
Subject: [PATCH v25 2/2] Add --paralell, -P option to vacuumdb command

---
 doc/src/sgml/ref/vacuumdb.sgml    | 16 +++++++++++++
 src/bin/scripts/t/100_vacuumdb.pl | 10 +++++++-
 src/bin/scripts/vacuumdb.c        | 48 ++++++++++++++++++++++++++++++++++++++-
 3 files changed, 72 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/ref/vacuumdb.sgml b/doc/src/sgml/ref/vacuumdb.sgml
index 47d9345..f6ac0c6 100644
--- a/doc/src/sgml/ref/vacuumdb.sgml
+++ b/doc/src/sgml/ref/vacuumdb.sgml
@@ -227,6 +227,22 @@ PostgreSQL documentation
      </varlistentry>
 
      <varlistentry>
+      <term><option>-P <replaceable class="parameter">workers</replaceable></option></term>
+      <term><option>--parallel=<replaceable class="parameter">workers</replaceable></option></term>
+      <listitem>
+       <para>
+        Execute parallel vacuum with <productname>PostgreSQL</productname>'s
+        <replaceable class="parameter">workers</replaceable> background workers.
+       </para>
+       <para>
+        This option will require background workers, so make sure your
+        <xref linkend="guc-max-parallel-workers-maintenance"/> setting is more
+        than one.
+       </para>
+      </listitem>
+     </varlistentry>
+
+     <varlistentry>
       <term><option>-q</option></term>
       <term><option>--quiet</option></term>
       <listitem>
diff --git a/src/bin/scripts/t/100_vacuumdb.pl b/src/bin/scripts/t/100_vacuumdb.pl
index b685b35..8fe8071 100644
--- a/src/bin/scripts/t/100_vacuumdb.pl
+++ b/src/bin/scripts/t/100_vacuumdb.pl
@@ -3,7 +3,7 @@ use warnings;
 
 use PostgresNode;
 use TestLib;
-use Test::More tests => 44;
+use Test::More tests => 48;
 
 program_help_ok('vacuumdb');
 program_version_ok('vacuumdb');
@@ -48,6 +48,14 @@ $node->issues_sql_like(
 $node->command_fails(
 	[ 'vacuumdb', '--analyze-only', '--disable-page-skipping', 'postgres' ],
 	'--analyze-only and --disable-page-skipping specified together');
+$node->issues_sql_like(
+	[ 'vacuumdb', '-P2', 'postgres' ],
+	qr/statement: VACUUM \(PARALLEL 2\).*;/,
+	'vacuumdb -P2');
+$node->issues_sql_like(
+	[ 'vacuumdb', '-P', 'postgres' ],
+	qr/statement: VACUUM \(PARALLEL\).*;/,
+	'vacuumdb -P');
 $node->command_ok([qw(vacuumdb -Z --table=pg_am dbname=template1)],
 	'vacuumdb with connection string');
 
diff --git a/src/bin/scripts/vacuumdb.c b/src/bin/scripts/vacuumdb.c
index df2a315..fe3f06b 100644
--- a/src/bin/scripts/vacuumdb.c
+++ b/src/bin/scripts/vacuumdb.c
@@ -46,6 +46,8 @@ typedef struct vacuumingOptions
 	bool		skip_locked;
 	int			min_xid_age;
 	int			min_mxid_age;
+	int			parallel_workers;	/* -1 disables, 0 for choosing based on the
+									 * number of indexes */
 } vacuumingOptions;
 
 
@@ -112,6 +114,7 @@ main(int argc, char *argv[])
 		{"full", no_argument, NULL, 'f'},
 		{"verbose", no_argument, NULL, 'v'},
 		{"jobs", required_argument, NULL, 'j'},
+		{"parallel", optional_argument, NULL, 'P'},
 		{"maintenance-db", required_argument, NULL, 2},
 		{"analyze-in-stages", no_argument, NULL, 3},
 		{"disable-page-skipping", no_argument, NULL, 4},
@@ -141,6 +144,7 @@ main(int argc, char *argv[])
 
 	/* initialize options to all false */
 	memset(&vacopts, 0, sizeof(vacopts));
+	vacopts.parallel_workers = -1;
 
 	pg_logging_init(argv[0]);
 	progname = get_progname(argv[0]);
@@ -148,7 +152,7 @@ main(int argc, char *argv[])
 
 	handle_help_version_opts(argc, argv, "vacuumdb", help);
 
-	while ((c = getopt_long(argc, argv, "h:p:U:wWeqd:zZFat:fvj:", long_options, &optindex)) != -1)
+	while ((c = getopt_long(argc, argv, "h:p:P::U:wWeqd:zZFat:fvj:", long_options, &optindex)) != -1)
 	{
 		switch (c)
 		{
@@ -214,6 +218,24 @@ main(int argc, char *argv[])
 					exit(1);
 				}
 				break;
+			case 'P':
+				{
+					int parallel_workers = 0;
+
+					if (optarg != NULL)
+					{
+						parallel_workers = atoi(optarg);
+						if (parallel_workers <= 0)
+						{
+							pg_log_error("number of parallel workers must be at least 1");
+							exit(1);
+						}
+					}
+
+					/* allow to set 0, meaning PARALLEL without the parallel degree */
+					vacopts.parallel_workers = parallel_workers;
+					break;
+				}
 			case 2:
 				maintenance_db = pg_strdup(optarg);
 				break;
@@ -286,9 +308,22 @@ main(int argc, char *argv[])
 						 "disable-page-skipping");
 			exit(1);
 		}
+		if (vacopts.parallel_workers >= 0)
+		{
+			pg_log_error("cannot use the \"%s\" option when performing only analyze",
+						 "parallel");
+			exit(1);
+		}
 		/* allow 'and_analyze' with 'analyze_only' */
 	}
 
+	if (vacopts.full && vacopts.parallel_workers >= 0)
+	{
+		pg_log_error("cannot use the \"%s\" option with \"%s\" option",
+					 "full", "parallel");
+		exit(1);
+	}
+
 	setup_cancel_handler();
 
 	/* Avoid opening extra connections. */
@@ -894,6 +929,16 @@ prepare_vacuum_command(PQExpBuffer sql, int serverVersion,
 				appendPQExpBuffer(sql, "%sANALYZE", sep);
 				sep = comma;
 			}
+			if (vacopts->parallel_workers > 0)
+			{
+				appendPQExpBuffer(sql, "%sPARALLEL %d", sep, vacopts->parallel_workers);
+				sep = comma;
+			}
+			if (vacopts->parallel_workers == 0)
+			{
+				appendPQExpBuffer(sql, "%sPARALLEL", sep);
+				sep = comma;
+			}
 			if (sep != paren)
 				appendPQExpBufferChar(sql, ')');
 		}
@@ -1225,6 +1270,7 @@ help(const char *progname)
 	printf(_("  -j, --jobs=NUM                  use this many concurrent connections to vacuum\n"));
 	printf(_("      --min-mxid-age=MXID_AGE     minimum multixact ID age of tables to vacuum\n"));
 	printf(_("      --min-xid-age=XID_AGE       minimum transaction ID age of tables to vacuum\n"));
+	printf(_("  -P, --parallel[=NUM]            do parallel vacuuming\n"));
 	printf(_("  -q, --quiet                     don't write any messages\n"));
 	printf(_("      --skip-locked               skip relations that cannot be immediately locked\n"));
 	printf(_("  -t, --table='TABLE[(COLUMNS)]'  vacuum specific table(s) only\n"));
-- 
2.10.5

#78

Sergei Kornilov

sk@zsrv.org

over 6 years ago

In reply to: Masahiko Sawada (#77)

The following review has been posted through the commitfest application:
make installcheck-world: tested, passed
Implements feature: tested, passed
Spec compliant: not tested
Documentation: not tested

Hello

I reviewed v25 patches and have just a few notes.

missed synopsis for "PARALLEL" option (<synopsis> block in doc/src/sgml/ref/vacuum.sgml )
missed prototype for vacuum_log_cleanup_info in "non-export function prototypes"

/*
* Do post-vacuum cleanup, and statistics update for each index if
* we're not in parallel lazy vacuum. If in parallel lazy vacuum, do
* only post-vacum cleanup and update statistics at the end of parallel
* lazy vacuum.
*/
if (vacrelstats->useindex)
lazy_vacuum_or_cleanup_indexes(vacrelstats, Irel, nindexes,
indstats, lps, true);

if (ParallelVacuumIsActive(lps))
{
/* End parallel mode and update index statistics */
end_parallel_vacuum(lps, Irel, nindexes);
}

I personally do not like update statistics in different places.
Can we change lazy_vacuum_or_cleanup_indexes to writing stats for both parallel and non-parallel cases? I means something like this:

if (ParallelVacuumIsActive(lps))
{
/* Do parallel index vacuuming or index cleanup */
lazy_parallel_vacuum_or_cleanup_indexes(vacrelstats, Irel,
nindexes, stats,
lps, for_cleanup);
if (for_cleanup)
{
...
for (i = 0; i < nindexes; i++)
lazy_update_index_statistics(...);
}
return;
}

So all lazy_update_index_statistics would be in one place. lazy_parallel_vacuum_or_cleanup_indexes is called only from parallel leader and waits for all workers. Possible we can update stats in lazy_parallel_vacuum_or_cleanup_indexes after WaitForParallelWorkersToFinish call.

Also discussion question: vacuumdb parameters --parallel= and --jobs= will confuse users? We need more description for this options?

regards, Sergei

#79

Amit Kapila

amit.kapila16@gmail.com

over 6 years ago

In reply to: Masahiko Sawada (#77)

On Fri, Jun 7, 2019 at 12:03 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Since the previous version patch conflicts with current HEAD, I've
attached the updated version patches.

Review comments:
------------------------------
*
      indexes on the relation which further limited by
+      <xref linkend="guc-max-parallel-workers-maintenance"/>.

/which further/which is further

*
+ * index vacuuming or index cleanup, we launch parallel worker processes. Once
+ * all indexes are processed the parallel worker processes exit and the leader
+ * process re-initializes the DSM segment while keeping recorded dead tuples.

It is not clear for this comment why it re-initializes the DSM segment
instead of destroying it once the index work is done by workers. Can
you elaborate a bit more in the comment?

*
+ * Note that all parallel workers live during one either index vacuuming or

It seems usage of 'one' is not required in the above sentence.

*
+
+/*
+ * Compute the number of parallel worker process to request.

/process/processes

*
+static int
+compute_parallel_workers(Relation onerel, int nrequested, int nindexes)
+{
+ int parallel_workers = 0;
+
+ Assert(nrequested >= 0);
+
+ if (nindexes <= 1)
+ return 0;

I think here, in the beginning, you can also check if
max_parallel_maintenance_workers are 0, then return.

*
In function compute_parallel_workers, don't we want to cap the number
of workers based on maintenance_work_mem as we do in
plan_create_index_workers?

The basic point is how do we want to treat maintenance_work_mem for
this feature. Do we want all workers to use at max the
maintenance_work_mem or each worker is allowed to use
maintenance_work_mem? I would prefer earlier unless we have good
reason to follow the later strategy.

Accordingly, we might need to update the below paragraph in docs:
"Note that parallel utility commands should not consume substantially
more memory than equivalent non-parallel operations. This strategy
differs from that of parallel query, where resource limits generally
apply per worker process. Parallel utility commands treat the
resource limit <varname>maintenance_work_mem</varname> as a limit to
be applied to the entire utility command, regardless of the number of
parallel worker processes."

*
+static int
+compute_parallel_workers(Relation onerel, int nrequested, int nindexes)
+{
+ int parallel_workers = 0;
+
+ Assert(nrequested >= 0);
+
+ if (nindexes <= 1)
+ return 0;
+
+ if (nrequested > 0)
+ {
+ /* At least one index is taken by the leader process */
+ parallel_workers = Min(nrequested, nindexes - 1);
+ }

I think here we always allow the leader to participate. It seems to
me we have some way to disable leader participation. During the
development of previous parallel operations, we find it quite handy to
catch bugs. We might want to mimic what has been done for index with
DISABLE_LEADER_PARTICIPATION.

*
+/*
+ * DSM keys for parallel lazy vacuum. Unlike other parallel execution code,
+ * since we don't need to worry about DSM keys conflicting with plan_node_id
+ * we can use small integers.
+ */
+#define PARALLEL_VACUUM_KEY_SHARED 1
+#define PARALLEL_VACUUM_KEY_DEAD_TUPLES 2
+#define PARALLEL_VACUUM_KEY_QUERY_TEXT 3

I think it would be better if these keys should be assigned numbers in
a way we do for other similar operation like create index. See below
defines
in code:
/* Magic numbers for parallel state sharing */
#define PARALLEL_KEY_BTREE_SHARED UINT64CONST(0xA000000000000001)

This will make the code consistent with other parallel operations.

*
+begin_parallel_vacuum(LVRelStats *vacrelstats, Oid relid, BlockNumber nblocks,
+   int nindexes, int nrequested)
{
..
+ est_deadtuples = MAXALIGN(add_size(sizeof(LVDeadTuples),
..
}

I think here you should use SizeOfLVDeadTuples as defined by patch.

*
+ keys++;
+
+ /* Estimate size for dead tuples -- PARALLEL_VACUUM_KEY_DEAD_TUPLES */
+ maxtuples = compute_max_dead_tuples(nblocks, true);
+ est_deadtuples = MAXALIGN(add_size(sizeof(LVDeadTuples),
+    mul_size(sizeof(ItemPointerData), maxtuples)));
+ shm_toc_estimate_chunk(&pcxt->estimator, est_deadtuples);
+ keys++;
+
+ shm_toc_estimate_keys(&pcxt->estimator, keys);
+
+ /* Finally, estimate VACUUM_KEY_QUERY_TEXT space */
+ querylen = strlen(debug_query_string);
+ shm_toc_estimate_chunk(&pcxt->estimator, querylen + 1);
+ shm_toc_estimate_keys(&pcxt->estimator, 1);

The code style looks inconsistent here. In some cases, you are
calling shm_toc_estimate_keys immediately after shm_toc_estimate_chunk
and in other cases, you are accumulating keys. I think it is better
to call shm_toc_estimate_keys immediately after shm_toc_estimate_chunk
in all cases.

*
+void
+heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
{
..
+ /* Set debug_query_string for individual workers */
+ sharedquery = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, true);
..
}

I think the last parameter in shm_toc_lookup should be false. Is
there a reason for passing it as true?

*
+void
+heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
+{
..
+ /* Open table */
+ onerel = heap_open(lvshared->relid, ShareUpdateExclusiveLock);
..
}

I don't think it is a good idea to assume the lock mode as
ShareUpdateExclusiveLock here. Tomorrow, if due to some reason there
is a change in lock level for the vacuum process, we might forget to
update it here. I think it is better if we can get this information
from the master backend.

*
+end_parallel_vacuum(LVParallelState *lps, Relation *Irel, int nindexes)
{
..
+ /* Shutdown worker processes and destroy the parallel context */
+ WaitForParallelWorkersToFinish(lps->pcxt);
..
}

Do we really need to call WaitForParallelWorkersToFinish here as it
must have been called in lazy_parallel_vacuum_or_cleanup_indexes
before this time?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#80

Amit Kapila

amit.kapila16@gmail.com

over 6 years ago

In reply to: Amit Kapila (#79)

On Sat, Sep 21, 2019 at 6:01 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Jun 7, 2019 at 12:03 PM Masahiko Sawada <sawada.mshk@gmail.com>
wrote:

Since the previous version patch conflicts with current HEAD, I've
attached the updated version patches.

Review comments:
------------------------------

Sawada-San, are you planning to work on the review comments? I can take
care of this and then proceed with further review if you are tied up with
something else.

*
+/*
+ * DSM keys for parallel lazy vacuum. Unlike other parallel execution
code,
+ * since we don't need to worry about DSM keys conflicting with
plan_node_id
+ * we can use small integers.
+ */
+#define PARALLEL_VACUUM_KEY_SHARED 1
+#define PARALLEL_VACUUM_KEY_DEAD_TUPLES 2
+#define PARALLEL_VACUUM_KEY_QUERY_TEXT 3
I think it would be better if these keys should be assigned numbers in
a way we do for other similar operation like create index. See below
defines
in code:
/* Magic numbers for parallel state sharing */
#define PARALLEL_KEY_BTREE_SHARED UINT64CONST(0xA000000000000001)

This will make the code consistent with other parallel operations.

I think we don't need to handle this comment. Today, I read the other
emails in the thread and noticed that you have done this based on comment
by Robert and that decision seems wise to me.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#81

Masahiko Sawada

sawada.mshk@gmail.com

over 6 years ago

In reply to: Amit Kapila (#80)

On Tue, Oct 1, 2019 at 10:31 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Sat, Sep 21, 2019 at 6:01 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Jun 7, 2019 at 12:03 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Since the previous version patch conflicts with current HEAD, I've
attached the updated version patches.

Review comments:
------------------------------

Sawada-San, are you planning to work on the review comments? I can take care of this and then proceed with further review if you are tied up with something else.

Thank you for reviewing this patch.

Yes I'm addressing your comments and will submit the updated patch soon.

I think we don't need to handle this comment. Today, I read the other emails in the thread and noticed that you have done this based on comment by Robert and that decision seems wise to me.

Understood.

Regards,

--
Masahiko Sawada

#82

Masahiko Sawada

sawada.mshk@gmail.com

over 6 years ago

In reply to: Amit Kapila (#79)

2 attachment(s)

On Sat, Sep 21, 2019 at 9:31 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Jun 7, 2019 at 12:03 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Since the previous version patch conflicts with current HEAD, I've
attached the updated version patches.

Thank you for reviewing this patch!

Review comments:
------------------------------
*
indexes on the relation which further limited by
+      <xref linkend="guc-max-parallel-workers-maintenance"/>.

/which further/which is further

Fixed.

*
+ * index vacuuming or index cleanup, we launch parallel worker processes. Once
+ * all indexes are processed the parallel worker processes exit and the leader
+ * process re-initializes the DSM segment while keeping recorded dead tuples.
It is not clear for this comment why it re-initializes the DSM segment
instead of destroying it once the index work is done by workers. Can
you elaborate a bit more in the comment?

Added more explanation.

*
+ * Note that all parallel workers live during one either index vacuuming or

It seems usage of 'one' is not required in the above sentence.

Removed.

*
+
+/*
+ * Compute the number of parallel worker process to request.

/process/processes

Fixed.

*
+static int
+compute_parallel_workers(Relation onerel, int nrequested, int nindexes)
+{
+ int parallel_workers = 0;
+
+ Assert(nrequested >= 0);
+
+ if (nindexes <= 1)
+ return 0;
I think here, in the beginning, you can also check if
max_parallel_maintenance_workers are 0, then return.

Agreed, fixed.

*
In function compute_parallel_workers, don't we want to cap the number
of workers based on maintenance_work_mem as we do in
plan_create_index_workers?

The basic point is how do we want to treat maintenance_work_mem for
this feature. Do we want all workers to use at max the
maintenance_work_mem or each worker is allowed to use
maintenance_work_mem? I would prefer earlier unless we have good
reason to follow the later strategy.

Accordingly, we might need to update the below paragraph in docs:
"Note that parallel utility commands should not consume substantially
more memory than equivalent non-parallel operations. This strategy
differs from that of parallel query, where resource limits generally
apply per worker process. Parallel utility commands treat the
resource limit <varname>maintenance_work_mem</varname> as a limit to
be applied to the entire utility command, regardless of the number of
parallel worker processes."

I'd also prefer to use maintenance_work_mem at max during parallel
vacuum regardless of the number of parallel workers. This is current
implementation. In lazy vacuum the maintenance_work_mem is used to
record itempointer of dead tuples. This is done by leader process and
worker processes just refers them for vacuuming dead index tuples.
Even if user sets a small amount of maintenance_work_mem the parallel
vacuum would be helpful as it still would take a time for index
vacuuming. So I thought we should cap the number of parallel workers
by the number of indexes rather than maintenance_work_mem.

*
+static int
+compute_parallel_workers(Relation onerel, int nrequested, int nindexes)
+{
+ int parallel_workers = 0;
+
+ Assert(nrequested >= 0);
+
+ if (nindexes <= 1)
+ return 0;
+
+ if (nrequested > 0)
+ {
+ /* At least one index is taken by the leader process */
+ parallel_workers = Min(nrequested, nindexes - 1);
+ }
I think here we always allow the leader to participate. It seems to
me we have some way to disable leader participation. During the
development of previous parallel operations, we find it quite handy to
catch bugs. We might want to mimic what has been done for index with
DISABLE_LEADER_PARTICIPATION.

Added the way to disable leader participation.

*
+/*
+ * DSM keys for parallel lazy vacuum. Unlike other parallel execution code,
+ * since we don't need to worry about DSM keys conflicting with plan_node_id
+ * we can use small integers.
+ */
+#define PARALLEL_VACUUM_KEY_SHARED 1
+#define PARALLEL_VACUUM_KEY_DEAD_TUPLES 2
+#define PARALLEL_VACUUM_KEY_QUERY_TEXT 3
I think it would be better if these keys should be assigned numbers in
a way we do for other similar operation like create index. See below
defines
in code:
/* Magic numbers for parallel state sharing */
#define PARALLEL_KEY_BTREE_SHARED UINT64CONST(0xA000000000000001)

This will make the code consistent with other parallel operations.

I skipped this comment according to the previous your mail.

*
+begin_parallel_vacuum(LVRelStats *vacrelstats, Oid relid, BlockNumber nblocks,
+   int nindexes, int nrequested)
{
..
+ est_deadtuples = MAXALIGN(add_size(sizeof(LVDeadTuples),
..
}

I think here you should use SizeOfLVDeadTuples as defined by patch.

Fixed.

*
+ keys++;
+
+ /* Estimate size for dead tuples -- PARALLEL_VACUUM_KEY_DEAD_TUPLES */
+ maxtuples = compute_max_dead_tuples(nblocks, true);
+ est_deadtuples = MAXALIGN(add_size(sizeof(LVDeadTuples),
+    mul_size(sizeof(ItemPointerData), maxtuples)));
+ shm_toc_estimate_chunk(&pcxt->estimator, est_deadtuples);
+ keys++;
+
+ shm_toc_estimate_keys(&pcxt->estimator, keys);
+
+ /* Finally, estimate VACUUM_KEY_QUERY_TEXT space */
+ querylen = strlen(debug_query_string);
+ shm_toc_estimate_chunk(&pcxt->estimator, querylen + 1);
+ shm_toc_estimate_keys(&pcxt->estimator, 1);
The code style looks inconsistent here. In some cases, you are
calling shm_toc_estimate_keys immediately after shm_toc_estimate_chunk
and in other cases, you are accumulating keys. I think it is better
to call shm_toc_estimate_keys immediately after shm_toc_estimate_chunk
in all cases.

Fixed. But there are some code that call shm_toc_estimate_keys for
multiple keys in for example nbtsort.c and parallel.c. What is the
difference?

*
+void
+heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
{
..
+ /* Set debug_query_string for individual workers */
+ sharedquery = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, true);
..
}
I think the last parameter in shm_toc_lookup should be false. Is
there a reason for passing it as true?

My bad, fixed.

*
+void
+heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
+{
..
+ /* Open table */
+ onerel = heap_open(lvshared->relid, ShareUpdateExclusiveLock);
..
}
I don't think it is a good idea to assume the lock mode as
ShareUpdateExclusiveLock here. Tomorrow, if due to some reason there
is a change in lock level for the vacuum process, we might forget to
update it here. I think it is better if we can get this information
from the master backend.

So did you mean to declare the lock mode for lazy vacuum somewhere as
a global variable and use it in both try_relation_open in the leader
process and relation_open in the worker process? Otherwise we would
end up with adding something like shared->lmode =
ShareUpdateExclusiveLock during parallel context initialization, which
seems not to resolve your concern.

*
+end_parallel_vacuum(LVParallelState *lps, Relation *Irel, int nindexes)
{
..
+ /* Shutdown worker processes and destroy the parallel context */
+ WaitForParallelWorkersToFinish(lps->pcxt);
..
}
Do we really need to call WaitForParallelWorkersToFinish here as it
must have been called in lazy_parallel_vacuum_or_cleanup_indexes
before this time?

No, removed.

I've attached the updated version patch that incorporated your
comments excluding some comments that needs more discussion. After
discussion I'll update it again.

Regards,

--
Masahiko Sawada

Attachments:

v26-0001-Add-parallel-option-to-VACUUM-command.patchtext/x-patch; charset=US-ASCII; name=v26-0001-Add-parallel-option-to-VACUUM-command.patchDownload

From 80e717cd2a2a49b852f54f1e79cfd1cdd5cfa7d2 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Wed, 2 Oct 2019 22:46:21 +0900
Subject: [PATCH v26 1/2] Add parallel option to VACUUM command

This change adds PARALLEL option to VACUUM command that enable us to
perform index vacuuming and index cleanup with background
workers. Indivisual indexes is processed by one vacuum
process. Therefore parallel vacuum can be used when the table has at
least two indexes and it cannot specify larger parallel degree than
the number of indexes that the table has.

The parallel degree is either specified by user or determined based on
the number of indexes that the table has, and further limited by
max_parallel_maintenance_workers. The table size and index size don't
affect it.
---
 doc/src/sgml/config.sgml              |  14 +-
 doc/src/sgml/ref/vacuum.sgml          |  34 +
 src/backend/access/heap/vacuumlazy.c  | 908 +++++++++++++++++++++++---
 src/backend/access/transam/parallel.c |   4 +
 src/backend/commands/vacuum.c         |  27 +
 src/backend/postmaster/autovacuum.c   |   2 +
 src/bin/psql/tab-complete.c           |   2 +-
 src/include/access/heapam.h           |   3 +
 src/include/commands/vacuum.h         |   5 +
 src/test/regress/expected/vacuum.out  |  10 +
 src/test/regress/sql/vacuum.sql       |   7 +
 11 files changed, 909 insertions(+), 107 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 619ac8c50c..2be71dd128 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2264,13 +2264,13 @@ include_dir 'conf.d'
        <listitem>
         <para>
          Sets the maximum number of parallel workers that can be
-         started by a single utility command.  Currently, the only
-         parallel utility command that supports the use of parallel
-         workers is <command>CREATE INDEX</command>, and only when
-         building a B-tree index.  Parallel workers are taken from the
-         pool of processes established by <xref
-         linkend="guc-max-worker-processes"/>, limited by <xref
-         linkend="guc-max-parallel-workers"/>.  Note that the requested
+         started by a single utility command.  Currently, the parallel
+         utility commands that support the use of parallel workers are
+         <command>CREATE INDEX</command> only when building a B-tree index,
+         and <command>VACUUM</command> without <literal>FULL</literal>
+         option. Parallel workers are taken from the pool of processes
+         established by <xref linkend="guc-max-worker-processes"/>, limited
+         by <xref linkend="guc-max-parallel-workers"/>.  Note that the requested
          number of workers may not actually be available at run time.
          If this occurs, the utility operation will run with fewer
          workers than expected.  The default value is 2.  Setting this
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index f9b0fb8794..339ac48033 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -34,6 +34,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     SKIP_LOCKED [ <replaceable class="parameter">boolean</replaceable> ]
     INDEX_CLEANUP [ <replaceable class="parameter">boolean</replaceable> ]
     TRUNCATE [ <replaceable class="parameter">boolean</replaceable> ]
+    PARALLEL [ <replaceable class="parameter">integer</replaceable> ]
 
 <phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
 
@@ -223,6 +224,27 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><literal>PARALLEL</literal></term>
+    <listitem>
+     <para>
+      Perform vacuum index and cleanup index phases of <command>VACUUM</command>
+      in parallel using <replaceable class="parameter">integer</replaceable>
+      background workers (for the detail of each vacuum phases, please refer
+      to <xref linkend="vacuum-phases"/>). If the parallel degree
+      <replaceable class="parameter">integer</replaceable> is omitted, then
+      <command>VACUUM</command> decides the number of workers based on number of
+      indexes on the relation which is further limited by
+      <xref linkend="guc-max-parallel-workers-maintenance"/>. Only one worker
+      can be used per index. So parallel workers are launched only when
+      there are at least <literal>2</literal> indexes in the table. Workers
+      for vacuum launches before starting each phases and exit at the end of
+      the phase. These behaviors might change in a future release. This
+      option can not use with <literal>FULL</literal> option.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><replaceable class="parameter">boolean</replaceable></term>
     <listitem>
@@ -237,6 +259,18 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><replaceable class="parameter">integer</replaceable></term>
+    <listitem>
+     <para>
+      Specifies a positive integer value passed to the selected option.
+      The <replaceable class="parameter">integer</replaceable> value can
+      also be omitted, in which case the default value of the selected
+      option is used.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><replaceable class="parameter">table_name</replaceable></term>
     <listitem>
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index a3c4a1df3b..4ae9736e92 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -22,6 +22,21 @@
  * of index scans performed.  So we don't use maintenance_work_mem memory for
  * the TID array, just enough to hold as many heap tuples as fit on one page.
  *
+ * Lazy vacuum supports parallel execution with parallel worker processes. In
+ * parallel lazy vacuum, we perform both index vacuuming and index cleanup with
+ * parallel worker processes. Individual indexes is processed by one vacuum
+ * process. At beginning of lazy vacuum (at lazy_scan_heap) we prepare the
+ * parallel context and initialize the DSM segment that contains shared information
+ * as well as the memory space for storing dead tuples. When starting either
+ * index vacuuming or index cleanup, we launch parallel worker processes. Once
+ * all indexes are processed the parallel worker processes exit. And then the
+ * leader process re-initializes the parallel context while keeping recorded
+ * dead tuples so that the leader can launch parallel workers again in the next
+ * time. Note that all parallel workers live during either index vacuuming or
+ * index cleanup but the leader process neither exits from the parallel mode
+ * nor destroys the parallel context. For updating the index statistics, since
+ * any updates are not allowed during parallel mode we update the index
+ * statistics after exited from the parallel mode.
  *
  * Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
@@ -41,8 +56,10 @@
 #include "access/heapam_xlog.h"
 #include "access/htup_details.h"
 #include "access/multixact.h"
+#include "access/parallel.h"
 #include "access/transam.h"
 #include "access/visibilitymap.h"
+#include "access/xact.h"
 #include "access/xlog.h"
 #include "catalog/storage.h"
 #include "commands/dbcommands.h"
@@ -55,6 +72,7 @@
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
 #include "storage/lmgr.h"
+#include "tcop/tcopprot.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/pg_rusage.h"
@@ -110,6 +128,101 @@
  */
 #define PREFETCH_SIZE			((BlockNumber) 32)
 
+/*
+ * DSM keys for parallel lazy vacuum. Unlike other parallel execution code,
+ * since we don't need to worry about DSM keys conflicting with plan_node_id
+ * we can use small integers.
+ */
+#define PARALLEL_VACUUM_KEY_SHARED			1
+#define PARALLEL_VACUUM_KEY_DEAD_TUPLES		2
+#define PARALLEL_VACUUM_KEY_QUERY_TEXT		3
+
+/*
+ * PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION disables the leader's
+ * participation in parallel lazy vacuum.  This may be useful as a debugging
+ * aid.
+#undef PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION
+ */
+
+/*
+ * Macro to check if we are in a parallel lazy vacuum. If true, we are
+ * in the parallel mode and prepared the DSM segment.
+ */
+#define ParallelVacuumIsActive(lps) (((LVParallelState *) (lps)) != NULL)
+
+/*
+ * LVDeadTuples stores the dead tuple TIDs collected during heap scan.
+ * This is allocated in the DSM segment when parallel lazy vacuum
+ * mode, otherwise allocated in a local memory.
+ */
+typedef struct LVDeadTuples
+{
+	int			max_tuples;	/* # slots allocated in array */
+	int			num_tuples;	/* current # of entries */
+	/* List of TIDs of tuples we intend to delete */
+	/* NB: this list is ordered by TID address */
+	ItemPointerData itemptrs[FLEXIBLE_ARRAY_MEMBER];	/* array of ItemPointerData */
+} LVDeadTuples;
+#define SizeOfLVDeadTuples offsetof(LVDeadTuples, itemptrs) + sizeof(ItemPointerData)
+
+/*
+ * Struct for an index bulk-deletion statistic used for parallel lazy vacuum.
+ * This is allocated in the DSM segment.
+ */
+typedef struct LVSharedIndStats
+{
+	IndexBulkDeleteResult	stats;
+	bool					updated;	/* are the stats updated? */
+} LVSharedIndStats;
+
+/*
+ * Shared information among parallel workers. So this is allocated in
+ * the DSM segment.
+ */
+typedef struct LVShared
+{
+	/*
+	 * Target table relid and log level. These fields are not modified during
+	 * the lazy vacuum.
+	 */
+	Oid		relid;
+	int		elevel;
+
+	/*
+	 * An indication for vacuum workers of doing either index vacuuming or
+	 * index cleanup.
+	 */
+	bool	for_cleanup;
+
+	/*
+	 * Fields for both index vacuuming and index cleanup.
+	 *
+	 * reltuples is the total number of input heap tuples. We set either an
+	 * old live tuples in index vacuuming case or the new live tuples in index
+	 * cleanup case.
+	 *
+	 * estimated_count is true if the reltuples is estimated value.
+	 */
+	double	reltuples;
+	bool	estimated_count;
+
+	/*
+	 * Variables to control parallel index vacuuming. An variable-sized field
+	 * 'indstats' must come last.
+	 */
+	pg_atomic_uint32	nprocessed;
+	LVSharedIndStats	indstats[FLEXIBLE_ARRAY_MEMBER];
+} LVShared;
+#define SizeOfLVShared offsetof(LVShared, indstats) + sizeof(LVSharedIndStats)
+
+/* Struct for parallel lazy vacuum */
+typedef struct LVParallelState
+{
+	ParallelContext	*pcxt;
+	LVShared		*lvshared;
+	int				nworkers_requested;	/* user-requested parallel degree */
+} LVParallelState;
+
 typedef struct LVRelStats
 {
 	/* useindex = true means two-pass strategy; false means one-pass */
@@ -128,17 +241,12 @@ typedef struct LVRelStats
 	BlockNumber pages_removed;
 	double		tuples_deleted;
 	BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
-	/* List of TIDs of tuples we intend to delete */
-	/* NB: this list is ordered by TID address */
-	int			num_dead_tuples;	/* current # of entries */
-	int			max_dead_tuples;	/* # slots allocated in array */
-	ItemPointer dead_tuples;	/* array of ItemPointerData */
+	LVDeadTuples *dead_tuples;
 	int			num_index_scans;
 	TransactionId latestRemovedXid;
 	bool		lock_waiter_detected;
 } LVRelStats;
 
-
 /* A few variables that don't seem worth passing around as parameters */
 static int	elevel = -1;
 
@@ -155,12 +263,11 @@ static void lazy_scan_heap(Relation onerel, VacuumParams *params,
 						   bool aggressive);
 static void lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats);
 static bool lazy_check_needs_freeze(Buffer buf, bool *hastup);
-static void lazy_vacuum_index(Relation indrel,
-							  IndexBulkDeleteResult **stats,
-							  LVRelStats *vacrelstats);
+static void lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult **stats,
+							  LVDeadTuples *dead_tuples, double reltuples);
 static void lazy_cleanup_index(Relation indrel,
-							   IndexBulkDeleteResult *stats,
-							   LVRelStats *vacrelstats);
+							   IndexBulkDeleteResult **stats,
+							   double reltuples, bool estimated_count);
 static int	lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 							 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer);
 static bool should_attempt_truncation(VacuumParams *params,
@@ -169,12 +276,33 @@ static void lazy_truncate_heap(Relation onerel, LVRelStats *vacrelstats);
 static BlockNumber count_nondeletable_pages(Relation onerel,
 											LVRelStats *vacrelstats);
 static void lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks);
-static void lazy_record_dead_tuple(LVRelStats *vacrelstats,
+static void lazy_record_dead_tuple(LVDeadTuples *dead_tuples,
 								   ItemPointer itemptr);
 static bool lazy_tid_reaped(ItemPointer itemptr, void *state);
 static int	vac_cmp_itemptr(const void *left, const void *right);
 static bool heap_page_is_all_visible(Relation rel, Buffer buf,
 									 TransactionId *visibility_cutoff_xid, bool *all_frozen);
+static void lazy_update_index_statistics(Relation indrel, IndexBulkDeleteResult *stats);
+static LVParallelState *begin_parallel_vacuum(LVRelStats *vacrelstats, Oid relid,
+											  BlockNumber nblocks, int nindexes,
+											  int nrequested);
+static void end_parallel_vacuum(LVParallelState *lps, Relation *Irel, int nindexes);
+static void lazy_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
+										   int nindexes,
+										   IndexBulkDeleteResult **stats,
+										   LVParallelState *lps, bool for_cleanup);
+static void lazy_parallel_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats,
+													Relation *Irel,
+													int nindexes,
+													IndexBulkDeleteResult **stats,
+													LVParallelState *lps,
+													bool for_cleanup);
+static void do_parallel_vacuum_or_cleanup_indexes(Relation *Irel, int nindexes,
+												  IndexBulkDeleteResult **stats,
+												  LVShared *lvshared,
+												  LVDeadTuples *dead_tuples);
+static int compute_parallel_workers(Relation onerel, int nrequested, int nindexes);
+static long compute_max_dead_tuples(BlockNumber relblocks, bool hasindex);
 
 
 /*
@@ -488,6 +616,17 @@ vacuum_log_cleanup_info(Relation rel, LVRelStats *vacrelstats)
  *		dead-tuple TIDs, invoke vacuuming of indexes and call lazy_vacuum_heap
  *		to reclaim dead line pointers.
  *
+ *		If the table has at least two indexes and parallel lazy vacuum is
+ *		requested, we execute both index vacuuming and index cleanup with
+ *		parallel workers. In parallel lazy vacuum, we enter parallel mode and
+ *		create the parallel context and the DSM segment before starting heap
+ *		scan. All parallel workers are launched at beginning of index vacuuming
+ *		and index cleanup and they exit once done with all indexes. At end of
+ *		this function we exit from parallel mode. Index bulk-deletion results
+ *		are stored in the DSM segment and update index statistics as a whole
+ *		after exited from parallel mode since all writes are not allowed during
+ *		parallel mode.
+ *
  *		If there are no indexes then we can reclaim line pointers on the fly;
  *		dead line pointers need only be retained until all index pointers that
  *		reference them have been killed.
@@ -496,6 +635,8 @@ static void
 lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			   Relation *Irel, int nindexes, bool aggressive)
 {
+	LVParallelState *lps = NULL;
+	LVDeadTuples *dead_tuples;
 	BlockNumber nblocks,
 				blkno;
 	HeapTupleData tuple;
@@ -518,6 +659,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	bool		skipping_blocks;
 	xl_heap_freeze_tuple *frozen;
 	StringInfoData buf;
+	int			parallel_workers = 0;
 	const int	initprog_index[] = {
 		PROGRESS_VACUUM_PHASE,
 		PROGRESS_VACUUM_TOTAL_HEAP_BLKS,
@@ -553,13 +695,45 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	vacrelstats->nonempty_pages = 0;
 	vacrelstats->latestRemovedXid = InvalidTransactionId;
 
-	lazy_space_alloc(vacrelstats, nblocks);
+	/*
+	 * If parallel lazy vacuum is requested and we vacuum indexes, compute
+	 * the number of parallel vacuum worker to launch.
+	 */
+	if (params->nworkers >= 0 && vacrelstats->useindex)
+		parallel_workers = compute_parallel_workers(onerel,
+													params->nworkers,
+													nindexes);
+
+	if (parallel_workers > 0)
+	{
+		/*
+		 * Enter parallel mode, create the parallel context and allocate the
+		 * DSM segment.
+		 */
+		lps = begin_parallel_vacuum(vacrelstats,
+									RelationGetRelid(onerel),
+									nblocks, nindexes,
+									parallel_workers);
+
+		/* Remember the user-requested parallel degree for reporting */
+		lps->nworkers_requested = params->nworkers;
+	}
+	else
+	{
+		/*
+		 * Use single process vacuum. We allocate the memory space for dead
+		 * tuples locally.
+		 */
+		lazy_space_alloc(vacrelstats, nblocks);
+	}
+
+	dead_tuples = vacrelstats->dead_tuples;
 	frozen = palloc(sizeof(xl_heap_freeze_tuple) * MaxHeapTuplesPerPage);
 
 	/* Report that we're scanning the heap, advertising total # of blocks */
 	initprog_val[0] = PROGRESS_VACUUM_PHASE_SCAN_HEAP;
 	initprog_val[1] = nblocks;
-	initprog_val[2] = vacrelstats->max_dead_tuples;
+	initprog_val[2] = dead_tuples->max_tuples;
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
 	/*
@@ -737,8 +911,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * If we are close to overrunning the available space for dead-tuple
 		 * TIDs, pause and do a cycle of vacuuming before we tackle this page.
 		 */
-		if ((vacrelstats->max_dead_tuples - vacrelstats->num_dead_tuples) < MaxHeapTuplesPerPage &&
-			vacrelstats->num_dead_tuples > 0)
+		if ((dead_tuples->max_tuples - dead_tuples->num_tuples) < MaxHeapTuplesPerPage &&
+			dead_tuples->num_tuples > 0)
 		{
 			const int	hvp_index[] = {
 				PROGRESS_VACUUM_PHASE,
@@ -766,10 +940,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 										 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
 			/* Remove index entries */
-			for (i = 0; i < nindexes; i++)
-				lazy_vacuum_index(Irel[i],
-								  &indstats[i],
-								  vacrelstats);
+			lazy_vacuum_or_cleanup_indexes(vacrelstats, Irel, nindexes,
+										   indstats, lps, false);
 
 			/*
 			 * Report that we are now vacuuming the heap.  We also increase
@@ -789,7 +961,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 			vacrelstats->num_index_scans++;
 
 			/*
@@ -985,7 +1157,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		has_dead_tuples = false;
 		nfrozen = 0;
 		hastup = false;
-		prev_dead_count = vacrelstats->num_dead_tuples;
+		prev_dead_count = dead_tuples->num_tuples;
 		maxoff = PageGetMaxOffsetNumber(page);
 
 		/*
@@ -1024,7 +1196,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 */
 			if (ItemIdIsDead(itemid))
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				all_visible = false;
 				continue;
 			}
@@ -1170,7 +1342,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 
 			if (tupgone)
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				HeapTupleHeaderAdvanceLatestRemovedXid(tuple.t_data,
 													   &vacrelstats->latestRemovedXid);
 				tups_vacuumed += 1;
@@ -1240,7 +1412,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * doing a second scan. Also we don't do that but forget dead tuples
 		 * when index cleanup is disabled.
 		 */
-		if (!vacrelstats->useindex && vacrelstats->num_dead_tuples > 0)
+		if (!vacrelstats->useindex && dead_tuples->num_tuples > 0)
 		{
 			if (nindexes == 0)
 			{
@@ -1269,7 +1441,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 
 			/*
 			 * Periodically do incremental FSM vacuuming to make newly-freed
@@ -1384,7 +1556,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * page, so remember its free space as-is.  (This path will always be
 		 * taken if there are no indexes.)
 		 */
-		if (vacrelstats->num_dead_tuples == prev_dead_count)
+		if (dead_tuples->num_tuples == prev_dead_count)
 			RecordPageWithFreeSpace(onerel, blkno, freespace);
 	}
 
@@ -1418,7 +1590,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 
 	/* If any tuples need to be deleted, perform final vacuum cycle */
 	/* XXX put a threshold on min number of tuples here? */
-	if (vacrelstats->num_dead_tuples > 0)
+	if (dead_tuples->num_tuples > 0)
 	{
 		const int	hvp_index[] = {
 			PROGRESS_VACUUM_PHASE,
@@ -1434,10 +1606,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 									 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
 		/* Remove index entries */
-		for (i = 0; i < nindexes; i++)
-			lazy_vacuum_index(Irel[i],
-							  &indstats[i],
-							  vacrelstats);
+		lazy_vacuum_or_cleanup_indexes(vacrelstats, Irel, nindexes,
+										   indstats, lps, false);
 
 		/* Report that we are now vacuuming the heap */
 		hvp_val[0] = PROGRESS_VACUUM_PHASE_VACUUM_HEAP;
@@ -1463,11 +1633,20 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
 								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
 
-	/* Do post-vacuum cleanup and statistics update for each index */
+	/*
+	 * Do post-vacuum cleanup and statistics update for each index if
+	 * we're not in parallel lazy vacuum. If in parallel lazy vacuum, do
+	 * only post-vacuum cleanup and will update statistics at the end of
+	 * the parallel lazy vacuum.
+	 */
 	if (vacrelstats->useindex)
+		lazy_vacuum_or_cleanup_indexes(vacrelstats, Irel, nindexes,
+									   indstats, lps, true);
+
+	if (ParallelVacuumIsActive(lps))
 	{
-		for (i = 0; i < nindexes; i++)
-			lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
+		/* End parallel mode and update index statistics */
+		end_parallel_vacuum(lps, Irel, nindexes);
 	}
 
 	/* If no indexes, make log report that lazy_vacuum_heap would've made */
@@ -1534,7 +1713,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 	npages = 0;
 
 	tupindex = 0;
-	while (tupindex < vacrelstats->num_dead_tuples)
+	while (tupindex < vacrelstats->dead_tuples->num_tuples)
 	{
 		BlockNumber tblk;
 		Buffer		buf;
@@ -1543,7 +1722,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 
 		vacuum_delay_point();
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples->itemptrs[tupindex]);
 		buf = ReadBufferExtended(onerel, MAIN_FORKNUM, tblk, RBM_NORMAL,
 								 vac_strategy);
 		if (!ConditionalLockBufferForCleanup(buf))
@@ -1591,6 +1770,7 @@ static int
 lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 				 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer)
 {
+	LVDeadTuples	*dead_tuples = vacrelstats->dead_tuples;
 	Page		page = BufferGetPage(buffer);
 	OffsetNumber unused[MaxOffsetNumber];
 	int			uncnt = 0;
@@ -1601,16 +1781,16 @@ lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 
 	START_CRIT_SECTION();
 
-	for (; tupindex < vacrelstats->num_dead_tuples; tupindex++)
+	for (; tupindex < dead_tuples->num_tuples; tupindex++)
 	{
 		BlockNumber tblk;
 		OffsetNumber toff;
 		ItemId		itemid;
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&dead_tuples->itemptrs[tupindex]);
 		if (tblk != blkno)
 			break;				/* past end of tuples for this block */
-		toff = ItemPointerGetOffsetNumber(&vacrelstats->dead_tuples[tupindex]);
+		toff = ItemPointerGetOffsetNumber(&dead_tuples->itemptrs[tupindex]);
 		itemid = PageGetItemId(page, toff);
 		ItemIdSetUnused(itemid);
 		unused[uncnt++] = toff;
@@ -1731,19 +1911,289 @@ lazy_check_needs_freeze(Buffer buf, bool *hastup)
 	return false;
 }
 
+/*
+ * Vacuum or cleanup indexes with parallel workers. This function must be used
+ * by the parallel vacuum leader process.
+ */
+static void
+lazy_parallel_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
+										int nindexes, IndexBulkDeleteResult **stats,
+										LVParallelState *lps, bool for_cleanup)
+{
+	StringInfoData	buf;
+	bool			leaderparticipates = true;
+
+#ifdef PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION
+	leaderparticipates = false;
+#endif
+
+	Assert(!IsParallelWorker());
+	Assert(ParallelVacuumIsActive(lps));
+	Assert(nindexes > 0);
+
+	/* Set shared information to tell parallel workers */
+	lps->lvshared->for_cleanup = for_cleanup;
+	if (!for_cleanup)
+	{
+		/*
+		 * We can only provide an approximate value of num_heap_tuples in
+		 * vacuum cases.
+		 */
+		lps->lvshared->reltuples = vacrelstats->old_live_tuples;
+		lps->lvshared->estimated_count = true;
+	}
+	else
+	{
+		/*
+		 * Now we can provide a better estimate of total number of surviving
+		 * tuples (we assume indexes are more interested in that than in the
+		 * number of nominally live tuples).
+		 */
+		lps->lvshared->reltuples = vacrelstats->new_rel_tuples;
+		lps->lvshared->estimated_count =
+			(vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+
+	}
+
+	LaunchParallelWorkers(lps->pcxt);
+
+	/* Create the log message to report */
+	initStringInfo(&buf);
+	if (lps->pcxt->nworkers_launched == 0)
+	{
+		/*
+		 * If no workers launched, the leader process vacuums all indexes alone.
+		 * Since there is hope that we can launch parallel workers in the next
+		 * index vacuuming time we don't end parallel mode yet.
+		 */
+		if (for_cleanup)
+		{
+			if (lps->nworkers_requested > 0)
+				appendStringInfo(&buf,
+								 gettext_noop("could not launch parallel vacuum worker for index cleanup (planned: %d, requested: %d)"),
+								 lps->pcxt->nworkers, lps->nworkers_requested);
+			else
+				appendStringInfo(&buf,
+								 gettext_noop("could not launch parallel vacuum worker for index cleanup (planned: %d)"),
+								 lps->pcxt->nworkers);
+		}
+		else
+		{
+			if (lps->nworkers_requested > 0)
+				appendStringInfo(&buf,
+								 gettext_noop("could not launch parallel vacuum worker for index vacuuming (planned: %d, requested: %d)"),
+								 lps->pcxt->nworkers, lps->nworkers_requested);
+			else
+				appendStringInfo(&buf,
+								 gettext_noop("could not launch parallel vacuum worker for index vacuuming (planned: %d)"),
+								 lps->pcxt->nworkers);
+		}
+	}
+	else
+	{
+		if (for_cleanup)
+		{
+			if (lps->nworkers_requested > 0)
+				appendStringInfo(&buf,
+								 ngettext("launched %d parallel vacuum worker for index cleanup (planned: %d, requested %d)",
+										  "launched %d parallel vacuum workers for index cleanup (planned: %d, requsted %d)",
+										  lps->pcxt->nworkers_launched),
+								 lps->pcxt->nworkers_launched,
+								 lps->pcxt->nworkers,
+								 lps->nworkers_requested);
+			else
+				appendStringInfo(&buf,
+								 ngettext("launched %d parallel vacuum worker for index cleanup (planned: %d)",
+										  "launched %d parallel vacuum workers for index cleanup (planned: %d)",
+										  lps->pcxt->nworkers_launched),
+								 lps->pcxt->nworkers_launched,
+								 lps->pcxt->nworkers);
+		}
+		else
+		{
+			if (lps->nworkers_requested > 0)
+				appendStringInfo(&buf,
+								 ngettext("launched %d parallel vacuum worker for index vacuuming (planned: %d, requested %d)",
+										  "launched %d parallel vacuum workers for index vacuuming (planned: %d, requested %d)",
+										  lps->pcxt->nworkers_launched),
+								 lps->pcxt->nworkers_launched,
+								 lps->pcxt->nworkers,
+								 lps->nworkers_requested);
+			else
+				appendStringInfo(&buf,
+								 ngettext("launched %d parallel vacuum worker for index vacuuming (planned: %d)",
+										  "launched %d parallel vacuum workers for index vacuuming (planned: %d)",
+										  lps->pcxt->nworkers_launched),
+								 lps->pcxt->nworkers_launched,
+								 lps->pcxt->nworkers);
+		}
+	}
+
+	ereport(elevel, (errmsg("%s", buf.data)));
+
+	/*
+	 * Join index vacuuming or index cleanup with parallel workers. The
+	 * leader process alone does that in case where no workers launched.
+	 */
+	if (leaderparticipates || lps->pcxt->nworkers_launched == 0)
+		do_parallel_vacuum_or_cleanup_indexes(Irel, nindexes, stats,
+											  lps->lvshared,
+											  vacrelstats->dead_tuples);
+
+	/* Wait for all vacuum workers to finish */
+	WaitForParallelWorkersToFinish(lps->pcxt);
+
+	/*
+	 * If we are doing index cleanup, we don't need to reinitialize the
+	 * parallel context as no more index vacuuming and index cleanup will
+	 * be performed after that.
+	 */
+	if (!for_cleanup)
+	{
+		/* Reset the processing count */
+		pg_atomic_write_u32(&(lps->lvshared->nprocessed), 0);
+
+		/*
+		 * Reinitialize the parallel context to relaunch parallel workers
+		 * for the next execution.
+		 */
+		ReinitializeParallelDSM(lps->pcxt);
+	}
+}
+
+/*
+ * Parallel Index vacuuming and index cleanup routine used by both the leader
+ * process and worker processes. Unlike single process vacuum, we don't update
+ * index statistics after cleanup index since it is not allowed during
+ * parallel mode, instead copy index bulk-deletion results from the local
+ * memory to the DSM segment and update them at the end of parallel lazy
+ * vacuum.
+ */
+static void
+do_parallel_vacuum_or_cleanup_indexes(Relation *Irel, int nindexes,
+									  IndexBulkDeleteResult **stats,
+									  LVShared *lvshared,
+									  LVDeadTuples *dead_tuples)
+{
+	/* Loop until all indexes are vacuumed */
+	for (;;)
+	{
+		int idx;
+
+		/* Get an index number to process */
+		idx = pg_atomic_fetch_add_u32(&(lvshared->nprocessed), 1);
+
+		/* Done for all indexes? */
+		if (idx >= nindexes)
+			break;
+
+		/*
+		 * Update the pointer to the corresponding bulk-deletion result
+		 * if someone has already updated it.
+		 */
+		if (lvshared->indstats[idx].updated &&
+			stats[idx] == NULL)
+			stats[idx] = &(lvshared->indstats[idx].stats);
+
+		/* Do vacuum or cleanup one index */
+		if (!lvshared->for_cleanup)
+			lazy_vacuum_index(Irel[idx], &stats[idx], dead_tuples,
+							  lvshared->reltuples);
+		else
+			lazy_cleanup_index(Irel[idx], &stats[idx], lvshared->reltuples,
+							   lvshared->estimated_count);
+
+		/*
+		 * Copy the index bulk-deletion result returned from ambulkdelete and
+		 * amvacuumcleanup to the DSM segment if it's the first time to get it
+		 * from them, because they allocate it locally and it's possible that an
+		 * index will be vacuumed by the different vacuum process at the next
+		 * time. The copying the result normally happens only after the first
+		 * time of index vacuuming. From the second time, we pass the result on
+		 * the DSM segment so that they then update it directly.
+		 *
+		 * Since all vacuum workers write the bulk-deletion result at different
+		 * slots we can write them without locking.
+		 */
+		if (!lvshared->indstats[idx].updated && stats[idx] != NULL)
+		{
+			memcpy(&(lvshared->indstats[idx].stats),
+				   stats[idx], sizeof(IndexBulkDeleteResult));
+			lvshared->indstats[idx].updated = true;
+
+			/*
+			 * no longer need the locally allocated result and now stats[idx]
+			 * points to the DSM segment.
+			 */
+			pfree(stats[idx]);
+			stats[idx] = &(lvshared->indstats[idx].stats);
+		}
+	}
+}
+
+/*
+ * Vacuum or cleanup indexes. If we're ready for parallel lazy vacuum it's
+ * performed with parallel workers, otherwise do vacuum or cleanup indexes
+ * by itself. In parallel vacuum case, this function must be used by the
+ * parallel vacuum leader process. for_cleanup is true if the caller requests
+ * index cleanup, otherwise index vacuuming.
+ */
+static void
+lazy_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
+							   int nindexes, IndexBulkDeleteResult **stats,
+							   LVParallelState *lps, bool for_cleanup)
+{
+	int		idx;
+
+	Assert(!IsParallelWorker());
+
+	/* no job if the table has no index */
+	if (nindexes <= 0)
+		return;
+
+	if (ParallelVacuumIsActive(lps))
+	{
+		/* Do parallel index vacuuming or index cleanup */
+		lazy_parallel_vacuum_or_cleanup_indexes(vacrelstats, Irel,
+												nindexes, stats,
+												lps, for_cleanup);
+		return;
+	}
+
+	/* We are in single process vacuum, do index vacuuming or index cleanup */
+	for (idx = 0; idx < nindexes; idx++)
+	{
+		if (!for_cleanup)
+			lazy_vacuum_index(Irel[idx], &stats[idx], vacrelstats->dead_tuples,
+							  vacrelstats->old_live_tuples);
+		else
+		{
+			/* Cleanup one index and update index statistics */
+			lazy_cleanup_index(Irel[idx], &stats[idx], vacrelstats->new_rel_tuples,
+							   vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+
+			lazy_update_index_statistics(Irel[idx], stats[idx]);
+
+			if (stats[idx])
+				pfree(stats[idx]);
+		}
+	}
+}
 
 /*
  *	lazy_vacuum_index() -- vacuum one index relation.
  *
  *		Delete all the index entries pointing to tuples listed in
  *		vacrelstats->dead_tuples, and update running statistics.
+ *		reltuples is the number of heap tuples to be passed to the
+ *		bulk delete callback.
  */
 static void
-lazy_vacuum_index(Relation indrel,
-				  IndexBulkDeleteResult **stats,
-				  LVRelStats *vacrelstats)
+lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult **stats,
+				  LVDeadTuples *dead_tuples, double reltuples)
 {
 	IndexVacuumInfo ivinfo;
+	char		*msgfmt;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
@@ -1753,30 +2203,38 @@ lazy_vacuum_index(Relation indrel,
 	ivinfo.report_progress = false;
 	ivinfo.estimated_count = true;
 	ivinfo.message_level = elevel;
-	/* We can only provide an approximate value of num_heap_tuples here */
-	ivinfo.num_heap_tuples = vacrelstats->old_live_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
 	/* Do bulk deletion */
 	*stats = index_bulk_delete(&ivinfo, *stats,
-							   lazy_tid_reaped, (void *) vacrelstats);
+							   lazy_tid_reaped, (void *) dead_tuples);
+
+	if (IsParallelWorker())
+		msgfmt = gettext_noop("scanned index \"%s\" to remove %d row versions by parallel vacuum worker");
+	else
+		msgfmt = gettext_noop("scanned index \"%s\" to remove %d row versions");
 
 	ereport(elevel,
-			(errmsg("scanned index \"%s\" to remove %d row versions",
+			(errmsg(msgfmt,
 					RelationGetRelationName(indrel),
-					vacrelstats->num_dead_tuples),
+					dead_tuples->num_tuples),
 			 errdetail_internal("%s", pg_rusage_show(&ru0))));
 }
 
 /*
  *	lazy_cleanup_index() -- do post-vacuum cleanup for one index relation.
+ *
+ *		reltuples is the number of heap tuples and estimated_count is true
+ *		if the reltuples is an estimated value.
  */
 static void
 lazy_cleanup_index(Relation indrel,
-				   IndexBulkDeleteResult *stats,
-				   LVRelStats *vacrelstats)
+				   IndexBulkDeleteResult **stats,
+				   double reltuples, bool estimated_count)
 {
 	IndexVacuumInfo ivinfo;
+	char		*msgfmt;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
@@ -1784,49 +2242,55 @@ lazy_cleanup_index(Relation indrel,
 	ivinfo.index = indrel;
 	ivinfo.analyze_only = false;
 	ivinfo.report_progress = false;
-	ivinfo.estimated_count = (vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	ivinfo.estimated_count = estimated_count;
 	ivinfo.message_level = elevel;
 
-	/*
-	 * Now we can provide a better estimate of total number of surviving
-	 * tuples (we assume indexes are more interested in that than in the
-	 * number of nominally live tuples).
-	 */
-	ivinfo.num_heap_tuples = vacrelstats->new_rel_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
-	stats = index_vacuum_cleanup(&ivinfo, stats);
+	*stats = index_vacuum_cleanup(&ivinfo, *stats);
 
-	if (!stats)
+	if (!(*stats))
 		return;
 
-	/*
-	 * Now update statistics in pg_class, but only if the index says the count
-	 * is accurate.
-	 */
-	if (!stats->estimated_count)
-		vac_update_relstats(indrel,
-							stats->num_pages,
-							stats->num_index_tuples,
-							0,
-							false,
-							InvalidTransactionId,
-							InvalidMultiXactId,
-							false);
+	if (IsParallelWorker())
+		msgfmt = gettext_noop("index \"%s\" now contains %.0f row versions in %u pages, reported by parallel vacuum worker");
+	else
+		msgfmt = gettext_noop("index \"%s\" now contains %.0f row versions in %u pages");
 
 	ereport(elevel,
-			(errmsg("index \"%s\" now contains %.0f row versions in %u pages",
+			(errmsg(msgfmt,
 					RelationGetRelationName(indrel),
-					stats->num_index_tuples,
-					stats->num_pages),
+					(*stats)->num_index_tuples,
+					(*stats)->num_pages),
 			 errdetail("%.0f index row versions were removed.\n"
 					   "%u index pages have been deleted, %u are currently reusable.\n"
 					   "%s.",
-					   stats->tuples_removed,
-					   stats->pages_deleted, stats->pages_free,
+					   (*stats)->tuples_removed,
+					   (*stats)->pages_deleted, (*stats)->pages_free,
 					   pg_rusage_show(&ru0))));
+}
+
+/*
+ * Update index statistics in pg_class if the statistics is accurate.
+ */
+static void
+lazy_update_index_statistics(Relation indrel, IndexBulkDeleteResult *stats)
+{
+	Assert(!IsInParallelMode());
+
+	if (!stats || stats->estimated_count)
+		return;
 
-	pfree(stats);
+	/* Update index statistics */
+	vac_update_relstats(indrel,
+						stats->num_pages,
+						stats->num_index_tuples,
+						0,
+						false,
+						InvalidTransactionId,
+						InvalidMultiXactId,
+						false);
 }
 
 /*
@@ -2134,19 +2598,17 @@ count_nondeletable_pages(Relation onerel, LVRelStats *vacrelstats)
 }
 
 /*
- * lazy_space_alloc - space allocation decisions for lazy vacuum
- *
- * See the comments at the head of this file for rationale.
+ * Return the maximum number of dead tuples we can record.
  */
-static void
-lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+static long
+compute_max_dead_tuples(BlockNumber relblocks, bool useindex)
 {
 	long		maxtuples;
 	int			vac_work_mem = IsAutoVacuumWorkerProcess() &&
 	autovacuum_work_mem != -1 ?
 	autovacuum_work_mem : maintenance_work_mem;
 
-	if (vacrelstats->useindex)
+	if (useindex)
 	{
 		maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);
 		maxtuples = Min(maxtuples, INT_MAX);
@@ -2160,34 +2622,49 @@ lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
 		maxtuples = Max(maxtuples, MaxHeapTuplesPerPage);
 	}
 	else
-	{
 		maxtuples = MaxHeapTuplesPerPage;
-	}
 
-	vacrelstats->num_dead_tuples = 0;
-	vacrelstats->max_dead_tuples = (int) maxtuples;
-	vacrelstats->dead_tuples = (ItemPointer)
-		palloc(maxtuples * sizeof(ItemPointerData));
+	return maxtuples;
+}
+
+/*
+ * lazy_space_alloc - space allocation decisions for lazy vacuum
+ *
+ * See the comments at the head of this file for rationale.
+ */
+static void
+lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+{
+	LVDeadTuples	*dead_tuples = NULL;
+	long		maxtuples;
+
+	maxtuples = compute_max_dead_tuples(relblocks, vacrelstats->useindex);
+
+	dead_tuples = (LVDeadTuples *)
+		palloc(SizeOfLVDeadTuples + maxtuples * sizeof(ItemPointerData));
+	dead_tuples->num_tuples = 0;
+	dead_tuples->max_tuples = (int) maxtuples;
+
+	vacrelstats->dead_tuples = dead_tuples;
 }
 
 /*
  * lazy_record_dead_tuple - remember one deletable tuple
  */
 static void
-lazy_record_dead_tuple(LVRelStats *vacrelstats,
-					   ItemPointer itemptr)
+lazy_record_dead_tuple(LVDeadTuples *dead_tuples, ItemPointer itemptr)
 {
 	/*
 	 * The array shouldn't overflow under normal behavior, but perhaps it
 	 * could if we are given a really small maintenance_work_mem. In that
 	 * case, just forget the last few tuples (we'll get 'em next time).
 	 */
-	if (vacrelstats->num_dead_tuples < vacrelstats->max_dead_tuples)
+	if (dead_tuples->num_tuples < dead_tuples->max_tuples)
 	{
-		vacrelstats->dead_tuples[vacrelstats->num_dead_tuples] = *itemptr;
-		vacrelstats->num_dead_tuples++;
+		dead_tuples->itemptrs[dead_tuples->num_tuples] = *itemptr;
+		dead_tuples->num_tuples++;
 		pgstat_progress_update_param(PROGRESS_VACUUM_NUM_DEAD_TUPLES,
-									 vacrelstats->num_dead_tuples);
+									 dead_tuples->num_tuples);
 	}
 }
 
@@ -2201,12 +2678,12 @@ lazy_record_dead_tuple(LVRelStats *vacrelstats,
 static bool
 lazy_tid_reaped(ItemPointer itemptr, void *state)
 {
-	LVRelStats *vacrelstats = (LVRelStats *) state;
+	LVDeadTuples	*dead_tuples = (LVDeadTuples *) state;
 	ItemPointer res;
 
 	res = (ItemPointer) bsearch((void *) itemptr,
-								(void *) vacrelstats->dead_tuples,
-								vacrelstats->num_dead_tuples,
+								(void *) dead_tuples->itemptrs,
+								dead_tuples->num_tuples,
 								sizeof(ItemPointerData),
 								vac_cmp_itemptr);
 
@@ -2354,3 +2831,236 @@ heap_page_is_all_visible(Relation rel, Buffer buf,
 
 	return all_visible;
 }
+
+/*
+ * Compute the number of parallel worker processes to request. Both index
+ * vacuuming and index cleanup can be executed together with parallel workers.
+ * The relation sizes of table and indexes don't affect to the parallel
+ * degree for now. nrequested is the number of parallel workers that user
+ * requested. If nrequested is 0 we compute the parallel degree based on
+ * nindexes that is the number of indexes that the table has.
+ */
+static int
+compute_parallel_workers(Relation onerel, int nrequested, int nindexes)
+{
+	int		parallel_workers;
+	bool	leaderparticipates = true;
+	int		nindexes_to_vacuum = nindexes;
+
+#ifdef PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION
+	leaderparticipates = false;
+#endif
+
+	Assert(nrequested >= 0);
+
+	/* Return immediately when parallelism disabled */
+	if (max_parallel_maintenance_workers == 0)
+		return 0;
+
+	/* The leader process takes one index */
+	if (leaderparticipates)
+		nindexes_to_vacuum--;
+
+	if (nrequested > 0)
+	{
+		/* The parallel degree is requested */
+		parallel_workers = Min(nrequested, nindexes_to_vacuum);
+	}
+	else
+	{
+		/*
+		 * The parallel degree is not requested. Compute it based on the
+		 * number of indexes.
+		 */
+		parallel_workers = nindexes_to_vacuum;
+	}
+
+	/* cap by max_parallel_maintenace_workers */
+	parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+
+	return parallel_workers;
+}
+
+/*
+ * Enter parallel mode, allocate and initialize the DSM segment.
+ */
+static LVParallelState *
+begin_parallel_vacuum(LVRelStats *vacrelstats, Oid relid, BlockNumber nblocks,
+					  int nindexes, int nrequested)
+{
+	LVParallelState *lps = (LVParallelState *) palloc(sizeof(LVParallelState));
+	LVShared	*shared;
+	ParallelContext *pcxt;
+	LVDeadTuples	*tidmap;
+	long	maxtuples;
+	char	*sharedquery;
+	Size	est_shared;
+	Size	est_deadtuples;
+	int		querylen;
+
+	Assert(nrequested > 0);
+	Assert(nindexes > 0);
+
+	EnterParallelMode();
+	pcxt = CreateParallelContext("postgres", "heap_parallel_vacuum_main",
+								 nrequested);
+	lps->pcxt = pcxt;
+	Assert(pcxt->nworkers > 0);
+
+	/* Estimate size for shared information -- PARALLEL_VACUUM_KEY_SHARED */
+	est_shared = MAXALIGN(add_size(SizeOfLVShared,
+								   mul_size(sizeof(LVSharedIndStats), nindexes)));
+	shm_toc_estimate_chunk(&pcxt->estimator, est_shared);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	/* Estimate size for dead tuples -- PARALLEL_VACUUM_KEY_DEAD_TUPLES */
+	maxtuples = compute_max_dead_tuples(nblocks, true);
+	est_deadtuples = MAXALIGN(add_size(SizeOfLVDeadTuples,
+									   mul_size(sizeof(ItemPointerData), maxtuples)));
+	shm_toc_estimate_chunk(&pcxt->estimator, est_deadtuples);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	/* Finally, estimate VACUUM_KEY_QUERY_TEXT space */
+	querylen = strlen(debug_query_string);
+	shm_toc_estimate_chunk(&pcxt->estimator, querylen + 1);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	InitializeParallelDSM(pcxt);
+
+	/* prepare shared information */
+	shared = (LVShared *) shm_toc_allocate(pcxt->toc, est_shared);
+	shared->relid = relid;
+	shared->elevel = elevel;
+	pg_atomic_init_u32(&(shared->nprocessed), 0);
+	MemSet(shared->indstats, 0, sizeof(LVSharedIndStats) * nindexes);
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_SHARED, shared);
+	lps->lvshared = shared;
+
+	/* prepare the dead tuple space */
+	tidmap = (LVDeadTuples *) shm_toc_allocate(pcxt->toc, est_deadtuples);
+	tidmap->max_tuples = maxtuples;
+	tidmap->num_tuples = 0;
+	MemSet(tidmap->itemptrs, 0, sizeof(ItemPointerData) * maxtuples);
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_DEAD_TUPLES, tidmap);
+	vacrelstats->dead_tuples = tidmap;
+
+	/* Store query string for workers */
+	sharedquery = (char *) shm_toc_allocate(pcxt->toc, querylen + 1);
+	memcpy(sharedquery, debug_query_string, querylen + 1);
+	sharedquery[querylen] = '\0';
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, sharedquery);
+
+	lps->nworkers_requested = 0;
+
+	return lps;
+}
+
+/*
+ * Shutdown workers, destroy the parallel context, and end parallel mode.
+ * Update index statistics after exited from parallel mode.
+ */
+static void
+end_parallel_vacuum(LVParallelState *lps, Relation *Irel, int nindexes)
+{
+	IndexBulkDeleteResult *copied_stats = NULL;
+	int *updated_idx = palloc(sizeof(int) * nindexes);
+	int nupdated = 0;
+	int i;
+
+	Assert(!IsParallelWorker());
+	Assert(Irel != NULL && nindexes > 0);
+
+	/*
+	 * All writes are not allowed during parallel mode and it might not be
+	 * safe to exit from parallel mode while keeping the parallel context.
+	 * So we copy the updated index statistics to a temporary space and
+	 * update them after exited from parallel mode.
+	 */
+	for (i = 0; i < nindexes; i++)
+	{
+		if (lps->lvshared->indstats[i].updated)
+			updated_idx[nupdated++] = i;
+	}
+
+	copied_stats = palloc(sizeof(IndexBulkDeleteResult) * nupdated);
+
+	for (i = 0; i < nupdated; i++)
+		memcpy(&(copied_stats[i]),
+			   &(lps->lvshared->indstats[updated_idx[i]].stats),
+			   sizeof(IndexBulkDeleteResult));
+
+	/* Shutdown worker processes and destroy the parallel context */
+	DestroyParallelContext(lps->pcxt);
+	ExitParallelMode();
+
+	/* Update index statistics */
+	for (i = 0; i < nupdated; i++)
+		lazy_update_index_statistics(Irel[updated_idx[i]],
+									 &(copied_stats[i]));
+
+	pfree(copied_stats);
+	pfree(updated_idx);
+}
+
+/*
+ * Perform work within a launched parallel process.
+ *
+ * Since parallel vacuum workers work only within index vacuuming and index
+ * cleanup, no need to report the progress information.
+ */
+void
+heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
+{
+	Relation	onerel;
+	Relation	*indrels;
+	LVShared	*lvshared;
+	LVDeadTuples	*dead_tuples;
+	int			nindexes;
+	char		*sharedquery;
+	IndexBulkDeleteResult **stats;
+
+	lvshared = (LVShared *) shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_SHARED,
+										   false);
+	elevel = lvshared->elevel;
+
+	ereport(DEBUG1,
+			(errmsg("starting parallel lazy vacuum worker for %s",
+					lvshared->for_cleanup ? "cleanup" : "vacuuming")));
+
+	/* Set debug_query_string for individual workers */
+	sharedquery = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, false);
+	debug_query_string = sharedquery;
+	pgstat_report_activity(STATE_RUNNING, debug_query_string);
+
+	/* Open table */
+	onerel = heap_open(lvshared->relid, ShareUpdateExclusiveLock);
+
+	/*
+	 * Open all indexes. indrels are sorted in order by OID, which should
+	 * be matched to the leader's one.
+	 */
+	vac_open_indexes(onerel, RowExclusiveLock, &nindexes, &indrels);
+	Assert(nindexes > 0);
+
+	/* Set dead tuple space */
+	dead_tuples = (LVDeadTuples *) shm_toc_lookup(toc,
+												  PARALLEL_VACUUM_KEY_DEAD_TUPLES,
+												  false);
+
+	/* Set cost-based vacuum delay */
+	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostBalance = 0;
+	VacuumPageHit = 0;
+	VacuumPageMiss = 0;
+	VacuumPageDirty = 0;
+
+	stats = (IndexBulkDeleteResult **)
+		palloc0(nindexes * sizeof(IndexBulkDeleteResult *));
+
+	/* Do either vacuuming indexes or cleaning indexes */
+	do_parallel_vacuum_or_cleanup_indexes(indrels, nindexes, stats,
+										  lvshared, dead_tuples);
+
+	vac_close_indexes(nindexes, indrels, RowExclusiveLock);
+	heap_close(onerel, ShareUpdateExclusiveLock);
+}
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index 55d129a64f..86511b2703 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -14,6 +14,7 @@
 
 #include "postgres.h"
 
+#include "access/heapam.h"
 #include "access/nbtree.h"
 #include "access/parallel.h"
 #include "access/session.h"
@@ -140,6 +141,9 @@ static const struct
 	},
 	{
 		"_bt_parallel_build_main", _bt_parallel_build_main
+	},
+	{
+		"heap_parallel_vacuum_main", heap_parallel_vacuum_main
 	}
 };
 
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index e154507ecd..78e2fe6c3f 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -99,6 +99,7 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 	/* Set default value */
 	params.index_cleanup = VACOPT_TERNARY_DEFAULT;
 	params.truncate = VACOPT_TERNARY_DEFAULT;
+	params.nworkers = -1;
 
 	/* Parse options list */
 	foreach(lc, vacstmt->options)
@@ -129,6 +130,27 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 			params.index_cleanup = get_vacopt_ternary_value(opt);
 		else if (strcmp(opt->defname, "truncate") == 0)
 			params.truncate = get_vacopt_ternary_value(opt);
+		else if (strcmp(opt->defname, "parallel") == 0)
+		{
+			if (opt->arg == NULL)
+			{
+				/*
+				 * Parallel lazy vacuum is requested but user didn't specify
+				 * the parallel degree. The parallel degree will be determined
+				 * at the start of lazy vacuum.
+				 */
+				params.nworkers = 0;
+			}
+			else
+			{
+				params.nworkers = defGetInt32(opt);
+				if (params.nworkers < 1)
+					ereport(ERROR,
+							(errcode(ERRCODE_SYNTAX_ERROR),
+							 errmsg("parallel vacuum degree must be at least 1"),
+							 parser_errposition(pstate, opt->location)));
+			}
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -170,6 +192,11 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 		}
 	}
 
+	if ((params.options & VACOPT_FULL) && params.nworkers >= 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot specify FULL option with PARALLEL option")));
+
 	/*
 	 * All freeze ages are zero if the FREEZE option is given; otherwise pass
 	 * them as -1 which means to use the default values.
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 073f313337..da21d62635 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2888,6 +2888,8 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 			(!wraparound ? VACOPT_SKIP_LOCKED : 0);
 		tab->at_params.index_cleanup = VACOPT_TERNARY_DEFAULT;
 		tab->at_params.truncate = VACOPT_TERNARY_DEFAULT;
+		/* parallel lazy vacuum is not supported for autovacuum */
+		tab->at_params.nworkers = -1;
 		tab->at_params.freeze_min_age = freeze_min_age;
 		tab->at_params.freeze_table_age = freeze_table_age;
 		tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index e00dbab5aa..321a1511a8 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -3556,7 +3556,7 @@ psql_completion(const char *text, int start, int end)
 		if (ends_with(prev_wd, '(') || ends_with(prev_wd, ','))
 			COMPLETE_WITH("FULL", "FREEZE", "ANALYZE", "VERBOSE",
 						  "DISABLE_PAGE_SKIPPING", "SKIP_LOCKED",
-						  "INDEX_CLEANUP", "TRUNCATE");
+						  "INDEX_CLEANUP", "TRUNCATE", "PARALLEL");
 		else if (TailMatches("FULL|FREEZE|ANALYZE|VERBOSE|DISABLE_PAGE_SKIPPING|SKIP_LOCKED|INDEX_CLEANUP|TRUNCATE"))
 			COMPLETE_WITH("ON", "OFF");
 	}
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 858bcb6bc9..12065cc038 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -24,6 +24,8 @@
 #include "nodes/primnodes.h"
 #include "storage/bufpage.h"
 #include "storage/lockdefs.h"
+#include "storage/shm_toc.h"
+#include "storage/dsm.h"
 #include "utils/relcache.h"
 #include "utils/snapshot.h"
 
@@ -193,6 +195,7 @@ extern Size SyncScanShmemSize(void);
 struct VacuumParams;
 extern void heap_vacuum_rel(Relation onerel,
 							struct VacuumParams *params, BufferAccessStrategy bstrategy);
+extern void heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
 
 /* in heap/heapam_visibility.c */
 extern bool HeapTupleSatisfiesVisibility(HeapTuple stup, Snapshot snapshot,
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 128f7ae65d..43702f2f86 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -184,6 +184,11 @@ typedef struct VacuumParams
 										 * default value depends on reloptions */
 	VacOptTernaryValue truncate;	/* Truncate empty pages at the end,
 									 * default value depends on reloptions */
+	/*
+	 * The number of parallel vacuum workers. -1 by default for no workers
+	 * and 0 for choosing based on the number of indexes.
+	 */
+	int			nworkers;
 } VacuumParams;
 
 /* GUC parameters */
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index aff0b10a93..7fa981c649 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -92,6 +92,16 @@ CONTEXT:  SQL function "do_analyze" statement 1
 SQL function "wrap_do_analyze" statement 1
 VACUUM FULL vactst;
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+-- PARALLEL option
+VACUUM (PARALLEL) vaccluster;
+VACUUM (PARALLEL 2) vaccluster;
+VACUUM (PARALLEL 0) vaccluster; -- error
+ERROR:  parallel vacuum degree must be at least 1
+LINE 1: VACUUM (PARALLEL 0) vaccluster;
+                ^
+VACUUM (PARALLEL 2, INDEX_CLEANUP FALSE) vaccluster;
+VACUUM (PARALLEL 2, FULL TRUE) vaccluster; -- error, cannot use both PARALLEL and FULL
+ERROR:  cannot specify FULL option with PARALLEL option
 -- INDEX_CLEANUP option
 CREATE TABLE no_index_cleanup (i INT PRIMARY KEY, t TEXT);
 -- Use uncompressed data stored in toast.
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index f0fee3af2b..30f4c38ac8 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -75,6 +75,13 @@ VACUUM FULL vactst;
 
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
 
+-- PARALLEL option
+VACUUM (PARALLEL) vaccluster;
+VACUUM (PARALLEL 2) vaccluster;
+VACUUM (PARALLEL 0) vaccluster; -- error
+VACUUM (PARALLEL 2, INDEX_CLEANUP FALSE) vaccluster;
+VACUUM (PARALLEL 2, FULL TRUE) vaccluster; -- error, cannot use both PARALLEL and FULL
+
 -- INDEX_CLEANUP option
 CREATE TABLE no_index_cleanup (i INT PRIMARY KEY, t TEXT);
 -- Use uncompressed data stored in toast.
-- 
2.22.0

v26-0002-Add-paralell-P-option-to-vacuumdb-command.patchtext/x-patch; charset=US-ASCII; name=v26-0002-Add-paralell-P-option-to-vacuumdb-command.patchDownload

From 7085bef4caf59066ac9e09da240b03173b549d1c Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Wed, 23 Jan 2019 16:07:53 +0900
Subject: [PATCH v26 2/2] Add --paralell, -P option to vacuumdb command

---
 doc/src/sgml/ref/vacuumdb.sgml    | 16 +++++++++++
 src/bin/scripts/t/100_vacuumdb.pl | 10 ++++++-
 src/bin/scripts/vacuumdb.c        | 48 ++++++++++++++++++++++++++++++-
 3 files changed, 72 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/ref/vacuumdb.sgml b/doc/src/sgml/ref/vacuumdb.sgml
index 47d93456f8..f6ac0c6e5a 100644
--- a/doc/src/sgml/ref/vacuumdb.sgml
+++ b/doc/src/sgml/ref/vacuumdb.sgml
@@ -226,6 +226,22 @@ PostgreSQL documentation
       </listitem>
      </varlistentry>
 
+     <varlistentry>
+      <term><option>-P <replaceable class="parameter">workers</replaceable></option></term>
+      <term><option>--parallel=<replaceable class="parameter">workers</replaceable></option></term>
+      <listitem>
+       <para>
+        Execute parallel vacuum with <productname>PostgreSQL</productname>'s
+        <replaceable class="parameter">workers</replaceable> background workers.
+       </para>
+       <para>
+        This option will require background workers, so make sure your
+        <xref linkend="guc-max-parallel-workers-maintenance"/> setting is more
+        than one.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry>
       <term><option>-q</option></term>
       <term><option>--quiet</option></term>
diff --git a/src/bin/scripts/t/100_vacuumdb.pl b/src/bin/scripts/t/100_vacuumdb.pl
index b685b35282..8fe80719e8 100644
--- a/src/bin/scripts/t/100_vacuumdb.pl
+++ b/src/bin/scripts/t/100_vacuumdb.pl
@@ -3,7 +3,7 @@ use warnings;
 
 use PostgresNode;
 use TestLib;
-use Test::More tests => 44;
+use Test::More tests => 48;
 
 program_help_ok('vacuumdb');
 program_version_ok('vacuumdb');
@@ -48,6 +48,14 @@ $node->issues_sql_like(
 $node->command_fails(
 	[ 'vacuumdb', '--analyze-only', '--disable-page-skipping', 'postgres' ],
 	'--analyze-only and --disable-page-skipping specified together');
+$node->issues_sql_like(
+	[ 'vacuumdb', '-P2', 'postgres' ],
+	qr/statement: VACUUM \(PARALLEL 2\).*;/,
+	'vacuumdb -P2');
+$node->issues_sql_like(
+	[ 'vacuumdb', '-P', 'postgres' ],
+	qr/statement: VACUUM \(PARALLEL\).*;/,
+	'vacuumdb -P');
 $node->command_ok([qw(vacuumdb -Z --table=pg_am dbname=template1)],
 	'vacuumdb with connection string');
 
diff --git a/src/bin/scripts/vacuumdb.c b/src/bin/scripts/vacuumdb.c
index 2c7219239f..63bf66a70b 100644
--- a/src/bin/scripts/vacuumdb.c
+++ b/src/bin/scripts/vacuumdb.c
@@ -34,6 +34,8 @@ typedef struct vacuumingOptions
 	bool		skip_locked;
 	int			min_xid_age;
 	int			min_mxid_age;
+	int			parallel_workers;	/* -1 disables, 0 for choosing based on the
+									 * number of indexes */
 } vacuumingOptions;
 
 
@@ -86,6 +88,7 @@ main(int argc, char *argv[])
 		{"full", no_argument, NULL, 'f'},
 		{"verbose", no_argument, NULL, 'v'},
 		{"jobs", required_argument, NULL, 'j'},
+		{"parallel", optional_argument, NULL, 'P'},
 		{"maintenance-db", required_argument, NULL, 2},
 		{"analyze-in-stages", no_argument, NULL, 3},
 		{"disable-page-skipping", no_argument, NULL, 4},
@@ -115,6 +118,7 @@ main(int argc, char *argv[])
 
 	/* initialize options to all false */
 	memset(&vacopts, 0, sizeof(vacopts));
+	vacopts.parallel_workers = -1;
 
 	pg_logging_init(argv[0]);
 	progname = get_progname(argv[0]);
@@ -122,7 +126,7 @@ main(int argc, char *argv[])
 
 	handle_help_version_opts(argc, argv, "vacuumdb", help);
 
-	while ((c = getopt_long(argc, argv, "h:p:U:wWeqd:zZFat:fvj:", long_options, &optindex)) != -1)
+	while ((c = getopt_long(argc, argv, "h:p:P::U:wWeqd:zZFat:fvj:", long_options, &optindex)) != -1)
 	{
 		switch (c)
 		{
@@ -182,6 +186,24 @@ main(int argc, char *argv[])
 					exit(1);
 				}
 				break;
+			case 'P':
+				{
+					int parallel_workers = 0;
+
+					if (optarg != NULL)
+					{
+						parallel_workers = atoi(optarg);
+						if (parallel_workers <= 0)
+						{
+							pg_log_error("number of parallel workers must be at least 1");
+							exit(1);
+						}
+					}
+
+					/* allow to set 0, meaning PARALLEL without the parallel degree */
+					vacopts.parallel_workers = parallel_workers;
+					break;
+				}
 			case 2:
 				maintenance_db = pg_strdup(optarg);
 				break;
@@ -254,9 +276,22 @@ main(int argc, char *argv[])
 						 "disable-page-skipping");
 			exit(1);
 		}
+		if (vacopts.parallel_workers >= 0)
+		{
+			pg_log_error("cannot use the \"%s\" option when performing only analyze",
+						 "parallel");
+			exit(1);
+		}
 		/* allow 'and_analyze' with 'analyze_only' */
 	}
 
+	if (vacopts.full && vacopts.parallel_workers >= 0)
+	{
+		pg_log_error("cannot use the \"%s\" option with \"%s\" option",
+					 "full", "parallel");
+		exit(1);
+	}
+
 	setup_cancel_handler();
 
 	/* Avoid opening extra connections. */
@@ -822,6 +857,16 @@ prepare_vacuum_command(PQExpBuffer sql, int serverVersion,
 				appendPQExpBuffer(sql, "%sANALYZE", sep);
 				sep = comma;
 			}
+			if (vacopts->parallel_workers > 0)
+			{
+				appendPQExpBuffer(sql, "%sPARALLEL %d", sep, vacopts->parallel_workers);
+				sep = comma;
+			}
+			if (vacopts->parallel_workers == 0)
+			{
+				appendPQExpBuffer(sql, "%sPARALLEL", sep);
+				sep = comma;
+			}
 			if (sep != paren)
 				appendPQExpBufferChar(sql, ')');
 		}
@@ -885,6 +930,7 @@ help(const char *progname)
 	printf(_("  -j, --jobs=NUM                  use this many concurrent connections to vacuum\n"));
 	printf(_("      --min-mxid-age=MXID_AGE     minimum multixact ID age of tables to vacuum\n"));
 	printf(_("      --min-xid-age=XID_AGE       minimum transaction ID age of tables to vacuum\n"));
+	printf(_("  -P, --parallel[=NUM]            do parallel vacuuming\n"));
 	printf(_("  -q, --quiet                     don't write any messages\n"));
 	printf(_("      --skip-locked               skip relations that cannot be immediately locked\n"));
 	printf(_("  -t, --table='TABLE[(COLUMNS)]'  vacuum specific table(s) only\n"));
-- 
2.22.0

#83

Dilip Kumar

dilipbalaut@gmail.com

over 6 years ago

In reply to: Masahiko Sawada (#82)

On Wed, Oct 2, 2019 at 7:29 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

I have started reviewing this patch and I have some cosmetic comments.
I will continue the review tomorrow.

+This change adds PARALLEL option to VACUUM command that enable us to
+perform index vacuuming and index cleanup with background
+workers. Indivisual

/s/Indivisual/Individual/

+ * parallel worker processes. Individual indexes is processed by one vacuum
+ * process. At beginning of lazy vacuum (at lazy_scan_heap) we prepare the

/s/Individual indexes is processed/Individual indexes are processed/
/s/At beginning/ At the beginning

+ * parallel workers. In parallel lazy vacuum, we enter parallel mode and
+ * create the parallel context and the DSM segment before starting heap
+ * scan.

Can we extend the comment to explain why we do that before starting
the heap scan?

+ else
+ {
+ if (for_cleanup)
+ {
+ if (lps->nworkers_requested > 0)
+ appendStringInfo(&buf,
+ ngettext("launched %d parallel vacuum worker for index cleanup
(planned: %d, requested %d)",
+   "launched %d parallel vacuum workers for index cleanup (planned:
%d, requsted %d)",
+   lps->pcxt->nworkers_launched),
+ lps->pcxt->nworkers_launched,
+ lps->pcxt->nworkers,
+ lps->nworkers_requested);
+ else
+ appendStringInfo(&buf,
+ ngettext("launched %d parallel vacuum worker for index cleanup (planned: %d)",
+   "launched %d parallel vacuum workers for index cleanup (planned: %d)",
+   lps->pcxt->nworkers_launched),
+ lps->pcxt->nworkers_launched,
+ lps->pcxt->nworkers);
+ }
+ else
+ {
+ if (lps->nworkers_requested > 0)
+ appendStringInfo(&buf,
+ ngettext("launched %d parallel vacuum worker for index vacuuming
(planned: %d, requested %d)",
+   "launched %d parallel vacuum workers for index vacuuming (planned:
%d, requested %d)",
+   lps->pcxt->nworkers_launched),
+ lps->pcxt->nworkers_launched,
+ lps->pcxt->nworkers,
+ lps->nworkers_requested);
+ else
+ appendStringInfo(&buf,
+ ngettext("launched %d parallel vacuum worker for index vacuuming
(planned: %d)",
+   "launched %d parallel vacuum workers for index vacuuming (planned: %d)",
+   lps->pcxt->nworkers_launched),
+ lps->pcxt->nworkers_launched,
+ lps->pcxt->nworkers);
+ }

Multiple places I see a lot of duplicate code for for_cleanup is true
or false. The only difference is in the error message whether we give
index cleanup or index vacuuming otherwise complete code is the same
for
both the cases. Can't we create some string and based on the value of
the for_cleanup and append it in the error message that way we can
avoid duplicating this at many places?

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#84

Masahiko Sawada

sawada.mshk@gmail.com

over 6 years ago

In reply to: Dilip Kumar (#83)

2 attachment(s)

On Thu, Oct 3, 2019 at 9:06 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Wed, Oct 2, 2019 at 7:29 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

I have started reviewing this patch and I have some cosmetic comments.
I will continue the review tomorrow.

Thank you for reviewing the patch!

+This change adds PARALLEL option to VACUUM command that enable us to
+perform index vacuuming and index cleanup with background
+workers. Indivisual

/s/Indivisual/Individual/

Fixed.

+ * parallel worker processes. Individual indexes is processed by one vacuum
+ * process. At beginning of lazy vacuum (at lazy_scan_heap) we prepare the
/s/Individual indexes is processed/Individual indexes are processed/
/s/At beginning/ At the beginning

Fixed.

+ * parallel workers. In parallel lazy vacuum, we enter parallel mode and
+ * create the parallel context and the DSM segment before starting heap
+ * scan.
Can we extend the comment to explain why we do that before starting
the heap scan?

Added more comment.

+ else
+ {
+ if (for_cleanup)
+ {
+ if (lps->nworkers_requested > 0)
+ appendStringInfo(&buf,
+ ngettext("launched %d parallel vacuum worker for index cleanup
(planned: %d, requested %d)",
+   "launched %d parallel vacuum workers for index cleanup (planned:
%d, requsted %d)",
+   lps->pcxt->nworkers_launched),
+ lps->pcxt->nworkers_launched,
+ lps->pcxt->nworkers,
+ lps->nworkers_requested);
+ else
+ appendStringInfo(&buf,
+ ngettext("launched %d parallel vacuum worker for index cleanup (planned: %d)",
+   "launched %d parallel vacuum workers for index cleanup (planned: %d)",
+   lps->pcxt->nworkers_launched),
+ lps->pcxt->nworkers_launched,
+ lps->pcxt->nworkers);
+ }
+ else
+ {
+ if (lps->nworkers_requested > 0)
+ appendStringInfo(&buf,
+ ngettext("launched %d parallel vacuum worker for index vacuuming
(planned: %d, requested %d)",
+   "launched %d parallel vacuum workers for index vacuuming (planned:
%d, requested %d)",
+   lps->pcxt->nworkers_launched),
+ lps->pcxt->nworkers_launched,
+ lps->pcxt->nworkers,
+ lps->nworkers_requested);
+ else
+ appendStringInfo(&buf,
+ ngettext("launched %d parallel vacuum worker for index vacuuming
(planned: %d)",
+   "launched %d parallel vacuum workers for index vacuuming (planned: %d)",
+   lps->pcxt->nworkers_launched),
+ lps->pcxt->nworkers_launched,
+ lps->pcxt->nworkers);
+ }

I think it's necessary for translation. IIUC if we construct the
message it cannot be translated.

Attached the updated patch.

Regards,

--
Masahiko Sawada

Attachments:

v27-0002-Add-paralell-P-option-to-vacuumdb-command.patchtext/x-patch; charset=US-ASCII; name=v27-0002-Add-paralell-P-option-to-vacuumdb-command.patchDownload

From 9eb0b5e4e010783e04882cab4e4bab5063eabc56 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Wed, 23 Jan 2019 16:07:53 +0900
Subject: [PATCH v27 2/2] Add --paralell, -P option to vacuumdb command

---
 doc/src/sgml/ref/vacuumdb.sgml    | 16 +++++++++++
 src/bin/scripts/t/100_vacuumdb.pl | 10 ++++++-
 src/bin/scripts/vacuumdb.c        | 48 ++++++++++++++++++++++++++++++-
 3 files changed, 72 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/ref/vacuumdb.sgml b/doc/src/sgml/ref/vacuumdb.sgml
index 47d93456f8..f6ac0c6e5a 100644
--- a/doc/src/sgml/ref/vacuumdb.sgml
+++ b/doc/src/sgml/ref/vacuumdb.sgml
@@ -226,6 +226,22 @@ PostgreSQL documentation
       </listitem>
      </varlistentry>
 
+     <varlistentry>
+      <term><option>-P <replaceable class="parameter">workers</replaceable></option></term>
+      <term><option>--parallel=<replaceable class="parameter">workers</replaceable></option></term>
+      <listitem>
+       <para>
+        Execute parallel vacuum with <productname>PostgreSQL</productname>'s
+        <replaceable class="parameter">workers</replaceable> background workers.
+       </para>
+       <para>
+        This option will require background workers, so make sure your
+        <xref linkend="guc-max-parallel-workers-maintenance"/> setting is more
+        than one.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry>
       <term><option>-q</option></term>
       <term><option>--quiet</option></term>
diff --git a/src/bin/scripts/t/100_vacuumdb.pl b/src/bin/scripts/t/100_vacuumdb.pl
index b685b35282..8fe80719e8 100644
--- a/src/bin/scripts/t/100_vacuumdb.pl
+++ b/src/bin/scripts/t/100_vacuumdb.pl
@@ -3,7 +3,7 @@ use warnings;
 
 use PostgresNode;
 use TestLib;
-use Test::More tests => 44;
+use Test::More tests => 48;
 
 program_help_ok('vacuumdb');
 program_version_ok('vacuumdb');
@@ -48,6 +48,14 @@ $node->issues_sql_like(
 $node->command_fails(
 	[ 'vacuumdb', '--analyze-only', '--disable-page-skipping', 'postgres' ],
 	'--analyze-only and --disable-page-skipping specified together');
+$node->issues_sql_like(
+	[ 'vacuumdb', '-P2', 'postgres' ],
+	qr/statement: VACUUM \(PARALLEL 2\).*;/,
+	'vacuumdb -P2');
+$node->issues_sql_like(
+	[ 'vacuumdb', '-P', 'postgres' ],
+	qr/statement: VACUUM \(PARALLEL\).*;/,
+	'vacuumdb -P');
 $node->command_ok([qw(vacuumdb -Z --table=pg_am dbname=template1)],
 	'vacuumdb with connection string');
 
diff --git a/src/bin/scripts/vacuumdb.c b/src/bin/scripts/vacuumdb.c
index 2c7219239f..63bf66a70b 100644
--- a/src/bin/scripts/vacuumdb.c
+++ b/src/bin/scripts/vacuumdb.c
@@ -34,6 +34,8 @@ typedef struct vacuumingOptions
 	bool		skip_locked;
 	int			min_xid_age;
 	int			min_mxid_age;
+	int			parallel_workers;	/* -1 disables, 0 for choosing based on the
+									 * number of indexes */
 } vacuumingOptions;
 
 
@@ -86,6 +88,7 @@ main(int argc, char *argv[])
 		{"full", no_argument, NULL, 'f'},
 		{"verbose", no_argument, NULL, 'v'},
 		{"jobs", required_argument, NULL, 'j'},
+		{"parallel", optional_argument, NULL, 'P'},
 		{"maintenance-db", required_argument, NULL, 2},
 		{"analyze-in-stages", no_argument, NULL, 3},
 		{"disable-page-skipping", no_argument, NULL, 4},
@@ -115,6 +118,7 @@ main(int argc, char *argv[])
 
 	/* initialize options to all false */
 	memset(&vacopts, 0, sizeof(vacopts));
+	vacopts.parallel_workers = -1;
 
 	pg_logging_init(argv[0]);
 	progname = get_progname(argv[0]);
@@ -122,7 +126,7 @@ main(int argc, char *argv[])
 
 	handle_help_version_opts(argc, argv, "vacuumdb", help);
 
-	while ((c = getopt_long(argc, argv, "h:p:U:wWeqd:zZFat:fvj:", long_options, &optindex)) != -1)
+	while ((c = getopt_long(argc, argv, "h:p:P::U:wWeqd:zZFat:fvj:", long_options, &optindex)) != -1)
 	{
 		switch (c)
 		{
@@ -182,6 +186,24 @@ main(int argc, char *argv[])
 					exit(1);
 				}
 				break;
+			case 'P':
+				{
+					int parallel_workers = 0;
+
+					if (optarg != NULL)
+					{
+						parallel_workers = atoi(optarg);
+						if (parallel_workers <= 0)
+						{
+							pg_log_error("number of parallel workers must be at least 1");
+							exit(1);
+						}
+					}
+
+					/* allow to set 0, meaning PARALLEL without the parallel degree */
+					vacopts.parallel_workers = parallel_workers;
+					break;
+				}
 			case 2:
 				maintenance_db = pg_strdup(optarg);
 				break;
@@ -254,9 +276,22 @@ main(int argc, char *argv[])
 						 "disable-page-skipping");
 			exit(1);
 		}
+		if (vacopts.parallel_workers >= 0)
+		{
+			pg_log_error("cannot use the \"%s\" option when performing only analyze",
+						 "parallel");
+			exit(1);
+		}
 		/* allow 'and_analyze' with 'analyze_only' */
 	}
 
+	if (vacopts.full && vacopts.parallel_workers >= 0)
+	{
+		pg_log_error("cannot use the \"%s\" option with \"%s\" option",
+					 "full", "parallel");
+		exit(1);
+	}
+
 	setup_cancel_handler();
 
 	/* Avoid opening extra connections. */
@@ -822,6 +857,16 @@ prepare_vacuum_command(PQExpBuffer sql, int serverVersion,
 				appendPQExpBuffer(sql, "%sANALYZE", sep);
 				sep = comma;
 			}
+			if (vacopts->parallel_workers > 0)
+			{
+				appendPQExpBuffer(sql, "%sPARALLEL %d", sep, vacopts->parallel_workers);
+				sep = comma;
+			}
+			if (vacopts->parallel_workers == 0)
+			{
+				appendPQExpBuffer(sql, "%sPARALLEL", sep);
+				sep = comma;
+			}
 			if (sep != paren)
 				appendPQExpBufferChar(sql, ')');
 		}
@@ -885,6 +930,7 @@ help(const char *progname)
 	printf(_("  -j, --jobs=NUM                  use this many concurrent connections to vacuum\n"));
 	printf(_("      --min-mxid-age=MXID_AGE     minimum multixact ID age of tables to vacuum\n"));
 	printf(_("      --min-xid-age=XID_AGE       minimum transaction ID age of tables to vacuum\n"));
+	printf(_("  -P, --parallel[=NUM]            do parallel vacuuming\n"));
 	printf(_("  -q, --quiet                     don't write any messages\n"));
 	printf(_("      --skip-locked               skip relations that cannot be immediately locked\n"));
 	printf(_("  -t, --table='TABLE[(COLUMNS)]'  vacuum specific table(s) only\n"));
-- 
2.22.0

v27-0001-Add-parallel-option-to-VACUUM-command.patchtext/x-patch; charset=US-ASCII; name=v27-0001-Add-parallel-option-to-VACUUM-command.patchDownload

From 7763fa4cbd60d3ff345359bc84bcfbd31839ae0f Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Wed, 2 Oct 2019 22:46:21 +0900
Subject: [PATCH v27 1/2] Add parallel option to VACUUM command

This change adds PARALLEL option to VACUUM command that enable us to
perform index vacuuming and index cleanup with background
workers. Individual indexes is processed by one vacuum
process. Therefore parallel vacuum can be used when the table has at
least two indexes and it cannot specify larger parallel degree than
the number of indexes that the table has.

The parallel degree is either specified by user or determined based on
the number of indexes that the table has, and further limited by
max_parallel_maintenance_workers. The table size and index size don't
affect it.
---
 doc/src/sgml/config.sgml              |  14 +-
 doc/src/sgml/ref/vacuum.sgml          |  34 +
 src/backend/access/heap/vacuumlazy.c  | 909 +++++++++++++++++++++++---
 src/backend/access/transam/parallel.c |   4 +
 src/backend/commands/vacuum.c         |  27 +
 src/backend/postmaster/autovacuum.c   |   2 +
 src/bin/psql/tab-complete.c           |   2 +-
 src/include/access/heapam.h           |   3 +
 src/include/commands/vacuum.h         |   5 +
 src/test/regress/expected/vacuum.out  |  10 +
 src/test/regress/sql/vacuum.sql       |   7 +
 11 files changed, 910 insertions(+), 107 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 619ac8c50c..2be71dd128 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2264,13 +2264,13 @@ include_dir 'conf.d'
        <listitem>
         <para>
          Sets the maximum number of parallel workers that can be
-         started by a single utility command.  Currently, the only
-         parallel utility command that supports the use of parallel
-         workers is <command>CREATE INDEX</command>, and only when
-         building a B-tree index.  Parallel workers are taken from the
-         pool of processes established by <xref
-         linkend="guc-max-worker-processes"/>, limited by <xref
-         linkend="guc-max-parallel-workers"/>.  Note that the requested
+         started by a single utility command.  Currently, the parallel
+         utility commands that support the use of parallel workers are
+         <command>CREATE INDEX</command> only when building a B-tree index,
+         and <command>VACUUM</command> without <literal>FULL</literal>
+         option. Parallel workers are taken from the pool of processes
+         established by <xref linkend="guc-max-worker-processes"/>, limited
+         by <xref linkend="guc-max-parallel-workers"/>.  Note that the requested
          number of workers may not actually be available at run time.
          If this occurs, the utility operation will run with fewer
          workers than expected.  The default value is 2.  Setting this
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index f9b0fb8794..339ac48033 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -34,6 +34,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     SKIP_LOCKED [ <replaceable class="parameter">boolean</replaceable> ]
     INDEX_CLEANUP [ <replaceable class="parameter">boolean</replaceable> ]
     TRUNCATE [ <replaceable class="parameter">boolean</replaceable> ]
+    PARALLEL [ <replaceable class="parameter">integer</replaceable> ]
 
 <phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
 
@@ -223,6 +224,27 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><literal>PARALLEL</literal></term>
+    <listitem>
+     <para>
+      Perform vacuum index and cleanup index phases of <command>VACUUM</command>
+      in parallel using <replaceable class="parameter">integer</replaceable>
+      background workers (for the detail of each vacuum phases, please refer
+      to <xref linkend="vacuum-phases"/>). If the parallel degree
+      <replaceable class="parameter">integer</replaceable> is omitted, then
+      <command>VACUUM</command> decides the number of workers based on number of
+      indexes on the relation which is further limited by
+      <xref linkend="guc-max-parallel-workers-maintenance"/>. Only one worker
+      can be used per index. So parallel workers are launched only when
+      there are at least <literal>2</literal> indexes in the table. Workers
+      for vacuum launches before starting each phases and exit at the end of
+      the phase. These behaviors might change in a future release. This
+      option can not use with <literal>FULL</literal> option.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><replaceable class="parameter">boolean</replaceable></term>
     <listitem>
@@ -237,6 +259,18 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><replaceable class="parameter">integer</replaceable></term>
+    <listitem>
+     <para>
+      Specifies a positive integer value passed to the selected option.
+      The <replaceable class="parameter">integer</replaceable> value can
+      also be omitted, in which case the default value of the selected
+      option is used.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><replaceable class="parameter">table_name</replaceable></term>
     <listitem>
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index a3c4a1df3b..2200192b71 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -22,6 +22,21 @@
  * of index scans performed.  So we don't use maintenance_work_mem memory for
  * the TID array, just enough to hold as many heap tuples as fit on one page.
  *
+ * Lazy vacuum supports parallel execution with parallel worker processes. In
+ * parallel lazy vacuum, we perform both index vacuuming and index cleanup with
+ * parallel worker processes. Individual indexes are processed by one vacuum
+ * process. At the beginning of lazy vacuum (at lazy_scan_heap) we prepare the
+ * parallel context and initialize the DSM segment that contains shared information
+ * as well as the memory space for storing dead tuples. When starting either
+ * index vacuuming or index cleanup, we launch parallel worker processes. Once
+ * all indexes are processed the parallel worker processes exit. And then the
+ * leader process re-initializes the parallel context while keeping recorded
+ * dead tuples so that the leader can launch parallel workers again in the next
+ * time. Note that all parallel workers live during either index vacuuming or
+ * index cleanup but the leader process neither exits from the parallel mode
+ * nor destroys the parallel context. For updating the index statistics, since
+ * any updates are not allowed during parallel mode we update the index
+ * statistics after exited from the parallel mode.
  *
  * Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
@@ -41,8 +56,10 @@
 #include "access/heapam_xlog.h"
 #include "access/htup_details.h"
 #include "access/multixact.h"
+#include "access/parallel.h"
 #include "access/transam.h"
 #include "access/visibilitymap.h"
+#include "access/xact.h"
 #include "access/xlog.h"
 #include "catalog/storage.h"
 #include "commands/dbcommands.h"
@@ -55,6 +72,7 @@
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
 #include "storage/lmgr.h"
+#include "tcop/tcopprot.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/pg_rusage.h"
@@ -110,6 +128,101 @@
  */
 #define PREFETCH_SIZE			((BlockNumber) 32)
 
+/*
+ * DSM keys for parallel lazy vacuum. Unlike other parallel execution code,
+ * since we don't need to worry about DSM keys conflicting with plan_node_id
+ * we can use small integers.
+ */
+#define PARALLEL_VACUUM_KEY_SHARED			1
+#define PARALLEL_VACUUM_KEY_DEAD_TUPLES		2
+#define PARALLEL_VACUUM_KEY_QUERY_TEXT		3
+
+/*
+ * PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION disables the leader's
+ * participation in parallel lazy vacuum.  This may be useful as a debugging
+ * aid.
+#undef PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION
+ */
+
+/*
+ * Macro to check if we are in a parallel lazy vacuum. If true, we are
+ * in the parallel mode and prepared the DSM segment.
+ */
+#define ParallelVacuumIsActive(lps) (((LVParallelState *) (lps)) != NULL)
+
+/*
+ * LVDeadTuples stores the dead tuple TIDs collected during heap scan.
+ * This is allocated in the DSM segment when parallel lazy vacuum
+ * mode, otherwise allocated in a local memory.
+ */
+typedef struct LVDeadTuples
+{
+	int			max_tuples;	/* # slots allocated in array */
+	int			num_tuples;	/* current # of entries */
+	/* List of TIDs of tuples we intend to delete */
+	/* NB: this list is ordered by TID address */
+	ItemPointerData itemptrs[FLEXIBLE_ARRAY_MEMBER];	/* array of ItemPointerData */
+} LVDeadTuples;
+#define SizeOfLVDeadTuples offsetof(LVDeadTuples, itemptrs) + sizeof(ItemPointerData)
+
+/*
+ * Struct for an index bulk-deletion statistic used for parallel lazy vacuum.
+ * This is allocated in the DSM segment.
+ */
+typedef struct LVSharedIndStats
+{
+	IndexBulkDeleteResult	stats;
+	bool					updated;	/* are the stats updated? */
+} LVSharedIndStats;
+
+/*
+ * Shared information among parallel workers. So this is allocated in
+ * the DSM segment.
+ */
+typedef struct LVShared
+{
+	/*
+	 * Target table relid and log level. These fields are not modified during
+	 * the lazy vacuum.
+	 */
+	Oid		relid;
+	int		elevel;
+
+	/*
+	 * An indication for vacuum workers of doing either index vacuuming or
+	 * index cleanup.
+	 */
+	bool	for_cleanup;
+
+	/*
+	 * Fields for both index vacuuming and index cleanup.
+	 *
+	 * reltuples is the total number of input heap tuples. We set either an
+	 * old live tuples in index vacuuming case or the new live tuples in index
+	 * cleanup case.
+	 *
+	 * estimated_count is true if the reltuples is estimated value.
+	 */
+	double	reltuples;
+	bool	estimated_count;
+
+	/*
+	 * Variables to control parallel index vacuuming. An variable-sized field
+	 * 'indstats' must come last.
+	 */
+	pg_atomic_uint32	nprocessed;
+	LVSharedIndStats	indstats[FLEXIBLE_ARRAY_MEMBER];
+} LVShared;
+#define SizeOfLVShared offsetof(LVShared, indstats) + sizeof(LVSharedIndStats)
+
+/* Struct for parallel lazy vacuum */
+typedef struct LVParallelState
+{
+	ParallelContext	*pcxt;
+	LVShared		*lvshared;
+	int				nworkers_requested;	/* user-requested parallel degree */
+} LVParallelState;
+
 typedef struct LVRelStats
 {
 	/* useindex = true means two-pass strategy; false means one-pass */
@@ -128,17 +241,12 @@ typedef struct LVRelStats
 	BlockNumber pages_removed;
 	double		tuples_deleted;
 	BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
-	/* List of TIDs of tuples we intend to delete */
-	/* NB: this list is ordered by TID address */
-	int			num_dead_tuples;	/* current # of entries */
-	int			max_dead_tuples;	/* # slots allocated in array */
-	ItemPointer dead_tuples;	/* array of ItemPointerData */
+	LVDeadTuples *dead_tuples;
 	int			num_index_scans;
 	TransactionId latestRemovedXid;
 	bool		lock_waiter_detected;
 } LVRelStats;
 
-
 /* A few variables that don't seem worth passing around as parameters */
 static int	elevel = -1;
 
@@ -155,12 +263,11 @@ static void lazy_scan_heap(Relation onerel, VacuumParams *params,
 						   bool aggressive);
 static void lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats);
 static bool lazy_check_needs_freeze(Buffer buf, bool *hastup);
-static void lazy_vacuum_index(Relation indrel,
-							  IndexBulkDeleteResult **stats,
-							  LVRelStats *vacrelstats);
+static void lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult **stats,
+							  LVDeadTuples *dead_tuples, double reltuples);
 static void lazy_cleanup_index(Relation indrel,
-							   IndexBulkDeleteResult *stats,
-							   LVRelStats *vacrelstats);
+							   IndexBulkDeleteResult **stats,
+							   double reltuples, bool estimated_count);
 static int	lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 							 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer);
 static bool should_attempt_truncation(VacuumParams *params,
@@ -169,12 +276,33 @@ static void lazy_truncate_heap(Relation onerel, LVRelStats *vacrelstats);
 static BlockNumber count_nondeletable_pages(Relation onerel,
 											LVRelStats *vacrelstats);
 static void lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks);
-static void lazy_record_dead_tuple(LVRelStats *vacrelstats,
+static void lazy_record_dead_tuple(LVDeadTuples *dead_tuples,
 								   ItemPointer itemptr);
 static bool lazy_tid_reaped(ItemPointer itemptr, void *state);
 static int	vac_cmp_itemptr(const void *left, const void *right);
 static bool heap_page_is_all_visible(Relation rel, Buffer buf,
 									 TransactionId *visibility_cutoff_xid, bool *all_frozen);
+static void lazy_update_index_statistics(Relation indrel, IndexBulkDeleteResult *stats);
+static LVParallelState *begin_parallel_vacuum(LVRelStats *vacrelstats, Oid relid,
+											  BlockNumber nblocks, int nindexes,
+											  int nrequested);
+static void end_parallel_vacuum(LVParallelState *lps, Relation *Irel, int nindexes);
+static void lazy_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
+										   int nindexes,
+										   IndexBulkDeleteResult **stats,
+										   LVParallelState *lps, bool for_cleanup);
+static void lazy_parallel_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats,
+													Relation *Irel,
+													int nindexes,
+													IndexBulkDeleteResult **stats,
+													LVParallelState *lps,
+													bool for_cleanup);
+static void do_parallel_vacuum_or_cleanup_indexes(Relation *Irel, int nindexes,
+												  IndexBulkDeleteResult **stats,
+												  LVShared *lvshared,
+												  LVDeadTuples *dead_tuples);
+static int compute_parallel_workers(Relation onerel, int nrequested, int nindexes);
+static long compute_max_dead_tuples(BlockNumber relblocks, bool hasindex);
 
 
 /*
@@ -488,6 +616,18 @@ vacuum_log_cleanup_info(Relation rel, LVRelStats *vacrelstats)
  *		dead-tuple TIDs, invoke vacuuming of indexes and call lazy_vacuum_heap
  *		to reclaim dead line pointers.
  *
+ *		If the table has at least two indexes and parallel lazy vacuum is
+ *		requested, we execute both index vacuuming and index cleanup with
+ *		parallel workers. In parallel lazy vacuum, we enter parallel mode and
+ *		then create both the parallel context and the DSM segment before starting
+ *		heap scan so that we can record dead tuples to the DSM segment. All
+ *		parallel workers are launched at beginning of index vacuuming and index
+ *		cleanup and they exit once done with all indexes. At the end of this
+ *		function we exit from parallel mode. Index bulk-deletion results are
+ *		stored in the DSM segment and update index statistics as a whole after
+ *		exited from parallel mode since all writes are not allowed during parallel
+ *		mode.
+ *
  *		If there are no indexes then we can reclaim line pointers on the fly;
  *		dead line pointers need only be retained until all index pointers that
  *		reference them have been killed.
@@ -496,6 +636,8 @@ static void
 lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			   Relation *Irel, int nindexes, bool aggressive)
 {
+	LVParallelState *lps = NULL;
+	LVDeadTuples *dead_tuples;
 	BlockNumber nblocks,
 				blkno;
 	HeapTupleData tuple;
@@ -518,6 +660,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	bool		skipping_blocks;
 	xl_heap_freeze_tuple *frozen;
 	StringInfoData buf;
+	int			parallel_workers = 0;
 	const int	initprog_index[] = {
 		PROGRESS_VACUUM_PHASE,
 		PROGRESS_VACUUM_TOTAL_HEAP_BLKS,
@@ -553,13 +696,45 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	vacrelstats->nonempty_pages = 0;
 	vacrelstats->latestRemovedXid = InvalidTransactionId;
 
-	lazy_space_alloc(vacrelstats, nblocks);
+	/*
+	 * If parallel lazy vacuum is requested and we vacuum indexes, compute
+	 * the number of parallel vacuum worker to launch.
+	 */
+	if (params->nworkers >= 0 && vacrelstats->useindex)
+		parallel_workers = compute_parallel_workers(onerel,
+													params->nworkers,
+													nindexes);
+
+	if (parallel_workers > 0)
+	{
+		/*
+		 * Enter parallel mode, create the parallel context and allocate the
+		 * DSM segment.
+		 */
+		lps = begin_parallel_vacuum(vacrelstats,
+									RelationGetRelid(onerel),
+									nblocks, nindexes,
+									parallel_workers);
+
+		/* Remember the user-requested parallel degree for reporting */
+		lps->nworkers_requested = params->nworkers;
+	}
+	else
+	{
+		/*
+		 * Use single process vacuum. We allocate the memory space for dead
+		 * tuples locally.
+		 */
+		lazy_space_alloc(vacrelstats, nblocks);
+	}
+
+	dead_tuples = vacrelstats->dead_tuples;
 	frozen = palloc(sizeof(xl_heap_freeze_tuple) * MaxHeapTuplesPerPage);
 
 	/* Report that we're scanning the heap, advertising total # of blocks */
 	initprog_val[0] = PROGRESS_VACUUM_PHASE_SCAN_HEAP;
 	initprog_val[1] = nblocks;
-	initprog_val[2] = vacrelstats->max_dead_tuples;
+	initprog_val[2] = dead_tuples->max_tuples;
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
 	/*
@@ -737,8 +912,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * If we are close to overrunning the available space for dead-tuple
 		 * TIDs, pause and do a cycle of vacuuming before we tackle this page.
 		 */
-		if ((vacrelstats->max_dead_tuples - vacrelstats->num_dead_tuples) < MaxHeapTuplesPerPage &&
-			vacrelstats->num_dead_tuples > 0)
+		if ((dead_tuples->max_tuples - dead_tuples->num_tuples) < MaxHeapTuplesPerPage &&
+			dead_tuples->num_tuples > 0)
 		{
 			const int	hvp_index[] = {
 				PROGRESS_VACUUM_PHASE,
@@ -766,10 +941,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 										 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
 			/* Remove index entries */
-			for (i = 0; i < nindexes; i++)
-				lazy_vacuum_index(Irel[i],
-								  &indstats[i],
-								  vacrelstats);
+			lazy_vacuum_or_cleanup_indexes(vacrelstats, Irel, nindexes,
+										   indstats, lps, false);
 
 			/*
 			 * Report that we are now vacuuming the heap.  We also increase
@@ -789,7 +962,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 			vacrelstats->num_index_scans++;
 
 			/*
@@ -985,7 +1158,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		has_dead_tuples = false;
 		nfrozen = 0;
 		hastup = false;
-		prev_dead_count = vacrelstats->num_dead_tuples;
+		prev_dead_count = dead_tuples->num_tuples;
 		maxoff = PageGetMaxOffsetNumber(page);
 
 		/*
@@ -1024,7 +1197,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 */
 			if (ItemIdIsDead(itemid))
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				all_visible = false;
 				continue;
 			}
@@ -1170,7 +1343,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 
 			if (tupgone)
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				HeapTupleHeaderAdvanceLatestRemovedXid(tuple.t_data,
 													   &vacrelstats->latestRemovedXid);
 				tups_vacuumed += 1;
@@ -1240,7 +1413,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * doing a second scan. Also we don't do that but forget dead tuples
 		 * when index cleanup is disabled.
 		 */
-		if (!vacrelstats->useindex && vacrelstats->num_dead_tuples > 0)
+		if (!vacrelstats->useindex && dead_tuples->num_tuples > 0)
 		{
 			if (nindexes == 0)
 			{
@@ -1269,7 +1442,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 
 			/*
 			 * Periodically do incremental FSM vacuuming to make newly-freed
@@ -1384,7 +1557,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * page, so remember its free space as-is.  (This path will always be
 		 * taken if there are no indexes.)
 		 */
-		if (vacrelstats->num_dead_tuples == prev_dead_count)
+		if (dead_tuples->num_tuples == prev_dead_count)
 			RecordPageWithFreeSpace(onerel, blkno, freespace);
 	}
 
@@ -1418,7 +1591,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 
 	/* If any tuples need to be deleted, perform final vacuum cycle */
 	/* XXX put a threshold on min number of tuples here? */
-	if (vacrelstats->num_dead_tuples > 0)
+	if (dead_tuples->num_tuples > 0)
 	{
 		const int	hvp_index[] = {
 			PROGRESS_VACUUM_PHASE,
@@ -1434,10 +1607,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 									 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
 		/* Remove index entries */
-		for (i = 0; i < nindexes; i++)
-			lazy_vacuum_index(Irel[i],
-							  &indstats[i],
-							  vacrelstats);
+		lazy_vacuum_or_cleanup_indexes(vacrelstats, Irel, nindexes,
+										   indstats, lps, false);
 
 		/* Report that we are now vacuuming the heap */
 		hvp_val[0] = PROGRESS_VACUUM_PHASE_VACUUM_HEAP;
@@ -1463,11 +1634,20 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
 								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
 
-	/* Do post-vacuum cleanup and statistics update for each index */
+	/*
+	 * Do post-vacuum cleanup and statistics update for each index if
+	 * we're not in parallel lazy vacuum. If in parallel lazy vacuum, do
+	 * only post-vacuum cleanup and will update statistics at the end of
+	 * the parallel lazy vacuum.
+	 */
 	if (vacrelstats->useindex)
+		lazy_vacuum_or_cleanup_indexes(vacrelstats, Irel, nindexes,
+									   indstats, lps, true);
+
+	if (ParallelVacuumIsActive(lps))
 	{
-		for (i = 0; i < nindexes; i++)
-			lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
+		/* End parallel mode and update index statistics */
+		end_parallel_vacuum(lps, Irel, nindexes);
 	}
 
 	/* If no indexes, make log report that lazy_vacuum_heap would've made */
@@ -1534,7 +1714,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 	npages = 0;
 
 	tupindex = 0;
-	while (tupindex < vacrelstats->num_dead_tuples)
+	while (tupindex < vacrelstats->dead_tuples->num_tuples)
 	{
 		BlockNumber tblk;
 		Buffer		buf;
@@ -1543,7 +1723,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 
 		vacuum_delay_point();
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples->itemptrs[tupindex]);
 		buf = ReadBufferExtended(onerel, MAIN_FORKNUM, tblk, RBM_NORMAL,
 								 vac_strategy);
 		if (!ConditionalLockBufferForCleanup(buf))
@@ -1591,6 +1771,7 @@ static int
 lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 				 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer)
 {
+	LVDeadTuples	*dead_tuples = vacrelstats->dead_tuples;
 	Page		page = BufferGetPage(buffer);
 	OffsetNumber unused[MaxOffsetNumber];
 	int			uncnt = 0;
@@ -1601,16 +1782,16 @@ lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 
 	START_CRIT_SECTION();
 
-	for (; tupindex < vacrelstats->num_dead_tuples; tupindex++)
+	for (; tupindex < dead_tuples->num_tuples; tupindex++)
 	{
 		BlockNumber tblk;
 		OffsetNumber toff;
 		ItemId		itemid;
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&dead_tuples->itemptrs[tupindex]);
 		if (tblk != blkno)
 			break;				/* past end of tuples for this block */
-		toff = ItemPointerGetOffsetNumber(&vacrelstats->dead_tuples[tupindex]);
+		toff = ItemPointerGetOffsetNumber(&dead_tuples->itemptrs[tupindex]);
 		itemid = PageGetItemId(page, toff);
 		ItemIdSetUnused(itemid);
 		unused[uncnt++] = toff;
@@ -1731,19 +1912,289 @@ lazy_check_needs_freeze(Buffer buf, bool *hastup)
 	return false;
 }
 
+/*
+ * Vacuum or cleanup indexes with parallel workers. This function must be used
+ * by the parallel vacuum leader process.
+ */
+static void
+lazy_parallel_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
+										int nindexes, IndexBulkDeleteResult **stats,
+										LVParallelState *lps, bool for_cleanup)
+{
+	StringInfoData	buf;
+	bool			leaderparticipates = true;
+
+#ifdef PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION
+	leaderparticipates = false;
+#endif
+
+	Assert(!IsParallelWorker());
+	Assert(ParallelVacuumIsActive(lps));
+	Assert(nindexes > 0);
+
+	/* Set shared information to tell parallel workers */
+	lps->lvshared->for_cleanup = for_cleanup;
+	if (!for_cleanup)
+	{
+		/*
+		 * We can only provide an approximate value of num_heap_tuples in
+		 * vacuum cases.
+		 */
+		lps->lvshared->reltuples = vacrelstats->old_live_tuples;
+		lps->lvshared->estimated_count = true;
+	}
+	else
+	{
+		/*
+		 * Now we can provide a better estimate of total number of surviving
+		 * tuples (we assume indexes are more interested in that than in the
+		 * number of nominally live tuples).
+		 */
+		lps->lvshared->reltuples = vacrelstats->new_rel_tuples;
+		lps->lvshared->estimated_count =
+			(vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+
+	}
+
+	LaunchParallelWorkers(lps->pcxt);
+
+	/* Create the log message to report */
+	initStringInfo(&buf);
+	if (lps->pcxt->nworkers_launched == 0)
+	{
+		/*
+		 * If no workers launched, the leader process vacuums all indexes alone.
+		 * Since there is hope that we can launch parallel workers in the next
+		 * index vacuuming time we don't end parallel mode yet.
+		 */
+		if (for_cleanup)
+		{
+			if (lps->nworkers_requested > 0)
+				appendStringInfo(&buf,
+								 gettext_noop("could not launch parallel vacuum worker for index cleanup (planned: %d, requested: %d)"),
+								 lps->pcxt->nworkers, lps->nworkers_requested);
+			else
+				appendStringInfo(&buf,
+								 gettext_noop("could not launch parallel vacuum worker for index cleanup (planned: %d)"),
+								 lps->pcxt->nworkers);
+		}
+		else
+		{
+			if (lps->nworkers_requested > 0)
+				appendStringInfo(&buf,
+								 gettext_noop("could not launch parallel vacuum worker for index vacuuming (planned: %d, requested: %d)"),
+								 lps->pcxt->nworkers, lps->nworkers_requested);
+			else
+				appendStringInfo(&buf,
+								 gettext_noop("could not launch parallel vacuum worker for index vacuuming (planned: %d)"),
+								 lps->pcxt->nworkers);
+		}
+	}
+	else
+	{
+		if (for_cleanup)
+		{
+			if (lps->nworkers_requested > 0)
+				appendStringInfo(&buf,
+								 ngettext("launched %d parallel vacuum worker for index cleanup (planned: %d, requested %d)",
+										  "launched %d parallel vacuum workers for index cleanup (planned: %d, requsted %d)",
+										  lps->pcxt->nworkers_launched),
+								 lps->pcxt->nworkers_launched,
+								 lps->pcxt->nworkers,
+								 lps->nworkers_requested);
+			else
+				appendStringInfo(&buf,
+								 ngettext("launched %d parallel vacuum worker for index cleanup (planned: %d)",
+										  "launched %d parallel vacuum workers for index cleanup (planned: %d)",
+										  lps->pcxt->nworkers_launched),
+								 lps->pcxt->nworkers_launched,
+								 lps->pcxt->nworkers);
+		}
+		else
+		{
+			if (lps->nworkers_requested > 0)
+				appendStringInfo(&buf,
+								 ngettext("launched %d parallel vacuum worker for index vacuuming (planned: %d, requested %d)",
+										  "launched %d parallel vacuum workers for index vacuuming (planned: %d, requested %d)",
+										  lps->pcxt->nworkers_launched),
+								 lps->pcxt->nworkers_launched,
+								 lps->pcxt->nworkers,
+								 lps->nworkers_requested);
+			else
+				appendStringInfo(&buf,
+								 ngettext("launched %d parallel vacuum worker for index vacuuming (planned: %d)",
+										  "launched %d parallel vacuum workers for index vacuuming (planned: %d)",
+										  lps->pcxt->nworkers_launched),
+								 lps->pcxt->nworkers_launched,
+								 lps->pcxt->nworkers);
+		}
+	}
+
+	ereport(elevel, (errmsg("%s", buf.data)));
+
+	/*
+	 * Join index vacuuming or index cleanup with parallel workers. The
+	 * leader process alone does that in case where no workers launched.
+	 */
+	if (leaderparticipates || lps->pcxt->nworkers_launched == 0)
+		do_parallel_vacuum_or_cleanup_indexes(Irel, nindexes, stats,
+											  lps->lvshared,
+											  vacrelstats->dead_tuples);
+
+	/* Wait for all vacuum workers to finish */
+	WaitForParallelWorkersToFinish(lps->pcxt);
+
+	/*
+	 * If we are doing index cleanup, we don't need to reinitialize the
+	 * parallel context as no more index vacuuming and index cleanup will
+	 * be performed after that.
+	 */
+	if (!for_cleanup)
+	{
+		/* Reset the processing count */
+		pg_atomic_write_u32(&(lps->lvshared->nprocessed), 0);
+
+		/*
+		 * Reinitialize the parallel context to relaunch parallel workers
+		 * for the next execution.
+		 */
+		ReinitializeParallelDSM(lps->pcxt);
+	}
+}
+
+/*
+ * Parallel Index vacuuming and index cleanup routine used by both the leader
+ * process and worker processes. Unlike single process vacuum, we don't update
+ * index statistics after cleanup index since it is not allowed during
+ * parallel mode, instead copy index bulk-deletion results from the local
+ * memory to the DSM segment and update them at the end of parallel lazy
+ * vacuum.
+ */
+static void
+do_parallel_vacuum_or_cleanup_indexes(Relation *Irel, int nindexes,
+									  IndexBulkDeleteResult **stats,
+									  LVShared *lvshared,
+									  LVDeadTuples *dead_tuples)
+{
+	/* Loop until all indexes are vacuumed */
+	for (;;)
+	{
+		int idx;
+
+		/* Get an index number to process */
+		idx = pg_atomic_fetch_add_u32(&(lvshared->nprocessed), 1);
+
+		/* Done for all indexes? */
+		if (idx >= nindexes)
+			break;
+
+		/*
+		 * Update the pointer to the corresponding bulk-deletion result
+		 * if someone has already updated it.
+		 */
+		if (lvshared->indstats[idx].updated &&
+			stats[idx] == NULL)
+			stats[idx] = &(lvshared->indstats[idx].stats);
+
+		/* Do vacuum or cleanup one index */
+		if (!lvshared->for_cleanup)
+			lazy_vacuum_index(Irel[idx], &stats[idx], dead_tuples,
+							  lvshared->reltuples);
+		else
+			lazy_cleanup_index(Irel[idx], &stats[idx], lvshared->reltuples,
+							   lvshared->estimated_count);
+
+		/*
+		 * Copy the index bulk-deletion result returned from ambulkdelete and
+		 * amvacuumcleanup to the DSM segment if it's the first time to get it
+		 * from them, because they allocate it locally and it's possible that an
+		 * index will be vacuumed by the different vacuum process at the next
+		 * time. The copying the result normally happens only after the first
+		 * time of index vacuuming. From the second time, we pass the result on
+		 * the DSM segment so that they then update it directly.
+		 *
+		 * Since all vacuum workers write the bulk-deletion result at different
+		 * slots we can write them without locking.
+		 */
+		if (!lvshared->indstats[idx].updated && stats[idx] != NULL)
+		{
+			memcpy(&(lvshared->indstats[idx].stats),
+				   stats[idx], sizeof(IndexBulkDeleteResult));
+			lvshared->indstats[idx].updated = true;
+
+			/*
+			 * no longer need the locally allocated result and now stats[idx]
+			 * points to the DSM segment.
+			 */
+			pfree(stats[idx]);
+			stats[idx] = &(lvshared->indstats[idx].stats);
+		}
+	}
+}
+
+/*
+ * Vacuum or cleanup indexes. If we're ready for parallel lazy vacuum it's
+ * performed with parallel workers, otherwise do vacuum or cleanup indexes
+ * by itself. In parallel vacuum case, this function must be used by the
+ * parallel vacuum leader process. for_cleanup is true if the caller requests
+ * index cleanup, otherwise index vacuuming.
+ */
+static void
+lazy_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
+							   int nindexes, IndexBulkDeleteResult **stats,
+							   LVParallelState *lps, bool for_cleanup)
+{
+	int		idx;
+
+	Assert(!IsParallelWorker());
+
+	/* no job if the table has no index */
+	if (nindexes <= 0)
+		return;
+
+	if (ParallelVacuumIsActive(lps))
+	{
+		/* Do parallel index vacuuming or index cleanup */
+		lazy_parallel_vacuum_or_cleanup_indexes(vacrelstats, Irel,
+												nindexes, stats,
+												lps, for_cleanup);
+		return;
+	}
+
+	/* We are in single process vacuum, do index vacuuming or index cleanup */
+	for (idx = 0; idx < nindexes; idx++)
+	{
+		if (!for_cleanup)
+			lazy_vacuum_index(Irel[idx], &stats[idx], vacrelstats->dead_tuples,
+							  vacrelstats->old_live_tuples);
+		else
+		{
+			/* Cleanup one index and update index statistics */
+			lazy_cleanup_index(Irel[idx], &stats[idx], vacrelstats->new_rel_tuples,
+							   vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+
+			lazy_update_index_statistics(Irel[idx], stats[idx]);
+
+			if (stats[idx])
+				pfree(stats[idx]);
+		}
+	}
+}
 
 /*
  *	lazy_vacuum_index() -- vacuum one index relation.
  *
  *		Delete all the index entries pointing to tuples listed in
  *		vacrelstats->dead_tuples, and update running statistics.
+ *		reltuples is the number of heap tuples to be passed to the
+ *		bulk delete callback.
  */
 static void
-lazy_vacuum_index(Relation indrel,
-				  IndexBulkDeleteResult **stats,
-				  LVRelStats *vacrelstats)
+lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult **stats,
+				  LVDeadTuples *dead_tuples, double reltuples)
 {
 	IndexVacuumInfo ivinfo;
+	char		*msgfmt;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
@@ -1753,30 +2204,38 @@ lazy_vacuum_index(Relation indrel,
 	ivinfo.report_progress = false;
 	ivinfo.estimated_count = true;
 	ivinfo.message_level = elevel;
-	/* We can only provide an approximate value of num_heap_tuples here */
-	ivinfo.num_heap_tuples = vacrelstats->old_live_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
 	/* Do bulk deletion */
 	*stats = index_bulk_delete(&ivinfo, *stats,
-							   lazy_tid_reaped, (void *) vacrelstats);
+							   lazy_tid_reaped, (void *) dead_tuples);
+
+	if (IsParallelWorker())
+		msgfmt = gettext_noop("scanned index \"%s\" to remove %d row versions by parallel vacuum worker");
+	else
+		msgfmt = gettext_noop("scanned index \"%s\" to remove %d row versions");
 
 	ereport(elevel,
-			(errmsg("scanned index \"%s\" to remove %d row versions",
+			(errmsg(msgfmt,
 					RelationGetRelationName(indrel),
-					vacrelstats->num_dead_tuples),
+					dead_tuples->num_tuples),
 			 errdetail_internal("%s", pg_rusage_show(&ru0))));
 }
 
 /*
  *	lazy_cleanup_index() -- do post-vacuum cleanup for one index relation.
+ *
+ *		reltuples is the number of heap tuples and estimated_count is true
+ *		if the reltuples is an estimated value.
  */
 static void
 lazy_cleanup_index(Relation indrel,
-				   IndexBulkDeleteResult *stats,
-				   LVRelStats *vacrelstats)
+				   IndexBulkDeleteResult **stats,
+				   double reltuples, bool estimated_count)
 {
 	IndexVacuumInfo ivinfo;
+	char		*msgfmt;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
@@ -1784,49 +2243,55 @@ lazy_cleanup_index(Relation indrel,
 	ivinfo.index = indrel;
 	ivinfo.analyze_only = false;
 	ivinfo.report_progress = false;
-	ivinfo.estimated_count = (vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	ivinfo.estimated_count = estimated_count;
 	ivinfo.message_level = elevel;
 
-	/*
-	 * Now we can provide a better estimate of total number of surviving
-	 * tuples (we assume indexes are more interested in that than in the
-	 * number of nominally live tuples).
-	 */
-	ivinfo.num_heap_tuples = vacrelstats->new_rel_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
-	stats = index_vacuum_cleanup(&ivinfo, stats);
+	*stats = index_vacuum_cleanup(&ivinfo, *stats);
 
-	if (!stats)
+	if (!(*stats))
 		return;
 
-	/*
-	 * Now update statistics in pg_class, but only if the index says the count
-	 * is accurate.
-	 */
-	if (!stats->estimated_count)
-		vac_update_relstats(indrel,
-							stats->num_pages,
-							stats->num_index_tuples,
-							0,
-							false,
-							InvalidTransactionId,
-							InvalidMultiXactId,
-							false);
+	if (IsParallelWorker())
+		msgfmt = gettext_noop("index \"%s\" now contains %.0f row versions in %u pages, reported by parallel vacuum worker");
+	else
+		msgfmt = gettext_noop("index \"%s\" now contains %.0f row versions in %u pages");
 
 	ereport(elevel,
-			(errmsg("index \"%s\" now contains %.0f row versions in %u pages",
+			(errmsg(msgfmt,
 					RelationGetRelationName(indrel),
-					stats->num_index_tuples,
-					stats->num_pages),
+					(*stats)->num_index_tuples,
+					(*stats)->num_pages),
 			 errdetail("%.0f index row versions were removed.\n"
 					   "%u index pages have been deleted, %u are currently reusable.\n"
 					   "%s.",
-					   stats->tuples_removed,
-					   stats->pages_deleted, stats->pages_free,
+					   (*stats)->tuples_removed,
+					   (*stats)->pages_deleted, (*stats)->pages_free,
 					   pg_rusage_show(&ru0))));
+}
+
+/*
+ * Update index statistics in pg_class if the statistics is accurate.
+ */
+static void
+lazy_update_index_statistics(Relation indrel, IndexBulkDeleteResult *stats)
+{
+	Assert(!IsInParallelMode());
+
+	if (!stats || stats->estimated_count)
+		return;
 
-	pfree(stats);
+	/* Update index statistics */
+	vac_update_relstats(indrel,
+						stats->num_pages,
+						stats->num_index_tuples,
+						0,
+						false,
+						InvalidTransactionId,
+						InvalidMultiXactId,
+						false);
 }
 
 /*
@@ -2134,19 +2599,17 @@ count_nondeletable_pages(Relation onerel, LVRelStats *vacrelstats)
 }
 
 /*
- * lazy_space_alloc - space allocation decisions for lazy vacuum
- *
- * See the comments at the head of this file for rationale.
+ * Return the maximum number of dead tuples we can record.
  */
-static void
-lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+static long
+compute_max_dead_tuples(BlockNumber relblocks, bool useindex)
 {
 	long		maxtuples;
 	int			vac_work_mem = IsAutoVacuumWorkerProcess() &&
 	autovacuum_work_mem != -1 ?
 	autovacuum_work_mem : maintenance_work_mem;
 
-	if (vacrelstats->useindex)
+	if (useindex)
 	{
 		maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);
 		maxtuples = Min(maxtuples, INT_MAX);
@@ -2160,34 +2623,49 @@ lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
 		maxtuples = Max(maxtuples, MaxHeapTuplesPerPage);
 	}
 	else
-	{
 		maxtuples = MaxHeapTuplesPerPage;
-	}
 
-	vacrelstats->num_dead_tuples = 0;
-	vacrelstats->max_dead_tuples = (int) maxtuples;
-	vacrelstats->dead_tuples = (ItemPointer)
-		palloc(maxtuples * sizeof(ItemPointerData));
+	return maxtuples;
+}
+
+/*
+ * lazy_space_alloc - space allocation decisions for lazy vacuum
+ *
+ * See the comments at the head of this file for rationale.
+ */
+static void
+lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+{
+	LVDeadTuples	*dead_tuples = NULL;
+	long		maxtuples;
+
+	maxtuples = compute_max_dead_tuples(relblocks, vacrelstats->useindex);
+
+	dead_tuples = (LVDeadTuples *)
+		palloc(SizeOfLVDeadTuples + maxtuples * sizeof(ItemPointerData));
+	dead_tuples->num_tuples = 0;
+	dead_tuples->max_tuples = (int) maxtuples;
+
+	vacrelstats->dead_tuples = dead_tuples;
 }
 
 /*
  * lazy_record_dead_tuple - remember one deletable tuple
  */
 static void
-lazy_record_dead_tuple(LVRelStats *vacrelstats,
-					   ItemPointer itemptr)
+lazy_record_dead_tuple(LVDeadTuples *dead_tuples, ItemPointer itemptr)
 {
 	/*
 	 * The array shouldn't overflow under normal behavior, but perhaps it
 	 * could if we are given a really small maintenance_work_mem. In that
 	 * case, just forget the last few tuples (we'll get 'em next time).
 	 */
-	if (vacrelstats->num_dead_tuples < vacrelstats->max_dead_tuples)
+	if (dead_tuples->num_tuples < dead_tuples->max_tuples)
 	{
-		vacrelstats->dead_tuples[vacrelstats->num_dead_tuples] = *itemptr;
-		vacrelstats->num_dead_tuples++;
+		dead_tuples->itemptrs[dead_tuples->num_tuples] = *itemptr;
+		dead_tuples->num_tuples++;
 		pgstat_progress_update_param(PROGRESS_VACUUM_NUM_DEAD_TUPLES,
-									 vacrelstats->num_dead_tuples);
+									 dead_tuples->num_tuples);
 	}
 }
 
@@ -2201,12 +2679,12 @@ lazy_record_dead_tuple(LVRelStats *vacrelstats,
 static bool
 lazy_tid_reaped(ItemPointer itemptr, void *state)
 {
-	LVRelStats *vacrelstats = (LVRelStats *) state;
+	LVDeadTuples	*dead_tuples = (LVDeadTuples *) state;
 	ItemPointer res;
 
 	res = (ItemPointer) bsearch((void *) itemptr,
-								(void *) vacrelstats->dead_tuples,
-								vacrelstats->num_dead_tuples,
+								(void *) dead_tuples->itemptrs,
+								dead_tuples->num_tuples,
 								sizeof(ItemPointerData),
 								vac_cmp_itemptr);
 
@@ -2354,3 +2832,236 @@ heap_page_is_all_visible(Relation rel, Buffer buf,
 
 	return all_visible;
 }
+
+/*
+ * Compute the number of parallel worker processes to request. Both index
+ * vacuuming and index cleanup can be executed together with parallel workers.
+ * The relation sizes of table and indexes don't affect to the parallel
+ * degree for now. nrequested is the number of parallel workers that user
+ * requested. If nrequested is 0 we compute the parallel degree based on
+ * nindexes that is the number of indexes that the table has.
+ */
+static int
+compute_parallel_workers(Relation onerel, int nrequested, int nindexes)
+{
+	int		parallel_workers;
+	bool	leaderparticipates = true;
+	int		nindexes_to_vacuum = nindexes;
+
+#ifdef PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION
+	leaderparticipates = false;
+#endif
+
+	Assert(nrequested >= 0);
+
+	/* Return immediately when parallelism disabled */
+	if (max_parallel_maintenance_workers == 0)
+		return 0;
+
+	/* The leader process takes one index */
+	if (leaderparticipates)
+		nindexes_to_vacuum--;
+
+	if (nrequested > 0)
+	{
+		/* The parallel degree is requested */
+		parallel_workers = Min(nrequested, nindexes_to_vacuum);
+	}
+	else
+	{
+		/*
+		 * The parallel degree is not requested. Compute it based on the
+		 * number of indexes.
+		 */
+		parallel_workers = nindexes_to_vacuum;
+	}
+
+	/* cap by max_parallel_maintenace_workers */
+	parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+
+	return parallel_workers;
+}
+
+/*
+ * Enter parallel mode, allocate and initialize the DSM segment.
+ */
+static LVParallelState *
+begin_parallel_vacuum(LVRelStats *vacrelstats, Oid relid, BlockNumber nblocks,
+					  int nindexes, int nrequested)
+{
+	LVParallelState *lps = (LVParallelState *) palloc(sizeof(LVParallelState));
+	LVShared	*shared;
+	ParallelContext *pcxt;
+	LVDeadTuples	*tidmap;
+	long	maxtuples;
+	char	*sharedquery;
+	Size	est_shared;
+	Size	est_deadtuples;
+	int		querylen;
+
+	Assert(nrequested > 0);
+	Assert(nindexes > 0);
+
+	EnterParallelMode();
+	pcxt = CreateParallelContext("postgres", "heap_parallel_vacuum_main",
+								 nrequested);
+	lps->pcxt = pcxt;
+	Assert(pcxt->nworkers > 0);
+
+	/* Estimate size for shared information -- PARALLEL_VACUUM_KEY_SHARED */
+	est_shared = MAXALIGN(add_size(SizeOfLVShared,
+								   mul_size(sizeof(LVSharedIndStats), nindexes)));
+	shm_toc_estimate_chunk(&pcxt->estimator, est_shared);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	/* Estimate size for dead tuples -- PARALLEL_VACUUM_KEY_DEAD_TUPLES */
+	maxtuples = compute_max_dead_tuples(nblocks, true);
+	est_deadtuples = MAXALIGN(add_size(SizeOfLVDeadTuples,
+									   mul_size(sizeof(ItemPointerData), maxtuples)));
+	shm_toc_estimate_chunk(&pcxt->estimator, est_deadtuples);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	/* Finally, estimate VACUUM_KEY_QUERY_TEXT space */
+	querylen = strlen(debug_query_string);
+	shm_toc_estimate_chunk(&pcxt->estimator, querylen + 1);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	InitializeParallelDSM(pcxt);
+
+	/* prepare shared information */
+	shared = (LVShared *) shm_toc_allocate(pcxt->toc, est_shared);
+	shared->relid = relid;
+	shared->elevel = elevel;
+	pg_atomic_init_u32(&(shared->nprocessed), 0);
+	MemSet(shared->indstats, 0, sizeof(LVSharedIndStats) * nindexes);
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_SHARED, shared);
+	lps->lvshared = shared;
+
+	/* prepare the dead tuple space */
+	tidmap = (LVDeadTuples *) shm_toc_allocate(pcxt->toc, est_deadtuples);
+	tidmap->max_tuples = maxtuples;
+	tidmap->num_tuples = 0;
+	MemSet(tidmap->itemptrs, 0, sizeof(ItemPointerData) * maxtuples);
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_DEAD_TUPLES, tidmap);
+	vacrelstats->dead_tuples = tidmap;
+
+	/* Store query string for workers */
+	sharedquery = (char *) shm_toc_allocate(pcxt->toc, querylen + 1);
+	memcpy(sharedquery, debug_query_string, querylen + 1);
+	sharedquery[querylen] = '\0';
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, sharedquery);
+
+	lps->nworkers_requested = 0;
+
+	return lps;
+}
+
+/*
+ * Shutdown workers, destroy the parallel context, and end parallel mode.
+ * Update index statistics after exited from parallel mode.
+ */
+static void
+end_parallel_vacuum(LVParallelState *lps, Relation *Irel, int nindexes)
+{
+	IndexBulkDeleteResult *copied_stats = NULL;
+	int *updated_idx = palloc(sizeof(int) * nindexes);
+	int nupdated = 0;
+	int i;
+
+	Assert(!IsParallelWorker());
+	Assert(Irel != NULL && nindexes > 0);
+
+	/*
+	 * All writes are not allowed during parallel mode and it might not be
+	 * safe to exit from parallel mode while keeping the parallel context.
+	 * So we copy the updated index statistics to a temporary space and
+	 * update them after exited from parallel mode.
+	 */
+	for (i = 0; i < nindexes; i++)
+	{
+		if (lps->lvshared->indstats[i].updated)
+			updated_idx[nupdated++] = i;
+	}
+
+	copied_stats = palloc(sizeof(IndexBulkDeleteResult) * nupdated);
+
+	for (i = 0; i < nupdated; i++)
+		memcpy(&(copied_stats[i]),
+			   &(lps->lvshared->indstats[updated_idx[i]].stats),
+			   sizeof(IndexBulkDeleteResult));
+
+	/* Shutdown worker processes and destroy the parallel context */
+	DestroyParallelContext(lps->pcxt);
+	ExitParallelMode();
+
+	/* Update index statistics */
+	for (i = 0; i < nupdated; i++)
+		lazy_update_index_statistics(Irel[updated_idx[i]],
+									 &(copied_stats[i]));
+
+	pfree(copied_stats);
+	pfree(updated_idx);
+}
+
+/*
+ * Perform work within a launched parallel process.
+ *
+ * Since parallel vacuum workers work only within index vacuuming and index
+ * cleanup, no need to report the progress information.
+ */
+void
+heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
+{
+	Relation	onerel;
+	Relation	*indrels;
+	LVShared	*lvshared;
+	LVDeadTuples	*dead_tuples;
+	int			nindexes;
+	char		*sharedquery;
+	IndexBulkDeleteResult **stats;
+
+	lvshared = (LVShared *) shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_SHARED,
+										   false);
+	elevel = lvshared->elevel;
+
+	ereport(DEBUG1,
+			(errmsg("starting parallel lazy vacuum worker for %s",
+					lvshared->for_cleanup ? "cleanup" : "vacuuming")));
+
+	/* Set debug_query_string for individual workers */
+	sharedquery = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, false);
+	debug_query_string = sharedquery;
+	pgstat_report_activity(STATE_RUNNING, debug_query_string);
+
+	/* Open table */
+	onerel = heap_open(lvshared->relid, ShareUpdateExclusiveLock);
+
+	/*
+	 * Open all indexes. indrels are sorted in order by OID, which should
+	 * be matched to the leader's one.
+	 */
+	vac_open_indexes(onerel, RowExclusiveLock, &nindexes, &indrels);
+	Assert(nindexes > 0);
+
+	/* Set dead tuple space */
+	dead_tuples = (LVDeadTuples *) shm_toc_lookup(toc,
+												  PARALLEL_VACUUM_KEY_DEAD_TUPLES,
+												  false);
+
+	/* Set cost-based vacuum delay */
+	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostBalance = 0;
+	VacuumPageHit = 0;
+	VacuumPageMiss = 0;
+	VacuumPageDirty = 0;
+
+	stats = (IndexBulkDeleteResult **)
+		palloc0(nindexes * sizeof(IndexBulkDeleteResult *));
+
+	/* Do either vacuuming indexes or cleaning indexes */
+	do_parallel_vacuum_or_cleanup_indexes(indrels, nindexes, stats,
+										  lvshared, dead_tuples);
+
+	vac_close_indexes(nindexes, indrels, RowExclusiveLock);
+	heap_close(onerel, ShareUpdateExclusiveLock);
+}
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index 55d129a64f..86511b2703 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -14,6 +14,7 @@
 
 #include "postgres.h"
 
+#include "access/heapam.h"
 #include "access/nbtree.h"
 #include "access/parallel.h"
 #include "access/session.h"
@@ -140,6 +141,9 @@ static const struct
 	},
 	{
 		"_bt_parallel_build_main", _bt_parallel_build_main
+	},
+	{
+		"heap_parallel_vacuum_main", heap_parallel_vacuum_main
 	}
 };
 
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index e154507ecd..78e2fe6c3f 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -99,6 +99,7 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 	/* Set default value */
 	params.index_cleanup = VACOPT_TERNARY_DEFAULT;
 	params.truncate = VACOPT_TERNARY_DEFAULT;
+	params.nworkers = -1;
 
 	/* Parse options list */
 	foreach(lc, vacstmt->options)
@@ -129,6 +130,27 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 			params.index_cleanup = get_vacopt_ternary_value(opt);
 		else if (strcmp(opt->defname, "truncate") == 0)
 			params.truncate = get_vacopt_ternary_value(opt);
+		else if (strcmp(opt->defname, "parallel") == 0)
+		{
+			if (opt->arg == NULL)
+			{
+				/*
+				 * Parallel lazy vacuum is requested but user didn't specify
+				 * the parallel degree. The parallel degree will be determined
+				 * at the start of lazy vacuum.
+				 */
+				params.nworkers = 0;
+			}
+			else
+			{
+				params.nworkers = defGetInt32(opt);
+				if (params.nworkers < 1)
+					ereport(ERROR,
+							(errcode(ERRCODE_SYNTAX_ERROR),
+							 errmsg("parallel vacuum degree must be at least 1"),
+							 parser_errposition(pstate, opt->location)));
+			}
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -170,6 +192,11 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 		}
 	}
 
+	if ((params.options & VACOPT_FULL) && params.nworkers >= 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot specify FULL option with PARALLEL option")));
+
 	/*
 	 * All freeze ages are zero if the FREEZE option is given; otherwise pass
 	 * them as -1 which means to use the default values.
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 073f313337..da21d62635 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2888,6 +2888,8 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 			(!wraparound ? VACOPT_SKIP_LOCKED : 0);
 		tab->at_params.index_cleanup = VACOPT_TERNARY_DEFAULT;
 		tab->at_params.truncate = VACOPT_TERNARY_DEFAULT;
+		/* parallel lazy vacuum is not supported for autovacuum */
+		tab->at_params.nworkers = -1;
 		tab->at_params.freeze_min_age = freeze_min_age;
 		tab->at_params.freeze_table_age = freeze_table_age;
 		tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index e00dbab5aa..321a1511a8 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -3556,7 +3556,7 @@ psql_completion(const char *text, int start, int end)
 		if (ends_with(prev_wd, '(') || ends_with(prev_wd, ','))
 			COMPLETE_WITH("FULL", "FREEZE", "ANALYZE", "VERBOSE",
 						  "DISABLE_PAGE_SKIPPING", "SKIP_LOCKED",
-						  "INDEX_CLEANUP", "TRUNCATE");
+						  "INDEX_CLEANUP", "TRUNCATE", "PARALLEL");
 		else if (TailMatches("FULL|FREEZE|ANALYZE|VERBOSE|DISABLE_PAGE_SKIPPING|SKIP_LOCKED|INDEX_CLEANUP|TRUNCATE"))
 			COMPLETE_WITH("ON", "OFF");
 	}
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 858bcb6bc9..12065cc038 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -24,6 +24,8 @@
 #include "nodes/primnodes.h"
 #include "storage/bufpage.h"
 #include "storage/lockdefs.h"
+#include "storage/shm_toc.h"
+#include "storage/dsm.h"
 #include "utils/relcache.h"
 #include "utils/snapshot.h"
 
@@ -193,6 +195,7 @@ extern Size SyncScanShmemSize(void);
 struct VacuumParams;
 extern void heap_vacuum_rel(Relation onerel,
 							struct VacuumParams *params, BufferAccessStrategy bstrategy);
+extern void heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
 
 /* in heap/heapam_visibility.c */
 extern bool HeapTupleSatisfiesVisibility(HeapTuple stup, Snapshot snapshot,
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 128f7ae65d..43702f2f86 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -184,6 +184,11 @@ typedef struct VacuumParams
 										 * default value depends on reloptions */
 	VacOptTernaryValue truncate;	/* Truncate empty pages at the end,
 									 * default value depends on reloptions */
+	/*
+	 * The number of parallel vacuum workers. -1 by default for no workers
+	 * and 0 for choosing based on the number of indexes.
+	 */
+	int			nworkers;
 } VacuumParams;
 
 /* GUC parameters */
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index aff0b10a93..7fa981c649 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -92,6 +92,16 @@ CONTEXT:  SQL function "do_analyze" statement 1
 SQL function "wrap_do_analyze" statement 1
 VACUUM FULL vactst;
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+-- PARALLEL option
+VACUUM (PARALLEL) vaccluster;
+VACUUM (PARALLEL 2) vaccluster;
+VACUUM (PARALLEL 0) vaccluster; -- error
+ERROR:  parallel vacuum degree must be at least 1
+LINE 1: VACUUM (PARALLEL 0) vaccluster;
+                ^
+VACUUM (PARALLEL 2, INDEX_CLEANUP FALSE) vaccluster;
+VACUUM (PARALLEL 2, FULL TRUE) vaccluster; -- error, cannot use both PARALLEL and FULL
+ERROR:  cannot specify FULL option with PARALLEL option
 -- INDEX_CLEANUP option
 CREATE TABLE no_index_cleanup (i INT PRIMARY KEY, t TEXT);
 -- Use uncompressed data stored in toast.
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index f0fee3af2b..30f4c38ac8 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -75,6 +75,13 @@ VACUUM FULL vactst;
 
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
 
+-- PARALLEL option
+VACUUM (PARALLEL) vaccluster;
+VACUUM (PARALLEL 2) vaccluster;
+VACUUM (PARALLEL 0) vaccluster; -- error
+VACUUM (PARALLEL 2, INDEX_CLEANUP FALSE) vaccluster;
+VACUUM (PARALLEL 2, FULL TRUE) vaccluster; -- error, cannot use both PARALLEL and FULL
+
 -- INDEX_CLEANUP option
 CREATE TABLE no_index_cleanup (i INT PRIMARY KEY, t TEXT);
 -- Use uncompressed data stored in toast.
-- 
2.22.0

#85

Amit Kapila

amit.kapila16@gmail.com

over 6 years ago

In reply to: Masahiko Sawada (#82)

On Wed, Oct 2, 2019 at 7:29 PM Masahiko Sawada <sawada.mshk@gmail.com>
wrote:

On Sat, Sep 21, 2019 at 9:31 PM Amit Kapila <amit.kapila16@gmail.com>
wrote:

*
In function compute_parallel_workers, don't we want to cap the number
of workers based on maintenance_work_mem as we do in
plan_create_index_workers?

The basic point is how do we want to treat maintenance_work_mem for
this feature. Do we want all workers to use at max the
maintenance_work_mem or each worker is allowed to use
maintenance_work_mem? I would prefer earlier unless we have good
reason to follow the later strategy.

Accordingly, we might need to update the below paragraph in docs:
"Note that parallel utility commands should not consume substantially
more memory than equivalent non-parallel operations. This strategy
differs from that of parallel query, where resource limits generally
apply per worker process. Parallel utility commands treat the
resource limit <varname>maintenance_work_mem</varname> as a limit to
be applied to the entire utility command, regardless of the number of
parallel worker processes."

I'd also prefer to use maintenance_work_mem at max during parallel
vacuum regardless of the number of parallel workers. This is current
implementation. In lazy vacuum the maintenance_work_mem is used to
record itempointer of dead tuples. This is done by leader process and
worker processes just refers them for vacuuming dead index tuples.
Even if user sets a small amount of maintenance_work_mem the parallel
vacuum would be helpful as it still would take a time for index
vacuuming. So I thought we should cap the number of parallel workers
by the number of indexes rather than maintenance_work_mem.

Isn't that true only if we never use maintenance_work_mem during index
cleanup? However, I think we are using during index cleanup, see forex.
ginInsertCleanup. I think before reaching any conclusion about what to do
about this, first we need to establish whether this is a problem. If I am
correct, then only some of the index cleanups (like gin index) use
maintenance_work_mem, so we need to consider that point while designing a
solution for this.

*
+ keys++;
+
+ /* Estimate size for dead tuples -- PARALLEL_VACUUM_KEY_DEAD_TUPLES */
+ maxtuples = compute_max_dead_tuples(nblocks, true);
+ est_deadtuples = MAXALIGN(add_size(sizeof(LVDeadTuples),
+    mul_size(sizeof(ItemPointerData), maxtuples)));
+ shm_toc_estimate_chunk(&pcxt->estimator, est_deadtuples);
+ keys++;
+
+ shm_toc_estimate_keys(&pcxt->estimator, keys);
+
+ /* Finally, estimate VACUUM_KEY_QUERY_TEXT space */
+ querylen = strlen(debug_query_string);
+ shm_toc_estimate_chunk(&pcxt->estimator, querylen + 1);
+ shm_toc_estimate_keys(&pcxt->estimator, 1);
The code style looks inconsistent here. In some cases, you are
calling shm_toc_estimate_keys immediately after shm_toc_estimate_chunk
and in other cases, you are accumulating keys. I think it is better
to call shm_toc_estimate_keys immediately after shm_toc_estimate_chunk
in all cases.
Fixed. But there are some code that call shm_toc_estimate_keys for
multiple keys in for example nbtsort.c and parallel.c. What is the
difference?

We can do it, either way, depending on the situation. For example, in
nbtsort.c, there is an if check based on which 'number of keys' can vary.
I think here we should try to write in a way that it should not confuse the
reader why it is done in a particular way. This is the reason I told you
to be consistent.

*
+void
+heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
+{
..
+ /* Open table */
+ onerel = heap_open(lvshared->relid, ShareUpdateExclusiveLock);
..
}
I don't think it is a good idea to assume the lock mode as
ShareUpdateExclusiveLock here. Tomorrow, if due to some reason there
is a change in lock level for the vacuum process, we might forget to
update it here. I think it is better if we can get this information
from the master backend.
So did you mean to declare the lock mode for lazy vacuum somewhere as
a global variable and use it in both try_relation_open in the leader
process and relation_open in the worker process? Otherwise we would
end up with adding something like shared->lmode =
ShareUpdateExclusiveLock during parallel context initialization, which
seems not to resolve your concern.

I was thinking that if we can find a way to pass the lockmode we used in
vacuum_rel, but I guess we need to pass it through multiple functions which
will be a bit inconvenient. OTOH, today, I checked nbtsort.c
(_bt_parallel_build_main) and found that there also we are using it
directly instead of passing it from the master backend. I think we can
leave it as you have in the patch, but add a comment on why it is okay to
use that lock mode?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#86

Amit Kapila

amit.kapila16@gmail.com

over 6 years ago

In reply to: Masahiko Sawada (#84)

On Fri, Oct 4, 2019 at 10:28 AM Masahiko Sawada <sawada.mshk@gmail.com>
wrote:

On Thu, Oct 3, 2019 at 9:06 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Wed, Oct 2, 2019 at 7:29 PM Masahiko Sawada <sawada.mshk@gmail.com>

wrote:
+ else
+ {
+ if (for_cleanup)
+ {
+ if (lps->nworkers_requested > 0)
+ appendStringInfo(&buf,
+ ngettext("launched %d parallel vacuum worker for index cleanup
(planned: %d, requested %d)",
+   "launched %d parallel vacuum workers for index cleanup (planned:
%d, requsted %d)",
+   lps->pcxt->nworkers_launched),
+ lps->pcxt->nworkers_launched,
+ lps->pcxt->nworkers,
+ lps->nworkers_requested);
+ else
+ appendStringInfo(&buf,
+ ngettext("launched %d parallel vacuum worker for index cleanup
(planned: %d)",

+ "launched %d parallel vacuum workers for index cleanup (planned:

%d)",
+   lps->pcxt->nworkers_launched),
+ lps->pcxt->nworkers_launched,
+ lps->pcxt->nworkers);
+ }
+ else
+ {
+ if (lps->nworkers_requested > 0)
+ appendStringInfo(&buf,
+ ngettext("launched %d parallel vacuum worker for index vacuuming
(planned: %d, requested %d)",
+   "launched %d parallel vacuum workers for index vacuuming (planned:
%d, requested %d)",
+   lps->pcxt->nworkers_launched),
+ lps->pcxt->nworkers_launched,
+ lps->pcxt->nworkers,
+ lps->nworkers_requested);
+ else
+ appendStringInfo(&buf,
+ ngettext("launched %d parallel vacuum worker for index vacuuming
(planned: %d)",
+   "launched %d parallel vacuum workers for index vacuuming (planned:
%d)",
+   lps->pcxt->nworkers_launched),
+ lps->pcxt->nworkers_launched,
+ lps->pcxt->nworkers);
+ }
Multiple places I see a lot of duplicate code for for_cleanup is true
or false. The only difference is in the error message whether we give
index cleanup or index vacuuming otherwise complete code is the same
for
both the cases. Can't we create some string and based on the value of
the for_cleanup and append it in the error message that way we can
avoid duplicating this at many places?
I think it's necessary for translation. IIUC if we construct the
message it cannot be translated.

Do we really need to log all those messages? The other places where we
launch parallel workers doesn't seem to be using such messages. Why do you
think it is important to log the messages here when other cases don't use
it?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#87

Dilip Kumar

dilipbalaut@gmail.com

over 6 years ago

In reply to: Amit Kapila (#86)

On Fri, Oct 4, 2019 at 11:01 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Oct 4, 2019 at 10:28 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Some more comments..
1.
+ for (idx = 0; idx < nindexes; idx++)
+ {
+ if (!for_cleanup)
+ lazy_vacuum_index(Irel[idx], &stats[idx], vacrelstats->dead_tuples,
+   vacrelstats->old_live_tuples);
+ else
+ {
+ /* Cleanup one index and update index statistics */
+ lazy_cleanup_index(Irel[idx], &stats[idx], vacrelstats->new_rel_tuples,
+    vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+
+ lazy_update_index_statistics(Irel[idx], stats[idx]);
+
+ if (stats[idx])
+ pfree(stats[idx]);
+ }

I think instead of checking for_cleanup variable for every index of
the loop we better move loop inside, like shown below?

if (!for_cleanup)
for (idx = 0; idx < nindexes; idx++)
lazy_vacuum_index(Irel[idx], &stats[idx], vacrelstats->dead_tuples,
else
for (idx = 0; idx < nindexes; idx++)
{
lazy_cleanup_index
lazy_update_index_statistics
...
}

2.
+static void
+lazy_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
+    int nindexes, IndexBulkDeleteResult **stats,
+    LVParallelState *lps, bool for_cleanup)
+{
+ int idx;
+
+ Assert(!IsParallelWorker());
+
+ /* no job if the table has no index */
+ if (nindexes <= 0)
+ return;

Wouldn't it be good idea to call this function only if nindexes > 0?

3.
+/*
+ * Vacuum or cleanup indexes with parallel workers. This function must be used
+ * by the parallel vacuum leader process.
+ */
+static void
+lazy_parallel_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats,
Relation *Irel,
+ int nindexes, IndexBulkDeleteResult **stats,
+ LVParallelState *lps, bool for_cleanup)

If you see this function there is no much common code between
for_cleanup and without for_cleanup except these 3-4 statement.
LaunchParallelWorkers(lps->pcxt);
/* Create the log message to report */
initStringInfo(&buf);
...
/* Wait for all vacuum workers to finish */
WaitForParallelWorkersToFinish(lps->pcxt);

Other than that you have got a lot of checks like this
+ if (!for_cleanup)
+ {
+ }
+ else
+ {
}

I think code would be much redable if we have 2 functions one for
vaccum (lazy_parallel_vacuum_indexes) and another for
cleanup(lazy_parallel_cleanup_indexes).

4.
 * of index scans performed.  So we don't use maintenance_work_mem memory for
  * the TID array, just enough to hold as many heap tuples as fit on one page.
  *
+ * Lazy vacuum supports parallel execution with parallel worker processes. In
+ * parallel lazy vacuum, we perform both index vacuuming and index cleanup with
+ * parallel worker processes. Individual indexes are processed by one vacuum

Spacing after the "." is not uniform, previous comment is using 2
space and newly
added is using 1 space.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#88

Amit Kapila

amit.kapila16@gmail.com

over 6 years ago

In reply to: Masahiko Sawada (#82)

On Wed, Oct 2, 2019 at 7:29 PM Masahiko Sawada <sawada.mshk@gmail.com>
wrote:

On Sat, Sep 21, 2019 at 9:31 PM Amit Kapila <amit.kapila16@gmail.com>
wrote:
*
+end_parallel_vacuum(LVParallelState *lps, Relation *Irel, int nindexes)
{
..
+ /* Shutdown worker processes and destroy the parallel context */
+ WaitForParallelWorkersToFinish(lps->pcxt);
..
}
Do we really need to call WaitForParallelWorkersToFinish here as it
must have been called in lazy_parallel_vacuum_or_cleanup_indexes
before this time?
No, removed.

+ /* Shutdown worker processes and destroy the parallel context */
+ DestroyParallelContext(lps->pcxt);

But you forget to update the comment.

Few more comments:
--------------------------------
1.
+/*
+ * Parallel Index vacuuming and index cleanup routine used by both the
leader
+ * process and worker processes. Unlike single process vacuum, we don't
update
+ * index statistics after cleanup index since it is not allowed during
+ * parallel mode, instead copy index bulk-deletion results from the local
+ * memory to the DSM segment and update them at the end of parallel lazy
+ * vacuum.
+ */
+static void
+do_parallel_vacuum_or_cleanup_indexes(Relation *Irel, int nindexes,
+  IndexBulkDeleteResult **stats,
+  LVShared *lvshared,
+  LVDeadTuples *dead_tuples)
+{
+ /* Loop until all indexes are vacuumed */
+ for (;;)
+ {
+ int idx;
+
+ /* Get an index number to process */
+ idx = pg_atomic_fetch_add_u32(&(lvshared->nprocessed), 1);
+
+ /* Done for all indexes? */
+ if (idx >= nindexes)
+ break;
+
+ /*
+ * Update the pointer to the corresponding bulk-deletion result
+ * if someone has already updated it.
+ */
+ if (lvshared->indstats[idx].updated &&
+ stats[idx] == NULL)
+ stats[idx] = &(lvshared->indstats[idx].stats);
+
+ /* Do vacuum or cleanup one index */
+ if (!lvshared->for_cleanup)
+ lazy_vacuum_index(Irel[idx], &stats[idx], dead_tuples,
+  lvshared->reltuples);
+ else
+ lazy_cleanup_index(Irel[idx], &stats[idx], lvshared->reltuples,
+   lvshared->estimated_count);

It seems we always run index cleanup via parallel worker which seems
overkill because the cleanup index generally scans the index only when
bulkdelete was not performed. In some cases like for hash index, it
doesn't do anything even bulk delete is not called. OTOH, for brin index,
it does the main job during cleanup but we might be able to always allow
index cleanup by parallel worker for brin indexes if we remove the
allocation in brinbulkdelete which I am not sure is of any use.

I think we shouldn't call cleanup via parallel worker unless bulkdelete
hasn't been performed on the index.

2.
- for (i = 0; i < nindexes; i++)
- lazy_vacuum_index(Irel[i],
-  &indstats[i],
-  vacrelstats);
+ lazy_vacuum_or_cleanup_indexes(vacrelstats, Irel, nindexes,
+   indstats, lps, false);

Indentation is not proper. You might want to run pgindent.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#89

vignesh C

vignesh21@gmail.com

over 6 years ago

In reply to: Amit Kapila (#88)

On Fri, Oct 4, 2019 at 4:18 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Oct 2, 2019 at 7:29 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Sat, Sep 21, 2019 at 9:31 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

One comment:
We can check if parallel_workers is within range something within
MAX_PARALLEL_WORKER_LIMIT.
+ int parallel_workers = 0;
+
+ if (optarg != NULL)
+ {
+ parallel_workers = atoi(optarg);
+ if (parallel_workers <= 0)
+ {
+ pg_log_error("number of parallel workers must be at least 1");
+ exit(1);
+ }
+ }

Regards,
Vignesh
EnterpriseDB: http://www.enterprisedb.com

#90

Masahiko Sawada

sawada.mshk@gmail.com

over 6 years ago

In reply to: Amit Kapila (#85)

On Fri, Oct 4, 2019 at 2:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Oct 2, 2019 at 7:29 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Sat, Sep 21, 2019 at 9:31 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

*
In function compute_parallel_workers, don't we want to cap the number
of workers based on maintenance_work_mem as we do in
plan_create_index_workers?

The basic point is how do we want to treat maintenance_work_mem for
this feature. Do we want all workers to use at max the
maintenance_work_mem or each worker is allowed to use
maintenance_work_mem? I would prefer earlier unless we have good
reason to follow the later strategy.

Accordingly, we might need to update the below paragraph in docs:
"Note that parallel utility commands should not consume substantially
more memory than equivalent non-parallel operations. This strategy
differs from that of parallel query, where resource limits generally
apply per worker process. Parallel utility commands treat the
resource limit <varname>maintenance_work_mem</varname> as a limit to
be applied to the entire utility command, regardless of the number of
parallel worker processes."

I'd also prefer to use maintenance_work_mem at max during parallel
vacuum regardless of the number of parallel workers. This is current
implementation. In lazy vacuum the maintenance_work_mem is used to
record itempointer of dead tuples. This is done by leader process and
worker processes just refers them for vacuuming dead index tuples.
Even if user sets a small amount of maintenance_work_mem the parallel
vacuum would be helpful as it still would take a time for index
vacuuming. So I thought we should cap the number of parallel workers
by the number of indexes rather than maintenance_work_mem.

Isn't that true only if we never use maintenance_work_mem during index cleanup? However, I think we are using during index cleanup, see forex. ginInsertCleanup. I think before reaching any conclusion about what to do about this, first we need to establish whether this is a problem. If I am correct, then only some of the index cleanups (like gin index) use maintenance_work_mem, so we need to consider that point while designing a solution for this.

I got your point. Currently the single process lazy vacuum could
consume the amount of (maintenance_work_mem * 2) memory at max because
we do index cleanup during holding the dead tuple space as you
mentioned. And ginInsertCleanup is also be called at the beginning of
ginbulkdelete. In current parallel lazy vacuum, each parallel vacuum
worker could consume other memory apart from the memory used by heap
scan depending on the implementation of target index AM. Given that
the current single and parallel vacuum implementation it would be
better to control the amount memory in total rather than the number of
parallel workers. So one approach I came up with is that we make all
vacuum workers use the amount of (maintenance_work_mem / # of
participants) as new maintenance_work_mem. It might be too small in
some cases but it doesn't consume more memory than single lazy vacuum
as long as index AM doesn't consume more memory regardless of
maintenance_work_mem. I think it really depends on the implementation
of index AM.

*
+ keys++;
+
+ /* Estimate size for dead tuples -- PARALLEL_VACUUM_KEY_DEAD_TUPLES */
+ maxtuples = compute_max_dead_tuples(nblocks, true);
+ est_deadtuples = MAXALIGN(add_size(sizeof(LVDeadTuples),
+    mul_size(sizeof(ItemPointerData), maxtuples)));
+ shm_toc_estimate_chunk(&pcxt->estimator, est_deadtuples);
+ keys++;
+
+ shm_toc_estimate_keys(&pcxt->estimator, keys);
+
+ /* Finally, estimate VACUUM_KEY_QUERY_TEXT space */
+ querylen = strlen(debug_query_string);
+ shm_toc_estimate_chunk(&pcxt->estimator, querylen + 1);
+ shm_toc_estimate_keys(&pcxt->estimator, 1);
The code style looks inconsistent here. In some cases, you are
calling shm_toc_estimate_keys immediately after shm_toc_estimate_chunk
and in other cases, you are accumulating keys. I think it is better
to call shm_toc_estimate_keys immediately after shm_toc_estimate_chunk
in all cases.
Fixed. But there are some code that call shm_toc_estimate_keys for
multiple keys in for example nbtsort.c and parallel.c. What is the
difference?
We can do it, either way, depending on the situation. For example, in nbtsort.c, there is an if check based on which 'number of keys' can vary. I think here we should try to write in a way that it should not confuse the reader why it is done in a particular way. This is the reason I told you to be consistent.

Understood. Thank you for explanation!

*
+void
+heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
+{
..
+ /* Open table */
+ onerel = heap_open(lvshared->relid, ShareUpdateExclusiveLock);
..
}
I don't think it is a good idea to assume the lock mode as
ShareUpdateExclusiveLock here. Tomorrow, if due to some reason there
is a change in lock level for the vacuum process, we might forget to
update it here. I think it is better if we can get this information
from the master backend.
So did you mean to declare the lock mode for lazy vacuum somewhere as
a global variable and use it in both try_relation_open in the leader
process and relation_open in the worker process? Otherwise we would
end up with adding something like shared->lmode =
ShareUpdateExclusiveLock during parallel context initialization, which
seems not to resolve your concern.
I was thinking that if we can find a way to pass the lockmode we used in vacuum_rel, but I guess we need to pass it through multiple functions which will be a bit inconvenient. OTOH, today, I checked nbtsort.c (_bt_parallel_build_main) and found that there also we are using it directly instead of passing it from the master backend. I think we can leave it as you have in the patch, but add a comment on why it is okay to use that lock mode?

Yeah agreed.

Regards,

--
Masahiko Sawada

#91

Masahiko Sawada

sawada.mshk@gmail.com

over 6 years ago

In reply to: Amit Kapila (#86)

On Fri, Oct 4, 2019 at 2:31 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Oct 4, 2019 at 10:28 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Oct 3, 2019 at 9:06 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Wed, Oct 2, 2019 at 7:29 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
+ else
+ {
+ if (for_cleanup)
+ {
+ if (lps->nworkers_requested > 0)
+ appendStringInfo(&buf,
+ ngettext("launched %d parallel vacuum worker for index cleanup
(planned: %d, requested %d)",
+   "launched %d parallel vacuum workers for index cleanup (planned:
%d, requsted %d)",
+   lps->pcxt->nworkers_launched),
+ lps->pcxt->nworkers_launched,
+ lps->pcxt->nworkers,
+ lps->nworkers_requested);
+ else
+ appendStringInfo(&buf,
+ ngettext("launched %d parallel vacuum worker for index cleanup (planned: %d)",
+   "launched %d parallel vacuum workers for index cleanup (planned: %d)",
+   lps->pcxt->nworkers_launched),
+ lps->pcxt->nworkers_launched,
+ lps->pcxt->nworkers);
+ }
+ else
+ {
+ if (lps->nworkers_requested > 0)
+ appendStringInfo(&buf,
+ ngettext("launched %d parallel vacuum worker for index vacuuming
(planned: %d, requested %d)",
+   "launched %d parallel vacuum workers for index vacuuming (planned:
%d, requested %d)",
+   lps->pcxt->nworkers_launched),
+ lps->pcxt->nworkers_launched,
+ lps->pcxt->nworkers,
+ lps->nworkers_requested);
+ else
+ appendStringInfo(&buf,
+ ngettext("launched %d parallel vacuum worker for index vacuuming
(planned: %d)",
+   "launched %d parallel vacuum workers for index vacuuming (planned: %d)",
+   lps->pcxt->nworkers_launched),
+ lps->pcxt->nworkers_launched,
+ lps->pcxt->nworkers);
+ }
Multiple places I see a lot of duplicate code for for_cleanup is true
or false. The only difference is in the error message whether we give
index cleanup or index vacuuming otherwise complete code is the same
for
both the cases. Can't we create some string and based on the value of
the for_cleanup and append it in the error message that way we can
avoid duplicating this at many places?
I think it's necessary for translation. IIUC if we construct the
message it cannot be translated.
Do we really need to log all those messages? The other places where we launch parallel workers doesn't seem to be using such messages. Why do you think it is important to log the messages here when other cases don't use it?

Well I would rather think that parallel create index doesn't log
enough messages. Parallel maintenance operation is invoked manually by
user. I can imagine that DBA wants to cancel and try the operation
again later if enough workers are not launched. But there is not a
convenient way to confirm how many parallel workers planned and
actually launched. We need to see ps command or pg_stat_activity.
That's why I think that log message would be helpful for users.

Regards,

--
Masahiko Sawada

#92

Dilip Kumar

dilipbalaut@gmail.com

over 6 years ago

In reply to: Dilip Kumar (#87)

On Fri, Oct 4, 2019 at 3:35 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Fri, Oct 4, 2019 at 11:01 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Oct 4, 2019 at 10:28 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Some more comments..
1.
+ for (idx = 0; idx < nindexes; idx++)
+ {
+ if (!for_cleanup)
+ lazy_vacuum_index(Irel[idx], &stats[idx], vacrelstats->dead_tuples,
+   vacrelstats->old_live_tuples);
+ else
+ {
+ /* Cleanup one index and update index statistics */
+ lazy_cleanup_index(Irel[idx], &stats[idx], vacrelstats->new_rel_tuples,
+    vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+
+ lazy_update_index_statistics(Irel[idx], stats[idx]);
+
+ if (stats[idx])
+ pfree(stats[idx]);
+ }
I think instead of checking for_cleanup variable for every index of
the loop we better move loop inside, like shown below?

if (!for_cleanup)
for (idx = 0; idx < nindexes; idx++)
lazy_vacuum_index(Irel[idx], &stats[idx], vacrelstats->dead_tuples,
else
for (idx = 0; idx < nindexes; idx++)
{
lazy_cleanup_index
lazy_update_index_statistics
...
}
2.
+static void
+lazy_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
+    int nindexes, IndexBulkDeleteResult **stats,
+    LVParallelState *lps, bool for_cleanup)
+{
+ int idx;
+
+ Assert(!IsParallelWorker());
+
+ /* no job if the table has no index */
+ if (nindexes <= 0)
+ return;
Wouldn't it be good idea to call this function only if nindexes > 0?
3.
+/*
+ * Vacuum or cleanup indexes with parallel workers. This function must be used
+ * by the parallel vacuum leader process.
+ */
+static void
+lazy_parallel_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats,
Relation *Irel,
+ int nindexes, IndexBulkDeleteResult **stats,
+ LVParallelState *lps, bool for_cleanup)
If you see this function there is no much common code between
for_cleanup and without for_cleanup except these 3-4 statement.
LaunchParallelWorkers(lps->pcxt);
/* Create the log message to report */
initStringInfo(&buf);
...
/* Wait for all vacuum workers to finish */
WaitForParallelWorkersToFinish(lps->pcxt);
Other than that you have got a lot of checks like this
+ if (!for_cleanup)
+ {
+ }
+ else
+ {
}
I think code would be much redable if we have 2 functions one for
vaccum (lazy_parallel_vacuum_indexes) and another for
cleanup(lazy_parallel_cleanup_indexes).
4.
* of index scans performed.  So we don't use maintenance_work_mem memory for
* the TID array, just enough to hold as many heap tuples as fit on one page.
*
+ * Lazy vacuum supports parallel execution with parallel worker processes. In
+ * parallel lazy vacuum, we perform both index vacuuming and index cleanup with
+ * parallel worker processes. Individual indexes are processed by one vacuum
Spacing after the "." is not uniform, previous comment is using 2
space and newly
added is using 1 space.

Few more comments
----------------------------

1.
+static int
+compute_parallel_workers(Relation onerel, int nrequested, int nindexes)
+{
+ int parallel_workers;
+ bool leaderparticipates = true;

Seems like this function is not using onerel parameter so we can remove this.

2.
+
+ /* Estimate size for dead tuples -- PARALLEL_VACUUM_KEY_DEAD_TUPLES */
+ maxtuples = compute_max_dead_tuples(nblocks, true);
+ est_deadtuples = MAXALIGN(add_size(SizeOfLVDeadTuples,
+    mul_size(sizeof(ItemPointerData), maxtuples)));
+ shm_toc_estimate_chunk(&pcxt->estimator, est_deadtuples);
+ shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+ /* Finally, estimate VACUUM_KEY_QUERY_TEXT space */
+ querylen = strlen(debug_query_string);

for consistency with other comments change
VACUUM_KEY_QUERY_TEXT to PARALLEL_VACUUM_KEY_QUERY_TEXT

3.
@@ -2888,6 +2888,8 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
  (!wraparound ? VACOPT_SKIP_LOCKED : 0);
  tab->at_params.index_cleanup = VACOPT_TERNARY_DEFAULT;
  tab->at_params.truncate = VACOPT_TERNARY_DEFAULT;
+ /* parallel lazy vacuum is not supported for autovacuum */
+ tab->at_params.nworkers = -1;

What is the reason for the same? Can we explain in the comments?

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#93

Amit Kapila

amit.kapila16@gmail.com

over 6 years ago

In reply to: Masahiko Sawada (#91)

On Fri, Oct 4, 2019 at 7:57 PM Masahiko Sawada <sawada.mshk@gmail.com>
wrote:

On Fri, Oct 4, 2019 at 2:31 PM Amit Kapila <amit.kapila16@gmail.com>
wrote:

Do we really need to log all those messages? The other places where we

launch parallel workers doesn't seem to be using such messages. Why do you
think it is important to log the messages here when other cases don't use
it?

Well I would rather think that parallel create index doesn't log
enough messages. Parallel maintenance operation is invoked manually by
user. I can imagine that DBA wants to cancel and try the operation
again later if enough workers are not launched. But there is not a
convenient way to confirm how many parallel workers planned and
actually launched. We need to see ps command or pg_stat_activity.
That's why I think that log message would be helpful for users.

Hmm, what is a guarantee at a later time the user will get the required
number of workers? I think if the user decides to vacuum, then she would
want it to start sooner. Also, to cancel the vacuum, for this reason, the
user needs to monitor logs which don't seem to be an easy thing considering
this information will be logged at DEBUG2 level. I think it is better to
add in docs that we don't guarantee that the number of workers the user has
asked or expected to use for a parallel vacuum will be available during
execution. Even if there is a compelling reason (which I don't see) to
log this information, I think we shouldn't use more than one message to log
(like there is no need for a separate message for cleanup and vacuuming)
this information.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#94

Amit Kapila

amit.kapila16@gmail.com

over 6 years ago

In reply to: Masahiko Sawada (#90)

On Fri, Oct 4, 2019 at 7:34 PM Masahiko Sawada <sawada.mshk@gmail.com>
wrote:

On Fri, Oct 4, 2019 at 2:02 PM Amit Kapila <amit.kapila16@gmail.com>
wrote:

I'd also prefer to use maintenance_work_mem at max during parallel
vacuum regardless of the number of parallel workers. This is current
implementation. In lazy vacuum the maintenance_work_mem is used to
record itempointer of dead tuples. This is done by leader process and
worker processes just refers them for vacuuming dead index tuples.
Even if user sets a small amount of maintenance_work_mem the parallel
vacuum would be helpful as it still would take a time for index
vacuuming. So I thought we should cap the number of parallel workers
by the number of indexes rather than maintenance_work_mem.

Isn't that true only if we never use maintenance_work_mem during index

cleanup? However, I think we are using during index cleanup, see forex.
ginInsertCleanup. I think before reaching any conclusion about what to do
about this, first we need to establish whether this is a problem. If I am
correct, then only some of the index cleanups (like gin index) use
maintenance_work_mem, so we need to consider that point while designing a
solution for this.

I got your point. Currently the single process lazy vacuum could
consume the amount of (maintenance_work_mem * 2) memory at max because
we do index cleanup during holding the dead tuple space as you
mentioned. And ginInsertCleanup is also be called at the beginning of
ginbulkdelete. In current parallel lazy vacuum, each parallel vacuum
worker could consume other memory apart from the memory used by heap
scan depending on the implementation of target index AM. Given that
the current single and parallel vacuum implementation it would be
better to control the amount memory in total rather than the number of
parallel workers. So one approach I came up with is that we make all
vacuum workers use the amount of (maintenance_work_mem / # of
participants) as new maintenance_work_mem.

Yeah, we can do something like that, but I am not clear whether the current
memory usage for Gin indexes is correct. I have started a new thread,
let's discuss there.

[1]: /messages/by-id/CAA4eK1LmcD5aPogzwim5Nn58Ki+74a6Edghx4Wd8hAskvHaq5A@mail.gmail.com
/messages/by-id/CAA4eK1LmcD5aPogzwim5Nn58Ki+74a6Edghx4Wd8hAskvHaq5A@mail.gmail.com

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#95

Masahiko Sawada

sawada.mshk@gmail.com

over 6 years ago

In reply to: Amit Kapila (#94)

On Sun, Oct 6, 2019 at 7:59 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Oct 4, 2019 at 7:34 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Fri, Oct 4, 2019 at 2:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

I'd also prefer to use maintenance_work_mem at max during parallel
vacuum regardless of the number of parallel workers. This is current
implementation. In lazy vacuum the maintenance_work_mem is used to
record itempointer of dead tuples. This is done by leader process and
worker processes just refers them for vacuuming dead index tuples.
Even if user sets a small amount of maintenance_work_mem the parallel
vacuum would be helpful as it still would take a time for index
vacuuming. So I thought we should cap the number of parallel workers
by the number of indexes rather than maintenance_work_mem.

Isn't that true only if we never use maintenance_work_mem during index cleanup? However, I think we are using during index cleanup, see forex. ginInsertCleanup. I think before reaching any conclusion about what to do about this, first we need to establish whether this is a problem. If I am correct, then only some of the index cleanups (like gin index) use maintenance_work_mem, so we need to consider that point while designing a solution for this.

I got your point. Currently the single process lazy vacuum could
consume the amount of (maintenance_work_mem * 2) memory at max because
we do index cleanup during holding the dead tuple space as you
mentioned. And ginInsertCleanup is also be called at the beginning of
ginbulkdelete. In current parallel lazy vacuum, each parallel vacuum
worker could consume other memory apart from the memory used by heap
scan depending on the implementation of target index AM. Given that
the current single and parallel vacuum implementation it would be
better to control the amount memory in total rather than the number of
parallel workers. So one approach I came up with is that we make all
vacuum workers use the amount of (maintenance_work_mem / # of
participants) as new maintenance_work_mem.

Yeah, we can do something like that, but I am not clear whether the current memory usage for Gin indexes is correct. I have started a new thread, let's discuss there.

Thank you for starting that discussion!

Regards,

--
Masahiko Sawada

#96

Masahiko Sawada

sawada.mshk@gmail.com

over 6 years ago

In reply to: Amit Kapila (#93)

On Sat, Oct 5, 2019 at 8:22 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Oct 4, 2019 at 7:57 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Fri, Oct 4, 2019 at 2:31 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

Do we really need to log all those messages? The other places where we launch parallel workers doesn't seem to be using such messages. Why do you think it is important to log the messages here when other cases don't use it?

Well I would rather think that parallel create index doesn't log
enough messages. Parallel maintenance operation is invoked manually by
user. I can imagine that DBA wants to cancel and try the operation
again later if enough workers are not launched. But there is not a
convenient way to confirm how many parallel workers planned and
actually launched. We need to see ps command or pg_stat_activity.
That's why I think that log message would be helpful for users.

Hmm, what is a guarantee at a later time the user will get the required number of workers? I think if the user decides to vacuum, then she would want it to start sooner. Also, to cancel the vacuum, for this reason, the user needs to monitor logs which don't seem to be an easy thing considering this information will be logged at DEBUG2 level. I think it is better to add in docs that we don't guarantee that the number of workers the user has asked or expected to use for a parallel vacuum will be available during execution. Even if there is a compelling reason (which I don't see) to log this information, I think we shouldn't use more than one message to log (like there is no need for a separate message for cleanup and vacuuming) this information.

I think that there is use case where user wants to cancel a
long-running analytic query using parallel workers to use parallel
workers for parallel vacuum instead. That way the lazy vacuum will
eventually complete soon. Or user would want to see the vacuum log to
check if lazy vacuum has been done with how many parallel workers for
diagnostic when the vacuum took a long time. This log information
appears when VERBOSE option is specified. When executing VACUUM
command it's quite common to specify VERBOSE option to see the vacuum
execution more details and VACUUM VERBOSE already emits very detailed
information such as how many frozen pages are skipped and OldestXmin.
So I think this information would not be too odd for that. Are you
concerned that this information takes many lines of code? or it's not
worth to be logged?

I agreed to add in docs that we don't guarantee that the number of
workers user requested will be available.

--
Regards,

--
Masahiko Sawada

#97

Amit Kapila

amit.kapila16@gmail.com

over 6 years ago

In reply to: Masahiko Sawada (#96)

On Mon, Oct 7, 2019 at 10:00 AM Masahiko Sawada <sawada.mshk@gmail.com>
wrote:

On Sat, Oct 5, 2019 at 8:22 PM Amit Kapila <amit.kapila16@gmail.com>
wrote:

On Fri, Oct 4, 2019 at 7:57 PM Masahiko Sawada <sawada.mshk@gmail.com>

wrote:

On Fri, Oct 4, 2019 at 2:31 PM Amit Kapila <amit.kapila16@gmail.com>

wrote:

Do we really need to log all those messages? The other places where

we launch parallel workers doesn't seem to be using such messages. Why do
you think it is important to log the messages here when other cases don't
use it?

Well I would rather think that parallel create index doesn't log
enough messages. Parallel maintenance operation is invoked manually by
user. I can imagine that DBA wants to cancel and try the operation
again later if enough workers are not launched. But there is not a
convenient way to confirm how many parallel workers planned and
actually launched. We need to see ps command or pg_stat_activity.
That's why I think that log message would be helpful for users.

Hmm, what is a guarantee at a later time the user will get the required

number of workers? I think if the user decides to vacuum, then she would
want it to start sooner. Also, to cancel the vacuum, for this reason, the
user needs to monitor logs which don't seem to be an easy thing considering
this information will be logged at DEBUG2 level. I think it is better to
add in docs that we don't guarantee that the number of workers the user has
asked or expected to use for a parallel vacuum will be available during
execution. Even if there is a compelling reason (which I don't see) to
log this information, I think we shouldn't use more than one message to log
(like there is no need for a separate message for cleanup and vacuuming)
this information.

I think that there is use case where user wants to cancel a
long-running analytic query using parallel workers to use parallel
workers for parallel vacuum instead. That way the lazy vacuum will
eventually complete soon. Or user would want to see the vacuum log to
check if lazy vacuum has been done with how many parallel workers for
diagnostic when the vacuum took a long time. This log information
appears when VERBOSE option is specified. When executing VACUUM
command it's quite common to specify VERBOSE option to see the vacuum
execution more details and VACUUM VERBOSE already emits very detailed
information such as how many frozen pages are skipped and OldestXmin.
So I think this information would not be too odd for that. Are you
concerned that this information takes many lines of code? or it's not
worth to be logged?

To an extent both, but I see the point you are making. So, we should try
to minimize the number of lines used to log this message. If we can use
just one message to log this information, that would be ideal.

I agreed to add in docs that we don't guarantee that the number of
workers user requested will be available.

Okay.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#98

Masahiko Sawada

sawada.mshk@gmail.com

over 6 years ago

In reply to: Dilip Kumar (#87)

On Fri, Oct 4, 2019 at 7:05 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Fri, Oct 4, 2019 at 11:01 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Oct 4, 2019 at 10:28 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Some more comments..

Thank you!

1.
+ for (idx = 0; idx < nindexes; idx++)
+ {
+ if (!for_cleanup)
+ lazy_vacuum_index(Irel[idx], &stats[idx], vacrelstats->dead_tuples,
+   vacrelstats->old_live_tuples);
+ else
+ {
+ /* Cleanup one index and update index statistics */
+ lazy_cleanup_index(Irel[idx], &stats[idx], vacrelstats->new_rel_tuples,
+    vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+
+ lazy_update_index_statistics(Irel[idx], stats[idx]);
+
+ if (stats[idx])
+ pfree(stats[idx]);
+ }

I think instead of checking for_cleanup variable for every index of
the loop we better move loop inside, like shown below?

Fixed.

if (!for_cleanup)
for (idx = 0; idx < nindexes; idx++)
lazy_vacuum_index(Irel[idx], &stats[idx], vacrelstats->dead_tuples,
else
for (idx = 0; idx < nindexes; idx++)
{
lazy_cleanup_index
lazy_update_index_statistics
...
}
2.
+static void
+lazy_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
+    int nindexes, IndexBulkDeleteResult **stats,
+    LVParallelState *lps, bool for_cleanup)
+{
+ int idx;
+
+ Assert(!IsParallelWorker());
+
+ /* no job if the table has no index */
+ if (nindexes <= 0)
+ return;
Wouldn't it be good idea to call this function only if nindexes > 0?

I realized the callers of this function should pass nindexes > 0
because they attempt to do index vacuuming or index cleanup. So it
should be an assertion rather than returning. Thoughts?

3.
+/*
+ * Vacuum or cleanup indexes with parallel workers. This function must be used
+ * by the parallel vacuum leader process.
+ */
+static void
+lazy_parallel_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats,
Relation *Irel,
+ int nindexes, IndexBulkDeleteResult **stats,
+ LVParallelState *lps, bool for_cleanup)
If you see this function there is no much common code between
for_cleanup and without for_cleanup except these 3-4 statement.
LaunchParallelWorkers(lps->pcxt);
/* Create the log message to report */
initStringInfo(&buf);
...
/* Wait for all vacuum workers to finish */
WaitForParallelWorkersToFinish(lps->pcxt);
Other than that you have got a lot of checks like this
+ if (!for_cleanup)
+ {
+ }
+ else
+ {
}
I think code would be much redable if we have 2 functions one for
vaccum (lazy_parallel_vacuum_indexes) and another for
cleanup(lazy_parallel_cleanup_indexes).

Seems good idea. Fixed.

4.
* of index scans performed.  So we don't use maintenance_work_mem memory for
* the TID array, just enough to hold as many heap tuples as fit on one page.
*
+ * Lazy vacuum supports parallel execution with parallel worker processes. In
+ * parallel lazy vacuum, we perform both index vacuuming and index cleanup with
+ * parallel worker processes. Individual indexes are processed by one vacuum

Spacing after the "." is not uniform, previous comment is using 2
space and newly
added is using 1 space.

FIxed.

The code has been fixed in my local repository. After incorporated the
all comments I got so far I'll submit the updated version patch.

Regards,

--
Masahiko Sawada

#99

Masahiko Sawada

sawada.mshk@gmail.com

over 6 years ago

In reply to: Dilip Kumar (#92)

On Sat, Oct 5, 2019 at 4:36 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

Few more comments
----------------------------
1.
+static int
+compute_parallel_workers(Relation onerel, int nrequested, int nindexes)
+{
+ int parallel_workers;
+ bool leaderparticipates = true;
Seems like this function is not using onerel parameter so we can remove this.

Fixed.

2.
+
+ /* Estimate size for dead tuples -- PARALLEL_VACUUM_KEY_DEAD_TUPLES */
+ maxtuples = compute_max_dead_tuples(nblocks, true);
+ est_deadtuples = MAXALIGN(add_size(SizeOfLVDeadTuples,
+    mul_size(sizeof(ItemPointerData), maxtuples)));
+ shm_toc_estimate_chunk(&pcxt->estimator, est_deadtuples);
+ shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+ /* Finally, estimate VACUUM_KEY_QUERY_TEXT space */
+ querylen = strlen(debug_query_string);

for consistency with other comments change
VACUUM_KEY_QUERY_TEXT to PARALLEL_VACUUM_KEY_QUERY_TEXT

Fixed.

3.
@@ -2888,6 +2888,8 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
(!wraparound ? VACOPT_SKIP_LOCKED : 0);
tab->at_params.index_cleanup = VACOPT_TERNARY_DEFAULT;
tab->at_params.truncate = VACOPT_TERNARY_DEFAULT;
+ /* parallel lazy vacuum is not supported for autovacuum */
+ tab->at_params.nworkers = -1;

What is the reason for the same? Can we explain in the comments?

I think it's just that we don't want to support parallel auto vacuum
because it can consume more CPU resources in spite of background job,
which might be an unexpected behavior of autovacuum. I've changed the
comment.

Regards,

--
Masahiko Sawada

#100

Masahiko Sawada

sawada.mshk@gmail.com

over 6 years ago

In reply to: vignesh C (#89)

On Fri, Oct 4, 2019 at 8:55 PM vignesh C <vignesh21@gmail.com> wrote:

On Fri, Oct 4, 2019 at 4:18 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Oct 2, 2019 at 7:29 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Sat, Sep 21, 2019 at 9:31 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

One comment:

Thank you for reviewing this patch.

We can check if parallel_workers is within range something within
MAX_PARALLEL_WORKER_LIMIT.
+ int parallel_workers = 0;
+
+ if (optarg != NULL)
+ {
+ parallel_workers = atoi(optarg);
+ if (parallel_workers <= 0)
+ {
+ pg_log_error("number of parallel workers must be at least 1");
+ exit(1);
+ }
+ }

Fixed.

Regards,

--
Masahiko Sawada

#101

Amit Kapila

amit.kapila16@gmail.com

over 6 years ago

In reply to: Masahiko Sawada (#99)

On Wed, Oct 9, 2019 at 6:13 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Sat, Oct 5, 2019 at 4:36 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
3.
@@ -2888,6 +2888,8 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
(!wraparound ? VACOPT_SKIP_LOCKED : 0);
tab->at_params.index_cleanup = VACOPT_TERNARY_DEFAULT;
tab->at_params.truncate = VACOPT_TERNARY_DEFAULT;
+ /* parallel lazy vacuum is not supported for autovacuum */
+ tab->at_params.nworkers = -1;
What is the reason for the same? Can we explain in the comments?
I think it's just that we don't want to support parallel auto vacuum
because it can consume more CPU resources in spite of background job,
which might be an unexpected behavior of autovacuum.

I think the other reason is it can generate a lot of I/O which might
choke other operations. I think if we want we can provide Guc(s) to
control such behavior, but initially providing it via command should
be a good start so that users can knowingly use it in appropriate
cases. We can later extend it for autovacuum if required.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#102

Amit Kapila

amit.kapila16@gmail.com

over 6 years ago

In reply to: Amit Kapila (#88)

On Fri, Oct 4, 2019 at 4:18 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Oct 2, 2019 at 7:29 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Few more comments:
---------------------------------
1. Caurrently parallel vacuum is allowed for temporary relations
which is wrong. It leads to below error:

postgres=# create temporary table tmp_t1(c1 int, c2 char(10));
CREATE TABLE
postgres=# create index idx_tmp_t1 on tmp_t1(c1);
CREATE INDEX
postgres=# create index idx1_tmp_t1 on tmp_t1(c2);
CREATE INDEX
postgres=# insert into tmp_t1 values(generate_series(1,10000),'aaaa');
INSERT 0 10000
postgres=# delete from tmp_t1 where c1 > 5000;
DELETE 5000
postgres=# vacuum (parallel 2) tmp_t1;
ERROR: cannot access temporary tables during a parallel operation
CONTEXT: parallel worker

The parallel vacuum shouldn't be allowed for temporary relations.

2.
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -34,6 +34,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [
<replaceable class="paramet
     SKIP_LOCKED [ <replaceable class="parameter">boolean</replaceable> ]
     INDEX_CLEANUP [ <replaceable
class="parameter">boolean</replaceable> ]
     TRUNCATE [ <replaceable class="parameter">boolean</replaceable> ]
+    PARALLEL [ <replaceable
class="parameter">integer</replaceable> ]

Now, if the user gives a command like Vacuum (analyze, parallel)
<table_name>; it is not very obvious that a parallel option will be
only used for vacuum purposes but not for analyze. I think we can add
a note in the docs to mention this explicitly. This can avoid any
confusion.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#103

Masahiko Sawada

sawada.mshk@gmail.com

over 6 years ago

In reply to: Amit Kapila (#88)

On Fri, Oct 4, 2019 at 7:48 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Oct 2, 2019 at 7:29 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Sat, Sep 21, 2019 at 9:31 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
*
+end_parallel_vacuum(LVParallelState *lps, Relation *Irel, int nindexes)
{
..
+ /* Shutdown worker processes and destroy the parallel context */
+ WaitForParallelWorkersToFinish(lps->pcxt);
..
}
Do we really need to call WaitForParallelWorkersToFinish here as it
must have been called in lazy_parallel_vacuum_or_cleanup_indexes
before this time?
No, removed.
+ /* Shutdown worker processes and destroy the parallel context */
+ DestroyParallelContext(lps->pcxt);
But you forget to update the comment.

Fixed.

Few more comments:
--------------------------------
1.
+/*
+ * Parallel Index vacuuming and index cleanup routine used by both the leader
+ * process and worker processes. Unlike single process vacuum, we don't update
+ * index statistics after cleanup index since it is not allowed during
+ * parallel mode, instead copy index bulk-deletion results from the local
+ * memory to the DSM segment and update them at the end of parallel lazy
+ * vacuum.
+ */
+static void
+do_parallel_vacuum_or_cleanup_indexes(Relation *Irel, int nindexes,
+  IndexBulkDeleteResult **stats,
+  LVShared *lvshared,
+  LVDeadTuples *dead_tuples)
+{
+ /* Loop until all indexes are vacuumed */
+ for (;;)
+ {
+ int idx;
+
+ /* Get an index number to process */
+ idx = pg_atomic_fetch_add_u32(&(lvshared->nprocessed), 1);
+
+ /* Done for all indexes? */
+ if (idx >= nindexes)
+ break;
+
+ /*
+ * Update the pointer to the corresponding bulk-deletion result
+ * if someone has already updated it.
+ */
+ if (lvshared->indstats[idx].updated &&
+ stats[idx] == NULL)
+ stats[idx] = &(lvshared->indstats[idx].stats);
+
+ /* Do vacuum or cleanup one index */
+ if (!lvshared->for_cleanup)
+ lazy_vacuum_index(Irel[idx], &stats[idx], dead_tuples,
+  lvshared->reltuples);
+ else
+ lazy_cleanup_index(Irel[idx], &stats[idx], lvshared->reltuples,
+   lvshared->estimated_count);
It seems we always run index cleanup via parallel worker which seems overkill because the cleanup index generally scans the index only when bulkdelete was not performed. In some cases like for hash index, it doesn't do anything even bulk delete is not called. OTOH, for brin index, it does the main job during cleanup but we might be able to always allow index cleanup by parallel worker for brin indexes if we remove the allocation in brinbulkdelete which I am not sure is of any use.

I think we shouldn't call cleanup via parallel worker unless bulkdelete hasn't been performed on the index.

Agreed. Fixed.

2.
- for (i = 0; i < nindexes; i++)
- lazy_vacuum_index(Irel[i],
-  &indstats[i],
-  vacrelstats);
+ lazy_vacuum_or_cleanup_indexes(vacrelstats, Irel, nindexes,
+   indstats, lps, false);

Indentation is not proper. You might want to run pgindent.

Fixed.

Regards,

--
Masahiko Sawada

#104

Masahiko Sawada

sawada.mshk@gmail.com

over 6 years ago

In reply to: Amit Kapila (#102)

2 attachment(s)

On Thu, Oct 10, 2019 at 2:19 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Oct 4, 2019 at 4:18 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Oct 2, 2019 at 7:29 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Few more comments:

Thank you for reviewing the patch!

---------------------------------
1. Caurrently parallel vacuum is allowed for temporary relations
which is wrong. It leads to below error:

postgres=# create temporary table tmp_t1(c1 int, c2 char(10));
CREATE TABLE
postgres=# create index idx_tmp_t1 on tmp_t1(c1);
CREATE INDEX
postgres=# create index idx1_tmp_t1 on tmp_t1(c2);
CREATE INDEX
postgres=# insert into tmp_t1 values(generate_series(1,10000),'aaaa');
INSERT 0 10000
postgres=# delete from tmp_t1 where c1 > 5000;
DELETE 5000
postgres=# vacuum (parallel 2) tmp_t1;
ERROR: cannot access temporary tables during a parallel operation
CONTEXT: parallel worker

The parallel vacuum shouldn't be allowed for temporary relations.

Fixed.

2.
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -34,6 +34,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [
<replaceable class="paramet
SKIP_LOCKED [ <replaceable class="parameter">boolean</replaceable> ]
INDEX_CLEANUP [ <replaceable
class="parameter">boolean</replaceable> ]
TRUNCATE [ <replaceable class="parameter">boolean</replaceable> ]
+    PARALLEL [ <replaceable
class="parameter">integer</replaceable> ]
Now, if the user gives a command like Vacuum (analyze, parallel)
<table_name>; it is not very obvious that a parallel option will be
only used for vacuum purposes but not for analyze. I think we can add
a note in the docs to mention this explicitly. This can avoid any
confusion.

Agreed.

Attached the latest version patch although the memory usage problem is
under discussion. I'll update the patches according to the result of
that discussion.

Regards,

--
Masahiko Sawada

Attachments:

v28-0002-Add-paralell-P-option-to-vacuumdb-command.patchtext/x-patch; charset=US-ASCII; name=v28-0002-Add-paralell-P-option-to-vacuumdb-command.patchDownload

From 941fd0848e95e6bc48b51e86f1456e6da800c77e Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Wed, 23 Jan 2019 16:07:53 +0900
Subject: [PATCH v28 2/2] Add --paralell, -P option to vacuumdb command

---
 doc/src/sgml/ref/vacuumdb.sgml    | 16 +++++++++++
 src/bin/scripts/t/100_vacuumdb.pl | 10 ++++++-
 src/bin/scripts/vacuumdb.c        | 48 ++++++++++++++++++++++++++++++-
 3 files changed, 72 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/ref/vacuumdb.sgml b/doc/src/sgml/ref/vacuumdb.sgml
index 47d93456f8..f6ac0c6e5a 100644
--- a/doc/src/sgml/ref/vacuumdb.sgml
+++ b/doc/src/sgml/ref/vacuumdb.sgml
@@ -226,6 +226,22 @@ PostgreSQL documentation
       </listitem>
      </varlistentry>
 
+     <varlistentry>
+      <term><option>-P <replaceable class="parameter">workers</replaceable></option></term>
+      <term><option>--parallel=<replaceable class="parameter">workers</replaceable></option></term>
+      <listitem>
+       <para>
+        Execute parallel vacuum with <productname>PostgreSQL</productname>'s
+        <replaceable class="parameter">workers</replaceable> background workers.
+       </para>
+       <para>
+        This option will require background workers, so make sure your
+        <xref linkend="guc-max-parallel-workers-maintenance"/> setting is more
+        than one.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry>
       <term><option>-q</option></term>
       <term><option>--quiet</option></term>
diff --git a/src/bin/scripts/t/100_vacuumdb.pl b/src/bin/scripts/t/100_vacuumdb.pl
index b685b35282..8fe80719e8 100644
--- a/src/bin/scripts/t/100_vacuumdb.pl
+++ b/src/bin/scripts/t/100_vacuumdb.pl
@@ -3,7 +3,7 @@ use warnings;
 
 use PostgresNode;
 use TestLib;
-use Test::More tests => 44;
+use Test::More tests => 48;
 
 program_help_ok('vacuumdb');
 program_version_ok('vacuumdb');
@@ -48,6 +48,14 @@ $node->issues_sql_like(
 $node->command_fails(
 	[ 'vacuumdb', '--analyze-only', '--disable-page-skipping', 'postgres' ],
 	'--analyze-only and --disable-page-skipping specified together');
+$node->issues_sql_like(
+	[ 'vacuumdb', '-P2', 'postgres' ],
+	qr/statement: VACUUM \(PARALLEL 2\).*;/,
+	'vacuumdb -P2');
+$node->issues_sql_like(
+	[ 'vacuumdb', '-P', 'postgres' ],
+	qr/statement: VACUUM \(PARALLEL\).*;/,
+	'vacuumdb -P');
 $node->command_ok([qw(vacuumdb -Z --table=pg_am dbname=template1)],
 	'vacuumdb with connection string');
 
diff --git a/src/bin/scripts/vacuumdb.c b/src/bin/scripts/vacuumdb.c
index 2c7219239f..63bf66a70b 100644
--- a/src/bin/scripts/vacuumdb.c
+++ b/src/bin/scripts/vacuumdb.c
@@ -34,6 +34,8 @@ typedef struct vacuumingOptions
 	bool		skip_locked;
 	int			min_xid_age;
 	int			min_mxid_age;
+	int			parallel_workers;	/* -1 disables, 0 for choosing based on the
+									 * number of indexes */
 } vacuumingOptions;
 
 
@@ -86,6 +88,7 @@ main(int argc, char *argv[])
 		{"full", no_argument, NULL, 'f'},
 		{"verbose", no_argument, NULL, 'v'},
 		{"jobs", required_argument, NULL, 'j'},
+		{"parallel", optional_argument, NULL, 'P'},
 		{"maintenance-db", required_argument, NULL, 2},
 		{"analyze-in-stages", no_argument, NULL, 3},
 		{"disable-page-skipping", no_argument, NULL, 4},
@@ -115,6 +118,7 @@ main(int argc, char *argv[])
 
 	/* initialize options to all false */
 	memset(&vacopts, 0, sizeof(vacopts));
+	vacopts.parallel_workers = -1;
 
 	pg_logging_init(argv[0]);
 	progname = get_progname(argv[0]);
@@ -122,7 +126,7 @@ main(int argc, char *argv[])
 
 	handle_help_version_opts(argc, argv, "vacuumdb", help);
 
-	while ((c = getopt_long(argc, argv, "h:p:U:wWeqd:zZFat:fvj:", long_options, &optindex)) != -1)
+	while ((c = getopt_long(argc, argv, "h:p:P::U:wWeqd:zZFat:fvj:", long_options, &optindex)) != -1)
 	{
 		switch (c)
 		{
@@ -182,6 +186,24 @@ main(int argc, char *argv[])
 					exit(1);
 				}
 				break;
+			case 'P':
+				{
+					int parallel_workers = 0;
+
+					if (optarg != NULL)
+					{
+						parallel_workers = atoi(optarg);
+						if (parallel_workers <= 0)
+						{
+							pg_log_error("number of parallel workers must be at least 1");
+							exit(1);
+						}
+					}
+
+					/* allow to set 0, meaning PARALLEL without the parallel degree */
+					vacopts.parallel_workers = parallel_workers;
+					break;
+				}
 			case 2:
 				maintenance_db = pg_strdup(optarg);
 				break;
@@ -254,9 +276,22 @@ main(int argc, char *argv[])
 						 "disable-page-skipping");
 			exit(1);
 		}
+		if (vacopts.parallel_workers >= 0)
+		{
+			pg_log_error("cannot use the \"%s\" option when performing only analyze",
+						 "parallel");
+			exit(1);
+		}
 		/* allow 'and_analyze' with 'analyze_only' */
 	}
 
+	if (vacopts.full && vacopts.parallel_workers >= 0)
+	{
+		pg_log_error("cannot use the \"%s\" option with \"%s\" option",
+					 "full", "parallel");
+		exit(1);
+	}
+
 	setup_cancel_handler();
 
 	/* Avoid opening extra connections. */
@@ -822,6 +857,16 @@ prepare_vacuum_command(PQExpBuffer sql, int serverVersion,
 				appendPQExpBuffer(sql, "%sANALYZE", sep);
 				sep = comma;
 			}
+			if (vacopts->parallel_workers > 0)
+			{
+				appendPQExpBuffer(sql, "%sPARALLEL %d", sep, vacopts->parallel_workers);
+				sep = comma;
+			}
+			if (vacopts->parallel_workers == 0)
+			{
+				appendPQExpBuffer(sql, "%sPARALLEL", sep);
+				sep = comma;
+			}
 			if (sep != paren)
 				appendPQExpBufferChar(sql, ')');
 		}
@@ -885,6 +930,7 @@ help(const char *progname)
 	printf(_("  -j, --jobs=NUM                  use this many concurrent connections to vacuum\n"));
 	printf(_("      --min-mxid-age=MXID_AGE     minimum multixact ID age of tables to vacuum\n"));
 	printf(_("      --min-xid-age=XID_AGE       minimum transaction ID age of tables to vacuum\n"));
+	printf(_("  -P, --parallel[=NUM]            do parallel vacuuming\n"));
 	printf(_("  -q, --quiet                     don't write any messages\n"));
 	printf(_("      --skip-locked               skip relations that cannot be immediately locked\n"));
 	printf(_("  -t, --table='TABLE[(COLUMNS)]'  vacuum specific table(s) only\n"));
-- 
2.22.0

v28-0001-Add-parallel-option-to-VACUUM-command.patchtext/x-patch; charset=US-ASCII; name=v28-0001-Add-parallel-option-to-VACUUM-command.patchDownload

From 59ff7673e31e49fee7603562089e416337f26add Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Wed, 2 Oct 2019 22:46:21 +0900
Subject: [PATCH v28 1/2] Add parallel option to VACUUM command

This change adds PARALLEL option to VACUUM command that enable us to
perform index vacuuming and index cleanup with background
workers. Individual indexes is processed by one vacuum
process. Therefore parallel vacuum can be used when the table has at
least two indexes and it cannot specify larger parallel degree than
the number of indexes that the table has.

The parallel degree is either specified by user or determined based on
the number of indexes that the table has, and further limited by
max_parallel_maintenance_workers. The table size and index size don't
affect it.
---
 doc/src/sgml/config.sgml              |  14 +-
 doc/src/sgml/ref/vacuum.sgml          |  44 ++
 src/backend/access/heap/vacuumlazy.c  | 890 +++++++++++++++++++++++---
 src/backend/access/transam/parallel.c |   4 +
 src/backend/commands/vacuum.c         |  45 ++
 src/backend/postmaster/autovacuum.c   |   2 +
 src/bin/psql/tab-complete.c           |   2 +-
 src/include/access/heapam.h           |   3 +
 src/include/commands/vacuum.h         |   5 +
 src/test/regress/expected/vacuum.out  |  14 +
 src/test/regress/sql/vacuum.sql       |  10 +
 11 files changed, 924 insertions(+), 109 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 47b12c6a8f..9012e5549e 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2265,13 +2265,13 @@ include_dir 'conf.d'
        <listitem>
         <para>
          Sets the maximum number of parallel workers that can be
-         started by a single utility command.  Currently, the only
-         parallel utility command that supports the use of parallel
-         workers is <command>CREATE INDEX</command>, and only when
-         building a B-tree index.  Parallel workers are taken from the
-         pool of processes established by <xref
-         linkend="guc-max-worker-processes"/>, limited by <xref
-         linkend="guc-max-parallel-workers"/>.  Note that the requested
+         started by a single utility command.  Currently, the parallel
+         utility commands that support the use of parallel workers are
+         <command>CREATE INDEX</command> only when building a B-tree index,
+         and <command>VACUUM</command> without <literal>FULL</literal>
+         option. Parallel workers are taken from the pool of processes
+         established by <xref linkend="guc-max-worker-processes"/>, limited
+         by <xref linkend="guc-max-parallel-workers"/>.  Note that the requested
          number of workers may not actually be available at run time.
          If this occurs, the utility operation will run with fewer
          workers than expected.  The default value is 2.  Setting this
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index f9b0fb8794..801daddb1f 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -34,6 +34,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     SKIP_LOCKED [ <replaceable class="parameter">boolean</replaceable> ]
     INDEX_CLEANUP [ <replaceable class="parameter">boolean</replaceable> ]
     TRUNCATE [ <replaceable class="parameter">boolean</replaceable> ]
+    PARALLEL [ <replaceable class="parameter">integer</replaceable> ]
 
 <phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
 
@@ -223,6 +224,31 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><literal>PARALLEL</literal></term>
+    <listitem>
+     <para>
+      Perform vacuum index and cleanup index phases of <command>VACUUM</command>
+      in parallel using <replaceable class="parameter">integer</replaceable>
+      background workers (for the detail of each vacuum phases, please
+      refer to <xref linkend="vacuum-phases"/>). If the parallel degree
+      <replaceable class="parameter">integer</replaceable> is omitted,
+      then <command>VACUUM</command> decides the number of workers based
+      on number of indexes on the relation which is further limited by
+      <xref linkend="guc-max-parallel-workers-maintenance"/>. Please note
+      that it is not guaranteed that the number of parallel worker specified
+      in <replaceable class="parameter">integer</replaceable> will be used
+      during execution. It is possible for a vacuum to run with fewer workers
+      than specified, or even with no workers at all. Only one worker can
+      be used per index. So parallel workers are launched only when there
+      are at least <literal>2</literal> indexes in the table. Workers for
+      vacuum launches before starting each phases and exit at the end of
+      the phase. These behaviors might change in a future release. This
+      option can not use with <literal>FULL</literal> option.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><replaceable class="parameter">boolean</replaceable></term>
     <listitem>
@@ -237,6 +263,18 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><replaceable class="parameter">integer</replaceable></term>
+    <listitem>
+     <para>
+      Specifies a positive integer value passed to the selected option.
+      The <replaceable class="parameter">integer</replaceable> value can
+      also be omitted, in which case the default value of the selected
+      option is used.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><replaceable class="parameter">table_name</replaceable></term>
     <listitem>
@@ -316,6 +354,12 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     more than a plain <command>VACUUM</command> would.
    </para>
 
+   <para>
+     The <option>PARALLEL</option> option is used for only vacuum purpose.
+     Even if this option is specified with <option>ANALYZE</option> option
+     it does not affect <option>ANALYZE</option>.
+   </para>
+
    <para>
     <command>VACUUM</command> causes a substantial increase in I/O traffic,
     which might cause poor performance for other active sessions.  Therefore,
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index a3c4a1df3b..3c5e16608e 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -22,6 +22,21 @@
  * of index scans performed.  So we don't use maintenance_work_mem memory for
  * the TID array, just enough to hold as many heap tuples as fit on one page.
  *
+ * Lazy vacuum supports parallel execution with parallel worker processes.  In
+ * parallel lazy vacuum, we perform both index vacuuming and index cleanup with
+ * parallel worker processes.  Individual indexes are processed by one vacuum
+ * process.  At the beginning of lazy vacuum (at lazy_scan_heap) we prepare the
+ * parallel context and initialize the DSM segment that contains shared information
+ * as well as the memory space for storing dead tuples.  When starting either
+ * index vacuuming or index cleanup, we launch parallel worker processes.  Once
+ * all indexes are processed the parallel worker processes exit.  And then the
+ * leader process re-initializes the parallel context while keeping recorded
+ * dead tuples so that the leader can launch parallel workers again in the next
+ * time.  Note that all parallel workers live during either index vacuuming or
+ * index cleanup but the leader process neither exits from the parallel mode
+ * nor destroys the parallel context.  For updating the index statistics, since
+ * any updates are not allowed during parallel mode we update the index
+ * statistics after exited from the parallel mode.
  *
  * Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
@@ -41,8 +56,10 @@
 #include "access/heapam_xlog.h"
 #include "access/htup_details.h"
 #include "access/multixact.h"
+#include "access/parallel.h"
 #include "access/transam.h"
 #include "access/visibilitymap.h"
+#include "access/xact.h"
 #include "access/xlog.h"
 #include "catalog/storage.h"
 #include "commands/dbcommands.h"
@@ -55,6 +72,7 @@
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
 #include "storage/lmgr.h"
+#include "tcop/tcopprot.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/pg_rusage.h"
@@ -110,6 +128,111 @@
  */
 #define PREFETCH_SIZE			((BlockNumber) 32)
 
+/*
+ * DSM keys for parallel lazy vacuum. Unlike other parallel execution code,
+ * since we don't need to worry about DSM keys conflicting with plan_node_id
+ * we can use small integers.
+ */
+#define PARALLEL_VACUUM_KEY_SHARED			1
+#define PARALLEL_VACUUM_KEY_DEAD_TUPLES		2
+#define PARALLEL_VACUUM_KEY_QUERY_TEXT		3
+
+/*
+ * PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION disables the leader's
+ * participation in parallel lazy vacuum.  This may be useful as a debugging
+ * aid.
+#undef PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION
+ */
+
+/*
+ * Macro to check if we are in a parallel lazy vacuum. If true, we are
+ * in the parallel mode and prepared the DSM segment.
+ */
+#define ParallelVacuumIsActive(lps) (((LVParallelState *) (lps)) != NULL)
+
+/*
+ * LVDeadTuples stores the dead tuple TIDs collected during heap scan.
+ * This is allocated in the DSM segment when parallel lazy vacuum
+ * mode, otherwise allocated in a local memory.
+ */
+typedef struct LVDeadTuples
+{
+	int			max_tuples;	/* # slots allocated in array */
+	int			num_tuples;	/* current # of entries */
+	/* List of TIDs of tuples we intend to delete */
+	/* NB: this list is ordered by TID address */
+	ItemPointerData itemptrs[FLEXIBLE_ARRAY_MEMBER];	/* array of ItemPointerData */
+} LVDeadTuples;
+#define SizeOfLVDeadTuples offsetof(LVDeadTuples, itemptrs) + sizeof(ItemPointerData)
+
+/*
+ * Struct for an index bulk-deletion statistic used for parallel lazy vacuum.
+ * This is allocated in the DSM segment.
+ */
+typedef struct LVSharedIndStats
+{
+	IndexBulkDeleteResult	stats;
+	bool					updated;	/* are the stats updated? */
+} LVSharedIndStats;
+
+/*
+ * Shared information among parallel workers. So this is allocated in
+ * the DSM segment.
+ */
+typedef struct LVShared
+{
+	/*
+	 * Target table relid and log level. These fields are not modified during
+	 * the lazy vacuum.
+	 */
+	Oid		relid;
+	int		elevel;
+
+	/*
+	 * An indication for vacuum workers of doing either index vacuuming or
+	 * index cleanup.
+	 */
+	bool	for_cleanup;
+
+	/*
+	 * Fields for both index vacuuming and index cleanup.
+	 *
+	 * reltuples is the total number of input heap tuples. We set either an
+	 * old live tuples in index vacuuming case or the new live tuples in index
+	 * cleanup case.
+	 *
+	 * estimated_count is true if the reltuples is estimated value.
+	 */
+	double	reltuples;
+	bool	estimated_count;
+
+	/*
+	 * Variables to control parallel index vacuuming. An variable-sized field
+	 * 'indstats' must come last.
+	 */
+	pg_atomic_uint32	nprocessed;
+	LVSharedIndStats	indstats[FLEXIBLE_ARRAY_MEMBER];
+} LVShared;
+#define SizeOfLVShared offsetof(LVShared, indstats) + sizeof(LVSharedIndStats)
+
+/* Struct for parallel lazy vacuum */
+typedef struct LVParallelState
+{
+	ParallelContext	*pcxt;
+
+	/* Shared information among parallel vacuum workers */
+	LVShared		*lvshared;
+
+	/* User-requested parallel degree */
+	int				nworkers_requested;
+
+	/*
+	 * Always true except in a debugging case where
+	 * PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION are defined.
+	 */
+	bool			leaderparticipates;
+} LVParallelState;
+
 typedef struct LVRelStats
 {
 	/* useindex = true means two-pass strategy; false means one-pass */
@@ -128,17 +251,12 @@ typedef struct LVRelStats
 	BlockNumber pages_removed;
 	double		tuples_deleted;
 	BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
-	/* List of TIDs of tuples we intend to delete */
-	/* NB: this list is ordered by TID address */
-	int			num_dead_tuples;	/* current # of entries */
-	int			max_dead_tuples;	/* # slots allocated in array */
-	ItemPointer dead_tuples;	/* array of ItemPointerData */
+	LVDeadTuples *dead_tuples;
 	int			num_index_scans;
 	TransactionId latestRemovedXid;
 	bool		lock_waiter_detected;
 } LVRelStats;
 
-
 /* A few variables that don't seem worth passing around as parameters */
 static int	elevel = -1;
 
@@ -155,12 +273,11 @@ static void lazy_scan_heap(Relation onerel, VacuumParams *params,
 						   bool aggressive);
 static void lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats);
 static bool lazy_check_needs_freeze(Buffer buf, bool *hastup);
-static void lazy_vacuum_index(Relation indrel,
-							  IndexBulkDeleteResult **stats,
-							  LVRelStats *vacrelstats);
+static void lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult **stats,
+							  LVDeadTuples *dead_tuples, double reltuples);
 static void lazy_cleanup_index(Relation indrel,
-							   IndexBulkDeleteResult *stats,
-							   LVRelStats *vacrelstats);
+							   IndexBulkDeleteResult **stats,
+							   double reltuples, bool estimated_count);
 static int	lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 							 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer);
 static bool should_attempt_truncation(VacuumParams *params,
@@ -169,12 +286,37 @@ static void lazy_truncate_heap(Relation onerel, LVRelStats *vacrelstats);
 static BlockNumber count_nondeletable_pages(Relation onerel,
 											LVRelStats *vacrelstats);
 static void lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks);
-static void lazy_record_dead_tuple(LVRelStats *vacrelstats,
+static void lazy_record_dead_tuple(LVDeadTuples *dead_tuples,
 								   ItemPointer itemptr);
 static bool lazy_tid_reaped(ItemPointer itemptr, void *state);
 static int	vac_cmp_itemptr(const void *left, const void *right);
 static bool heap_page_is_all_visible(Relation rel, Buffer buf,
 									 TransactionId *visibility_cutoff_xid, bool *all_frozen);
+static LVParallelState *begin_parallel_vacuum(LVRelStats *vacrelstats, Oid relid,
+											  BlockNumber nblocks, int nindexes,
+											  int nrequested);
+static void end_parallel_vacuum(LVParallelState *lps, Relation *Irel, int nindexes,
+								IndexBulkDeleteResult **stats);
+static void lazy_vacuum_indexes(LVRelStats *vacrelstats, Relation *Irel,
+								int nindexes, IndexBulkDeleteResult **stats,
+								LVParallelState *lps);
+static void lazy_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
+								 int nindexes, IndexBulkDeleteResult **stats,
+								 LVParallelState *lps);
+static void lazy_parallel_vacuum_indexes(LVRelStats *vacrelstats, Relation *Irel,
+										 int nindexes, IndexBulkDeleteResult **stats,
+										 LVParallelState *lps);
+static void lazy_parallel_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
+										  int nindexes, IndexBulkDeleteResult **stats,
+										  LVParallelState *lps);
+static void vacuum_or_cleanup_indexes_worker(Relation *Irel, int nindexes,
+											 IndexBulkDeleteResult **stats,
+											 LVShared *lvshared,
+											 LVDeadTuples *dead_tuples);
+static void update_index_statistics(Relation *Irel, IndexBulkDeleteResult **stats,
+									int nindexes);
+static int compute_parallel_workers(int nrequested, int nindexes);
+static long compute_max_dead_tuples(BlockNumber relblocks, bool hasindex);
 
 
 /*
@@ -488,6 +630,18 @@ vacuum_log_cleanup_info(Relation rel, LVRelStats *vacrelstats)
  *		dead-tuple TIDs, invoke vacuuming of indexes and call lazy_vacuum_heap
  *		to reclaim dead line pointers.
  *
+ *		If the table has at least two indexes and parallel lazy vacuum is
+ *		requested, we execute both index vacuuming and index cleanup with
+ *		parallel workers. In parallel lazy vacuum, we enter parallel mode and
+ *		then create both the parallel context and the DSM segment before starting
+ *		heap scan so that we can record dead tuples to the DSM segment. All
+ *		parallel workers are launched at beginning of index vacuuming and index
+ *		cleanup and they exit once done with all indexes. At the end of this
+ *		function we exit from parallel mode. Index bulk-deletion results are
+ *		stored in the DSM segment and update index statistics as a whole after
+ *		exited from parallel mode since all writes are not allowed during parallel
+ *		mode.
+ *
  *		If there are no indexes then we can reclaim line pointers on the fly;
  *		dead line pointers need only be retained until all index pointers that
  *		reference them have been killed.
@@ -496,6 +650,8 @@ static void
 lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			   Relation *Irel, int nindexes, bool aggressive)
 {
+	LVParallelState *lps = NULL;
+	LVDeadTuples *dead_tuples;
 	BlockNumber nblocks,
 				blkno;
 	HeapTupleData tuple;
@@ -518,6 +674,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	bool		skipping_blocks;
 	xl_heap_freeze_tuple *frozen;
 	StringInfoData buf;
+	int			parallel_workers = 0;
 	const int	initprog_index[] = {
 		PROGRESS_VACUUM_PHASE,
 		PROGRESS_VACUUM_TOTAL_HEAP_BLKS,
@@ -553,13 +710,43 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	vacrelstats->nonempty_pages = 0;
 	vacrelstats->latestRemovedXid = InvalidTransactionId;
 
-	lazy_space_alloc(vacrelstats, nblocks);
+	/*
+	 * If parallel lazy vacuum is requested and we vacuum indexes, compute
+	 * the number of parallel vacuum worker to launch.
+	 */
+	if (params->nworkers >= 0 && vacrelstats->useindex)
+		parallel_workers = compute_parallel_workers(params->nworkers, nindexes);
+
+	if (parallel_workers > 0)
+	{
+		/*
+		 * Enter parallel mode, create the parallel context and allocate the
+		 * DSM segment.
+		 */
+		lps = begin_parallel_vacuum(vacrelstats,
+									RelationGetRelid(onerel),
+									nblocks, nindexes,
+									parallel_workers);
+
+		/* Remember the user-requested parallel degree for reporting */
+		lps->nworkers_requested = params->nworkers;
+	}
+	else
+	{
+		/*
+		 * Use single process vacuum. We allocate the memory space for dead
+		 * tuples locally.
+		 */
+		lazy_space_alloc(vacrelstats, nblocks);
+	}
+
+	dead_tuples = vacrelstats->dead_tuples;
 	frozen = palloc(sizeof(xl_heap_freeze_tuple) * MaxHeapTuplesPerPage);
 
 	/* Report that we're scanning the heap, advertising total # of blocks */
 	initprog_val[0] = PROGRESS_VACUUM_PHASE_SCAN_HEAP;
 	initprog_val[1] = nblocks;
-	initprog_val[2] = vacrelstats->max_dead_tuples;
+	initprog_val[2] = dead_tuples->max_tuples;
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
 	/*
@@ -737,8 +924,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * If we are close to overrunning the available space for dead-tuple
 		 * TIDs, pause and do a cycle of vacuuming before we tackle this page.
 		 */
-		if ((vacrelstats->max_dead_tuples - vacrelstats->num_dead_tuples) < MaxHeapTuplesPerPage &&
-			vacrelstats->num_dead_tuples > 0)
+		if ((dead_tuples->max_tuples - dead_tuples->num_tuples) < MaxHeapTuplesPerPage &&
+			dead_tuples->num_tuples > 0)
 		{
 			const int	hvp_index[] = {
 				PROGRESS_VACUUM_PHASE,
@@ -766,10 +953,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 										 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
 			/* Remove index entries */
-			for (i = 0; i < nindexes; i++)
-				lazy_vacuum_index(Irel[i],
-								  &indstats[i],
-								  vacrelstats);
+			lazy_vacuum_indexes(vacrelstats, Irel, nindexes, indstats, lps);
 
 			/*
 			 * Report that we are now vacuuming the heap.  We also increase
@@ -789,7 +973,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 			vacrelstats->num_index_scans++;
 
 			/*
@@ -985,7 +1169,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		has_dead_tuples = false;
 		nfrozen = 0;
 		hastup = false;
-		prev_dead_count = vacrelstats->num_dead_tuples;
+		prev_dead_count = dead_tuples->num_tuples;
 		maxoff = PageGetMaxOffsetNumber(page);
 
 		/*
@@ -1024,7 +1208,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 */
 			if (ItemIdIsDead(itemid))
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				all_visible = false;
 				continue;
 			}
@@ -1170,7 +1354,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 
 			if (tupgone)
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				HeapTupleHeaderAdvanceLatestRemovedXid(tuple.t_data,
 													   &vacrelstats->latestRemovedXid);
 				tups_vacuumed += 1;
@@ -1240,7 +1424,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * doing a second scan. Also we don't do that but forget dead tuples
 		 * when index cleanup is disabled.
 		 */
-		if (!vacrelstats->useindex && vacrelstats->num_dead_tuples > 0)
+		if (!vacrelstats->useindex && dead_tuples->num_tuples > 0)
 		{
 			if (nindexes == 0)
 			{
@@ -1269,7 +1453,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 
 			/*
 			 * Periodically do incremental FSM vacuuming to make newly-freed
@@ -1384,7 +1568,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * page, so remember its free space as-is.  (This path will always be
 		 * taken if there are no indexes.)
 		 */
-		if (vacrelstats->num_dead_tuples == prev_dead_count)
+		if (dead_tuples->num_tuples == prev_dead_count)
 			RecordPageWithFreeSpace(onerel, blkno, freespace);
 	}
 
@@ -1418,7 +1602,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 
 	/* If any tuples need to be deleted, perform final vacuum cycle */
 	/* XXX put a threshold on min number of tuples here? */
-	if (vacrelstats->num_dead_tuples > 0)
+	if (dead_tuples->num_tuples > 0)
 	{
 		const int	hvp_index[] = {
 			PROGRESS_VACUUM_PHASE,
@@ -1434,10 +1618,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 									 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
 		/* Remove index entries */
-		for (i = 0; i < nindexes; i++)
-			lazy_vacuum_index(Irel[i],
-							  &indstats[i],
-							  vacrelstats);
+		lazy_vacuum_indexes(vacrelstats, Irel, nindexes, indstats, lps);
 
 		/* Report that we are now vacuuming the heap */
 		hvp_val[0] = PROGRESS_VACUUM_PHASE_VACUUM_HEAP;
@@ -1463,12 +1644,19 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
 								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
 
-	/* Do post-vacuum cleanup and statistics update for each index */
+	/* Do post-vacuum cleanup */
 	if (vacrelstats->useindex)
-	{
-		for (i = 0; i < nindexes; i++)
-			lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
-	}
+		lazy_cleanup_indexes(vacrelstats, Irel, nindexes, indstats, lps);
+
+	/*
+	 * End parallel mode before updating index statistics as we cannot wirte
+	 * during parallel mode.
+	 */
+	if (ParallelVacuumIsActive(lps))
+		end_parallel_vacuum(lps, Irel, nindexes, indstats);
+
+	/* Update index statistics */
+	 update_index_statistics(Irel, indstats, nindexes);
 
 	/* If no indexes, make log report that lazy_vacuum_heap would've made */
 	if (vacuumed_pages)
@@ -1534,7 +1722,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 	npages = 0;
 
 	tupindex = 0;
-	while (tupindex < vacrelstats->num_dead_tuples)
+	while (tupindex < vacrelstats->dead_tuples->num_tuples)
 	{
 		BlockNumber tblk;
 		Buffer		buf;
@@ -1543,7 +1731,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 
 		vacuum_delay_point();
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples->itemptrs[tupindex]);
 		buf = ReadBufferExtended(onerel, MAIN_FORKNUM, tblk, RBM_NORMAL,
 								 vac_strategy);
 		if (!ConditionalLockBufferForCleanup(buf))
@@ -1591,6 +1779,7 @@ static int
 lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 				 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer)
 {
+	LVDeadTuples	*dead_tuples = vacrelstats->dead_tuples;
 	Page		page = BufferGetPage(buffer);
 	OffsetNumber unused[MaxOffsetNumber];
 	int			uncnt = 0;
@@ -1601,16 +1790,16 @@ lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 
 	START_CRIT_SECTION();
 
-	for (; tupindex < vacrelstats->num_dead_tuples; tupindex++)
+	for (; tupindex < dead_tuples->num_tuples; tupindex++)
 	{
 		BlockNumber tblk;
 		OffsetNumber toff;
 		ItemId		itemid;
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&dead_tuples->itemptrs[tupindex]);
 		if (tblk != blkno)
 			break;				/* past end of tuples for this block */
-		toff = ItemPointerGetOffsetNumber(&vacrelstats->dead_tuples[tupindex]);
+		toff = ItemPointerGetOffsetNumber(&dead_tuples->itemptrs[tupindex]);
 		itemid = PageGetItemId(page, toff);
 		ItemIdSetUnused(itemid);
 		unused[uncnt++] = toff;
@@ -1731,19 +1920,258 @@ lazy_check_needs_freeze(Buffer buf, bool *hastup)
 	return false;
 }
 
+/*
+ * Vacuuming indexes with parallel vacuum workers. This function must be used
+ * by the parallel vacuum leader process.
+ */
+static void
+lazy_parallel_vacuum_indexes(LVRelStats *vacrelstats, Relation *Irel,
+							 int nindexes, IndexBulkDeleteResult **stats,
+							 LVParallelState *lps)
+{
+	Assert(!IsParallelWorker());
+	Assert(ParallelVacuumIsActive(lps));
+	Assert(nindexes > 0);
+
+	/* Tell parallel workers to do index vacuuming */
+	lps->lvshared->for_cleanup = false;
+
+	/*
+	 * We can only provide an approximate value of num_heap_tuples in
+	 * vacuum cases.
+	 */
+	lps->lvshared->reltuples = vacrelstats->old_live_tuples;
+	lps->lvshared->estimated_count = true;
+
+	LaunchParallelWorkers(lps->pcxt);
+
+	ereport(elevel,
+			(errmsg(ngettext("launched %d parallel vacuum worker for index vacuuming (planned: %d)",
+							 "launched %d parallel vacuum workers for index vacuuming (planned: %d)",
+							 lps->pcxt->nworkers_launched),
+					lps->pcxt->nworkers_launched, lps->pcxt->nworkers)));
+
+	/*
+	 * Join index vacuuming with parallel workers. The leader process alone
+	 * does that in case where no workers launched.
+	 */
+	if (lps->leaderparticipates || lps->pcxt->nworkers_launched == 0)
+		vacuum_or_cleanup_indexes_worker(Irel, nindexes, stats, lps->lvshared,
+										 vacrelstats->dead_tuples);
+
+	/* Wait for all vacuum workers to finish */
+	WaitForParallelWorkersToFinish(lps->pcxt);
+
+	/* Reset the processing count */
+	pg_atomic_write_u32(&(lps->lvshared->nprocessed), 0);
+
+	/*
+	 * Reinitialize the parallel context to relaunch parallel workers
+	 * for the next execution.
+	 */
+	ReinitializeParallelDSM(lps->pcxt);
+}
+
+/*
+ * Cleanup indexes with parallel vacuum workers. This function must be used
+ * by the parallel vacuum leader process.
+ */
+static void
+lazy_parallel_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
+									   int nindexes, IndexBulkDeleteResult **stats,
+									   LVParallelState *lps)
+{
+	Assert(!IsParallelWorker());
+	Assert(ParallelVacuumIsActive(lps));
+	Assert(nindexes > 0);
+
+	/* Tell parallel workers to do index cleanup */
+	lps->lvshared->for_cleanup = true;
+
+	/*
+	 * Now we can provide a better estimate of total number of surviving
+	 * tuples (we assume indexes are more interested in that than in the
+	 * number of nominally live tuples).
+	 */
+	lps->lvshared->reltuples = vacrelstats->new_rel_tuples;
+	lps->lvshared->estimated_count =
+		(vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+
+	LaunchParallelWorkers(lps->pcxt);
+
+	ereport(elevel,
+			(errmsg(ngettext("launched %d parallel vacuum worker for index cleanup (planned: %d)",
+							 "launched %d parallel vacuum workers for index cleanup (planned: %d)",
+							 lps->pcxt->nworkers_launched),
+					lps->pcxt->nworkers_launched, lps->pcxt->nworkers)));
+
+	/*
+	 * Join index cleanup with parallel workers. The leader process alone does
+	 * that in case where no workers launched.
+	 */
+	if (lps->leaderparticipates || lps->pcxt->nworkers_launched == 0)
+		vacuum_or_cleanup_indexes_worker(Irel, nindexes, stats, lps->lvshared,
+										 vacrelstats->dead_tuples);
+
+	/* Wait for all vacuum workers to finish */
+	WaitForParallelWorkersToFinish(lps->pcxt);
+
+
+	/*
+	 * We don't need to reinitialize the parallel context unlike parallel index
+	 * vacuum as no more index vacuuming and index cleanup will be performed after
+	 * that.
+	 */
+}
+
+/*
+ * Index vacuum and index cleanup routine used by parallel vacuum worker processes
+ * including the leader process.
+ */
+static void
+vacuum_or_cleanup_indexes_worker(Relation *Irel, int nindexes,
+								 IndexBulkDeleteResult **stats,
+								 LVShared *lvshared,
+								 LVDeadTuples *dead_tuples)
+{
+	/* Loop until all indexes are vacuumed */
+	for (;;)
+	{
+		int idx;
+
+		/* Get an index number to process */
+		idx = pg_atomic_fetch_add_u32(&(lvshared->nprocessed), 1);
+
+		/* Done for all indexes? */
+		if (idx >= nindexes)
+			break;
+
+		/*
+		 * Update the pointer to the corresponding bulk-deletion result
+		 * if someone has already updated it.
+		 */
+		if (lvshared->indstats[idx].updated &&
+			stats[idx] == NULL)
+			stats[idx] = &(lvshared->indstats[idx].stats);
+
+		/* Do vacuum or cleanup one index */
+		if (lvshared->for_cleanup)
+			lazy_cleanup_index(Irel[idx], &(stats[idx]), lvshared->reltuples,
+							   lvshared->estimated_count);
+		else
+			lazy_vacuum_index(Irel[idx], &(stats[idx]), dead_tuples,
+							  lvshared->reltuples);
+
+		/*
+		 * Copy the index bulk-deletion result returned from ambulkdelete and
+		 * amvacuumcleanup to the DSM segment if it's the first time to get it
+		 * from them, because they allocate it locally and it's possible that an
+		 * index will be vacuumed by the different vacuum process at the next
+		 * time. The copying the result normally happens only after the first
+		 * time of index vacuuming. From the second time, we pass the result on
+		 * the DSM segment so that they then update it directly.
+		 *
+		 * Since all vacuum workers write the bulk-deletion result at different
+		 * slots we can write them without locking.
+		 */
+		if (!lvshared->indstats[idx].updated && stats[idx] != NULL)
+		{
+			memcpy(&(lvshared->indstats[idx].stats),
+				   stats[idx], sizeof(IndexBulkDeleteResult));
+			lvshared->indstats[idx].updated = true;
+
+			/*
+			 * no longer need the locally allocated result and now stats[idx]
+			 * points to the DSM segment.
+			 */
+			pfree(stats[idx]);
+			stats[idx] = &(lvshared->indstats[idx].stats);
+		}
+	}
+}
+
+/*
+ * Cleanup indexes. This function must be used by the parallel vacuum leader
+ * process in parallel vacuum case.
+ */
+static void
+lazy_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
+					int nindexes, IndexBulkDeleteResult **stats,
+					LVParallelState *lps)
+{
+	int		idx;
+
+	Assert(!IsParallelWorker());
+	Assert(nindexes > 0);
+
+	if (ParallelVacuumIsActive(lps))
+	{
+		/*
+		 * Generally index cleanup does not scan the index when index vacuuming
+		 * (ambulkdelete) was performed. So we perform index cleanup with parallel
+		 * workers only if we have not performed index vacuuming yet. Otherwise
+		 * we do it the leader process alone.
+		 */
+		if (vacrelstats->num_index_scans == 0)
+			lazy_parallel_cleanup_indexes(vacrelstats, Irel, nindexes, stats, lps);
+		else
+		{
+			lps->lvshared->for_cleanup = true;
+			vacuum_or_cleanup_indexes_worker(Irel, nindexes, stats,
+											 lps->lvshared,
+											 vacrelstats->dead_tuples);
+		}
+	}
+	else
+	{
+		for (idx = 0; idx < nindexes; idx++)
+			lazy_cleanup_index(Irel[idx], &stats[idx], vacrelstats->new_rel_tuples,
+							   vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	}
+}
+
+/*
+ * Vacuum indexes. This function must be used by the parallel vacuum leader
+ * process in parallel vacuum case.
+ */
+static void
+lazy_vacuum_indexes(LVRelStats *vacrelstats, Relation *Irel,
+					int nindexes, IndexBulkDeleteResult **stats,
+					LVParallelState *lps)
+{
+	int		idx;
+
+	Assert(!IsParallelWorker());
+	Assert(nindexes > 0);
+
+	/*
+	 * Perform index vacuuming. If parallel vacuum is active we perform
+	 * index vacuuming with parallel workers. Otherwise do it alone.
+	 */
+	if (ParallelVacuumIsActive(lps))
+		lazy_parallel_vacuum_indexes(vacrelstats, Irel, nindexes, stats, lps);
+	else
+	{
+		for (idx = 0; idx < nindexes; idx++)
+			lazy_vacuum_index(Irel[idx], &stats[idx], vacrelstats->dead_tuples,
+							  vacrelstats->old_live_tuples);
+	}
+}
 
 /*
  *	lazy_vacuum_index() -- vacuum one index relation.
  *
  *		Delete all the index entries pointing to tuples listed in
  *		vacrelstats->dead_tuples, and update running statistics.
+ *		reltuples is the number of heap tuples to be passed to the
+ *		bulk delete callback.
  */
 static void
-lazy_vacuum_index(Relation indrel,
-				  IndexBulkDeleteResult **stats,
-				  LVRelStats *vacrelstats)
+lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult **stats,
+				  LVDeadTuples *dead_tuples, double reltuples)
 {
 	IndexVacuumInfo ivinfo;
+	char		*msgfmt;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
@@ -1753,30 +2181,38 @@ lazy_vacuum_index(Relation indrel,
 	ivinfo.report_progress = false;
 	ivinfo.estimated_count = true;
 	ivinfo.message_level = elevel;
-	/* We can only provide an approximate value of num_heap_tuples here */
-	ivinfo.num_heap_tuples = vacrelstats->old_live_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
 	/* Do bulk deletion */
 	*stats = index_bulk_delete(&ivinfo, *stats,
-							   lazy_tid_reaped, (void *) vacrelstats);
+							   lazy_tid_reaped, (void *) dead_tuples);
+
+	if (IsParallelWorker())
+		msgfmt = gettext_noop("scanned index \"%s\" to remove %d row versions by parallel vacuum worker");
+	else
+		msgfmt = gettext_noop("scanned index \"%s\" to remove %d row versions");
 
 	ereport(elevel,
-			(errmsg("scanned index \"%s\" to remove %d row versions",
+			(errmsg(msgfmt,
 					RelationGetRelationName(indrel),
-					vacrelstats->num_dead_tuples),
+					dead_tuples->num_tuples),
 			 errdetail_internal("%s", pg_rusage_show(&ru0))));
 }
 
 /*
  *	lazy_cleanup_index() -- do post-vacuum cleanup for one index relation.
+ *
+ *		reltuples is the number of heap tuples and estimated_count is true
+ *		if the reltuples is an estimated value.
  */
 static void
 lazy_cleanup_index(Relation indrel,
-				   IndexBulkDeleteResult *stats,
-				   LVRelStats *vacrelstats)
+				   IndexBulkDeleteResult **stats,
+				   double reltuples, bool estimated_count)
 {
 	IndexVacuumInfo ivinfo;
+	char		*msgfmt;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
@@ -1784,49 +2220,62 @@ lazy_cleanup_index(Relation indrel,
 	ivinfo.index = indrel;
 	ivinfo.analyze_only = false;
 	ivinfo.report_progress = false;
-	ivinfo.estimated_count = (vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	ivinfo.estimated_count = estimated_count;
 	ivinfo.message_level = elevel;
 
-	/*
-	 * Now we can provide a better estimate of total number of surviving
-	 * tuples (we assume indexes are more interested in that than in the
-	 * number of nominally live tuples).
-	 */
-	ivinfo.num_heap_tuples = vacrelstats->new_rel_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
-	stats = index_vacuum_cleanup(&ivinfo, stats);
+	*stats = index_vacuum_cleanup(&ivinfo, *stats);
 
-	if (!stats)
+	if (!(*stats))
 		return;
 
-	/*
-	 * Now update statistics in pg_class, but only if the index says the count
-	 * is accurate.
-	 */
-	if (!stats->estimated_count)
-		vac_update_relstats(indrel,
-							stats->num_pages,
-							stats->num_index_tuples,
-							0,
-							false,
-							InvalidTransactionId,
-							InvalidMultiXactId,
-							false);
+	if (IsParallelWorker())
+		msgfmt = gettext_noop("index \"%s\" now contains %.0f row versions in %u pages, reported by parallel vacuum worker");
+	else
+		msgfmt = gettext_noop("index \"%s\" now contains %.0f row versions in %u pages");
 
 	ereport(elevel,
-			(errmsg("index \"%s\" now contains %.0f row versions in %u pages",
+			(errmsg(msgfmt,
 					RelationGetRelationName(indrel),
-					stats->num_index_tuples,
-					stats->num_pages),
+					(*stats)->num_index_tuples,
+					(*stats)->num_pages),
 			 errdetail("%.0f index row versions were removed.\n"
 					   "%u index pages have been deleted, %u are currently reusable.\n"
 					   "%s.",
-					   stats->tuples_removed,
-					   stats->pages_deleted, stats->pages_free,
+					   (*stats)->tuples_removed,
+					   (*stats)->pages_deleted, (*stats)->pages_free,
 					   pg_rusage_show(&ru0))));
+}
+
+/*
+ * Update index statistics in pg_class if the statistics is accurate.
+ */
+static void
+update_index_statistics(Relation *Irel, IndexBulkDeleteResult **stats,
+						int nindexes)
+{
+	int i;
+
+	Assert(!IsInParallelMode());
 
-	pfree(stats);
+	for (i = 0; i < nindexes; i++)
+	{
+		if (stats[i] == NULL || stats[i]->estimated_count)
+			continue;
+
+		/* Update index statistics */
+		vac_update_relstats(Irel[i],
+							stats[i]->num_pages,
+							stats[i]->num_index_tuples,
+							0,
+							false,
+							InvalidTransactionId,
+							InvalidMultiXactId,
+							false);
+		pfree(stats[i]);
+	}
 }
 
 /*
@@ -2134,19 +2583,17 @@ count_nondeletable_pages(Relation onerel, LVRelStats *vacrelstats)
 }
 
 /*
- * lazy_space_alloc - space allocation decisions for lazy vacuum
- *
- * See the comments at the head of this file for rationale.
+ * Return the maximum number of dead tuples we can record.
  */
-static void
-lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+static long
+compute_max_dead_tuples(BlockNumber relblocks, bool useindex)
 {
 	long		maxtuples;
 	int			vac_work_mem = IsAutoVacuumWorkerProcess() &&
 	autovacuum_work_mem != -1 ?
 	autovacuum_work_mem : maintenance_work_mem;
 
-	if (vacrelstats->useindex)
+	if (useindex)
 	{
 		maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);
 		maxtuples = Min(maxtuples, INT_MAX);
@@ -2160,34 +2607,49 @@ lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
 		maxtuples = Max(maxtuples, MaxHeapTuplesPerPage);
 	}
 	else
-	{
 		maxtuples = MaxHeapTuplesPerPage;
-	}
 
-	vacrelstats->num_dead_tuples = 0;
-	vacrelstats->max_dead_tuples = (int) maxtuples;
-	vacrelstats->dead_tuples = (ItemPointer)
-		palloc(maxtuples * sizeof(ItemPointerData));
+	return maxtuples;
+}
+
+/*
+ * lazy_space_alloc - space allocation decisions for lazy vacuum
+ *
+ * See the comments at the head of this file for rationale.
+ */
+static void
+lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+{
+	LVDeadTuples	*dead_tuples = NULL;
+	long		maxtuples;
+
+	maxtuples = compute_max_dead_tuples(relblocks, vacrelstats->useindex);
+
+	dead_tuples = (LVDeadTuples *)
+		palloc(SizeOfLVDeadTuples + maxtuples * sizeof(ItemPointerData));
+	dead_tuples->num_tuples = 0;
+	dead_tuples->max_tuples = (int) maxtuples;
+
+	vacrelstats->dead_tuples = dead_tuples;
 }
 
 /*
  * lazy_record_dead_tuple - remember one deletable tuple
  */
 static void
-lazy_record_dead_tuple(LVRelStats *vacrelstats,
-					   ItemPointer itemptr)
+lazy_record_dead_tuple(LVDeadTuples *dead_tuples, ItemPointer itemptr)
 {
 	/*
 	 * The array shouldn't overflow under normal behavior, but perhaps it
 	 * could if we are given a really small maintenance_work_mem. In that
 	 * case, just forget the last few tuples (we'll get 'em next time).
 	 */
-	if (vacrelstats->num_dead_tuples < vacrelstats->max_dead_tuples)
+	if (dead_tuples->num_tuples < dead_tuples->max_tuples)
 	{
-		vacrelstats->dead_tuples[vacrelstats->num_dead_tuples] = *itemptr;
-		vacrelstats->num_dead_tuples++;
+		dead_tuples->itemptrs[dead_tuples->num_tuples] = *itemptr;
+		dead_tuples->num_tuples++;
 		pgstat_progress_update_param(PROGRESS_VACUUM_NUM_DEAD_TUPLES,
-									 vacrelstats->num_dead_tuples);
+									 dead_tuples->num_tuples);
 	}
 }
 
@@ -2201,12 +2663,12 @@ lazy_record_dead_tuple(LVRelStats *vacrelstats,
 static bool
 lazy_tid_reaped(ItemPointer itemptr, void *state)
 {
-	LVRelStats *vacrelstats = (LVRelStats *) state;
+	LVDeadTuples	*dead_tuples = (LVDeadTuples *) state;
 	ItemPointer res;
 
 	res = (ItemPointer) bsearch((void *) itemptr,
-								(void *) vacrelstats->dead_tuples,
-								vacrelstats->num_dead_tuples,
+								(void *) dead_tuples->itemptrs,
+								dead_tuples->num_tuples,
 								sizeof(ItemPointerData),
 								vac_cmp_itemptr);
 
@@ -2354,3 +2816,229 @@ heap_page_is_all_visible(Relation rel, Buffer buf,
 
 	return all_visible;
 }
+
+/*
+ * Compute the number of parallel worker processes to request. Both index
+ * vacuuming and index cleanup can be executed together with parallel workers.
+ * The relation sizes of table and indexes don't affect to the parallel
+ * degree for now. nrequested is the number of parallel workers that user
+ * requested. If nrequested is 0 we compute the parallel degree based on
+ * nindexes that is the number of indexes that the table has.
+ */
+static int
+compute_parallel_workers(int nrequested, int nindexes)
+{
+	int		parallel_workers;
+	bool	leaderparticipates = true;
+	int		nindexes_to_vacuum = nindexes;
+
+	Assert(nrequested >= 0);
+
+	/* Return immediately when parallelism disabled */
+	if (max_parallel_maintenance_workers == 0)
+		return 0;
+
+#ifdef PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION
+	leaderparticipates = false;
+#endif
+
+	/* The leader process takes one index */
+	if (leaderparticipates)
+		nindexes_to_vacuum--;
+
+	/* Compute the parallel degree */
+	parallel_workers = (nrequested > 0) ?
+		Min(nrequested, nindexes_to_vacuum) : nindexes_to_vacuum;
+
+	/* cap by max_parallel_maintenace_workers */
+	parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+
+	return parallel_workers;
+}
+
+/*
+ * Enter parallel mode, allocate and initialize the DSM segment.
+ */
+static LVParallelState *
+begin_parallel_vacuum(LVRelStats *vacrelstats, Oid relid, BlockNumber nblocks,
+					  int nindexes, int nrequested)
+{
+	LVParallelState *lps = (LVParallelState *) palloc(sizeof(LVParallelState));
+	LVShared	*shared;
+	ParallelContext *pcxt;
+	LVDeadTuples	*tidmap;
+	long	maxtuples;
+	char	*sharedquery;
+	Size	est_shared;
+	Size	est_deadtuples;
+	int		querylen;
+
+	Assert(nrequested > 0);
+	Assert(nindexes > 0);
+
+	lps->leaderparticipates = true;
+
+#ifdef PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION
+	lps->leaderparticipates = false;
+#endif
+
+	EnterParallelMode();
+	pcxt = CreateParallelContext("postgres", "heap_parallel_vacuum_main",
+								 nrequested);
+	lps->pcxt = pcxt;
+	Assert(pcxt->nworkers > 0);
+
+	/* Estimate size for shared information -- PARALLEL_VACUUM_KEY_SHARED */
+	est_shared = MAXALIGN(add_size(SizeOfLVShared,
+								   mul_size(sizeof(LVSharedIndStats), nindexes)));
+	shm_toc_estimate_chunk(&pcxt->estimator, est_shared);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	/* Estimate size for dead tuples -- PARALLEL_VACUUM_KEY_DEAD_TUPLES */
+	maxtuples = compute_max_dead_tuples(nblocks, true);
+	est_deadtuples = MAXALIGN(add_size(SizeOfLVDeadTuples,
+									   mul_size(sizeof(ItemPointerData), maxtuples)));
+	shm_toc_estimate_chunk(&pcxt->estimator, est_deadtuples);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	/* Finally, estimate PARALLEL_VACUUM_KEY_QUERY_TEXT space */
+	querylen = strlen(debug_query_string);
+	shm_toc_estimate_chunk(&pcxt->estimator, querylen + 1);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	InitializeParallelDSM(pcxt);
+
+	/* prepare shared information */
+	shared = (LVShared *) shm_toc_allocate(pcxt->toc, est_shared);
+	shared->relid = relid;
+	shared->elevel = elevel;
+	pg_atomic_init_u32(&(shared->nprocessed), 0);
+	MemSet(shared->indstats, 0, sizeof(LVSharedIndStats) * nindexes);
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_SHARED, shared);
+	lps->lvshared = shared;
+
+	/* prepare the dead tuple space */
+	tidmap = (LVDeadTuples *) shm_toc_allocate(pcxt->toc, est_deadtuples);
+	tidmap->max_tuples = maxtuples;
+	tidmap->num_tuples = 0;
+	MemSet(tidmap->itemptrs, 0, sizeof(ItemPointerData) * maxtuples);
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_DEAD_TUPLES, tidmap);
+	vacrelstats->dead_tuples = tidmap;
+
+	/* Store query string for workers */
+	sharedquery = (char *) shm_toc_allocate(pcxt->toc, querylen + 1);
+	memcpy(sharedquery, debug_query_string, querylen + 1);
+	sharedquery[querylen] = '\0';
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, sharedquery);
+
+	lps->nworkers_requested = 0;
+
+	return lps;
+}
+
+/*
+ * Destroy the parallel context, and end parallel mode.
+ *
+ * All writes are not allowed during parallel mode and it might not be
+ * safe to exit from the parallel mode while keeping the parallel context.
+ * So we copy the updated index statistics to a temporary space and adjust
+ * 'stats' so that we can update index statistics after exited from the
+ * parallel mode.
+ */
+static void
+end_parallel_vacuum(LVParallelState *lps, Relation *Irel, int nindexes,
+					IndexBulkDeleteResult **stats)
+{
+	IndexBulkDeleteResult *copied_stats;
+	int i;
+
+	Assert(!IsParallelWorker());
+
+	/* Copy the updated statistics and adjust each elements of stats */
+	copied_stats = palloc(sizeof(IndexBulkDeleteResult) * nindexes);
+	for (i = 0; i < nindexes; i++)
+	{
+		if (lps->lvshared->indstats[i].updated)
+		{
+			memcpy(&(copied_stats[i]),
+				   &(lps->lvshared->indstats[i].stats),
+				   sizeof(IndexBulkDeleteResult));
+			stats[i] = &(copied_stats[i]);
+		}
+		else
+			stats[i] = NULL;
+	}
+
+	DestroyParallelContext(lps->pcxt);
+	ExitParallelMode();
+
+	/* Deactivate parallel vacuum */
+	pfree(lps);
+	lps = NULL;
+}
+
+/*
+ * Perform work within a launched parallel process.
+ *
+ * Since parallel vacuum workers work only within index vacuuming and index
+ * cleanup, no need to report the progress information.
+ */
+void
+heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
+{
+	Relation	onerel;
+	Relation	*indrels;
+	LVShared	*lvshared;
+	LVDeadTuples	*dead_tuples;
+	int			nindexes;
+	char		*sharedquery;
+	IndexBulkDeleteResult **stats;
+
+	lvshared = (LVShared *) shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_SHARED,
+										   false);
+	elevel = lvshared->elevel;
+
+	ereport(DEBUG1,
+			(errmsg("starting parallel lazy vacuum worker for %s",
+					lvshared->for_cleanup ? "cleanup" : "vacuuming")));
+
+	/* Set debug_query_string for individual workers */
+	sharedquery = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, false);
+	debug_query_string = sharedquery;
+	pgstat_report_activity(STATE_RUNNING, debug_query_string);
+
+	/*
+	 * Open table. The lock mode is the same as the leader process. It's okay
+	 * because The lockmode does not conflict among the parallel workers.
+	 */
+	onerel = heap_open(lvshared->relid, ShareUpdateExclusiveLock);
+
+	/*
+	 * Open all indexes. indrels are sorted in order by OID, which should
+	 * be matched to the leader's one.
+	 */
+	vac_open_indexes(onerel, RowExclusiveLock, &nindexes, &indrels);
+	Assert(nindexes > 0);
+
+	/* Set dead tuple space */
+	dead_tuples = (LVDeadTuples *) shm_toc_lookup(toc,
+												  PARALLEL_VACUUM_KEY_DEAD_TUPLES,
+												  false);
+
+	/* Set cost-based vacuum delay */
+	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostBalance = 0;
+	VacuumPageHit = 0;
+	VacuumPageMiss = 0;
+	VacuumPageDirty = 0;
+
+	stats = (IndexBulkDeleteResult **)
+		palloc0(nindexes * sizeof(IndexBulkDeleteResult *));
+
+	/* Do either vacuuming indexes or cleaning indexes */
+	vacuum_or_cleanup_indexes_worker(indrels, nindexes, stats, lvshared,
+									 dead_tuples);
+
+	vac_close_indexes(nindexes, indrels, RowExclusiveLock);
+	heap_close(onerel, ShareUpdateExclusiveLock);
+}
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index 55d129a64f..86511b2703 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -14,6 +14,7 @@
 
 #include "postgres.h"
 
+#include "access/heapam.h"
 #include "access/nbtree.h"
 #include "access/parallel.h"
 #include "access/session.h"
@@ -140,6 +141,9 @@ static const struct
 	},
 	{
 		"_bt_parallel_build_main", _bt_parallel_build_main
+	},
+	{
+		"heap_parallel_vacuum_main", heap_parallel_vacuum_main
 	}
 };
 
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index e154507ecd..23f20d93e5 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -42,6 +42,7 @@
 #include "nodes/makefuncs.h"
 #include "pgstat.h"
 #include "postmaster/autovacuum.h"
+#include "postmaster/bgworker_internals.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
 #include "storage/proc.h"
@@ -99,6 +100,7 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 	/* Set default value */
 	params.index_cleanup = VACOPT_TERNARY_DEFAULT;
 	params.truncate = VACOPT_TERNARY_DEFAULT;
+	params.nworkers = -1;
 
 	/* Parse options list */
 	foreach(lc, vacstmt->options)
@@ -129,6 +131,28 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 			params.index_cleanup = get_vacopt_ternary_value(opt);
 		else if (strcmp(opt->defname, "truncate") == 0)
 			params.truncate = get_vacopt_ternary_value(opt);
+		else if (strcmp(opt->defname, "parallel") == 0)
+		{
+			if (opt->arg == NULL)
+			{
+				/*
+				 * Parallel lazy vacuum is requested but user didn't specify
+				 * the parallel degree. The parallel degree will be determined
+				 * at the start of lazy vacuum.
+				 */
+				params.nworkers = 0;
+			}
+			else
+			{
+				params.nworkers = defGetInt32(opt);
+				if (params.nworkers < 1 || params.nworkers > MAX_PARALLEL_WORKER_LIMIT)
+					ereport(ERROR,
+							(errcode(ERRCODE_SYNTAX_ERROR),
+							 errmsg("parallel vacuum degree must be between 1 and %d",
+									MAX_PARALLEL_WORKER_LIMIT),
+							 parser_errposition(pstate, opt->location)));
+			}
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -170,6 +194,11 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 		}
 	}
 
+	if ((params.options & VACOPT_FULL) && params.nworkers >= 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot specify FULL option with PARALLEL option")));
+
 	/*
 	 * All freeze ages are zero if the FREEZE option is given; otherwise pass
 	 * them as -1 which means to use the default values.
@@ -1736,6 +1765,22 @@ vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params)
 		return false;
 	}
 
+	/*
+	 * Check that it's a temporary relation and PARALLEL option is specified.
+	 * Since parallel workers cannot access data in temporary tables, parallel
+	 * vacuum is not allowed for temporary relation.
+	 */
+	if (RelationUsesLocalBuffers(onerel) && params->nworkers >= 0)
+	{
+		ereport(WARNING,
+				(errmsg("skipping \"%s\" --- cannot parallel vacuum temporary tables",
+						RelationGetRelationName(onerel))));
+		relation_close(onerel, lmode);
+		PopActiveSnapshot();
+		CommitTransactionCommand();
+		return false;
+	}
+
 	/*
 	 * Silently ignore partitioned tables as there is no work to be done.  The
 	 * useful work is on their child partitions, which have been queued up for
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 073f313337..de43d1e4f0 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2888,6 +2888,8 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 			(!wraparound ? VACOPT_SKIP_LOCKED : 0);
 		tab->at_params.index_cleanup = VACOPT_TERNARY_DEFAULT;
 		tab->at_params.truncate = VACOPT_TERNARY_DEFAULT;
+		/* We don't support parallel vacuum for autovacuum for now */
+		tab->at_params.nworkers = -1;
 		tab->at_params.freeze_min_age = freeze_min_age;
 		tab->at_params.freeze_table_age = freeze_table_age;
 		tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index e00dbab5aa..321a1511a8 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -3556,7 +3556,7 @@ psql_completion(const char *text, int start, int end)
 		if (ends_with(prev_wd, '(') || ends_with(prev_wd, ','))
 			COMPLETE_WITH("FULL", "FREEZE", "ANALYZE", "VERBOSE",
 						  "DISABLE_PAGE_SKIPPING", "SKIP_LOCKED",
-						  "INDEX_CLEANUP", "TRUNCATE");
+						  "INDEX_CLEANUP", "TRUNCATE", "PARALLEL");
 		else if (TailMatches("FULL|FREEZE|ANALYZE|VERBOSE|DISABLE_PAGE_SKIPPING|SKIP_LOCKED|INDEX_CLEANUP|TRUNCATE"))
 			COMPLETE_WITH("ON", "OFF");
 	}
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 858bcb6bc9..12065cc038 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -24,6 +24,8 @@
 #include "nodes/primnodes.h"
 #include "storage/bufpage.h"
 #include "storage/lockdefs.h"
+#include "storage/shm_toc.h"
+#include "storage/dsm.h"
 #include "utils/relcache.h"
 #include "utils/snapshot.h"
 
@@ -193,6 +195,7 @@ extern Size SyncScanShmemSize(void);
 struct VacuumParams;
 extern void heap_vacuum_rel(Relation onerel,
 							struct VacuumParams *params, BufferAccessStrategy bstrategy);
+extern void heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
 
 /* in heap/heapam_visibility.c */
 extern bool HeapTupleSatisfiesVisibility(HeapTuple stup, Snapshot snapshot,
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 128f7ae65d..43702f2f86 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -184,6 +184,11 @@ typedef struct VacuumParams
 										 * default value depends on reloptions */
 	VacOptTernaryValue truncate;	/* Truncate empty pages at the end,
 									 * default value depends on reloptions */
+	/*
+	 * The number of parallel vacuum workers. -1 by default for no workers
+	 * and 0 for choosing based on the number of indexes.
+	 */
+	int			nworkers;
 } VacuumParams;
 
 /* GUC parameters */
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index aff0b10a93..f134412c3d 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -92,6 +92,20 @@ CONTEXT:  SQL function "do_analyze" statement 1
 SQL function "wrap_do_analyze" statement 1
 VACUUM FULL vactst;
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+-- PARALLEL option
+VACUUM (PARALLEL) vaccluster;
+VACUUM (PARALLEL 2) vaccluster;
+VACUUM (PARALLEL 0) vaccluster; -- error
+ERROR:  parallel vacuum degree must be between 1 and 1024
+LINE 1: VACUUM (PARALLEL 0) vaccluster;
+                ^
+VACUUM (PARALLEL 2, INDEX_CLEANUP FALSE) vaccluster;
+VACUUM (PARALLEL 2, FULL TRUE) vaccluster; -- error, cannot use both PARALLEL and FULL
+ERROR:  cannot specify FULL option with PARALLEL option
+CREATE TEMPORARY TABLE tmp (a int PRIMARY KEY);
+CREATE INDEX tmp_idx1 ON tmp (a);
+VACUUM (PARALLEL 1) tmp; -- error, cannot parallel vacuum temporary tables
+WARNING:  skipping "tmp" --- cannot parallel vacuum temporary tables
 -- INDEX_CLEANUP option
 CREATE TABLE no_index_cleanup (i INT PRIMARY KEY, t TEXT);
 -- Use uncompressed data stored in toast.
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index f0fee3af2b..66a9b110fe 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -75,6 +75,16 @@ VACUUM FULL vactst;
 
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
 
+-- PARALLEL option
+VACUUM (PARALLEL) vaccluster;
+VACUUM (PARALLEL 2) vaccluster;
+VACUUM (PARALLEL 0) vaccluster; -- error
+VACUUM (PARALLEL 2, INDEX_CLEANUP FALSE) vaccluster;
+VACUUM (PARALLEL 2, FULL TRUE) vaccluster; -- error, cannot use both PARALLEL and FULL
+CREATE TEMPORARY TABLE tmp (a int PRIMARY KEY);
+CREATE INDEX tmp_idx1 ON tmp (a);
+VACUUM (PARALLEL 1) tmp; -- error, cannot parallel vacuum temporary tables
+
 -- INDEX_CLEANUP option
 CREATE TABLE no_index_cleanup (i INT PRIMARY KEY, t TEXT);
 -- Use uncompressed data stored in toast.
-- 
2.22.0

#105

Mahendra Singh

mahi6run@gmail.com

over 6 years ago

In reply to: Masahiko Sawada (#104)

On Thu, 10 Oct 2019 at 13:18, Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Thu, Oct 10, 2019 at 2:19 PM Amit Kapila <amit.kapila16@gmail.com>
wrote:

On Fri, Oct 4, 2019 at 4:18 PM Amit Kapila <amit.kapila16@gmail.com>

wrote:

On Wed, Oct 2, 2019 at 7:29 PM Masahiko Sawada <sawada.mshk@gmail.com>

wrote:

Few more comments:

Thank you for reviewing the patch!

---------------------------------
1. Caurrently parallel vacuum is allowed for temporary relations
which is wrong. It leads to below error:

postgres=# create temporary table tmp_t1(c1 int, c2 char(10));
CREATE TABLE
postgres=# create index idx_tmp_t1 on tmp_t1(c1);
CREATE INDEX
postgres=# create index idx1_tmp_t1 on tmp_t1(c2);
CREATE INDEX
postgres=# insert into tmp_t1 values(generate_series(1,10000),'aaaa');
INSERT 0 10000
postgres=# delete from tmp_t1 where c1 > 5000;
DELETE 5000
postgres=# vacuum (parallel 2) tmp_t1;
ERROR: cannot access temporary tables during a parallel operation
CONTEXT: parallel worker

The parallel vacuum shouldn't be allowed for temporary relations.

Fixed.
2.
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -34,6 +34,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [
<replaceable class="paramet
SKIP_LOCKED [ <replaceable class="parameter">boolean</replaceable> ]
INDEX_CLEANUP [ <replaceable
class="parameter">boolean</replaceable> ]
TRUNCATE [ <replaceable class="parameter">boolean</replaceable> ]
+    PARALLEL [ <replaceable
class="parameter">integer</replaceable> ]
Now, if the user gives a command like Vacuum (analyze, parallel)
<table_name>; it is not very obvious that a parallel option will be
only used for vacuum purposes but not for analyze. I think we can add
a note in the docs to mention this explicitly. This can avoid any
confusion.
Agreed.

Attached the latest version patch although the memory usage problem is
under discussion. I'll update the patches according to the result of
that discussion.

I applied both patches on HEAD and did some testing. I am getting one crash
in freeing memory. (pfree(stats[i]))

*Steps to reproduc*e:
*Step 1) *Apply both the patches and configure with below command.
./configure --with-zlib --enable-debug --prefix=$PWD/inst/
--with-openssl CFLAGS="-ggdb3" > war && make -j 8 install > war

*Step 2) Now start the server.*

*Step 3) Fire below commands:*

create table tmp_t1(c1 int, c2 char(10));
create index idx_tmp_t1 on tmp_t1(c1);
create index idx1_tmp_t1 on tmp_t1(c2);
insert into tmp_t1 values(generate_series(1,10000),'aaaa');
insert into tmp_t1 values(generate_series(1,10000),'aaaa');
insert into tmp_t1 values(generate_series(1,10000),'aaaa');
insert into tmp_t1 values(generate_series(1,10000),'aaaa');
insert into tmp_t1 values(generate_series(1,10000),'aaaa');
insert into tmp_t1 values(generate_series(1,10000),'aaaa');
delete from tmp_t1 where c1 > 5000;
vacuum (parallel 2) tmp_t1;

*Call stack:*

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `postgres: mahendra postgres [local] VACUUM
'.
Program terminated with signal 11, Segmentation fault.
#0 0x0000000000a4f97a in pfree (pointer=0x10baa68) at mcxt.c:1060
1060 context->methods->free_p(context, pointer);
Missing separate debuginfos, use: debuginfo-install
keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.15.1-19.el7.x86_64
libcom_err-1.42.9-12.el7_5.x86_64 libselinux-2.5-12.el7.x86_64
openssl-libs-1.0.2k-12.el7.x86_64 pcre-8.32-17.el7.x86_64
zlib-1.2.7-17.el7.x86_64
(gdb) bt
#0 0x0000000000a4f97a in pfree (pointer=0x10baa68) at mcxt.c:1060
#1 0x00000000004e7d13 in update_index_statistics (Irel=0x10b9808,
stats=0x10b9828, nindexes=2) at vacuumlazy.c:2277
#2 0x00000000004e693f in lazy_scan_heap (onerel=0x7f8d99610d08,
params=0x7ffeeaddb7f0, vacrelstats=0x10b9728, Irel=0x10b9808, nindexes=2,
aggressive=false) at vacuumlazy.c:1659
'#3 0x00000000004e4d25 in heap_vacuum_rel (onerel=0x7f8d99610d08,
params=0x7ffeeaddb7f0, bstrategy=0x1117528) at vacuumlazy.c:431
#4 0x00000000006a71a7 in table_relation_vacuum (rel=0x7f8d99610d08,
params=0x7ffeeaddb7f0, bstrategy=0x1117528) at
../../../src/include/access/tableam.h:1432
#5 0x00000000006a9899 in vacuum_rel (relid=16384, relation=0x103b308,
params=0x7ffeeaddb7f0) at vacuum.c:1870
#6 0x00000000006a7c22 in vacuum (relations=0x11176b8,
params=0x7ffeeaddb7f0, bstrategy=0x1117528, isTopLevel=true) at vacuum.c:425
#7 0x00000000006a77e6 in ExecVacuum (pstate=0x105f578, vacstmt=0x103b3d8,
isTopLevel=true) at vacuum.c:228
#8 0x00000000008af401 in standard_ProcessUtility (pstmt=0x103b6f8,
queryString=0x103a808 "vacuum (parallel 2) tmp_t1;",
context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0,
dest=0x103b7d8, completionTag=0x7ffeeaddbc50 "") at utility.c:670
#9 0x00000000008aec40 in ProcessUtility (pstmt=0x103b6f8,
queryString=0x103a808 "vacuum (parallel 2) tmp_t1;",
context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0,
dest=0x103b7d8, completionTag=0x7ffeeaddbc50 "") at utility.c:360
#10 0x00000000008addbb in PortalRunUtility (portal=0x10a1a28,
pstmt=0x103b6f8, isTopLevel=true, setHoldSnapshot=false, dest=0x103b7d8,
completionTag=0x7ffeeaddbc50 "") at pquery.c:1175
#11 0x00000000008adf9f in PortalRunMulti (portal=0x10a1a28,
isTopLevel=true, setHoldSnapshot=false, dest=0x103b7d8, altdest=0x103b7d8,
completionTag=0x7ffeeaddbc50 "") at pquery.c:1321
#12 0x00000000008ad55d in PortalRun (portal=0x10a1a28,
count=9223372036854775807, isTopLevel=true, run_once=true, dest=0x103b7d8,
altdest=0x103b7d8, completionTag=0x7ffeeaddbc50 "")
at pquery.c:796
#13 0x00000000008a7789 in exec_simple_query (query_string=0x103a808
"vacuum (parallel 2) tmp_t1;") at postgres.c:1231
#14 0x00000000008ab8f2 in PostgresMain (argc=1, argv=0x1065b00,
dbname=0x1065a28 "postgres", username=0x1065a08 "mahendra") at
postgres.c:4256
#15 0x0000000000811a42 in BackendRun (port=0x105d9c0) at postmaster.c:4465
#16 0x0000000000811241 in BackendStartup (port=0x105d9c0) at
postmaster.c:4156
#17 0x000000000080d7d6 in ServerLoop () at postmaster.c:1718
#18 0x000000000080d096 in PostmasterMain (argc=3, argv=0x1035270) at
postmaster.c:1391
#19 0x000000000072accb in main (argc=3, argv=0x1035270) at main.c:210

I did some analysis and found that we are trying to free some already freed
memory. Or we are freeing palloced memory in vac_update_relstats.
for (i = 0; i < nindexes; i++)
{
if (stats[i] == NULL || stats[i]->estimated_count)
continue;

/* Update index statistics */
vac_update_relstats(Irel[i],
stats[i]->num_pages,
stats[i]->num_index_tuples,
0,
false,
InvalidTransactionId,
InvalidMultiXactId,
false);
pfree(stats[i]);
}

As my table have 2 indexes, so we have to free both stats. When i = 0, it
is freeing propery but when i = 1, then vac_update_relstats is freeing
memory.

(gdb) p *stats[i]
$1 = {num_pages = 218, pages_removed = 0, estimated_count = false,
num_index_tuples = 30000, tuples_removed = 30000, pages_deleted = 102,
pages_free = 0}
(gdb) p *stats[i]
$2 = {num_pages = 0, pages_removed = 65536, estimated_count = false,
num_index_tuples = 0, tuples_removed = 0, pages_deleted = 0, pages_free = 0}
(gdb)

From above data, it looks like, somewhere inside vac_update_relstats, we
are freeing all palloced memory. I don't know, why is it.

Thanks and Regards
Mahendra Thalor
EnterpriseDB: http://www.enterprisedb.com

#106

Amit Kapila

amit.kapila16@gmail.com

over 6 years ago

In reply to: Mahendra Singh (#105)

1 attachment(s)

On Fri, Oct 11, 2019 at 4:47 PM Mahendra Singh <mahi6run@gmail.com> wrote:

I did some analysis and found that we are trying to free some already freed memory. Or we are freeing palloced memory in vac_update_relstats.
for (i = 0; i < nindexes; i++)
{
if (stats[i] == NULL || stats[i]->estimated_count)
continue;

/* Update index statistics */
vac_update_relstats(Irel[i],
stats[i]->num_pages,
stats[i]->num_index_tuples,
0,
false,
InvalidTransactionId,
InvalidMultiXactId,
false);
pfree(stats[i]);
}

As my table have 2 indexes, so we have to free both stats. When i = 0, it is freeing propery but when i = 1, then vac_update_relstats is freeing memory.

(gdb) p *stats[i]
$1 = {num_pages = 218, pages_removed = 0, estimated_count = false, num_index_tuples = 30000, tuples_removed = 30000, pages_deleted = 102, pages_free = 0}
(gdb) p *stats[i]
$2 = {num_pages = 0, pages_removed = 65536, estimated_count = false, num_index_tuples = 0, tuples_removed = 0, pages_deleted = 0, pages_free = 0}
(gdb)

From above data, it looks like, somewhere inside vac_update_relstats, we are freeing all palloced memory. I don't know, why is it.

I don't think the problem is in vac_update_relstats as we are not even
passing stats to it, so it won't be able to free it. I think the real
problem is in the way we copy the stats from shared memory to local
memory in the function end_parallel_vacuum(). Basically, it allocates
the memory for all the index stats together and then in function
update_index_statistics, it is trying to free memory of individual
array elements, that won't work. I have tried to fix the allocation
in end_parallel_vacuum, see if this fixes the problem for you. You
need to apply the attached patch atop
v28-0001-Add-parallel-option-to-VACUUM-command posted above by
Sawada-San.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachments:

Fix-memory-allocation-for-copying-the-stats.patchapplication/octet-stream; name=Fix-memory-allocation-for-copying-the-stats.patchDownload

From e10774420cd8f7ab56fdebd6eb49f6e37de46957 Mon Sep 17 00:00:00 2001
From: Amit Kapila <amit.kapila@enterprisedb.com>
Date: Sat, 12 Oct 2019 08:49:34 +0530
Subject: [PATCH] Fix memory allocation for copying the stats.

---
 src/backend/access/heap/vacuumlazy.c | 13 +++++--------
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 3c5e16608e..ea421e55da 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -2941,29 +2941,26 @@ begin_parallel_vacuum(LVRelStats *vacrelstats, Oid relid, BlockNumber nblocks,
  *
  * All writes are not allowed during parallel mode and it might not be
  * safe to exit from the parallel mode while keeping the parallel context.
- * So we copy the updated index statistics to a temporary space and adjust
- * 'stats' so that we can update index statistics after exited from the
- * parallel mode.
+ * So we copy the updated index statistics to a local memory and then later
+ * use that to update the index statistics.
  */
 static void
 end_parallel_vacuum(LVParallelState *lps, Relation *Irel, int nindexes,
 					IndexBulkDeleteResult **stats)
 {
-	IndexBulkDeleteResult *copied_stats;
 	int i;
 
 	Assert(!IsParallelWorker());
 
-	/* Copy the updated statistics and adjust each elements of stats */
-	copied_stats = palloc(sizeof(IndexBulkDeleteResult) * nindexes);
+	/* copy the updated statistics */
 	for (i = 0; i < nindexes; i++)
 	{
 		if (lps->lvshared->indstats[i].updated)
 		{
-			memcpy(&(copied_stats[i]),
+			stats[i] = (IndexBulkDeleteResult *) palloc0(sizeof(IndexBulkDeleteResult));
+			memcpy(stats[i],
 				   &(lps->lvshared->indstats[i].stats),
 				   sizeof(IndexBulkDeleteResult));
-			stats[i] = &(copied_stats[i]);
 		}
 		else
 			stats[i] = NULL;
-- 
2.16.2.windows.1

#107

Masahiko Sawada

sawada.mshk@gmail.com

over 6 years ago

In reply to: Amit Kapila (#106)

2 attachment(s)

On Sat, Oct 12, 2019 at 12:33 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Oct 11, 2019 at 4:47 PM Mahendra Singh <mahi6run@gmail.com> wrote:

I did some analysis and found that we are trying to free some already freed memory. Or we are freeing palloced memory in vac_update_relstats.
for (i = 0; i < nindexes; i++)
{
if (stats[i] == NULL || stats[i]->estimated_count)
continue;

/* Update index statistics */
vac_update_relstats(Irel[i],
stats[i]->num_pages,
stats[i]->num_index_tuples,
0,
false,
InvalidTransactionId,
InvalidMultiXactId,
false);
pfree(stats[i]);
}

As my table have 2 indexes, so we have to free both stats. When i = 0, it is freeing propery but when i = 1, then vac_update_relstats is freeing memory.

(gdb) p *stats[i]
$1 = {num_pages = 218, pages_removed = 0, estimated_count = false, num_index_tuples = 30000, tuples_removed = 30000, pages_deleted = 102, pages_free = 0}
(gdb) p *stats[i]
$2 = {num_pages = 0, pages_removed = 65536, estimated_count = false, num_index_tuples = 0, tuples_removed = 0, pages_deleted = 0, pages_free = 0}
(gdb)

From above data, it looks like, somewhere inside vac_update_relstats, we are freeing all palloced memory. I don't know, why is it.

I don't think the problem is in vac_update_relstats as we are not even
passing stats to it, so it won't be able to free it. I think the real
problem is in the way we copy the stats from shared memory to local
memory in the function end_parallel_vacuum(). Basically, it allocates
the memory for all the index stats together and then in function
update_index_statistics, it is trying to free memory of individual
array elements, that won't work. I have tried to fix the allocation
in end_parallel_vacuum, see if this fixes the problem for you. You
need to apply the attached patch atop
v28-0001-Add-parallel-option-to-VACUUM-command posted above by
Sawada-San.

Thank you for reviewing and creating the patch!

I think the patch fixes this issue correctly. Attached the updated
version patch.

Regards,

--
Masahiko Sawada

Attachments:

v29-0002-Add-paralell-P-option-to-vacuumdb-command.patchapplication/x-patch; name=v29-0002-Add-paralell-P-option-to-vacuumdb-command.patchDownload

From 33c58347fa7a0aa13a9d4494e7c07937f3180da2 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Wed, 23 Jan 2019 16:07:53 +0900
Subject: [PATCH v29 2/2] Add --paralell, -P option to vacuumdb command

---
 doc/src/sgml/ref/vacuumdb.sgml    | 16 +++++++++++
 src/bin/scripts/t/100_vacuumdb.pl | 10 ++++++-
 src/bin/scripts/vacuumdb.c        | 48 ++++++++++++++++++++++++++++++-
 3 files changed, 72 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/ref/vacuumdb.sgml b/doc/src/sgml/ref/vacuumdb.sgml
index 47d93456f8..f6ac0c6e5a 100644
--- a/doc/src/sgml/ref/vacuumdb.sgml
+++ b/doc/src/sgml/ref/vacuumdb.sgml
@@ -226,6 +226,22 @@ PostgreSQL documentation
       </listitem>
      </varlistentry>
 
+     <varlistentry>
+      <term><option>-P <replaceable class="parameter">workers</replaceable></option></term>
+      <term><option>--parallel=<replaceable class="parameter">workers</replaceable></option></term>
+      <listitem>
+       <para>
+        Execute parallel vacuum with <productname>PostgreSQL</productname>'s
+        <replaceable class="parameter">workers</replaceable> background workers.
+       </para>
+       <para>
+        This option will require background workers, so make sure your
+        <xref linkend="guc-max-parallel-workers-maintenance"/> setting is more
+        than one.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry>
       <term><option>-q</option></term>
       <term><option>--quiet</option></term>
diff --git a/src/bin/scripts/t/100_vacuumdb.pl b/src/bin/scripts/t/100_vacuumdb.pl
index b685b35282..8fe80719e8 100644
--- a/src/bin/scripts/t/100_vacuumdb.pl
+++ b/src/bin/scripts/t/100_vacuumdb.pl
@@ -3,7 +3,7 @@ use warnings;
 
 use PostgresNode;
 use TestLib;
-use Test::More tests => 44;
+use Test::More tests => 48;
 
 program_help_ok('vacuumdb');
 program_version_ok('vacuumdb');
@@ -48,6 +48,14 @@ $node->issues_sql_like(
 $node->command_fails(
 	[ 'vacuumdb', '--analyze-only', '--disable-page-skipping', 'postgres' ],
 	'--analyze-only and --disable-page-skipping specified together');
+$node->issues_sql_like(
+	[ 'vacuumdb', '-P2', 'postgres' ],
+	qr/statement: VACUUM \(PARALLEL 2\).*;/,
+	'vacuumdb -P2');
+$node->issues_sql_like(
+	[ 'vacuumdb', '-P', 'postgres' ],
+	qr/statement: VACUUM \(PARALLEL\).*;/,
+	'vacuumdb -P');
 $node->command_ok([qw(vacuumdb -Z --table=pg_am dbname=template1)],
 	'vacuumdb with connection string');
 
diff --git a/src/bin/scripts/vacuumdb.c b/src/bin/scripts/vacuumdb.c
index 2c7219239f..63bf66a70b 100644
--- a/src/bin/scripts/vacuumdb.c
+++ b/src/bin/scripts/vacuumdb.c
@@ -34,6 +34,8 @@ typedef struct vacuumingOptions
 	bool		skip_locked;
 	int			min_xid_age;
 	int			min_mxid_age;
+	int			parallel_workers;	/* -1 disables, 0 for choosing based on the
+									 * number of indexes */
 } vacuumingOptions;
 
 
@@ -86,6 +88,7 @@ main(int argc, char *argv[])
 		{"full", no_argument, NULL, 'f'},
 		{"verbose", no_argument, NULL, 'v'},
 		{"jobs", required_argument, NULL, 'j'},
+		{"parallel", optional_argument, NULL, 'P'},
 		{"maintenance-db", required_argument, NULL, 2},
 		{"analyze-in-stages", no_argument, NULL, 3},
 		{"disable-page-skipping", no_argument, NULL, 4},
@@ -115,6 +118,7 @@ main(int argc, char *argv[])
 
 	/* initialize options to all false */
 	memset(&vacopts, 0, sizeof(vacopts));
+	vacopts.parallel_workers = -1;
 
 	pg_logging_init(argv[0]);
 	progname = get_progname(argv[0]);
@@ -122,7 +126,7 @@ main(int argc, char *argv[])
 
 	handle_help_version_opts(argc, argv, "vacuumdb", help);
 
-	while ((c = getopt_long(argc, argv, "h:p:U:wWeqd:zZFat:fvj:", long_options, &optindex)) != -1)
+	while ((c = getopt_long(argc, argv, "h:p:P::U:wWeqd:zZFat:fvj:", long_options, &optindex)) != -1)
 	{
 		switch (c)
 		{
@@ -182,6 +186,24 @@ main(int argc, char *argv[])
 					exit(1);
 				}
 				break;
+			case 'P':
+				{
+					int parallel_workers = 0;
+
+					if (optarg != NULL)
+					{
+						parallel_workers = atoi(optarg);
+						if (parallel_workers <= 0)
+						{
+							pg_log_error("number of parallel workers must be at least 1");
+							exit(1);
+						}
+					}
+
+					/* allow to set 0, meaning PARALLEL without the parallel degree */
+					vacopts.parallel_workers = parallel_workers;
+					break;
+				}
 			case 2:
 				maintenance_db = pg_strdup(optarg);
 				break;
@@ -254,9 +276,22 @@ main(int argc, char *argv[])
 						 "disable-page-skipping");
 			exit(1);
 		}
+		if (vacopts.parallel_workers >= 0)
+		{
+			pg_log_error("cannot use the \"%s\" option when performing only analyze",
+						 "parallel");
+			exit(1);
+		}
 		/* allow 'and_analyze' with 'analyze_only' */
 	}
 
+	if (vacopts.full && vacopts.parallel_workers >= 0)
+	{
+		pg_log_error("cannot use the \"%s\" option with \"%s\" option",
+					 "full", "parallel");
+		exit(1);
+	}
+
 	setup_cancel_handler();
 
 	/* Avoid opening extra connections. */
@@ -822,6 +857,16 @@ prepare_vacuum_command(PQExpBuffer sql, int serverVersion,
 				appendPQExpBuffer(sql, "%sANALYZE", sep);
 				sep = comma;
 			}
+			if (vacopts->parallel_workers > 0)
+			{
+				appendPQExpBuffer(sql, "%sPARALLEL %d", sep, vacopts->parallel_workers);
+				sep = comma;
+			}
+			if (vacopts->parallel_workers == 0)
+			{
+				appendPQExpBuffer(sql, "%sPARALLEL", sep);
+				sep = comma;
+			}
 			if (sep != paren)
 				appendPQExpBufferChar(sql, ')');
 		}
@@ -885,6 +930,7 @@ help(const char *progname)
 	printf(_("  -j, --jobs=NUM                  use this many concurrent connections to vacuum\n"));
 	printf(_("      --min-mxid-age=MXID_AGE     minimum multixact ID age of tables to vacuum\n"));
 	printf(_("      --min-xid-age=XID_AGE       minimum transaction ID age of tables to vacuum\n"));
+	printf(_("  -P, --parallel[=NUM]            do parallel vacuuming\n"));
 	printf(_("  -q, --quiet                     don't write any messages\n"));
 	printf(_("      --skip-locked               skip relations that cannot be immediately locked\n"));
 	printf(_("  -t, --table='TABLE[(COLUMNS)]'  vacuum specific table(s) only\n"));
-- 
2.22.0

v29-0001-Add-parallel-option-to-VACUUM-command.patchapplication/x-patch; name=v29-0001-Add-parallel-option-to-VACUUM-command.patchDownload

From 5d10be72455501d8db775140be758774ea48e199 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Wed, 2 Oct 2019 22:46:21 +0900
Subject: [PATCH v29 1/2] Add parallel option to VACUUM command

This change adds PARALLEL option to VACUUM command that enable us to
perform index vacuuming and index cleanup with background
workers. Individual indexes is processed by one vacuum
process. Therefore parallel vacuum can be used when the table has at
least two indexes and it cannot specify larger parallel degree than
the number of indexes that the table has.

The parallel degree is either specified by user or determined based on
the number of indexes that the table has, and further limited by
max_parallel_maintenance_workers. The table size and index size don't
affect it.
---
 doc/src/sgml/config.sgml              |  14 +-
 doc/src/sgml/ref/vacuum.sgml          |  44 ++
 src/backend/access/heap/vacuumlazy.c  | 887 +++++++++++++++++++++++---
 src/backend/access/transam/parallel.c |   4 +
 src/backend/commands/vacuum.c         |  45 ++
 src/backend/postmaster/autovacuum.c   |   2 +
 src/bin/psql/tab-complete.c           |   2 +-
 src/include/access/heapam.h           |   3 +
 src/include/commands/vacuum.h         |   5 +
 src/test/regress/expected/vacuum.out  |  14 +
 src/test/regress/sql/vacuum.sql       |  10 +
 11 files changed, 921 insertions(+), 109 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 47b12c6a8f..9012e5549e 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2265,13 +2265,13 @@ include_dir 'conf.d'
        <listitem>
         <para>
          Sets the maximum number of parallel workers that can be
-         started by a single utility command.  Currently, the only
-         parallel utility command that supports the use of parallel
-         workers is <command>CREATE INDEX</command>, and only when
-         building a B-tree index.  Parallel workers are taken from the
-         pool of processes established by <xref
-         linkend="guc-max-worker-processes"/>, limited by <xref
-         linkend="guc-max-parallel-workers"/>.  Note that the requested
+         started by a single utility command.  Currently, the parallel
+         utility commands that support the use of parallel workers are
+         <command>CREATE INDEX</command> only when building a B-tree index,
+         and <command>VACUUM</command> without <literal>FULL</literal>
+         option. Parallel workers are taken from the pool of processes
+         established by <xref linkend="guc-max-worker-processes"/>, limited
+         by <xref linkend="guc-max-parallel-workers"/>.  Note that the requested
          number of workers may not actually be available at run time.
          If this occurs, the utility operation will run with fewer
          workers than expected.  The default value is 2.  Setting this
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index f9b0fb8794..801daddb1f 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -34,6 +34,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     SKIP_LOCKED [ <replaceable class="parameter">boolean</replaceable> ]
     INDEX_CLEANUP [ <replaceable class="parameter">boolean</replaceable> ]
     TRUNCATE [ <replaceable class="parameter">boolean</replaceable> ]
+    PARALLEL [ <replaceable class="parameter">integer</replaceable> ]
 
 <phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
 
@@ -223,6 +224,31 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><literal>PARALLEL</literal></term>
+    <listitem>
+     <para>
+      Perform vacuum index and cleanup index phases of <command>VACUUM</command>
+      in parallel using <replaceable class="parameter">integer</replaceable>
+      background workers (for the detail of each vacuum phases, please
+      refer to <xref linkend="vacuum-phases"/>). If the parallel degree
+      <replaceable class="parameter">integer</replaceable> is omitted,
+      then <command>VACUUM</command> decides the number of workers based
+      on number of indexes on the relation which is further limited by
+      <xref linkend="guc-max-parallel-workers-maintenance"/>. Please note
+      that it is not guaranteed that the number of parallel worker specified
+      in <replaceable class="parameter">integer</replaceable> will be used
+      during execution. It is possible for a vacuum to run with fewer workers
+      than specified, or even with no workers at all. Only one worker can
+      be used per index. So parallel workers are launched only when there
+      are at least <literal>2</literal> indexes in the table. Workers for
+      vacuum launches before starting each phases and exit at the end of
+      the phase. These behaviors might change in a future release. This
+      option can not use with <literal>FULL</literal> option.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><replaceable class="parameter">boolean</replaceable></term>
     <listitem>
@@ -237,6 +263,18 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><replaceable class="parameter">integer</replaceable></term>
+    <listitem>
+     <para>
+      Specifies a positive integer value passed to the selected option.
+      The <replaceable class="parameter">integer</replaceable> value can
+      also be omitted, in which case the default value of the selected
+      option is used.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><replaceable class="parameter">table_name</replaceable></term>
     <listitem>
@@ -316,6 +354,12 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     more than a plain <command>VACUUM</command> would.
    </para>
 
+   <para>
+     The <option>PARALLEL</option> option is used for only vacuum purpose.
+     Even if this option is specified with <option>ANALYZE</option> option
+     it does not affect <option>ANALYZE</option>.
+   </para>
+
    <para>
     <command>VACUUM</command> causes a substantial increase in I/O traffic,
     which might cause poor performance for other active sessions.  Therefore,
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index a3c4a1df3b..ea421e55da 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -22,6 +22,21 @@
  * of index scans performed.  So we don't use maintenance_work_mem memory for
  * the TID array, just enough to hold as many heap tuples as fit on one page.
  *
+ * Lazy vacuum supports parallel execution with parallel worker processes.  In
+ * parallel lazy vacuum, we perform both index vacuuming and index cleanup with
+ * parallel worker processes.  Individual indexes are processed by one vacuum
+ * process.  At the beginning of lazy vacuum (at lazy_scan_heap) we prepare the
+ * parallel context and initialize the DSM segment that contains shared information
+ * as well as the memory space for storing dead tuples.  When starting either
+ * index vacuuming or index cleanup, we launch parallel worker processes.  Once
+ * all indexes are processed the parallel worker processes exit.  And then the
+ * leader process re-initializes the parallel context while keeping recorded
+ * dead tuples so that the leader can launch parallel workers again in the next
+ * time.  Note that all parallel workers live during either index vacuuming or
+ * index cleanup but the leader process neither exits from the parallel mode
+ * nor destroys the parallel context.  For updating the index statistics, since
+ * any updates are not allowed during parallel mode we update the index
+ * statistics after exited from the parallel mode.
  *
  * Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
@@ -41,8 +56,10 @@
 #include "access/heapam_xlog.h"
 #include "access/htup_details.h"
 #include "access/multixact.h"
+#include "access/parallel.h"
 #include "access/transam.h"
 #include "access/visibilitymap.h"
+#include "access/xact.h"
 #include "access/xlog.h"
 #include "catalog/storage.h"
 #include "commands/dbcommands.h"
@@ -55,6 +72,7 @@
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
 #include "storage/lmgr.h"
+#include "tcop/tcopprot.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/pg_rusage.h"
@@ -110,6 +128,111 @@
  */
 #define PREFETCH_SIZE			((BlockNumber) 32)
 
+/*
+ * DSM keys for parallel lazy vacuum. Unlike other parallel execution code,
+ * since we don't need to worry about DSM keys conflicting with plan_node_id
+ * we can use small integers.
+ */
+#define PARALLEL_VACUUM_KEY_SHARED			1
+#define PARALLEL_VACUUM_KEY_DEAD_TUPLES		2
+#define PARALLEL_VACUUM_KEY_QUERY_TEXT		3
+
+/*
+ * PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION disables the leader's
+ * participation in parallel lazy vacuum.  This may be useful as a debugging
+ * aid.
+#undef PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION
+ */
+
+/*
+ * Macro to check if we are in a parallel lazy vacuum. If true, we are
+ * in the parallel mode and prepared the DSM segment.
+ */
+#define ParallelVacuumIsActive(lps) (((LVParallelState *) (lps)) != NULL)
+
+/*
+ * LVDeadTuples stores the dead tuple TIDs collected during heap scan.
+ * This is allocated in the DSM segment when parallel lazy vacuum
+ * mode, otherwise allocated in a local memory.
+ */
+typedef struct LVDeadTuples
+{
+	int			max_tuples;	/* # slots allocated in array */
+	int			num_tuples;	/* current # of entries */
+	/* List of TIDs of tuples we intend to delete */
+	/* NB: this list is ordered by TID address */
+	ItemPointerData itemptrs[FLEXIBLE_ARRAY_MEMBER];	/* array of ItemPointerData */
+} LVDeadTuples;
+#define SizeOfLVDeadTuples offsetof(LVDeadTuples, itemptrs) + sizeof(ItemPointerData)
+
+/*
+ * Struct for an index bulk-deletion statistic used for parallel lazy vacuum.
+ * This is allocated in the DSM segment.
+ */
+typedef struct LVSharedIndStats
+{
+	IndexBulkDeleteResult	stats;
+	bool					updated;	/* are the stats updated? */
+} LVSharedIndStats;
+
+/*
+ * Shared information among parallel workers. So this is allocated in
+ * the DSM segment.
+ */
+typedef struct LVShared
+{
+	/*
+	 * Target table relid and log level. These fields are not modified during
+	 * the lazy vacuum.
+	 */
+	Oid		relid;
+	int		elevel;
+
+	/*
+	 * An indication for vacuum workers of doing either index vacuuming or
+	 * index cleanup.
+	 */
+	bool	for_cleanup;
+
+	/*
+	 * Fields for both index vacuuming and index cleanup.
+	 *
+	 * reltuples is the total number of input heap tuples. We set either an
+	 * old live tuples in index vacuuming case or the new live tuples in index
+	 * cleanup case.
+	 *
+	 * estimated_count is true if the reltuples is estimated value.
+	 */
+	double	reltuples;
+	bool	estimated_count;
+
+	/*
+	 * Variables to control parallel index vacuuming. An variable-sized field
+	 * 'indstats' must come last.
+	 */
+	pg_atomic_uint32	nprocessed;
+	LVSharedIndStats	indstats[FLEXIBLE_ARRAY_MEMBER];
+} LVShared;
+#define SizeOfLVShared offsetof(LVShared, indstats) + sizeof(LVSharedIndStats)
+
+/* Struct for parallel lazy vacuum */
+typedef struct LVParallelState
+{
+	ParallelContext	*pcxt;
+
+	/* Shared information among parallel vacuum workers */
+	LVShared		*lvshared;
+
+	/* User-requested parallel degree */
+	int				nworkers_requested;
+
+	/*
+	 * Always true except in a debugging case where
+	 * PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION are defined.
+	 */
+	bool			leaderparticipates;
+} LVParallelState;
+
 typedef struct LVRelStats
 {
 	/* useindex = true means two-pass strategy; false means one-pass */
@@ -128,17 +251,12 @@ typedef struct LVRelStats
 	BlockNumber pages_removed;
 	double		tuples_deleted;
 	BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
-	/* List of TIDs of tuples we intend to delete */
-	/* NB: this list is ordered by TID address */
-	int			num_dead_tuples;	/* current # of entries */
-	int			max_dead_tuples;	/* # slots allocated in array */
-	ItemPointer dead_tuples;	/* array of ItemPointerData */
+	LVDeadTuples *dead_tuples;
 	int			num_index_scans;
 	TransactionId latestRemovedXid;
 	bool		lock_waiter_detected;
 } LVRelStats;
 
-
 /* A few variables that don't seem worth passing around as parameters */
 static int	elevel = -1;
 
@@ -155,12 +273,11 @@ static void lazy_scan_heap(Relation onerel, VacuumParams *params,
 						   bool aggressive);
 static void lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats);
 static bool lazy_check_needs_freeze(Buffer buf, bool *hastup);
-static void lazy_vacuum_index(Relation indrel,
-							  IndexBulkDeleteResult **stats,
-							  LVRelStats *vacrelstats);
+static void lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult **stats,
+							  LVDeadTuples *dead_tuples, double reltuples);
 static void lazy_cleanup_index(Relation indrel,
-							   IndexBulkDeleteResult *stats,
-							   LVRelStats *vacrelstats);
+							   IndexBulkDeleteResult **stats,
+							   double reltuples, bool estimated_count);
 static int	lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 							 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer);
 static bool should_attempt_truncation(VacuumParams *params,
@@ -169,12 +286,37 @@ static void lazy_truncate_heap(Relation onerel, LVRelStats *vacrelstats);
 static BlockNumber count_nondeletable_pages(Relation onerel,
 											LVRelStats *vacrelstats);
 static void lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks);
-static void lazy_record_dead_tuple(LVRelStats *vacrelstats,
+static void lazy_record_dead_tuple(LVDeadTuples *dead_tuples,
 								   ItemPointer itemptr);
 static bool lazy_tid_reaped(ItemPointer itemptr, void *state);
 static int	vac_cmp_itemptr(const void *left, const void *right);
 static bool heap_page_is_all_visible(Relation rel, Buffer buf,
 									 TransactionId *visibility_cutoff_xid, bool *all_frozen);
+static LVParallelState *begin_parallel_vacuum(LVRelStats *vacrelstats, Oid relid,
+											  BlockNumber nblocks, int nindexes,
+											  int nrequested);
+static void end_parallel_vacuum(LVParallelState *lps, Relation *Irel, int nindexes,
+								IndexBulkDeleteResult **stats);
+static void lazy_vacuum_indexes(LVRelStats *vacrelstats, Relation *Irel,
+								int nindexes, IndexBulkDeleteResult **stats,
+								LVParallelState *lps);
+static void lazy_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
+								 int nindexes, IndexBulkDeleteResult **stats,
+								 LVParallelState *lps);
+static void lazy_parallel_vacuum_indexes(LVRelStats *vacrelstats, Relation *Irel,
+										 int nindexes, IndexBulkDeleteResult **stats,
+										 LVParallelState *lps);
+static void lazy_parallel_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
+										  int nindexes, IndexBulkDeleteResult **stats,
+										  LVParallelState *lps);
+static void vacuum_or_cleanup_indexes_worker(Relation *Irel, int nindexes,
+											 IndexBulkDeleteResult **stats,
+											 LVShared *lvshared,
+											 LVDeadTuples *dead_tuples);
+static void update_index_statistics(Relation *Irel, IndexBulkDeleteResult **stats,
+									int nindexes);
+static int compute_parallel_workers(int nrequested, int nindexes);
+static long compute_max_dead_tuples(BlockNumber relblocks, bool hasindex);
 
 
 /*
@@ -488,6 +630,18 @@ vacuum_log_cleanup_info(Relation rel, LVRelStats *vacrelstats)
  *		dead-tuple TIDs, invoke vacuuming of indexes and call lazy_vacuum_heap
  *		to reclaim dead line pointers.
  *
+ *		If the table has at least two indexes and parallel lazy vacuum is
+ *		requested, we execute both index vacuuming and index cleanup with
+ *		parallel workers. In parallel lazy vacuum, we enter parallel mode and
+ *		then create both the parallel context and the DSM segment before starting
+ *		heap scan so that we can record dead tuples to the DSM segment. All
+ *		parallel workers are launched at beginning of index vacuuming and index
+ *		cleanup and they exit once done with all indexes. At the end of this
+ *		function we exit from parallel mode. Index bulk-deletion results are
+ *		stored in the DSM segment and update index statistics as a whole after
+ *		exited from parallel mode since all writes are not allowed during parallel
+ *		mode.
+ *
  *		If there are no indexes then we can reclaim line pointers on the fly;
  *		dead line pointers need only be retained until all index pointers that
  *		reference them have been killed.
@@ -496,6 +650,8 @@ static void
 lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			   Relation *Irel, int nindexes, bool aggressive)
 {
+	LVParallelState *lps = NULL;
+	LVDeadTuples *dead_tuples;
 	BlockNumber nblocks,
 				blkno;
 	HeapTupleData tuple;
@@ -518,6 +674,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	bool		skipping_blocks;
 	xl_heap_freeze_tuple *frozen;
 	StringInfoData buf;
+	int			parallel_workers = 0;
 	const int	initprog_index[] = {
 		PROGRESS_VACUUM_PHASE,
 		PROGRESS_VACUUM_TOTAL_HEAP_BLKS,
@@ -553,13 +710,43 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	vacrelstats->nonempty_pages = 0;
 	vacrelstats->latestRemovedXid = InvalidTransactionId;
 
-	lazy_space_alloc(vacrelstats, nblocks);
+	/*
+	 * If parallel lazy vacuum is requested and we vacuum indexes, compute
+	 * the number of parallel vacuum worker to launch.
+	 */
+	if (params->nworkers >= 0 && vacrelstats->useindex)
+		parallel_workers = compute_parallel_workers(params->nworkers, nindexes);
+
+	if (parallel_workers > 0)
+	{
+		/*
+		 * Enter parallel mode, create the parallel context and allocate the
+		 * DSM segment.
+		 */
+		lps = begin_parallel_vacuum(vacrelstats,
+									RelationGetRelid(onerel),
+									nblocks, nindexes,
+									parallel_workers);
+
+		/* Remember the user-requested parallel degree for reporting */
+		lps->nworkers_requested = params->nworkers;
+	}
+	else
+	{
+		/*
+		 * Use single process vacuum. We allocate the memory space for dead
+		 * tuples locally.
+		 */
+		lazy_space_alloc(vacrelstats, nblocks);
+	}
+
+	dead_tuples = vacrelstats->dead_tuples;
 	frozen = palloc(sizeof(xl_heap_freeze_tuple) * MaxHeapTuplesPerPage);
 
 	/* Report that we're scanning the heap, advertising total # of blocks */
 	initprog_val[0] = PROGRESS_VACUUM_PHASE_SCAN_HEAP;
 	initprog_val[1] = nblocks;
-	initprog_val[2] = vacrelstats->max_dead_tuples;
+	initprog_val[2] = dead_tuples->max_tuples;
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
 	/*
@@ -737,8 +924,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * If we are close to overrunning the available space for dead-tuple
 		 * TIDs, pause and do a cycle of vacuuming before we tackle this page.
 		 */
-		if ((vacrelstats->max_dead_tuples - vacrelstats->num_dead_tuples) < MaxHeapTuplesPerPage &&
-			vacrelstats->num_dead_tuples > 0)
+		if ((dead_tuples->max_tuples - dead_tuples->num_tuples) < MaxHeapTuplesPerPage &&
+			dead_tuples->num_tuples > 0)
 		{
 			const int	hvp_index[] = {
 				PROGRESS_VACUUM_PHASE,
@@ -766,10 +953,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 										 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
 			/* Remove index entries */
-			for (i = 0; i < nindexes; i++)
-				lazy_vacuum_index(Irel[i],
-								  &indstats[i],
-								  vacrelstats);
+			lazy_vacuum_indexes(vacrelstats, Irel, nindexes, indstats, lps);
 
 			/*
 			 * Report that we are now vacuuming the heap.  We also increase
@@ -789,7 +973,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 			vacrelstats->num_index_scans++;
 
 			/*
@@ -985,7 +1169,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		has_dead_tuples = false;
 		nfrozen = 0;
 		hastup = false;
-		prev_dead_count = vacrelstats->num_dead_tuples;
+		prev_dead_count = dead_tuples->num_tuples;
 		maxoff = PageGetMaxOffsetNumber(page);
 
 		/*
@@ -1024,7 +1208,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 */
 			if (ItemIdIsDead(itemid))
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				all_visible = false;
 				continue;
 			}
@@ -1170,7 +1354,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 
 			if (tupgone)
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				HeapTupleHeaderAdvanceLatestRemovedXid(tuple.t_data,
 													   &vacrelstats->latestRemovedXid);
 				tups_vacuumed += 1;
@@ -1240,7 +1424,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * doing a second scan. Also we don't do that but forget dead tuples
 		 * when index cleanup is disabled.
 		 */
-		if (!vacrelstats->useindex && vacrelstats->num_dead_tuples > 0)
+		if (!vacrelstats->useindex && dead_tuples->num_tuples > 0)
 		{
 			if (nindexes == 0)
 			{
@@ -1269,7 +1453,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 
 			/*
 			 * Periodically do incremental FSM vacuuming to make newly-freed
@@ -1384,7 +1568,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * page, so remember its free space as-is.  (This path will always be
 		 * taken if there are no indexes.)
 		 */
-		if (vacrelstats->num_dead_tuples == prev_dead_count)
+		if (dead_tuples->num_tuples == prev_dead_count)
 			RecordPageWithFreeSpace(onerel, blkno, freespace);
 	}
 
@@ -1418,7 +1602,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 
 	/* If any tuples need to be deleted, perform final vacuum cycle */
 	/* XXX put a threshold on min number of tuples here? */
-	if (vacrelstats->num_dead_tuples > 0)
+	if (dead_tuples->num_tuples > 0)
 	{
 		const int	hvp_index[] = {
 			PROGRESS_VACUUM_PHASE,
@@ -1434,10 +1618,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 									 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
 		/* Remove index entries */
-		for (i = 0; i < nindexes; i++)
-			lazy_vacuum_index(Irel[i],
-							  &indstats[i],
-							  vacrelstats);
+		lazy_vacuum_indexes(vacrelstats, Irel, nindexes, indstats, lps);
 
 		/* Report that we are now vacuuming the heap */
 		hvp_val[0] = PROGRESS_VACUUM_PHASE_VACUUM_HEAP;
@@ -1463,12 +1644,19 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
 								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
 
-	/* Do post-vacuum cleanup and statistics update for each index */
+	/* Do post-vacuum cleanup */
 	if (vacrelstats->useindex)
-	{
-		for (i = 0; i < nindexes; i++)
-			lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
-	}
+		lazy_cleanup_indexes(vacrelstats, Irel, nindexes, indstats, lps);
+
+	/*
+	 * End parallel mode before updating index statistics as we cannot wirte
+	 * during parallel mode.
+	 */
+	if (ParallelVacuumIsActive(lps))
+		end_parallel_vacuum(lps, Irel, nindexes, indstats);
+
+	/* Update index statistics */
+	 update_index_statistics(Irel, indstats, nindexes);
 
 	/* If no indexes, make log report that lazy_vacuum_heap would've made */
 	if (vacuumed_pages)
@@ -1534,7 +1722,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 	npages = 0;
 
 	tupindex = 0;
-	while (tupindex < vacrelstats->num_dead_tuples)
+	while (tupindex < vacrelstats->dead_tuples->num_tuples)
 	{
 		BlockNumber tblk;
 		Buffer		buf;
@@ -1543,7 +1731,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 
 		vacuum_delay_point();
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples->itemptrs[tupindex]);
 		buf = ReadBufferExtended(onerel, MAIN_FORKNUM, tblk, RBM_NORMAL,
 								 vac_strategy);
 		if (!ConditionalLockBufferForCleanup(buf))
@@ -1591,6 +1779,7 @@ static int
 lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 				 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer)
 {
+	LVDeadTuples	*dead_tuples = vacrelstats->dead_tuples;
 	Page		page = BufferGetPage(buffer);
 	OffsetNumber unused[MaxOffsetNumber];
 	int			uncnt = 0;
@@ -1601,16 +1790,16 @@ lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 
 	START_CRIT_SECTION();
 
-	for (; tupindex < vacrelstats->num_dead_tuples; tupindex++)
+	for (; tupindex < dead_tuples->num_tuples; tupindex++)
 	{
 		BlockNumber tblk;
 		OffsetNumber toff;
 		ItemId		itemid;
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&dead_tuples->itemptrs[tupindex]);
 		if (tblk != blkno)
 			break;				/* past end of tuples for this block */
-		toff = ItemPointerGetOffsetNumber(&vacrelstats->dead_tuples[tupindex]);
+		toff = ItemPointerGetOffsetNumber(&dead_tuples->itemptrs[tupindex]);
 		itemid = PageGetItemId(page, toff);
 		ItemIdSetUnused(itemid);
 		unused[uncnt++] = toff;
@@ -1731,19 +1920,258 @@ lazy_check_needs_freeze(Buffer buf, bool *hastup)
 	return false;
 }
 
+/*
+ * Vacuuming indexes with parallel vacuum workers. This function must be used
+ * by the parallel vacuum leader process.
+ */
+static void
+lazy_parallel_vacuum_indexes(LVRelStats *vacrelstats, Relation *Irel,
+							 int nindexes, IndexBulkDeleteResult **stats,
+							 LVParallelState *lps)
+{
+	Assert(!IsParallelWorker());
+	Assert(ParallelVacuumIsActive(lps));
+	Assert(nindexes > 0);
+
+	/* Tell parallel workers to do index vacuuming */
+	lps->lvshared->for_cleanup = false;
+
+	/*
+	 * We can only provide an approximate value of num_heap_tuples in
+	 * vacuum cases.
+	 */
+	lps->lvshared->reltuples = vacrelstats->old_live_tuples;
+	lps->lvshared->estimated_count = true;
+
+	LaunchParallelWorkers(lps->pcxt);
+
+	ereport(elevel,
+			(errmsg(ngettext("launched %d parallel vacuum worker for index vacuuming (planned: %d)",
+							 "launched %d parallel vacuum workers for index vacuuming (planned: %d)",
+							 lps->pcxt->nworkers_launched),
+					lps->pcxt->nworkers_launched, lps->pcxt->nworkers)));
+
+	/*
+	 * Join index vacuuming with parallel workers. The leader process alone
+	 * does that in case where no workers launched.
+	 */
+	if (lps->leaderparticipates || lps->pcxt->nworkers_launched == 0)
+		vacuum_or_cleanup_indexes_worker(Irel, nindexes, stats, lps->lvshared,
+										 vacrelstats->dead_tuples);
+
+	/* Wait for all vacuum workers to finish */
+	WaitForParallelWorkersToFinish(lps->pcxt);
+
+	/* Reset the processing count */
+	pg_atomic_write_u32(&(lps->lvshared->nprocessed), 0);
+
+	/*
+	 * Reinitialize the parallel context to relaunch parallel workers
+	 * for the next execution.
+	 */
+	ReinitializeParallelDSM(lps->pcxt);
+}
+
+/*
+ * Cleanup indexes with parallel vacuum workers. This function must be used
+ * by the parallel vacuum leader process.
+ */
+static void
+lazy_parallel_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
+									   int nindexes, IndexBulkDeleteResult **stats,
+									   LVParallelState *lps)
+{
+	Assert(!IsParallelWorker());
+	Assert(ParallelVacuumIsActive(lps));
+	Assert(nindexes > 0);
+
+	/* Tell parallel workers to do index cleanup */
+	lps->lvshared->for_cleanup = true;
+
+	/*
+	 * Now we can provide a better estimate of total number of surviving
+	 * tuples (we assume indexes are more interested in that than in the
+	 * number of nominally live tuples).
+	 */
+	lps->lvshared->reltuples = vacrelstats->new_rel_tuples;
+	lps->lvshared->estimated_count =
+		(vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+
+	LaunchParallelWorkers(lps->pcxt);
+
+	ereport(elevel,
+			(errmsg(ngettext("launched %d parallel vacuum worker for index cleanup (planned: %d)",
+							 "launched %d parallel vacuum workers for index cleanup (planned: %d)",
+							 lps->pcxt->nworkers_launched),
+					lps->pcxt->nworkers_launched, lps->pcxt->nworkers)));
+
+	/*
+	 * Join index cleanup with parallel workers. The leader process alone does
+	 * that in case where no workers launched.
+	 */
+	if (lps->leaderparticipates || lps->pcxt->nworkers_launched == 0)
+		vacuum_or_cleanup_indexes_worker(Irel, nindexes, stats, lps->lvshared,
+										 vacrelstats->dead_tuples);
+
+	/* Wait for all vacuum workers to finish */
+	WaitForParallelWorkersToFinish(lps->pcxt);
+
+
+	/*
+	 * We don't need to reinitialize the parallel context unlike parallel index
+	 * vacuum as no more index vacuuming and index cleanup will be performed after
+	 * that.
+	 */
+}
+
+/*
+ * Index vacuum and index cleanup routine used by parallel vacuum worker processes
+ * including the leader process.
+ */
+static void
+vacuum_or_cleanup_indexes_worker(Relation *Irel, int nindexes,
+								 IndexBulkDeleteResult **stats,
+								 LVShared *lvshared,
+								 LVDeadTuples *dead_tuples)
+{
+	/* Loop until all indexes are vacuumed */
+	for (;;)
+	{
+		int idx;
+
+		/* Get an index number to process */
+		idx = pg_atomic_fetch_add_u32(&(lvshared->nprocessed), 1);
+
+		/* Done for all indexes? */
+		if (idx >= nindexes)
+			break;
+
+		/*
+		 * Update the pointer to the corresponding bulk-deletion result
+		 * if someone has already updated it.
+		 */
+		if (lvshared->indstats[idx].updated &&
+			stats[idx] == NULL)
+			stats[idx] = &(lvshared->indstats[idx].stats);
+
+		/* Do vacuum or cleanup one index */
+		if (lvshared->for_cleanup)
+			lazy_cleanup_index(Irel[idx], &(stats[idx]), lvshared->reltuples,
+							   lvshared->estimated_count);
+		else
+			lazy_vacuum_index(Irel[idx], &(stats[idx]), dead_tuples,
+							  lvshared->reltuples);
+
+		/*
+		 * Copy the index bulk-deletion result returned from ambulkdelete and
+		 * amvacuumcleanup to the DSM segment if it's the first time to get it
+		 * from them, because they allocate it locally and it's possible that an
+		 * index will be vacuumed by the different vacuum process at the next
+		 * time. The copying the result normally happens only after the first
+		 * time of index vacuuming. From the second time, we pass the result on
+		 * the DSM segment so that they then update it directly.
+		 *
+		 * Since all vacuum workers write the bulk-deletion result at different
+		 * slots we can write them without locking.
+		 */
+		if (!lvshared->indstats[idx].updated && stats[idx] != NULL)
+		{
+			memcpy(&(lvshared->indstats[idx].stats),
+				   stats[idx], sizeof(IndexBulkDeleteResult));
+			lvshared->indstats[idx].updated = true;
+
+			/*
+			 * no longer need the locally allocated result and now stats[idx]
+			 * points to the DSM segment.
+			 */
+			pfree(stats[idx]);
+			stats[idx] = &(lvshared->indstats[idx].stats);
+		}
+	}
+}
+
+/*
+ * Cleanup indexes. This function must be used by the parallel vacuum leader
+ * process in parallel vacuum case.
+ */
+static void
+lazy_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
+					int nindexes, IndexBulkDeleteResult **stats,
+					LVParallelState *lps)
+{
+	int		idx;
+
+	Assert(!IsParallelWorker());
+	Assert(nindexes > 0);
+
+	if (ParallelVacuumIsActive(lps))
+	{
+		/*
+		 * Generally index cleanup does not scan the index when index vacuuming
+		 * (ambulkdelete) was performed. So we perform index cleanup with parallel
+		 * workers only if we have not performed index vacuuming yet. Otherwise
+		 * we do it the leader process alone.
+		 */
+		if (vacrelstats->num_index_scans == 0)
+			lazy_parallel_cleanup_indexes(vacrelstats, Irel, nindexes, stats, lps);
+		else
+		{
+			lps->lvshared->for_cleanup = true;
+			vacuum_or_cleanup_indexes_worker(Irel, nindexes, stats,
+											 lps->lvshared,
+											 vacrelstats->dead_tuples);
+		}
+	}
+	else
+	{
+		for (idx = 0; idx < nindexes; idx++)
+			lazy_cleanup_index(Irel[idx], &stats[idx], vacrelstats->new_rel_tuples,
+							   vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	}
+}
+
+/*
+ * Vacuum indexes. This function must be used by the parallel vacuum leader
+ * process in parallel vacuum case.
+ */
+static void
+lazy_vacuum_indexes(LVRelStats *vacrelstats, Relation *Irel,
+					int nindexes, IndexBulkDeleteResult **stats,
+					LVParallelState *lps)
+{
+	int		idx;
+
+	Assert(!IsParallelWorker());
+	Assert(nindexes > 0);
+
+	/*
+	 * Perform index vacuuming. If parallel vacuum is active we perform
+	 * index vacuuming with parallel workers. Otherwise do it alone.
+	 */
+	if (ParallelVacuumIsActive(lps))
+		lazy_parallel_vacuum_indexes(vacrelstats, Irel, nindexes, stats, lps);
+	else
+	{
+		for (idx = 0; idx < nindexes; idx++)
+			lazy_vacuum_index(Irel[idx], &stats[idx], vacrelstats->dead_tuples,
+							  vacrelstats->old_live_tuples);
+	}
+}
 
 /*
  *	lazy_vacuum_index() -- vacuum one index relation.
  *
  *		Delete all the index entries pointing to tuples listed in
  *		vacrelstats->dead_tuples, and update running statistics.
+ *		reltuples is the number of heap tuples to be passed to the
+ *		bulk delete callback.
  */
 static void
-lazy_vacuum_index(Relation indrel,
-				  IndexBulkDeleteResult **stats,
-				  LVRelStats *vacrelstats)
+lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult **stats,
+				  LVDeadTuples *dead_tuples, double reltuples)
 {
 	IndexVacuumInfo ivinfo;
+	char		*msgfmt;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
@@ -1753,30 +2181,38 @@ lazy_vacuum_index(Relation indrel,
 	ivinfo.report_progress = false;
 	ivinfo.estimated_count = true;
 	ivinfo.message_level = elevel;
-	/* We can only provide an approximate value of num_heap_tuples here */
-	ivinfo.num_heap_tuples = vacrelstats->old_live_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
 	/* Do bulk deletion */
 	*stats = index_bulk_delete(&ivinfo, *stats,
-							   lazy_tid_reaped, (void *) vacrelstats);
+							   lazy_tid_reaped, (void *) dead_tuples);
+
+	if (IsParallelWorker())
+		msgfmt = gettext_noop("scanned index \"%s\" to remove %d row versions by parallel vacuum worker");
+	else
+		msgfmt = gettext_noop("scanned index \"%s\" to remove %d row versions");
 
 	ereport(elevel,
-			(errmsg("scanned index \"%s\" to remove %d row versions",
+			(errmsg(msgfmt,
 					RelationGetRelationName(indrel),
-					vacrelstats->num_dead_tuples),
+					dead_tuples->num_tuples),
 			 errdetail_internal("%s", pg_rusage_show(&ru0))));
 }
 
 /*
  *	lazy_cleanup_index() -- do post-vacuum cleanup for one index relation.
+ *
+ *		reltuples is the number of heap tuples and estimated_count is true
+ *		if the reltuples is an estimated value.
  */
 static void
 lazy_cleanup_index(Relation indrel,
-				   IndexBulkDeleteResult *stats,
-				   LVRelStats *vacrelstats)
+				   IndexBulkDeleteResult **stats,
+				   double reltuples, bool estimated_count)
 {
 	IndexVacuumInfo ivinfo;
+	char		*msgfmt;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
@@ -1784,49 +2220,62 @@ lazy_cleanup_index(Relation indrel,
 	ivinfo.index = indrel;
 	ivinfo.analyze_only = false;
 	ivinfo.report_progress = false;
-	ivinfo.estimated_count = (vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	ivinfo.estimated_count = estimated_count;
 	ivinfo.message_level = elevel;
 
-	/*
-	 * Now we can provide a better estimate of total number of surviving
-	 * tuples (we assume indexes are more interested in that than in the
-	 * number of nominally live tuples).
-	 */
-	ivinfo.num_heap_tuples = vacrelstats->new_rel_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
-	stats = index_vacuum_cleanup(&ivinfo, stats);
+	*stats = index_vacuum_cleanup(&ivinfo, *stats);
 
-	if (!stats)
+	if (!(*stats))
 		return;
 
-	/*
-	 * Now update statistics in pg_class, but only if the index says the count
-	 * is accurate.
-	 */
-	if (!stats->estimated_count)
-		vac_update_relstats(indrel,
-							stats->num_pages,
-							stats->num_index_tuples,
-							0,
-							false,
-							InvalidTransactionId,
-							InvalidMultiXactId,
-							false);
+	if (IsParallelWorker())
+		msgfmt = gettext_noop("index \"%s\" now contains %.0f row versions in %u pages, reported by parallel vacuum worker");
+	else
+		msgfmt = gettext_noop("index \"%s\" now contains %.0f row versions in %u pages");
 
 	ereport(elevel,
-			(errmsg("index \"%s\" now contains %.0f row versions in %u pages",
+			(errmsg(msgfmt,
 					RelationGetRelationName(indrel),
-					stats->num_index_tuples,
-					stats->num_pages),
+					(*stats)->num_index_tuples,
+					(*stats)->num_pages),
 			 errdetail("%.0f index row versions were removed.\n"
 					   "%u index pages have been deleted, %u are currently reusable.\n"
 					   "%s.",
-					   stats->tuples_removed,
-					   stats->pages_deleted, stats->pages_free,
+					   (*stats)->tuples_removed,
+					   (*stats)->pages_deleted, (*stats)->pages_free,
 					   pg_rusage_show(&ru0))));
+}
+
+/*
+ * Update index statistics in pg_class if the statistics is accurate.
+ */
+static void
+update_index_statistics(Relation *Irel, IndexBulkDeleteResult **stats,
+						int nindexes)
+{
+	int i;
+
+	Assert(!IsInParallelMode());
 
-	pfree(stats);
+	for (i = 0; i < nindexes; i++)
+	{
+		if (stats[i] == NULL || stats[i]->estimated_count)
+			continue;
+
+		/* Update index statistics */
+		vac_update_relstats(Irel[i],
+							stats[i]->num_pages,
+							stats[i]->num_index_tuples,
+							0,
+							false,
+							InvalidTransactionId,
+							InvalidMultiXactId,
+							false);
+		pfree(stats[i]);
+	}
 }
 
 /*
@@ -2134,19 +2583,17 @@ count_nondeletable_pages(Relation onerel, LVRelStats *vacrelstats)
 }
 
 /*
- * lazy_space_alloc - space allocation decisions for lazy vacuum
- *
- * See the comments at the head of this file for rationale.
+ * Return the maximum number of dead tuples we can record.
  */
-static void
-lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+static long
+compute_max_dead_tuples(BlockNumber relblocks, bool useindex)
 {
 	long		maxtuples;
 	int			vac_work_mem = IsAutoVacuumWorkerProcess() &&
 	autovacuum_work_mem != -1 ?
 	autovacuum_work_mem : maintenance_work_mem;
 
-	if (vacrelstats->useindex)
+	if (useindex)
 	{
 		maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);
 		maxtuples = Min(maxtuples, INT_MAX);
@@ -2160,34 +2607,49 @@ lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
 		maxtuples = Max(maxtuples, MaxHeapTuplesPerPage);
 	}
 	else
-	{
 		maxtuples = MaxHeapTuplesPerPage;
-	}
 
-	vacrelstats->num_dead_tuples = 0;
-	vacrelstats->max_dead_tuples = (int) maxtuples;
-	vacrelstats->dead_tuples = (ItemPointer)
-		palloc(maxtuples * sizeof(ItemPointerData));
+	return maxtuples;
+}
+
+/*
+ * lazy_space_alloc - space allocation decisions for lazy vacuum
+ *
+ * See the comments at the head of this file for rationale.
+ */
+static void
+lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+{
+	LVDeadTuples	*dead_tuples = NULL;
+	long		maxtuples;
+
+	maxtuples = compute_max_dead_tuples(relblocks, vacrelstats->useindex);
+
+	dead_tuples = (LVDeadTuples *)
+		palloc(SizeOfLVDeadTuples + maxtuples * sizeof(ItemPointerData));
+	dead_tuples->num_tuples = 0;
+	dead_tuples->max_tuples = (int) maxtuples;
+
+	vacrelstats->dead_tuples = dead_tuples;
 }
 
 /*
  * lazy_record_dead_tuple - remember one deletable tuple
  */
 static void
-lazy_record_dead_tuple(LVRelStats *vacrelstats,
-					   ItemPointer itemptr)
+lazy_record_dead_tuple(LVDeadTuples *dead_tuples, ItemPointer itemptr)
 {
 	/*
 	 * The array shouldn't overflow under normal behavior, but perhaps it
 	 * could if we are given a really small maintenance_work_mem. In that
 	 * case, just forget the last few tuples (we'll get 'em next time).
 	 */
-	if (vacrelstats->num_dead_tuples < vacrelstats->max_dead_tuples)
+	if (dead_tuples->num_tuples < dead_tuples->max_tuples)
 	{
-		vacrelstats->dead_tuples[vacrelstats->num_dead_tuples] = *itemptr;
-		vacrelstats->num_dead_tuples++;
+		dead_tuples->itemptrs[dead_tuples->num_tuples] = *itemptr;
+		dead_tuples->num_tuples++;
 		pgstat_progress_update_param(PROGRESS_VACUUM_NUM_DEAD_TUPLES,
-									 vacrelstats->num_dead_tuples);
+									 dead_tuples->num_tuples);
 	}
 }
 
@@ -2201,12 +2663,12 @@ lazy_record_dead_tuple(LVRelStats *vacrelstats,
 static bool
 lazy_tid_reaped(ItemPointer itemptr, void *state)
 {
-	LVRelStats *vacrelstats = (LVRelStats *) state;
+	LVDeadTuples	*dead_tuples = (LVDeadTuples *) state;
 	ItemPointer res;
 
 	res = (ItemPointer) bsearch((void *) itemptr,
-								(void *) vacrelstats->dead_tuples,
-								vacrelstats->num_dead_tuples,
+								(void *) dead_tuples->itemptrs,
+								dead_tuples->num_tuples,
 								sizeof(ItemPointerData),
 								vac_cmp_itemptr);
 
@@ -2354,3 +2816,226 @@ heap_page_is_all_visible(Relation rel, Buffer buf,
 
 	return all_visible;
 }
+
+/*
+ * Compute the number of parallel worker processes to request. Both index
+ * vacuuming and index cleanup can be executed together with parallel workers.
+ * The relation sizes of table and indexes don't affect to the parallel
+ * degree for now. nrequested is the number of parallel workers that user
+ * requested. If nrequested is 0 we compute the parallel degree based on
+ * nindexes that is the number of indexes that the table has.
+ */
+static int
+compute_parallel_workers(int nrequested, int nindexes)
+{
+	int		parallel_workers;
+	bool	leaderparticipates = true;
+	int		nindexes_to_vacuum = nindexes;
+
+	Assert(nrequested >= 0);
+
+	/* Return immediately when parallelism disabled */
+	if (max_parallel_maintenance_workers == 0)
+		return 0;
+
+#ifdef PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION
+	leaderparticipates = false;
+#endif
+
+	/* The leader process takes one index */
+	if (leaderparticipates)
+		nindexes_to_vacuum--;
+
+	/* Compute the parallel degree */
+	parallel_workers = (nrequested > 0) ?
+		Min(nrequested, nindexes_to_vacuum) : nindexes_to_vacuum;
+
+	/* cap by max_parallel_maintenace_workers */
+	parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+
+	return parallel_workers;
+}
+
+/*
+ * Enter parallel mode, allocate and initialize the DSM segment.
+ */
+static LVParallelState *
+begin_parallel_vacuum(LVRelStats *vacrelstats, Oid relid, BlockNumber nblocks,
+					  int nindexes, int nrequested)
+{
+	LVParallelState *lps = (LVParallelState *) palloc(sizeof(LVParallelState));
+	LVShared	*shared;
+	ParallelContext *pcxt;
+	LVDeadTuples	*tidmap;
+	long	maxtuples;
+	char	*sharedquery;
+	Size	est_shared;
+	Size	est_deadtuples;
+	int		querylen;
+
+	Assert(nrequested > 0);
+	Assert(nindexes > 0);
+
+	lps->leaderparticipates = true;
+
+#ifdef PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION
+	lps->leaderparticipates = false;
+#endif
+
+	EnterParallelMode();
+	pcxt = CreateParallelContext("postgres", "heap_parallel_vacuum_main",
+								 nrequested);
+	lps->pcxt = pcxt;
+	Assert(pcxt->nworkers > 0);
+
+	/* Estimate size for shared information -- PARALLEL_VACUUM_KEY_SHARED */
+	est_shared = MAXALIGN(add_size(SizeOfLVShared,
+								   mul_size(sizeof(LVSharedIndStats), nindexes)));
+	shm_toc_estimate_chunk(&pcxt->estimator, est_shared);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	/* Estimate size for dead tuples -- PARALLEL_VACUUM_KEY_DEAD_TUPLES */
+	maxtuples = compute_max_dead_tuples(nblocks, true);
+	est_deadtuples = MAXALIGN(add_size(SizeOfLVDeadTuples,
+									   mul_size(sizeof(ItemPointerData), maxtuples)));
+	shm_toc_estimate_chunk(&pcxt->estimator, est_deadtuples);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	/* Finally, estimate PARALLEL_VACUUM_KEY_QUERY_TEXT space */
+	querylen = strlen(debug_query_string);
+	shm_toc_estimate_chunk(&pcxt->estimator, querylen + 1);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	InitializeParallelDSM(pcxt);
+
+	/* prepare shared information */
+	shared = (LVShared *) shm_toc_allocate(pcxt->toc, est_shared);
+	shared->relid = relid;
+	shared->elevel = elevel;
+	pg_atomic_init_u32(&(shared->nprocessed), 0);
+	MemSet(shared->indstats, 0, sizeof(LVSharedIndStats) * nindexes);
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_SHARED, shared);
+	lps->lvshared = shared;
+
+	/* prepare the dead tuple space */
+	tidmap = (LVDeadTuples *) shm_toc_allocate(pcxt->toc, est_deadtuples);
+	tidmap->max_tuples = maxtuples;
+	tidmap->num_tuples = 0;
+	MemSet(tidmap->itemptrs, 0, sizeof(ItemPointerData) * maxtuples);
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_DEAD_TUPLES, tidmap);
+	vacrelstats->dead_tuples = tidmap;
+
+	/* Store query string for workers */
+	sharedquery = (char *) shm_toc_allocate(pcxt->toc, querylen + 1);
+	memcpy(sharedquery, debug_query_string, querylen + 1);
+	sharedquery[querylen] = '\0';
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, sharedquery);
+
+	lps->nworkers_requested = 0;
+
+	return lps;
+}
+
+/*
+ * Destroy the parallel context, and end parallel mode.
+ *
+ * All writes are not allowed during parallel mode and it might not be
+ * safe to exit from the parallel mode while keeping the parallel context.
+ * So we copy the updated index statistics to a local memory and then later
+ * use that to update the index statistics.
+ */
+static void
+end_parallel_vacuum(LVParallelState *lps, Relation *Irel, int nindexes,
+					IndexBulkDeleteResult **stats)
+{
+	int i;
+
+	Assert(!IsParallelWorker());
+
+	/* copy the updated statistics */
+	for (i = 0; i < nindexes; i++)
+	{
+		if (lps->lvshared->indstats[i].updated)
+		{
+			stats[i] = (IndexBulkDeleteResult *) palloc0(sizeof(IndexBulkDeleteResult));
+			memcpy(stats[i],
+				   &(lps->lvshared->indstats[i].stats),
+				   sizeof(IndexBulkDeleteResult));
+		}
+		else
+			stats[i] = NULL;
+	}
+
+	DestroyParallelContext(lps->pcxt);
+	ExitParallelMode();
+
+	/* Deactivate parallel vacuum */
+	pfree(lps);
+	lps = NULL;
+}
+
+/*
+ * Perform work within a launched parallel process.
+ *
+ * Since parallel vacuum workers work only within index vacuuming and index
+ * cleanup, no need to report the progress information.
+ */
+void
+heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
+{
+	Relation	onerel;
+	Relation	*indrels;
+	LVShared	*lvshared;
+	LVDeadTuples	*dead_tuples;
+	int			nindexes;
+	char		*sharedquery;
+	IndexBulkDeleteResult **stats;
+
+	lvshared = (LVShared *) shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_SHARED,
+										   false);
+	elevel = lvshared->elevel;
+
+	ereport(DEBUG1,
+			(errmsg("starting parallel lazy vacuum worker for %s",
+					lvshared->for_cleanup ? "cleanup" : "vacuuming")));
+
+	/* Set debug_query_string for individual workers */
+	sharedquery = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, false);
+	debug_query_string = sharedquery;
+	pgstat_report_activity(STATE_RUNNING, debug_query_string);
+
+	/*
+	 * Open table. The lock mode is the same as the leader process. It's okay
+	 * because The lockmode does not conflict among the parallel workers.
+	 */
+	onerel = heap_open(lvshared->relid, ShareUpdateExclusiveLock);
+
+	/*
+	 * Open all indexes. indrels are sorted in order by OID, which should
+	 * be matched to the leader's one.
+	 */
+	vac_open_indexes(onerel, RowExclusiveLock, &nindexes, &indrels);
+	Assert(nindexes > 0);
+
+	/* Set dead tuple space */
+	dead_tuples = (LVDeadTuples *) shm_toc_lookup(toc,
+												  PARALLEL_VACUUM_KEY_DEAD_TUPLES,
+												  false);
+
+	/* Set cost-based vacuum delay */
+	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostBalance = 0;
+	VacuumPageHit = 0;
+	VacuumPageMiss = 0;
+	VacuumPageDirty = 0;
+
+	stats = (IndexBulkDeleteResult **)
+		palloc0(nindexes * sizeof(IndexBulkDeleteResult *));
+
+	/* Do either vacuuming indexes or cleaning indexes */
+	vacuum_or_cleanup_indexes_worker(indrels, nindexes, stats, lvshared,
+									 dead_tuples);
+
+	vac_close_indexes(nindexes, indrels, RowExclusiveLock);
+	heap_close(onerel, ShareUpdateExclusiveLock);
+}
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index 55d129a64f..86511b2703 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -14,6 +14,7 @@
 
 #include "postgres.h"
 
+#include "access/heapam.h"
 #include "access/nbtree.h"
 #include "access/parallel.h"
 #include "access/session.h"
@@ -140,6 +141,9 @@ static const struct
 	},
 	{
 		"_bt_parallel_build_main", _bt_parallel_build_main
+	},
+	{
+		"heap_parallel_vacuum_main", heap_parallel_vacuum_main
 	}
 };
 
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index e154507ecd..23f20d93e5 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -42,6 +42,7 @@
 #include "nodes/makefuncs.h"
 #include "pgstat.h"
 #include "postmaster/autovacuum.h"
+#include "postmaster/bgworker_internals.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
 #include "storage/proc.h"
@@ -99,6 +100,7 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 	/* Set default value */
 	params.index_cleanup = VACOPT_TERNARY_DEFAULT;
 	params.truncate = VACOPT_TERNARY_DEFAULT;
+	params.nworkers = -1;
 
 	/* Parse options list */
 	foreach(lc, vacstmt->options)
@@ -129,6 +131,28 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 			params.index_cleanup = get_vacopt_ternary_value(opt);
 		else if (strcmp(opt->defname, "truncate") == 0)
 			params.truncate = get_vacopt_ternary_value(opt);
+		else if (strcmp(opt->defname, "parallel") == 0)
+		{
+			if (opt->arg == NULL)
+			{
+				/*
+				 * Parallel lazy vacuum is requested but user didn't specify
+				 * the parallel degree. The parallel degree will be determined
+				 * at the start of lazy vacuum.
+				 */
+				params.nworkers = 0;
+			}
+			else
+			{
+				params.nworkers = defGetInt32(opt);
+				if (params.nworkers < 1 || params.nworkers > MAX_PARALLEL_WORKER_LIMIT)
+					ereport(ERROR,
+							(errcode(ERRCODE_SYNTAX_ERROR),
+							 errmsg("parallel vacuum degree must be between 1 and %d",
+									MAX_PARALLEL_WORKER_LIMIT),
+							 parser_errposition(pstate, opt->location)));
+			}
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -170,6 +194,11 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 		}
 	}
 
+	if ((params.options & VACOPT_FULL) && params.nworkers >= 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot specify FULL option with PARALLEL option")));
+
 	/*
 	 * All freeze ages are zero if the FREEZE option is given; otherwise pass
 	 * them as -1 which means to use the default values.
@@ -1736,6 +1765,22 @@ vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params)
 		return false;
 	}
 
+	/*
+	 * Check that it's a temporary relation and PARALLEL option is specified.
+	 * Since parallel workers cannot access data in temporary tables, parallel
+	 * vacuum is not allowed for temporary relation.
+	 */
+	if (RelationUsesLocalBuffers(onerel) && params->nworkers >= 0)
+	{
+		ereport(WARNING,
+				(errmsg("skipping \"%s\" --- cannot parallel vacuum temporary tables",
+						RelationGetRelationName(onerel))));
+		relation_close(onerel, lmode);
+		PopActiveSnapshot();
+		CommitTransactionCommand();
+		return false;
+	}
+
 	/*
 	 * Silently ignore partitioned tables as there is no work to be done.  The
 	 * useful work is on their child partitions, which have been queued up for
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 073f313337..de43d1e4f0 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2888,6 +2888,8 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 			(!wraparound ? VACOPT_SKIP_LOCKED : 0);
 		tab->at_params.index_cleanup = VACOPT_TERNARY_DEFAULT;
 		tab->at_params.truncate = VACOPT_TERNARY_DEFAULT;
+		/* We don't support parallel vacuum for autovacuum for now */
+		tab->at_params.nworkers = -1;
 		tab->at_params.freeze_min_age = freeze_min_age;
 		tab->at_params.freeze_table_age = freeze_table_age;
 		tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index e00dbab5aa..321a1511a8 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -3556,7 +3556,7 @@ psql_completion(const char *text, int start, int end)
 		if (ends_with(prev_wd, '(') || ends_with(prev_wd, ','))
 			COMPLETE_WITH("FULL", "FREEZE", "ANALYZE", "VERBOSE",
 						  "DISABLE_PAGE_SKIPPING", "SKIP_LOCKED",
-						  "INDEX_CLEANUP", "TRUNCATE");
+						  "INDEX_CLEANUP", "TRUNCATE", "PARALLEL");
 		else if (TailMatches("FULL|FREEZE|ANALYZE|VERBOSE|DISABLE_PAGE_SKIPPING|SKIP_LOCKED|INDEX_CLEANUP|TRUNCATE"))
 			COMPLETE_WITH("ON", "OFF");
 	}
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 858bcb6bc9..12065cc038 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -24,6 +24,8 @@
 #include "nodes/primnodes.h"
 #include "storage/bufpage.h"
 #include "storage/lockdefs.h"
+#include "storage/shm_toc.h"
+#include "storage/dsm.h"
 #include "utils/relcache.h"
 #include "utils/snapshot.h"
 
@@ -193,6 +195,7 @@ extern Size SyncScanShmemSize(void);
 struct VacuumParams;
 extern void heap_vacuum_rel(Relation onerel,
 							struct VacuumParams *params, BufferAccessStrategy bstrategy);
+extern void heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
 
 /* in heap/heapam_visibility.c */
 extern bool HeapTupleSatisfiesVisibility(HeapTuple stup, Snapshot snapshot,
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 128f7ae65d..43702f2f86 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -184,6 +184,11 @@ typedef struct VacuumParams
 										 * default value depends on reloptions */
 	VacOptTernaryValue truncate;	/* Truncate empty pages at the end,
 									 * default value depends on reloptions */
+	/*
+	 * The number of parallel vacuum workers. -1 by default for no workers
+	 * and 0 for choosing based on the number of indexes.
+	 */
+	int			nworkers;
 } VacuumParams;
 
 /* GUC parameters */
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index aff0b10a93..f134412c3d 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -92,6 +92,20 @@ CONTEXT:  SQL function "do_analyze" statement 1
 SQL function "wrap_do_analyze" statement 1
 VACUUM FULL vactst;
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+-- PARALLEL option
+VACUUM (PARALLEL) vaccluster;
+VACUUM (PARALLEL 2) vaccluster;
+VACUUM (PARALLEL 0) vaccluster; -- error
+ERROR:  parallel vacuum degree must be between 1 and 1024
+LINE 1: VACUUM (PARALLEL 0) vaccluster;
+                ^
+VACUUM (PARALLEL 2, INDEX_CLEANUP FALSE) vaccluster;
+VACUUM (PARALLEL 2, FULL TRUE) vaccluster; -- error, cannot use both PARALLEL and FULL
+ERROR:  cannot specify FULL option with PARALLEL option
+CREATE TEMPORARY TABLE tmp (a int PRIMARY KEY);
+CREATE INDEX tmp_idx1 ON tmp (a);
+VACUUM (PARALLEL 1) tmp; -- error, cannot parallel vacuum temporary tables
+WARNING:  skipping "tmp" --- cannot parallel vacuum temporary tables
 -- INDEX_CLEANUP option
 CREATE TABLE no_index_cleanup (i INT PRIMARY KEY, t TEXT);
 -- Use uncompressed data stored in toast.
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index f0fee3af2b..66a9b110fe 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -75,6 +75,16 @@ VACUUM FULL vactst;
 
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
 
+-- PARALLEL option
+VACUUM (PARALLEL) vaccluster;
+VACUUM (PARALLEL 2) vaccluster;
+VACUUM (PARALLEL 0) vaccluster; -- error
+VACUUM (PARALLEL 2, INDEX_CLEANUP FALSE) vaccluster;
+VACUUM (PARALLEL 2, FULL TRUE) vaccluster; -- error, cannot use both PARALLEL and FULL
+CREATE TEMPORARY TABLE tmp (a int PRIMARY KEY);
+CREATE INDEX tmp_idx1 ON tmp (a);
+VACUUM (PARALLEL 1) tmp; -- error, cannot parallel vacuum temporary tables
+
 -- INDEX_CLEANUP option
 CREATE TABLE no_index_cleanup (i INT PRIMARY KEY, t TEXT);
 -- Use uncompressed data stored in toast.
-- 
2.22.0

#108

Mahendra Singh

mahi6run@gmail.com

over 6 years ago

In reply to: Amit Kapila (#106)

Thanks Amit for patch.

Crash is fixed by this patch.

Thanks and Regards
Mahendra Thalor

On Sat, Oct 12, 2019, 09:03 Amit Kapila <amit.kapila16@gmail.com> wrote:

Show quoted text

On Fri, Oct 11, 2019 at 4:47 PM Mahendra Singh <mahi6run@gmail.com> wrote:

I did some analysis and found that we are trying to free some already

freed memory. Or we are freeing palloced memory in vac_update_relstats.

for (i = 0; i < nindexes; i++)
{
if (stats[i] == NULL || stats[i]->estimated_count)
continue;

/* Update index statistics */
vac_update_relstats(Irel[i],
stats[i]->num_pages,
stats[i]->num_index_tuples,
0,
false,
InvalidTransactionId,
InvalidMultiXactId,
false);
pfree(stats[i]);
}

As my table have 2 indexes, so we have to free both stats. When i = 0,

it is freeing propery but when i = 1, then vac_update_relstats is freeing
memory.

(gdb) p *stats[i]
$1 = {num_pages = 218, pages_removed = 0, estimated_count = false,

num_index_tuples = 30000, tuples_removed = 30000, pages_deleted = 102,
pages_free = 0}

(gdb) p *stats[i]
$2 = {num_pages = 0, pages_removed = 65536, estimated_count = false,

num_index_tuples = 0, tuples_removed = 0, pages_deleted = 0, pages_free = 0}

(gdb)

From above data, it looks like, somewhere inside vac_update_relstats, we

are freeing all palloced memory. I don't know, why is it.

I don't think the problem is in vac_update_relstats as we are not even
passing stats to it, so it won't be able to free it. I think the real
problem is in the way we copy the stats from shared memory to local
memory in the function end_parallel_vacuum(). Basically, it allocates
the memory for all the index stats together and then in function
update_index_statistics, it is trying to free memory of individual
array elements, that won't work. I have tried to fix the allocation
in end_parallel_vacuum, see if this fixes the problem for you. You
need to apply the attached patch atop
v28-0001-Add-parallel-option-to-VACUUM-command posted above by
Sawada-San.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#109

Amit Kapila

amit.kapila16@gmail.com

over 6 years ago

In reply to: Masahiko Sawada (#107)

1 attachment(s)

On Sat, Oct 12, 2019 at 11:29 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Sat, Oct 12, 2019 at 12:33 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Oct 11, 2019 at 4:47 PM Mahendra Singh <mahi6run@gmail.com> wrote:

Thank you for reviewing and creating the patch!

I think the patch fixes this issue correctly. Attached the updated
version patch.

I see a much bigger problem with the way this patch collects the index
stats in shared memory. IIUC, it allocates the shared memory (DSM)
for all the index stats, in the same way, considering its size as
IndexBulkDeleteResult. For the first time, it gets the stats from
local memory as returned by ambulkdelete/amvacuumcleanup call and then
copies it in shared memory space. There onwards, it always updates
the stats in shared memory by pointing each index stats to that
memory. In this scheme, you overlooked the point that an index AM
could choose to return a larger structure of which
IndexBulkDeleteResult is just the first field. This generally
provides a way for ambulkdelete to communicate additional private data
to amvacuumcleanup. We use this idea in the gist index, see how
gistbulkdelete and gistvacuumcleanup works. The current design won't
work for such cases.

One idea is to change the design such that each index method provides
a method to estimate/allocate the shared memory required for stats of
ambulkdelete/amvacuumscan and then later we also need to use index
method-specific function which copies the stats from local memory to
shared memory. I think this needs further investigation.

I have also made a few other changes in the attached delta patch. The
main point that fixed by attached patch is that even if we don't allow
a parallel vacuum on temporary tables, the analyze should be able to
work if the user has asked for it. I have changed an error message
and few other cosmetic changes related to comments. Kindly include
this in the next version if you don't find any problem with the
changes.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachments:

fix_comments_amit_1.patchapplication/octet-stream; name=fix_comments_amit_1.patchDownload

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 3c5e16608e..77bb4a265c 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -197,7 +197,7 @@ typedef struct LVShared
 	/*
 	 * Fields for both index vacuuming and index cleanup.
 	 *
-	 * reltuples is the total number of input heap tuples. We set either an
+	 * reltuples is the total number of input heap tuples.  We set either an
 	 * old live tuples in index vacuuming case or the new live tuples in index
 	 * cleanup case.
 	 *
@@ -207,7 +207,7 @@ typedef struct LVShared
 	bool	estimated_count;
 
 	/*
-	 * Variables to control parallel index vacuuming. An variable-sized field
+	 * Variables to control parallel index vacuuming.  An variable-sized field
 	 * 'indstats' must come last.
 	 */
 	pg_atomic_uint32	nprocessed;
@@ -2108,9 +2108,9 @@ lazy_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
 	{
 		/*
 		 * Generally index cleanup does not scan the index when index vacuuming
-		 * (ambulkdelete) was performed. So we perform index cleanup with parallel
-		 * workers only if we have not performed index vacuuming yet. Otherwise
-		 * we do it the leader process alone.
+		 * (ambulkdelete) was already performed.  So we perform index cleanup
+		 * with parallel workers only if we have not performed index vacuuming
+		 * yet.  Otherwise, we do it in the leader process alone.
 		 */
 		if (vacrelstats->num_index_scans == 0)
 			lazy_parallel_cleanup_indexes(vacrelstats, Irel, nindexes, stats, lps);
@@ -2146,7 +2146,8 @@ lazy_vacuum_indexes(LVRelStats *vacrelstats, Relation *Irel,
 
 	/*
 	 * Perform index vacuuming. If parallel vacuum is active we perform
-	 * index vacuuming with parallel workers. Otherwise do it alone.
+	 * index vacuuming with parallel workers.  Otherwise, we do it in the
+	 * leader process alone.
 	 */
 	if (ParallelVacuumIsActive(lps))
 		lazy_parallel_vacuum_indexes(vacrelstats, Irel, nindexes, stats, lps);
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 23f20d93e5..ff8c7760c0 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -1766,19 +1766,19 @@ vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params)
 	}
 
 	/*
-	 * Check that it's a temporary relation and PARALLEL option is specified.
 	 * Since parallel workers cannot access data in temporary tables, parallel
 	 * vacuum is not allowed for temporary relation.
 	 */
 	if (RelationUsesLocalBuffers(onerel) && params->nworkers >= 0)
 	{
 		ereport(WARNING,
-				(errmsg("skipping \"%s\" --- cannot parallel vacuum temporary tables",
+				(errmsg("skipping vacuum on \"%s\" --- cannot vacuum temporary tables in parallel",
 						RelationGetRelationName(onerel))));
 		relation_close(onerel, lmode);
 		PopActiveSnapshot();
 		CommitTransactionCommand();
-		return false;
+		/* It's OK to proceed with ANALYZE on this table */
+		return true;
 	}
 
 	/*

#110

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Amit Kapila (#109)

On Sat, Oct 12, 2019 at 4:50 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Sat, Oct 12, 2019 at 11:29 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

I see a much bigger problem with the way this patch collects the index
stats in shared memory. IIUC, it allocates the shared memory (DSM)
for all the index stats, in the same way, considering its size as
IndexBulkDeleteResult. For the first time, it gets the stats from
local memory as returned by ambulkdelete/amvacuumcleanup call and then
copies it in shared memory space. There onwards, it always updates
the stats in shared memory by pointing each index stats to that
memory. In this scheme, you overlooked the point that an index AM
could choose to return a larger structure of which
IndexBulkDeleteResult is just the first field. This generally
provides a way for ambulkdelete to communicate additional private data
to amvacuumcleanup. We use this idea in the gist index, see how
gistbulkdelete and gistvacuumcleanup works. The current design won't
work for such cases.

Today, I looked at gistbulkdelete and gistvacuumcleanup closely and I
have a few observations about those which might help us to solve this
problem for gist indexes:
1. Are we using memory context GistBulkDeleteResult->page_set_context?
It seems to me it is not being used.
2. Each time we perform gistbulkdelete, we always seem to reset the
GistBulkDeleteResult stats, see gistvacuumscan. So, how will it
accumulate it for the cleanup phase when the vacuum needs to call
gistbulkdelete multiple times because the available space for
dead-tuple is filled. It seems to me like we only use the stats from
the very last call to gistbulkdelete.
3. Do we really need to give the responsibility of deleting empty
pages (gistvacuum_delete_empty_pages) to gistvacuumcleanup. Can't we
do it in gistbulkdelte? I see one advantage of postponing it till the
cleanup phase which is if somehow we can accumulate stats over
multiple calls of gistbulkdelete, but I am not sure if it is feasible.
At least, the way current code works, it seems that there is no
advantage to postpone deleting empty pages till the cleanup phase.

If we avoid postponing deleting empty pages till the cleanup phase,
then we don't have the problem for gist indexes.

This is not directly related to this patch, so we can discuss these
observations in a separate thread as well, but before that, I wanted
to check your opinion to see if this makes sense to you as this will
help us in moving this patch forward.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#111

Dilip Kumar

dilipbalaut@gmail.com

about 6 years ago

In reply to: Amit Kapila (#110)

On Mon, Oct 14, 2019 at 3:10 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Sat, Oct 12, 2019 at 4:50 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Sat, Oct 12, 2019 at 11:29 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

I see a much bigger problem with the way this patch collects the index
stats in shared memory. IIUC, it allocates the shared memory (DSM)
for all the index stats, in the same way, considering its size as
IndexBulkDeleteResult. For the first time, it gets the stats from
local memory as returned by ambulkdelete/amvacuumcleanup call and then
copies it in shared memory space. There onwards, it always updates
the stats in shared memory by pointing each index stats to that
memory. In this scheme, you overlooked the point that an index AM
could choose to return a larger structure of which
IndexBulkDeleteResult is just the first field. This generally
provides a way for ambulkdelete to communicate additional private data
to amvacuumcleanup. We use this idea in the gist index, see how
gistbulkdelete and gistvacuumcleanup works. The current design won't
work for such cases.

Today, I looked at gistbulkdelete and gistvacuumcleanup closely and I
have a few observations about those which might help us to solve this
problem for gist indexes:
1. Are we using memory context GistBulkDeleteResult->page_set_context?
It seems to me it is not being used.

To me also it appears that it's not being used.

2. Each time we perform gistbulkdelete, we always seem to reset the
GistBulkDeleteResult stats, see gistvacuumscan. So, how will it
accumulate it for the cleanup phase when the vacuum needs to call
gistbulkdelete multiple times because the available space for
dead-tuple is filled. It seems to me like we only use the stats from
the very last call to gistbulkdelete.

IIUC, it is fine to use the stats from the latest gistbulkdelete call
because we are trying to collect the information of the empty pages
while scanning the tree. So I think it would be fine to just use the
information collected from the latest scan otherwise we will get
duplicate information.

3. Do we really need to give the responsibility of deleting empty
pages (gistvacuum_delete_empty_pages) to gistvacuumcleanup. Can't we
do it in gistbulkdelte? I see one advantage of postponing it till the
cleanup phase which is if somehow we can accumulate stats over
multiple calls of gistbulkdelete, but I am not sure if it is feasible.

It seems that we want to use the latest result. That might be the
reason for postponing to the cleanup phase.

At least, the way current code works, it seems that there is no
advantage to postpone deleting empty pages till the cleanup phase.

If we avoid postponing deleting empty pages till the cleanup phase,
then we don't have the problem for gist indexes.

This is not directly related to this patch, so we can discuss these
observations in a separate thread as well, but before that, I wanted
to check your opinion to see if this makes sense to you as this will
help us in moving this patch forward.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#112

Masahiko Sawada

sawada.mshk@gmail.com

about 6 years ago

In reply to: Amit Kapila (#110)

On Mon, Oct 14, 2019 at 6:37 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Sat, Oct 12, 2019 at 4:50 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Sat, Oct 12, 2019 at 11:29 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

I see a much bigger problem with the way this patch collects the index
stats in shared memory. IIUC, it allocates the shared memory (DSM)
for all the index stats, in the same way, considering its size as
IndexBulkDeleteResult. For the first time, it gets the stats from
local memory as returned by ambulkdelete/amvacuumcleanup call and then
copies it in shared memory space. There onwards, it always updates
the stats in shared memory by pointing each index stats to that
memory. In this scheme, you overlooked the point that an index AM
could choose to return a larger structure of which
IndexBulkDeleteResult is just the first field. This generally
provides a way for ambulkdelete to communicate additional private data
to amvacuumcleanup. We use this idea in the gist index, see how
gistbulkdelete and gistvacuumcleanup works. The current design won't
work for such cases.

Indeed. That's a very good point. Thank you for pointing out.

Today, I looked at gistbulkdelete and gistvacuumcleanup closely and I
have a few observations about those which might help us to solve this
problem for gist indexes:
1. Are we using memory context GistBulkDeleteResult->page_set_context?
It seems to me it is not being used.

Yes I also think this memory context is not being used.

2. Each time we perform gistbulkdelete, we always seem to reset the
GistBulkDeleteResult stats, see gistvacuumscan. So, how will it
accumulate it for the cleanup phase when the vacuum needs to call
gistbulkdelete multiple times because the available space for
dead-tuple is filled. It seems to me like we only use the stats from
the very last call to gistbulkdelete.

I think you're right. gistbulkdelete scans all pages and collects all
internal pages and all empty pages. And then in gistvacuumcleanup it
uses them to unlink all empty pages. Currently it accumulates such
information over multiple gistbulkdelete calls due to missing
switching the memory context but I guess this code intends to use them
only from the very last call to gistbulkdelete.

3. Do we really need to give the responsibility of deleting empty
pages (gistvacuum_delete_empty_pages) to gistvacuumcleanup. Can't we
do it in gistbulkdelte? I see one advantage of postponing it till the
cleanup phase which is if somehow we can accumulate stats over
multiple calls of gistbulkdelete, but I am not sure if it is feasible.
At least, the way current code works, it seems that there is no
advantage to postpone deleting empty pages till the cleanup phase.

Considering the current strategy of page deletion of gist index the
advantage of postponing the page deletion till the cleanup phase is
that we can do the bulk deletion in cleanup phase which is called at
most once. But I wonder if we can do the page deletion in the similar
way to btree index. Or even we use the current strategy I think we can
do that while not passing the pages information from bulkdelete to
vacuumcleanup using by GistBulkDeleteResult.

If we avoid postponing deleting empty pages till the cleanup phase,
then we don't have the problem for gist indexes.

Yes. But considering your pointing out I guess that there might be
other index AMs use the stats returned from bulkdelete in the similar
way to gist index (i.e. using more larger structure of which
IndexBulkDeleteResult is just the first field). If we have the same
concern the parallel vacuum still needs to deal with that as you
mentioned.

Regards,

--
Masahiko Sawada

#113

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Masahiko Sawada (#112)

On Tue, Oct 15, 2019 at 10:34 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Mon, Oct 14, 2019 at 6:37 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

3. Do we really need to give the responsibility of deleting empty
pages (gistvacuum_delete_empty_pages) to gistvacuumcleanup. Can't we
do it in gistbulkdelte? I see one advantage of postponing it till the
cleanup phase which is if somehow we can accumulate stats over
multiple calls of gistbulkdelete, but I am not sure if it is feasible.
At least, the way current code works, it seems that there is no
advantage to postpone deleting empty pages till the cleanup phase.

Considering the current strategy of page deletion of gist index the
advantage of postponing the page deletion till the cleanup phase is
that we can do the bulk deletion in cleanup phase which is called at
most once. But I wonder if we can do the page deletion in the similar
way to btree index.

I think there might be some advantage of the current strategy due to
which it has been chosen. I was going through the development thread
and noticed some old email which points something related to this.
See [1]/messages/by-id/8548498B-6EC6-4C89-8313-107BEC437489@yandex-team.ru.

Or even we use the current strategy I think we can
do that while not passing the pages information from bulkdelete to
vacuumcleanup using by GistBulkDeleteResult.

Yeah, I also think so. I have started a new thread [2]/messages/by-id/CAA4eK1LGr+MN0xHZpJ2dfS8QNQ1a_aROKowZB+MPNep8FVtwAA@mail.gmail.com to know the
opinion of others on this matter.

If we avoid postponing deleting empty pages till the cleanup phase,
then we don't have the problem for gist indexes.

Yes. But considering your pointing out I guess that there might be
other index AMs use the stats returned from bulkdelete in the similar
way to gist index (i.e. using more larger structure of which
IndexBulkDeleteResult is just the first field). If we have the same
concern the parallel vacuum still needs to deal with that as you
mentioned.

Right, apart from some functions for memory allocation/estimation and
stats copy, we might need something like amcanparallelvacuum, so that
index methods can have the option to not participate in parallel
vacuum due to reasons similar to gist or something else. I think we
can work towards this direction as this anyway seems to be required
and till we reach any conclusion for gist indexes, you can mark
amcanparallelvacuum for gist indexes as false.

[1]: /messages/by-id/8548498B-6EC6-4C89-8313-107BEC437489@yandex-team.ru
[2]: /messages/by-id/CAA4eK1LGr+MN0xHZpJ2dfS8QNQ1a_aROKowZB+MPNep8FVtwAA@mail.gmail.com

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#114

Masahiko Sawada

sawada.mshk@gmail.com

about 6 years ago

In reply to: Amit Kapila (#113)

On Tue, Oct 15, 2019 at 3:55 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Oct 15, 2019 at 10:34 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Mon, Oct 14, 2019 at 6:37 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

3. Do we really need to give the responsibility of deleting empty
pages (gistvacuum_delete_empty_pages) to gistvacuumcleanup. Can't we
do it in gistbulkdelte? I see one advantage of postponing it till the
cleanup phase which is if somehow we can accumulate stats over
multiple calls of gistbulkdelete, but I am not sure if it is feasible.
At least, the way current code works, it seems that there is no
advantage to postpone deleting empty pages till the cleanup phase.

Considering the current strategy of page deletion of gist index the
advantage of postponing the page deletion till the cleanup phase is
that we can do the bulk deletion in cleanup phase which is called at
most once. But I wonder if we can do the page deletion in the similar
way to btree index.

I think there might be some advantage of the current strategy due to
which it has been chosen. I was going through the development thread
and noticed some old email which points something related to this.
See [1].

Thanks.

Or even we use the current strategy I think we can
do that while not passing the pages information from bulkdelete to
vacuumcleanup using by GistBulkDeleteResult.

Yeah, I also think so. I have started a new thread [2] to know the
opinion of others on this matter.

Thank you.

If we avoid postponing deleting empty pages till the cleanup phase,
then we don't have the problem for gist indexes.

Yes. But considering your pointing out I guess that there might be
other index AMs use the stats returned from bulkdelete in the similar
way to gist index (i.e. using more larger structure of which
IndexBulkDeleteResult is just the first field). If we have the same
concern the parallel vacuum still needs to deal with that as you
mentioned.

Right, apart from some functions for memory allocation/estimation and
stats copy, we might need something like amcanparallelvacuum, so that
index methods can have the option to not participate in parallel
vacuum due to reasons similar to gist or something else. I think we
can work towards this direction as this anyway seems to be required
and till we reach any conclusion for gist indexes, you can mark
amcanparallelvacuum for gist indexes as false.

Agreed. I'll create a separate patch to add this callback and change
parallel vacuum patch so that it checks the participation of indexes
and then vacuums on un-participated indexes after parallel vacuum.

Regards,

--
Masahiko Sawada

#115

Masahiko Sawada

sawada.mshk@gmail.com

about 6 years ago

In reply to: Masahiko Sawada (#114)

On Tue, Oct 15, 2019 at 4:15 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Tue, Oct 15, 2019 at 3:55 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Oct 15, 2019 at 10:34 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Mon, Oct 14, 2019 at 6:37 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

3. Do we really need to give the responsibility of deleting empty
pages (gistvacuum_delete_empty_pages) to gistvacuumcleanup. Can't we
do it in gistbulkdelte? I see one advantage of postponing it till the
cleanup phase which is if somehow we can accumulate stats over
multiple calls of gistbulkdelete, but I am not sure if it is feasible.
At least, the way current code works, it seems that there is no
advantage to postpone deleting empty pages till the cleanup phase.

Considering the current strategy of page deletion of gist index the
advantage of postponing the page deletion till the cleanup phase is
that we can do the bulk deletion in cleanup phase which is called at
most once. But I wonder if we can do the page deletion in the similar
way to btree index.

I think there might be some advantage of the current strategy due to
which it has been chosen. I was going through the development thread
and noticed some old email which points something related to this.
See [1].

Thanks.

Or even we use the current strategy I think we can
do that while not passing the pages information from bulkdelete to
vacuumcleanup using by GistBulkDeleteResult.

Yeah, I also think so. I have started a new thread [2] to know the
opinion of others on this matter.

Thank you.

If we avoid postponing deleting empty pages till the cleanup phase,
then we don't have the problem for gist indexes.

Yes. But considering your pointing out I guess that there might be
other index AMs use the stats returned from bulkdelete in the similar
way to gist index (i.e. using more larger structure of which
IndexBulkDeleteResult is just the first field). If we have the same
concern the parallel vacuum still needs to deal with that as you
mentioned.

Right, apart from some functions for memory allocation/estimation and
stats copy, we might need something like amcanparallelvacuum, so that
index methods can have the option to not participate in parallel
vacuum due to reasons similar to gist or something else. I think we
can work towards this direction as this anyway seems to be required
and till we reach any conclusion for gist indexes, you can mark
amcanparallelvacuum for gist indexes as false.

Agreed. I'll create a separate patch to add this callback and change
parallel vacuum patch so that it checks the participation of indexes
and then vacuums on un-participated indexes after parallel vacuum.

amcanparallelvacuum is not necessary to be a callback, it can be a
boolean field of IndexAmRoutine.

Regards,

--
Masahiko Sawada

#116

Dilip Kumar

dilipbalaut@gmail.com

about 6 years ago

In reply to: Amit Kapila (#113)

On Tue, Oct 15, 2019 at 12:25 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

Right, apart from some functions for memory allocation/estimation and
stats copy, we might need something like amcanparallelvacuum, so that
index methods can have the option to not participate in parallel
vacuum due to reasons similar to gist or something else. I think we
can work towards this direction as this anyway seems to be required
and till we reach any conclusion for gist indexes, you can mark
amcanparallelvacuum for gist indexes as false.

I think for estimating the size of the stat I suggest "amestimatestat"
or "amstatsize" and for copy stat data we can add "amcopystat"? It
would be helpful to extend the parallel vacuum for the indexes which
has extended stats.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#117

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Masahiko Sawada (#115)

On Tue, Oct 15, 2019 at 1:26 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Tue, Oct 15, 2019 at 4:15 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

If we avoid postponing deleting empty pages till the cleanup phase,
then we don't have the problem for gist indexes.

Yes. But considering your pointing out I guess that there might be
other index AMs use the stats returned from bulkdelete in the similar
way to gist index (i.e. using more larger structure of which
IndexBulkDeleteResult is just the first field). If we have the same
concern the parallel vacuum still needs to deal with that as you
mentioned.

Right, apart from some functions for memory allocation/estimation and
stats copy, we might need something like amcanparallelvacuum, so that
index methods can have the option to not participate in parallel
vacuum due to reasons similar to gist or something else. I think we
can work towards this direction as this anyway seems to be required
and till we reach any conclusion for gist indexes, you can mark
amcanparallelvacuum for gist indexes as false.

Agreed. I'll create a separate patch to add this callback and change
parallel vacuum patch so that it checks the participation of indexes
and then vacuums on un-participated indexes after parallel vacuum.

amcanparallelvacuum is not necessary to be a callback, it can be a
boolean field of IndexAmRoutine.

Yes, it will be a boolean. Note that for parallel-index scans, we
already have amcanparallel.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#118

Masahiko Sawada

sawada.mshk@gmail.com

about 6 years ago

In reply to: Amit Kapila (#117)

3 attachment(s)

On Tue, Oct 15, 2019 at 6:33 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Oct 15, 2019 at 1:26 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Tue, Oct 15, 2019 at 4:15 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

If we avoid postponing deleting empty pages till the cleanup phase,
then we don't have the problem for gist indexes.

Yes. But considering your pointing out I guess that there might be
other index AMs use the stats returned from bulkdelete in the similar
way to gist index (i.e. using more larger structure of which
IndexBulkDeleteResult is just the first field). If we have the same
concern the parallel vacuum still needs to deal with that as you
mentioned.

Right, apart from some functions for memory allocation/estimation and
stats copy, we might need something like amcanparallelvacuum, so that
index methods can have the option to not participate in parallel
vacuum due to reasons similar to gist or something else. I think we
can work towards this direction as this anyway seems to be required
and till we reach any conclusion for gist indexes, you can mark
amcanparallelvacuum for gist indexes as false.

Agreed. I'll create a separate patch to add this callback and change
parallel vacuum patch so that it checks the participation of indexes
and then vacuums on un-participated indexes after parallel vacuum.

amcanparallelvacuum is not necessary to be a callback, it can be a
boolean field of IndexAmRoutine.

Yes, it will be a boolean. Note that for parallel-index scans, we
already have amcanparallel.

Attached updated patch set. 0001 patch introduces new index AM field
amcanparallelvacuum. All index AMs except for gist sets true for now.
0002 patch incorporated the all comments I got so far.

Regards,

--
Masahiko Sawada

Attachments:

v30-0002-Add-parallel-option-to-VACUUM-command.patchtext/x-patch; charset=US-ASCII; name=v30-0002-Add-parallel-option-to-VACUUM-command.patchDownload

From 698ba00a46f06a196bc805693b060e9c5b721cf2 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Wed, 2 Oct 2019 22:46:21 +0900
Subject: [PATCH v30 2/3] Add parallel option to VACUUM command

This change adds PARALLEL option to VACUUM command that enable us to
perform index vacuuming and index cleanup with background
workers. Individual indexes is processed by one vacuum
process. Therefore parallel vacuum can be used when the table has at
least two indexes and it cannot specify larger parallel degree than
the number of indexes that the table has.

The parallel degree is either specified by user or determined based on
the number of indexes that the table has, and further limited by
max_parallel_maintenance_workers. The table size and index size don't
affect it.
---
 doc/src/sgml/config.sgml              |  14 +-
 doc/src/sgml/ref/vacuum.sgml          |  45 ++
 src/backend/access/heap/vacuumlazy.c  | 984 +++++++++++++++++++++++---
 src/backend/access/transam/parallel.c |   4 +
 src/backend/commands/vacuum.c         |  45 ++
 src/backend/postmaster/autovacuum.c   |   2 +
 src/bin/psql/tab-complete.c           |   2 +-
 src/include/access/heapam.h           |   3 +
 src/include/commands/vacuum.h         |   5 +
 src/test/regress/expected/vacuum.out  |  14 +
 src/test/regress/sql/vacuum.sql       |  10 +
 11 files changed, 1019 insertions(+), 109 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 47b12c6a8f..9012e5549e 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2265,13 +2265,13 @@ include_dir 'conf.d'
        <listitem>
         <para>
          Sets the maximum number of parallel workers that can be
-         started by a single utility command.  Currently, the only
-         parallel utility command that supports the use of parallel
-         workers is <command>CREATE INDEX</command>, and only when
-         building a B-tree index.  Parallel workers are taken from the
-         pool of processes established by <xref
-         linkend="guc-max-worker-processes"/>, limited by <xref
-         linkend="guc-max-parallel-workers"/>.  Note that the requested
+         started by a single utility command.  Currently, the parallel
+         utility commands that support the use of parallel workers are
+         <command>CREATE INDEX</command> only when building a B-tree index,
+         and <command>VACUUM</command> without <literal>FULL</literal>
+         option. Parallel workers are taken from the pool of processes
+         established by <xref linkend="guc-max-worker-processes"/>, limited
+         by <xref linkend="guc-max-parallel-workers"/>.  Note that the requested
          number of workers may not actually be available at run time.
          If this occurs, the utility operation will run with fewer
          workers than expected.  The default value is 2.  Setting this
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index f9b0fb8794..ae086b976b 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -34,6 +34,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     SKIP_LOCKED [ <replaceable class="parameter">boolean</replaceable> ]
     INDEX_CLEANUP [ <replaceable class="parameter">boolean</replaceable> ]
     TRUNCATE [ <replaceable class="parameter">boolean</replaceable> ]
+    PARALLEL [ <replaceable class="parameter">integer</replaceable> ]
 
 <phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
 
@@ -223,6 +224,32 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><literal>PARALLEL</literal></term>
+    <listitem>
+     <para>
+      Perform vacuum index and cleanup index phases of <command>VACUUM</command>
+      in parallel using <replaceable class="parameter">integer</replaceable>
+      background workers (for the detail of each vacuum phases, please
+      refer to <xref linkend="vacuum-phases"/>). If the parallel degree
+      <replaceable class="parameter">integer</replaceable> is omitted,
+      then <command>VACUUM</command> decides the number of workers based
+      on number of indexes that support parallel vacuum operation on the
+      relation which is further limited by
+      <xref linkend="guc-max-parallel-workers-maintenance"/>. Please note
+      that it is not guaranteed that the number of parallel worker specified
+      in <replaceable class="parameter">integer</replaceable> will be used
+      during execution. It is possible for a vacuum to run with fewer workers
+      than specified, or even with no workers at all. Only one worker can
+      be used per index. So parallel workers are launched only when there
+      are at least <literal>2</literal> indexes in the table. Workers for
+      vacuum launches before starting each phases and exit at the end of
+      the phase. These behaviors might change in a future release. This
+      option can not use with <literal>FULL</literal> option.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><replaceable class="parameter">boolean</replaceable></term>
     <listitem>
@@ -237,6 +264,18 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><replaceable class="parameter">integer</replaceable></term>
+    <listitem>
+     <para>
+      Specifies a positive integer value passed to the selected option.
+      The <replaceable class="parameter">integer</replaceable> value can
+      also be omitted, in which case the default value of the selected
+      option is used.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><replaceable class="parameter">table_name</replaceable></term>
     <listitem>
@@ -316,6 +355,12 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     more than a plain <command>VACUUM</command> would.
    </para>
 
+   <para>
+     The <option>PARALLEL</option> option is used for only vacuum purpose.
+     Even if this option is specified with <option>ANALYZE</option> option
+     it does not affect <option>ANALYZE</option>.
+   </para>
+
    <para>
     <command>VACUUM</command> causes a substantial increase in I/O traffic,
     which might cause poor performance for other active sessions.  Therefore,
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index a3c4a1df3b..9f51b53408 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -22,6 +22,21 @@
  * of index scans performed.  So we don't use maintenance_work_mem memory for
  * the TID array, just enough to hold as many heap tuples as fit on one page.
  *
+ * Lazy vacuum supports parallel execution with parallel worker processes.  In
+ * parallel lazy vacuum, we perform both index vacuuming and index cleanup with
+ * parallel worker processes.  Individual indexes are processed by one vacuum
+ * process.  At the beginning of lazy vacuum (at lazy_scan_heap) we prepare the
+ * parallel context and initialize the DSM segment that contains shared information
+ * as well as the memory space for storing dead tuples.  When starting either
+ * index vacuuming or index cleanup, we launch parallel worker processes.  Once
+ * all indexes are processed the parallel worker processes exit.  And then the
+ * leader process re-initializes the parallel context while keeping recorded
+ * dead tuples so that the leader can launch parallel workers again in the next
+ * time.  Note that all parallel workers live during either index vacuuming or
+ * index cleanup but the leader process neither exits from the parallel mode
+ * nor destroys the parallel context.  For updating the index statistics, since
+ * any updates are not allowed during parallel mode we update the index
+ * statistics after exited from the parallel mode.
  *
  * Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
@@ -36,13 +51,16 @@
 
 #include <math.h>
 
+#include "access/amapi.h"
 #include "access/genam.h"
 #include "access/heapam.h"
 #include "access/heapam_xlog.h"
 #include "access/htup_details.h"
 #include "access/multixact.h"
+#include "access/parallel.h"
 #include "access/transam.h"
 #include "access/visibilitymap.h"
+#include "access/xact.h"
 #include "access/xlog.h"
 #include "catalog/storage.h"
 #include "commands/dbcommands.h"
@@ -55,6 +73,7 @@
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
 #include "storage/lmgr.h"
+#include "tcop/tcopprot.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/pg_rusage.h"
@@ -110,6 +129,119 @@
  */
 #define PREFETCH_SIZE			((BlockNumber) 32)
 
+/*
+ * DSM keys for parallel lazy vacuum. Unlike other parallel execution code,
+ * since we don't need to worry about DSM keys conflicting with plan_node_id
+ * we can use small integers.
+ */
+#define PARALLEL_VACUUM_KEY_SHARED			1
+#define PARALLEL_VACUUM_KEY_DEAD_TUPLES		2
+#define PARALLEL_VACUUM_KEY_QUERY_TEXT		3
+
+/*
+ * PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION disables the leader's
+ * participation in parallel lazy vacuum.  This may be useful as a debugging
+ * aid.
+#undef PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION
+ */
+
+/*
+ * Macro to check if we are in a parallel lazy vacuum. If true, we are
+ * in the parallel mode and prepared the DSM segment.
+ */
+#define ParallelVacuumIsActive(lps) (((LVParallelState *) (lps)) != NULL)
+
+/*
+ * LVDeadTuples stores the dead tuple TIDs collected during heap scan.
+ * This is allocated in the DSM segment when parallel lazy vacuum
+ * mode, otherwise allocated in a local memory.
+ */
+typedef struct LVDeadTuples
+{
+	int			max_tuples;	/* # slots allocated in array */
+	int			num_tuples;	/* current # of entries */
+	/* List of TIDs of tuples we intend to delete */
+	/* NB: this list is ordered by TID address */
+	ItemPointerData itemptrs[FLEXIBLE_ARRAY_MEMBER];	/* array of ItemPointerData */
+} LVDeadTuples;
+#define SizeOfLVDeadTuples offsetof(LVDeadTuples, itemptrs) + sizeof(ItemPointerData)
+
+/*
+ * Struct for an index bulk-deletion statistic used for parallel lazy vacuum.
+ * This is allocated in the DSM segment.
+ */
+typedef struct LVSharedIndStats
+{
+	/*
+	 * True if this slot is in use.  If false we don't use the other fields and
+	 * index statistics are stored in a local memory instead.  And the leader
+	 * process them after parallel index vacuuming.  in_use can be false if the
+	 * index does not support parallel index vacuuming.
+	 */
+	bool					in_use;
+
+	IndexBulkDeleteResult	stats;
+	bool					updated;	/* are the stats updated? */
+} LVSharedIndStats;
+
+/*
+ * Shared information among parallel workers. So this is allocated in
+ * the DSM segment.
+ */
+typedef struct LVShared
+{
+	/*
+	 * Target table relid and log level. These fields are not modified during
+	 * the lazy vacuum.
+	 */
+	Oid		relid;
+	int		elevel;
+
+	/*
+	 * An indication for vacuum workers of doing either index vacuuming or
+	 * index cleanup.
+	 */
+	bool	for_cleanup;
+
+	/*
+	 * Fields for both index vacuuming and index cleanup.
+	 *
+	 * reltuples is the total number of input heap tuples.  We set either an
+	 * old live tuples in index vacuuming case or the new live tuples in index
+	 * cleanup case.
+	 *
+	 * estimated_count is true if the reltuples is estimated value.
+	 */
+	double	reltuples;
+	bool	estimated_count;
+
+	/* The number of indexes that do NOT support parallel index vacuuming */
+	int		nindexes_nonparallel;
+
+	/*
+	 * Variables to control parallel index vacuuming.  An variable-sized field
+	 * 'indstats' must come last.
+	 */
+	pg_atomic_uint32	nprocessed;
+	LVSharedIndStats	indstats[FLEXIBLE_ARRAY_MEMBER];
+} LVShared;
+#define SizeOfLVShared offsetof(LVShared, indstats) + sizeof(LVSharedIndStats)
+
+/* Struct for parallel lazy vacuum */
+typedef struct LVParallelState
+{
+	ParallelContext	*pcxt;
+
+	/* Shared information among parallel vacuum workers */
+	LVShared		*lvshared;
+
+	/*
+	 * Always true except for a debugging case where
+	 * PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION are defined.
+	 */
+	bool			leaderparticipates;
+} LVParallelState;
+
 typedef struct LVRelStats
 {
 	/* useindex = true means two-pass strategy; false means one-pass */
@@ -128,17 +260,12 @@ typedef struct LVRelStats
 	BlockNumber pages_removed;
 	double		tuples_deleted;
 	BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
-	/* List of TIDs of tuples we intend to delete */
-	/* NB: this list is ordered by TID address */
-	int			num_dead_tuples;	/* current # of entries */
-	int			max_dead_tuples;	/* # slots allocated in array */
-	ItemPointer dead_tuples;	/* array of ItemPointerData */
+	LVDeadTuples *dead_tuples;
 	int			num_index_scans;
 	TransactionId latestRemovedXid;
 	bool		lock_waiter_detected;
 } LVRelStats;
 
-
 /* A few variables that don't seem worth passing around as parameters */
 static int	elevel = -1;
 
@@ -155,12 +282,11 @@ static void lazy_scan_heap(Relation onerel, VacuumParams *params,
 						   bool aggressive);
 static void lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats);
 static bool lazy_check_needs_freeze(Buffer buf, bool *hastup);
-static void lazy_vacuum_index(Relation indrel,
-							  IndexBulkDeleteResult **stats,
-							  LVRelStats *vacrelstats);
+static void lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult **stats,
+							  LVDeadTuples *dead_tuples, double reltuples);
 static void lazy_cleanup_index(Relation indrel,
-							   IndexBulkDeleteResult *stats,
-							   LVRelStats *vacrelstats);
+							   IndexBulkDeleteResult **stats,
+							   double reltuples, bool estimated_count);
 static int	lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 							 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer);
 static bool should_attempt_truncation(VacuumParams *params,
@@ -169,12 +295,37 @@ static void lazy_truncate_heap(Relation onerel, LVRelStats *vacrelstats);
 static BlockNumber count_nondeletable_pages(Relation onerel,
 											LVRelStats *vacrelstats);
 static void lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks);
-static void lazy_record_dead_tuple(LVRelStats *vacrelstats,
+static void lazy_record_dead_tuple(LVDeadTuples *dead_tuples,
 								   ItemPointer itemptr);
 static bool lazy_tid_reaped(ItemPointer itemptr, void *state);
 static int	vac_cmp_itemptr(const void *left, const void *right);
 static bool heap_page_is_all_visible(Relation rel, Buffer buf,
 									 TransactionId *visibility_cutoff_xid, bool *all_frozen);
+static LVParallelState *begin_parallel_vacuum(LVRelStats *vacrelstats, Oid relid,
+											  BlockNumber nblocks, Relation *Irel,
+											  int nindexes, int nrequested);
+static void end_parallel_vacuum(LVParallelState *lps, Relation *Irel, int nindexes,
+								IndexBulkDeleteResult **stats);
+static void lazy_vacuum_indexes(LVRelStats *vacrelstats, Relation *Irel,
+								int nindexes, IndexBulkDeleteResult **stats,
+								LVParallelState *lps);
+static void lazy_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
+								 int nindexes, IndexBulkDeleteResult **stats,
+								 LVParallelState *lps);
+static void lazy_parallel_vacuum_indexes(LVRelStats *vacrelstats, Relation *Irel,
+										 int nindexes, IndexBulkDeleteResult **stats,
+										 LVParallelState *lps);
+static void lazy_parallel_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
+										  int nindexes, IndexBulkDeleteResult **stats,
+										  LVParallelState *lps);
+static void vacuum_or_cleanup_indexes_worker(Relation *Irel, int nindexes,
+											 IndexBulkDeleteResult **stats,
+											 LVShared *lvshared,
+											 LVDeadTuples *dead_tuples);
+static void update_index_statistics(Relation *Irel, IndexBulkDeleteResult **stats,
+									int nindexes);
+static int compute_parallel_workers(Relation *Irel, int nindexes, int nrequested);
+static long compute_max_dead_tuples(BlockNumber relblocks, bool hasindex);
 
 
 /*
@@ -488,6 +639,18 @@ vacuum_log_cleanup_info(Relation rel, LVRelStats *vacrelstats)
  *		dead-tuple TIDs, invoke vacuuming of indexes and call lazy_vacuum_heap
  *		to reclaim dead line pointers.
  *
+ *		If the table has at least two indexes and parallel lazy vacuum is
+ *		requested, we execute both index vacuuming and index cleanup with
+ *		parallel workers. In parallel lazy vacuum, we enter parallel mode and
+ *		then create both the parallel context and the DSM segment before starting
+ *		heap scan so that we can record dead tuples to the DSM segment. All
+ *		parallel workers are launched at beginning of index vacuuming and index
+ *		cleanup and they exit once done with all indexes. At the end of this
+ *		function we exit from parallel mode. Index bulk-deletion results are
+ *		stored in the DSM segment and update index statistics as a whole after
+ *		exited from parallel mode since all writes are not allowed during parallel
+ *		mode.
+ *
  *		If there are no indexes then we can reclaim line pointers on the fly;
  *		dead line pointers need only be retained until all index pointers that
  *		reference them have been killed.
@@ -496,6 +659,8 @@ static void
 lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			   Relation *Irel, int nindexes, bool aggressive)
 {
+	LVParallelState *lps = NULL;
+	LVDeadTuples *dead_tuples;
 	BlockNumber nblocks,
 				blkno;
 	HeapTupleData tuple;
@@ -518,6 +683,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	bool		skipping_blocks;
 	xl_heap_freeze_tuple *frozen;
 	StringInfoData buf;
+	int			parallel_workers = 0;
 	const int	initprog_index[] = {
 		PROGRESS_VACUUM_PHASE,
 		PROGRESS_VACUUM_TOTAL_HEAP_BLKS,
@@ -553,13 +719,41 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	vacrelstats->nonempty_pages = 0;
 	vacrelstats->latestRemovedXid = InvalidTransactionId;
 
-	lazy_space_alloc(vacrelstats, nblocks);
+	/*
+	 * If parallel lazy vacuum is requested and we vacuum indexes, compute
+	 * the number of parallel vacuum worker to launch.
+	 */
+	if (params->nworkers >= 0 && vacrelstats->useindex)
+		parallel_workers = compute_parallel_workers(Irel, nindexes,
+													params->nworkers);
+
+	if (parallel_workers > 0)
+	{
+		/*
+		 * Enter parallel mode, create the parallel context and allocate the
+		 * DSM segment.
+		 */
+		lps = begin_parallel_vacuum(vacrelstats,
+									RelationGetRelid(onerel),
+									nblocks, Irel, nindexes,
+									parallel_workers);
+	}
+	else
+	{
+		/*
+		 * Use single process vacuum. We allocate the memory space for dead
+		 * tuples locally.
+		 */
+		lazy_space_alloc(vacrelstats, nblocks);
+	}
+
+	dead_tuples = vacrelstats->dead_tuples;
 	frozen = palloc(sizeof(xl_heap_freeze_tuple) * MaxHeapTuplesPerPage);
 
 	/* Report that we're scanning the heap, advertising total # of blocks */
 	initprog_val[0] = PROGRESS_VACUUM_PHASE_SCAN_HEAP;
 	initprog_val[1] = nblocks;
-	initprog_val[2] = vacrelstats->max_dead_tuples;
+	initprog_val[2] = dead_tuples->max_tuples;
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
 	/*
@@ -737,8 +931,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * If we are close to overrunning the available space for dead-tuple
 		 * TIDs, pause and do a cycle of vacuuming before we tackle this page.
 		 */
-		if ((vacrelstats->max_dead_tuples - vacrelstats->num_dead_tuples) < MaxHeapTuplesPerPage &&
-			vacrelstats->num_dead_tuples > 0)
+		if ((dead_tuples->max_tuples - dead_tuples->num_tuples) < MaxHeapTuplesPerPage &&
+			dead_tuples->num_tuples > 0)
 		{
 			const int	hvp_index[] = {
 				PROGRESS_VACUUM_PHASE,
@@ -766,10 +960,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 										 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
 			/* Remove index entries */
-			for (i = 0; i < nindexes; i++)
-				lazy_vacuum_index(Irel[i],
-								  &indstats[i],
-								  vacrelstats);
+			lazy_vacuum_indexes(vacrelstats, Irel, nindexes, indstats, lps);
 
 			/*
 			 * Report that we are now vacuuming the heap.  We also increase
@@ -789,7 +980,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 			vacrelstats->num_index_scans++;
 
 			/*
@@ -985,7 +1176,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		has_dead_tuples = false;
 		nfrozen = 0;
 		hastup = false;
-		prev_dead_count = vacrelstats->num_dead_tuples;
+		prev_dead_count = dead_tuples->num_tuples;
 		maxoff = PageGetMaxOffsetNumber(page);
 
 		/*
@@ -1024,7 +1215,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 */
 			if (ItemIdIsDead(itemid))
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				all_visible = false;
 				continue;
 			}
@@ -1170,7 +1361,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 
 			if (tupgone)
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				HeapTupleHeaderAdvanceLatestRemovedXid(tuple.t_data,
 													   &vacrelstats->latestRemovedXid);
 				tups_vacuumed += 1;
@@ -1240,7 +1431,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * doing a second scan. Also we don't do that but forget dead tuples
 		 * when index cleanup is disabled.
 		 */
-		if (!vacrelstats->useindex && vacrelstats->num_dead_tuples > 0)
+		if (!vacrelstats->useindex && dead_tuples->num_tuples > 0)
 		{
 			if (nindexes == 0)
 			{
@@ -1269,7 +1460,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 
 			/*
 			 * Periodically do incremental FSM vacuuming to make newly-freed
@@ -1384,7 +1575,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * page, so remember its free space as-is.  (This path will always be
 		 * taken if there are no indexes.)
 		 */
-		if (vacrelstats->num_dead_tuples == prev_dead_count)
+		if (dead_tuples->num_tuples == prev_dead_count)
 			RecordPageWithFreeSpace(onerel, blkno, freespace);
 	}
 
@@ -1418,7 +1609,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 
 	/* If any tuples need to be deleted, perform final vacuum cycle */
 	/* XXX put a threshold on min number of tuples here? */
-	if (vacrelstats->num_dead_tuples > 0)
+	if (dead_tuples->num_tuples > 0)
 	{
 		const int	hvp_index[] = {
 			PROGRESS_VACUUM_PHASE,
@@ -1434,10 +1625,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 									 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
 		/* Remove index entries */
-		for (i = 0; i < nindexes; i++)
-			lazy_vacuum_index(Irel[i],
-							  &indstats[i],
-							  vacrelstats);
+		lazy_vacuum_indexes(vacrelstats, Irel, nindexes, indstats, lps);
 
 		/* Report that we are now vacuuming the heap */
 		hvp_val[0] = PROGRESS_VACUUM_PHASE_VACUUM_HEAP;
@@ -1463,12 +1651,19 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
 								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
 
-	/* Do post-vacuum cleanup and statistics update for each index */
+	/* Do post-vacuum cleanup */
 	if (vacrelstats->useindex)
-	{
-		for (i = 0; i < nindexes; i++)
-			lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
-	}
+		lazy_cleanup_indexes(vacrelstats, Irel, nindexes, indstats, lps);
+
+	/*
+	 * End parallel mode before updating index statistics as we cannot write
+	 * during parallel mode.
+	 */
+	if (ParallelVacuumIsActive(lps))
+		end_parallel_vacuum(lps, Irel, nindexes, indstats);
+
+	/* Update index statistics */
+	 update_index_statistics(Irel, indstats, nindexes);
 
 	/* If no indexes, make log report that lazy_vacuum_heap would've made */
 	if (vacuumed_pages)
@@ -1534,7 +1729,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 	npages = 0;
 
 	tupindex = 0;
-	while (tupindex < vacrelstats->num_dead_tuples)
+	while (tupindex < vacrelstats->dead_tuples->num_tuples)
 	{
 		BlockNumber tblk;
 		Buffer		buf;
@@ -1543,7 +1738,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 
 		vacuum_delay_point();
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples->itemptrs[tupindex]);
 		buf = ReadBufferExtended(onerel, MAIN_FORKNUM, tblk, RBM_NORMAL,
 								 vac_strategy);
 		if (!ConditionalLockBufferForCleanup(buf))
@@ -1591,6 +1786,7 @@ static int
 lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 				 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer)
 {
+	LVDeadTuples	*dead_tuples = vacrelstats->dead_tuples;
 	Page		page = BufferGetPage(buffer);
 	OffsetNumber unused[MaxOffsetNumber];
 	int			uncnt = 0;
@@ -1601,16 +1797,16 @@ lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 
 	START_CRIT_SECTION();
 
-	for (; tupindex < vacrelstats->num_dead_tuples; tupindex++)
+	for (; tupindex < dead_tuples->num_tuples; tupindex++)
 	{
 		BlockNumber tblk;
 		OffsetNumber toff;
 		ItemId		itemid;
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&dead_tuples->itemptrs[tupindex]);
 		if (tblk != blkno)
 			break;				/* past end of tuples for this block */
-		toff = ItemPointerGetOffsetNumber(&vacrelstats->dead_tuples[tupindex]);
+		toff = ItemPointerGetOffsetNumber(&dead_tuples->itemptrs[tupindex]);
 		itemid = PageGetItemId(page, toff);
 		ItemIdSetUnused(itemid);
 		unused[uncnt++] = toff;
@@ -1731,19 +1927,303 @@ lazy_check_needs_freeze(Buffer buf, bool *hastup)
 	return false;
 }
 
+/*
+ * Vacuuming indexes with parallel vacuum workers. This function must be used
+ * by the parallel vacuum leader process.
+ */
+static void
+lazy_parallel_vacuum_indexes(LVRelStats *vacrelstats, Relation *Irel,
+							 int nindexes, IndexBulkDeleteResult **stats,
+							 LVParallelState *lps)
+{
+	Assert(!IsParallelWorker());
+	Assert(ParallelVacuumIsActive(lps));
+	Assert(nindexes > 0);
+
+	/* Tell parallel workers to do index vacuuming */
+	lps->lvshared->for_cleanup = false;
+
+	/*
+	 * We can only provide an approximate value of num_heap_tuples in
+	 * vacuum cases.
+	 */
+	lps->lvshared->reltuples = vacrelstats->old_live_tuples;
+	lps->lvshared->estimated_count = true;
+
+	LaunchParallelWorkers(lps->pcxt);
+
+	ereport(elevel,
+			(errmsg(ngettext("launched %d parallel vacuum worker for index vacuuming (planned: %d)",
+							 "launched %d parallel vacuum workers for index vacuuming (planned: %d)",
+							 lps->pcxt->nworkers_launched),
+					lps->pcxt->nworkers_launched, lps->pcxt->nworkers)));
+
+	/*
+	 * Join index vacuuming with parallel workers. The leader process alone
+	 * does that in case where no workers launched.
+	 */
+	if (lps->leaderparticipates || lps->pcxt->nworkers_launched == 0)
+		vacuum_or_cleanup_indexes_worker(Irel, nindexes, stats, lps->lvshared,
+										 vacrelstats->dead_tuples);
+
+	/* Wait for all vacuum workers to finish */
+	WaitForParallelWorkersToFinish(lps->pcxt);
+
+	/* Reset the processing count */
+	pg_atomic_write_u32(&(lps->lvshared->nprocessed), 0);
+
+	/*
+	 * Reinitialize the parallel context to relaunch parallel workers
+	 * for the next execution.
+	 */
+	ReinitializeParallelDSM(lps->pcxt);
+}
+
+/*
+ * Cleanup indexes with parallel vacuum workers. This function must be used
+ * by the parallel vacuum leader process.
+ */
+static void
+lazy_parallel_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
+									   int nindexes, IndexBulkDeleteResult **stats,
+									   LVParallelState *lps)
+{
+	Assert(!IsParallelWorker());
+	Assert(ParallelVacuumIsActive(lps));
+	Assert(nindexes > 0);
+
+	/* Tell parallel workers to do index cleanup */
+	lps->lvshared->for_cleanup = true;
+
+	/*
+	 * Now we can provide a better estimate of total number of surviving
+	 * tuples (we assume indexes are more interested in that than in the
+	 * number of nominally live tuples).
+	 */
+	lps->lvshared->reltuples = vacrelstats->new_rel_tuples;
+	lps->lvshared->estimated_count =
+		(vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+
+	LaunchParallelWorkers(lps->pcxt);
+
+	ereport(elevel,
+			(errmsg(ngettext("launched %d parallel vacuum worker for index cleanup (planned: %d)",
+							 "launched %d parallel vacuum workers for index cleanup (planned: %d)",
+							 lps->pcxt->nworkers_launched),
+					lps->pcxt->nworkers_launched, lps->pcxt->nworkers)));
+
+	/*
+	 * Join index cleanup with parallel workers. The leader process alone does
+	 * that in case where no workers launched.
+	 */
+	if (lps->leaderparticipates || lps->pcxt->nworkers_launched == 0)
+		vacuum_or_cleanup_indexes_worker(Irel, nindexes, stats, lps->lvshared,
+										 vacrelstats->dead_tuples);
+
+	/* Wait for all vacuum workers to finish */
+	WaitForParallelWorkersToFinish(lps->pcxt);
+
+
+	/*
+	 * We don't need to reinitialize the parallel context unlike parallel index
+	 * vacuum as no more index vacuuming and index cleanup will be performed after
+	 * that.
+	 */
+}
+
+/*
+ * Index vacuum and index cleanup routine used by parallel vacuum worker processes
+ * including the leader process.  After finished each indexes this function copies
+ * the index statistics returned from ambulkdelete and amvacuumcleanup to the
+ * DSM segment.
+ */
+static void
+vacuum_or_cleanup_indexes_worker(Relation *Irel, int nindexes,
+								 IndexBulkDeleteResult **stats,
+								 LVShared *lvshared,
+								 LVDeadTuples *dead_tuples)
+{
+	/* Loop until all indexes are vacuumed */
+	for (;;)
+	{
+		int idx;
+
+		/* Get an index number to process */
+		idx = pg_atomic_fetch_add_u32(&(lvshared->nprocessed), 1);
+
+		/* Done for all indexes? */
+		if (idx >= nindexes)
+			break;
+
+		/* Skip unused slot */
+		if (!lvshared->indstats[idx].in_use)
+			continue;
+
+		/*
+		 * Update the pointer to the corresponding bulk-deletion result
+		 * if someone has already updated it.
+		 */
+		if (lvshared->indstats[idx].updated &&
+			stats[idx] == NULL)
+			stats[idx] = &(lvshared->indstats[idx].stats);
+
+		/* Do vacuum or cleanup one index */
+		if (lvshared->for_cleanup)
+			lazy_cleanup_index(Irel[idx], &(stats[idx]), lvshared->reltuples,
+							   lvshared->estimated_count);
+		else
+			lazy_vacuum_index(Irel[idx], &(stats[idx]), dead_tuples,
+							  lvshared->reltuples);
+
+		/*
+		 * Copy the index bulk-deletion result returned from ambulkdelete and
+		 * amvacuumcleanup to the DSM segment if it's the first time to get it
+		 * from them, because they allocate it locally and it's possible that an
+		 * index will be vacuumed by the different vacuum process at the next
+		 * time. The copying the result normally happens only after the first
+		 * time of index vacuuming. From the second time, we pass the result on
+		 * the DSM segment so that they then update it directly.
+		 *
+		 * Since all vacuum workers write the bulk-deletion result at different
+		 * slots we can write them without locking.
+		 */
+		if (!lvshared->indstats[idx].updated && stats[idx] != NULL)
+		{
+			memcpy(&(lvshared->indstats[idx].stats),
+				   stats[idx], sizeof(IndexBulkDeleteResult));
+			lvshared->indstats[idx].updated = true;
+
+			/*
+			 * no longer need the locally allocated result and now stats[idx]
+			 * points to the DSM segment.
+			 */
+			pfree(stats[idx]);
+			stats[idx] = &(lvshared->indstats[idx].stats);
+		}
+	}
+}
+
+/*
+ * Cleanup indexes. This function must be used by the parallel vacuum leader
+ * process in parallel vacuum case.
+ */
+static void
+lazy_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
+					int nindexes, IndexBulkDeleteResult **stats,
+					LVParallelState *lps)
+{
+	int		idx;
+
+	Assert(!IsParallelWorker());
+	Assert(nindexes > 0);
+
+	/*
+	 * If parallel vacuum is active we perform index cleanup with parallel
+	 * workers.
+	 */
+	if (ParallelVacuumIsActive(lps))
+	{
+		/*
+		 * Generally index cleanup does not scan the index when index vacuuming
+		 * (ambulkdelete) was already performed.  So we perform index cleanup
+		 * with parallel workers only if we have not performed index vacuuming
+		 * yet.  Otherwise, we do it in the leader process alone.
+		 */
+		if (vacrelstats->num_index_scans == 0)
+			lazy_parallel_cleanup_indexes(vacrelstats, Irel, nindexes, stats, lps);
+		else
+		{
+			/*
+			 * Do cleanup by the leader process alone.  Since we need to copy
+			 * the index statistics to the DSM segment we cannot use
+			 * lazy_index_cleanup instead.
+			 */
+			lps->lvshared->for_cleanup = true;
+			vacuum_or_cleanup_indexes_worker(Irel, nindexes, stats,
+											 lps->lvshared,
+											 vacrelstats->dead_tuples);
+		}
+
+		/*
+		 * Done if there is no indexes that do not support parallel index vacuuming.
+		 * Otherwise fall through to do single process vacuum on such indexes.
+		 */
+		if (lps->lvshared->nindexes_nonparallel == 0)
+			return;
+	}
+
+	for (idx = 0; idx < nindexes; idx++)
+	{
+		/*
+		 * Skip indexes that we have already cleaned up during parallel index
+		 * vacuuming.
+		 */
+		if (ParallelVacuumIsActive(lps) && lps->lvshared->indstats[idx].in_use)
+			continue;
+
+		lazy_cleanup_index(Irel[idx], &stats[idx], vacrelstats->new_rel_tuples,
+							   vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	}
+}
+
+/*
+ * Vacuum indexes. This function must be used by the parallel vacuum leader
+ * process in parallel vacuum case.
+ */
+static void
+lazy_vacuum_indexes(LVRelStats *vacrelstats, Relation *Irel,
+					int nindexes, IndexBulkDeleteResult **stats,
+					LVParallelState *lps)
+{
+	int		idx;
+
+	Assert(!IsParallelWorker());
+	Assert(nindexes > 0);
+
+	/*
+	 * If parallel vacuum is active we perform index vacumming with parallel
+	 * workers.
+	 */
+	if (ParallelVacuumIsActive(lps))
+	{
+		lazy_parallel_vacuum_indexes(vacrelstats, Irel, nindexes, stats, lps);
+
+		/*
+		 * Done if there is no indexes that do not support parallel index vacuuming.
+		 * Otherwise fall through to do single process vacuum on such indexes.
+		 */
+		if (lps->lvshared->nindexes_nonparallel == 0)
+			return;
+	}
+
+	for (idx = 0; idx < nindexes; idx++)
+	{
+		/*
+		 * Skip indexes that we have already vacuumed during parallel index
+		 * vacuuming.
+		 */
+		if (ParallelVacuumIsActive(lps) && lps->lvshared->indstats[idx].in_use)
+			continue;
+
+		lazy_vacuum_index(Irel[idx], &stats[idx], vacrelstats->dead_tuples,
+						  vacrelstats->old_live_tuples);
+	}
+}
 
 /*
  *	lazy_vacuum_index() -- vacuum one index relation.
  *
  *		Delete all the index entries pointing to tuples listed in
  *		vacrelstats->dead_tuples, and update running statistics.
+ *		reltuples is the number of heap tuples to be passed to the
+ *		bulk delete callback.
  */
 static void
-lazy_vacuum_index(Relation indrel,
-				  IndexBulkDeleteResult **stats,
-				  LVRelStats *vacrelstats)
+lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult **stats,
+				  LVDeadTuples *dead_tuples, double reltuples)
 {
 	IndexVacuumInfo ivinfo;
+	char		*msgfmt;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
@@ -1753,30 +2233,38 @@ lazy_vacuum_index(Relation indrel,
 	ivinfo.report_progress = false;
 	ivinfo.estimated_count = true;
 	ivinfo.message_level = elevel;
-	/* We can only provide an approximate value of num_heap_tuples here */
-	ivinfo.num_heap_tuples = vacrelstats->old_live_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
 	/* Do bulk deletion */
 	*stats = index_bulk_delete(&ivinfo, *stats,
-							   lazy_tid_reaped, (void *) vacrelstats);
+							   lazy_tid_reaped, (void *) dead_tuples);
+
+	if (IsParallelWorker())
+		msgfmt = gettext_noop("scanned index \"%s\" to remove %d row versions by parallel vacuum worker");
+	else
+		msgfmt = gettext_noop("scanned index \"%s\" to remove %d row versions");
 
 	ereport(elevel,
-			(errmsg("scanned index \"%s\" to remove %d row versions",
+			(errmsg(msgfmt,
 					RelationGetRelationName(indrel),
-					vacrelstats->num_dead_tuples),
+					dead_tuples->num_tuples),
 			 errdetail_internal("%s", pg_rusage_show(&ru0))));
 }
 
 /*
  *	lazy_cleanup_index() -- do post-vacuum cleanup for one index relation.
+ *
+ *		reltuples is the number of heap tuples and estimated_count is true
+ *		if the reltuples is an estimated value.
  */
 static void
 lazy_cleanup_index(Relation indrel,
-				   IndexBulkDeleteResult *stats,
-				   LVRelStats *vacrelstats)
+				   IndexBulkDeleteResult **stats,
+				   double reltuples, bool estimated_count)
 {
 	IndexVacuumInfo ivinfo;
+	char		*msgfmt;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
@@ -1784,49 +2272,62 @@ lazy_cleanup_index(Relation indrel,
 	ivinfo.index = indrel;
 	ivinfo.analyze_only = false;
 	ivinfo.report_progress = false;
-	ivinfo.estimated_count = (vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	ivinfo.estimated_count = estimated_count;
 	ivinfo.message_level = elevel;
 
-	/*
-	 * Now we can provide a better estimate of total number of surviving
-	 * tuples (we assume indexes are more interested in that than in the
-	 * number of nominally live tuples).
-	 */
-	ivinfo.num_heap_tuples = vacrelstats->new_rel_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
-	stats = index_vacuum_cleanup(&ivinfo, stats);
+	*stats = index_vacuum_cleanup(&ivinfo, *stats);
 
-	if (!stats)
+	if (!(*stats))
 		return;
 
-	/*
-	 * Now update statistics in pg_class, but only if the index says the count
-	 * is accurate.
-	 */
-	if (!stats->estimated_count)
-		vac_update_relstats(indrel,
-							stats->num_pages,
-							stats->num_index_tuples,
-							0,
-							false,
-							InvalidTransactionId,
-							InvalidMultiXactId,
-							false);
+	if (IsParallelWorker())
+		msgfmt = gettext_noop("index \"%s\" now contains %.0f row versions in %u pages, reported by parallel vacuum worker");
+	else
+		msgfmt = gettext_noop("index \"%s\" now contains %.0f row versions in %u pages");
 
 	ereport(elevel,
-			(errmsg("index \"%s\" now contains %.0f row versions in %u pages",
+			(errmsg(msgfmt,
 					RelationGetRelationName(indrel),
-					stats->num_index_tuples,
-					stats->num_pages),
+					(*stats)->num_index_tuples,
+					(*stats)->num_pages),
 			 errdetail("%.0f index row versions were removed.\n"
 					   "%u index pages have been deleted, %u are currently reusable.\n"
 					   "%s.",
-					   stats->tuples_removed,
-					   stats->pages_deleted, stats->pages_free,
+					   (*stats)->tuples_removed,
+					   (*stats)->pages_deleted, (*stats)->pages_free,
 					   pg_rusage_show(&ru0))));
+}
+
+/*
+ * Update index statistics in pg_class if the statistics is accurate.
+ */
+static void
+update_index_statistics(Relation *Irel, IndexBulkDeleteResult **stats,
+						int nindexes)
+{
+	int i;
+
+	Assert(!IsInParallelMode());
+
+	for (i = 0; i < nindexes; i++)
+	{
+		if (stats[i] == NULL || stats[i]->estimated_count)
+			continue;
 
-	pfree(stats);
+		/* Update index statistics */
+		vac_update_relstats(Irel[i],
+							stats[i]->num_pages,
+							stats[i]->num_index_tuples,
+							0,
+							false,
+							InvalidTransactionId,
+							InvalidMultiXactId,
+							false);
+		pfree(stats[i]);
+	}
 }
 
 /*
@@ -2134,19 +2635,17 @@ count_nondeletable_pages(Relation onerel, LVRelStats *vacrelstats)
 }
 
 /*
- * lazy_space_alloc - space allocation decisions for lazy vacuum
- *
- * See the comments at the head of this file for rationale.
+ * Return the maximum number of dead tuples we can record.
  */
-static void
-lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+static long
+compute_max_dead_tuples(BlockNumber relblocks, bool useindex)
 {
 	long		maxtuples;
 	int			vac_work_mem = IsAutoVacuumWorkerProcess() &&
 	autovacuum_work_mem != -1 ?
 	autovacuum_work_mem : maintenance_work_mem;
 
-	if (vacrelstats->useindex)
+	if (useindex)
 	{
 		maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);
 		maxtuples = Min(maxtuples, INT_MAX);
@@ -2160,34 +2659,49 @@ lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
 		maxtuples = Max(maxtuples, MaxHeapTuplesPerPage);
 	}
 	else
-	{
 		maxtuples = MaxHeapTuplesPerPage;
-	}
 
-	vacrelstats->num_dead_tuples = 0;
-	vacrelstats->max_dead_tuples = (int) maxtuples;
-	vacrelstats->dead_tuples = (ItemPointer)
-		palloc(maxtuples * sizeof(ItemPointerData));
+	return maxtuples;
+}
+
+/*
+ * lazy_space_alloc - space allocation decisions for lazy vacuum
+ *
+ * See the comments at the head of this file for rationale.
+ */
+static void
+lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+{
+	LVDeadTuples	*dead_tuples = NULL;
+	long		maxtuples;
+
+	maxtuples = compute_max_dead_tuples(relblocks, vacrelstats->useindex);
+
+	dead_tuples = (LVDeadTuples *)
+		palloc(SizeOfLVDeadTuples + maxtuples * sizeof(ItemPointerData));
+	dead_tuples->num_tuples = 0;
+	dead_tuples->max_tuples = (int) maxtuples;
+
+	vacrelstats->dead_tuples = dead_tuples;
 }
 
 /*
  * lazy_record_dead_tuple - remember one deletable tuple
  */
 static void
-lazy_record_dead_tuple(LVRelStats *vacrelstats,
-					   ItemPointer itemptr)
+lazy_record_dead_tuple(LVDeadTuples *dead_tuples, ItemPointer itemptr)
 {
 	/*
 	 * The array shouldn't overflow under normal behavior, but perhaps it
 	 * could if we are given a really small maintenance_work_mem. In that
 	 * case, just forget the last few tuples (we'll get 'em next time).
 	 */
-	if (vacrelstats->num_dead_tuples < vacrelstats->max_dead_tuples)
+	if (dead_tuples->num_tuples < dead_tuples->max_tuples)
 	{
-		vacrelstats->dead_tuples[vacrelstats->num_dead_tuples] = *itemptr;
-		vacrelstats->num_dead_tuples++;
+		dead_tuples->itemptrs[dead_tuples->num_tuples] = *itemptr;
+		dead_tuples->num_tuples++;
 		pgstat_progress_update_param(PROGRESS_VACUUM_NUM_DEAD_TUPLES,
-									 vacrelstats->num_dead_tuples);
+									 dead_tuples->num_tuples);
 	}
 }
 
@@ -2201,12 +2715,12 @@ lazy_record_dead_tuple(LVRelStats *vacrelstats,
 static bool
 lazy_tid_reaped(ItemPointer itemptr, void *state)
 {
-	LVRelStats *vacrelstats = (LVRelStats *) state;
+	LVDeadTuples	*dead_tuples = (LVDeadTuples *) state;
 	ItemPointer res;
 
 	res = (ItemPointer) bsearch((void *) itemptr,
-								(void *) vacrelstats->dead_tuples,
-								vacrelstats->num_dead_tuples,
+								(void *) dead_tuples->itemptrs,
+								dead_tuples->num_tuples,
 								sizeof(ItemPointerData),
 								vac_cmp_itemptr);
 
@@ -2354,3 +2868,271 @@ heap_page_is_all_visible(Relation rel, Buffer buf,
 
 	return all_visible;
 }
+
+/*
+ * Compute the number of parallel worker processes to request. Both index
+ * vacuuming and index cleanup can be executed together with parallel workers.
+ * The relation sizes of table and indexes don't affect to the parallel
+ * degree for now. nrequested is the number of parallel workers that user
+ * requested. If nrequested is 0 we compute the parallel degree based on
+ * nindexes that is the number of indexes that support parallel index
+ * vacuuming.
+ */
+static int
+compute_parallel_workers(Relation *Irel, int nindexes, int nrequested)
+{
+	bool	leaderparticipates = true;
+	int		nindexes_to_vacuum = 0;
+	int		parallel_workers;
+	int		i;
+
+	Assert(nrequested >= 0);
+
+	/* Return immediately when parallelism disabled */
+	if (max_parallel_maintenance_workers == 0)
+		return 0;
+
+	/*
+	 * Compute the number of indexes that can participate to parallel index
+	 * vacuuming.
+	 */
+	for (i = 0; i < nindexes; i++)
+	{
+		IndexAmRoutine *amroutine = GetIndexAmRoutine(Irel[i]->rd_amhandler);
+
+		if (amroutine->amcanparallelvacuum)
+			nindexes_to_vacuum++;
+	}
+
+	/* No index supports parallel index vacuuming */
+	if (nindexes_to_vacuum == 0)
+		return 0;
+
+#ifdef PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION
+	leaderparticipates = false;
+#endif
+
+	/* The leader process takes one index */
+	if (leaderparticipates)
+		nindexes_to_vacuum--;
+
+	/* Compute the parallel degree */
+	parallel_workers = (nrequested > 0) ?
+		Min(nrequested, nindexes_to_vacuum) : nindexes_to_vacuum;
+
+	/* cap by max_parallel_maintenace_workers */
+	parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+
+	return parallel_workers;
+}
+
+/*
+ * Enter parallel mode, allocate and initialize the DSM segment.
+ */
+static LVParallelState *
+begin_parallel_vacuum(LVRelStats *vacrelstats, Oid relid, BlockNumber nblocks,
+					  Relation *Irel, int nindexes, int nrequested)
+{
+	LVParallelState *lps = (LVParallelState *) palloc(sizeof(LVParallelState));
+	LVShared	*shared;
+	ParallelContext *pcxt;
+	LVDeadTuples	*tidmap;
+	long	maxtuples;
+	char	*sharedquery;
+	Size	est_shared;
+	Size	est_deadtuples;
+	int		querylen;
+	int		i;
+
+	Assert(nrequested > 0);
+	Assert(nindexes > 0);
+
+	lps->leaderparticipates = true;
+
+#ifdef PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION
+	lps->leaderparticipates = false;
+#endif
+
+	EnterParallelMode();
+	pcxt = CreateParallelContext("postgres", "heap_parallel_vacuum_main",
+								 nrequested);
+	lps->pcxt = pcxt;
+	Assert(pcxt->nworkers > 0);
+
+	/* Estimate size for shared information -- PARALLEL_VACUUM_KEY_SHARED */
+	est_shared = MAXALIGN(add_size(SizeOfLVShared,
+								   mul_size(sizeof(LVSharedIndStats), nindexes)));
+	shm_toc_estimate_chunk(&pcxt->estimator, est_shared);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	/* Estimate size for dead tuples -- PARALLEL_VACUUM_KEY_DEAD_TUPLES */
+	maxtuples = compute_max_dead_tuples(nblocks, true);
+	est_deadtuples = MAXALIGN(add_size(SizeOfLVDeadTuples,
+									   mul_size(sizeof(ItemPointerData), maxtuples)));
+	shm_toc_estimate_chunk(&pcxt->estimator, est_deadtuples);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	/* Finally, estimate PARALLEL_VACUUM_KEY_QUERY_TEXT space */
+	querylen = strlen(debug_query_string);
+	shm_toc_estimate_chunk(&pcxt->estimator, querylen + 1);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	InitializeParallelDSM(pcxt);
+
+	/* prepare shared information */
+	shared = (LVShared *) shm_toc_allocate(pcxt->toc, est_shared);
+	shared->relid = relid;
+	shared->elevel = elevel;
+	pg_atomic_init_u32(&(shared->nprocessed), 0);
+
+	/*
+	 * Initialize indexes statistics and check participations of parallel
+	 * index vacuum.
+	 *
+	 * XXX: We allocate the space for all indexes regardless it might not be
+	 * used.  It is okay for now since the size of index statistics is small
+	 * enough.
+	 */
+	MemSet(shared->indstats, 0, sizeof(LVSharedIndStats) * nindexes);
+	for (i = 0; i < nindexes; i++)
+	{
+		IndexAmRoutine *amroutine = GetIndexAmRoutine(Irel[i]->rd_amhandler);
+
+		if (amroutine->amcanparallelvacuum)
+			shared->indstats[i].in_use = true;
+		else
+			shared->nindexes_nonparallel++;
+	}
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_SHARED, shared);
+	lps->lvshared = shared;
+
+	/* prepare the dead tuple space */
+	tidmap = (LVDeadTuples *) shm_toc_allocate(pcxt->toc, est_deadtuples);
+	tidmap->max_tuples = maxtuples;
+	tidmap->num_tuples = 0;
+	MemSet(tidmap->itemptrs, 0, sizeof(ItemPointerData) * maxtuples);
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_DEAD_TUPLES, tidmap);
+	vacrelstats->dead_tuples = tidmap;
+
+	/* Store query string for workers */
+	sharedquery = (char *) shm_toc_allocate(pcxt->toc, querylen + 1);
+	memcpy(sharedquery, debug_query_string, querylen + 1);
+	sharedquery[querylen] = '\0';
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, sharedquery);
+
+	return lps;
+}
+
+/*
+ * Destroy the parallel context, and end parallel mode.
+ *
+ * All writes are not allowed during parallel mode and it might not be
+ * safe to exit from the parallel mode while keeping the parallel context.
+ * So we copy the updated index statistics to a local memory and then later
+ * use that to update the index statistics.
+ */
+static void
+end_parallel_vacuum(LVParallelState *lps, Relation *Irel, int nindexes,
+					IndexBulkDeleteResult **stats)
+{
+	int i;
+
+	Assert(!IsParallelWorker());
+
+	/* copy the updated statistics */
+	for (i = 0; i < nindexes; i++)
+	{
+		/*
+		 * Skip unused slot.  The statistics of this index are already
+		 * stored in local memory.
+		 */
+		if (!lps->lvshared->indstats[i].in_use)
+		{
+			Assert(!lps->lvshared->indstats[i].updated);
+			continue;
+		}
+
+		if (lps->lvshared->indstats[i].updated)
+		{
+			stats[i] = (IndexBulkDeleteResult *) palloc0(sizeof(IndexBulkDeleteResult));
+			memcpy(stats[i],
+				   &(lps->lvshared->indstats[i].stats),
+				   sizeof(IndexBulkDeleteResult));
+		}
+		else
+			stats[i] = NULL;
+	}
+
+	DestroyParallelContext(lps->pcxt);
+	ExitParallelMode();
+
+	/* Deactivate parallel vacuum */
+	pfree(lps);
+	lps = NULL;
+}
+
+/*
+ * Perform work within a launched parallel process.
+ *
+ * Since parallel vacuum workers work only within index vacuuming and index
+ * cleanup, no need to report the progress information.
+ */
+void
+heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
+{
+	Relation	onerel;
+	Relation	*indrels;
+	LVShared	*lvshared;
+	LVDeadTuples	*dead_tuples;
+	int			nindexes;
+	char		*sharedquery;
+	IndexBulkDeleteResult **stats;
+
+	lvshared = (LVShared *) shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_SHARED,
+										   false);
+	elevel = lvshared->elevel;
+
+	ereport(DEBUG1,
+			(errmsg("starting parallel lazy vacuum worker for %s",
+					lvshared->for_cleanup ? "cleanup" : "vacuuming")));
+
+	/* Set debug_query_string for individual workers */
+	sharedquery = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, false);
+	debug_query_string = sharedquery;
+	pgstat_report_activity(STATE_RUNNING, debug_query_string);
+
+	/*
+	 * Open table. The lock mode is the same as the leader process. It's okay
+	 * because The lockmode does not conflict among the parallel workers.
+	 */
+	onerel = heap_open(lvshared->relid, ShareUpdateExclusiveLock);
+
+	/*
+	 * Open all indexes. indrels are sorted in order by OID, which should
+	 * be matched to the leader's one.
+	 */
+	vac_open_indexes(onerel, RowExclusiveLock, &nindexes, &indrels);
+	Assert(nindexes > 0);
+
+	/* Set dead tuple space */
+	dead_tuples = (LVDeadTuples *) shm_toc_lookup(toc,
+												  PARALLEL_VACUUM_KEY_DEAD_TUPLES,
+												  false);
+
+	/* Set cost-based vacuum delay */
+	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostBalance = 0;
+	VacuumPageHit = 0;
+	VacuumPageMiss = 0;
+	VacuumPageDirty = 0;
+
+	stats = (IndexBulkDeleteResult **)
+		palloc0(nindexes * sizeof(IndexBulkDeleteResult *));
+
+	/* Do either vacuuming indexes or cleaning indexes */
+	vacuum_or_cleanup_indexes_worker(indrels, nindexes, stats, lvshared,
+									 dead_tuples);
+
+	vac_close_indexes(nindexes, indrels, RowExclusiveLock);
+	heap_close(onerel, ShareUpdateExclusiveLock);
+}
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index 55d129a64f..86511b2703 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -14,6 +14,7 @@
 
 #include "postgres.h"
 
+#include "access/heapam.h"
 #include "access/nbtree.h"
 #include "access/parallel.h"
 #include "access/session.h"
@@ -140,6 +141,9 @@ static const struct
 	},
 	{
 		"_bt_parallel_build_main", _bt_parallel_build_main
+	},
+	{
+		"heap_parallel_vacuum_main", heap_parallel_vacuum_main
 	}
 };
 
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index e154507ecd..ff8c7760c0 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -42,6 +42,7 @@
 #include "nodes/makefuncs.h"
 #include "pgstat.h"
 #include "postmaster/autovacuum.h"
+#include "postmaster/bgworker_internals.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
 #include "storage/proc.h"
@@ -99,6 +100,7 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 	/* Set default value */
 	params.index_cleanup = VACOPT_TERNARY_DEFAULT;
 	params.truncate = VACOPT_TERNARY_DEFAULT;
+	params.nworkers = -1;
 
 	/* Parse options list */
 	foreach(lc, vacstmt->options)
@@ -129,6 +131,28 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 			params.index_cleanup = get_vacopt_ternary_value(opt);
 		else if (strcmp(opt->defname, "truncate") == 0)
 			params.truncate = get_vacopt_ternary_value(opt);
+		else if (strcmp(opt->defname, "parallel") == 0)
+		{
+			if (opt->arg == NULL)
+			{
+				/*
+				 * Parallel lazy vacuum is requested but user didn't specify
+				 * the parallel degree. The parallel degree will be determined
+				 * at the start of lazy vacuum.
+				 */
+				params.nworkers = 0;
+			}
+			else
+			{
+				params.nworkers = defGetInt32(opt);
+				if (params.nworkers < 1 || params.nworkers > MAX_PARALLEL_WORKER_LIMIT)
+					ereport(ERROR,
+							(errcode(ERRCODE_SYNTAX_ERROR),
+							 errmsg("parallel vacuum degree must be between 1 and %d",
+									MAX_PARALLEL_WORKER_LIMIT),
+							 parser_errposition(pstate, opt->location)));
+			}
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -170,6 +194,11 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 		}
 	}
 
+	if ((params.options & VACOPT_FULL) && params.nworkers >= 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot specify FULL option with PARALLEL option")));
+
 	/*
 	 * All freeze ages are zero if the FREEZE option is given; otherwise pass
 	 * them as -1 which means to use the default values.
@@ -1736,6 +1765,22 @@ vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params)
 		return false;
 	}
 
+	/*
+	 * Since parallel workers cannot access data in temporary tables, parallel
+	 * vacuum is not allowed for temporary relation.
+	 */
+	if (RelationUsesLocalBuffers(onerel) && params->nworkers >= 0)
+	{
+		ereport(WARNING,
+				(errmsg("skipping vacuum on \"%s\" --- cannot vacuum temporary tables in parallel",
+						RelationGetRelationName(onerel))));
+		relation_close(onerel, lmode);
+		PopActiveSnapshot();
+		CommitTransactionCommand();
+		/* It's OK to proceed with ANALYZE on this table */
+		return true;
+	}
+
 	/*
 	 * Silently ignore partitioned tables as there is no work to be done.  The
 	 * useful work is on their child partitions, which have been queued up for
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 073f313337..de43d1e4f0 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2888,6 +2888,8 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 			(!wraparound ? VACOPT_SKIP_LOCKED : 0);
 		tab->at_params.index_cleanup = VACOPT_TERNARY_DEFAULT;
 		tab->at_params.truncate = VACOPT_TERNARY_DEFAULT;
+		/* We don't support parallel vacuum for autovacuum for now */
+		tab->at_params.nworkers = -1;
 		tab->at_params.freeze_min_age = freeze_min_age;
 		tab->at_params.freeze_table_age = freeze_table_age;
 		tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index e00dbab5aa..321a1511a8 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -3556,7 +3556,7 @@ psql_completion(const char *text, int start, int end)
 		if (ends_with(prev_wd, '(') || ends_with(prev_wd, ','))
 			COMPLETE_WITH("FULL", "FREEZE", "ANALYZE", "VERBOSE",
 						  "DISABLE_PAGE_SKIPPING", "SKIP_LOCKED",
-						  "INDEX_CLEANUP", "TRUNCATE");
+						  "INDEX_CLEANUP", "TRUNCATE", "PARALLEL");
 		else if (TailMatches("FULL|FREEZE|ANALYZE|VERBOSE|DISABLE_PAGE_SKIPPING|SKIP_LOCKED|INDEX_CLEANUP|TRUNCATE"))
 			COMPLETE_WITH("ON", "OFF");
 	}
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 858bcb6bc9..12065cc038 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -24,6 +24,8 @@
 #include "nodes/primnodes.h"
 #include "storage/bufpage.h"
 #include "storage/lockdefs.h"
+#include "storage/shm_toc.h"
+#include "storage/dsm.h"
 #include "utils/relcache.h"
 #include "utils/snapshot.h"
 
@@ -193,6 +195,7 @@ extern Size SyncScanShmemSize(void);
 struct VacuumParams;
 extern void heap_vacuum_rel(Relation onerel,
 							struct VacuumParams *params, BufferAccessStrategy bstrategy);
+extern void heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
 
 /* in heap/heapam_visibility.c */
 extern bool HeapTupleSatisfiesVisibility(HeapTuple stup, Snapshot snapshot,
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 128f7ae65d..43702f2f86 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -184,6 +184,11 @@ typedef struct VacuumParams
 										 * default value depends on reloptions */
 	VacOptTernaryValue truncate;	/* Truncate empty pages at the end,
 									 * default value depends on reloptions */
+	/*
+	 * The number of parallel vacuum workers. -1 by default for no workers
+	 * and 0 for choosing based on the number of indexes.
+	 */
+	int			nworkers;
 } VacuumParams;
 
 /* GUC parameters */
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index aff0b10a93..f134412c3d 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -92,6 +92,20 @@ CONTEXT:  SQL function "do_analyze" statement 1
 SQL function "wrap_do_analyze" statement 1
 VACUUM FULL vactst;
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+-- PARALLEL option
+VACUUM (PARALLEL) vaccluster;
+VACUUM (PARALLEL 2) vaccluster;
+VACUUM (PARALLEL 0) vaccluster; -- error
+ERROR:  parallel vacuum degree must be between 1 and 1024
+LINE 1: VACUUM (PARALLEL 0) vaccluster;
+                ^
+VACUUM (PARALLEL 2, INDEX_CLEANUP FALSE) vaccluster;
+VACUUM (PARALLEL 2, FULL TRUE) vaccluster; -- error, cannot use both PARALLEL and FULL
+ERROR:  cannot specify FULL option with PARALLEL option
+CREATE TEMPORARY TABLE tmp (a int PRIMARY KEY);
+CREATE INDEX tmp_idx1 ON tmp (a);
+VACUUM (PARALLEL 1) tmp; -- error, cannot parallel vacuum temporary tables
+WARNING:  skipping "tmp" --- cannot parallel vacuum temporary tables
 -- INDEX_CLEANUP option
 CREATE TABLE no_index_cleanup (i INT PRIMARY KEY, t TEXT);
 -- Use uncompressed data stored in toast.
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index f0fee3af2b..66a9b110fe 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -75,6 +75,16 @@ VACUUM FULL vactst;
 
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
 
+-- PARALLEL option
+VACUUM (PARALLEL) vaccluster;
+VACUUM (PARALLEL 2) vaccluster;
+VACUUM (PARALLEL 0) vaccluster; -- error
+VACUUM (PARALLEL 2, INDEX_CLEANUP FALSE) vaccluster;
+VACUUM (PARALLEL 2, FULL TRUE) vaccluster; -- error, cannot use both PARALLEL and FULL
+CREATE TEMPORARY TABLE tmp (a int PRIMARY KEY);
+CREATE INDEX tmp_idx1 ON tmp (a);
+VACUUM (PARALLEL 1) tmp; -- error, cannot parallel vacuum temporary tables
+
 -- INDEX_CLEANUP option
 CREATE TABLE no_index_cleanup (i INT PRIMARY KEY, t TEXT);
 -- Use uncompressed data stored in toast.
-- 
2.22.0

v30-0001-Add-a-index-AM-field-to-check-parallel-index-par.patchtext/x-patch; charset=US-ASCII; name=v30-0001-Add-a-index-AM-field-to-check-parallel-index-par.patchDownload

From 8b0e75e69be6f34757dcaa43d8b73037365297a0 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Tue, 15 Oct 2019 17:03:22 +0900
Subject: [PATCH v30 1/3] Add a index AM field to check parallel index
 participation

gist indexes don't support parallel index vacuuming for now since it
returns larger structure of which IndexBulkDelete is just the first
field from bulkdelete.
---
 contrib/bloom/blutils.c              | 1 +
 doc/src/sgml/indexam.sgml            | 2 ++
 src/backend/access/brin/brin.c       | 1 +
 src/backend/access/gist/gist.c       | 1 +
 src/backend/access/hash/hash.c       | 1 +
 src/backend/access/nbtree/nbtree.c   | 1 +
 src/backend/access/spgist/spgutils.c | 1 +
 src/include/access/amapi.h           | 2 ++
 8 files changed, 10 insertions(+)

diff --git a/contrib/bloom/blutils.c b/contrib/bloom/blutils.c
index dbb24cb5b2..6dbfca0f4a 100644
--- a/contrib/bloom/blutils.c
+++ b/contrib/bloom/blutils.c
@@ -122,6 +122,7 @@ blhandler(PG_FUNCTION_ARGS)
 	amroutine->amclusterable = false;
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
+	amroutine->amcanparallelvacuum = true;
 	amroutine->amcaninclude = false;
 	amroutine->amkeytype = InvalidOid;
 
diff --git a/doc/src/sgml/indexam.sgml b/doc/src/sgml/indexam.sgml
index dd54c68802..fa5682db04 100644
--- a/doc/src/sgml/indexam.sgml
+++ b/doc/src/sgml/indexam.sgml
@@ -120,6 +120,8 @@ typedef struct IndexAmRoutine
     bool        ampredlocks;
     /* does AM support parallel scan? */
     bool        amcanparallel;
+    /* does AM support parallel vacuum? */
+    bool        amcanparallelvacuum;
     /* does AM support columns included with clause INCLUDE? */
     bool        amcaninclude;
     /* type of data stored in index, or InvalidOid if variable */
diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index ae7b729edd..6ea48fb555 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -100,6 +100,7 @@ brinhandler(PG_FUNCTION_ARGS)
 	amroutine->amclusterable = false;
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
+	amroutine->amcanparallelvacuum = true;
 	amroutine->amcaninclude = false;
 	amroutine->amkeytype = InvalidOid;
 
diff --git a/src/backend/access/gist/gist.c b/src/backend/access/gist/gist.c
index 0cc87911d6..f44c2fd2ff 100644
--- a/src/backend/access/gist/gist.c
+++ b/src/backend/access/gist/gist.c
@@ -74,6 +74,7 @@ gisthandler(PG_FUNCTION_ARGS)
 	amroutine->amclusterable = true;
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = false;
+	amroutine->amcanparallelvacuum = false;
 	amroutine->amcaninclude = true;
 	amroutine->amkeytype = InvalidOid;
 
diff --git a/src/backend/access/hash/hash.c b/src/backend/access/hash/hash.c
index 5cc30dac42..f21d9ac78f 100644
--- a/src/backend/access/hash/hash.c
+++ b/src/backend/access/hash/hash.c
@@ -73,6 +73,7 @@ hashhandler(PG_FUNCTION_ARGS)
 	amroutine->amclusterable = false;
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = false;
+	amroutine->amcanparallelvacuum = true;
 	amroutine->amcaninclude = false;
 	amroutine->amkeytype = INT4OID;
 
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index 4cfd5289ad..e885aadc21 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -122,6 +122,7 @@ bthandler(PG_FUNCTION_ARGS)
 	amroutine->amclusterable = true;
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = true;
+	amroutine->amcanparallelvacuum = true;
 	amroutine->amcaninclude = true;
 	amroutine->amkeytype = InvalidOid;
 
diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c
index 45472db147..0c86b63f65 100644
--- a/src/backend/access/spgist/spgutils.c
+++ b/src/backend/access/spgist/spgutils.c
@@ -55,6 +55,7 @@ spghandler(PG_FUNCTION_ARGS)
 	amroutine->amclusterable = false;
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
+	amroutine->amcanparallelvacuum = true;
 	amroutine->amcaninclude = false;
 	amroutine->amkeytype = InvalidOid;
 
diff --git a/src/include/access/amapi.h b/src/include/access/amapi.h
index 6e3db06eed..f7d2a1b7e3 100644
--- a/src/include/access/amapi.h
+++ b/src/include/access/amapi.h
@@ -195,6 +195,8 @@ typedef struct IndexAmRoutine
 	bool		ampredlocks;
 	/* does AM support parallel scan? */
 	bool		amcanparallel;
+	/* does AM support parallel vacuum? */
+	bool		amcanparallelvacuum;
 	/* does AM support columns included with clause INCLUDE? */
 	bool		amcaninclude;
 	/* type of data stored in index, or InvalidOid if variable */
-- 
2.22.0

v30-0003-Add-paralell-P-option-to-vacuumdb-command.patchtext/x-patch; charset=US-ASCII; name=v30-0003-Add-paralell-P-option-to-vacuumdb-command.patchDownload

From 77e5455339cda63b27fdab055c0988d4c8cf1cc5 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Wed, 23 Jan 2019 16:07:53 +0900
Subject: [PATCH v30 3/3] Add --paralell, -P option to vacuumdb command

---
 doc/src/sgml/ref/vacuumdb.sgml    | 16 +++++++++++
 src/bin/scripts/t/100_vacuumdb.pl | 10 ++++++-
 src/bin/scripts/vacuumdb.c        | 48 ++++++++++++++++++++++++++++++-
 3 files changed, 72 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/ref/vacuumdb.sgml b/doc/src/sgml/ref/vacuumdb.sgml
index 47d93456f8..f6ac0c6e5a 100644
--- a/doc/src/sgml/ref/vacuumdb.sgml
+++ b/doc/src/sgml/ref/vacuumdb.sgml
@@ -226,6 +226,22 @@ PostgreSQL documentation
       </listitem>
      </varlistentry>
 
+     <varlistentry>
+      <term><option>-P <replaceable class="parameter">workers</replaceable></option></term>
+      <term><option>--parallel=<replaceable class="parameter">workers</replaceable></option></term>
+      <listitem>
+       <para>
+        Execute parallel vacuum with <productname>PostgreSQL</productname>'s
+        <replaceable class="parameter">workers</replaceable> background workers.
+       </para>
+       <para>
+        This option will require background workers, so make sure your
+        <xref linkend="guc-max-parallel-workers-maintenance"/> setting is more
+        than one.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry>
       <term><option>-q</option></term>
       <term><option>--quiet</option></term>
diff --git a/src/bin/scripts/t/100_vacuumdb.pl b/src/bin/scripts/t/100_vacuumdb.pl
index b685b35282..8fe80719e8 100644
--- a/src/bin/scripts/t/100_vacuumdb.pl
+++ b/src/bin/scripts/t/100_vacuumdb.pl
@@ -3,7 +3,7 @@ use warnings;
 
 use PostgresNode;
 use TestLib;
-use Test::More tests => 44;
+use Test::More tests => 48;
 
 program_help_ok('vacuumdb');
 program_version_ok('vacuumdb');
@@ -48,6 +48,14 @@ $node->issues_sql_like(
 $node->command_fails(
 	[ 'vacuumdb', '--analyze-only', '--disable-page-skipping', 'postgres' ],
 	'--analyze-only and --disable-page-skipping specified together');
+$node->issues_sql_like(
+	[ 'vacuumdb', '-P2', 'postgres' ],
+	qr/statement: VACUUM \(PARALLEL 2\).*;/,
+	'vacuumdb -P2');
+$node->issues_sql_like(
+	[ 'vacuumdb', '-P', 'postgres' ],
+	qr/statement: VACUUM \(PARALLEL\).*;/,
+	'vacuumdb -P');
 $node->command_ok([qw(vacuumdb -Z --table=pg_am dbname=template1)],
 	'vacuumdb with connection string');
 
diff --git a/src/bin/scripts/vacuumdb.c b/src/bin/scripts/vacuumdb.c
index 2c7219239f..63bf66a70b 100644
--- a/src/bin/scripts/vacuumdb.c
+++ b/src/bin/scripts/vacuumdb.c
@@ -34,6 +34,8 @@ typedef struct vacuumingOptions
 	bool		skip_locked;
 	int			min_xid_age;
 	int			min_mxid_age;
+	int			parallel_workers;	/* -1 disables, 0 for choosing based on the
+									 * number of indexes */
 } vacuumingOptions;
 
 
@@ -86,6 +88,7 @@ main(int argc, char *argv[])
 		{"full", no_argument, NULL, 'f'},
 		{"verbose", no_argument, NULL, 'v'},
 		{"jobs", required_argument, NULL, 'j'},
+		{"parallel", optional_argument, NULL, 'P'},
 		{"maintenance-db", required_argument, NULL, 2},
 		{"analyze-in-stages", no_argument, NULL, 3},
 		{"disable-page-skipping", no_argument, NULL, 4},
@@ -115,6 +118,7 @@ main(int argc, char *argv[])
 
 	/* initialize options to all false */
 	memset(&vacopts, 0, sizeof(vacopts));
+	vacopts.parallel_workers = -1;
 
 	pg_logging_init(argv[0]);
 	progname = get_progname(argv[0]);
@@ -122,7 +126,7 @@ main(int argc, char *argv[])
 
 	handle_help_version_opts(argc, argv, "vacuumdb", help);
 
-	while ((c = getopt_long(argc, argv, "h:p:U:wWeqd:zZFat:fvj:", long_options, &optindex)) != -1)
+	while ((c = getopt_long(argc, argv, "h:p:P::U:wWeqd:zZFat:fvj:", long_options, &optindex)) != -1)
 	{
 		switch (c)
 		{
@@ -182,6 +186,24 @@ main(int argc, char *argv[])
 					exit(1);
 				}
 				break;
+			case 'P':
+				{
+					int parallel_workers = 0;
+
+					if (optarg != NULL)
+					{
+						parallel_workers = atoi(optarg);
+						if (parallel_workers <= 0)
+						{
+							pg_log_error("number of parallel workers must be at least 1");
+							exit(1);
+						}
+					}
+
+					/* allow to set 0, meaning PARALLEL without the parallel degree */
+					vacopts.parallel_workers = parallel_workers;
+					break;
+				}
 			case 2:
 				maintenance_db = pg_strdup(optarg);
 				break;
@@ -254,9 +276,22 @@ main(int argc, char *argv[])
 						 "disable-page-skipping");
 			exit(1);
 		}
+		if (vacopts.parallel_workers >= 0)
+		{
+			pg_log_error("cannot use the \"%s\" option when performing only analyze",
+						 "parallel");
+			exit(1);
+		}
 		/* allow 'and_analyze' with 'analyze_only' */
 	}
 
+	if (vacopts.full && vacopts.parallel_workers >= 0)
+	{
+		pg_log_error("cannot use the \"%s\" option with \"%s\" option",
+					 "full", "parallel");
+		exit(1);
+	}
+
 	setup_cancel_handler();
 
 	/* Avoid opening extra connections. */
@@ -822,6 +857,16 @@ prepare_vacuum_command(PQExpBuffer sql, int serverVersion,
 				appendPQExpBuffer(sql, "%sANALYZE", sep);
 				sep = comma;
 			}
+			if (vacopts->parallel_workers > 0)
+			{
+				appendPQExpBuffer(sql, "%sPARALLEL %d", sep, vacopts->parallel_workers);
+				sep = comma;
+			}
+			if (vacopts->parallel_workers == 0)
+			{
+				appendPQExpBuffer(sql, "%sPARALLEL", sep);
+				sep = comma;
+			}
 			if (sep != paren)
 				appendPQExpBufferChar(sql, ')');
 		}
@@ -885,6 +930,7 @@ help(const char *progname)
 	printf(_("  -j, --jobs=NUM                  use this many concurrent connections to vacuum\n"));
 	printf(_("      --min-mxid-age=MXID_AGE     minimum multixact ID age of tables to vacuum\n"));
 	printf(_("      --min-xid-age=XID_AGE       minimum transaction ID age of tables to vacuum\n"));
+	printf(_("  -P, --parallel[=NUM]            do parallel vacuuming\n"));
 	printf(_("  -q, --quiet                     don't write any messages\n"));
 	printf(_("      --skip-locked               skip relations that cannot be immediately locked\n"));
 	printf(_("  -t, --table='TABLE[(COLUMNS)]'  vacuum specific table(s) only\n"));
-- 
2.22.0

#119

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Masahiko Sawada (#118)

On Wed, Oct 16, 2019 at 6:50 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Tue, Oct 15, 2019 at 6:33 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

Attached updated patch set. 0001 patch introduces new index AM field
amcanparallelvacuum. All index AMs except for gist sets true for now.
0002 patch incorporated the all comments I got so far.

I haven't studied the latest patch in detail, but it seems you are
still assuming that all indexes will have the same amount of shared
memory for index stats and copying it in the same way. I thought we
agreed that each index AM should do this on its own. The basic
problem is as of now we see this problem only with the Gist index, but
some other index AM's could also have a similar problem.

Another major problem with previous and this patch version is that the
cost-based vacuum concept seems to be entirely broken. Basically,
each parallel vacuum worker operates independently w.r.t vacuum delay
and cost. Assume that the overall I/O allowed for vacuum operation is
X after which it will sleep for some time, reset the balance and
continue. In the patch, each worker will be allowed to perform X
before which it can sleep and also there is no coordination for the
same with master backend. This is somewhat similar to memory usage
problem, but a bit more tricky because here we can't easily split the
I/O for each of the worker.

One idea could be that we somehow map vacuum costing related
parameters to the shared memory (dsm) which the vacuum operation is
using and then allow workers to coordinate. This way master and
worker processes will have the same view of balance cost and can act
accordingly.

The other idea could be that we come up with some smart way to split
the I/O among workers. Initially, I thought we could try something as
we do for autovacuum workers (see autovac_balance_cost), but I think
that will require much more math. Before launching workers, we need
to compute the remaining I/O (heap operation would have used
something) after which we need to sleep and continue the operation and
then somehow split it equally across workers. Once the workers are
finished, then need to let master backend know how much I/O they have
consumed and then master backend can add it to it's current I/O
consumed.

I think this problem matters because the vacuum delay is useful for
large vacuums and this patch is trying to exactly solve that problem,
so we can't ignore this problem. I am not yet sure what is the best
solution to this problem, but I think we need to do something for it.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#120

Masahiko Sawada

sawada.mshk@gmail.com

about 6 years ago

In reply to: Amit Kapila (#119)

On Wed, Oct 16, 2019 at 3:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Oct 16, 2019 at 6:50 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Tue, Oct 15, 2019 at 6:33 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

Attached updated patch set. 0001 patch introduces new index AM field
amcanparallelvacuum. All index AMs except for gist sets true for now.
0002 patch incorporated the all comments I got so far.

I haven't studied the latest patch in detail, but it seems you are
still assuming that all indexes will have the same amount of shared
memory for index stats and copying it in the same way.

Yeah I thought we agreed at least to have canparallelvacuum and if an
index AM cannot support parallel index vacuuming like gist, it returns
false.

I thought we
agreed that each index AM should do this on its own. The basic
problem is as of now we see this problem only with the Gist index, but
some other index AM's could also have a similar problem.

Okay. I'm thinking we're going to have a new callback to ack index AMs
the size of the structure using within both ambulkdelete and
amvacuumcleanup. But copying it to DSM can be done by the core because
it knows how many bytes need to be copied to DSM. Is that okay?

Another major problem with previous and this patch version is that the
cost-based vacuum concept seems to be entirely broken. Basically,
each parallel vacuum worker operates independently w.r.t vacuum delay
and cost. Assume that the overall I/O allowed for vacuum operation is
X after which it will sleep for some time, reset the balance and
continue. In the patch, each worker will be allowed to perform X
before which it can sleep and also there is no coordination for the
same with master backend. This is somewhat similar to memory usage
problem, but a bit more tricky because here we can't easily split the
I/O for each of the worker.

One idea could be that we somehow map vacuum costing related
parameters to the shared memory (dsm) which the vacuum operation is
using and then allow workers to coordinate. This way master and
worker processes will have the same view of balance cost and can act
accordingly.

The other idea could be that we come up with some smart way to split
the I/O among workers. Initially, I thought we could try something as
we do for autovacuum workers (see autovac_balance_cost), but I think
that will require much more math. Before launching workers, we need
to compute the remaining I/O (heap operation would have used
something) after which we need to sleep and continue the operation and
then somehow split it equally across workers. Once the workers are
finished, then need to let master backend know how much I/O they have
consumed and then master backend can add it to it's current I/O
consumed.

I think this problem matters because the vacuum delay is useful for
large vacuums and this patch is trying to exactly solve that problem,
so we can't ignore this problem. I am not yet sure what is the best
solution to this problem, but I think we need to do something for it.

I guess that the concepts of vacuum delay contradicts the concepts of
parallel vacuum. The concepts of parallel vacuum would be to use more
resource to make vacuum faster. Vacuum delays balances I/O during
vacuum in order to avoid I/O spikes by vacuum but parallel vacuum
rather concentrates I/O in shorter duration. Since we need to share
the memory in entire system we need to deal with the memory issue but
disks are different.

If we need to deal with this problem how about just dividing
vacuum_cost_limit by the parallel degree and setting it to worker's
vacuum_cost_limit?

Regards,

--
Masahiko Sawada

#121

Mahendra Singh

mahi6run@gmail.com

about 6 years ago

In reply to: Masahiko Sawada (#118)

Hi
I applied all 3 patches and ran regression test. I was getting one
regression failure.

diff -U3

/home/mahendra/postgres_base_rp/postgres/src/test/regress/expected/vacuum.out
/home/mahendra/postgres_base_rp/postgres/src/test/regress/results/vacuum.out
---
/home/mahendra/postgres_base_rp/postgres/src/test/regress/expected/vacuum.out
2019-10-17 10:01:58.138863802 +0530
+++
/home/mahendra/postgres_base_rp/postgres/src/test/regress/results/vacuum.out
2019-10-17 11:41:20.930699926 +0530
@@ -105,7 +105,7 @@
CREATE TEMPORARY TABLE tmp (a int PRIMARY KEY);
CREATE INDEX tmp_idx1 ON tmp (a);
VACUUM (PARALLEL 1) tmp; -- error, cannot parallel vacuum temporary tables
-WARNING:  skipping "tmp" --- cannot parallel vacuum temporary tables
+WARNING:  skipping vacuum on "tmp" --- cannot vacuum temporary tables in
parallel
-- INDEX_CLEANUP option
CREATE TABLE no_index_cleanup (i INT PRIMARY KEY, t TEXT);
-- Use uncompressed data stored in toast.

It look likes that you changed warning message for temp table, but haven't
updated expected out file.

Thanks and Regards
Mahendra Thalor
EnterpriseDB: http://www.enterprisedb.com

On Wed, 16 Oct 2019 at 06:50, Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Show quoted text

On Tue, Oct 15, 2019 at 6:33 PM Amit Kapila <amit.kapila16@gmail.com>
wrote:

On Tue, Oct 15, 2019 at 1:26 PM Masahiko Sawada <sawada.mshk@gmail.com>

wrote:

On Tue, Oct 15, 2019 at 4:15 PM Masahiko Sawada <sawada.mshk@gmail.com>

wrote:

If we avoid postponing deleting empty pages till the cleanup

phase,

then we don't have the problem for gist indexes.

Yes. But considering your pointing out I guess that there might

be

other index AMs use the stats returned from bulkdelete in the

similar

way to gist index (i.e. using more larger structure of which
IndexBulkDeleteResult is just the first field). If we have the

same

concern the parallel vacuum still needs to deal with that as you
mentioned.

Right, apart from some functions for memory allocation/estimation

and

stats copy, we might need something like amcanparallelvacuum, so

that

index methods can have the option to not participate in parallel
vacuum due to reasons similar to gist or something else. I think

we

can work towards this direction as this anyway seems to be required
and till we reach any conclusion for gist indexes, you can mark
amcanparallelvacuum for gist indexes as false.

Agreed. I'll create a separate patch to add this callback and change
parallel vacuum patch so that it checks the participation of indexes
and then vacuums on un-participated indexes after parallel vacuum.

amcanparallelvacuum is not necessary to be a callback, it can be a
boolean field of IndexAmRoutine.

Yes, it will be a boolean. Note that for parallel-index scans, we
already have amcanparallel.

Attached updated patch set. 0001 patch introduces new index AM field
amcanparallelvacuum. All index AMs except for gist sets true for now.
0002 patch incorporated the all comments I got so far.

Regards,

--
Masahiko Sawada

#122

Masahiko Sawada

sawada.mshk@gmail.com

about 6 years ago

In reply to: Mahendra Singh (#121)

On Thu, Oct 17, 2019 at 3:18 PM Mahendra Singh <mahi6run@gmail.com> wrote:

Hi
I applied all 3 patches and ran regression test. I was getting one regression failure.

diff -U3 /home/mahendra/postgres_base_rp/postgres/src/test/regress/expected/vacuum.out /home/mahendra/postgres_base_rp/postgres/src/test/regress/results/vacuum.out
--- /home/mahendra/postgres_base_rp/postgres/src/test/regress/expected/vacuum.out 2019-10-17 10:01:58.138863802 +0530
+++ /home/mahendra/postgres_base_rp/postgres/src/test/regress/results/vacuum.out 2019-10-17 11:41:20.930699926 +0530
@@ -105,7 +105,7 @@
CREATE TEMPORARY TABLE tmp (a int PRIMARY KEY);
CREATE INDEX tmp_idx1 ON tmp (a);
VACUUM (PARALLEL 1) tmp; -- error, cannot parallel vacuum temporary tables
-WARNING:  skipping "tmp" --- cannot parallel vacuum temporary tables
+WARNING:  skipping vacuum on "tmp" --- cannot vacuum temporary tables in parallel
-- INDEX_CLEANUP option
CREATE TABLE no_index_cleanup (i INT PRIMARY KEY, t TEXT);
-- Use uncompressed data stored in toast.

It look likes that you changed warning message for temp table, but haven't updated expected out file.

Thank you!
I forgot to change the expected file. I'll fix it in the next version patch.

Regards,

--
Masahiko Sawada

#123

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Masahiko Sawada (#120)

On Thu, Oct 17, 2019 at 10:56 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Wed, Oct 16, 2019 at 3:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Oct 16, 2019 at 6:50 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Tue, Oct 15, 2019 at 6:33 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

Attached updated patch set. 0001 patch introduces new index AM field
amcanparallelvacuum. All index AMs except for gist sets true for now.
0002 patch incorporated the all comments I got so far.

I haven't studied the latest patch in detail, but it seems you are
still assuming that all indexes will have the same amount of shared
memory for index stats and copying it in the same way.

Yeah I thought we agreed at least to have canparallelvacuum and if an
index AM cannot support parallel index vacuuming like gist, it returns
false.

I thought we
agreed that each index AM should do this on its own. The basic
problem is as of now we see this problem only with the Gist index, but
some other index AM's could also have a similar problem.

Okay. I'm thinking we're going to have a new callback to ack index AMs
the size of the structure using within both ambulkdelete and
amvacuumcleanup. But copying it to DSM can be done by the core because
it knows how many bytes need to be copied to DSM. Is that okay?

That sounds okay.

Another major problem with previous and this patch version is that the
cost-based vacuum concept seems to be entirely broken. Basically,
each parallel vacuum worker operates independently w.r.t vacuum delay
and cost. Assume that the overall I/O allowed for vacuum operation is
X after which it will sleep for some time, reset the balance and
continue. In the patch, each worker will be allowed to perform X
before which it can sleep and also there is no coordination for the
same with master backend. This is somewhat similar to memory usage
problem, but a bit more tricky because here we can't easily split the
I/O for each of the worker.

One idea could be that we somehow map vacuum costing related
parameters to the shared memory (dsm) which the vacuum operation is
using and then allow workers to coordinate. This way master and
worker processes will have the same view of balance cost and can act
accordingly.

The other idea could be that we come up with some smart way to split
the I/O among workers. Initially, I thought we could try something as
we do for autovacuum workers (see autovac_balance_cost), but I think
that will require much more math. Before launching workers, we need
to compute the remaining I/O (heap operation would have used
something) after which we need to sleep and continue the operation and
then somehow split it equally across workers. Once the workers are
finished, then need to let master backend know how much I/O they have
consumed and then master backend can add it to it's current I/O
consumed.

I think this problem matters because the vacuum delay is useful for
large vacuums and this patch is trying to exactly solve that problem,
so we can't ignore this problem. I am not yet sure what is the best
solution to this problem, but I think we need to do something for it.

I guess that the concepts of vacuum delay contradicts the concepts of
parallel vacuum. The concepts of parallel vacuum would be to use more
resource to make vacuum faster. Vacuum delays balances I/O during
vacuum in order to avoid I/O spikes by vacuum but parallel vacuum
rather concentrates I/O in shorter duration.

You have a point, but the way it is currently working in the patch
doesn't make much sense. Basically, each of the parallel workers will
be allowed to use a complete I/O limit which is actually a limit for
the entire vacuum operation. It doesn't give any consideration to the
work done for the heap.

Since we need to share
the memory in entire system we need to deal with the memory issue but
disks are different.

If we need to deal with this problem how about just dividing
vacuum_cost_limit by the parallel degree and setting it to worker's
vacuum_cost_limit?

How will we take the I/O done by heap into consideration? The
vacuum_cost_limit is the cost for the entire vacuum operation not
separately for heap and indexes. What makes you think that
considering the limit for heap and index separately is not
problematic?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#124

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Amit Kapila (#123)

On Thu, Oct 17, 2019 at 12:21 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Thu, Oct 17, 2019 at 10:56 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

I guess that the concepts of vacuum delay contradicts the concepts of
parallel vacuum. The concepts of parallel vacuum would be to use more
resource to make vacuum faster. Vacuum delays balances I/O during
vacuum in order to avoid I/O spikes by vacuum but parallel vacuum
rather concentrates I/O in shorter duration.

You have a point, but the way it is currently working in the patch
doesn't make much sense.

Another point in this regard is that the user anyway has an option to
turn off the cost-based vacuum. By default, it is anyway disabled.
So, if the user enables it we have to provide some sensible behavior.
If we can't come up with anything, then, in the end, we might want to
turn it off for a parallel vacuum and mention the same in docs, but I
think we should try to come up with a solution for it.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#125

Masahiko Sawada

sawada.mshk@gmail.com

about 6 years ago

In reply to: Amit Kapila (#124)

On Thu, Oct 17, 2019 at 5:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Thu, Oct 17, 2019 at 12:21 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Thu, Oct 17, 2019 at 10:56 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

I guess that the concepts of vacuum delay contradicts the concepts of
parallel vacuum. The concepts of parallel vacuum would be to use more
resource to make vacuum faster. Vacuum delays balances I/O during
vacuum in order to avoid I/O spikes by vacuum but parallel vacuum
rather concentrates I/O in shorter duration.

You have a point, but the way it is currently working in the patch
doesn't make much sense.

Another point in this regard is that the user anyway has an option to
turn off the cost-based vacuum. By default, it is anyway disabled.
So, if the user enables it we have to provide some sensible behavior.
If we can't come up with anything, then, in the end, we might want to
turn it off for a parallel vacuum and mention the same in docs, but I
think we should try to come up with a solution for it.

I finally got your point and now understood the need. And the idea I
proposed doesn't work fine.

So you meant that all workers share the cost count and if a parallel
vacuum worker increase the cost and it reaches the limit, does the
only one worker sleep? Is that okay even though other parallel workers
are still running and then the sleep might not help?

Regards,

--
Masahiko Sawada

#126

Dilip Kumar

dilipbalaut@gmail.com

about 6 years ago

In reply to: Masahiko Sawada (#125)

On Thu, Oct 17, 2019 at 2:12 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Thu, Oct 17, 2019 at 5:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Thu, Oct 17, 2019 at 12:21 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Thu, Oct 17, 2019 at 10:56 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

I guess that the concepts of vacuum delay contradicts the concepts of
parallel vacuum. The concepts of parallel vacuum would be to use more
resource to make vacuum faster. Vacuum delays balances I/O during
vacuum in order to avoid I/O spikes by vacuum but parallel vacuum
rather concentrates I/O in shorter duration.

You have a point, but the way it is currently working in the patch
doesn't make much sense.

Another point in this regard is that the user anyway has an option to
turn off the cost-based vacuum. By default, it is anyway disabled.
So, if the user enables it we have to provide some sensible behavior.
If we can't come up with anything, then, in the end, we might want to
turn it off for a parallel vacuum and mention the same in docs, but I
think we should try to come up with a solution for it.

I finally got your point and now understood the need. And the idea I
proposed doesn't work fine.

So you meant that all workers share the cost count and if a parallel
vacuum worker increase the cost and it reaches the limit, does the
only one worker sleep? Is that okay even though other parallel workers
are still running and then the sleep might not help?

I agree with this point. There is a possibility that some of the
workers who are doing heavy I/O continue to work and OTOH other
workers who are doing very less I/O might become the victim and
unnecessarily delay its operation.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#127

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Dilip Kumar (#126)

On Thu, Oct 17, 2019 at 3:25 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Thu, Oct 17, 2019 at 2:12 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Thu, Oct 17, 2019 at 5:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

Another point in this regard is that the user anyway has an option to
turn off the cost-based vacuum. By default, it is anyway disabled.
So, if the user enables it we have to provide some sensible behavior.
If we can't come up with anything, then, in the end, we might want to
turn it off for a parallel vacuum and mention the same in docs, but I
think we should try to come up with a solution for it.

I finally got your point and now understood the need. And the idea I
proposed doesn't work fine.

So you meant that all workers share the cost count and if a parallel
vacuum worker increase the cost and it reaches the limit, does the
only one worker sleep? Is that okay even though other parallel workers
are still running and then the sleep might not help?

Remember that the other running workers will also increase
VacuumCostBalance and whichever worker finds that it becomes greater
than VacuumCostLimit will reset its value and sleep. So, won't this
make sure that overall throttling works the same?

I agree with this point. There is a possibility that some of the
workers who are doing heavy I/O continue to work and OTOH other
workers who are doing very less I/O might become the victim and
unnecessarily delay its operation.

Sure, but will it impact the overall I/O? I mean to say the rate
limit we want to provide for overall vacuum operation will still be
the same. Also, isn't a similar thing happens now also where heap
might have done a major portion of I/O but soon after we start
vacuuming the index, we will hit the limit and will sleep.

I think this might not be the perfect solution and we should try to
come up with something else if this doesn't seem to be working. Have
you guys thought about the second solution I mentioned in email [1]/messages/by-id/CAA4eK1+ySETHCaCnAsEC-dC4GSXaE2sNGMOgD6J=X+N43bBqJQ@mail.gmail.com
(Before launching workers, we need to compute the remaining I/O ....)?
Any other better ideas?

[1]: /messages/by-id/CAA4eK1+ySETHCaCnAsEC-dC4GSXaE2sNGMOgD6J=X+N43bBqJQ@mail.gmail.com

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#128

Dilip Kumar

dilipbalaut@gmail.com

about 6 years ago

In reply to: Amit Kapila (#127)

On Thu, Oct 17, 2019 at 4:00 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Thu, Oct 17, 2019 at 3:25 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Thu, Oct 17, 2019 at 2:12 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Thu, Oct 17, 2019 at 5:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

Another point in this regard is that the user anyway has an option to
turn off the cost-based vacuum. By default, it is anyway disabled.
So, if the user enables it we have to provide some sensible behavior.
If we can't come up with anything, then, in the end, we might want to
turn it off for a parallel vacuum and mention the same in docs, but I
think we should try to come up with a solution for it.

I finally got your point and now understood the need. And the idea I
proposed doesn't work fine.

So you meant that all workers share the cost count and if a parallel
vacuum worker increase the cost and it reaches the limit, does the
only one worker sleep? Is that okay even though other parallel workers
are still running and then the sleep might not help?

Remember that the other running workers will also increase
VacuumCostBalance and whichever worker finds that it becomes greater
than VacuumCostLimit will reset its value and sleep. So, won't this
make sure that overall throttling works the same?

I agree with this point. There is a possibility that some of the
workers who are doing heavy I/O continue to work and OTOH other
workers who are doing very less I/O might become the victim and
unnecessarily delay its operation.

Sure, but will it impact the overall I/O? I mean to say the rate
limit we want to provide for overall vacuum operation will still be
the same. Also, isn't a similar thing happens now also where heap
might have done a major portion of I/O but soon after we start
vacuuming the index, we will hit the limit and will sleep.

Actually, What I meant is that the worker who performing actual I/O
might not go for the delay and another worker which has done only CPU
operation might pay the penalty? So basically the worker who is doing
CPU intensive operation might go for the delay and pay the penalty and
the worker who is performing actual I/O continues to work and do
further I/O. Do you think this is not a practical problem?

Stepping back a bit, OTOH, I think that we can not guarantee that the
one worker who has done more I/O will continue to do further I/O and
the one which has not done much I/O will not perform more I/O in
future. So it might not be too bad if we compute shared costs as you
suggested above.

I think this might not be the perfect solution and we should try to
come up with something else if this doesn't seem to be working. Have
you guys thought about the second solution I mentioned in email [1]
(Before launching workers, we need to compute the remaining I/O ....)?
Any other better ideas?

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#129

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Dilip Kumar (#128)

On Fri, Oct 18, 2019 at 8:45 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Thu, Oct 17, 2019 at 4:00 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Thu, Oct 17, 2019 at 3:25 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Thu, Oct 17, 2019 at 2:12 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Thu, Oct 17, 2019 at 5:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

Another point in this regard is that the user anyway has an option to
turn off the cost-based vacuum. By default, it is anyway disabled.
So, if the user enables it we have to provide some sensible behavior.
If we can't come up with anything, then, in the end, we might want to
turn it off for a parallel vacuum and mention the same in docs, but I
think we should try to come up with a solution for it.

I finally got your point and now understood the need. And the idea I
proposed doesn't work fine.

So you meant that all workers share the cost count and if a parallel
vacuum worker increase the cost and it reaches the limit, does the
only one worker sleep? Is that okay even though other parallel workers
are still running and then the sleep might not help?

Remember that the other running workers will also increase
VacuumCostBalance and whichever worker finds that it becomes greater
than VacuumCostLimit will reset its value and sleep. So, won't this
make sure that overall throttling works the same?

I agree with this point. There is a possibility that some of the
workers who are doing heavy I/O continue to work and OTOH other
workers who are doing very less I/O might become the victim and
unnecessarily delay its operation.

Sure, but will it impact the overall I/O? I mean to say the rate
limit we want to provide for overall vacuum operation will still be
the same. Also, isn't a similar thing happens now also where heap
might have done a major portion of I/O but soon after we start
vacuuming the index, we will hit the limit and will sleep.

Actually, What I meant is that the worker who performing actual I/O
might not go for the delay and another worker which has done only CPU
operation might pay the penalty? So basically the worker who is doing
CPU intensive operation might go for the delay and pay the penalty and
the worker who is performing actual I/O continues to work and do
further I/O. Do you think this is not a practical problem?

I don't know. Generally, we try to delay (if required) before
processing (read/write) one page which means it will happen for I/O
intensive operations, so I am not sure if the point you are making is
completely correct.

Stepping back a bit, OTOH, I think that we can not guarantee that the
one worker who has done more I/O will continue to do further I/O and
the one which has not done much I/O will not perform more I/O in
future. So it might not be too bad if we compute shared costs as you
suggested above.

I am thinking if we can write the patch for both the approaches (a.
compute shared costs and try to delay based on that, b. try to divide
the I/O cost among workers as described in the email above[1]/messages/by-id/CAA4eK1+ySETHCaCnAsEC-dC4GSXaE2sNGMOgD6J=X+N43bBqJQ@mail.gmail.com) and do
some tests to see the behavior of throttling, that might help us in
deciding what is the best strategy to solve this problem, if any.
What do you think?

[1]: /messages/by-id/CAA4eK1+ySETHCaCnAsEC-dC4GSXaE2sNGMOgD6J=X+N43bBqJQ@mail.gmail.com

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#130

Dilip Kumar

dilipbalaut@gmail.com

about 6 years ago

In reply to: Amit Kapila (#129)

On Fri, Oct 18, 2019 at 11:25 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Oct 18, 2019 at 8:45 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Thu, Oct 17, 2019 at 4:00 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Thu, Oct 17, 2019 at 3:25 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Thu, Oct 17, 2019 at 2:12 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Thu, Oct 17, 2019 at 5:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

Another point in this regard is that the user anyway has an option to
turn off the cost-based vacuum. By default, it is anyway disabled.
So, if the user enables it we have to provide some sensible behavior.
If we can't come up with anything, then, in the end, we might want to
turn it off for a parallel vacuum and mention the same in docs, but I
think we should try to come up with a solution for it.

I finally got your point and now understood the need. And the idea I
proposed doesn't work fine.

So you meant that all workers share the cost count and if a parallel
vacuum worker increase the cost and it reaches the limit, does the
only one worker sleep? Is that okay even though other parallel workers
are still running and then the sleep might not help?

Remember that the other running workers will also increase
VacuumCostBalance and whichever worker finds that it becomes greater
than VacuumCostLimit will reset its value and sleep. So, won't this
make sure that overall throttling works the same?

I agree with this point. There is a possibility that some of the
workers who are doing heavy I/O continue to work and OTOH other
workers who are doing very less I/O might become the victim and
unnecessarily delay its operation.

Sure, but will it impact the overall I/O? I mean to say the rate
limit we want to provide for overall vacuum operation will still be
the same. Also, isn't a similar thing happens now also where heap
might have done a major portion of I/O but soon after we start
vacuuming the index, we will hit the limit and will sleep.

Actually, What I meant is that the worker who performing actual I/O
might not go for the delay and another worker which has done only CPU
operation might pay the penalty? So basically the worker who is doing
CPU intensive operation might go for the delay and pay the penalty and
the worker who is performing actual I/O continues to work and do
further I/O. Do you think this is not a practical problem?

I don't know. Generally, we try to delay (if required) before
processing (read/write) one page which means it will happen for I/O
intensive operations, so I am not sure if the point you are making is
completely correct.

Ok, I agree with the point that we are checking it only when we are
doing the I/O operation. But, we also need to consider that each I/O
operations have a different weightage. So even if we have a delay
point at I/O operation there is a possibility that we might delay the
worker which is just performing read buffer with page
hit(VacuumCostPageHit). But, the other worker who is actually
dirtying the page(VacuumCostPageDirty = 20) continue the work and do
more I/O.

Stepping back a bit, OTOH, I think that we can not guarantee that the
one worker who has done more I/O will continue to do further I/O and
the one which has not done much I/O will not perform more I/O in
future. So it might not be too bad if we compute shared costs as you
suggested above.

I am thinking if we can write the patch for both the approaches (a.
compute shared costs and try to delay based on that, b. try to divide
the I/O cost among workers as described in the email above[1]) and do
some tests to see the behavior of throttling, that might help us in
deciding what is the best strategy to solve this problem, if any.
What do you think?

I agree with this idea. I can come up with a POC patch for approach
(b). Meanwhile, if someone is interested to quickly hack with the
approach (a) then we can do some testing and compare. Sawada-san,
by any chance will you be interested to write POC with approach (a)?
Otherwise, I will try to write it after finishing the first one
(approach b).

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#131

Masahiko Sawada

sawada.mshk@gmail.com

about 6 years ago

In reply to: Dilip Kumar (#130)

On Fri, Oct 18, 2019 at 3:48 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Fri, Oct 18, 2019 at 11:25 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Oct 18, 2019 at 8:45 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Thu, Oct 17, 2019 at 4:00 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Thu, Oct 17, 2019 at 3:25 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Thu, Oct 17, 2019 at 2:12 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Thu, Oct 17, 2019 at 5:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

Another point in this regard is that the user anyway has an option to
turn off the cost-based vacuum. By default, it is anyway disabled.
So, if the user enables it we have to provide some sensible behavior.
If we can't come up with anything, then, in the end, we might want to
turn it off for a parallel vacuum and mention the same in docs, but I
think we should try to come up with a solution for it.

I finally got your point and now understood the need. And the idea I
proposed doesn't work fine.

So you meant that all workers share the cost count and if a parallel
vacuum worker increase the cost and it reaches the limit, does the
only one worker sleep? Is that okay even though other parallel workers
are still running and then the sleep might not help?

Remember that the other running workers will also increase
VacuumCostBalance and whichever worker finds that it becomes greater
than VacuumCostLimit will reset its value and sleep. So, won't this
make sure that overall throttling works the same?

I agree with this point. There is a possibility that some of the
workers who are doing heavy I/O continue to work and OTOH other
workers who are doing very less I/O might become the victim and
unnecessarily delay its operation.

Sure, but will it impact the overall I/O? I mean to say the rate
limit we want to provide for overall vacuum operation will still be
the same. Also, isn't a similar thing happens now also where heap
might have done a major portion of I/O but soon after we start
vacuuming the index, we will hit the limit and will sleep.

Actually, What I meant is that the worker who performing actual I/O
might not go for the delay and another worker which has done only CPU
operation might pay the penalty? So basically the worker who is doing
CPU intensive operation might go for the delay and pay the penalty and
the worker who is performing actual I/O continues to work and do
further I/O. Do you think this is not a practical problem?

I don't know. Generally, we try to delay (if required) before
processing (read/write) one page which means it will happen for I/O
intensive operations, so I am not sure if the point you are making is
completely correct.

Ok, I agree with the point that we are checking it only when we are
doing the I/O operation. But, we also need to consider that each I/O
operations have a different weightage. So even if we have a delay
point at I/O operation there is a possibility that we might delay the
worker which is just performing read buffer with page
hit(VacuumCostPageHit). But, the other worker who is actually
dirtying the page(VacuumCostPageDirty = 20) continue the work and do
more I/O.

Stepping back a bit, OTOH, I think that we can not guarantee that the
one worker who has done more I/O will continue to do further I/O and
the one which has not done much I/O will not perform more I/O in
future. So it might not be too bad if we compute shared costs as you
suggested above.

I am thinking if we can write the patch for both the approaches (a.
compute shared costs and try to delay based on that, b. try to divide
the I/O cost among workers as described in the email above[1]) and do
some tests to see the behavior of throttling, that might help us in
deciding what is the best strategy to solve this problem, if any.
What do you think?

I agree with this idea. I can come up with a POC patch for approach
(b). Meanwhile, if someone is interested to quickly hack with the
approach (a) then we can do some testing and compare. Sawada-san,
by any chance will you be interested to write POC with approach (a)?

Yes, I will try to write the PoC patch with approach (a).

Regards,

--
Masahiko Sawada

#132

Dilip Kumar

dilipbalaut@gmail.com

about 6 years ago

In reply to: Dilip Kumar (#130)

2 attachment(s)

On Fri, Oct 18, 2019 at 12:18 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Fri, Oct 18, 2019 at 11:25 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Oct 18, 2019 at 8:45 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Thu, Oct 17, 2019 at 4:00 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Thu, Oct 17, 2019 at 3:25 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Thu, Oct 17, 2019 at 2:12 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Thu, Oct 17, 2019 at 5:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

Another point in this regard is that the user anyway has an option to
turn off the cost-based vacuum. By default, it is anyway disabled.
So, if the user enables it we have to provide some sensible behavior.
If we can't come up with anything, then, in the end, we might want to
turn it off for a parallel vacuum and mention the same in docs, but I
think we should try to come up with a solution for it.

I finally got your point and now understood the need. And the idea I
proposed doesn't work fine.

So you meant that all workers share the cost count and if a parallel
vacuum worker increase the cost and it reaches the limit, does the
only one worker sleep? Is that okay even though other parallel workers
are still running and then the sleep might not help?

Remember that the other running workers will also increase
VacuumCostBalance and whichever worker finds that it becomes greater
than VacuumCostLimit will reset its value and sleep. So, won't this
make sure that overall throttling works the same?

I agree with this point. There is a possibility that some of the
workers who are doing heavy I/O continue to work and OTOH other
workers who are doing very less I/O might become the victim and
unnecessarily delay its operation.

Sure, but will it impact the overall I/O? I mean to say the rate
limit we want to provide for overall vacuum operation will still be
the same. Also, isn't a similar thing happens now also where heap
might have done a major portion of I/O but soon after we start
vacuuming the index, we will hit the limit and will sleep.

Actually, What I meant is that the worker who performing actual I/O
might not go for the delay and another worker which has done only CPU
operation might pay the penalty? So basically the worker who is doing
CPU intensive operation might go for the delay and pay the penalty and
the worker who is performing actual I/O continues to work and do
further I/O. Do you think this is not a practical problem?

I don't know. Generally, we try to delay (if required) before
processing (read/write) one page which means it will happen for I/O
intensive operations, so I am not sure if the point you are making is
completely correct.

Ok, I agree with the point that we are checking it only when we are
doing the I/O operation. But, we also need to consider that each I/O
operations have a different weightage. So even if we have a delay
point at I/O operation there is a possibility that we might delay the
worker which is just performing read buffer with page
hit(VacuumCostPageHit). But, the other worker who is actually
dirtying the page(VacuumCostPageDirty = 20) continue the work and do
more I/O.

Stepping back a bit, OTOH, I think that we can not guarantee that the
one worker who has done more I/O will continue to do further I/O and
the one which has not done much I/O will not perform more I/O in
future. So it might not be too bad if we compute shared costs as you
suggested above.

I am thinking if we can write the patch for both the approaches (a.
compute shared costs and try to delay based on that, b. try to divide
the I/O cost among workers as described in the email above[1]) and do
some tests to see the behavior of throttling, that might help us in
deciding what is the best strategy to solve this problem, if any.
What do you think?

I agree with this idea. I can come up with a POC patch for approach
(b). Meanwhile, if someone is interested to quickly hack with the
approach (a) then we can do some testing and compare. Sawada-san,
by any chance will you be interested to write POC with approach (a)?
Otherwise, I will try to write it after finishing the first one
(approach b).

I have come up with the POC for approach (a).

The idea is
1) Before launching the worker divide the current VacuumCostBalance
among workers so that workers start accumulating the balance from that
point.
2) Also, divide the VacuumCostLimit among the workers.
3) Once the worker are done with the index vacuum, send back the
remaining balance with the leader.
4) The leader will sum all the balances and add that to its current
VacuumCostBalance. And start accumulating its balance from this
point.

I was trying to test how is the behaviour of the vacuum I/O limit, but
I could not find an easy way to test that so I just put the tracepoint
in the code and just checked that at what point we are giving the
delay.
I also printed the cost balance at various point to see that after how
much I/O accumulation we are hitting the delay. Please feel free to
suggest a better way to test this.

I have printed these logs for parallel vacuum patch (v30) vs v(30) +
patch for dividing i/o limit (attached with the mail)

Note: Patch and the test results are attached.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Attachments:

POC-v1-0001-divide-vacuum-cost-limit.patchapplication/octet-stream; name=POC-v1-0001-divide-vacuum-cost-limit.patchDownload

From 3346ed95bff15607e90a69741b0fcc5b90aff9c9 Mon Sep 17 00:00:00 2001
From: Dilip Kumar <dilip.kumar@enterprisedb.com>
Date: Wed, 23 Oct 2019 10:46:49 +0530
Subject: [PATCH v1] divide vacuum cost limit

---
 src/backend/access/heap/vacuumlazy.c | 119 ++++++++++++++++++++++++++++++++++-
 1 file changed, 118 insertions(+), 1 deletion(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 2faa4e9..73b6af7 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -137,6 +137,7 @@
 #define PARALLEL_VACUUM_KEY_SHARED			1
 #define PARALLEL_VACUUM_KEY_DEAD_TUPLES		2
 #define PARALLEL_VACUUM_KEY_QUERY_TEXT		3
+#define PARALLEL_VACUUM_KEY_COST_BALANCE	4
 
 /*
  * PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION disables the leader's
@@ -227,6 +228,14 @@ typedef struct LVShared
 } LVShared;
 #define SizeOfLVShared offsetof(LVShared, indstats) + sizeof(LVSharedIndStats)
 
+typedef struct LVCostBalance
+{
+	pg_atomic_uint32	nslot;
+	int		nworkers;
+	int		vaccostbalance[FLEXIBLE_ARRAY_MEMBER];
+} LVCostBalance;
+#define SizeOfLVCostBalance offsetof(LVCostBalance, vaccostbalance) + sizeof(int)
+
 /* Struct for parallel lazy vacuum */
 typedef struct LVParallelState
 {
@@ -235,6 +244,8 @@ typedef struct LVParallelState
 	/* Shared information among parallel vacuum workers */
 	LVShared		*lvshared;
 
+	/* Shared cost balance. */
+	LVCostBalance	*lvcostbalance;
 	/*
 	 * Always true except for a debugging case where
 	 * PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION are defined.
@@ -1927,6 +1938,31 @@ lazy_check_needs_freeze(Buffer buf, bool *hastup)
 	return false;
 }
 
+static void
+compute_cost_balance(LVParallelState *lps)
+{
+	int i;
+
+	/*
+	 * Share the estimated worker counts so that each worker can compute their
+	 * cost limit.  Include the leader if it is participating in the index
+	 * vacuum phase.
+	 * XXX: Actual worker launched might be lesser than the estimated worker so
+	 * in that case each worker might operate with less vacuum cost limit.
+	 */
+	lps->lvcostbalance->nworkers = lps->pcxt->nworkers;
+	if (lps->leaderparticipates)
+		lps->lvcostbalance->nworkers += 1;
+
+	/*
+	 * Divide the current cost balance among the worker so that we don't loose
+	 * accounting of the I/O balance so far.
+	 */
+	for (i = 0; i < lps->pcxt->nworkers; i++)
+		lps->lvcostbalance->vaccostbalance[i] =
+				VacuumCostBalance / lps->lvcostbalance->nworkers;
+}
+
 /*
  * Vacuuming indexes with parallel vacuum workers. This function must be used
  * by the parallel vacuum leader process.
@@ -1936,6 +1972,8 @@ lazy_parallel_vacuum_indexes(LVRelStats *vacrelstats, Relation *Irel,
 							 int nindexes, IndexBulkDeleteResult **stats,
 							 LVParallelState *lps)
 {
+	int i;
+	
 	Assert(!IsParallelWorker());
 	Assert(ParallelVacuumIsActive(lps));
 	Assert(nindexes > 0);
@@ -1950,6 +1988,9 @@ lazy_parallel_vacuum_indexes(LVRelStats *vacrelstats, Relation *Irel,
 	lps->lvshared->reltuples = vacrelstats->old_live_tuples;
 	lps->lvshared->estimated_count = true;
 
+	/* Compute cost balance for the workers. */
+	compute_cost_balance(lps);
+
 	LaunchParallelWorkers(lps->pcxt);
 
 	ereport(elevel,
@@ -1963,14 +2004,38 @@ lazy_parallel_vacuum_indexes(LVRelStats *vacrelstats, Relation *Irel,
 	 * does that in case where no workers launched.
 	 */
 	if (lps->leaderparticipates || lps->pcxt->nworkers_launched == 0)
+	{
+		int base_cost_limit = VacuumCostLimit;
+
+		/*
+		 * If leader is participating and we have launched the parallel workers
+		 * then compute the leaders share of the cost limit and cost balance.
+		 */
+		if (lps->pcxt->nworkers_launched > 0)
+		{
+			VacuumCostLimit /= lps->lvcostbalance->nworkers;
+			VacuumCostBalance /= lps->lvcostbalance->nworkers;
+		}
+
 		vacuum_or_cleanup_indexes_worker(Irel, nindexes, stats, lps->lvshared,
 										 vacrelstats->dead_tuples);
+		VacuumCostLimit = base_cost_limit;
+	}
 
 	/* Wait for all vacuum workers to finish */
 	WaitForParallelWorkersToFinish(lps->pcxt);
 
 	/* Reset the processing count */
 	pg_atomic_write_u32(&(lps->lvshared->nprocessed), 0);
+	pg_atomic_write_u32(&(lps->lvcostbalance->nslot), 0);
+
+	/*
+	 * Index vacuuming phase is complete, so collect the remaining balance from
+	 * all the worker and add to the current balance of the leader.  So that we
+	 * don't loose the accounting for the extra I/O balance of the workers.
+	 */
+	for (i = 0; i < lps->pcxt->nworkers_launched; i++)
+		VacuumCostBalance += lps->lvcostbalance->vaccostbalance[i];
 
 	/*
 	 * Reinitialize the parallel context to relaunch parallel workers
@@ -1988,6 +2053,8 @@ lazy_parallel_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
 									   int nindexes, IndexBulkDeleteResult **stats,
 									   LVParallelState *lps)
 {
+	int i;
+
 	Assert(!IsParallelWorker());
 	Assert(ParallelVacuumIsActive(lps));
 	Assert(nindexes > 0);
@@ -2004,6 +2071,9 @@ lazy_parallel_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
 	lps->lvshared->estimated_count =
 		(vacrelstats->tupcount_pages < vacrelstats->rel_pages);
 
+	/* Compute cost balance for the workers. */
+	compute_cost_balance(lps);
+
 	LaunchParallelWorkers(lps->pcxt);
 
 	ereport(elevel,
@@ -2017,12 +2087,34 @@ lazy_parallel_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
 	 * that in case where no workers launched.
 	 */
 	if (lps->leaderparticipates || lps->pcxt->nworkers_launched == 0)
+	{
+		int base_cost_limit = VacuumCostLimit;
+
+		/*
+		 * If leader is participating and we have launched the parallel workers
+		 * then compute the leaders share of the cost limit and cost balance.
+		 */
+		if (lps->pcxt->nworkers_launched > 0)
+		{
+			VacuumCostLimit /= lps->lvcostbalance->nworkers;
+			VacuumCostBalance /= lps->lvcostbalance->nworkers;
+		}
+
 		vacuum_or_cleanup_indexes_worker(Irel, nindexes, stats, lps->lvshared,
 										 vacrelstats->dead_tuples);
+		VacuumCostLimit = base_cost_limit;
+	}
 
 	/* Wait for all vacuum workers to finish */
 	WaitForParallelWorkersToFinish(lps->pcxt);
 
+	/*
+	 * Index vacuuming phase is complete, so collect the remaining balance from
+	 * all the worker and add to the current balance of the leader.  So that we
+	 * don't loose the accounting for the extra I/O balance of the workers.
+	 */	
+	for (i = 0; i < lps->pcxt->nworkers_launched; i++)
+		VacuumCostBalance += lps->lvcostbalance->vaccostbalance[i];
 
 	/*
 	 * We don't need to reinitialize the parallel context unlike parallel index
@@ -2937,10 +3029,12 @@ begin_parallel_vacuum(LVRelStats *vacrelstats, Oid relid, BlockNumber nblocks,
 	LVShared	*shared;
 	ParallelContext *pcxt;
 	LVDeadTuples	*tidmap;
+	LVCostBalance	*costbalance;
 	long	maxtuples;
 	char	*sharedquery;
 	Size	est_shared;
 	Size	est_deadtuples;
+	Size	est_costbalance;
 	int		querylen;
 	int		i;
 
@@ -3019,6 +3113,14 @@ begin_parallel_vacuum(LVRelStats *vacrelstats, Oid relid, BlockNumber nblocks,
 	memcpy(sharedquery, debug_query_string, querylen + 1);
 	sharedquery[querylen] = '\0';
 	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, sharedquery);
+	
+	/* Vacuum cost balance. */
+	est_costbalance = MAXALIGN(add_size(SizeOfLVCostBalance,
+								   mul_size(sizeof(int), nrequested)));	
+	costbalance = (LVCostBalance *) shm_toc_allocate(pcxt->toc, est_costbalance);
+	pg_atomic_init_u32(&(costbalance->nslot), 0);
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_COST_BALANCE, costbalance);
+	lps->lvcostbalance = costbalance;
 
 	return lps;
 }
@@ -3084,8 +3186,10 @@ heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
 	Relation	*indrels;
 	LVShared	*lvshared;
 	LVDeadTuples	*dead_tuples;
+	LVCostBalance	*costbalance;	
 	int			nindexes;
 	char		*sharedquery;
+	int			slot;
 	IndexBulkDeleteResult **stats;
 
 	lvshared = (LVShared *) shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_SHARED,
@@ -3118,6 +3222,11 @@ heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
 	dead_tuples = (LVDeadTuples *) shm_toc_lookup(toc,
 												  PARALLEL_VACUUM_KEY_DEAD_TUPLES,
 												  false);
+	
+	costbalance = (LVCostBalance *) shm_toc_lookup(toc,
+												   PARALLEL_VACUUM_KEY_COST_BALANCE,
+												   false);
+	slot = pg_atomic_fetch_add_u32(&(costbalance->nslot), 1);
 
 	/* Set cost-based vacuum delay */
 	VacuumCostActive = (VacuumCostDelay > 0);
@@ -3126,13 +3235,21 @@ heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
 	VacuumPageMiss = 0;
 	VacuumPageDirty = 0;
 
+	/* Compute the vacuum cost limit for the worker. */
+	VacuumCostLimit = VacuumCostLimit / costbalance->nworkers;
+	VacuumCostBalance = costbalance->vaccostbalance[slot];
+
 	stats = (IndexBulkDeleteResult **)
 		palloc0(nindexes * sizeof(IndexBulkDeleteResult *));
 
 	/* Do either vacuuming indexes or cleaning indexes */
 	vacuum_or_cleanup_indexes_worker(indrels, nindexes, stats, lvshared,
 									 dead_tuples);
-
+	/*
+	 * Share the remaining balance with the leader so that we don't loose
+	 * accounting for the same.
+	 */
+	costbalance->vaccostbalance[slot] = VacuumCostBalance;
 	vac_close_indexes(nindexes, indrels, RowExclusiveLock);
 	table_close(onerel, ShareUpdateExclusiveLock);
 }
-- 
1.8.3.1

test_and_observarion.patchapplication/octet-stream; name=test_and_observarion.patchDownload

Test case:

drop table test;
create table test(a int, b varchar, c varchar);

create index idx1 on test(a);
create index idx2 on test(b);
create index idx3 on test(c);

insert into test select i, repeat('a',30)||i, repeat('a',20)||i from generate_series(1,200000) as i;
delete from test where a < 100000;
\timing on
--vacuum test;
vacuum (PARALLEL 3) test;

VacuumCostDelay=10ms
VacuumCostLimit=2000
Total Vacuum Cost of the operation: ~24030   Heap Scan Cost: ~1724  Index Scan Cost: ~22350

I have put some tracepoint in the code to observe how I/O throteling is done
On master when complete vacuum is done from a single process we can observe that every time
balance reach to ~2000 the delay is done.

=============================================================================

Test target : Parallel vacuum patch (v30)
Phase1: Heap vacuuming
Leader:  Vacuum delay pid=12695 balance=2001 limit=2000 delay=10.005000 ms
Leader:  Vacuum Index start vacuum_cost_balance=1724

Phase2: Index vacuuming

Delay point ->  Almost at same time each worker put the delay point when costbalance for each worker is ~2000
SO basically we get the first delay at cost ~6000 whereas the VacuumCostLimit is set to 2000.  I think we can
observe the same thing for all the subsequent delay point.

Leader:   Vacuum delay pid=12695 balance=2005 limitDK=2000 delay=10.025000 ms
Worker1:  Vacuum delay pid=13733 balance=2003 limit=2000 delay=10.015000 ms
Worker2:  Vacuum delay pid=13732 balance=2003 limit=2000 delay=10.015000 ms

Worker1:  Vacuum delay pid=13733 balance=2007 limit=2000 delay=10.035000 ms
Worker2:  Vacuum delay pid=13732 balance=2007 limit=2000 delay=10.035000 ms

Worker1:  Vacuum delay pid=13733 balance=2000 limit=2000 delay=10.000000 ms
Worker2:  Vacuum delay pid=13732 balance=2001 limit=2000 delay=10.005000 ms

Worker1:  Vacuum delay pid=13733 balance=2005 limit=2000 delay=10.025000 ms
Worker2:  Vacuum delay pid=13732 balance=2001 limit=2000 delay=10.005000 ms

Worker2:  Vacuum delay pid=13732 balance=2000 limit=2000 delay=10.000000 ms
Leader:   Vacuum delay pid=12695 balance=2000 limit=2000 delay=10.000000 ms
Leader:   Vacuum delay pid=12695 balance=2000 limit=2000 delay=10.000000 ms


Observation:  With the parallel vacuum we have noticed few problem
1. All worker has been assigned with the full limit, so the first delay point occur when I/O cost is = VacuumCostLimit * workers.
2. The remaining balance for the heap scan phase is not handed over to worker, it will be only used by the leader so
it can significantly increase the I/O usage if workers are performing most of the work.



================================================================================

Test target: Parallel vacuum patch + POC-v1-0001-divide-vacuum-cost-limit
Phase1: Heap vacuuming
Leader:  Vacuum delay pid=48698 balance=2001 limit=2000 delay=10.005000 ms
Leader:  Vacuum Index start vacuum_cost_balance=1724

Phase2: Index vacuuming (The remaining cost balance from the heap scan is equally divided among the workers)
worker1:  pid = 49182 worker start balance=574
worker2:  pid = 49181 worker start balance=574


worker1:  Vacuum delay pid=49182 balance=668 limit=666 delay=10.030030 ms
leader:   Vacuum delay pid=48698 balance=672 limit=666 delay=10.090090 ms
worker2:  Vacuum delay pid=49181 balance=669 limit=666 delay=10.045045 ms

delaypoint ->  Each worker delay at almost same time and the combine cost is ~2000 which is equal to VacuumCostLimit
worker1:  Vacuum delay pid=49182 balance=669 limit=666 delay=10.045045 ms
leader:   Vacuum delay pid=48698 balance=666 limit=666 delay=10.000000 ms
worker2:  Vacuum delay pid=49181 balance=666 limit=666 delay=10.000000 ms
delaypoint ->
worker1:  Vacuum delay pid=49182 balance=667 limit=666 delay=10.015015 ms
leader:   Vacuum delay pid=48698 balance=672 limit=666 delay=10.090090 ms
worker2:  Vacuum delay pid=49181 balance=668 limit=666 delay=10.030030 ms
delaypoint ->
worker1:  Vacuum delay pid=49182 balance=672 limit=666 delay=10.090090 ms
leader:   Vacuum delay pid=48698 balance=666 limit=666 delay=10.000000 ms
worker2:  Vacuum delay pid=49181 balance=674 limit=666 delay=10.120120 ms
delaypoint ->
worker1:  Vacuum delay pid=49182 balance=672 limit=666 delay=10.090090 ms
worker2:  Vacuum delay pid=49181 balance=671 limit=666 delay=10.075075 ms
leader:   Vacuum delay pid=48698 balance=672 limit=666 delay=10.090090 ms
delaypoint->
worker1:  Vacuum delay pid=49182 balance=672 limit=666 delay=10.090090 ms
worker2:  Vacuum delay pid=49181 balance=672 limit=666 delay=10.090090 ms
delaypoint->
worker1:  Vacuum delay pid=49182 balance=672 limit=666 delay=10.090090 ms
worker2:  Vacuum delay pid=49181 balance=672 limit=666 delay=10.090090 ms
leader:   Vacuum delay pid=48698 balance=666 limit=666 delay=10.080000 ms
delaypoint->
worker1:  Vacuum delay pid=49182 balance=672 limit=666 delay=10.090090 ms
worker2:  Vacuum delay pid=49181 balance=666 limit=666 delay=10.000000 ms
worker1:  Vacuum delay pid=49182 balance=667 limit=666 delay=10.015015 ms

worker2:  Vacuum delay pid=49181 balance=672 limit=666 delay=10.090090 ms
worker1:  Vacuum delay pid=49182 balance=671 limit=666 delay=10.075075 ms
leader:   Vacuum delay pid=48698 balance=672 limit=666 delay=10.060070 ms

leader:   Vacuum delay pid=48698 balance=671 limit=666 delay=10.080000 ms
worker2:  Vacuum delay pid=49181 balance=671 limit=666 delay=10.075075 ms
worker1:  Vacuum delay pid=49182 balance=672 limit=666 delay=10.090090 ms

....
At the end of the index vacuuming we are combining the remaning cost balance from all worker
and add the balance to leader for the next phase of the heap vacuuming.
       :  Post parallel balance=894

Observation:
1. The main problem we observed with parallel vacuum that the I/O limit is overshooting before it delays, is solved here by dividing the I/O limit.
2. If some worker has very less work to do and not doing much I/O still other workers are having small I/O limit, this can cause frequent delay than required.

#133

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Dilip Kumar (#132)

On Thu, Oct 24, 2019 at 11:51 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Fri, Oct 18, 2019 at 12:18 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Fri, Oct 18, 2019 at 11:25 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

I am thinking if we can write the patch for both the approaches (a.
compute shared costs and try to delay based on that, b. try to divide
the I/O cost among workers as described in the email above[1]) and do
some tests to see the behavior of throttling, that might help us in
deciding what is the best strategy to solve this problem, if any.
What do you think?

I agree with this idea. I can come up with a POC patch for approach
(b). Meanwhile, if someone is interested to quickly hack with the
approach (a) then we can do some testing and compare. Sawada-san,
by any chance will you be interested to write POC with approach (a)?
Otherwise, I will try to write it after finishing the first one
(approach b).

I have come up with the POC for approach (a).

I think you mean to say approach (b).

The idea is
1) Before launching the worker divide the current VacuumCostBalance
among workers so that workers start accumulating the balance from that
point.
2) Also, divide the VacuumCostLimit among the workers.
3) Once the worker are done with the index vacuum, send back the
remaining balance with the leader.
4) The leader will sum all the balances and add that to its current
VacuumCostBalance. And start accumulating its balance from this
point.

I was trying to test how is the behaviour of the vacuum I/O limit, but
I could not find an easy way to test that so I just put the tracepoint
in the code and just checked that at what point we are giving the
delay.
I also printed the cost balance at various point to see that after how
much I/O accumulation we are hitting the delay. Please feel free to
suggest a better way to test this.

Can we compute the overall throttling (sleep time) in the operation
separately for heap and index, then divide the index's sleep_time with
a number of workers and add it to heap's sleep time? Then, it will be
a bit easier to compare the data between parallel and non-parallel
case.

I have printed these logs for parallel vacuum patch (v30) vs v(30) +
patch for dividing i/o limit (attached with the mail)

Note: Patch and the test results are attached.

I think it is always a good idea to summarize the results and tell
your conclusion about it. AFAICT, it seems to me this technique as
done in patch might not work for the cases when there is an uneven
amount of work done by parallel workers (say the index sizes vary
(maybe due partial indexes or index column width or some other
reasons)). The reason for it is that when the worker finishes it's
work we don't rebalance the cost among other workers. Can we generate
such a test and see how it behaves? I think it might be possible to
address this if it turns out to be a problem.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#134

Dilip Kumar

dilipbalaut@gmail.com

about 6 years ago

In reply to: Amit Kapila (#133)

On Thu, Oct 24, 2019 at 4:21 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Thu, Oct 24, 2019 at 11:51 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Fri, Oct 18, 2019 at 12:18 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Fri, Oct 18, 2019 at 11:25 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

I am thinking if we can write the patch for both the approaches (a.
compute shared costs and try to delay based on that, b. try to divide
the I/O cost among workers as described in the email above[1]) and do
some tests to see the behavior of throttling, that might help us in
deciding what is the best strategy to solve this problem, if any.
What do you think?

I agree with this idea. I can come up with a POC patch for approach
(b). Meanwhile, if someone is interested to quickly hack with the
approach (a) then we can do some testing and compare. Sawada-san,
by any chance will you be interested to write POC with approach (a)?
Otherwise, I will try to write it after finishing the first one
(approach b).

I have come up with the POC for approach (a).

I think you mean to say approach (b).

Yeah, sorry for the confusion. It's approach (b).

The idea is
1) Before launching the worker divide the current VacuumCostBalance
among workers so that workers start accumulating the balance from that
point.
2) Also, divide the VacuumCostLimit among the workers.
3) Once the worker are done with the index vacuum, send back the
remaining balance with the leader.
4) The leader will sum all the balances and add that to its current
VacuumCostBalance. And start accumulating its balance from this
point.

I was trying to test how is the behaviour of the vacuum I/O limit, but
I could not find an easy way to test that so I just put the tracepoint
in the code and just checked that at what point we are giving the
delay.
I also printed the cost balance at various point to see that after how
much I/O accumulation we are hitting the delay. Please feel free to
suggest a better way to test this.

Can we compute the overall throttling (sleep time) in the operation
separately for heap and index, then divide the index's sleep_time with
a number of workers and add it to heap's sleep time? Then, it will be
a bit easier to compare the data between parallel and non-parallel
case.

Okay, I will try to do that.

I have printed these logs for parallel vacuum patch (v30) vs v(30) +
patch for dividing i/o limit (attached with the mail)

Note: Patch and the test results are attached.

I think it is always a good idea to summarize the results and tell
your conclusion about it. AFAICT, it seems to me this technique as
done in patch might not work for the cases when there is an uneven
amount of work done by parallel workers (say the index sizes vary
(maybe due partial indexes or index column width or some other
reasons)). The reason for it is that when the worker finishes it's
work we don't rebalance the cost among other workers.

Right, thats one problem I observed.
Can we generate

such a test and see how it behaves? I think it might be possible to
address this if it turns out to be a problem.

Yeah, we can address this by rebalancing the cost.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#135

Masahiko Sawada

sawada.mshk@gmail.com

about 6 years ago

In reply to: Dilip Kumar (#132)

On Thu, Oct 24, 2019 at 3:21 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Fri, Oct 18, 2019 at 12:18 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Fri, Oct 18, 2019 at 11:25 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Oct 18, 2019 at 8:45 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Thu, Oct 17, 2019 at 4:00 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Thu, Oct 17, 2019 at 3:25 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Thu, Oct 17, 2019 at 2:12 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Thu, Oct 17, 2019 at 5:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

Another point in this regard is that the user anyway has an option to
turn off the cost-based vacuum. By default, it is anyway disabled.
So, if the user enables it we have to provide some sensible behavior.
If we can't come up with anything, then, in the end, we might want to
turn it off for a parallel vacuum and mention the same in docs, but I
think we should try to come up with a solution for it.

I finally got your point and now understood the need. And the idea I
proposed doesn't work fine.

So you meant that all workers share the cost count and if a parallel
vacuum worker increase the cost and it reaches the limit, does the
only one worker sleep? Is that okay even though other parallel workers
are still running and then the sleep might not help?

Remember that the other running workers will also increase
VacuumCostBalance and whichever worker finds that it becomes greater
than VacuumCostLimit will reset its value and sleep. So, won't this
make sure that overall throttling works the same?

I agree with this point. There is a possibility that some of the
workers who are doing heavy I/O continue to work and OTOH other
workers who are doing very less I/O might become the victim and
unnecessarily delay its operation.

Sure, but will it impact the overall I/O? I mean to say the rate
limit we want to provide for overall vacuum operation will still be
the same. Also, isn't a similar thing happens now also where heap
might have done a major portion of I/O but soon after we start
vacuuming the index, we will hit the limit and will sleep.

Actually, What I meant is that the worker who performing actual I/O
might not go for the delay and another worker which has done only CPU
operation might pay the penalty? So basically the worker who is doing
CPU intensive operation might go for the delay and pay the penalty and
the worker who is performing actual I/O continues to work and do
further I/O. Do you think this is not a practical problem?

I don't know. Generally, we try to delay (if required) before
processing (read/write) one page which means it will happen for I/O
intensive operations, so I am not sure if the point you are making is
completely correct.

Ok, I agree with the point that we are checking it only when we are
doing the I/O operation. But, we also need to consider that each I/O
operations have a different weightage. So even if we have a delay
point at I/O operation there is a possibility that we might delay the
worker which is just performing read buffer with page
hit(VacuumCostPageHit). But, the other worker who is actually
dirtying the page(VacuumCostPageDirty = 20) continue the work and do
more I/O.

Stepping back a bit, OTOH, I think that we can not guarantee that the
one worker who has done more I/O will continue to do further I/O and
the one which has not done much I/O will not perform more I/O in
future. So it might not be too bad if we compute shared costs as you
suggested above.

I am thinking if we can write the patch for both the approaches (a.
compute shared costs and try to delay based on that, b. try to divide
the I/O cost among workers as described in the email above[1]) and do
some tests to see the behavior of throttling, that might help us in
deciding what is the best strategy to solve this problem, if any.
What do you think?

I agree with this idea. I can come up with a POC patch for approach
(b). Meanwhile, if someone is interested to quickly hack with the
approach (a) then we can do some testing and compare. Sawada-san,
by any chance will you be interested to write POC with approach (a)?
Otherwise, I will try to write it after finishing the first one
(approach b).

I have come up with the POC for approach (a).

The idea is
1) Before launching the worker divide the current VacuumCostBalance
among workers so that workers start accumulating the balance from that
point.
2) Also, divide the VacuumCostLimit among the workers.
3) Once the worker are done with the index vacuum, send back the
remaining balance with the leader.
4) The leader will sum all the balances and add that to its current
VacuumCostBalance. And start accumulating its balance from this
point.

I was trying to test how is the behaviour of the vacuum I/O limit, but
I could not find an easy way to test that so I just put the tracepoint
in the code and just checked that at what point we are giving the
delay.
I also printed the cost balance at various point to see that after how
much I/O accumulation we are hitting the delay. Please feel free to
suggest a better way to test this.

I have printed these logs for parallel vacuum patch (v30) vs v(30) +
patch for dividing i/o limit (attached with the mail)

Note: Patch and the test results are attached.

Thank you!

For approach (a) the basic idea I've come up with is that we have a
shared balance value on DSM and each workers including the leader
process add its local balance value to it in vacuum_delay_point, and
then based on the shared value workers sleep. I'll submit that patch
with other updates.

Regards,

--
Masahiko Sawada

#136

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Masahiko Sawada (#135)

On Thu, Oct 24, 2019 at 8:12 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Thu, Oct 24, 2019 at 3:21 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

I have come up with the POC for approach (a).

The idea is
1) Before launching the worker divide the current VacuumCostBalance
among workers so that workers start accumulating the balance from that
point.
2) Also, divide the VacuumCostLimit among the workers.
3) Once the worker are done with the index vacuum, send back the
remaining balance with the leader.
4) The leader will sum all the balances and add that to its current
VacuumCostBalance. And start accumulating its balance from this
point.

I was trying to test how is the behaviour of the vacuum I/O limit, but
I could not find an easy way to test that so I just put the tracepoint
in the code and just checked that at what point we are giving the
delay.
I also printed the cost balance at various point to see that after how
much I/O accumulation we are hitting the delay. Please feel free to
suggest a better way to test this.

I have printed these logs for parallel vacuum patch (v30) vs v(30) +
patch for dividing i/o limit (attached with the mail)

Note: Patch and the test results are attached.

Thank you!

For approach (a) the basic idea I've come up with is that we have a
shared balance value on DSM and each workers including the leader
process add its local balance value to it in vacuum_delay_point, and
then based on the shared value workers sleep. I'll submit that patch
with other updates.

I think it would be better if we can prepare the I/O balance patches
on top of main patch and evaluate both approaches. We can test both
the approaches and integrate the one which turned out to be good.

Note that, I will be away next week, so I won't be able to review your
latest patch unless you are planning to post today or tomorrow.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#137

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Amit Kapila (#136)

On Fri, Oct 25, 2019 at 7:37 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Thu, Oct 24, 2019 at 8:12 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Thu, Oct 24, 2019 at 3:21 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

I have come up with the POC for approach (a).

The idea is
1) Before launching the worker divide the current VacuumCostBalance
among workers so that workers start accumulating the balance from that
point.
2) Also, divide the VacuumCostLimit among the workers.
3) Once the worker are done with the index vacuum, send back the
remaining balance with the leader.
4) The leader will sum all the balances and add that to its current
VacuumCostBalance. And start accumulating its balance from this
point.

I was trying to test how is the behaviour of the vacuum I/O limit, but
I could not find an easy way to test that so I just put the tracepoint
in the code and just checked that at what point we are giving the
delay.
I also printed the cost balance at various point to see that after how
much I/O accumulation we are hitting the delay. Please feel free to
suggest a better way to test this.

I have printed these logs for parallel vacuum patch (v30) vs v(30) +
patch for dividing i/o limit (attached with the mail)

Note: Patch and the test results are attached.

Thank you!

For approach (a) the basic idea I've come up with is that we have a
shared balance value on DSM and each workers including the leader
process add its local balance value to it in vacuum_delay_point, and
then based on the shared value workers sleep. I'll submit that patch
with other updates.

I think it would be better if we can prepare the I/O balance patches
on top of main patch and evaluate both approaches. We can test both
the approaches and integrate the one which turned out to be good.

Just to add something to testing both approaches. I think we can
first come up with a way to compute the throttling vacuum does as
mentioned by me in one of the emails above [1]/messages/by-id/CAA4eK1+PeiFLdTuwrE6CvbNdx80E-O=ZxCuWB2maREKFD-RaCA@mail.gmail.com or in some other way.
I think Dilip is planning to give it a try and once we have that we
can evaluate both the patches. Some of the tests I have in mind are:
a. All indexes have an equal amount of deleted data.
b. indexes have an uneven amount of deleted data.
c. try with mix of indexes (btree, gin, gist, hash, etc..) on a table.

Feel free to add more tests.

[1]: /messages/by-id/CAA4eK1+PeiFLdTuwrE6CvbNdx80E-O=ZxCuWB2maREKFD-RaCA@mail.gmail.com

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#138

Dilip Kumar

dilipbalaut@gmail.com

about 6 years ago

In reply to: Masahiko Sawada (#135)

On Thu, Oct 24, 2019 at 8:12 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Thu, Oct 24, 2019 at 3:21 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Fri, Oct 18, 2019 at 12:18 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Fri, Oct 18, 2019 at 11:25 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Oct 18, 2019 at 8:45 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Thu, Oct 17, 2019 at 4:00 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Thu, Oct 17, 2019 at 3:25 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Thu, Oct 17, 2019 at 2:12 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Thu, Oct 17, 2019 at 5:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

Another point in this regard is that the user anyway has an option to
turn off the cost-based vacuum. By default, it is anyway disabled.
So, if the user enables it we have to provide some sensible behavior.
If we can't come up with anything, then, in the end, we might want to
turn it off for a parallel vacuum and mention the same in docs, but I
think we should try to come up with a solution for it.

I finally got your point and now understood the need. And the idea I
proposed doesn't work fine.

So you meant that all workers share the cost count and if a parallel
vacuum worker increase the cost and it reaches the limit, does the
only one worker sleep? Is that okay even though other parallel workers
are still running and then the sleep might not help?

Remember that the other running workers will also increase
VacuumCostBalance and whichever worker finds that it becomes greater
than VacuumCostLimit will reset its value and sleep. So, won't this
make sure that overall throttling works the same?

I agree with this point. There is a possibility that some of the
workers who are doing heavy I/O continue to work and OTOH other
workers who are doing very less I/O might become the victim and
unnecessarily delay its operation.

Sure, but will it impact the overall I/O? I mean to say the rate
limit we want to provide for overall vacuum operation will still be
the same. Also, isn't a similar thing happens now also where heap
might have done a major portion of I/O but soon after we start
vacuuming the index, we will hit the limit and will sleep.

Actually, What I meant is that the worker who performing actual I/O
might not go for the delay and another worker which has done only CPU
operation might pay the penalty? So basically the worker who is doing
CPU intensive operation might go for the delay and pay the penalty and
the worker who is performing actual I/O continues to work and do
further I/O. Do you think this is not a practical problem?

I don't know. Generally, we try to delay (if required) before
processing (read/write) one page which means it will happen for I/O
intensive operations, so I am not sure if the point you are making is
completely correct.

Ok, I agree with the point that we are checking it only when we are
doing the I/O operation. But, we also need to consider that each I/O
operations have a different weightage. So even if we have a delay
point at I/O operation there is a possibility that we might delay the
worker which is just performing read buffer with page
hit(VacuumCostPageHit). But, the other worker who is actually
dirtying the page(VacuumCostPageDirty = 20) continue the work and do
more I/O.

Stepping back a bit, OTOH, I think that we can not guarantee that the
one worker who has done more I/O will continue to do further I/O and
the one which has not done much I/O will not perform more I/O in
future. So it might not be too bad if we compute shared costs as you
suggested above.

I am thinking if we can write the patch for both the approaches (a.
compute shared costs and try to delay based on that, b. try to divide
the I/O cost among workers as described in the email above[1]) and do
some tests to see the behavior of throttling, that might help us in
deciding what is the best strategy to solve this problem, if any.
What do you think?

I agree with this idea. I can come up with a POC patch for approach
(b). Meanwhile, if someone is interested to quickly hack with the
approach (a) then we can do some testing and compare. Sawada-san,
by any chance will you be interested to write POC with approach (a)?
Otherwise, I will try to write it after finishing the first one
(approach b).

I have come up with the POC for approach (a).

The idea is
1) Before launching the worker divide the current VacuumCostBalance
among workers so that workers start accumulating the balance from that
point.
2) Also, divide the VacuumCostLimit among the workers.
3) Once the worker are done with the index vacuum, send back the
remaining balance with the leader.
4) The leader will sum all the balances and add that to its current
VacuumCostBalance. And start accumulating its balance from this
point.

I was trying to test how is the behaviour of the vacuum I/O limit, but
I could not find an easy way to test that so I just put the tracepoint
in the code and just checked that at what point we are giving the
delay.
I also printed the cost balance at various point to see that after how
much I/O accumulation we are hitting the delay. Please feel free to
suggest a better way to test this.

I have printed these logs for parallel vacuum patch (v30) vs v(30) +
patch for dividing i/o limit (attached with the mail)

Note: Patch and the test results are attached.

Thank you!

For approach (a) the basic idea I've come up with is that we have a
shared balance value on DSM and each workers including the leader
process add its local balance value to it in vacuum_delay_point, and
then based on the shared value workers sleep. I'll submit that patch
with other updates.

IMHO, if we add the local balance to the shared balance in
vacuum_delay_point and each worker is working with full limit then
there will be a problem right? because suppose VacuumCostLimit is 2000
then the first time each worker hit the vacuum_delay_point when their
local limit will be 2000 so in most cases, the first delay will be hit
when there gross I/O is 6000 (if there are 3 workers).

I think if we want to have the shared accounting then we must
accumulate the balance always in a shared variable so that as soon as
the gross limit hits the VacuumCostLimit, we can have the delay point.

Maybe we can do this
1. change VacuumCostBalance from integer to pg_atomic_uint32 *
2. In heap_parallel_vacuum_main function, make this point into a
shared memory location. Basically, for the non-parallel case, it will
point to the process-specific global variable whereas in parallel case
it will point to a shared memory variable.
3. Now, I think in code (I think 5-6 occurrence) wherever we are using
VacuumCostBalance, change them to use atomic operations.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#139

Masahiko Sawada

sawada.mshk@gmail.com

about 6 years ago

In reply to: Dilip Kumar (#138)

On Fri, Oct 25, 2019 at 12:44 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Thu, Oct 24, 2019 at 8:12 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Thu, Oct 24, 2019 at 3:21 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Fri, Oct 18, 2019 at 12:18 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Fri, Oct 18, 2019 at 11:25 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Oct 18, 2019 at 8:45 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Thu, Oct 17, 2019 at 4:00 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Thu, Oct 17, 2019 at 3:25 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Thu, Oct 17, 2019 at 2:12 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Thu, Oct 17, 2019 at 5:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

Another point in this regard is that the user anyway has an option to
turn off the cost-based vacuum. By default, it is anyway disabled.
So, if the user enables it we have to provide some sensible behavior.
If we can't come up with anything, then, in the end, we might want to
turn it off for a parallel vacuum and mention the same in docs, but I
think we should try to come up with a solution for it.

I finally got your point and now understood the need. And the idea I
proposed doesn't work fine.

So you meant that all workers share the cost count and if a parallel
vacuum worker increase the cost and it reaches the limit, does the
only one worker sleep? Is that okay even though other parallel workers
are still running and then the sleep might not help?

Remember that the other running workers will also increase
VacuumCostBalance and whichever worker finds that it becomes greater
than VacuumCostLimit will reset its value and sleep. So, won't this
make sure that overall throttling works the same?

I agree with this point. There is a possibility that some of the
workers who are doing heavy I/O continue to work and OTOH other
workers who are doing very less I/O might become the victim and
unnecessarily delay its operation.

Sure, but will it impact the overall I/O? I mean to say the rate
limit we want to provide for overall vacuum operation will still be
the same. Also, isn't a similar thing happens now also where heap
might have done a major portion of I/O but soon after we start
vacuuming the index, we will hit the limit and will sleep.

Actually, What I meant is that the worker who performing actual I/O
might not go for the delay and another worker which has done only CPU
operation might pay the penalty? So basically the worker who is doing
CPU intensive operation might go for the delay and pay the penalty and
the worker who is performing actual I/O continues to work and do
further I/O. Do you think this is not a practical problem?

I don't know. Generally, we try to delay (if required) before
processing (read/write) one page which means it will happen for I/O
intensive operations, so I am not sure if the point you are making is
completely correct.

Ok, I agree with the point that we are checking it only when we are
doing the I/O operation. But, we also need to consider that each I/O
operations have a different weightage. So even if we have a delay
point at I/O operation there is a possibility that we might delay the
worker which is just performing read buffer with page
hit(VacuumCostPageHit). But, the other worker who is actually
dirtying the page(VacuumCostPageDirty = 20) continue the work and do
more I/O.

Stepping back a bit, OTOH, I think that we can not guarantee that the
one worker who has done more I/O will continue to do further I/O and
the one which has not done much I/O will not perform more I/O in
future. So it might not be too bad if we compute shared costs as you
suggested above.

I am thinking if we can write the patch for both the approaches (a.
compute shared costs and try to delay based on that, b. try to divide
the I/O cost among workers as described in the email above[1]) and do
some tests to see the behavior of throttling, that might help us in
deciding what is the best strategy to solve this problem, if any.
What do you think?

I agree with this idea. I can come up with a POC patch for approach
(b). Meanwhile, if someone is interested to quickly hack with the
approach (a) then we can do some testing and compare. Sawada-san,
by any chance will you be interested to write POC with approach (a)?
Otherwise, I will try to write it after finishing the first one
(approach b).

I have come up with the POC for approach (a).

The idea is
1) Before launching the worker divide the current VacuumCostBalance
among workers so that workers start accumulating the balance from that
point.
2) Also, divide the VacuumCostLimit among the workers.
3) Once the worker are done with the index vacuum, send back the
remaining balance with the leader.
4) The leader will sum all the balances and add that to its current
VacuumCostBalance. And start accumulating its balance from this
point.

I was trying to test how is the behaviour of the vacuum I/O limit, but
I could not find an easy way to test that so I just put the tracepoint
in the code and just checked that at what point we are giving the
delay.
I also printed the cost balance at various point to see that after how
much I/O accumulation we are hitting the delay. Please feel free to
suggest a better way to test this.

I have printed these logs for parallel vacuum patch (v30) vs v(30) +
patch for dividing i/o limit (attached with the mail)

Note: Patch and the test results are attached.

Thank you!

For approach (a) the basic idea I've come up with is that we have a
shared balance value on DSM and each workers including the leader
process add its local balance value to it in vacuum_delay_point, and
then based on the shared value workers sleep. I'll submit that patch
with other updates.

IMHO, if we add the local balance to the shared balance in
vacuum_delay_point and each worker is working with full limit then
there will be a problem right? because suppose VacuumCostLimit is 2000
then the first time each worker hit the vacuum_delay_point when their
local limit will be 2000 so in most cases, the first delay will be hit
when there gross I/O is 6000 (if there are 3 workers).

For more detail of my idea it is that the first worker who entered to
vacuum_delay_point adds its local value to shared value and reset the
local value to 0. And then the worker sleeps if it exceeds
VacuumCostLimit but before sleeping it can subtract VacuumCostLimit
from the shared value. Since vacuum_delay_point are typically called
per page processed I expect there will not such problem. Thoughts?

Regards,

--
Masahiko Sawada

#140

Dilip Kumar

dilipbalaut@gmail.com

about 6 years ago

In reply to: Masahiko Sawada (#139)

On Fri, Oct 25, 2019 at 10:22 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Fri, Oct 25, 2019 at 12:44 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Thu, Oct 24, 2019 at 8:12 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Thu, Oct 24, 2019 at 3:21 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Fri, Oct 18, 2019 at 12:18 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Fri, Oct 18, 2019 at 11:25 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Oct 18, 2019 at 8:45 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Thu, Oct 17, 2019 at 4:00 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Thu, Oct 17, 2019 at 3:25 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Thu, Oct 17, 2019 at 2:12 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Thu, Oct 17, 2019 at 5:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

Another point in this regard is that the user anyway has an option to
turn off the cost-based vacuum. By default, it is anyway disabled.
So, if the user enables it we have to provide some sensible behavior.
If we can't come up with anything, then, in the end, we might want to
turn it off for a parallel vacuum and mention the same in docs, but I
think we should try to come up with a solution for it.

I finally got your point and now understood the need. And the idea I
proposed doesn't work fine.

So you meant that all workers share the cost count and if a parallel
vacuum worker increase the cost and it reaches the limit, does the
only one worker sleep? Is that okay even though other parallel workers
are still running and then the sleep might not help?

Remember that the other running workers will also increase
VacuumCostBalance and whichever worker finds that it becomes greater
than VacuumCostLimit will reset its value and sleep. So, won't this
make sure that overall throttling works the same?

I agree with this point. There is a possibility that some of the
workers who are doing heavy I/O continue to work and OTOH other
workers who are doing very less I/O might become the victim and
unnecessarily delay its operation.

Sure, but will it impact the overall I/O? I mean to say the rate
limit we want to provide for overall vacuum operation will still be
the same. Also, isn't a similar thing happens now also where heap
might have done a major portion of I/O but soon after we start
vacuuming the index, we will hit the limit and will sleep.

Actually, What I meant is that the worker who performing actual I/O
might not go for the delay and another worker which has done only CPU
operation might pay the penalty? So basically the worker who is doing
CPU intensive operation might go for the delay and pay the penalty and
the worker who is performing actual I/O continues to work and do
further I/O. Do you think this is not a practical problem?

I don't know. Generally, we try to delay (if required) before
processing (read/write) one page which means it will happen for I/O
intensive operations, so I am not sure if the point you are making is
completely correct.

Ok, I agree with the point that we are checking it only when we are
doing the I/O operation. But, we also need to consider that each I/O
operations have a different weightage. So even if we have a delay
point at I/O operation there is a possibility that we might delay the
worker which is just performing read buffer with page
hit(VacuumCostPageHit). But, the other worker who is actually
dirtying the page(VacuumCostPageDirty = 20) continue the work and do
more I/O.

Stepping back a bit, OTOH, I think that we can not guarantee that the
one worker who has done more I/O will continue to do further I/O and
the one which has not done much I/O will not perform more I/O in
future. So it might not be too bad if we compute shared costs as you
suggested above.

I am thinking if we can write the patch for both the approaches (a.
compute shared costs and try to delay based on that, b. try to divide
the I/O cost among workers as described in the email above[1]) and do
some tests to see the behavior of throttling, that might help us in
deciding what is the best strategy to solve this problem, if any.
What do you think?

I agree with this idea. I can come up with a POC patch for approach
(b). Meanwhile, if someone is interested to quickly hack with the
approach (a) then we can do some testing and compare. Sawada-san,
by any chance will you be interested to write POC with approach (a)?
Otherwise, I will try to write it after finishing the first one
(approach b).

I have come up with the POC for approach (a).

The idea is
1) Before launching the worker divide the current VacuumCostBalance
among workers so that workers start accumulating the balance from that
point.
2) Also, divide the VacuumCostLimit among the workers.
3) Once the worker are done with the index vacuum, send back the
remaining balance with the leader.
4) The leader will sum all the balances and add that to its current
VacuumCostBalance. And start accumulating its balance from this
point.

I was trying to test how is the behaviour of the vacuum I/O limit, but
I could not find an easy way to test that so I just put the tracepoint
in the code and just checked that at what point we are giving the
delay.
I also printed the cost balance at various point to see that after how
much I/O accumulation we are hitting the delay. Please feel free to
suggest a better way to test this.

I have printed these logs for parallel vacuum patch (v30) vs v(30) +
patch for dividing i/o limit (attached with the mail)

Note: Patch and the test results are attached.

Thank you!

For approach (a) the basic idea I've come up with is that we have a
shared balance value on DSM and each workers including the leader
process add its local balance value to it in vacuum_delay_point, and
then based on the shared value workers sleep. I'll submit that patch
with other updates.

IMHO, if we add the local balance to the shared balance in
vacuum_delay_point and each worker is working with full limit then
there will be a problem right? because suppose VacuumCostLimit is 2000
then the first time each worker hit the vacuum_delay_point when their
local limit will be 2000 so in most cases, the first delay will be hit
when there gross I/O is 6000 (if there are 3 workers).

For more detail of my idea it is that the first worker who entered to
vacuum_delay_point adds its local value to shared value and reset the
local value to 0. And then the worker sleeps if it exceeds
VacuumCostLimit but before sleeping it can subtract VacuumCostLimit
from the shared value. Since vacuum_delay_point are typically called
per page processed I expect there will not such problem. Thoughts?

Oh right, I assumed that when the local balance is exceeding the
VacuumCostLimit that time you are adding it to the shared value but
you are adding it to to shared value every time in vacuum_delay_point.
So I think your idea is correct.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#141

Masahiko Sawada

sawada.mshk@gmail.com

about 6 years ago

In reply to: Dilip Kumar (#140)

6 attachment(s)

On Fri, Oct 25, 2019 at 2:06 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Fri, Oct 25, 2019 at 10:22 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

For more detail of my idea it is that the first worker who entered to
vacuum_delay_point adds its local value to shared value and reset the
local value to 0. And then the worker sleeps if it exceeds
VacuumCostLimit but before sleeping it can subtract VacuumCostLimit
from the shared value. Since vacuum_delay_point are typically called
per page processed I expect there will not such problem. Thoughts?

Oh right, I assumed that when the local balance is exceeding the
VacuumCostLimit that time you are adding it to the shared value but
you are adding it to to shared value every time in vacuum_delay_point.
So I think your idea is correct.

I've attached the updated patch set.

First three patches add new variables and a callback to index AM.

Next two patches are the main part to support parallel vacuum. I've
incorporated all review comments I got so far. The memory layout of
variable-length index statistics might be complex a bit. It's similar
to the format of heap tuple header, having a null bitmap. And both the
size of index statistics and actual data for each indexes follows.

Last patch is a PoC patch that implements the shared vacuum cost
balance. For now it's separated but after testing both approaches it
will be merged to 0004 patch. I'll test both next week.

This patch set can be applied on top of the patch[1]/messages/by-id/CAFiTN-uQY+B+CLb8W3YYdb7XmB9hyYFXkAy3C7RY=-YSWRV1DA@mail.gmail.com that improves
gist index bulk-deletion. So canparallelvacuum of gist index is true.

[1]: /messages/by-id/CAFiTN-uQY+B+CLb8W3YYdb7XmB9hyYFXkAy3C7RY=-YSWRV1DA@mail.gmail.com

Regards,

--
Masahiko Sawada

Attachments:

v31-0002-Add-an-index-AM-callback-to-estimate-DSM-for-par.patchtext/x-patch; charset=US-ASCII; name=v31-0002-Add-an-index-AM-callback-to-estimate-DSM-for-par.patchDownload

From bf8e8ae5ded91327d504a19227c96378d3d0b513 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Thu, 24 Oct 2019 14:51:16 +0900
Subject: [PATCH v31 2/6] Add an index AM callback to estimate DSM for parallel
 vacuum

---
 contrib/bloom/blutils.c                       |  1 +
 doc/src/sgml/indexam.sgml                     | 17 +++++++++++
 src/backend/access/brin/brin.c                |  1 +
 src/backend/access/gin/ginutil.c              |  1 +
 src/backend/access/gist/gist.c                |  1 +
 src/backend/access/hash/hash.c                |  1 +
 src/backend/access/index/indexam.c            | 29 +++++++++++++++++++
 src/backend/access/nbtree/nbtree.c            |  1 +
 src/backend/access/spgist/spgutils.c          |  1 +
 src/include/access/amapi.h                    |  9 ++++++
 src/include/access/genam.h                    |  1 +
 .../modules/dummy_index_am/dummy_index_am.c   |  1 +
 12 files changed, 64 insertions(+)

diff --git a/contrib/bloom/blutils.c b/contrib/bloom/blutils.c
index 98163c81bd..9ef14a47f3 100644
--- a/contrib/bloom/blutils.c
+++ b/contrib/bloom/blutils.c
@@ -145,6 +145,7 @@ blhandler(PG_FUNCTION_ARGS)
 	amroutine->amestimateparallelscan = NULL;
 	amroutine->aminitparallelscan = NULL;
 	amroutine->amparallelrescan = NULL;
+	amroutine->amestimateparallelvacuum = NULL;
 
 	PG_RETURN_POINTER(amroutine);
 }
diff --git a/doc/src/sgml/indexam.sgml b/doc/src/sgml/indexam.sgml
index fa5682db04..c3d2352d0f 100644
--- a/doc/src/sgml/indexam.sgml
+++ b/doc/src/sgml/indexam.sgml
@@ -151,6 +151,9 @@ typedef struct IndexAmRoutine
     amestimateparallelscan_function amestimateparallelscan;    /* can be NULL */
     aminitparallelscan_function aminitparallelscan;    /* can be NULL */
     amparallelrescan_function amparallelrescan;    /* can be NULL */
+
+    /* interface functions to support parallel vacuum */
+    amestimateparallelvacuum_function amestimateparallelvacuum; /* can be NULL */
 } IndexAmRoutine;
 </programlisting>
   </para>
@@ -733,6 +736,20 @@ amparallelrescan (IndexScanDesc scan);
    the beginning.
   </para>
 
+  <para>
+<programlisting>
+void
+amestimateparallelvacuum (IndexScanDesc scan);
+</programlisting>
+   Estimate and return the number of bytes of dynamic shared memory which the
+   access method will be needed to copy the statistics to.
+  </para>
+
+  <para>
+   It is not necessary to implement this function for access methods which
+   do not support parallel vacuum or in cases where the access method does not
+   require more than size of <structname>IndexBulkDeleteResult</structname>.
+  </para>
  </sect1>
 
  <sect1 id="index-scanning">
diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index 6ea48fb555..4045f5eacf 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -125,6 +125,7 @@ brinhandler(PG_FUNCTION_ARGS)
 	amroutine->amestimateparallelscan = NULL;
 	amroutine->aminitparallelscan = NULL;
 	amroutine->amparallelrescan = NULL;
+	amroutine->amestimateparallelvacuum = NULL;
 
 	PG_RETURN_POINTER(amroutine);
 }
diff --git a/src/backend/access/gin/ginutil.c b/src/backend/access/gin/ginutil.c
index 0c33809c83..9832f651ef 100644
--- a/src/backend/access/gin/ginutil.c
+++ b/src/backend/access/gin/ginutil.c
@@ -77,6 +77,7 @@ ginhandler(PG_FUNCTION_ARGS)
 	amroutine->amestimateparallelscan = NULL;
 	amroutine->aminitparallelscan = NULL;
 	amroutine->amparallelrescan = NULL;
+	amroutine->amestimateparallelvacuum = NULL;
 
 	PG_RETURN_POINTER(amroutine);
 }
diff --git a/src/backend/access/gist/gist.c b/src/backend/access/gist/gist.c
index 0363bf814a..88b1e839b3 100644
--- a/src/backend/access/gist/gist.c
+++ b/src/backend/access/gist/gist.c
@@ -99,6 +99,7 @@ gisthandler(PG_FUNCTION_ARGS)
 	amroutine->amestimateparallelscan = NULL;
 	amroutine->aminitparallelscan = NULL;
 	amroutine->amparallelrescan = NULL;
+	amroutine->amestimateparallelvacuum = NULL;
 
 	PG_RETURN_POINTER(amroutine);
 }
diff --git a/src/backend/access/hash/hash.c b/src/backend/access/hash/hash.c
index f21d9ac78f..3666318064 100644
--- a/src/backend/access/hash/hash.c
+++ b/src/backend/access/hash/hash.c
@@ -98,6 +98,7 @@ hashhandler(PG_FUNCTION_ARGS)
 	amroutine->amestimateparallelscan = NULL;
 	amroutine->aminitparallelscan = NULL;
 	amroutine->amparallelrescan = NULL;
+	amroutine->amestimateparallelvacuum = NULL;
 
 	PG_RETURN_POINTER(amroutine);
 }
diff --git a/src/backend/access/index/indexam.c b/src/backend/access/index/indexam.c
index 9dfa0ddfbb..5238b9d38f 100644
--- a/src/backend/access/index/indexam.c
+++ b/src/backend/access/index/indexam.c
@@ -711,6 +711,35 @@ index_vacuum_cleanup(IndexVacuumInfo *info,
 	return indexRelation->rd_indam->amvacuumcleanup(info, stats);
 }
 
+/*
+ * index_parallelvacuum_estimate - estimate shared memory for parallel vacuum
+ *
+ * Currently, we don't pass any information to the AM-specific estimator,
+ * so it can probably only return a constant.  In the future, we might need
+ * to pass more information.
+ */
+Size
+index_parallelvacuum_estimate(Relation indexRelation)
+{
+	Size		nbytes;
+
+	RELATION_CHECKS;
+
+	/*
+	 * If amestimateparallelvacuum is not provided, assume only
+	 * IndexBulkDeleteResult is needed.
+	 */
+	if (indexRelation->rd_indam->amestimateparallelvacuum != NULL)
+	{
+		nbytes = indexRelation->rd_indam->amestimateparallelvacuum();
+		Assert(nbytes >= MAXALIGN(sizeof(IndexBulkDeleteResult)));
+	}
+	else
+		nbytes = MAXALIGN(sizeof(IndexBulkDeleteResult));
+
+	return nbytes;
+}
+
 /* ----------------
  *		index_can_return
  *
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index e885aadc21..f1db77886c 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -147,6 +147,7 @@ bthandler(PG_FUNCTION_ARGS)
 	amroutine->amestimateparallelscan = btestimateparallelscan;
 	amroutine->aminitparallelscan = btinitparallelscan;
 	amroutine->amparallelrescan = btparallelrescan;
+	amroutine->amestimateparallelvacuum = NULL;
 
 	PG_RETURN_POINTER(amroutine);
 }
diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c
index 0c86b63f65..ff66c3ac6c 100644
--- a/src/backend/access/spgist/spgutils.c
+++ b/src/backend/access/spgist/spgutils.c
@@ -80,6 +80,7 @@ spghandler(PG_FUNCTION_ARGS)
 	amroutine->amestimateparallelscan = NULL;
 	amroutine->aminitparallelscan = NULL;
 	amroutine->amparallelrescan = NULL;
+	amroutine->amestimateparallelvacuum = NULL;
 
 	PG_RETURN_POINTER(amroutine);
 }
diff --git a/src/include/access/amapi.h b/src/include/access/amapi.h
index f7d2a1b7e3..549912c1c9 100644
--- a/src/include/access/amapi.h
+++ b/src/include/access/amapi.h
@@ -156,6 +156,12 @@ typedef void (*aminitparallelscan_function) (void *target);
 /* (re)start parallel index scan */
 typedef void (*amparallelrescan_function) (IndexScanDesc scan);
 
+/*
+ * Callback function signatures - for parallel index vacuuming.
+ */
+/* estimate size of parallel index vacuuming memory */
+typedef Size (*amestimateparallelvacuum_function) (void);
+
 /*
  * API struct for an index AM.  Note this must be stored in a single palloc'd
  * chunk of memory.
@@ -232,6 +238,9 @@ typedef struct IndexAmRoutine
 	amestimateparallelscan_function amestimateparallelscan; /* can be NULL */
 	aminitparallelscan_function aminitparallelscan; /* can be NULL */
 	amparallelrescan_function amparallelrescan; /* can be NULL */
+
+	/* interface functions to support parallel vacuum */
+	amestimateparallelvacuum_function amestimateparallelvacuum; /* can be NULL */
 } IndexAmRoutine;
 
 
diff --git a/src/include/access/genam.h b/src/include/access/genam.h
index a813b004be..48ed5bbac7 100644
--- a/src/include/access/genam.h
+++ b/src/include/access/genam.h
@@ -179,6 +179,7 @@ extern IndexBulkDeleteResult *index_bulk_delete(IndexVacuumInfo *info,
 												void *callback_state);
 extern IndexBulkDeleteResult *index_vacuum_cleanup(IndexVacuumInfo *info,
 												   IndexBulkDeleteResult *stats);
+extern Size index_parallelvacuum_estimate(Relation indexRelation);
 extern bool index_can_return(Relation indexRelation, int attno);
 extern RegProcedure index_getprocid(Relation irel, AttrNumber attnum,
 									uint16 procnum);
diff --git a/src/test/modules/dummy_index_am/dummy_index_am.c b/src/test/modules/dummy_index_am/dummy_index_am.c
index f12eefbb24..c90405a23b 100644
--- a/src/test/modules/dummy_index_am/dummy_index_am.c
+++ b/src/test/modules/dummy_index_am/dummy_index_am.c
@@ -325,6 +325,7 @@ dihandler(PG_FUNCTION_ARGS)
 	amroutine->amestimateparallelscan = NULL;
 	amroutine->aminitparallelscan = NULL;
 	amroutine->amparallelrescan = NULL;
+	amroutine->amestimateparallelvacuum = NULL;
 
 	PG_RETURN_POINTER(amroutine);
 }
-- 
2.22.0

v31-0003-Add-an-index-AM-field-to-check-if-use-maintenanc.patchtext/x-patch; charset=US-ASCII; name=v31-0003-Add-an-index-AM-field-to-check-if-use-maintenanc.patchDownload

From a1fa1f544ecbdc529254c021f139776f6cacbae0 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Thu, 24 Oct 2019 16:30:02 +0900
Subject: [PATCH v31 3/6] Add an index AM field to check if use
 maintenance_work_mem

---
 contrib/bloom/blutils.c                          | 1 +
 doc/src/sgml/indexam.sgml                        | 2 ++
 src/backend/access/brin/brin.c                   | 1 +
 src/backend/access/gin/ginutil.c                 | 1 +
 src/backend/access/gist/gist.c                   | 1 +
 src/backend/access/hash/hash.c                   | 1 +
 src/backend/access/nbtree/nbtree.c               | 1 +
 src/backend/access/spgist/spgutils.c             | 1 +
 src/include/access/amapi.h                       | 2 ++
 src/test/modules/dummy_index_am/dummy_index_am.c | 1 +
 10 files changed, 12 insertions(+)

diff --git a/contrib/bloom/blutils.c b/contrib/bloom/blutils.c
index 9ef14a47f3..d50122d9e2 100644
--- a/contrib/bloom/blutils.c
+++ b/contrib/bloom/blutils.c
@@ -122,6 +122,7 @@ blhandler(PG_FUNCTION_ARGS)
 	amroutine->amcanparallel = false;
 	amroutine->amcanparallelvacuum = true;
 	amroutine->amcaninclude = false;
+	amroutine->amusemaintenanceworkmem = false;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = blbuild;
diff --git a/doc/src/sgml/indexam.sgml b/doc/src/sgml/indexam.sgml
index c3d2352d0f..df4cad11b3 100644
--- a/doc/src/sgml/indexam.sgml
+++ b/doc/src/sgml/indexam.sgml
@@ -124,6 +124,8 @@ typedef struct IndexAmRoutine
     bool        amcanparallelvacuum;
     /* does AM support columns included with clause INCLUDE? */
     bool        amcaninclude;
+    /* does AM use maintenance_work_mem? */
+    bool        amusemaintenanceworkmem;
     /* type of data stored in index, or InvalidOid if variable */
     Oid         amkeytype;
 
diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index 4045f5eacf..4a3286bfde 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -102,6 +102,7 @@ brinhandler(PG_FUNCTION_ARGS)
 	amroutine->amcanparallel = false;
 	amroutine->amcanparallelvacuum = true;
 	amroutine->amcaninclude = false;
+	amroutine->amusemaintenanceworkmem = false;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = brinbuild;
diff --git a/src/backend/access/gin/ginutil.c b/src/backend/access/gin/ginutil.c
index 9832f651ef..a28a71999d 100644
--- a/src/backend/access/gin/ginutil.c
+++ b/src/backend/access/gin/ginutil.c
@@ -54,6 +54,7 @@ ginhandler(PG_FUNCTION_ARGS)
 	amroutine->amcanparallel = false;
 	amroutine->amcanparallelvacuum = true;
 	amroutine->amcaninclude = false;
+	amroutine->amusemaintenanceworkmem = true;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = ginbuild;
diff --git a/src/backend/access/gist/gist.c b/src/backend/access/gist/gist.c
index 88b1e839b3..752b5bc88c 100644
--- a/src/backend/access/gist/gist.c
+++ b/src/backend/access/gist/gist.c
@@ -76,6 +76,7 @@ gisthandler(PG_FUNCTION_ARGS)
 	amroutine->amcanparallel = false;
 	amroutine->amcanparallelvacuum = true;
 	amroutine->amcaninclude = true;
+	amroutine->amusemaintenanceworkmem = false;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = gistbuild;
diff --git a/src/backend/access/hash/hash.c b/src/backend/access/hash/hash.c
index 3666318064..dc0dc312ef 100644
--- a/src/backend/access/hash/hash.c
+++ b/src/backend/access/hash/hash.c
@@ -75,6 +75,7 @@ hashhandler(PG_FUNCTION_ARGS)
 	amroutine->amcanparallel = false;
 	amroutine->amcanparallelvacuum = true;
 	amroutine->amcaninclude = false;
+	amroutine->amusemaintenanceworkmem = false;
 	amroutine->amkeytype = INT4OID;
 
 	amroutine->ambuild = hashbuild;
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index f1db77886c..1ea2ba3fe0 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -124,6 +124,7 @@ bthandler(PG_FUNCTION_ARGS)
 	amroutine->amcanparallel = true;
 	amroutine->amcanparallelvacuum = true;
 	amroutine->amcaninclude = true;
+	amroutine->amusemaintenanceworkmem = false;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = btbuild;
diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c
index ff66c3ac6c..4a1689859a 100644
--- a/src/backend/access/spgist/spgutils.c
+++ b/src/backend/access/spgist/spgutils.c
@@ -57,6 +57,7 @@ spghandler(PG_FUNCTION_ARGS)
 	amroutine->amcanparallel = false;
 	amroutine->amcanparallelvacuum = true;
 	amroutine->amcaninclude = false;
+	amroutine->amusemaintenanceworkmem = false;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = spgbuild;
diff --git a/src/include/access/amapi.h b/src/include/access/amapi.h
index 549912c1c9..d166350bbe 100644
--- a/src/include/access/amapi.h
+++ b/src/include/access/amapi.h
@@ -205,6 +205,8 @@ typedef struct IndexAmRoutine
 	bool		amcanparallelvacuum;
 	/* does AM support columns included with clause INCLUDE? */
 	bool		amcaninclude;
+	/* does AM use maintenance_work_mem? */
+	bool		amusemaintenanceworkmem;
 	/* type of data stored in index, or InvalidOid if variable */
 	Oid			amkeytype;
 
diff --git a/src/test/modules/dummy_index_am/dummy_index_am.c b/src/test/modules/dummy_index_am/dummy_index_am.c
index c90405a23b..374d545f0d 100644
--- a/src/test/modules/dummy_index_am/dummy_index_am.c
+++ b/src/test/modules/dummy_index_am/dummy_index_am.c
@@ -302,6 +302,7 @@ dihandler(PG_FUNCTION_ARGS)
 	amroutine->amcanparallel = false;
 	amroutine->amcanparallelvacuum = false;
 	amroutine->amcaninclude = false;
+	amroutine->amusemaintenanceworkmem = false;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = dibuild;
-- 
2.22.0

v31-0005-Add-paralell-P-option-to-vacuumdb-command.patchtext/x-patch; charset=US-ASCII; name=v31-0005-Add-paralell-P-option-to-vacuumdb-command.patchDownload

From 0eedb72e6fd99e4dee95b3fb43430823f313ca97 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Wed, 23 Jan 2019 16:07:53 +0900
Subject: [PATCH v31 5/6] Add --paralell, -P option to vacuumdb command

---
 doc/src/sgml/ref/vacuumdb.sgml    | 16 +++++++++++
 src/bin/scripts/t/100_vacuumdb.pl | 10 ++++++-
 src/bin/scripts/vacuumdb.c        | 48 ++++++++++++++++++++++++++++++-
 3 files changed, 72 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/ref/vacuumdb.sgml b/doc/src/sgml/ref/vacuumdb.sgml
index 47d93456f8..f6ac0c6e5a 100644
--- a/doc/src/sgml/ref/vacuumdb.sgml
+++ b/doc/src/sgml/ref/vacuumdb.sgml
@@ -226,6 +226,22 @@ PostgreSQL documentation
       </listitem>
      </varlistentry>
 
+     <varlistentry>
+      <term><option>-P <replaceable class="parameter">workers</replaceable></option></term>
+      <term><option>--parallel=<replaceable class="parameter">workers</replaceable></option></term>
+      <listitem>
+       <para>
+        Execute parallel vacuum with <productname>PostgreSQL</productname>'s
+        <replaceable class="parameter">workers</replaceable> background workers.
+       </para>
+       <para>
+        This option will require background workers, so make sure your
+        <xref linkend="guc-max-parallel-workers-maintenance"/> setting is more
+        than one.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry>
       <term><option>-q</option></term>
       <term><option>--quiet</option></term>
diff --git a/src/bin/scripts/t/100_vacuumdb.pl b/src/bin/scripts/t/100_vacuumdb.pl
index b685b35282..8fe80719e8 100644
--- a/src/bin/scripts/t/100_vacuumdb.pl
+++ b/src/bin/scripts/t/100_vacuumdb.pl
@@ -3,7 +3,7 @@ use warnings;
 
 use PostgresNode;
 use TestLib;
-use Test::More tests => 44;
+use Test::More tests => 48;
 
 program_help_ok('vacuumdb');
 program_version_ok('vacuumdb');
@@ -48,6 +48,14 @@ $node->issues_sql_like(
 $node->command_fails(
 	[ 'vacuumdb', '--analyze-only', '--disable-page-skipping', 'postgres' ],
 	'--analyze-only and --disable-page-skipping specified together');
+$node->issues_sql_like(
+	[ 'vacuumdb', '-P2', 'postgres' ],
+	qr/statement: VACUUM \(PARALLEL 2\).*;/,
+	'vacuumdb -P2');
+$node->issues_sql_like(
+	[ 'vacuumdb', '-P', 'postgres' ],
+	qr/statement: VACUUM \(PARALLEL\).*;/,
+	'vacuumdb -P');
 $node->command_ok([qw(vacuumdb -Z --table=pg_am dbname=template1)],
 	'vacuumdb with connection string');
 
diff --git a/src/bin/scripts/vacuumdb.c b/src/bin/scripts/vacuumdb.c
index 2c7219239f..63bf66a70b 100644
--- a/src/bin/scripts/vacuumdb.c
+++ b/src/bin/scripts/vacuumdb.c
@@ -34,6 +34,8 @@ typedef struct vacuumingOptions
 	bool		skip_locked;
 	int			min_xid_age;
 	int			min_mxid_age;
+	int			parallel_workers;	/* -1 disables, 0 for choosing based on the
+									 * number of indexes */
 } vacuumingOptions;
 
 
@@ -86,6 +88,7 @@ main(int argc, char *argv[])
 		{"full", no_argument, NULL, 'f'},
 		{"verbose", no_argument, NULL, 'v'},
 		{"jobs", required_argument, NULL, 'j'},
+		{"parallel", optional_argument, NULL, 'P'},
 		{"maintenance-db", required_argument, NULL, 2},
 		{"analyze-in-stages", no_argument, NULL, 3},
 		{"disable-page-skipping", no_argument, NULL, 4},
@@ -115,6 +118,7 @@ main(int argc, char *argv[])
 
 	/* initialize options to all false */
 	memset(&vacopts, 0, sizeof(vacopts));
+	vacopts.parallel_workers = -1;
 
 	pg_logging_init(argv[0]);
 	progname = get_progname(argv[0]);
@@ -122,7 +126,7 @@ main(int argc, char *argv[])
 
 	handle_help_version_opts(argc, argv, "vacuumdb", help);
 
-	while ((c = getopt_long(argc, argv, "h:p:U:wWeqd:zZFat:fvj:", long_options, &optindex)) != -1)
+	while ((c = getopt_long(argc, argv, "h:p:P::U:wWeqd:zZFat:fvj:", long_options, &optindex)) != -1)
 	{
 		switch (c)
 		{
@@ -182,6 +186,24 @@ main(int argc, char *argv[])
 					exit(1);
 				}
 				break;
+			case 'P':
+				{
+					int parallel_workers = 0;
+
+					if (optarg != NULL)
+					{
+						parallel_workers = atoi(optarg);
+						if (parallel_workers <= 0)
+						{
+							pg_log_error("number of parallel workers must be at least 1");
+							exit(1);
+						}
+					}
+
+					/* allow to set 0, meaning PARALLEL without the parallel degree */
+					vacopts.parallel_workers = parallel_workers;
+					break;
+				}
 			case 2:
 				maintenance_db = pg_strdup(optarg);
 				break;
@@ -254,9 +276,22 @@ main(int argc, char *argv[])
 						 "disable-page-skipping");
 			exit(1);
 		}
+		if (vacopts.parallel_workers >= 0)
+		{
+			pg_log_error("cannot use the \"%s\" option when performing only analyze",
+						 "parallel");
+			exit(1);
+		}
 		/* allow 'and_analyze' with 'analyze_only' */
 	}
 
+	if (vacopts.full && vacopts.parallel_workers >= 0)
+	{
+		pg_log_error("cannot use the \"%s\" option with \"%s\" option",
+					 "full", "parallel");
+		exit(1);
+	}
+
 	setup_cancel_handler();
 
 	/* Avoid opening extra connections. */
@@ -822,6 +857,16 @@ prepare_vacuum_command(PQExpBuffer sql, int serverVersion,
 				appendPQExpBuffer(sql, "%sANALYZE", sep);
 				sep = comma;
 			}
+			if (vacopts->parallel_workers > 0)
+			{
+				appendPQExpBuffer(sql, "%sPARALLEL %d", sep, vacopts->parallel_workers);
+				sep = comma;
+			}
+			if (vacopts->parallel_workers == 0)
+			{
+				appendPQExpBuffer(sql, "%sPARALLEL", sep);
+				sep = comma;
+			}
 			if (sep != paren)
 				appendPQExpBufferChar(sql, ')');
 		}
@@ -885,6 +930,7 @@ help(const char *progname)
 	printf(_("  -j, --jobs=NUM                  use this many concurrent connections to vacuum\n"));
 	printf(_("      --min-mxid-age=MXID_AGE     minimum multixact ID age of tables to vacuum\n"));
 	printf(_("      --min-xid-age=XID_AGE       minimum transaction ID age of tables to vacuum\n"));
+	printf(_("  -P, --parallel[=NUM]            do parallel vacuuming\n"));
 	printf(_("  -q, --quiet                     don't write any messages\n"));
 	printf(_("      --skip-locked               skip relations that cannot be immediately locked\n"));
 	printf(_("  -t, --table='TABLE[(COLUMNS)]'  vacuum specific table(s) only\n"));
-- 
2.22.0

v31-0004-Add-parallel-option-to-VACUUM-command.patchtext/x-patch; charset=US-ASCII; name=v31-0004-Add-parallel-option-to-VACUUM-command.patchDownload

From 9dcbbbcfaafeef51ff5ea069124265dd9ad572e6 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 25 Oct 2019 22:47:41 +0900
Subject: [PATCH v31 4/6] Add parallel option to VACUUM command

This change adds PARALLEL option to VACUUM command that enable us to
perform index vacuuming and index cleanup with background
workers. Individual indexes is processed by one vacuum
process. Therefore parallel vacuum can be used when the table has at
least two indexes and it cannot specify larger parallel degree than
the number of indexes that the table has.

The parallel degree is either specified by user or determined based on
the number of indexes that the table has, and further limited by
max_parallel_maintenance_workers. The table size and index size don't
affect it.
---
 doc/src/sgml/config.sgml              |   14 +-
 doc/src/sgml/ref/vacuum.sgml          |   45 ++
 src/backend/access/heap/vacuumlazy.c  | 1049 ++++++++++++++++++++++---
 src/backend/access/transam/parallel.c |    4 +
 src/backend/commands/vacuum.c         |   45 ++
 src/backend/postmaster/autovacuum.c   |    2 +
 src/bin/psql/tab-complete.c           |    2 +-
 src/include/access/heapam.h           |    3 +
 src/include/commands/vacuum.h         |    5 +
 src/test/regress/expected/vacuum.out  |   14 +
 src/test/regress/sql/vacuum.sql       |   10 +
 11 files changed, 1084 insertions(+), 109 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 886632ff43..335a0ec752 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2265,13 +2265,13 @@ include_dir 'conf.d'
        <listitem>
         <para>
          Sets the maximum number of parallel workers that can be
-         started by a single utility command.  Currently, the only
-         parallel utility command that supports the use of parallel
-         workers is <command>CREATE INDEX</command>, and only when
-         building a B-tree index.  Parallel workers are taken from the
-         pool of processes established by <xref
-         linkend="guc-max-worker-processes"/>, limited by <xref
-         linkend="guc-max-parallel-workers"/>.  Note that the requested
+         started by a single utility command.  Currently, the parallel
+         utility commands that support the use of parallel workers are
+         <command>CREATE INDEX</command> only when building a B-tree index,
+         and <command>VACUUM</command> without <literal>FULL</literal>
+         option. Parallel workers are taken from the pool of processes
+         established by <xref linkend="guc-max-worker-processes"/>, limited
+         by <xref linkend="guc-max-parallel-workers"/>.  Note that the requested
          number of workers may not actually be available at run time.
          If this occurs, the utility operation will run with fewer
          workers than expected.  The default value is 2.  Setting this
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index f9b0fb8794..ae086b976b 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -34,6 +34,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     SKIP_LOCKED [ <replaceable class="parameter">boolean</replaceable> ]
     INDEX_CLEANUP [ <replaceable class="parameter">boolean</replaceable> ]
     TRUNCATE [ <replaceable class="parameter">boolean</replaceable> ]
+    PARALLEL [ <replaceable class="parameter">integer</replaceable> ]
 
 <phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
 
@@ -223,6 +224,32 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><literal>PARALLEL</literal></term>
+    <listitem>
+     <para>
+      Perform vacuum index and cleanup index phases of <command>VACUUM</command>
+      in parallel using <replaceable class="parameter">integer</replaceable>
+      background workers (for the detail of each vacuum phases, please
+      refer to <xref linkend="vacuum-phases"/>). If the parallel degree
+      <replaceable class="parameter">integer</replaceable> is omitted,
+      then <command>VACUUM</command> decides the number of workers based
+      on number of indexes that support parallel vacuum operation on the
+      relation which is further limited by
+      <xref linkend="guc-max-parallel-workers-maintenance"/>. Please note
+      that it is not guaranteed that the number of parallel worker specified
+      in <replaceable class="parameter">integer</replaceable> will be used
+      during execution. It is possible for a vacuum to run with fewer workers
+      than specified, or even with no workers at all. Only one worker can
+      be used per index. So parallel workers are launched only when there
+      are at least <literal>2</literal> indexes in the table. Workers for
+      vacuum launches before starting each phases and exit at the end of
+      the phase. These behaviors might change in a future release. This
+      option can not use with <literal>FULL</literal> option.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><replaceable class="parameter">boolean</replaceable></term>
     <listitem>
@@ -237,6 +264,18 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><replaceable class="parameter">integer</replaceable></term>
+    <listitem>
+     <para>
+      Specifies a positive integer value passed to the selected option.
+      The <replaceable class="parameter">integer</replaceable> value can
+      also be omitted, in which case the default value of the selected
+      option is used.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><replaceable class="parameter">table_name</replaceable></term>
     <listitem>
@@ -316,6 +355,12 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     more than a plain <command>VACUUM</command> would.
    </para>
 
+   <para>
+     The <option>PARALLEL</option> option is used for only vacuum purpose.
+     Even if this option is specified with <option>ANALYZE</option> option
+     it does not affect <option>ANALYZE</option>.
+   </para>
+
    <para>
     <command>VACUUM</command> causes a substantial increase in I/O traffic,
     which might cause poor performance for other active sessions.  Therefore,
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index a3c4a1df3b..02040c837e 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -22,6 +22,21 @@
  * of index scans performed.  So we don't use maintenance_work_mem memory for
  * the TID array, just enough to hold as many heap tuples as fit on one page.
  *
+ * Lazy vacuum supports parallel execution with parallel worker processes.  In
+ * parallel lazy vacuum, we perform both index vacuuming and index cleanup with
+ * parallel worker processes.  Individual indexes are processed by one vacuum
+ * process.  At the beginning of lazy vacuum (at lazy_scan_heap) we prepare the
+ * parallel context and initialize the DSM segment that contains shared information
+ * as well as the memory space for storing dead tuples.  When starting either
+ * index vacuuming or index cleanup, we launch parallel worker processes.  Once
+ * all indexes are processed the parallel worker processes exit.  And then the
+ * leader process re-initializes the parallel context while keeping recorded
+ * dead tuples so that the leader can launch parallel workers again in the next
+ * time.  Note that all parallel workers live during either index vacuuming or
+ * index cleanup but the leader process neither exits from the parallel mode
+ * nor destroys the parallel context.  For updating the index statistics, since
+ * any updates are not allowed during parallel mode we update the index
+ * statistics after exited from the parallel mode.
  *
  * Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
@@ -36,13 +51,16 @@
 
 #include <math.h>
 
+#include "access/amapi.h"
 #include "access/genam.h"
 #include "access/heapam.h"
 #include "access/heapam_xlog.h"
 #include "access/htup_details.h"
 #include "access/multixact.h"
+#include "access/parallel.h"
 #include "access/transam.h"
 #include "access/visibilitymap.h"
+#include "access/xact.h"
 #include "access/xlog.h"
 #include "catalog/storage.h"
 #include "commands/dbcommands.h"
@@ -55,6 +73,7 @@
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
 #include "storage/lmgr.h"
+#include "tcop/tcopprot.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/pg_rusage.h"
@@ -110,6 +129,139 @@
  */
 #define PREFETCH_SIZE			((BlockNumber) 32)
 
+/*
+ * DSM keys for parallel lazy vacuum. Unlike other parallel execution code,
+ * since we don't need to worry about DSM keys conflicting with plan_node_id
+ * we can use small integers.
+ */
+#define PARALLEL_VACUUM_KEY_SHARED			1
+#define PARALLEL_VACUUM_KEY_DEAD_TUPLES		2
+#define PARALLEL_VACUUM_KEY_QUERY_TEXT		3
+
+/*
+ * PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION disables the leader's
+ * participation in parallel lazy vacuum.  This may be useful as a debugging
+ * aid.
+#undef PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION
+ */
+
+/*
+ * Macro to check if we are in a parallel lazy vacuum. If true, we are
+ * in the parallel mode and prepared the DSM segment.
+ */
+#define ParallelVacuumIsActive(lps) (((LVParallelState *) (lps)) != NULL)
+
+/*
+ * LVDeadTuples stores the dead tuple TIDs collected during heap scan.
+ * This is allocated in the DSM segment when parallel lazy vacuum
+ * mode, otherwise allocated in a local memory.
+ */
+typedef struct LVDeadTuples
+{
+	int			max_tuples;	/* # slots allocated in array */
+	int			num_tuples;	/* current # of entries */
+	/* List of TIDs of tuples we intend to delete */
+	/* NB: this list is ordered by TID address */
+	ItemPointerData itemptrs[FLEXIBLE_ARRAY_MEMBER];	/* array of ItemPointerData */
+} LVDeadTuples;
+#define SizeOfLVDeadTuples offsetof(LVDeadTuples, itemptrs) + sizeof(ItemPointerData)
+
+/*
+ * Shared information among parallel workers. So this is allocated in
+ * the DSM segment.
+ */
+typedef struct LVShared
+{
+	/*
+	 * Target table relid and log level. These fields are not modified
+	 * during the lazy vacuum.
+	 */
+	Oid		relid;
+	int		elevel;
+
+	/*
+	 * An indication for vacuum workers of doing either index vacuuming or
+	 * index cleanup.
+	 */
+	bool	for_cleanup;
+
+	/*
+	 * Fields for both index vacuuming and index cleanup.
+	 *
+	 * reltuples is the total number of input heap tuples.  We set either
+	 * an old live tuples in index vacuuming case or the new live tuples in
+	 * index cleanup case.
+	 *
+	 * estimated_count is true if the reltuples is estimated value.
+	 */
+	double	reltuples;
+	bool	estimated_count;
+
+
+	/*
+	 * In single process lazy vacuum we could consume more memory during
+	 * index vacuuming or cleanup apart from the memory for heap scanning
+	 * if an index consume memory during ambulkdelete and amvacuumcleanup.
+	 * In parallel index vacuuming, since individual vacuum workers
+	 * consumes memory we set the new maitenance_work_mem for each workers
+	 * to not consume more memory than single process lazy vacuum.
+	 */
+	int		maintenance_work_mem_worker;
+
+	/* The number of indexes that do NOT support parallel index vacuuming */
+	int		nindexes_nonparallel;
+
+	/*
+	 * Variables to control parallel index vacuuming.  Index statistics
+	 * returned from ambulkdelete and amvacuumcleanup is nullable variable
+	 * length.  'offset' is NULL bitmap. Note that a 0 indicates a null,
+	 * while 1 indicates non-null.  The index statistics follows at end of
+	 * struct.
+	 */
+	pg_atomic_uint32	nprocessed;	/* counter for vacuuming and clean up */
+	uint32				offset;		/* sizeof header incl. bitmap */
+	bits8				bitmap[FLEXIBLE_ARRAY_MEMBER];	 /* bit map of NULLs */
+
+	/* Shared index statistics data follows at end of struct */
+} LVShared;
+#define SizeOfLVShared offsetof(LVShared, bitmap) + sizeof(bits8)
+#define GetSharedIndStats(s) \
+	((LVSharedIndStats *)((char *)(s) + ((LVShared *)(s))->offset))
+#define IndStatsIsNull(s, i) \
+	(!(((LVShared *)(s))->bitmap[(i) >> 3] & (1 << ((i) & 0x07))))
+
+/*
+ * Struct for an index bulk-deletion statistic used for parallel lazy
+ * vacuum. This is allocated in the DSM segment.  IndexBulkDeleteResult
+ * follows at end of struct.
+ */
+typedef struct LVSharedIndStats
+{
+	Size	size;
+	bool	updated;	/* are the stats updated */
+
+	/* Index bulk-deletion result data follows at end of struct */
+} LVSharedIndStats;
+#define SizeOfSharedIndStats(s) \
+	(sizeof(LVSharedIndStats) + ((LVSharedIndStats *)(s))->size)
+#define GetIndexBulkDeleteResult(s) \
+	((IndexBulkDeleteResult *)((char *)(s) + sizeof(LVSharedIndStats)))
+
+/* Struct for parallel lazy vacuum */
+typedef struct LVParallelState
+{
+	ParallelContext	*pcxt;
+
+	/* Shared information among parallel vacuum workers */
+	LVShared		*lvshared;
+
+	/*
+	 * Always true except for a debugging case where
+	 * PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION are defined.
+	 */
+	bool			leaderparticipates;
+} LVParallelState;
+
 typedef struct LVRelStats
 {
 	/* useindex = true means two-pass strategy; false means one-pass */
@@ -128,17 +280,12 @@ typedef struct LVRelStats
 	BlockNumber pages_removed;
 	double		tuples_deleted;
 	BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
-	/* List of TIDs of tuples we intend to delete */
-	/* NB: this list is ordered by TID address */
-	int			num_dead_tuples;	/* current # of entries */
-	int			max_dead_tuples;	/* # slots allocated in array */
-	ItemPointer dead_tuples;	/* array of ItemPointerData */
+	LVDeadTuples *dead_tuples;
 	int			num_index_scans;
 	TransactionId latestRemovedXid;
 	bool		lock_waiter_detected;
 } LVRelStats;
 
-
 /* A few variables that don't seem worth passing around as parameters */
 static int	elevel = -1;
 
@@ -155,12 +302,11 @@ static void lazy_scan_heap(Relation onerel, VacuumParams *params,
 						   bool aggressive);
 static void lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats);
 static bool lazy_check_needs_freeze(Buffer buf, bool *hastup);
-static void lazy_vacuum_index(Relation indrel,
-							  IndexBulkDeleteResult **stats,
-							  LVRelStats *vacrelstats);
+static void lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult **stats,
+							  LVDeadTuples *dead_tuples, double reltuples);
 static void lazy_cleanup_index(Relation indrel,
-							   IndexBulkDeleteResult *stats,
-							   LVRelStats *vacrelstats);
+							   IndexBulkDeleteResult **stats,
+							   double reltuples, bool estimated_count);
 static int	lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 							 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer);
 static bool should_attempt_truncation(VacuumParams *params,
@@ -169,12 +315,36 @@ static void lazy_truncate_heap(Relation onerel, LVRelStats *vacrelstats);
 static BlockNumber count_nondeletable_pages(Relation onerel,
 											LVRelStats *vacrelstats);
 static void lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks);
-static void lazy_record_dead_tuple(LVRelStats *vacrelstats,
+static void lazy_record_dead_tuple(LVDeadTuples *dead_tuples,
 								   ItemPointer itemptr);
 static bool lazy_tid_reaped(ItemPointer itemptr, void *state);
 static int	vac_cmp_itemptr(const void *left, const void *right);
 static bool heap_page_is_all_visible(Relation rel, Buffer buf,
 									 TransactionId *visibility_cutoff_xid, bool *all_frozen);
+static LVParallelState *begin_parallel_vacuum(LVRelStats *vacrelstats, Oid relid,
+											  BlockNumber nblocks, Relation *Irel,
+											  int nindexes, int nrequested);
+static void end_parallel_vacuum(LVParallelState *lps, Relation *Irel, int nindexes,
+								IndexBulkDeleteResult **stats);
+static void prepare_index_statistics(LVShared *lvshared, Relation *Irel, int nindexes);
+static void lazy_vacuum_indexes(LVRelStats *vacrelstats, Relation *Irel,
+								int nindexes, IndexBulkDeleteResult **stats,
+								LVParallelState *lps);
+static void lazy_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
+								 int nindexes, IndexBulkDeleteResult **stats,
+								 LVParallelState *lps);
+static void lazy_parallel_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
+													int nindexes, IndexBulkDeleteResult **stats,
+													LVParallelState *lps);
+static void vacuum_or_cleanup_indexes_worker(Relation *Irel, int nindexes,
+											 IndexBulkDeleteResult **stats,
+											 LVShared *lvshared,
+											 LVDeadTuples *dead_tuples);
+static void update_index_statistics(Relation *Irel, IndexBulkDeleteResult **stats,
+									int nindexes);
+static LVSharedIndStats *get_indstats(LVShared *lvshared, int n);
+static int compute_parallel_workers(Relation *Irel, int nindexes, int nrequested);
+static long compute_max_dead_tuples(BlockNumber relblocks, bool hasindex);
 
 
 /*
@@ -488,6 +658,18 @@ vacuum_log_cleanup_info(Relation rel, LVRelStats *vacrelstats)
  *		dead-tuple TIDs, invoke vacuuming of indexes and call lazy_vacuum_heap
  *		to reclaim dead line pointers.
  *
+ *		If the table has at least two indexes and parallel lazy vacuum is
+ *		requested, we execute both index vacuuming and index cleanup with
+ *		parallel workers. In parallel lazy vacuum, we enter parallel mode and
+ *		then create both the parallel context and the DSM segment before starting
+ *		heap scan so that we can record dead tuples to the DSM segment. All
+ *		parallel workers are launched at beginning of index vacuuming and index
+ *		cleanup and they exit once done with all indexes. At the end of this
+ *		function we exit from parallel mode. Index bulk-deletion results are
+ *		stored in the DSM segment and update index statistics as a whole after
+ *		exited from parallel mode since all writes are not allowed during parallel
+ *		mode.
+ *
  *		If there are no indexes then we can reclaim line pointers on the fly;
  *		dead line pointers need only be retained until all index pointers that
  *		reference them have been killed.
@@ -496,6 +678,8 @@ static void
 lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			   Relation *Irel, int nindexes, bool aggressive)
 {
+	LVParallelState *lps = NULL;
+	LVDeadTuples *dead_tuples;
 	BlockNumber nblocks,
 				blkno;
 	HeapTupleData tuple;
@@ -518,6 +702,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	bool		skipping_blocks;
 	xl_heap_freeze_tuple *frozen;
 	StringInfoData buf;
+	int			parallel_workers = 0;
 	const int	initprog_index[] = {
 		PROGRESS_VACUUM_PHASE,
 		PROGRESS_VACUUM_TOTAL_HEAP_BLKS,
@@ -553,13 +738,41 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	vacrelstats->nonempty_pages = 0;
 	vacrelstats->latestRemovedXid = InvalidTransactionId;
 
-	lazy_space_alloc(vacrelstats, nblocks);
+	/*
+	 * If parallel lazy vacuum is requested and we vacuum indexes, compute
+	 * the number of parallel vacuum worker to launch.
+	 */
+	if (params->nworkers >= 0 && vacrelstats->useindex)
+		parallel_workers = compute_parallel_workers(Irel, nindexes,
+													params->nworkers);
+
+	if (parallel_workers > 0)
+	{
+		/*
+		 * Enter parallel mode, create the parallel context and allocate the
+		 * DSM segment.
+		 */
+		lps = begin_parallel_vacuum(vacrelstats,
+									RelationGetRelid(onerel),
+									nblocks, Irel, nindexes,
+									parallel_workers);
+	}
+	else
+	{
+		/*
+		 * Use single process vacuum. We allocate the memory space for dead
+		 * tuples locally.
+		 */
+		lazy_space_alloc(vacrelstats, nblocks);
+	}
+
+	dead_tuples = vacrelstats->dead_tuples;
 	frozen = palloc(sizeof(xl_heap_freeze_tuple) * MaxHeapTuplesPerPage);
 
 	/* Report that we're scanning the heap, advertising total # of blocks */
 	initprog_val[0] = PROGRESS_VACUUM_PHASE_SCAN_HEAP;
 	initprog_val[1] = nblocks;
-	initprog_val[2] = vacrelstats->max_dead_tuples;
+	initprog_val[2] = dead_tuples->max_tuples;
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
 	/*
@@ -737,8 +950,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * If we are close to overrunning the available space for dead-tuple
 		 * TIDs, pause and do a cycle of vacuuming before we tackle this page.
 		 */
-		if ((vacrelstats->max_dead_tuples - vacrelstats->num_dead_tuples) < MaxHeapTuplesPerPage &&
-			vacrelstats->num_dead_tuples > 0)
+		if ((dead_tuples->max_tuples - dead_tuples->num_tuples) < MaxHeapTuplesPerPage &&
+			dead_tuples->num_tuples > 0)
 		{
 			const int	hvp_index[] = {
 				PROGRESS_VACUUM_PHASE,
@@ -766,10 +979,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 										 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
 			/* Remove index entries */
-			for (i = 0; i < nindexes; i++)
-				lazy_vacuum_index(Irel[i],
-								  &indstats[i],
-								  vacrelstats);
+			lazy_vacuum_indexes(vacrelstats, Irel, nindexes, indstats, lps);
 
 			/*
 			 * Report that we are now vacuuming the heap.  We also increase
@@ -789,7 +999,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 			vacrelstats->num_index_scans++;
 
 			/*
@@ -985,7 +1195,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		has_dead_tuples = false;
 		nfrozen = 0;
 		hastup = false;
-		prev_dead_count = vacrelstats->num_dead_tuples;
+		prev_dead_count = dead_tuples->num_tuples;
 		maxoff = PageGetMaxOffsetNumber(page);
 
 		/*
@@ -1024,7 +1234,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 */
 			if (ItemIdIsDead(itemid))
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				all_visible = false;
 				continue;
 			}
@@ -1170,7 +1380,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 
 			if (tupgone)
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				HeapTupleHeaderAdvanceLatestRemovedXid(tuple.t_data,
 													   &vacrelstats->latestRemovedXid);
 				tups_vacuumed += 1;
@@ -1240,7 +1450,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * doing a second scan. Also we don't do that but forget dead tuples
 		 * when index cleanup is disabled.
 		 */
-		if (!vacrelstats->useindex && vacrelstats->num_dead_tuples > 0)
+		if (!vacrelstats->useindex && dead_tuples->num_tuples > 0)
 		{
 			if (nindexes == 0)
 			{
@@ -1269,7 +1479,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 
 			/*
 			 * Periodically do incremental FSM vacuuming to make newly-freed
@@ -1384,7 +1594,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * page, so remember its free space as-is.  (This path will always be
 		 * taken if there are no indexes.)
 		 */
-		if (vacrelstats->num_dead_tuples == prev_dead_count)
+		if (dead_tuples->num_tuples == prev_dead_count)
 			RecordPageWithFreeSpace(onerel, blkno, freespace);
 	}
 
@@ -1418,7 +1628,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 
 	/* If any tuples need to be deleted, perform final vacuum cycle */
 	/* XXX put a threshold on min number of tuples here? */
-	if (vacrelstats->num_dead_tuples > 0)
+	if (dead_tuples->num_tuples > 0)
 	{
 		const int	hvp_index[] = {
 			PROGRESS_VACUUM_PHASE,
@@ -1434,10 +1644,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 									 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
 		/* Remove index entries */
-		for (i = 0; i < nindexes; i++)
-			lazy_vacuum_index(Irel[i],
-							  &indstats[i],
-							  vacrelstats);
+		lazy_vacuum_indexes(vacrelstats, Irel, nindexes, indstats, lps);
 
 		/* Report that we are now vacuuming the heap */
 		hvp_val[0] = PROGRESS_VACUUM_PHASE_VACUUM_HEAP;
@@ -1463,12 +1670,19 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
 								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
 
-	/* Do post-vacuum cleanup and statistics update for each index */
+	/* Do post-vacuum cleanup */
 	if (vacrelstats->useindex)
-	{
-		for (i = 0; i < nindexes; i++)
-			lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
-	}
+		lazy_cleanup_indexes(vacrelstats, Irel, nindexes, indstats, lps);
+
+	/*
+	 * End parallel mode before updating index statistics as we cannot write
+	 * during parallel mode.
+	 */
+	if (ParallelVacuumIsActive(lps))
+		end_parallel_vacuum(lps, Irel, nindexes, indstats);
+
+	/* Update index statistics */
+	 update_index_statistics(Irel, indstats, nindexes);
 
 	/* If no indexes, make log report that lazy_vacuum_heap would've made */
 	if (vacuumed_pages)
@@ -1534,7 +1748,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 	npages = 0;
 
 	tupindex = 0;
-	while (tupindex < vacrelstats->num_dead_tuples)
+	while (tupindex < vacrelstats->dead_tuples->num_tuples)
 	{
 		BlockNumber tblk;
 		Buffer		buf;
@@ -1543,7 +1757,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 
 		vacuum_delay_point();
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples->itemptrs[tupindex]);
 		buf = ReadBufferExtended(onerel, MAIN_FORKNUM, tblk, RBM_NORMAL,
 								 vac_strategy);
 		if (!ConditionalLockBufferForCleanup(buf))
@@ -1591,6 +1805,7 @@ static int
 lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 				 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer)
 {
+	LVDeadTuples	*dead_tuples = vacrelstats->dead_tuples;
 	Page		page = BufferGetPage(buffer);
 	OffsetNumber unused[MaxOffsetNumber];
 	int			uncnt = 0;
@@ -1601,16 +1816,16 @@ lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 
 	START_CRIT_SECTION();
 
-	for (; tupindex < vacrelstats->num_dead_tuples; tupindex++)
+	for (; tupindex < dead_tuples->num_tuples; tupindex++)
 	{
 		BlockNumber tblk;
 		OffsetNumber toff;
 		ItemId		itemid;
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&dead_tuples->itemptrs[tupindex]);
 		if (tblk != blkno)
 			break;				/* past end of tuples for this block */
-		toff = ItemPointerGetOffsetNumber(&vacrelstats->dead_tuples[tupindex]);
+		toff = ItemPointerGetOffsetNumber(&dead_tuples->itemptrs[tupindex]);
 		itemid = PageGetItemId(page, toff);
 		ItemIdSetUnused(itemid);
 		unused[uncnt++] = toff;
@@ -1731,19 +1946,290 @@ lazy_check_needs_freeze(Buffer buf, bool *hastup)
 	return false;
 }
 
+/*
+ * Perform index vacuuming or index cleanup with parallel workers. This function
+ * must be used by the parallel vacuum leader process. The caller must set
+ * lps->lvshared->for_cleanup to indicate whether vacuuming or cleanup.
+ */
+static void
+lazy_parallel_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
+										int nindexes, IndexBulkDeleteResult **stats,
+										LVParallelState *lps)
+{
+	Assert(!IsParallelWorker());
+	Assert(ParallelVacuumIsActive(lps));
+	Assert(nindexes > 0);
+
+	LaunchParallelWorkers(lps->pcxt);
+
+	if (lps->lvshared->for_cleanup)
+		ereport(elevel,
+				(errmsg(ngettext("launched %d parallel vacuum worker for index vacuuming (planned: %d)",
+								 "launched %d parallel vacuum workers for index vacuuming (planned: %d)",
+								 lps->pcxt->nworkers_launched),
+						lps->pcxt->nworkers_launched, lps->pcxt->nworkers)));
+	else
+		ereport(elevel,
+				(errmsg(ngettext("launched %d parallel vacuum worker for index cleanup (planned: %d)",
+								 "launched %d parallel vacuum workers for index cleanup (planned: %d)",
+								 lps->pcxt->nworkers_launched),
+						lps->pcxt->nworkers_launched, lps->pcxt->nworkers)));
+
+	/*
+	 * Join as parallel workers. The leader process alone does that in case where
+	 * no workers launched.
+	 */
+	if (lps->leaderparticipates || lps->pcxt->nworkers_launched == 0)
+		vacuum_or_cleanup_indexes_worker(Irel, nindexes, stats, lps->lvshared,
+										 vacrelstats->dead_tuples);
+
+	/* Wait for all vacuum workers to finish */
+	WaitForParallelWorkersToFinish(lps->pcxt);
+
+	/*
+	 * We need to reinitialize the parallel context as no more index vacuuming and
+	 * index cleanup will be performed after that.
+	 */
+	if (!lps->lvshared->for_cleanup)
+	{
+		/* Reset the processing count */
+		pg_atomic_write_u32(&(lps->lvshared->nprocessed), 0);
+
+		/*
+		 * Reinitialize the parallel context to relaunch parallel workers
+		 * for the next execution.
+		 */
+		ReinitializeParallelDSM(lps->pcxt);
+	}
+}
+
+/*
+ * Index vacuuming and index cleanup routine used by parallel vacuum
+ * worker processes including the leader process.  After finished each
+ * indexes this function copies the index statistics returned from
+ * ambulkdelete and amvacuumcleanup to the DSM segment.
+ */
+static void
+vacuum_or_cleanup_indexes_worker(Relation *Irel, int nindexes,
+								 IndexBulkDeleteResult **stats,
+								 LVShared *lvshared,
+								 LVDeadTuples *dead_tuples)
+{
+	/* Loop until all indexes are vacuumed */
+	for (;;)
+	{
+		int idx;
+		LVSharedIndStats *shared_indstats;
+		IndexBulkDeleteResult *bulkdelete_res;
+
+		/* Get an index number to process */
+		idx = pg_atomic_fetch_add_u32(&(lvshared->nprocessed), 1);
+
+		/* Done for all indexes? */
+		if (idx >= nindexes)
+			break;
+
+		/* Get index statistics struct of this index */
+		shared_indstats = get_indstats(lvshared, idx);
+
+		/* Skip if this index doesn't support parallel index vacuuming */
+		if (shared_indstats == NULL)
+			continue;
+
+		/* Get the space for IndexBulkDeleteResult */
+		bulkdelete_res = GetIndexBulkDeleteResult(shared_indstats);
+
+		/*
+		 * Update the pointer to the corresponding bulk-deletion result
+		 * if someone has already updated it.
+		 */
+		if (shared_indstats->updated && stats[idx] == NULL)
+			stats[idx] = bulkdelete_res;
+
+		/* Do vacuum or cleanup one index */
+		if (lvshared->for_cleanup)
+			lazy_cleanup_index(Irel[idx], &(stats[idx]), lvshared->reltuples,
+							   lvshared->estimated_count);
+		else
+			lazy_vacuum_index(Irel[idx], &(stats[idx]), dead_tuples,
+							  lvshared->reltuples);
+
+		/*
+		 * Copy the index bulk-deletion result returned from ambulkdelete
+		 * and amvacuumcleanup to the DSM segment if it's the first time to
+		 * get it from them, because they allocate it locally and it's
+		 * possible that an index will be vacuumed by the different vacuum
+		 * process at the next time.  The copying the result normally
+		 * happens only after the first time of index vacuuming.  From the
+		 * second time, we pass the result on the DSM segment so that they
+		 * then update it directly.
+		 *
+		 * Since all vacuum workers write the bulk-deletion result at
+		 * different slots we can write them without locking.
+		 */
+		if (!shared_indstats->updated && stats[idx] != NULL)
+		{
+			memcpy(bulkdelete_res, stats[idx], shared_indstats->size);
+			shared_indstats->updated = true;
+
+			/*
+			 * no longer need the locally allocated result and now
+			 * stats[idx] points to the DSM segment.
+			 */
+			pfree(stats[idx]);
+			stats[idx] = bulkdelete_res;
+		}
+	}
+}
+
+/*
+ * Cleanup indexes.  This function must be used by the parallel vacuum
+ * leader process in parallel vacuum case.
+ */
+static void
+lazy_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
+					int nindexes, IndexBulkDeleteResult **stats,
+					LVParallelState *lps)
+{
+	int		idx;
+
+	Assert(!IsParallelWorker());
+	Assert(nindexes > 0);
+
+	/*
+	 * If parallel vacuum is active we perform index cleanup with parallel
+	 * workers.
+	 */
+	if (ParallelVacuumIsActive(lps))
+	{
+		/* Tell parallel workers to do index cleanup */
+		lps->lvshared->for_cleanup = true;
+
+		/*
+		 * Now we can provide a better estimate of total number of
+		 * surviving tuples (we assume indexes are more interested in that
+		 * than in the number of nominally live tuples).
+		 */
+		lps->lvshared->reltuples = vacrelstats->new_rel_tuples;
+		lps->lvshared->estimated_count =
+			(vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+
+		/*
+		 * Generally index cleanup does not scan the index when index
+		 * vacuuming (ambulkdelete) was already performed.  So we perform
+		 * index cleanup with parallel workers only if we have not
+		 * performed index vacuuming yet.  Otherwise, we do it in the
+		 * leader process alone.
+		 */
+		if (vacrelstats->num_index_scans == 0)
+			lazy_parallel_vacuum_or_cleanup_indexes(vacrelstats, Irel, nindexes,
+													stats, lps);
+		else
+		{
+			/*
+			 * Do cleanup by the leader process alone.  Since we need to
+			 * copy the index statistics to the DSM segment we cannot use
+			 * lazy_index_cleanup instead.
+			 */
+			vacuum_or_cleanup_indexes_worker(Irel, nindexes, stats,
+											 lps->lvshared,
+											 vacrelstats->dead_tuples);
+		}
+
+		/*
+		 * Done if there is no indexes that do not support parallel index
+		 * vacuuming.  Otherwise fall through to do single process vacuum
+		 * on such indexes.
+		 */
+		if (lps->lvshared->nindexes_nonparallel == 0)
+			return;
+	}
+
+	for (idx = 0; idx < nindexes; idx++)
+	{
+		/*
+		 * Skip indexes that we have already cleaned up during parallel
+		 * index vacuuming.
+		 */
+		if (ParallelVacuumIsActive(lps) && !IndStatsIsNull(lps->lvshared,idx))
+			continue;
+
+		lazy_cleanup_index(Irel[idx], &stats[idx],
+						   vacrelstats->new_rel_tuples,
+						   vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	}
+}
+
+/*
+ * Vacuum indexes. This function must be used by the parallel vacuum leader
+ * process in parallel vacuum case.
+ */
+static void
+lazy_vacuum_indexes(LVRelStats *vacrelstats, Relation *Irel,
+					int nindexes, IndexBulkDeleteResult **stats,
+					LVParallelState *lps)
+{
+	int		idx;
+
+	Assert(!IsParallelWorker());
+	Assert(nindexes > 0);
+
+	/*
+	 * If parallel vacuum is active we perform index vacuuming with
+	 * parallel workers.
+	 */
+	if (ParallelVacuumIsActive(lps))
+	{
+		/* Tell parallel workers to do index vacuuming */
+		lps->lvshared->for_cleanup = false;
+
+		/*
+		 * We can only provide an approximate value of num_heap_tuples in
+		 * vacuum cases.
+		 */
+		lps->lvshared->reltuples = vacrelstats->old_live_tuples;
+		lps->lvshared->estimated_count = true;
+
+		lazy_parallel_vacuum_or_cleanup_indexes(vacrelstats, Irel, nindexes,
+												stats, lps);
+
+		/*
+		 * Done if there is no indexes that do not support parallel index
+		 * vacuuming.  Otherwise fall through to do single process vacuum
+		 * on such indexes.
+		 */
+		if (lps->lvshared->nindexes_nonparallel == 0)
+			return;
+	}
+
+	for (idx = 0; idx < nindexes; idx++)
+	{
+		/*
+		 * Skip indexes that we have already vacuumed during parallel index
+		 * vacuuming.
+		 */
+		if (ParallelVacuumIsActive(lps) && !IndStatsIsNull(lps->lvshared, idx))
+			continue;
+
+		lazy_vacuum_index(Irel[idx], &stats[idx], vacrelstats->dead_tuples,
+						  vacrelstats->old_live_tuples);
+	}
+}
 
 /*
  *	lazy_vacuum_index() -- vacuum one index relation.
  *
  *		Delete all the index entries pointing to tuples listed in
  *		vacrelstats->dead_tuples, and update running statistics.
+ *		reltuples is the number of heap tuples to be passed to the
+ *		bulk delete callback.
  */
 static void
-lazy_vacuum_index(Relation indrel,
-				  IndexBulkDeleteResult **stats,
-				  LVRelStats *vacrelstats)
+lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult **stats,
+				  LVDeadTuples *dead_tuples, double reltuples)
 {
 	IndexVacuumInfo ivinfo;
+	char		*msgfmt;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
@@ -1753,30 +2239,38 @@ lazy_vacuum_index(Relation indrel,
 	ivinfo.report_progress = false;
 	ivinfo.estimated_count = true;
 	ivinfo.message_level = elevel;
-	/* We can only provide an approximate value of num_heap_tuples here */
-	ivinfo.num_heap_tuples = vacrelstats->old_live_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
 	/* Do bulk deletion */
 	*stats = index_bulk_delete(&ivinfo, *stats,
-							   lazy_tid_reaped, (void *) vacrelstats);
+							   lazy_tid_reaped, (void *) dead_tuples);
+
+	if (IsParallelWorker())
+		msgfmt = gettext_noop("scanned index \"%s\" to remove %d row versions by parallel vacuum worker");
+	else
+		msgfmt = gettext_noop("scanned index \"%s\" to remove %d row versions");
 
 	ereport(elevel,
-			(errmsg("scanned index \"%s\" to remove %d row versions",
+			(errmsg(msgfmt,
 					RelationGetRelationName(indrel),
-					vacrelstats->num_dead_tuples),
+					dead_tuples->num_tuples),
 			 errdetail_internal("%s", pg_rusage_show(&ru0))));
 }
 
 /*
  *	lazy_cleanup_index() -- do post-vacuum cleanup for one index relation.
+ *
+ *		reltuples is the number of heap tuples and estimated_count is true
+ *		if the reltuples is an estimated value.
  */
 static void
 lazy_cleanup_index(Relation indrel,
-				   IndexBulkDeleteResult *stats,
-				   LVRelStats *vacrelstats)
+				   IndexBulkDeleteResult **stats,
+				   double reltuples, bool estimated_count)
 {
 	IndexVacuumInfo ivinfo;
+	char		*msgfmt;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
@@ -1784,49 +2278,62 @@ lazy_cleanup_index(Relation indrel,
 	ivinfo.index = indrel;
 	ivinfo.analyze_only = false;
 	ivinfo.report_progress = false;
-	ivinfo.estimated_count = (vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	ivinfo.estimated_count = estimated_count;
 	ivinfo.message_level = elevel;
 
-	/*
-	 * Now we can provide a better estimate of total number of surviving
-	 * tuples (we assume indexes are more interested in that than in the
-	 * number of nominally live tuples).
-	 */
-	ivinfo.num_heap_tuples = vacrelstats->new_rel_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
-	stats = index_vacuum_cleanup(&ivinfo, stats);
+	*stats = index_vacuum_cleanup(&ivinfo, *stats);
 
-	if (!stats)
+	if (!(*stats))
 		return;
 
-	/*
-	 * Now update statistics in pg_class, but only if the index says the count
-	 * is accurate.
-	 */
-	if (!stats->estimated_count)
-		vac_update_relstats(indrel,
-							stats->num_pages,
-							stats->num_index_tuples,
-							0,
-							false,
-							InvalidTransactionId,
-							InvalidMultiXactId,
-							false);
+	if (IsParallelWorker())
+		msgfmt = gettext_noop("index \"%s\" now contains %.0f row versions in %u pages, reported by parallel vacuum worker");
+	else
+		msgfmt = gettext_noop("index \"%s\" now contains %.0f row versions in %u pages");
 
 	ereport(elevel,
-			(errmsg("index \"%s\" now contains %.0f row versions in %u pages",
+			(errmsg(msgfmt,
 					RelationGetRelationName(indrel),
-					stats->num_index_tuples,
-					stats->num_pages),
+					(*stats)->num_index_tuples,
+					(*stats)->num_pages),
 			 errdetail("%.0f index row versions were removed.\n"
 					   "%u index pages have been deleted, %u are currently reusable.\n"
 					   "%s.",
-					   stats->tuples_removed,
-					   stats->pages_deleted, stats->pages_free,
+					   (*stats)->tuples_removed,
+					   (*stats)->pages_deleted, (*stats)->pages_free,
 					   pg_rusage_show(&ru0))));
+}
+
+/*
+ * Update index statistics in pg_class if the statistics is accurate.
+ */
+static void
+update_index_statistics(Relation *Irel, IndexBulkDeleteResult **stats,
+						int nindexes)
+{
+	int i;
+
+	Assert(!IsInParallelMode());
 
-	pfree(stats);
+	for (i = 0; i < nindexes; i++)
+	{
+		if (stats[i] == NULL || stats[i]->estimated_count)
+			continue;
+
+		/* Update index statistics */
+		vac_update_relstats(Irel[i],
+							stats[i]->num_pages,
+							stats[i]->num_index_tuples,
+							0,
+							false,
+							InvalidTransactionId,
+							InvalidMultiXactId,
+							false);
+		pfree(stats[i]);
+	}
 }
 
 /*
@@ -2134,19 +2641,17 @@ count_nondeletable_pages(Relation onerel, LVRelStats *vacrelstats)
 }
 
 /*
- * lazy_space_alloc - space allocation decisions for lazy vacuum
- *
- * See the comments at the head of this file for rationale.
+ * Return the maximum number of dead tuples we can record.
  */
-static void
-lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+static long
+compute_max_dead_tuples(BlockNumber relblocks, bool useindex)
 {
 	long		maxtuples;
 	int			vac_work_mem = IsAutoVacuumWorkerProcess() &&
 	autovacuum_work_mem != -1 ?
 	autovacuum_work_mem : maintenance_work_mem;
 
-	if (vacrelstats->useindex)
+	if (useindex)
 	{
 		maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);
 		maxtuples = Min(maxtuples, INT_MAX);
@@ -2160,34 +2665,49 @@ lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
 		maxtuples = Max(maxtuples, MaxHeapTuplesPerPage);
 	}
 	else
-	{
 		maxtuples = MaxHeapTuplesPerPage;
-	}
 
-	vacrelstats->num_dead_tuples = 0;
-	vacrelstats->max_dead_tuples = (int) maxtuples;
-	vacrelstats->dead_tuples = (ItemPointer)
-		palloc(maxtuples * sizeof(ItemPointerData));
+	return maxtuples;
+}
+
+/*
+ * lazy_space_alloc - space allocation decisions for lazy vacuum
+ *
+ * See the comments at the head of this file for rationale.
+ */
+static void
+lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+{
+	LVDeadTuples	*dead_tuples = NULL;
+	long		maxtuples;
+
+	maxtuples = compute_max_dead_tuples(relblocks, vacrelstats->useindex);
+
+	dead_tuples = (LVDeadTuples *)
+		palloc(SizeOfLVDeadTuples + maxtuples * sizeof(ItemPointerData));
+	dead_tuples->num_tuples = 0;
+	dead_tuples->max_tuples = (int) maxtuples;
+
+	vacrelstats->dead_tuples = dead_tuples;
 }
 
 /*
  * lazy_record_dead_tuple - remember one deletable tuple
  */
 static void
-lazy_record_dead_tuple(LVRelStats *vacrelstats,
-					   ItemPointer itemptr)
+lazy_record_dead_tuple(LVDeadTuples *dead_tuples, ItemPointer itemptr)
 {
 	/*
 	 * The array shouldn't overflow under normal behavior, but perhaps it
 	 * could if we are given a really small maintenance_work_mem. In that
 	 * case, just forget the last few tuples (we'll get 'em next time).
 	 */
-	if (vacrelstats->num_dead_tuples < vacrelstats->max_dead_tuples)
+	if (dead_tuples->num_tuples < dead_tuples->max_tuples)
 	{
-		vacrelstats->dead_tuples[vacrelstats->num_dead_tuples] = *itemptr;
-		vacrelstats->num_dead_tuples++;
+		dead_tuples->itemptrs[dead_tuples->num_tuples] = *itemptr;
+		dead_tuples->num_tuples++;
 		pgstat_progress_update_param(PROGRESS_VACUUM_NUM_DEAD_TUPLES,
-									 vacrelstats->num_dead_tuples);
+									 dead_tuples->num_tuples);
 	}
 }
 
@@ -2201,12 +2721,12 @@ lazy_record_dead_tuple(LVRelStats *vacrelstats,
 static bool
 lazy_tid_reaped(ItemPointer itemptr, void *state)
 {
-	LVRelStats *vacrelstats = (LVRelStats *) state;
+	LVDeadTuples	*dead_tuples = (LVDeadTuples *) state;
 	ItemPointer res;
 
 	res = (ItemPointer) bsearch((void *) itemptr,
-								(void *) vacrelstats->dead_tuples,
-								vacrelstats->num_dead_tuples,
+								(void *) dead_tuples->itemptrs,
+								dead_tuples->num_tuples,
 								sizeof(ItemPointerData),
 								vac_cmp_itemptr);
 
@@ -2354,3 +2874,330 @@ heap_page_is_all_visible(Relation rel, Buffer buf,
 
 	return all_visible;
 }
+
+/*
+ * Compute the number of parallel worker processes to request. Both index
+ * vacuuming and index cleanup can be executed together with parallel workers.
+ * The relation sizes of table and indexes don't affect to the parallel
+ * degree for now. nrequested is the number of parallel workers that user
+ * requested. If nrequested is 0 we compute the parallel degree based on
+ * nindexes that is the number of indexes that support parallel index
+ * vacuuming.
+ */
+static int
+compute_parallel_workers(Relation *Irel, int nindexes, int nrequested)
+{
+	bool	leaderparticipates = true;
+	int		nindexes_to_vacuum = 0;
+	int		parallel_workers;
+	int		i;
+
+	Assert(nrequested >= 0);
+
+	/* Return immediately when parallelism disabled */
+	if (max_parallel_maintenance_workers == 0)
+		return 0;
+
+	/*
+	 * Compute the number of indexes that can participate to parallel index
+	 * vacuuming.
+	 */
+	for (i = 0; i < nindexes; i++)
+	{
+		IndexAmRoutine *amroutine = GetIndexAmRoutine(Irel[i]->rd_amhandler);
+
+		if (amroutine->amcanparallelvacuum)
+			nindexes_to_vacuum++;
+	}
+
+	/* No index supports parallel index vacuuming */
+	if (nindexes_to_vacuum == 0)
+		return 0;
+
+#ifdef PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION
+	leaderparticipates = false;
+#endif
+
+	/* The leader process takes one index */
+	if (leaderparticipates)
+		nindexes_to_vacuum--;
+
+	/* Compute the parallel degree */
+	parallel_workers = (nrequested > 0) ?
+		Min(nrequested, nindexes_to_vacuum) : nindexes_to_vacuum;
+
+	/* cap by max_parallel_maintenace_workers */
+	parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+
+	return parallel_workers;
+}
+
+/*
+ * Enter parallel mode, allocate and initialize the DSM segment.
+ */
+static LVParallelState *
+begin_parallel_vacuum(LVRelStats *vacrelstats, Oid relid, BlockNumber nblocks,
+					  Relation *Irel, int nindexes, int nrequested)
+{
+	LVParallelState *lps = (LVParallelState *) palloc(sizeof(LVParallelState));
+	ParallelContext *pcxt;
+	LVShared		*shared;
+	LVDeadTuples	*dead_tuples;
+	long	maxtuples;
+	char	*sharedquery;
+	Size	est_shared;
+	Size	est_deadtuples;
+	int		querylen;
+	int		i;
+
+	Assert(nrequested > 0);
+	Assert(nindexes > 0);
+
+	lps->leaderparticipates = true;
+
+#ifdef PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION
+	lps->leaderparticipates = false;
+#endif
+
+	EnterParallelMode();
+	pcxt = CreateParallelContext("postgres", "heap_parallel_vacuum_main",
+								 nrequested);
+	lps->pcxt = pcxt;
+	Assert(pcxt->nworkers > 0);
+
+	/* Estimate size for shared information -- PARALLEL_VACUUM_KEY_SHARED */
+	est_shared = MAXALIGN(add_size(SizeOfLVShared, BITMAPLEN(nindexes)));
+	for (i = 0; i < nindexes; i++)
+	{
+		if (Irel[i]->rd_indam->amcanparallelvacuum)
+			est_shared = add_size(est_shared,
+									add_size(sizeof(LVSharedIndStats),
+											 index_parallelvacuum_estimate(Irel[i])));
+	}
+	shm_toc_estimate_chunk(&pcxt->estimator, est_shared);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	/* Estimate size for dead tuples -- PARALLEL_VACUUM_KEY_DEAD_TUPLES */
+	maxtuples = compute_max_dead_tuples(nblocks, true);
+	est_deadtuples = MAXALIGN(add_size(SizeOfLVDeadTuples,
+									   mul_size(sizeof(ItemPointerData), maxtuples)));
+	shm_toc_estimate_chunk(&pcxt->estimator, est_deadtuples);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	/* Finally, estimate PARALLEL_VACUUM_KEY_QUERY_TEXT space */
+	querylen = strlen(debug_query_string);
+	shm_toc_estimate_chunk(&pcxt->estimator, querylen + 1);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	InitializeParallelDSM(pcxt);
+
+	/* Prepare shared information */
+	shared = (LVShared *) shm_toc_allocate(pcxt->toc, est_shared);
+	MemSet(shared, 0, est_shared);
+	shared->relid = relid;
+	shared->elevel = elevel;
+	shared->offset = add_size(SizeOfLVShared, BITMAPLEN(nindexes));
+	prepare_index_statistics(shared, Irel, nindexes);
+	pg_atomic_init_u32(&(shared->nprocessed), 0);
+
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_SHARED, shared);
+	lps->lvshared = shared;
+
+	/* Prepare the dead tuple space */
+	dead_tuples = (LVDeadTuples *) shm_toc_allocate(pcxt->toc, est_deadtuples);
+	dead_tuples->max_tuples = maxtuples;
+	dead_tuples->num_tuples = 0;
+	MemSet(dead_tuples->itemptrs, 0, sizeof(ItemPointerData) * maxtuples);
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_DEAD_TUPLES, dead_tuples);
+	vacrelstats->dead_tuples = dead_tuples;
+
+	/* Store query string for workers */
+	sharedquery = (char *) shm_toc_allocate(pcxt->toc, querylen + 1);
+	memcpy(sharedquery, debug_query_string, querylen + 1);
+	sharedquery[querylen] = '\0';
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, sharedquery);
+
+	return lps;
+}
+
+/*
+ * Initialize variables for shared index statistics, set NULL bitmap and
+ * the struct size of each indexes.  Also this function sets the number of
+ * indexes that do not support parallel index vacuuming and that use
+ * maintenance_work_mem.
+ */
+static void
+prepare_index_statistics(LVShared *lvshared, Relation *Irel, int nindexes)
+{
+	char *p = (char *) GetSharedIndStats(lvshared);
+	int			vac_work_mem = IsAutoVacuumWorkerProcess() &&
+		autovacuum_work_mem != -1 ?
+		autovacuum_work_mem : maintenance_work_mem;
+	int nindexes_mwm = 0;
+	int i;
+
+	for (i = 0; i < nindexes; i++)
+	{
+		LVSharedIndStats *indstats;
+
+		if (!Irel[i]->rd_indam->amcanparallelvacuum)
+		{
+			/* Set NULL as this index does not support parallel vacuum */
+			lvshared->bitmap[i >> 3] |= 0 << (i & 0x07);
+			lvshared->nindexes_nonparallel++;
+			continue;
+		}
+
+		if (Irel[i]->rd_indam->amusemaintenanceworkmem)
+			nindexes_mwm++;
+
+		/* Set the size for index statistics */
+		indstats = (LVSharedIndStats *) p;
+		lvshared->bitmap[i >> 3] |= 1 << (i & 0x07);
+		indstats->size = index_parallelvacuum_estimate(Irel[i]);
+
+		p += SizeOfSharedIndStats(indstats);
+	}
+
+	/* Compute the new maitenance_work_mem value for index vacuuming */
+	lvshared->maintenance_work_mem_worker =
+		(nindexes_mwm > 0) ? vac_work_mem / nindexes_mwm : vac_work_mem;
+}
+
+/*
+ * Destroy the parallel context, and end parallel mode.
+ *
+ * All writes are not allowed during parallel mode and it might not be
+ * safe to exit from the parallel mode while keeping the parallel context.
+ * So we copy the updated index statistics to a local memory and then later
+ * use that to update the index statistics.
+ */
+static void
+end_parallel_vacuum(LVParallelState *lps, Relation *Irel, int nindexes,
+					IndexBulkDeleteResult **stats)
+{
+	int i;
+
+	Assert(!IsParallelWorker());
+
+	/* copy the updated statistics */
+	for (i = 0; i < nindexes; i++)
+	{
+		LVSharedIndStats *indstats = get_indstats(lps->lvshared, i);
+
+		/*
+		 * Skip unused slot.  The statistics of this index are already
+		 * stored in local memory.
+		 */
+		if (indstats == NULL)
+			continue;
+
+		if (indstats->updated)
+		{
+			stats[i] = (IndexBulkDeleteResult *) palloc0(sizeof(IndexBulkDeleteResult));
+			memcpy(stats[i],
+				   GetIndexBulkDeleteResult(indstats),
+				   sizeof(IndexBulkDeleteResult));
+		}
+		else
+			stats[i] = NULL;
+	}
+
+	DestroyParallelContext(lps->pcxt);
+	ExitParallelMode();
+
+	/* Deactivate parallel vacuum */
+	pfree(lps);
+	lps = NULL;
+}
+
+/* Return the Nth index statistics or NULL */
+static LVSharedIndStats *
+get_indstats(LVShared *lvshared, int n)
+{
+	int		i;
+	char	*p = (char *) GetSharedIndStats(lvshared);;
+
+	if (IndStatsIsNull(lvshared, n))
+		return NULL;
+
+	for (i = 0; i < (n - 1); i++)
+	{
+		if (IndStatsIsNull(lvshared, i))
+			continue;
+
+		p += SizeOfSharedIndStats(p);
+	}
+
+	return (LVSharedIndStats *) p;
+}
+
+/*
+ * Perform work within a launched parallel process.
+ *
+ * Since parallel vacuum workers work only within index vacuuming and index
+ * cleanup, no need to report the progress information.
+ */
+void
+heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
+{
+	Relation	onerel;
+	Relation	*indrels;
+	LVShared	*lvshared;
+	LVDeadTuples	*dead_tuples;
+	int			nindexes;
+	char		*sharedquery;
+	IndexBulkDeleteResult **stats;
+
+	lvshared = (LVShared *) shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_SHARED,
+										   false);
+	elevel = lvshared->elevel;
+
+	ereport(DEBUG1,
+			(errmsg("starting parallel lazy vacuum worker for %s",
+					lvshared->for_cleanup ? "cleanup" : "vacuuming")));
+
+	/* Set debug_query_string for individual workers */
+	sharedquery = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, false);
+	debug_query_string = sharedquery;
+	pgstat_report_activity(STATE_RUNNING, debug_query_string);
+
+	/*
+	 * Open table.  The lock mode is the same as the leader process.  It's
+	 * okay because The lockmode does not conflict among the parallel
+	 * workers.
+	 */
+	onerel = table_open(lvshared->relid, ShareUpdateExclusiveLock);
+
+	/*
+	 * Open all indexes. indrels are sorted in order by OID, which should
+	 * be matched to the leader's one.
+	 */
+	vac_open_indexes(onerel, RowExclusiveLock, &nindexes, &indrels);
+	Assert(nindexes > 0);
+
+	/* Set dead tuple space */
+	dead_tuples = (LVDeadTuples *) shm_toc_lookup(toc,
+												  PARALLEL_VACUUM_KEY_DEAD_TUPLES,
+												  false);
+
+	/* Set cost-based vacuum delay */
+	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostBalance = 0;
+	VacuumPageHit = 0;
+	VacuumPageMiss = 0;
+	VacuumPageDirty = 0;
+
+	stats = (IndexBulkDeleteResult **)
+		palloc0(nindexes * sizeof(IndexBulkDeleteResult *));
+
+	if (lvshared->maintenance_work_mem_worker > 0)
+		maintenance_work_mem = lvshared->maintenance_work_mem_worker;
+
+	/* Do either vacuuming indexes or cleaning indexes */
+	vacuum_or_cleanup_indexes_worker(indrels, nindexes, stats, lvshared,
+									 dead_tuples);
+
+	vac_close_indexes(nindexes, indrels, RowExclusiveLock);
+	table_close(onerel, ShareUpdateExclusiveLock);
+}
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index 55d129a64f..86511b2703 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -14,6 +14,7 @@
 
 #include "postgres.h"
 
+#include "access/heapam.h"
 #include "access/nbtree.h"
 #include "access/parallel.h"
 #include "access/session.h"
@@ -140,6 +141,9 @@ static const struct
 	},
 	{
 		"_bt_parallel_build_main", _bt_parallel_build_main
+	},
+	{
+		"heap_parallel_vacuum_main", heap_parallel_vacuum_main
 	}
 };
 
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 4b67b40b28..9ada501709 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -42,6 +42,7 @@
 #include "nodes/makefuncs.h"
 #include "pgstat.h"
 #include "postmaster/autovacuum.h"
+#include "postmaster/bgworker_internals.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
 #include "storage/proc.h"
@@ -99,6 +100,7 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 	/* Set default value */
 	params.index_cleanup = VACOPT_TERNARY_DEFAULT;
 	params.truncate = VACOPT_TERNARY_DEFAULT;
+	params.nworkers = -1;
 
 	/* Parse options list */
 	foreach(lc, vacstmt->options)
@@ -129,6 +131,28 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 			params.index_cleanup = get_vacopt_ternary_value(opt);
 		else if (strcmp(opt->defname, "truncate") == 0)
 			params.truncate = get_vacopt_ternary_value(opt);
+		else if (strcmp(opt->defname, "parallel") == 0)
+		{
+			if (opt->arg == NULL)
+			{
+				/*
+				 * Parallel lazy vacuum is requested but user didn't specify
+				 * the parallel degree. The parallel degree will be determined
+				 * at the start of lazy vacuum.
+				 */
+				params.nworkers = 0;
+			}
+			else
+			{
+				params.nworkers = defGetInt32(opt);
+				if (params.nworkers < 1 || params.nworkers > MAX_PARALLEL_WORKER_LIMIT)
+					ereport(ERROR,
+							(errcode(ERRCODE_SYNTAX_ERROR),
+							 errmsg("parallel vacuum degree must be between 1 and %d",
+									MAX_PARALLEL_WORKER_LIMIT),
+							 parser_errposition(pstate, opt->location)));
+			}
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -170,6 +194,11 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 		}
 	}
 
+	if ((params.options & VACOPT_FULL) && params.nworkers >= 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot specify FULL option with PARALLEL option")));
+
 	/*
 	 * All freeze ages are zero if the FREEZE option is given; otherwise pass
 	 * them as -1 which means to use the default values.
@@ -1742,6 +1771,22 @@ vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params)
 		return false;
 	}
 
+	/*
+	 * Since parallel workers cannot access data in temporary tables, parallel
+	 * vacuum is not allowed for temporary relation.
+	 */
+	if (RelationUsesLocalBuffers(onerel) && params->nworkers >= 0)
+	{
+		ereport(WARNING,
+				(errmsg("skipping vacuum on \"%s\" --- cannot vacuum temporary tables in parallel",
+						RelationGetRelationName(onerel))));
+		relation_close(onerel, lmode);
+		PopActiveSnapshot();
+		CommitTransactionCommand();
+		/* It's OK to proceed with ANALYZE on this table */
+		return true;
+	}
+
 	/*
 	 * Silently ignore partitioned tables as there is no work to be done.  The
 	 * useful work is on their child partitions, which have been queued up for
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index c1dd8168ca..c3690f9c41 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2891,6 +2891,8 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 			(!wraparound ? VACOPT_SKIP_LOCKED : 0);
 		tab->at_params.index_cleanup = VACOPT_TERNARY_DEFAULT;
 		tab->at_params.truncate = VACOPT_TERNARY_DEFAULT;
+		/* We don't support parallel vacuum for autovacuum for now */
+		tab->at_params.nworkers = -1;
 		tab->at_params.freeze_min_age = freeze_min_age;
 		tab->at_params.freeze_table_age = freeze_table_age;
 		tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index e00dbab5aa..321a1511a8 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -3556,7 +3556,7 @@ psql_completion(const char *text, int start, int end)
 		if (ends_with(prev_wd, '(') || ends_with(prev_wd, ','))
 			COMPLETE_WITH("FULL", "FREEZE", "ANALYZE", "VERBOSE",
 						  "DISABLE_PAGE_SKIPPING", "SKIP_LOCKED",
-						  "INDEX_CLEANUP", "TRUNCATE");
+						  "INDEX_CLEANUP", "TRUNCATE", "PARALLEL");
 		else if (TailMatches("FULL|FREEZE|ANALYZE|VERBOSE|DISABLE_PAGE_SKIPPING|SKIP_LOCKED|INDEX_CLEANUP|TRUNCATE"))
 			COMPLETE_WITH("ON", "OFF");
 	}
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 858bcb6bc9..12065cc038 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -24,6 +24,8 @@
 #include "nodes/primnodes.h"
 #include "storage/bufpage.h"
 #include "storage/lockdefs.h"
+#include "storage/shm_toc.h"
+#include "storage/dsm.h"
 #include "utils/relcache.h"
 #include "utils/snapshot.h"
 
@@ -193,6 +195,7 @@ extern Size SyncScanShmemSize(void);
 struct VacuumParams;
 extern void heap_vacuum_rel(Relation onerel,
 							struct VacuumParams *params, BufferAccessStrategy bstrategy);
+extern void heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
 
 /* in heap/heapam_visibility.c */
 extern bool HeapTupleSatisfiesVisibility(HeapTuple stup, Snapshot snapshot,
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 128f7ae65d..43702f2f86 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -184,6 +184,11 @@ typedef struct VacuumParams
 										 * default value depends on reloptions */
 	VacOptTernaryValue truncate;	/* Truncate empty pages at the end,
 									 * default value depends on reloptions */
+	/*
+	 * The number of parallel vacuum workers. -1 by default for no workers
+	 * and 0 for choosing based on the number of indexes.
+	 */
+	int			nworkers;
 } VacuumParams;
 
 /* GUC parameters */
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index aff0b10a93..91db6a10b0 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -92,6 +92,20 @@ CONTEXT:  SQL function "do_analyze" statement 1
 SQL function "wrap_do_analyze" statement 1
 VACUUM FULL vactst;
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+-- PARALLEL option
+VACUUM (PARALLEL) vaccluster;
+VACUUM (PARALLEL 2) vaccluster;
+VACUUM (PARALLEL 0) vaccluster; -- error
+ERROR:  parallel vacuum degree must be between 1 and 1024
+LINE 1: VACUUM (PARALLEL 0) vaccluster;
+                ^
+VACUUM (PARALLEL 2, INDEX_CLEANUP FALSE) vaccluster;
+VACUUM (PARALLEL 2, FULL TRUE) vaccluster; -- error, cannot use both PARALLEL and FULL
+ERROR:  cannot specify FULL option with PARALLEL option
+CREATE TEMPORARY TABLE tmp (a int PRIMARY KEY);
+CREATE INDEX tmp_idx1 ON tmp (a);
+VACUUM (PARALLEL 1) tmp; -- error, cannot parallel vacuum temporary tables
+WARNING:  skipping vacuum on "tmp" --- cannot vacuum temporary tables in parallel
 -- INDEX_CLEANUP option
 CREATE TABLE no_index_cleanup (i INT PRIMARY KEY, t TEXT);
 -- Use uncompressed data stored in toast.
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index f0fee3af2b..66a9b110fe 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -75,6 +75,16 @@ VACUUM FULL vactst;
 
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
 
+-- PARALLEL option
+VACUUM (PARALLEL) vaccluster;
+VACUUM (PARALLEL 2) vaccluster;
+VACUUM (PARALLEL 0) vaccluster; -- error
+VACUUM (PARALLEL 2, INDEX_CLEANUP FALSE) vaccluster;
+VACUUM (PARALLEL 2, FULL TRUE) vaccluster; -- error, cannot use both PARALLEL and FULL
+CREATE TEMPORARY TABLE tmp (a int PRIMARY KEY);
+CREATE INDEX tmp_idx1 ON tmp (a);
+VACUUM (PARALLEL 1) tmp; -- error, cannot parallel vacuum temporary tables
+
 -- INDEX_CLEANUP option
 CREATE TABLE no_index_cleanup (i INT PRIMARY KEY, t TEXT);
 -- Use uncompressed data stored in toast.
-- 
2.22.0

v31-0006-PoC-shared-vacuum-cost-balance.patchtext/x-patch; charset=US-ASCII; name=v31-0006-PoC-shared-vacuum-cost-balance.patchDownload

From a4bce0f6d662e4e42d98d6a9ffe70728e254f64a Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 25 Oct 2019 21:56:24 +0900
Subject: [PATCH v31 6/6] PoC: shared vacuum cost balance

---
 src/backend/access/heap/vacuumlazy.c | 23 ++++++++-
 src/backend/commands/vacuum.c        | 72 +++++++++++++++++++++++-----
 src/include/access/heapam.h          |  1 +
 3 files changed, 81 insertions(+), 15 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 02040c837e..cf0ccee037 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -211,6 +211,13 @@ typedef struct LVShared
 	/* The number of indexes that do NOT support parallel index vacuuming */
 	int		nindexes_nonparallel;
 
+	/*
+	 * Shared vacuum cost balance.  During parallel index vacuuming
+	 * VacuumSharedCostBalance points to this value and it accumulates the
+	 * balance of each parallel vacuum workers.
+	 */
+	pg_atomic_uint32 cost_balance;
+
 	/*
 	 * Variables to control parallel index vacuuming.  Index statistics
 	 * returned from ambulkdelete and amvacuumcleanup is nullable variable
@@ -230,6 +237,9 @@ typedef struct LVShared
 #define IndStatsIsNull(s, i) \
 	(!(((LVShared *)(s))->bitmap[(i) >> 3] & (1 << ((i) & 0x07))))
 
+/* Global variable for shared cost-based vacuum delay */
+pg_atomic_uint32	*VacuumSharedCostBalance = NULL;
+
 /*
  * Struct for an index bulk-deletion statistic used for parallel lazy
  * vacuum. This is allocated in the DSM segment.  IndexBulkDeleteResult
@@ -1960,6 +1970,10 @@ lazy_parallel_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
 	Assert(ParallelVacuumIsActive(lps));
 	Assert(nindexes > 0);
 
+	/* Move the current balance to the shared value */
+	pg_atomic_write_u32(&(lps->lvshared->cost_balance), VacuumCostBalance);
+	VacuumCostBalance = 0;
+
 	LaunchParallelWorkers(lps->pcxt);
 
 	if (lps->lvshared->for_cleanup)
@@ -1987,11 +2001,15 @@ lazy_parallel_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
 	WaitForParallelWorkersToFinish(lps->pcxt);
 
 	/*
-	 * We need to reinitialize the parallel context as no more index vacuuming and
-	 * index cleanup will be performed after that.
+	 * We need neither to reinitialize the parallel context nor to reset vacuum cost
+	 * balance after index cleanup as no more index vacuuming and index cleanup will
+	 * be performed after that.
 	 */
 	if (!lps->lvshared->for_cleanup)
 	{
+		/* Continue to use the shared balance value */
+		VacuumCostBalance = pg_atomic_read_u32(&(lps->lvshared->cost_balance));
+
 		/* Reset the processing count */
 		pg_atomic_write_u32(&(lps->lvshared->nprocessed), 0);
 
@@ -2999,6 +3017,7 @@ begin_parallel_vacuum(LVRelStats *vacrelstats, Oid relid, BlockNumber nblocks,
 	shared->offset = add_size(SizeOfLVShared, BITMAPLEN(nindexes));
 	prepare_index_statistics(shared, Irel, nindexes);
 	pg_atomic_init_u32(&(shared->nprocessed), 0);
+	pg_atomic_init_u32(&(shared->cost_balance), 0);
 
 	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_SHARED, shared);
 	lps->lvshared = shared;
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 9ada501709..7ace51e099 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -412,6 +412,7 @@ vacuum(List *relations, VacuumParams *params,
 		VacuumPageHit = 0;
 		VacuumPageMiss = 0;
 		VacuumPageDirty = 0;
+		VacuumSharedCostBalance = NULL;
 
 		/*
 		 * Loop to process each selected relation.
@@ -1990,28 +1991,73 @@ vac_close_indexes(int nindexes, Relation *Irel, LOCKMODE lockmode)
 void
 vacuum_delay_point(void)
 {
+	bool require_sleep = false;
+
 	/* Always check for interrupts */
 	CHECK_FOR_INTERRUPTS();
 
-	/* Nap if appropriate */
-	if (VacuumCostActive && !InterruptPending &&
-		VacuumCostBalance >= VacuumCostLimit)
+	if (VacuumCostActive && !InterruptPending)
 	{
-		double		msec;
+		/*
+		 * If the vacuum cost balance is shared among parallel workers we
+		 * decide whether to sleep based on that.
+		 */
+		if (VacuumSharedCostBalance != NULL)
+		{
+			while (true)
+			{
+				uint32 shared_balance;
+				uint32 new_balance;
 
-		msec = VacuumCostDelay * VacuumCostBalance / VacuumCostLimit;
-		if (msec > VacuumCostDelay * 4)
-			msec = VacuumCostDelay * 4;
+				require_sleep = false;
 
-		pg_usleep((long) (msec * 1000));
+				/* compute new balance by adding the local value */
+				shared_balance = pg_atomic_read_u32(VacuumSharedCostBalance);
+				new_balance = shared_balance + VacuumCostBalance;
 
-		VacuumCostBalance = 0;
+				if (new_balance >= VacuumCostLimit)
+				{
+					require_sleep = true;
+					new_balance -= VacuumCostLimit;
+				}
+
+				if (pg_atomic_compare_exchange_u32(VacuumSharedCostBalance,
+												   &shared_balance,
+												   new_balance))
+					break;
+			}
 
-		/* update balance values for workers */
-		AutoVacuumUpdateDelay();
+			/*
+			 * Reset the local balance as we accumulated it into the shared
+			 * value.
+			 */
+			VacuumCostBalance = 0;
+		}
+		else if (VacuumCostBalance >= VacuumCostLimit)
+		{
+			/* In single process vacuum check only the local balance */
+			require_sleep = true;
+		}
+
+		/* Nap if appropriate */
+		if (require_sleep)
+		{
+			double		msec;
 
-		/* Might have gotten an interrupt while sleeping */
-		CHECK_FOR_INTERRUPTS();
+			msec = VacuumCostDelay * VacuumCostBalance / VacuumCostLimit;
+			if (msec > VacuumCostDelay * 4)
+				msec = VacuumCostDelay * 4;
+
+			pg_usleep((long) (msec * 1000));
+
+			VacuumCostBalance = 0;
+
+			/* update balance values for workers */
+			AutoVacuumUpdateDelay();
+
+			/* Might have gotten an interrupt while sleeping */
+			CHECK_FOR_INTERRUPTS();
+		}
 	}
 }
 
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 12065cc038..ac883f67d1 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -192,6 +192,7 @@ extern void SyncScanShmemInit(void);
 extern Size SyncScanShmemSize(void);
 
 /* in heap/vacuumlazy.c */
+extern pg_atomic_uint32	*VacuumSharedCostBalance;
 struct VacuumParams;
 extern void heap_vacuum_rel(Relation onerel,
 							struct VacuumParams *params, BufferAccessStrategy bstrategy);
-- 
2.22.0

v31-0001-Add-an-index-AM-field-to-check-parallel-index-pa.patchtext/x-patch; charset=US-ASCII; name=v31-0001-Add-an-index-AM-field-to-check-parallel-index-pa.patchDownload

From f8a88f7fcc4a19031fd4c42c9afa247b1655e51a Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Tue, 15 Oct 2019 17:03:22 +0900
Subject: [PATCH v31 1/6] Add an index AM field to check parallel index
 participation

---
 contrib/bloom/blutils.c                          | 1 +
 doc/src/sgml/indexam.sgml                        | 2 ++
 src/backend/access/brin/brin.c                   | 1 +
 src/backend/access/gin/ginutil.c                 | 1 +
 src/backend/access/gist/gist.c                   | 1 +
 src/backend/access/hash/hash.c                   | 1 +
 src/backend/access/nbtree/nbtree.c               | 1 +
 src/backend/access/spgist/spgutils.c             | 1 +
 src/include/access/amapi.h                       | 2 ++
 src/test/modules/dummy_index_am/dummy_index_am.c | 1 +
 10 files changed, 12 insertions(+)

diff --git a/contrib/bloom/blutils.c b/contrib/bloom/blutils.c
index 3d44616adc..98163c81bd 100644
--- a/contrib/bloom/blutils.c
+++ b/contrib/bloom/blutils.c
@@ -120,6 +120,7 @@ blhandler(PG_FUNCTION_ARGS)
 	amroutine->amclusterable = false;
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
+	amroutine->amcanparallelvacuum = true;
 	amroutine->amcaninclude = false;
 	amroutine->amkeytype = InvalidOid;
 
diff --git a/doc/src/sgml/indexam.sgml b/doc/src/sgml/indexam.sgml
index dd54c68802..fa5682db04 100644
--- a/doc/src/sgml/indexam.sgml
+++ b/doc/src/sgml/indexam.sgml
@@ -120,6 +120,8 @@ typedef struct IndexAmRoutine
     bool        ampredlocks;
     /* does AM support parallel scan? */
     bool        amcanparallel;
+    /* does AM support parallel vacuum? */
+    bool        amcanparallelvacuum;
     /* does AM support columns included with clause INCLUDE? */
     bool        amcaninclude;
     /* type of data stored in index, or InvalidOid if variable */
diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index ae7b729edd..6ea48fb555 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -100,6 +100,7 @@ brinhandler(PG_FUNCTION_ARGS)
 	amroutine->amclusterable = false;
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
+	amroutine->amcanparallelvacuum = true;
 	amroutine->amcaninclude = false;
 	amroutine->amkeytype = InvalidOid;
 
diff --git a/src/backend/access/gin/ginutil.c b/src/backend/access/gin/ginutil.c
index cf9699ad18..0c33809c83 100644
--- a/src/backend/access/gin/ginutil.c
+++ b/src/backend/access/gin/ginutil.c
@@ -52,6 +52,7 @@ ginhandler(PG_FUNCTION_ARGS)
 	amroutine->amclusterable = false;
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = false;
+	amroutine->amcanparallelvacuum = true;
 	amroutine->amcaninclude = false;
 	amroutine->amkeytype = InvalidOid;
 
diff --git a/src/backend/access/gist/gist.c b/src/backend/access/gist/gist.c
index 0cc87911d6..0363bf814a 100644
--- a/src/backend/access/gist/gist.c
+++ b/src/backend/access/gist/gist.c
@@ -74,6 +74,7 @@ gisthandler(PG_FUNCTION_ARGS)
 	amroutine->amclusterable = true;
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = false;
+	amroutine->amcanparallelvacuum = true;
 	amroutine->amcaninclude = true;
 	amroutine->amkeytype = InvalidOid;
 
diff --git a/src/backend/access/hash/hash.c b/src/backend/access/hash/hash.c
index 5cc30dac42..f21d9ac78f 100644
--- a/src/backend/access/hash/hash.c
+++ b/src/backend/access/hash/hash.c
@@ -73,6 +73,7 @@ hashhandler(PG_FUNCTION_ARGS)
 	amroutine->amclusterable = false;
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = false;
+	amroutine->amcanparallelvacuum = true;
 	amroutine->amcaninclude = false;
 	amroutine->amkeytype = INT4OID;
 
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index 4cfd5289ad..e885aadc21 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -122,6 +122,7 @@ bthandler(PG_FUNCTION_ARGS)
 	amroutine->amclusterable = true;
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = true;
+	amroutine->amcanparallelvacuum = true;
 	amroutine->amcaninclude = true;
 	amroutine->amkeytype = InvalidOid;
 
diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c
index 45472db147..0c86b63f65 100644
--- a/src/backend/access/spgist/spgutils.c
+++ b/src/backend/access/spgist/spgutils.c
@@ -55,6 +55,7 @@ spghandler(PG_FUNCTION_ARGS)
 	amroutine->amclusterable = false;
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
+	amroutine->amcanparallelvacuum = true;
 	amroutine->amcaninclude = false;
 	amroutine->amkeytype = InvalidOid;
 
diff --git a/src/include/access/amapi.h b/src/include/access/amapi.h
index 6e3db06eed..f7d2a1b7e3 100644
--- a/src/include/access/amapi.h
+++ b/src/include/access/amapi.h
@@ -195,6 +195,8 @@ typedef struct IndexAmRoutine
 	bool		ampredlocks;
 	/* does AM support parallel scan? */
 	bool		amcanparallel;
+	/* does AM support parallel vacuum? */
+	bool		amcanparallelvacuum;
 	/* does AM support columns included with clause INCLUDE? */
 	bool		amcaninclude;
 	/* type of data stored in index, or InvalidOid if variable */
diff --git a/src/test/modules/dummy_index_am/dummy_index_am.c b/src/test/modules/dummy_index_am/dummy_index_am.c
index bc68767f3a..f12eefbb24 100644
--- a/src/test/modules/dummy_index_am/dummy_index_am.c
+++ b/src/test/modules/dummy_index_am/dummy_index_am.c
@@ -300,6 +300,7 @@ dihandler(PG_FUNCTION_ARGS)
 	amroutine->amclusterable = false;
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
+	amroutine->amcanparallelvacuum = false;
 	amroutine->amcaninclude = false;
 	amroutine->amkeytype = InvalidOid;
 
-- 
2.22.0

#142

Dilip Kumar

dilipbalaut@gmail.com

about 6 years ago

In reply to: Masahiko Sawada (#141)

On Fri, Oct 25, 2019 at 9:19 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Fri, Oct 25, 2019 at 2:06 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Fri, Oct 25, 2019 at 10:22 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

For more detail of my idea it is that the first worker who entered to
vacuum_delay_point adds its local value to shared value and reset the
local value to 0. And then the worker sleeps if it exceeds
VacuumCostLimit but before sleeping it can subtract VacuumCostLimit
from the shared value. Since vacuum_delay_point are typically called
per page processed I expect there will not such problem. Thoughts?

Oh right, I assumed that when the local balance is exceeding the
VacuumCostLimit that time you are adding it to the shared value but
you are adding it to to shared value every time in vacuum_delay_point.
So I think your idea is correct.

I've attached the updated patch set.

First three patches add new variables and a callback to index AM.

Next two patches are the main part to support parallel vacuum. I've
incorporated all review comments I got so far. The memory layout of
variable-length index statistics might be complex a bit. It's similar
to the format of heap tuple header, having a null bitmap. And both the
size of index statistics and actual data for each indexes follows.

Last patch is a PoC patch that implements the shared vacuum cost
balance. For now it's separated but after testing both approaches it
will be merged to 0004 patch. I'll test both next week.

This patch set can be applied on top of the patch[1] that improves
gist index bulk-deletion. So canparallelvacuum of gist index is true.

[1] /messages/by-id/CAFiTN-uQY+B+CLb8W3YYdb7XmB9hyYFXkAy3C7RY=-YSWRV1DA@mail.gmail.com

I haven't yet read the new set of the patch. But, I have noticed one
thing. That we are getting the size of the statistics using the AM
routine. But, we are copying those statistics from local memory to
the shared memory directly using the memcpy. Wouldn't it be a good
idea to have an AM specific routine to get it copied from the local
memory to the shared memory? I am not sure it is worth it or not but
my thought behind this point is that it will give AM to have local
stats in any form ( like they can store a pointer in that ) but they
can serialize that while copying to shared stats. And, later when
shared stats are passed back to the Am then it can deserialize in its
local form and use it.

+ * Since all vacuum workers write the bulk-deletion result at
+ * different slots we can write them without locking.
+ */
+ if (!shared_indstats->updated && stats[idx] != NULL)
+ {
+ memcpy(bulkdelete_res, stats[idx], shared_indstats->size);
+ shared_indstats->updated = true;
+
+ /*
+ * no longer need the locally allocated result and now
+ * stats[idx] points to the DSM segment.
+ */
+ pfree(stats[idx]);
+ stats[idx] = bulkdelete_res;
+ }

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#143

Dilip Kumar

dilipbalaut@gmail.com

about 6 years ago

In reply to: Dilip Kumar (#134)

2 attachment(s)

On Thu, Oct 24, 2019 at 4:33 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Thu, Oct 24, 2019 at 4:21 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Thu, Oct 24, 2019 at 11:51 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Fri, Oct 18, 2019 at 12:18 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Fri, Oct 18, 2019 at 11:25 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

I am thinking if we can write the patch for both the approaches (a.
compute shared costs and try to delay based on that, b. try to divide
the I/O cost among workers as described in the email above[1]) and do
some tests to see the behavior of throttling, that might help us in
deciding what is the best strategy to solve this problem, if any.
What do you think?

I agree with this idea. I can come up with a POC patch for approach
(b). Meanwhile, if someone is interested to quickly hack with the
approach (a) then we can do some testing and compare. Sawada-san,
by any chance will you be interested to write POC with approach (a)?
Otherwise, I will try to write it after finishing the first one
(approach b).

I have come up with the POC for approach (a).

Can we compute the overall throttling (sleep time) in the operation
separately for heap and index, then divide the index's sleep_time with
a number of workers and add it to heap's sleep time? Then, it will be
a bit easier to compare the data between parallel and non-parallel
case.

I have come up with a patch to compute the total delay during the
vacuum. So the idea of computing the total cost delay is

Total cost delay = Total dealy of heap scan + Total dealy of
index/worker; Patch is attached for the same.

I have prepared this patch on the latest patch of the parallel
vacuum[1]/messages/by-id/CAD21AoBMo9dr_QmhT=dKh7fmiq7tpx+yLHR8nw9i5NZ-SgtaVg@mail.gmail.com. I have also rebased the patch for the approach [b] for
dividing the vacuum cost limit and done some testing for computing the
I/O throttling. Attached patches 0001-POC-compute-total-cost-delay
and 0002-POC-divide-vacuum-cost-limit can be applied on top of
v31-0005-Add-paralell-P-option-to-vacuumdb-command.patch. I haven't
rebased on top of v31-0006, because v31-0006 is implementing the I/O
throttling with one approach and 0002-POC-divide-vacuum-cost-limit is
doing the same with another approach. But,
0001-POC-compute-total-cost-delay can be applied on top of v31-0006 as
well (just 1-2 lines conflict).

Testing: I have performed 2 tests, one with the same size indexes and
second with the different size indexes and measured total I/O delay
with the attached patch.

Setup:
VacuumCostDelay=10ms
VacuumCostLimit=2000

Test1 (Same size index):
create table test(a int, b varchar, c varchar);
create index idx1 on test(a);
create index idx2 on test(b);
create index idx3 on test(c);
insert into test select i, repeat('a',30)||i, repeat('a',20)||i from
generate_series(1,500000) as i;
delete from test where a < 200000;

Vacuum (Head) Parallel Vacuum
Vacuum Cost Divide Patch
Total Delay 1784 (ms) 1398(ms)
1938(ms)

Test2 (Variable size dead tuple in index)
create table test(a int, b varchar, c varchar);
create index idx1 on test(a);
create index idx2 on test(b) where a > 100000;
create index idx3 on test(c) where a > 150000;

insert into test select i, repeat('a',30)||i, repeat('a',20)||i from
generate_series(1,500000) as i;
delete from test where a < 200000;

Vacuum (Head) Parallel Vacuum
Vacuum Cost Divide Patch
Total Delay 1438 (ms) 1029(ms)
1529(ms)

Conclusion:
1. The tests prove that the total I/O delay is significantly less with
the parallel vacuum.
2. With the vacuum cost divide the problem is solved but the delay bit
more compared to the non-parallel version. The reason could be the
problem discussed at[2]/messages/by-id/CAA4eK1+PeiFLdTuwrE6CvbNdx80E-O=ZxCuWB2maREKFD-RaCA@mail.gmail.com, but it needs further investigation.

Next, I will test with the v31-0006 (shared vacuum cost) patch. I
will also try to test different types of indexes.

[1]: /messages/by-id/CAD21AoBMo9dr_QmhT=dKh7fmiq7tpx+yLHR8nw9i5NZ-SgtaVg@mail.gmail.com
[2]: /messages/by-id/CAA4eK1+PeiFLdTuwrE6CvbNdx80E-O=ZxCuWB2maREKFD-RaCA@mail.gmail.com

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Attachments:

0001-POC-compute-total-cost-delay.patchapplication/octet-stream; name=0001-POC-compute-total-cost-delay.patchDownload

From 560df7cb72a550d813ca1152f2732bde20df6fa2 Mon Sep 17 00:00:00 2001
From: Dilip Kumar <dilip.kumar@enterprisedb.com>
Date: Sun, 27 Oct 2019 16:19:51 +0530
Subject: [PATCH 1/2] compute total cost delay

---
 src/backend/access/heap/vacuumlazy.c | 59 ++++++++++++++++++++++++++++++++++++
 src/backend/commands/vacuum.c        |  1 +
 src/backend/utils/init/globals.c     |  2 +-
 src/include/miscadmin.h              |  1 +
 4 files changed, 62 insertions(+), 1 deletion(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 02040c8..d7e99d7 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -137,6 +137,7 @@
 #define PARALLEL_VACUUM_KEY_SHARED			1
 #define PARALLEL_VACUUM_KEY_DEAD_TUPLES		2
 #define PARALLEL_VACUUM_KEY_QUERY_TEXT		3
+#define PARALLEL_VACUUM_KEY_COST_DELAY		4
 
 /*
  * PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION disables the leader's
@@ -247,6 +248,13 @@ typedef struct LVSharedIndStats
 #define GetIndexBulkDeleteResult(s) \
 	((IndexBulkDeleteResult *)((char *)(s) + sizeof(LVSharedIndStats)))
 
+typedef struct LVCostDelay
+{
+	pg_atomic_uint32	nslot;
+	double	vaccostdelay[FLEXIBLE_ARRAY_MEMBER];
+} LVCostDelay;
+#define SizeOfLVCostDelay offsetof(LVCostDelay, vaccostdelay) + sizeof(double)
+
 /* Struct for parallel lazy vacuum */
 typedef struct LVParallelState
 {
@@ -255,6 +263,8 @@ typedef struct LVParallelState
 	/* Shared information among parallel vacuum workers */
 	LVShared		*lvshared;
 
+	/* Shared cost delay. */
+	LVCostDelay		*lvcostdelay;
 	/*
 	 * Always true except for a debugging case where
 	 * PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION are defined.
@@ -746,6 +756,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		parallel_workers = compute_parallel_workers(Irel, nindexes,
 													params->nworkers);
 
+	VacuumCostTotalDelay = 0;
 	if (parallel_workers > 0)
 	{
 		/*
@@ -1722,6 +1733,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 					vacrelstats->scanned_pages, nblocks),
 			 errdetail_internal("%s", buf.data)));
 	pfree(buf.data);
+
+	elog(LOG, "Total cost delay = %lf", VacuumCostTotalDelay);
 }
 
 
@@ -1956,6 +1969,9 @@ lazy_parallel_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
 										int nindexes, IndexBulkDeleteResult **stats,
 										LVParallelState *lps)
 {
+	int		i;
+	double	costdelay;
+
 	Assert(!IsParallelWorker());
 	Assert(ParallelVacuumIsActive(lps));
 	Assert(nindexes > 0);
@@ -1976,6 +1992,14 @@ lazy_parallel_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
 						lps->pcxt->nworkers_launched, lps->pcxt->nworkers)));
 
 	/*
+	 * Remember the total delay so far and set the VacuumCostTotalDelay so that
+	 * we can get the leader contribution in total delay in index vacuuming
+	 * phase.
+	 */
+	costdelay = VacuumCostTotalDelay;
+	VacuumCostTotalDelay = 0;
+
+	/*
 	 * Join as parallel workers. The leader process alone does that in case where
 	 * no workers launched.
 	 */
@@ -1986,6 +2010,22 @@ lazy_parallel_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
 	/* Wait for all vacuum workers to finish */
 	WaitForParallelWorkersToFinish(lps->pcxt);
 
+	/* Collect all the delay from wrokers and add to total delay. */
+	for (i = 0; i < lps->pcxt->nworkers_launched; i++)
+	{
+		VacuumCostTotalDelay += lps->lvcostdelay->vaccostdelay[i];
+	}
+
+	/*
+	 * Compute the average cost delay.
+	 */
+	if (lps->leaderparticipates)
+		VacuumCostTotalDelay /= (lps->pcxt->nworkers_launched + 1);
+	else
+		VacuumCostTotalDelay /= lps->pcxt->nworkers_launched;
+
+	VacuumCostTotalDelay += costdelay;
+
 	/*
 	 * We need to reinitialize the parallel context as no more index vacuuming and
 	 * index cleanup will be performed after that.
@@ -2943,10 +2983,12 @@ begin_parallel_vacuum(LVRelStats *vacrelstats, Oid relid, BlockNumber nblocks,
 	ParallelContext *pcxt;
 	LVShared		*shared;
 	LVDeadTuples	*dead_tuples;
+	LVCostDelay		*costdelay;
 	long	maxtuples;
 	char	*sharedquery;
 	Size	est_shared;
 	Size	est_deadtuples;
+	Size	est_costdelay;
 	int		querylen;
 	int		i;
 
@@ -3016,6 +3058,14 @@ begin_parallel_vacuum(LVRelStats *vacrelstats, Oid relid, BlockNumber nblocks,
 	memcpy(sharedquery, debug_query_string, querylen + 1);
 	sharedquery[querylen] = '\0';
 	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, sharedquery);
+	
+	/* Vacuum cost balance. */
+	est_costdelay = MAXALIGN(add_size(SizeOfLVCostDelay,
+								   mul_size(sizeof(int), nrequested)));	
+	costdelay = (LVCostDelay *) shm_toc_allocate(pcxt->toc, est_costdelay);
+	pg_atomic_init_u32(&(costdelay->nslot), 0);
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_COST_DELAY, costdelay);
+	lps->lvcostdelay = costdelay;
 
 	return lps;
 }
@@ -3145,8 +3195,10 @@ heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
 	Relation	*indrels;
 	LVShared	*lvshared;
 	LVDeadTuples	*dead_tuples;
+	LVCostDelay		*costdelay;	
 	int			nindexes;
 	char		*sharedquery;
+	int			slot;
 	IndexBulkDeleteResult **stats;
 
 	lvshared = (LVShared *) shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_SHARED,
@@ -3180,6 +3232,11 @@ heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
 	dead_tuples = (LVDeadTuples *) shm_toc_lookup(toc,
 												  PARALLEL_VACUUM_KEY_DEAD_TUPLES,
 												  false);
+	
+	costdelay = (LVCostDelay *) shm_toc_lookup(toc,
+												   PARALLEL_VACUUM_KEY_COST_DELAY,
+												   false);
+	slot = pg_atomic_fetch_add_u32(&(costdelay->nslot), 1);
 
 	/* Set cost-based vacuum delay */
 	VacuumCostActive = (VacuumCostDelay > 0);
@@ -3198,6 +3255,8 @@ heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
 	vacuum_or_cleanup_indexes_worker(indrels, nindexes, stats, lvshared,
 									 dead_tuples);
 
+	/* update the total delay in the shared location. */
+	costdelay->vaccostdelay[slot] = VacuumCostTotalDelay;
 	vac_close_indexes(nindexes, indrels, RowExclusiveLock);
 	table_close(onerel, ShareUpdateExclusiveLock);
 }
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 9ada501..56fdefd 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -2010,6 +2010,7 @@ vacuum_delay_point(void)
 		/* update balance values for workers */
 		AutoVacuumUpdateDelay();
 
+		VacuumCostTotalDelay += msec;
 		/* Might have gotten an interrupt while sleeping */
 		CHECK_FOR_INTERRUPTS();
 	}
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 3bf96de..a5a1129 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -139,7 +139,7 @@ int			VacuumCostPageMiss = 10;
 int			VacuumCostPageDirty = 20;
 int			VacuumCostLimit = 200;
 double		VacuumCostDelay = 0;
-
+double		VacuumCostTotalDelay = 0;
 int			VacuumPageHit = 0;
 int			VacuumPageMiss = 0;
 int			VacuumPageDirty = 0;
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index bc6e03f..ab1c0ce 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -251,6 +251,7 @@ extern int	VacuumCostPageMiss;
 extern int	VacuumCostPageDirty;
 extern int	VacuumCostLimit;
 extern double VacuumCostDelay;
+extern double VacuumCostTotalDelay;
 
 extern int	VacuumPageHit;
 extern int	VacuumPageMiss;
-- 
1.8.3.1

0002-POC-divide-vacuum-cost-limit.patchapplication/octet-stream; name=0002-POC-divide-vacuum-cost-limit.patchDownload

From df0f24e519a1a8fc94dfc21a321fda7677007982 Mon Sep 17 00:00:00 2001
From: Dilip Kumar <dilip.kumar@enterprisedb.com>
Date: Mon, 28 Oct 2019 09:58:18 +0530
Subject: [PATCH 2/2] POC-divide-vacuum-cost-limit

---
 src/backend/access/heap/vacuumlazy.c | 94 +++++++++++++++++++++++++++++++++++-
 1 file changed, 92 insertions(+), 2 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index d7e99d7..18ae8bb 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -138,6 +138,7 @@
 #define PARALLEL_VACUUM_KEY_DEAD_TUPLES		2
 #define PARALLEL_VACUUM_KEY_QUERY_TEXT		3
 #define PARALLEL_VACUUM_KEY_COST_DELAY		4
+#define PARALLEL_VACUUM_KEY_COST_BALANCE	5
 
 /*
  * PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION disables the leader's
@@ -255,6 +256,14 @@ typedef struct LVCostDelay
 } LVCostDelay;
 #define SizeOfLVCostDelay offsetof(LVCostDelay, vaccostdelay) + sizeof(double)
 
+typedef struct LVCostBalance
+{
+	pg_atomic_uint32	nslot;
+	int		nworkers;
+	int		vaccostbalance[FLEXIBLE_ARRAY_MEMBER];
+} LVCostBalance;
+#define SizeOfLVCostBalance offsetof(LVCostBalance, vaccostbalance) + sizeof(int)
+
 /* Struct for parallel lazy vacuum */
 typedef struct LVParallelState
 {
@@ -265,6 +274,9 @@ typedef struct LVParallelState
 
 	/* Shared cost delay. */
 	LVCostDelay		*lvcostdelay;
+
+	/* Shared cost balance. */
+	LVCostBalance   *lvcostbalance;
 	/*
 	 * Always true except for a debugging case where
 	 * PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION are defined.
@@ -1959,6 +1971,31 @@ lazy_check_needs_freeze(Buffer buf, bool *hastup)
 	return false;
 }
 
+static void
+compute_cost_balance(LVParallelState *lps)
+{
+	int i;
+
+	/*
+	 * Share the estimated worker counts so that each worker can compute their
+	 * cost limit.  Include the leader if it is participating in the index
+	 * vacuum phase.
+	 * XXX: Actual worker launched might be lesser than the estimated worker so
+	 * in that case each worker might operate with less vacuum cost limit.
+	 */
+	lps->lvcostbalance->nworkers = lps->pcxt->nworkers;
+	if (lps->leaderparticipates)
+		lps->lvcostbalance->nworkers += 1;
+
+	/*
+	 * Divide the current cost balance among the worker so that we don't loose
+	 * accounting of the I/O balance so far.
+	 */
+	for (i = 0; i < lps->pcxt->nworkers; i++)
+		lps->lvcostbalance->vaccostbalance[i] =
+				VacuumCostBalance / lps->lvcostbalance->nworkers;
+}
+
 /*
  * Perform index vacuuming or index cleanup with parallel workers. This function
  * must be used by the parallel vacuum leader process. The caller must set
@@ -1976,6 +2013,9 @@ lazy_parallel_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
 	Assert(ParallelVacuumIsActive(lps));
 	Assert(nindexes > 0);
 
+	/* Compute cost balance for the workers. */
+	compute_cost_balance(lps);
+
 	LaunchParallelWorkers(lps->pcxt);
 
 	if (lps->lvshared->for_cleanup)
@@ -2004,12 +2044,36 @@ lazy_parallel_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
 	 * no workers launched.
 	 */
 	if (lps->leaderparticipates || lps->pcxt->nworkers_launched == 0)
+	{
+		int base_cost_limit = VacuumCostLimit;
+
+		/*
+		 * If leader is participating and we have launched the parallel workers
+		 * then compute the leaders share of the cost limit and cost balance.
+		 */
+		if (lps->pcxt->nworkers_launched > 0)
+		{
+				VacuumCostLimit /= lps->lvcostbalance->nworkers;
+				VacuumCostBalance /= lps->lvcostbalance->nworkers;
+		}
 		vacuum_or_cleanup_indexes_worker(Irel, nindexes, stats, lps->lvshared,
 										 vacrelstats->dead_tuples);
+		VacuumCostLimit = base_cost_limit;
+	}
 
 	/* Wait for all vacuum workers to finish */
 	WaitForParallelWorkersToFinish(lps->pcxt);
 
+	pg_atomic_write_u32(&(lps->lvcostbalance->nslot), 0);
+
+	/*
+	 * Index vacuuming phase is complete, so collect the remaining balance from
+	 * all the worker and add to the current balance of the leader.  So that we
+	 * don't loose the accounting for the extra I/O balance of the workers.
+	 */
+	for (i = 0; i < lps->pcxt->nworkers_launched; i++)
+		VacuumCostBalance += lps->lvcostbalance->vaccostbalance[i];
+
 	/* Collect all the delay from wrokers and add to total delay. */
 	for (i = 0; i < lps->pcxt->nworkers_launched; i++)
 	{
@@ -2984,11 +3048,13 @@ begin_parallel_vacuum(LVRelStats *vacrelstats, Oid relid, BlockNumber nblocks,
 	LVShared		*shared;
 	LVDeadTuples	*dead_tuples;
 	LVCostDelay		*costdelay;
+	LVCostBalance   *costbalance;
 	long	maxtuples;
 	char	*sharedquery;
 	Size	est_shared;
 	Size	est_deadtuples;
 	Size	est_costdelay;
+	Size    est_costbalance;
 	int		querylen;
 	int		i;
 
@@ -3067,6 +3133,15 @@ begin_parallel_vacuum(LVRelStats *vacrelstats, Oid relid, BlockNumber nblocks,
 	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_COST_DELAY, costdelay);
 	lps->lvcostdelay = costdelay;
 
+       
+	/* Vacuum cost balance. */
+	est_costbalance = MAXALIGN(add_size(SizeOfLVCostBalance,
+										mul_size(sizeof(int), nrequested))); 
+	costbalance = (LVCostBalance *) shm_toc_allocate(pcxt->toc, est_costbalance);
+	pg_atomic_init_u32(&(costbalance->nslot), 0);
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_COST_BALANCE, costbalance);
+	lps->lvcostbalance = costbalance;
+
 	return lps;
 }
 
@@ -3195,7 +3270,8 @@ heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
 	Relation	*indrels;
 	LVShared	*lvshared;
 	LVDeadTuples	*dead_tuples;
-	LVCostDelay		*costdelay;	
+	LVCostDelay		*costdelay;
+	LVCostBalance   *costbalance;
 	int			nindexes;
 	char		*sharedquery;
 	int			slot;
@@ -3236,7 +3312,10 @@ heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
 	costdelay = (LVCostDelay *) shm_toc_lookup(toc,
 												   PARALLEL_VACUUM_KEY_COST_DELAY,
 												   false);
-	slot = pg_atomic_fetch_add_u32(&(costdelay->nslot), 1);
+	costbalance = (LVCostBalance *) shm_toc_lookup(toc,
+												   PARALLEL_VACUUM_KEY_COST_BALANCE,
+												   false);
+	slot = pg_atomic_fetch_add_u32(&(costbalance->nslot), 1);
 
 	/* Set cost-based vacuum delay */
 	VacuumCostActive = (VacuumCostDelay > 0);
@@ -3245,6 +3324,10 @@ heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
 	VacuumPageMiss = 0;
 	VacuumPageDirty = 0;
 
+	/* Compute the vacuum cost limit for the worker. */
+	VacuumCostLimit = VacuumCostLimit / costbalance->nworkers;
+	VacuumCostBalance = costbalance->vaccostbalance[slot];
+
 	stats = (IndexBulkDeleteResult **)
 		palloc0(nindexes * sizeof(IndexBulkDeleteResult *));
 
@@ -3257,6 +3340,13 @@ heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
 
 	/* update the total delay in the shared location. */
 	costdelay->vaccostdelay[slot] = VacuumCostTotalDelay;
+
+	/*
+	 * Share the remaining balance with the leader so that we don't loose
+	 * accounting for the same.
+	 */
+	costbalance->vaccostbalance[slot] = VacuumCostBalance;
+
 	vac_close_indexes(nindexes, indrels, RowExclusiveLock);
 	table_close(onerel, ShareUpdateExclusiveLock);
 }
-- 
1.8.3.1

#144

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Dilip Kumar (#142)

On Sun, Oct 27, 2019 at 12:52 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Fri, Oct 25, 2019 at 9:19 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

I haven't yet read the new set of the patch. But, I have noticed one
thing. That we are getting the size of the statistics using the AM
routine. But, we are copying those statistics from local memory to
the shared memory directly using the memcpy. Wouldn't it be a good
idea to have an AM specific routine to get it copied from the local
memory to the shared memory? I am not sure it is worth it or not but
my thought behind this point is that it will give AM to have local
stats in any form ( like they can store a pointer in that ) but they
can serialize that while copying to shared stats. And, later when
shared stats are passed back to the Am then it can deserialize in its
local form and use it.

You have a point, but after changing the gist index, we don't have any
current usage for indexes that need something like that. So, on one
side there is some value in having an API to copy the stats, but on
the other side without having clear usage of an API, it might not be
good to expose a new API for the same. I think we can expose such an
API in the future if there is a need for the same. Do you or anyone
know of any external IndexAM that has such a need?

Few minor comments while glancing through the latest patchset.

1. I think you can merge 0001*, 0002*, 0003* patch into one patch as
all three expose new variable/function from IndexAmRoutine.

2.
+prepare_index_statistics(LVShared *lvshared, Relation *Irel, int nindexes)
+{
+ char *p = (char *) GetSharedIndStats(lvshared);
+ int vac_work_mem = IsAutoVacuumWorkerProcess() &&
+ autovacuum_work_mem != -1 ?
+ autovacuum_work_mem : maintenance_work_mem;

I think this function won't be called from AutoVacuumWorkerProcess at
least not as of now, so isn't it a better idea to have an Assert for
it?

3.
+void
+heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)

This function is for performing a parallel operation on the index, so
why to start with heap? It is better to name it as
index_parallel_vacuum_main or simply parallel_vacuum_main.

4.
/* useindex = true means two-pass strategy; false means one-pass */
@@ -128,17 +280,12 @@ typedef struct LVRelStats
  BlockNumber pages_removed;
  double tuples_deleted;
  BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
- /* List of TIDs of tuples we intend to delete */
- /* NB: this list is ordered by TID address */
- int num_dead_tuples; /* current # of entries */
- int max_dead_tuples; /* # slots allocated in array */
- ItemPointer dead_tuples; /* array of ItemPointerData */
+ LVDeadTuples *dead_tuples;
  int num_index_scans;
  TransactionId latestRemovedXid;
  bool lock_waiter_detected;
 } LVRelStats;

-
/* A few variables that don't seem worth passing around as parameters */
static int elevel = -1;

It seems like a spurious line removal.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#145

Dilip Kumar

dilipbalaut@gmail.com

about 6 years ago

In reply to: Amit Kapila (#144)

On Mon, Oct 28, 2019 at 12:20 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Sun, Oct 27, 2019 at 12:52 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Fri, Oct 25, 2019 at 9:19 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

I haven't yet read the new set of the patch. But, I have noticed one
thing. That we are getting the size of the statistics using the AM
routine. But, we are copying those statistics from local memory to
the shared memory directly using the memcpy. Wouldn't it be a good
idea to have an AM specific routine to get it copied from the local
memory to the shared memory? I am not sure it is worth it or not but
my thought behind this point is that it will give AM to have local
stats in any form ( like they can store a pointer in that ) but they
can serialize that while copying to shared stats. And, later when
shared stats are passed back to the Am then it can deserialize in its
local form and use it.

You have a point, but after changing the gist index, we don't have any
current usage for indexes that need something like that. So, on one
side there is some value in having an API to copy the stats, but on
the other side without having clear usage of an API, it might not be
good to expose a new API for the same. I think we can expose such an
API in the future if there is a need for the same.

I agree with the point. But, the current patch exposes an API for
estimating the size for the statistics. So IMHO, either we expose
both APIs for estimating the size of the stats and copy the stats or
none. Am I missing something here?

Do you or anyone

know of any external IndexAM that has such a need?

Few minor comments while glancing through the latest patchset.

1. I think you can merge 0001*, 0002*, 0003* patch into one patch as
all three expose new variable/function from IndexAmRoutine.
2.
+prepare_index_statistics(LVShared *lvshared, Relation *Irel, int nindexes)
+{
+ char *p = (char *) GetSharedIndStats(lvshared);
+ int vac_work_mem = IsAutoVacuumWorkerProcess() &&
+ autovacuum_work_mem != -1 ?
+ autovacuum_work_mem : maintenance_work_mem;
I think this function won't be called from AutoVacuumWorkerProcess at
least not as of now, so isn't it a better idea to have an Assert for
it?
3.
+void
+heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
This function is for performing a parallel operation on the index, so
why to start with heap? It is better to name it as
index_parallel_vacuum_main or simply parallel_vacuum_main.
4.
/* useindex = true means two-pass strategy; false means one-pass */
@@ -128,17 +280,12 @@ typedef struct LVRelStats
BlockNumber pages_removed;
double tuples_deleted;
BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
- /* List of TIDs of tuples we intend to delete */
- /* NB: this list is ordered by TID address */
- int num_dead_tuples; /* current # of entries */
- int max_dead_tuples; /* # slots allocated in array */
- ItemPointer dead_tuples; /* array of ItemPointerData */
+ LVDeadTuples *dead_tuples;
int num_index_scans;
TransactionId latestRemovedXid;
bool lock_waiter_detected;
} LVRelStats;
-
/* A few variables that don't seem worth passing around as parameters */
static int elevel = -1;

It seems like a spurious line removal.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#146

Dilip Kumar

dilipbalaut@gmail.com

about 6 years ago

In reply to: Masahiko Sawada (#141)

On Fri, Oct 25, 2019 at 9:19 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Fri, Oct 25, 2019 at 2:06 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Fri, Oct 25, 2019 at 10:22 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

For more detail of my idea it is that the first worker who entered to
vacuum_delay_point adds its local value to shared value and reset the
local value to 0. And then the worker sleeps if it exceeds
VacuumCostLimit but before sleeping it can subtract VacuumCostLimit
from the shared value. Since vacuum_delay_point are typically called
per page processed I expect there will not such problem. Thoughts?

Oh right, I assumed that when the local balance is exceeding the
VacuumCostLimit that time you are adding it to the shared value but
you are adding it to to shared value every time in vacuum_delay_point.
So I think your idea is correct.

I've attached the updated patch set.

First three patches add new variables and a callback to index AM.

Next two patches are the main part to support parallel vacuum. I've
incorporated all review comments I got so far. The memory layout of
variable-length index statistics might be complex a bit. It's similar
to the format of heap tuple header, having a null bitmap. And both the
size of index statistics and actual data for each indexes follows.

Last patch is a PoC patch that implements the shared vacuum cost
balance. For now it's separated but after testing both approaches it
will be merged to 0004 patch. I'll test both next week.

This patch set can be applied on top of the patch[1] that improves
gist index bulk-deletion. So canparallelvacuum of gist index is true.

+ /* Get the space for IndexBulkDeleteResult */
+ bulkdelete_res = GetIndexBulkDeleteResult(shared_indstats);
+
+ /*
+ * Update the pointer to the corresponding bulk-deletion result
+ * if someone has already updated it.
+ */
+ if (shared_indstats->updated && stats[idx] == NULL)
+ stats[idx] = bulkdelete_res;
+

I have a doubt in this hunk, I do not understand when this condition
will be hit? Because whenever we are setting shared_indstats->updated
to true at the same time we are setting stats[idx] to shared stat. So
I am not sure in what case the shared_indstats->updated will be true
but stats[idx] is still pointing to NULL?

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#147

Masahiko Sawada

sawada.mshk@gmail.com

about 6 years ago

In reply to: Dilip Kumar (#146)

On Mon, Oct 28, 2019 at 6:08 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Fri, Oct 25, 2019 at 9:19 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Fri, Oct 25, 2019 at 2:06 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Fri, Oct 25, 2019 at 10:22 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

For more detail of my idea it is that the first worker who entered to
vacuum_delay_point adds its local value to shared value and reset the
local value to 0. And then the worker sleeps if it exceeds
VacuumCostLimit but before sleeping it can subtract VacuumCostLimit
from the shared value. Since vacuum_delay_point are typically called
per page processed I expect there will not such problem. Thoughts?

Oh right, I assumed that when the local balance is exceeding the
VacuumCostLimit that time you are adding it to the shared value but
you are adding it to to shared value every time in vacuum_delay_point.
So I think your idea is correct.

I've attached the updated patch set.

First three patches add new variables and a callback to index AM.

Next two patches are the main part to support parallel vacuum. I've
incorporated all review comments I got so far. The memory layout of
variable-length index statistics might be complex a bit. It's similar
to the format of heap tuple header, having a null bitmap. And both the
size of index statistics and actual data for each indexes follows.

Last patch is a PoC patch that implements the shared vacuum cost
balance. For now it's separated but after testing both approaches it
will be merged to 0004 patch. I'll test both next week.

This patch set can be applied on top of the patch[1] that improves
gist index bulk-deletion. So canparallelvacuum of gist index is true.
+ /* Get the space for IndexBulkDeleteResult */
+ bulkdelete_res = GetIndexBulkDeleteResult(shared_indstats);
+
+ /*
+ * Update the pointer to the corresponding bulk-deletion result
+ * if someone has already updated it.
+ */
+ if (shared_indstats->updated && stats[idx] == NULL)
+ stats[idx] = bulkdelete_res;
+
I have a doubt in this hunk, I do not understand when this condition
will be hit? Because whenever we are setting shared_indstats->updated
to true at the same time we are setting stats[idx] to shared stat. So
I am not sure in what case the shared_indstats->updated will be true
but stats[idx] is still pointing to NULL?

I think it can be true in the case where one parallel vacuum worker
vacuums the index that was vacuumed by other workers in previous index
vacuum cycle. Suppose that worker-A and worker-B vacuumed index-A and
index-B respectively. After that worker-A vacuum index-B in the next
index vacuum cycle. In this case, shared_indstats->updated is true
because worker-B already vacuumed in the previous vacuum cycle. On the
other hand stats[idx] on worker-A is NULL because it's first time for
worker-A to vacuum index-B. Therefore worker-A updates its stats[idx]
to the bulk-deletion result on DSM in order to pass it to the index
AM.

Regards,

--
Masahiko Sawada

#148

Dilip Kumar

dilipbalaut@gmail.com

about 6 years ago

In reply to: Masahiko Sawada (#147)

On Tue, Oct 29, 2019 at 10:01 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Mon, Oct 28, 2019 at 6:08 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Fri, Oct 25, 2019 at 9:19 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Fri, Oct 25, 2019 at 2:06 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Fri, Oct 25, 2019 at 10:22 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

For more detail of my idea it is that the first worker who entered to
vacuum_delay_point adds its local value to shared value and reset the
local value to 0. And then the worker sleeps if it exceeds
VacuumCostLimit but before sleeping it can subtract VacuumCostLimit
from the shared value. Since vacuum_delay_point are typically called
per page processed I expect there will not such problem. Thoughts?

Oh right, I assumed that when the local balance is exceeding the
VacuumCostLimit that time you are adding it to the shared value but
you are adding it to to shared value every time in vacuum_delay_point.
So I think your idea is correct.

I've attached the updated patch set.

First three patches add new variables and a callback to index AM.

Next two patches are the main part to support parallel vacuum. I've
incorporated all review comments I got so far. The memory layout of
variable-length index statistics might be complex a bit. It's similar
to the format of heap tuple header, having a null bitmap. And both the
size of index statistics and actual data for each indexes follows.

Last patch is a PoC patch that implements the shared vacuum cost
balance. For now it's separated but after testing both approaches it
will be merged to 0004 patch. I'll test both next week.

This patch set can be applied on top of the patch[1] that improves
gist index bulk-deletion. So canparallelvacuum of gist index is true.
+ /* Get the space for IndexBulkDeleteResult */
+ bulkdelete_res = GetIndexBulkDeleteResult(shared_indstats);
+
+ /*
+ * Update the pointer to the corresponding bulk-deletion result
+ * if someone has already updated it.
+ */
+ if (shared_indstats->updated && stats[idx] == NULL)
+ stats[idx] = bulkdelete_res;
+
I have a doubt in this hunk, I do not understand when this condition
will be hit? Because whenever we are setting shared_indstats->updated
to true at the same time we are setting stats[idx] to shared stat. So
I am not sure in what case the shared_indstats->updated will be true
but stats[idx] is still pointing to NULL?
I think it can be true in the case where one parallel vacuum worker
vacuums the index that was vacuumed by other workers in previous index
vacuum cycle. Suppose that worker-A and worker-B vacuumed index-A and
index-B respectively. After that worker-A vacuum index-B in the next
index vacuum cycle. In this case, shared_indstats->updated is true
because worker-B already vacuumed in the previous vacuum cycle. On the
other hand stats[idx] on worker-A is NULL because it's first time for
worker-A to vacuum index-B. Therefore worker-A updates its stats[idx]
to the bulk-deletion result on DSM in order to pass it to the index
AM.

Okay, that makes sense.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#149

Masahiko Sawada

sawada.mshk@gmail.com

about 6 years ago

In reply to: Dilip Kumar (#143)

4 attachment(s)

On Mon, Oct 28, 2019 at 2:13 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Thu, Oct 24, 2019 at 4:33 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Thu, Oct 24, 2019 at 4:21 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Thu, Oct 24, 2019 at 11:51 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Fri, Oct 18, 2019 at 12:18 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Fri, Oct 18, 2019 at 11:25 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

I am thinking if we can write the patch for both the approaches (a.
compute shared costs and try to delay based on that, b. try to divide
the I/O cost among workers as described in the email above[1]) and do
some tests to see the behavior of throttling, that might help us in
deciding what is the best strategy to solve this problem, if any.
What do you think?

I agree with this idea. I can come up with a POC patch for approach
(b). Meanwhile, if someone is interested to quickly hack with the
approach (a) then we can do some testing and compare. Sawada-san,
by any chance will you be interested to write POC with approach (a)?
Otherwise, I will try to write it after finishing the first one
(approach b).

I have come up with the POC for approach (a).

Can we compute the overall throttling (sleep time) in the operation
separately for heap and index, then divide the index's sleep_time with
a number of workers and add it to heap's sleep time? Then, it will be
a bit easier to compare the data between parallel and non-parallel
case.

I have come up with a patch to compute the total delay during the
vacuum. So the idea of computing the total cost delay is

Total cost delay = Total dealy of heap scan + Total dealy of
index/worker; Patch is attached for the same.

I have prepared this patch on the latest patch of the parallel
vacuum[1]. I have also rebased the patch for the approach [b] for
dividing the vacuum cost limit and done some testing for computing the
I/O throttling. Attached patches 0001-POC-compute-total-cost-delay
and 0002-POC-divide-vacuum-cost-limit can be applied on top of
v31-0005-Add-paralell-P-option-to-vacuumdb-command.patch. I haven't
rebased on top of v31-0006, because v31-0006 is implementing the I/O
throttling with one approach and 0002-POC-divide-vacuum-cost-limit is
doing the same with another approach. But,
0001-POC-compute-total-cost-delay can be applied on top of v31-0006 as
well (just 1-2 lines conflict).

Testing: I have performed 2 tests, one with the same size indexes and
second with the different size indexes and measured total I/O delay
with the attached patch.

Setup:
VacuumCostDelay=10ms
VacuumCostLimit=2000

Test1 (Same size index):
create table test(a int, b varchar, c varchar);
create index idx1 on test(a);
create index idx2 on test(b);
create index idx3 on test(c);
insert into test select i, repeat('a',30)||i, repeat('a',20)||i from
generate_series(1,500000) as i;
delete from test where a < 200000;

Vacuum (Head) Parallel Vacuum
Vacuum Cost Divide Patch
Total Delay 1784 (ms) 1398(ms)
1938(ms)

Test2 (Variable size dead tuple in index)
create table test(a int, b varchar, c varchar);
create index idx1 on test(a);
create index idx2 on test(b) where a > 100000;
create index idx3 on test(c) where a > 150000;

insert into test select i, repeat('a',30)||i, repeat('a',20)||i from
generate_series(1,500000) as i;
delete from test where a < 200000;

Vacuum (Head) Parallel Vacuum
Vacuum Cost Divide Patch
Total Delay 1438 (ms) 1029(ms)
1529(ms)

Conclusion:
1. The tests prove that the total I/O delay is significantly less with
the parallel vacuum.
2. With the vacuum cost divide the problem is solved but the delay bit
more compared to the non-parallel version. The reason could be the
problem discussed at[2], but it needs further investigation.

Next, I will test with the v31-0006 (shared vacuum cost) patch. I
will also try to test different types of indexes.

Thank you for testing!

I realized that v31-0006 patch doesn't work fine so I've attached the
updated version patch that also incorporated some comments I got so
far. Sorry for the inconvenience. I'll apply your 0001 patch and also
test the total delay time.

Regards,

--
Masahiko Sawada

Attachments:

v32-0001-Add-index-AM-field-and-callback-for-parallel-ind.patchtext/x-patch; charset=US-ASCII; name=v32-0001-Add-index-AM-field-and-callback-for-parallel-ind.patchDownload

From 9da5930da73a1ea10ad8e782171fb8e10a333b0a Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Tue, 15 Oct 2019 17:03:22 +0900
Subject: [PATCH v32 1/4] Add index AM field and callback for parallel index
 vacuum

---
 contrib/bloom/blutils.c                       |  3 ++
 doc/src/sgml/indexam.sgml                     | 21 ++++++++++++++
 src/backend/access/brin/brin.c                |  3 ++
 src/backend/access/gin/ginutil.c              |  3 ++
 src/backend/access/gist/gist.c                |  3 ++
 src/backend/access/hash/hash.c                |  3 ++
 src/backend/access/index/indexam.c            | 29 +++++++++++++++++++
 src/backend/access/nbtree/nbtree.c            |  3 ++
 src/backend/access/spgist/spgutils.c          |  3 ++
 src/include/access/amapi.h                    | 13 +++++++++
 src/include/access/genam.h                    |  1 +
 .../modules/dummy_index_am/dummy_index_am.c   |  3 ++
 12 files changed, 88 insertions(+)

diff --git a/contrib/bloom/blutils.c b/contrib/bloom/blutils.c
index 3d44616adc..d50122d9e2 100644
--- a/contrib/bloom/blutils.c
+++ b/contrib/bloom/blutils.c
@@ -120,7 +120,9 @@ blhandler(PG_FUNCTION_ARGS)
 	amroutine->amclusterable = false;
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
+	amroutine->amcanparallelvacuum = true;
 	amroutine->amcaninclude = false;
+	amroutine->amusemaintenanceworkmem = false;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = blbuild;
@@ -144,6 +146,7 @@ blhandler(PG_FUNCTION_ARGS)
 	amroutine->amestimateparallelscan = NULL;
 	amroutine->aminitparallelscan = NULL;
 	amroutine->amparallelrescan = NULL;
+	amroutine->amestimateparallelvacuum = NULL;
 
 	PG_RETURN_POINTER(amroutine);
 }
diff --git a/doc/src/sgml/indexam.sgml b/doc/src/sgml/indexam.sgml
index dd54c68802..df4cad11b3 100644
--- a/doc/src/sgml/indexam.sgml
+++ b/doc/src/sgml/indexam.sgml
@@ -120,8 +120,12 @@ typedef struct IndexAmRoutine
     bool        ampredlocks;
     /* does AM support parallel scan? */
     bool        amcanparallel;
+    /* does AM support parallel vacuum? */
+    bool        amcanparallelvacuum;
     /* does AM support columns included with clause INCLUDE? */
     bool        amcaninclude;
+    /* does AM use maintenance_work_mem? */
+    bool        amusemaintenanceworkmem;
     /* type of data stored in index, or InvalidOid if variable */
     Oid         amkeytype;
 
@@ -149,6 +153,9 @@ typedef struct IndexAmRoutine
     amestimateparallelscan_function amestimateparallelscan;    /* can be NULL */
     aminitparallelscan_function aminitparallelscan;    /* can be NULL */
     amparallelrescan_function amparallelrescan;    /* can be NULL */
+
+    /* interface functions to support parallel vacuum */
+    amestimateparallelvacuum_function amestimateparallelvacuum; /* can be NULL */
 } IndexAmRoutine;
 </programlisting>
   </para>
@@ -731,6 +738,20 @@ amparallelrescan (IndexScanDesc scan);
    the beginning.
   </para>
 
+  <para>
+<programlisting>
+void
+amestimateparallelvacuum (IndexScanDesc scan);
+</programlisting>
+   Estimate and return the number of bytes of dynamic shared memory which the
+   access method will be needed to copy the statistics to.
+  </para>
+
+  <para>
+   It is not necessary to implement this function for access methods which
+   do not support parallel vacuum or in cases where the access method does not
+   require more than size of <structname>IndexBulkDeleteResult</structname>.
+  </para>
  </sect1>
 
  <sect1 id="index-scanning">
diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index ae7b729edd..4a3286bfde 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -100,7 +100,9 @@ brinhandler(PG_FUNCTION_ARGS)
 	amroutine->amclusterable = false;
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
+	amroutine->amcanparallelvacuum = true;
 	amroutine->amcaninclude = false;
+	amroutine->amusemaintenanceworkmem = false;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = brinbuild;
@@ -124,6 +126,7 @@ brinhandler(PG_FUNCTION_ARGS)
 	amroutine->amestimateparallelscan = NULL;
 	amroutine->aminitparallelscan = NULL;
 	amroutine->amparallelrescan = NULL;
+	amroutine->amestimateparallelvacuum = NULL;
 
 	PG_RETURN_POINTER(amroutine);
 }
diff --git a/src/backend/access/gin/ginutil.c b/src/backend/access/gin/ginutil.c
index cf9699ad18..a28a71999d 100644
--- a/src/backend/access/gin/ginutil.c
+++ b/src/backend/access/gin/ginutil.c
@@ -52,7 +52,9 @@ ginhandler(PG_FUNCTION_ARGS)
 	amroutine->amclusterable = false;
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = false;
+	amroutine->amcanparallelvacuum = true;
 	amroutine->amcaninclude = false;
+	amroutine->amusemaintenanceworkmem = true;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = ginbuild;
@@ -76,6 +78,7 @@ ginhandler(PG_FUNCTION_ARGS)
 	amroutine->amestimateparallelscan = NULL;
 	amroutine->aminitparallelscan = NULL;
 	amroutine->amparallelrescan = NULL;
+	amroutine->amestimateparallelvacuum = NULL;
 
 	PG_RETURN_POINTER(amroutine);
 }
diff --git a/src/backend/access/gist/gist.c b/src/backend/access/gist/gist.c
index 0cc87911d6..752b5bc88c 100644
--- a/src/backend/access/gist/gist.c
+++ b/src/backend/access/gist/gist.c
@@ -74,7 +74,9 @@ gisthandler(PG_FUNCTION_ARGS)
 	amroutine->amclusterable = true;
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = false;
+	amroutine->amcanparallelvacuum = true;
 	amroutine->amcaninclude = true;
+	amroutine->amusemaintenanceworkmem = false;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = gistbuild;
@@ -98,6 +100,7 @@ gisthandler(PG_FUNCTION_ARGS)
 	amroutine->amestimateparallelscan = NULL;
 	amroutine->aminitparallelscan = NULL;
 	amroutine->amparallelrescan = NULL;
+	amroutine->amestimateparallelvacuum = NULL;
 
 	PG_RETURN_POINTER(amroutine);
 }
diff --git a/src/backend/access/hash/hash.c b/src/backend/access/hash/hash.c
index 5cc30dac42..dc0dc312ef 100644
--- a/src/backend/access/hash/hash.c
+++ b/src/backend/access/hash/hash.c
@@ -73,7 +73,9 @@ hashhandler(PG_FUNCTION_ARGS)
 	amroutine->amclusterable = false;
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = false;
+	amroutine->amcanparallelvacuum = true;
 	amroutine->amcaninclude = false;
+	amroutine->amusemaintenanceworkmem = false;
 	amroutine->amkeytype = INT4OID;
 
 	amroutine->ambuild = hashbuild;
@@ -97,6 +99,7 @@ hashhandler(PG_FUNCTION_ARGS)
 	amroutine->amestimateparallelscan = NULL;
 	amroutine->aminitparallelscan = NULL;
 	amroutine->amparallelrescan = NULL;
+	amroutine->amestimateparallelvacuum = NULL;
 
 	PG_RETURN_POINTER(amroutine);
 }
diff --git a/src/backend/access/index/indexam.c b/src/backend/access/index/indexam.c
index 9dfa0ddfbb..5238b9d38f 100644
--- a/src/backend/access/index/indexam.c
+++ b/src/backend/access/index/indexam.c
@@ -711,6 +711,35 @@ index_vacuum_cleanup(IndexVacuumInfo *info,
 	return indexRelation->rd_indam->amvacuumcleanup(info, stats);
 }
 
+/*
+ * index_parallelvacuum_estimate - estimate shared memory for parallel vacuum
+ *
+ * Currently, we don't pass any information to the AM-specific estimator,
+ * so it can probably only return a constant.  In the future, we might need
+ * to pass more information.
+ */
+Size
+index_parallelvacuum_estimate(Relation indexRelation)
+{
+	Size		nbytes;
+
+	RELATION_CHECKS;
+
+	/*
+	 * If amestimateparallelvacuum is not provided, assume only
+	 * IndexBulkDeleteResult is needed.
+	 */
+	if (indexRelation->rd_indam->amestimateparallelvacuum != NULL)
+	{
+		nbytes = indexRelation->rd_indam->amestimateparallelvacuum();
+		Assert(nbytes >= MAXALIGN(sizeof(IndexBulkDeleteResult)));
+	}
+	else
+		nbytes = MAXALIGN(sizeof(IndexBulkDeleteResult));
+
+	return nbytes;
+}
+
 /* ----------------
  *		index_can_return
  *
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index 4cfd5289ad..1ea2ba3fe0 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -122,7 +122,9 @@ bthandler(PG_FUNCTION_ARGS)
 	amroutine->amclusterable = true;
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = true;
+	amroutine->amcanparallelvacuum = true;
 	amroutine->amcaninclude = true;
+	amroutine->amusemaintenanceworkmem = false;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = btbuild;
@@ -146,6 +148,7 @@ bthandler(PG_FUNCTION_ARGS)
 	amroutine->amestimateparallelscan = btestimateparallelscan;
 	amroutine->aminitparallelscan = btinitparallelscan;
 	amroutine->amparallelrescan = btparallelrescan;
+	amroutine->amestimateparallelvacuum = NULL;
 
 	PG_RETURN_POINTER(amroutine);
 }
diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c
index 45472db147..4a1689859a 100644
--- a/src/backend/access/spgist/spgutils.c
+++ b/src/backend/access/spgist/spgutils.c
@@ -55,7 +55,9 @@ spghandler(PG_FUNCTION_ARGS)
 	amroutine->amclusterable = false;
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
+	amroutine->amcanparallelvacuum = true;
 	amroutine->amcaninclude = false;
+	amroutine->amusemaintenanceworkmem = false;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = spgbuild;
@@ -79,6 +81,7 @@ spghandler(PG_FUNCTION_ARGS)
 	amroutine->amestimateparallelscan = NULL;
 	amroutine->aminitparallelscan = NULL;
 	amroutine->amparallelrescan = NULL;
+	amroutine->amestimateparallelvacuum = NULL;
 
 	PG_RETURN_POINTER(amroutine);
 }
diff --git a/src/include/access/amapi.h b/src/include/access/amapi.h
index 6e3db06eed..d166350bbe 100644
--- a/src/include/access/amapi.h
+++ b/src/include/access/amapi.h
@@ -156,6 +156,12 @@ typedef void (*aminitparallelscan_function) (void *target);
 /* (re)start parallel index scan */
 typedef void (*amparallelrescan_function) (IndexScanDesc scan);
 
+/*
+ * Callback function signatures - for parallel index vacuuming.
+ */
+/* estimate size of parallel index vacuuming memory */
+typedef Size (*amestimateparallelvacuum_function) (void);
+
 /*
  * API struct for an index AM.  Note this must be stored in a single palloc'd
  * chunk of memory.
@@ -195,8 +201,12 @@ typedef struct IndexAmRoutine
 	bool		ampredlocks;
 	/* does AM support parallel scan? */
 	bool		amcanparallel;
+	/* does AM support parallel vacuum? */
+	bool		amcanparallelvacuum;
 	/* does AM support columns included with clause INCLUDE? */
 	bool		amcaninclude;
+	/* does AM use maintenance_work_mem? */
+	bool		amusemaintenanceworkmem;
 	/* type of data stored in index, or InvalidOid if variable */
 	Oid			amkeytype;
 
@@ -230,6 +240,9 @@ typedef struct IndexAmRoutine
 	amestimateparallelscan_function amestimateparallelscan; /* can be NULL */
 	aminitparallelscan_function aminitparallelscan; /* can be NULL */
 	amparallelrescan_function amparallelrescan; /* can be NULL */
+
+	/* interface functions to support parallel vacuum */
+	amestimateparallelvacuum_function amestimateparallelvacuum; /* can be NULL */
 } IndexAmRoutine;
 
 
diff --git a/src/include/access/genam.h b/src/include/access/genam.h
index a813b004be..48ed5bbac7 100644
--- a/src/include/access/genam.h
+++ b/src/include/access/genam.h
@@ -179,6 +179,7 @@ extern IndexBulkDeleteResult *index_bulk_delete(IndexVacuumInfo *info,
 												void *callback_state);
 extern IndexBulkDeleteResult *index_vacuum_cleanup(IndexVacuumInfo *info,
 												   IndexBulkDeleteResult *stats);
+extern Size index_parallelvacuum_estimate(Relation indexRelation);
 extern bool index_can_return(Relation indexRelation, int attno);
 extern RegProcedure index_getprocid(Relation irel, AttrNumber attnum,
 									uint16 procnum);
diff --git a/src/test/modules/dummy_index_am/dummy_index_am.c b/src/test/modules/dummy_index_am/dummy_index_am.c
index bc68767f3a..374d545f0d 100644
--- a/src/test/modules/dummy_index_am/dummy_index_am.c
+++ b/src/test/modules/dummy_index_am/dummy_index_am.c
@@ -300,7 +300,9 @@ dihandler(PG_FUNCTION_ARGS)
 	amroutine->amclusterable = false;
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
+	amroutine->amcanparallelvacuum = false;
 	amroutine->amcaninclude = false;
+	amroutine->amusemaintenanceworkmem = false;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = dibuild;
@@ -324,6 +326,7 @@ dihandler(PG_FUNCTION_ARGS)
 	amroutine->amestimateparallelscan = NULL;
 	amroutine->aminitparallelscan = NULL;
 	amroutine->amparallelrescan = NULL;
+	amroutine->amestimateparallelvacuum = NULL;
 
 	PG_RETURN_POINTER(amroutine);
 }
-- 
2.22.0

v32-0004-PoC-shared-vacuum-cost-balance.patchtext/x-patch; charset=US-ASCII; name=v32-0004-PoC-shared-vacuum-cost-balance.patchDownload

From 279ebab5dc0ad2ce0569bb84a80995257a746b80 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 25 Oct 2019 21:56:24 +0900
Subject: [PATCH v32 4/4] PoC: shared vacuum cost balance

---
 src/backend/access/heap/vacuumlazy.c | 26 +++++++++++
 src/backend/commands/vacuum.c        | 67 ++++++++++++++++++++++------
 src/include/access/heapam.h          |  1 +
 3 files changed, 81 insertions(+), 13 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 0cfa13b81b..a9d9f31887 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -211,6 +211,13 @@ typedef struct LVShared
 	/* The number of indexes that do NOT support parallel index vacuuming */
 	int		nindexes_nonparallel;
 
+	/*
+	 * Shared vacuum cost balance.  During parallel index vacuuming
+	 * VacuumSharedCostBalance points to this value and it accumulates the
+	 * balance of each parallel vacuum workers.
+	 */
+	pg_atomic_uint32 cost_balance;
+
 	/*
 	 * Variables to control parallel index vacuuming.  Index statistics
 	 * returned from ambulkdelete and amvacuumcleanup is nullable variable
@@ -230,6 +237,9 @@ typedef struct LVShared
 #define IndStatsIsNull(s, i) \
 	(!(((LVShared *)(s))->bitmap[(i) >> 3] & (1 << ((i) & 0x07))))
 
+/* Global variable for shared cost-based vacuum delay */
+pg_atomic_uint32	*VacuumSharedCostBalance = NULL;
+
 /*
  * Struct for an index bulk-deletion statistic used for parallel lazy
  * vacuum. This is allocated in the DSM segment.  IndexBulkDeleteResult
@@ -1961,6 +1971,14 @@ lazy_parallel_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
 	Assert(ParallelVacuumIsActive(lps));
 	Assert(nindexes > 0);
 
+	/*
+	 * Move the current balance to the shared value and enable shared cost
+	 * balance.
+	 */
+	pg_atomic_write_u32(&(lps->lvshared->cost_balance), VacuumCostBalance);
+	VacuumCostBalance = 0;
+	VacuumSharedCostBalance = &(lps->lvshared->cost_balance);
+
 	LaunchParallelWorkers(lps->pcxt);
 
 	if (lps->lvshared->for_cleanup)
@@ -1987,6 +2005,12 @@ lazy_parallel_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
 	/* Wait for all vacuum workers to finish */
 	WaitForParallelWorkersToFinish(lps->pcxt);
 
+	/* Disable shared cost balance for vacuum delay */
+	VacuumSharedCostBalance = NULL;
+
+	/* Continue to use the shared balance value */
+	VacuumCostBalance = pg_atomic_read_u32(&(lps->lvshared->cost_balance));
+
 	/*
 	 * We need to reinitialize the parallel context as no more index vacuuming and
 	 * index cleanup will be performed after that.
@@ -3000,6 +3024,7 @@ begin_parallel_vacuum(LVRelStats *vacrelstats, Oid relid, BlockNumber nblocks,
 	shared->offset = add_size(SizeOfLVShared, BITMAPLEN(nindexes));
 	prepare_index_statistics(shared, Irel, nindexes);
 	pg_atomic_init_u32(&(shared->nprocessed), 0);
+	pg_atomic_init_u32(&(shared->cost_balance), 0);
 
 	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_SHARED, shared);
 	lps->lvshared = shared;
@@ -3188,6 +3213,7 @@ heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
 	VacuumPageHit = 0;
 	VacuumPageMiss = 0;
 	VacuumPageDirty = 0;
+	VacuumSharedCostBalance = &(lvshared->cost_balance);
 
 	stats = (IndexBulkDeleteResult **)
 		palloc0(nindexes * sizeof(IndexBulkDeleteResult *));
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 9ada501709..1b9ea9b672 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -412,6 +412,7 @@ vacuum(List *relations, VacuumParams *params,
 		VacuumPageHit = 0;
 		VacuumPageMiss = 0;
 		VacuumPageDirty = 0;
+		VacuumSharedCostBalance = NULL;
 
 		/*
 		 * Loop to process each selected relation.
@@ -1990,28 +1991,68 @@ vac_close_indexes(int nindexes, Relation *Irel, LOCKMODE lockmode)
 void
 vacuum_delay_point(void)
 {
+	double	msec = 0;
+
 	/* Always check for interrupts */
 	CHECK_FOR_INTERRUPTS();
 
-	/* Nap if appropriate */
-	if (VacuumCostActive && !InterruptPending &&
-		VacuumCostBalance >= VacuumCostLimit)
+	if (VacuumCostActive && !InterruptPending)
 	{
-		double		msec;
+		/*
+		 * If the vacuum cost balance is shared among parallel workers we
+		 * decide whether to sleep based on that.
+		 */
+		if (VacuumSharedCostBalance != NULL)
+		{
+			while (true)
+			{
+				uint32 shared_balance;
+				uint32 new_balance;
 
-		msec = VacuumCostDelay * VacuumCostBalance / VacuumCostLimit;
-		if (msec > VacuumCostDelay * 4)
-			msec = VacuumCostDelay * 4;
+				msec = 0;
 
-		pg_usleep((long) (msec * 1000));
+				/* compute new balance by adding the local value */
+				shared_balance = pg_atomic_read_u32(VacuumSharedCostBalance);
+				new_balance = shared_balance + VacuumCostBalance;
 
-		VacuumCostBalance = 0;
+				if (new_balance >= VacuumCostLimit)
+				{
+					/* compute sleep time based on the shared cost balance */
+					msec = VacuumCostDelay * new_balance / VacuumCostLimit;
+					new_balance %= VacuumCostLimit;
+				}
 
-		/* update balance values for workers */
-		AutoVacuumUpdateDelay();
+				if (pg_atomic_compare_exchange_u32(VacuumSharedCostBalance,
+												   &shared_balance,
+												   new_balance))
+					break;
+			}
 
-		/* Might have gotten an interrupt while sleeping */
-		CHECK_FOR_INTERRUPTS();
+			/*
+			 * Reset the local balance as we accumulated it into the shared
+			 * value.
+			 */
+			VacuumCostBalance = 0;
+		}
+		else if (VacuumCostBalance >= VacuumCostLimit)
+			msec = VacuumCostDelay * VacuumCostBalance / VacuumCostLimit;
+
+		/* Nap if appropriate */
+		if (msec > 0)
+		{
+			if (msec > VacuumCostDelay * 4)
+				msec = VacuumCostDelay * 4;
+
+			pg_usleep((long) (msec * 1000));
+
+			VacuumCostBalance = 0;
+
+			/* update balance values for workers */
+			AutoVacuumUpdateDelay();
+
+			/* Might have gotten an interrupt while sleeping */
+			CHECK_FOR_INTERRUPTS();
+		}
 	}
 }
 
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 12065cc038..ac883f67d1 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -192,6 +192,7 @@ extern void SyncScanShmemInit(void);
 extern Size SyncScanShmemSize(void);
 
 /* in heap/vacuumlazy.c */
+extern pg_atomic_uint32	*VacuumSharedCostBalance;
 struct VacuumParams;
 extern void heap_vacuum_rel(Relation onerel,
 							struct VacuumParams *params, BufferAccessStrategy bstrategy);
-- 
2.22.0

v32-0003-Add-paralell-P-option-to-vacuumdb-command.patchtext/x-patch; charset=US-ASCII; name=v32-0003-Add-paralell-P-option-to-vacuumdb-command.patchDownload

From d17fc1ed9c2255800f993814bb6e1502045c39e4 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Wed, 23 Jan 2019 16:07:53 +0900
Subject: [PATCH v32 3/4] Add --paralell, -P option to vacuumdb command

---
 doc/src/sgml/ref/vacuumdb.sgml    | 16 +++++++++++
 src/bin/scripts/t/100_vacuumdb.pl | 10 ++++++-
 src/bin/scripts/vacuumdb.c        | 48 ++++++++++++++++++++++++++++++-
 3 files changed, 72 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/ref/vacuumdb.sgml b/doc/src/sgml/ref/vacuumdb.sgml
index 47d93456f8..f6ac0c6e5a 100644
--- a/doc/src/sgml/ref/vacuumdb.sgml
+++ b/doc/src/sgml/ref/vacuumdb.sgml
@@ -226,6 +226,22 @@ PostgreSQL documentation
       </listitem>
      </varlistentry>
 
+     <varlistentry>
+      <term><option>-P <replaceable class="parameter">workers</replaceable></option></term>
+      <term><option>--parallel=<replaceable class="parameter">workers</replaceable></option></term>
+      <listitem>
+       <para>
+        Execute parallel vacuum with <productname>PostgreSQL</productname>'s
+        <replaceable class="parameter">workers</replaceable> background workers.
+       </para>
+       <para>
+        This option will require background workers, so make sure your
+        <xref linkend="guc-max-parallel-workers-maintenance"/> setting is more
+        than one.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry>
       <term><option>-q</option></term>
       <term><option>--quiet</option></term>
diff --git a/src/bin/scripts/t/100_vacuumdb.pl b/src/bin/scripts/t/100_vacuumdb.pl
index b685b35282..8fe80719e8 100644
--- a/src/bin/scripts/t/100_vacuumdb.pl
+++ b/src/bin/scripts/t/100_vacuumdb.pl
@@ -3,7 +3,7 @@ use warnings;
 
 use PostgresNode;
 use TestLib;
-use Test::More tests => 44;
+use Test::More tests => 48;
 
 program_help_ok('vacuumdb');
 program_version_ok('vacuumdb');
@@ -48,6 +48,14 @@ $node->issues_sql_like(
 $node->command_fails(
 	[ 'vacuumdb', '--analyze-only', '--disable-page-skipping', 'postgres' ],
 	'--analyze-only and --disable-page-skipping specified together');
+$node->issues_sql_like(
+	[ 'vacuumdb', '-P2', 'postgres' ],
+	qr/statement: VACUUM \(PARALLEL 2\).*;/,
+	'vacuumdb -P2');
+$node->issues_sql_like(
+	[ 'vacuumdb', '-P', 'postgres' ],
+	qr/statement: VACUUM \(PARALLEL\).*;/,
+	'vacuumdb -P');
 $node->command_ok([qw(vacuumdb -Z --table=pg_am dbname=template1)],
 	'vacuumdb with connection string');
 
diff --git a/src/bin/scripts/vacuumdb.c b/src/bin/scripts/vacuumdb.c
index 2c7219239f..63bf66a70b 100644
--- a/src/bin/scripts/vacuumdb.c
+++ b/src/bin/scripts/vacuumdb.c
@@ -34,6 +34,8 @@ typedef struct vacuumingOptions
 	bool		skip_locked;
 	int			min_xid_age;
 	int			min_mxid_age;
+	int			parallel_workers;	/* -1 disables, 0 for choosing based on the
+									 * number of indexes */
 } vacuumingOptions;
 
 
@@ -86,6 +88,7 @@ main(int argc, char *argv[])
 		{"full", no_argument, NULL, 'f'},
 		{"verbose", no_argument, NULL, 'v'},
 		{"jobs", required_argument, NULL, 'j'},
+		{"parallel", optional_argument, NULL, 'P'},
 		{"maintenance-db", required_argument, NULL, 2},
 		{"analyze-in-stages", no_argument, NULL, 3},
 		{"disable-page-skipping", no_argument, NULL, 4},
@@ -115,6 +118,7 @@ main(int argc, char *argv[])
 
 	/* initialize options to all false */
 	memset(&vacopts, 0, sizeof(vacopts));
+	vacopts.parallel_workers = -1;
 
 	pg_logging_init(argv[0]);
 	progname = get_progname(argv[0]);
@@ -122,7 +126,7 @@ main(int argc, char *argv[])
 
 	handle_help_version_opts(argc, argv, "vacuumdb", help);
 
-	while ((c = getopt_long(argc, argv, "h:p:U:wWeqd:zZFat:fvj:", long_options, &optindex)) != -1)
+	while ((c = getopt_long(argc, argv, "h:p:P::U:wWeqd:zZFat:fvj:", long_options, &optindex)) != -1)
 	{
 		switch (c)
 		{
@@ -182,6 +186,24 @@ main(int argc, char *argv[])
 					exit(1);
 				}
 				break;
+			case 'P':
+				{
+					int parallel_workers = 0;
+
+					if (optarg != NULL)
+					{
+						parallel_workers = atoi(optarg);
+						if (parallel_workers <= 0)
+						{
+							pg_log_error("number of parallel workers must be at least 1");
+							exit(1);
+						}
+					}
+
+					/* allow to set 0, meaning PARALLEL without the parallel degree */
+					vacopts.parallel_workers = parallel_workers;
+					break;
+				}
 			case 2:
 				maintenance_db = pg_strdup(optarg);
 				break;
@@ -254,9 +276,22 @@ main(int argc, char *argv[])
 						 "disable-page-skipping");
 			exit(1);
 		}
+		if (vacopts.parallel_workers >= 0)
+		{
+			pg_log_error("cannot use the \"%s\" option when performing only analyze",
+						 "parallel");
+			exit(1);
+		}
 		/* allow 'and_analyze' with 'analyze_only' */
 	}
 
+	if (vacopts.full && vacopts.parallel_workers >= 0)
+	{
+		pg_log_error("cannot use the \"%s\" option with \"%s\" option",
+					 "full", "parallel");
+		exit(1);
+	}
+
 	setup_cancel_handler();
 
 	/* Avoid opening extra connections. */
@@ -822,6 +857,16 @@ prepare_vacuum_command(PQExpBuffer sql, int serverVersion,
 				appendPQExpBuffer(sql, "%sANALYZE", sep);
 				sep = comma;
 			}
+			if (vacopts->parallel_workers > 0)
+			{
+				appendPQExpBuffer(sql, "%sPARALLEL %d", sep, vacopts->parallel_workers);
+				sep = comma;
+			}
+			if (vacopts->parallel_workers == 0)
+			{
+				appendPQExpBuffer(sql, "%sPARALLEL", sep);
+				sep = comma;
+			}
 			if (sep != paren)
 				appendPQExpBufferChar(sql, ')');
 		}
@@ -885,6 +930,7 @@ help(const char *progname)
 	printf(_("  -j, --jobs=NUM                  use this many concurrent connections to vacuum\n"));
 	printf(_("      --min-mxid-age=MXID_AGE     minimum multixact ID age of tables to vacuum\n"));
 	printf(_("      --min-xid-age=XID_AGE       minimum transaction ID age of tables to vacuum\n"));
+	printf(_("  -P, --parallel[=NUM]            do parallel vacuuming\n"));
 	printf(_("  -q, --quiet                     don't write any messages\n"));
 	printf(_("      --skip-locked               skip relations that cannot be immediately locked\n"));
 	printf(_("  -t, --table='TABLE[(COLUMNS)]'  vacuum specific table(s) only\n"));
-- 
2.22.0

v32-0002-Add-parallel-option-to-VACUUM-command.patchtext/x-patch; charset=US-ASCII; name=v32-0002-Add-parallel-option-to-VACUUM-command.patchDownload

From 12a7a965a8dcb38ce79ba88afd20ea55fdeb5683 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 25 Oct 2019 22:47:41 +0900
Subject: [PATCH v32 2/4] Add parallel option to VACUUM command

This change adds PARALLEL option to VACUUM command that enable us to
perform index vacuuming and index cleanup with background
workers. Individual indexes is processed by one vacuum
process. Therefore parallel vacuum can be used when the table has at
least two indexes and it cannot specify larger parallel degree than
the number of indexes that the table has.

The parallel degree is either specified by user or determined based on
the number of indexes that the table has, and further limited by
max_parallel_maintenance_workers. The table size and index size don't
affect it.
---
 doc/src/sgml/config.sgml              |   14 +-
 doc/src/sgml/ref/vacuum.sgml          |   45 ++
 src/backend/access/heap/vacuumlazy.c  | 1048 ++++++++++++++++++++++---
 src/backend/access/transam/parallel.c |    4 +
 src/backend/commands/vacuum.c         |   45 ++
 src/backend/postmaster/autovacuum.c   |    2 +
 src/bin/psql/tab-complete.c           |    2 +-
 src/include/access/heapam.h           |    3 +
 src/include/commands/vacuum.h         |    5 +
 src/test/regress/expected/vacuum.out  |   14 +
 src/test/regress/sql/vacuum.sql       |   10 +
 11 files changed, 1084 insertions(+), 108 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 886632ff43..335a0ec752 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2265,13 +2265,13 @@ include_dir 'conf.d'
        <listitem>
         <para>
          Sets the maximum number of parallel workers that can be
-         started by a single utility command.  Currently, the only
-         parallel utility command that supports the use of parallel
-         workers is <command>CREATE INDEX</command>, and only when
-         building a B-tree index.  Parallel workers are taken from the
-         pool of processes established by <xref
-         linkend="guc-max-worker-processes"/>, limited by <xref
-         linkend="guc-max-parallel-workers"/>.  Note that the requested
+         started by a single utility command.  Currently, the parallel
+         utility commands that support the use of parallel workers are
+         <command>CREATE INDEX</command> only when building a B-tree index,
+         and <command>VACUUM</command> without <literal>FULL</literal>
+         option. Parallel workers are taken from the pool of processes
+         established by <xref linkend="guc-max-worker-processes"/>, limited
+         by <xref linkend="guc-max-parallel-workers"/>.  Note that the requested
          number of workers may not actually be available at run time.
          If this occurs, the utility operation will run with fewer
          workers than expected.  The default value is 2.  Setting this
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index f9b0fb8794..ae086b976b 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -34,6 +34,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     SKIP_LOCKED [ <replaceable class="parameter">boolean</replaceable> ]
     INDEX_CLEANUP [ <replaceable class="parameter">boolean</replaceable> ]
     TRUNCATE [ <replaceable class="parameter">boolean</replaceable> ]
+    PARALLEL [ <replaceable class="parameter">integer</replaceable> ]
 
 <phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
 
@@ -223,6 +224,32 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><literal>PARALLEL</literal></term>
+    <listitem>
+     <para>
+      Perform vacuum index and cleanup index phases of <command>VACUUM</command>
+      in parallel using <replaceable class="parameter">integer</replaceable>
+      background workers (for the detail of each vacuum phases, please
+      refer to <xref linkend="vacuum-phases"/>). If the parallel degree
+      <replaceable class="parameter">integer</replaceable> is omitted,
+      then <command>VACUUM</command> decides the number of workers based
+      on number of indexes that support parallel vacuum operation on the
+      relation which is further limited by
+      <xref linkend="guc-max-parallel-workers-maintenance"/>. Please note
+      that it is not guaranteed that the number of parallel worker specified
+      in <replaceable class="parameter">integer</replaceable> will be used
+      during execution. It is possible for a vacuum to run with fewer workers
+      than specified, or even with no workers at all. Only one worker can
+      be used per index. So parallel workers are launched only when there
+      are at least <literal>2</literal> indexes in the table. Workers for
+      vacuum launches before starting each phases and exit at the end of
+      the phase. These behaviors might change in a future release. This
+      option can not use with <literal>FULL</literal> option.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><replaceable class="parameter">boolean</replaceable></term>
     <listitem>
@@ -237,6 +264,18 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><replaceable class="parameter">integer</replaceable></term>
+    <listitem>
+     <para>
+      Specifies a positive integer value passed to the selected option.
+      The <replaceable class="parameter">integer</replaceable> value can
+      also be omitted, in which case the default value of the selected
+      option is used.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><replaceable class="parameter">table_name</replaceable></term>
     <listitem>
@@ -316,6 +355,12 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     more than a plain <command>VACUUM</command> would.
    </para>
 
+   <para>
+     The <option>PARALLEL</option> option is used for only vacuum purpose.
+     Even if this option is specified with <option>ANALYZE</option> option
+     it does not affect <option>ANALYZE</option>.
+   </para>
+
    <para>
     <command>VACUUM</command> causes a substantial increase in I/O traffic,
     which might cause poor performance for other active sessions.  Therefore,
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index a3c4a1df3b..0cfa13b81b 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -22,6 +22,21 @@
  * of index scans performed.  So we don't use maintenance_work_mem memory for
  * the TID array, just enough to hold as many heap tuples as fit on one page.
  *
+ * Lazy vacuum supports parallel execution with parallel worker processes.  In
+ * parallel lazy vacuum, we perform both index vacuuming and index cleanup with
+ * parallel worker processes.  Individual indexes are processed by one vacuum
+ * process.  At the beginning of lazy vacuum (at lazy_scan_heap) we prepare the
+ * parallel context and initialize the DSM segment that contains shared information
+ * as well as the memory space for storing dead tuples.  When starting either
+ * index vacuuming or index cleanup, we launch parallel worker processes.  Once
+ * all indexes are processed the parallel worker processes exit.  And then the
+ * leader process re-initializes the parallel context while keeping recorded
+ * dead tuples so that the leader can launch parallel workers again in the next
+ * time.  Note that all parallel workers live during either index vacuuming or
+ * index cleanup but the leader process neither exits from the parallel mode
+ * nor destroys the parallel context.  For updating the index statistics, since
+ * any updates are not allowed during parallel mode we update the index
+ * statistics after exited from the parallel mode.
  *
  * Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
@@ -36,13 +51,16 @@
 
 #include <math.h>
 
+#include "access/amapi.h"
 #include "access/genam.h"
 #include "access/heapam.h"
 #include "access/heapam_xlog.h"
 #include "access/htup_details.h"
 #include "access/multixact.h"
+#include "access/parallel.h"
 #include "access/transam.h"
 #include "access/visibilitymap.h"
+#include "access/xact.h"
 #include "access/xlog.h"
 #include "catalog/storage.h"
 #include "commands/dbcommands.h"
@@ -55,6 +73,7 @@
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
 #include "storage/lmgr.h"
+#include "tcop/tcopprot.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/pg_rusage.h"
@@ -110,6 +129,139 @@
  */
 #define PREFETCH_SIZE			((BlockNumber) 32)
 
+/*
+ * DSM keys for parallel lazy vacuum. Unlike other parallel execution code,
+ * since we don't need to worry about DSM keys conflicting with plan_node_id
+ * we can use small integers.
+ */
+#define PARALLEL_VACUUM_KEY_SHARED			1
+#define PARALLEL_VACUUM_KEY_DEAD_TUPLES		2
+#define PARALLEL_VACUUM_KEY_QUERY_TEXT		3
+
+/*
+ * PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION disables the leader's
+ * participation in parallel lazy vacuum.  This may be useful as a debugging
+ * aid.
+#undef PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION
+ */
+
+/*
+ * Macro to check if we are in a parallel lazy vacuum. If true, we are
+ * in the parallel mode and prepared the DSM segment.
+ */
+#define ParallelVacuumIsActive(lps) (((LVParallelState *) (lps)) != NULL)
+
+/*
+ * LVDeadTuples stores the dead tuple TIDs collected during heap scan.
+ * This is allocated in the DSM segment when parallel lazy vacuum
+ * mode, otherwise allocated in a local memory.
+ */
+typedef struct LVDeadTuples
+{
+	int			max_tuples;	/* # slots allocated in array */
+	int			num_tuples;	/* current # of entries */
+	/* List of TIDs of tuples we intend to delete */
+	/* NB: this list is ordered by TID address */
+	ItemPointerData itemptrs[FLEXIBLE_ARRAY_MEMBER];	/* array of ItemPointerData */
+} LVDeadTuples;
+#define SizeOfLVDeadTuples offsetof(LVDeadTuples, itemptrs) + sizeof(ItemPointerData)
+
+/*
+ * Shared information among parallel workers. So this is allocated in
+ * the DSM segment.
+ */
+typedef struct LVShared
+{
+	/*
+	 * Target table relid and log level. These fields are not modified
+	 * during the lazy vacuum.
+	 */
+	Oid		relid;
+	int		elevel;
+
+	/*
+	 * An indication for vacuum workers of doing either index vacuuming or
+	 * index cleanup.
+	 */
+	bool	for_cleanup;
+
+	/*
+	 * Fields for both index vacuuming and index cleanup.
+	 *
+	 * reltuples is the total number of input heap tuples.  We set either
+	 * an old live tuples in index vacuuming case or the new live tuples in
+	 * index cleanup case.
+	 *
+	 * estimated_count is true if the reltuples is estimated value.
+	 */
+	double	reltuples;
+	bool	estimated_count;
+
+
+	/*
+	 * In single process lazy vacuum we could consume more memory during
+	 * index vacuuming or cleanup apart from the memory for heap scanning
+	 * if an index consume memory during ambulkdelete and amvacuumcleanup.
+	 * In parallel index vacuuming, since individual vacuum workers
+	 * consumes memory we set the new maitenance_work_mem for each workers
+	 * to not consume more memory than single process lazy vacuum.
+	 */
+	int		maintenance_work_mem_worker;
+
+	/* The number of indexes that do NOT support parallel index vacuuming */
+	int		nindexes_nonparallel;
+
+	/*
+	 * Variables to control parallel index vacuuming.  Index statistics
+	 * returned from ambulkdelete and amvacuumcleanup is nullable variable
+	 * length.  'offset' is NULL bitmap. Note that a 0 indicates a null,
+	 * while 1 indicates non-null.  The index statistics follows at end of
+	 * struct.
+	 */
+	pg_atomic_uint32	nprocessed;	/* counter for vacuuming and clean up */
+	uint32				offset;		/* sizeof header incl. bitmap */
+	bits8				bitmap[FLEXIBLE_ARRAY_MEMBER];	 /* bit map of NULLs */
+
+	/* Shared index statistics data follows at end of struct */
+} LVShared;
+#define SizeOfLVShared offsetof(LVShared, bitmap) + sizeof(bits8)
+#define GetSharedIndStats(s) \
+	((LVSharedIndStats *)((char *)(s) + ((LVShared *)(s))->offset))
+#define IndStatsIsNull(s, i) \
+	(!(((LVShared *)(s))->bitmap[(i) >> 3] & (1 << ((i) & 0x07))))
+
+/*
+ * Struct for an index bulk-deletion statistic used for parallel lazy
+ * vacuum. This is allocated in the DSM segment.  IndexBulkDeleteResult
+ * follows at end of struct.
+ */
+typedef struct LVSharedIndStats
+{
+	Size	size;
+	bool	updated;	/* are the stats updated */
+
+	/* Index bulk-deletion result data follows at end of struct */
+} LVSharedIndStats;
+#define SizeOfSharedIndStats(s) \
+	(sizeof(LVSharedIndStats) + ((LVSharedIndStats *)(s))->size)
+#define GetIndexBulkDeleteResult(s) \
+	((IndexBulkDeleteResult *)((char *)(s) + sizeof(LVSharedIndStats)))
+
+/* Struct for parallel lazy vacuum */
+typedef struct LVParallelState
+{
+	ParallelContext	*pcxt;
+
+	/* Shared information among parallel vacuum workers */
+	LVShared		*lvshared;
+
+	/*
+	 * Always true except for a debugging case where
+	 * PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION are defined.
+	 */
+	bool			leaderparticipates;
+} LVParallelState;
+
 typedef struct LVRelStats
 {
 	/* useindex = true means two-pass strategy; false means one-pass */
@@ -128,11 +280,7 @@ typedef struct LVRelStats
 	BlockNumber pages_removed;
 	double		tuples_deleted;
 	BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
-	/* List of TIDs of tuples we intend to delete */
-	/* NB: this list is ordered by TID address */
-	int			num_dead_tuples;	/* current # of entries */
-	int			max_dead_tuples;	/* # slots allocated in array */
-	ItemPointer dead_tuples;	/* array of ItemPointerData */
+	LVDeadTuples *dead_tuples;
 	int			num_index_scans;
 	TransactionId latestRemovedXid;
 	bool		lock_waiter_detected;
@@ -155,12 +303,11 @@ static void lazy_scan_heap(Relation onerel, VacuumParams *params,
 						   bool aggressive);
 static void lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats);
 static bool lazy_check_needs_freeze(Buffer buf, bool *hastup);
-static void lazy_vacuum_index(Relation indrel,
-							  IndexBulkDeleteResult **stats,
-							  LVRelStats *vacrelstats);
+static void lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult **stats,
+							  LVDeadTuples *dead_tuples, double reltuples);
 static void lazy_cleanup_index(Relation indrel,
-							   IndexBulkDeleteResult *stats,
-							   LVRelStats *vacrelstats);
+							   IndexBulkDeleteResult **stats,
+							   double reltuples, bool estimated_count);
 static int	lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 							 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer);
 static bool should_attempt_truncation(VacuumParams *params,
@@ -169,12 +316,36 @@ static void lazy_truncate_heap(Relation onerel, LVRelStats *vacrelstats);
 static BlockNumber count_nondeletable_pages(Relation onerel,
 											LVRelStats *vacrelstats);
 static void lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks);
-static void lazy_record_dead_tuple(LVRelStats *vacrelstats,
+static void lazy_record_dead_tuple(LVDeadTuples *dead_tuples,
 								   ItemPointer itemptr);
 static bool lazy_tid_reaped(ItemPointer itemptr, void *state);
 static int	vac_cmp_itemptr(const void *left, const void *right);
 static bool heap_page_is_all_visible(Relation rel, Buffer buf,
 									 TransactionId *visibility_cutoff_xid, bool *all_frozen);
+static LVParallelState *begin_parallel_vacuum(LVRelStats *vacrelstats, Oid relid,
+											  BlockNumber nblocks, Relation *Irel,
+											  int nindexes, int nrequested);
+static void end_parallel_vacuum(LVParallelState *lps, Relation *Irel, int nindexes,
+								IndexBulkDeleteResult **stats);
+static void prepare_index_statistics(LVShared *lvshared, Relation *Irel, int nindexes);
+static void lazy_vacuum_indexes(LVRelStats *vacrelstats, Relation *Irel,
+								int nindexes, IndexBulkDeleteResult **stats,
+								LVParallelState *lps);
+static void lazy_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
+								 int nindexes, IndexBulkDeleteResult **stats,
+								 LVParallelState *lps);
+static void lazy_parallel_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
+													int nindexes, IndexBulkDeleteResult **stats,
+													LVParallelState *lps);
+static void vacuum_or_cleanup_indexes_worker(Relation *Irel, int nindexes,
+											 IndexBulkDeleteResult **stats,
+											 LVShared *lvshared,
+											 LVDeadTuples *dead_tuples);
+static void update_index_statistics(Relation *Irel, IndexBulkDeleteResult **stats,
+									int nindexes);
+static LVSharedIndStats *get_indstats(LVShared *lvshared, int n);
+static int compute_parallel_workers(Relation *Irel, int nindexes, int nrequested);
+static long compute_max_dead_tuples(BlockNumber relblocks, bool hasindex);
 
 
 /*
@@ -488,6 +659,18 @@ vacuum_log_cleanup_info(Relation rel, LVRelStats *vacrelstats)
  *		dead-tuple TIDs, invoke vacuuming of indexes and call lazy_vacuum_heap
  *		to reclaim dead line pointers.
  *
+ *		If the table has at least two indexes and parallel lazy vacuum is
+ *		requested, we execute both index vacuuming and index cleanup with
+ *		parallel workers. In parallel lazy vacuum, we enter parallel mode and
+ *		then create both the parallel context and the DSM segment before starting
+ *		heap scan so that we can record dead tuples to the DSM segment. All
+ *		parallel workers are launched at beginning of index vacuuming and index
+ *		cleanup and they exit once done with all indexes. At the end of this
+ *		function we exit from parallel mode. Index bulk-deletion results are
+ *		stored in the DSM segment and update index statistics as a whole after
+ *		exited from parallel mode since all writes are not allowed during parallel
+ *		mode.
+ *
  *		If there are no indexes then we can reclaim line pointers on the fly;
  *		dead line pointers need only be retained until all index pointers that
  *		reference them have been killed.
@@ -496,6 +679,8 @@ static void
 lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			   Relation *Irel, int nindexes, bool aggressive)
 {
+	LVParallelState *lps = NULL;
+	LVDeadTuples *dead_tuples;
 	BlockNumber nblocks,
 				blkno;
 	HeapTupleData tuple;
@@ -518,6 +703,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	bool		skipping_blocks;
 	xl_heap_freeze_tuple *frozen;
 	StringInfoData buf;
+	int			parallel_workers = 0;
 	const int	initprog_index[] = {
 		PROGRESS_VACUUM_PHASE,
 		PROGRESS_VACUUM_TOTAL_HEAP_BLKS,
@@ -553,13 +739,41 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	vacrelstats->nonempty_pages = 0;
 	vacrelstats->latestRemovedXid = InvalidTransactionId;
 
-	lazy_space_alloc(vacrelstats, nblocks);
+	/*
+	 * If parallel lazy vacuum is requested and we vacuum indexes, compute
+	 * the number of parallel vacuum worker to launch.
+	 */
+	if (params->nworkers >= 0 && vacrelstats->useindex)
+		parallel_workers = compute_parallel_workers(Irel, nindexes,
+													params->nworkers);
+
+	if (parallel_workers > 0)
+	{
+		/*
+		 * Enter parallel mode, create the parallel context and allocate the
+		 * DSM segment.
+		 */
+		lps = begin_parallel_vacuum(vacrelstats,
+									RelationGetRelid(onerel),
+									nblocks, Irel, nindexes,
+									parallel_workers);
+	}
+	else
+	{
+		/*
+		 * Use single process vacuum. We allocate the memory space for dead
+		 * tuples locally.
+		 */
+		lazy_space_alloc(vacrelstats, nblocks);
+	}
+
+	dead_tuples = vacrelstats->dead_tuples;
 	frozen = palloc(sizeof(xl_heap_freeze_tuple) * MaxHeapTuplesPerPage);
 
 	/* Report that we're scanning the heap, advertising total # of blocks */
 	initprog_val[0] = PROGRESS_VACUUM_PHASE_SCAN_HEAP;
 	initprog_val[1] = nblocks;
-	initprog_val[2] = vacrelstats->max_dead_tuples;
+	initprog_val[2] = dead_tuples->max_tuples;
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
 	/*
@@ -737,8 +951,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * If we are close to overrunning the available space for dead-tuple
 		 * TIDs, pause and do a cycle of vacuuming before we tackle this page.
 		 */
-		if ((vacrelstats->max_dead_tuples - vacrelstats->num_dead_tuples) < MaxHeapTuplesPerPage &&
-			vacrelstats->num_dead_tuples > 0)
+		if ((dead_tuples->max_tuples - dead_tuples->num_tuples) < MaxHeapTuplesPerPage &&
+			dead_tuples->num_tuples > 0)
 		{
 			const int	hvp_index[] = {
 				PROGRESS_VACUUM_PHASE,
@@ -766,10 +980,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 										 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
 			/* Remove index entries */
-			for (i = 0; i < nindexes; i++)
-				lazy_vacuum_index(Irel[i],
-								  &indstats[i],
-								  vacrelstats);
+			lazy_vacuum_indexes(vacrelstats, Irel, nindexes, indstats, lps);
 
 			/*
 			 * Report that we are now vacuuming the heap.  We also increase
@@ -789,7 +1000,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 			vacrelstats->num_index_scans++;
 
 			/*
@@ -985,7 +1196,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		has_dead_tuples = false;
 		nfrozen = 0;
 		hastup = false;
-		prev_dead_count = vacrelstats->num_dead_tuples;
+		prev_dead_count = dead_tuples->num_tuples;
 		maxoff = PageGetMaxOffsetNumber(page);
 
 		/*
@@ -1024,7 +1235,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 */
 			if (ItemIdIsDead(itemid))
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				all_visible = false;
 				continue;
 			}
@@ -1170,7 +1381,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 
 			if (tupgone)
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				HeapTupleHeaderAdvanceLatestRemovedXid(tuple.t_data,
 													   &vacrelstats->latestRemovedXid);
 				tups_vacuumed += 1;
@@ -1240,7 +1451,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * doing a second scan. Also we don't do that but forget dead tuples
 		 * when index cleanup is disabled.
 		 */
-		if (!vacrelstats->useindex && vacrelstats->num_dead_tuples > 0)
+		if (!vacrelstats->useindex && dead_tuples->num_tuples > 0)
 		{
 			if (nindexes == 0)
 			{
@@ -1269,7 +1480,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 
 			/*
 			 * Periodically do incremental FSM vacuuming to make newly-freed
@@ -1384,7 +1595,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * page, so remember its free space as-is.  (This path will always be
 		 * taken if there are no indexes.)
 		 */
-		if (vacrelstats->num_dead_tuples == prev_dead_count)
+		if (dead_tuples->num_tuples == prev_dead_count)
 			RecordPageWithFreeSpace(onerel, blkno, freespace);
 	}
 
@@ -1418,7 +1629,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 
 	/* If any tuples need to be deleted, perform final vacuum cycle */
 	/* XXX put a threshold on min number of tuples here? */
-	if (vacrelstats->num_dead_tuples > 0)
+	if (dead_tuples->num_tuples > 0)
 	{
 		const int	hvp_index[] = {
 			PROGRESS_VACUUM_PHASE,
@@ -1434,10 +1645,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 									 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
 		/* Remove index entries */
-		for (i = 0; i < nindexes; i++)
-			lazy_vacuum_index(Irel[i],
-							  &indstats[i],
-							  vacrelstats);
+		lazy_vacuum_indexes(vacrelstats, Irel, nindexes, indstats, lps);
 
 		/* Report that we are now vacuuming the heap */
 		hvp_val[0] = PROGRESS_VACUUM_PHASE_VACUUM_HEAP;
@@ -1463,12 +1671,19 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
 								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
 
-	/* Do post-vacuum cleanup and statistics update for each index */
+	/* Do post-vacuum cleanup */
 	if (vacrelstats->useindex)
-	{
-		for (i = 0; i < nindexes; i++)
-			lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
-	}
+		lazy_cleanup_indexes(vacrelstats, Irel, nindexes, indstats, lps);
+
+	/*
+	 * End parallel mode before updating index statistics as we cannot write
+	 * during parallel mode.
+	 */
+	if (ParallelVacuumIsActive(lps))
+		end_parallel_vacuum(lps, Irel, nindexes, indstats);
+
+	/* Update index statistics */
+	 update_index_statistics(Irel, indstats, nindexes);
 
 	/* If no indexes, make log report that lazy_vacuum_heap would've made */
 	if (vacuumed_pages)
@@ -1534,7 +1749,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 	npages = 0;
 
 	tupindex = 0;
-	while (tupindex < vacrelstats->num_dead_tuples)
+	while (tupindex < vacrelstats->dead_tuples->num_tuples)
 	{
 		BlockNumber tblk;
 		Buffer		buf;
@@ -1543,7 +1758,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 
 		vacuum_delay_point();
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples->itemptrs[tupindex]);
 		buf = ReadBufferExtended(onerel, MAIN_FORKNUM, tblk, RBM_NORMAL,
 								 vac_strategy);
 		if (!ConditionalLockBufferForCleanup(buf))
@@ -1591,6 +1806,7 @@ static int
 lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 				 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer)
 {
+	LVDeadTuples	*dead_tuples = vacrelstats->dead_tuples;
 	Page		page = BufferGetPage(buffer);
 	OffsetNumber unused[MaxOffsetNumber];
 	int			uncnt = 0;
@@ -1601,16 +1817,16 @@ lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 
 	START_CRIT_SECTION();
 
-	for (; tupindex < vacrelstats->num_dead_tuples; tupindex++)
+	for (; tupindex < dead_tuples->num_tuples; tupindex++)
 	{
 		BlockNumber tblk;
 		OffsetNumber toff;
 		ItemId		itemid;
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&dead_tuples->itemptrs[tupindex]);
 		if (tblk != blkno)
 			break;				/* past end of tuples for this block */
-		toff = ItemPointerGetOffsetNumber(&vacrelstats->dead_tuples[tupindex]);
+		toff = ItemPointerGetOffsetNumber(&dead_tuples->itemptrs[tupindex]);
 		itemid = PageGetItemId(page, toff);
 		ItemIdSetUnused(itemid);
 		unused[uncnt++] = toff;
@@ -1731,19 +1947,290 @@ lazy_check_needs_freeze(Buffer buf, bool *hastup)
 	return false;
 }
 
+/*
+ * Perform index vacuuming or index cleanup with parallel workers. This function
+ * must be used by the parallel vacuum leader process. The caller must set
+ * lps->lvshared->for_cleanup to indicate whether vacuuming or cleanup.
+ */
+static void
+lazy_parallel_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
+										int nindexes, IndexBulkDeleteResult **stats,
+										LVParallelState *lps)
+{
+	Assert(!IsParallelWorker());
+	Assert(ParallelVacuumIsActive(lps));
+	Assert(nindexes > 0);
+
+	LaunchParallelWorkers(lps->pcxt);
+
+	if (lps->lvshared->for_cleanup)
+		ereport(elevel,
+				(errmsg(ngettext("launched %d parallel vacuum worker for index vacuuming (planned: %d)",
+								 "launched %d parallel vacuum workers for index vacuuming (planned: %d)",
+								 lps->pcxt->nworkers_launched),
+						lps->pcxt->nworkers_launched, lps->pcxt->nworkers)));
+	else
+		ereport(elevel,
+				(errmsg(ngettext("launched %d parallel vacuum worker for index cleanup (planned: %d)",
+								 "launched %d parallel vacuum workers for index cleanup (planned: %d)",
+								 lps->pcxt->nworkers_launched),
+						lps->pcxt->nworkers_launched, lps->pcxt->nworkers)));
+
+	/*
+	 * Join as parallel workers. The leader process alone does that in case where
+	 * no workers launched.
+	 */
+	if (lps->leaderparticipates || lps->pcxt->nworkers_launched == 0)
+		vacuum_or_cleanup_indexes_worker(Irel, nindexes, stats, lps->lvshared,
+										 vacrelstats->dead_tuples);
+
+	/* Wait for all vacuum workers to finish */
+	WaitForParallelWorkersToFinish(lps->pcxt);
+
+	/*
+	 * We need to reinitialize the parallel context as no more index vacuuming and
+	 * index cleanup will be performed after that.
+	 */
+	if (!lps->lvshared->for_cleanup)
+	{
+		/* Reset the processing count */
+		pg_atomic_write_u32(&(lps->lvshared->nprocessed), 0);
+
+		/*
+		 * Reinitialize the parallel context to relaunch parallel workers
+		 * for the next execution.
+		 */
+		ReinitializeParallelDSM(lps->pcxt);
+	}
+}
+
+/*
+ * Index vacuuming and index cleanup routine used by parallel vacuum
+ * worker processes including the leader process.  After finished each
+ * indexes this function copies the index statistics returned from
+ * ambulkdelete and amvacuumcleanup to the DSM segment.
+ */
+static void
+vacuum_or_cleanup_indexes_worker(Relation *Irel, int nindexes,
+								 IndexBulkDeleteResult **stats,
+								 LVShared *lvshared,
+								 LVDeadTuples *dead_tuples)
+{
+	/* Loop until all indexes are vacuumed */
+	for (;;)
+	{
+		int idx;
+		LVSharedIndStats *shared_indstats;
+		IndexBulkDeleteResult *bulkdelete_res;
+
+		/* Get an index number to process */
+		idx = pg_atomic_fetch_add_u32(&(lvshared->nprocessed), 1);
+
+		/* Done for all indexes? */
+		if (idx >= nindexes)
+			break;
+
+		/* Get index statistics struct of this index */
+		shared_indstats = get_indstats(lvshared, idx);
+
+		/* Skip if this index doesn't support parallel index vacuuming */
+		if (shared_indstats == NULL)
+			continue;
+
+		/* Get the space for IndexBulkDeleteResult */
+		bulkdelete_res = GetIndexBulkDeleteResult(shared_indstats);
+
+		/*
+		 * Update the pointer to the corresponding bulk-deletion result
+		 * if someone has already updated it.
+		 */
+		if (shared_indstats->updated && stats[idx] == NULL)
+			stats[idx] = bulkdelete_res;
+
+		/* Do vacuum or cleanup one index */
+		if (lvshared->for_cleanup)
+			lazy_cleanup_index(Irel[idx], &(stats[idx]), lvshared->reltuples,
+							   lvshared->estimated_count);
+		else
+			lazy_vacuum_index(Irel[idx], &(stats[idx]), dead_tuples,
+							  lvshared->reltuples);
+
+		/*
+		 * Copy the index bulk-deletion result returned from ambulkdelete
+		 * and amvacuumcleanup to the DSM segment if it's the first time to
+		 * get it from them, because they allocate it locally and it's
+		 * possible that an index will be vacuumed by the different vacuum
+		 * process at the next time.  The copying the result normally
+		 * happens only after the first time of index vacuuming.  From the
+		 * second time, we pass the result on the DSM segment so that they
+		 * then update it directly.
+		 *
+		 * Since all vacuum workers write the bulk-deletion result at
+		 * different slots we can write them without locking.
+		 */
+		if (!shared_indstats->updated && stats[idx] != NULL)
+		{
+			memcpy(bulkdelete_res, stats[idx], shared_indstats->size);
+			shared_indstats->updated = true;
+
+			/*
+			 * no longer need the locally allocated result and now
+			 * stats[idx] points to the DSM segment.
+			 */
+			pfree(stats[idx]);
+			stats[idx] = bulkdelete_res;
+		}
+	}
+}
+
+/*
+ * Cleanup indexes.  This function must be used by the parallel vacuum
+ * leader process in parallel vacuum case.
+ */
+static void
+lazy_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
+					int nindexes, IndexBulkDeleteResult **stats,
+					LVParallelState *lps)
+{
+	int		idx;
+
+	Assert(!IsParallelWorker());
+	Assert(nindexes > 0);
+
+	/*
+	 * If parallel vacuum is active we perform index cleanup with parallel
+	 * workers.
+	 */
+	if (ParallelVacuumIsActive(lps))
+	{
+		/* Tell parallel workers to do index cleanup */
+		lps->lvshared->for_cleanup = true;
+
+		/*
+		 * Now we can provide a better estimate of total number of
+		 * surviving tuples (we assume indexes are more interested in that
+		 * than in the number of nominally live tuples).
+		 */
+		lps->lvshared->reltuples = vacrelstats->new_rel_tuples;
+		lps->lvshared->estimated_count =
+			(vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+
+		/*
+		 * Generally index cleanup does not scan the index when index
+		 * vacuuming (ambulkdelete) was already performed.  So we perform
+		 * index cleanup with parallel workers only if we have not
+		 * performed index vacuuming yet.  Otherwise, we do it in the
+		 * leader process alone.
+		 */
+		if (vacrelstats->num_index_scans == 0)
+			lazy_parallel_vacuum_or_cleanup_indexes(vacrelstats, Irel, nindexes,
+													stats, lps);
+		else
+		{
+			/*
+			 * Do cleanup by the leader process alone.  Since we need to
+			 * copy the index statistics to the DSM segment we cannot use
+			 * lazy_index_cleanup instead.
+			 */
+			vacuum_or_cleanup_indexes_worker(Irel, nindexes, stats,
+											 lps->lvshared,
+											 vacrelstats->dead_tuples);
+		}
+
+		/*
+		 * Done if there is no indexes that do not support parallel index
+		 * vacuuming.  Otherwise fall through to do single process vacuum
+		 * on such indexes.
+		 */
+		if (lps->lvshared->nindexes_nonparallel == 0)
+			return;
+	}
+
+	for (idx = 0; idx < nindexes; idx++)
+	{
+		/*
+		 * Skip indexes that we have already cleaned up during parallel
+		 * index vacuuming.
+		 */
+		if (ParallelVacuumIsActive(lps) && !IndStatsIsNull(lps->lvshared,idx))
+			continue;
+
+		lazy_cleanup_index(Irel[idx], &stats[idx],
+						   vacrelstats->new_rel_tuples,
+						   vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	}
+}
+
+/*
+ * Vacuum indexes. This function must be used by the parallel vacuum leader
+ * process in parallel vacuum case.
+ */
+static void
+lazy_vacuum_indexes(LVRelStats *vacrelstats, Relation *Irel,
+					int nindexes, IndexBulkDeleteResult **stats,
+					LVParallelState *lps)
+{
+	int		idx;
+
+	Assert(!IsParallelWorker());
+	Assert(nindexes > 0);
+
+	/*
+	 * If parallel vacuum is active we perform index vacuuming with
+	 * parallel workers.
+	 */
+	if (ParallelVacuumIsActive(lps))
+	{
+		/* Tell parallel workers to do index vacuuming */
+		lps->lvshared->for_cleanup = false;
+
+		/*
+		 * We can only provide an approximate value of num_heap_tuples in
+		 * vacuum cases.
+		 */
+		lps->lvshared->reltuples = vacrelstats->old_live_tuples;
+		lps->lvshared->estimated_count = true;
+
+		lazy_parallel_vacuum_or_cleanup_indexes(vacrelstats, Irel, nindexes,
+												stats, lps);
+
+		/*
+		 * Done if there is no indexes that do not support parallel index
+		 * vacuuming.  Otherwise fall through to do single process vacuum
+		 * on such indexes.
+		 */
+		if (lps->lvshared->nindexes_nonparallel == 0)
+			return;
+	}
+
+	for (idx = 0; idx < nindexes; idx++)
+	{
+		/*
+		 * Skip indexes that we have already vacuumed during parallel index
+		 * vacuuming.
+		 */
+		if (ParallelVacuumIsActive(lps) && !IndStatsIsNull(lps->lvshared, idx))
+			continue;
+
+		lazy_vacuum_index(Irel[idx], &stats[idx], vacrelstats->dead_tuples,
+						  vacrelstats->old_live_tuples);
+	}
+}
 
 /*
  *	lazy_vacuum_index() -- vacuum one index relation.
  *
  *		Delete all the index entries pointing to tuples listed in
  *		vacrelstats->dead_tuples, and update running statistics.
+ *		reltuples is the number of heap tuples to be passed to the
+ *		bulk delete callback.
  */
 static void
-lazy_vacuum_index(Relation indrel,
-				  IndexBulkDeleteResult **stats,
-				  LVRelStats *vacrelstats)
+lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult **stats,
+				  LVDeadTuples *dead_tuples, double reltuples)
 {
 	IndexVacuumInfo ivinfo;
+	char		*msgfmt;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
@@ -1753,30 +2240,38 @@ lazy_vacuum_index(Relation indrel,
 	ivinfo.report_progress = false;
 	ivinfo.estimated_count = true;
 	ivinfo.message_level = elevel;
-	/* We can only provide an approximate value of num_heap_tuples here */
-	ivinfo.num_heap_tuples = vacrelstats->old_live_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
 	/* Do bulk deletion */
 	*stats = index_bulk_delete(&ivinfo, *stats,
-							   lazy_tid_reaped, (void *) vacrelstats);
+							   lazy_tid_reaped, (void *) dead_tuples);
+
+	if (IsParallelWorker())
+		msgfmt = gettext_noop("scanned index \"%s\" to remove %d row versions by parallel vacuum worker");
+	else
+		msgfmt = gettext_noop("scanned index \"%s\" to remove %d row versions");
 
 	ereport(elevel,
-			(errmsg("scanned index \"%s\" to remove %d row versions",
+			(errmsg(msgfmt,
 					RelationGetRelationName(indrel),
-					vacrelstats->num_dead_tuples),
+					dead_tuples->num_tuples),
 			 errdetail_internal("%s", pg_rusage_show(&ru0))));
 }
 
 /*
  *	lazy_cleanup_index() -- do post-vacuum cleanup for one index relation.
+ *
+ *		reltuples is the number of heap tuples and estimated_count is true
+ *		if the reltuples is an estimated value.
  */
 static void
 lazy_cleanup_index(Relation indrel,
-				   IndexBulkDeleteResult *stats,
-				   LVRelStats *vacrelstats)
+				   IndexBulkDeleteResult **stats,
+				   double reltuples, bool estimated_count)
 {
 	IndexVacuumInfo ivinfo;
+	char		*msgfmt;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
@@ -1784,49 +2279,62 @@ lazy_cleanup_index(Relation indrel,
 	ivinfo.index = indrel;
 	ivinfo.analyze_only = false;
 	ivinfo.report_progress = false;
-	ivinfo.estimated_count = (vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	ivinfo.estimated_count = estimated_count;
 	ivinfo.message_level = elevel;
 
-	/*
-	 * Now we can provide a better estimate of total number of surviving
-	 * tuples (we assume indexes are more interested in that than in the
-	 * number of nominally live tuples).
-	 */
-	ivinfo.num_heap_tuples = vacrelstats->new_rel_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
-	stats = index_vacuum_cleanup(&ivinfo, stats);
+	*stats = index_vacuum_cleanup(&ivinfo, *stats);
 
-	if (!stats)
+	if (!(*stats))
 		return;
 
-	/*
-	 * Now update statistics in pg_class, but only if the index says the count
-	 * is accurate.
-	 */
-	if (!stats->estimated_count)
-		vac_update_relstats(indrel,
-							stats->num_pages,
-							stats->num_index_tuples,
-							0,
-							false,
-							InvalidTransactionId,
-							InvalidMultiXactId,
-							false);
+	if (IsParallelWorker())
+		msgfmt = gettext_noop("index \"%s\" now contains %.0f row versions in %u pages, reported by parallel vacuum worker");
+	else
+		msgfmt = gettext_noop("index \"%s\" now contains %.0f row versions in %u pages");
 
 	ereport(elevel,
-			(errmsg("index \"%s\" now contains %.0f row versions in %u pages",
+			(errmsg(msgfmt,
 					RelationGetRelationName(indrel),
-					stats->num_index_tuples,
-					stats->num_pages),
+					(*stats)->num_index_tuples,
+					(*stats)->num_pages),
 			 errdetail("%.0f index row versions were removed.\n"
 					   "%u index pages have been deleted, %u are currently reusable.\n"
 					   "%s.",
-					   stats->tuples_removed,
-					   stats->pages_deleted, stats->pages_free,
+					   (*stats)->tuples_removed,
+					   (*stats)->pages_deleted, (*stats)->pages_free,
 					   pg_rusage_show(&ru0))));
+}
+
+/*
+ * Update index statistics in pg_class if the statistics is accurate.
+ */
+static void
+update_index_statistics(Relation *Irel, IndexBulkDeleteResult **stats,
+						int nindexes)
+{
+	int i;
+
+	Assert(!IsInParallelMode());
 
-	pfree(stats);
+	for (i = 0; i < nindexes; i++)
+	{
+		if (stats[i] == NULL || stats[i]->estimated_count)
+			continue;
+
+		/* Update index statistics */
+		vac_update_relstats(Irel[i],
+							stats[i]->num_pages,
+							stats[i]->num_index_tuples,
+							0,
+							false,
+							InvalidTransactionId,
+							InvalidMultiXactId,
+							false);
+		pfree(stats[i]);
+	}
 }
 
 /*
@@ -2134,19 +2642,17 @@ count_nondeletable_pages(Relation onerel, LVRelStats *vacrelstats)
 }
 
 /*
- * lazy_space_alloc - space allocation decisions for lazy vacuum
- *
- * See the comments at the head of this file for rationale.
+ * Return the maximum number of dead tuples we can record.
  */
-static void
-lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+static long
+compute_max_dead_tuples(BlockNumber relblocks, bool useindex)
 {
 	long		maxtuples;
 	int			vac_work_mem = IsAutoVacuumWorkerProcess() &&
 	autovacuum_work_mem != -1 ?
 	autovacuum_work_mem : maintenance_work_mem;
 
-	if (vacrelstats->useindex)
+	if (useindex)
 	{
 		maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);
 		maxtuples = Min(maxtuples, INT_MAX);
@@ -2160,34 +2666,49 @@ lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
 		maxtuples = Max(maxtuples, MaxHeapTuplesPerPage);
 	}
 	else
-	{
 		maxtuples = MaxHeapTuplesPerPage;
-	}
 
-	vacrelstats->num_dead_tuples = 0;
-	vacrelstats->max_dead_tuples = (int) maxtuples;
-	vacrelstats->dead_tuples = (ItemPointer)
-		palloc(maxtuples * sizeof(ItemPointerData));
+	return maxtuples;
+}
+
+/*
+ * lazy_space_alloc - space allocation decisions for lazy vacuum
+ *
+ * See the comments at the head of this file for rationale.
+ */
+static void
+lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+{
+	LVDeadTuples	*dead_tuples = NULL;
+	long		maxtuples;
+
+	maxtuples = compute_max_dead_tuples(relblocks, vacrelstats->useindex);
+
+	dead_tuples = (LVDeadTuples *)
+		palloc(SizeOfLVDeadTuples + maxtuples * sizeof(ItemPointerData));
+	dead_tuples->num_tuples = 0;
+	dead_tuples->max_tuples = (int) maxtuples;
+
+	vacrelstats->dead_tuples = dead_tuples;
 }
 
 /*
  * lazy_record_dead_tuple - remember one deletable tuple
  */
 static void
-lazy_record_dead_tuple(LVRelStats *vacrelstats,
-					   ItemPointer itemptr)
+lazy_record_dead_tuple(LVDeadTuples *dead_tuples, ItemPointer itemptr)
 {
 	/*
 	 * The array shouldn't overflow under normal behavior, but perhaps it
 	 * could if we are given a really small maintenance_work_mem. In that
 	 * case, just forget the last few tuples (we'll get 'em next time).
 	 */
-	if (vacrelstats->num_dead_tuples < vacrelstats->max_dead_tuples)
+	if (dead_tuples->num_tuples < dead_tuples->max_tuples)
 	{
-		vacrelstats->dead_tuples[vacrelstats->num_dead_tuples] = *itemptr;
-		vacrelstats->num_dead_tuples++;
+		dead_tuples->itemptrs[dead_tuples->num_tuples] = *itemptr;
+		dead_tuples->num_tuples++;
 		pgstat_progress_update_param(PROGRESS_VACUUM_NUM_DEAD_TUPLES,
-									 vacrelstats->num_dead_tuples);
+									 dead_tuples->num_tuples);
 	}
 }
 
@@ -2201,12 +2722,12 @@ lazy_record_dead_tuple(LVRelStats *vacrelstats,
 static bool
 lazy_tid_reaped(ItemPointer itemptr, void *state)
 {
-	LVRelStats *vacrelstats = (LVRelStats *) state;
+	LVDeadTuples	*dead_tuples = (LVDeadTuples *) state;
 	ItemPointer res;
 
 	res = (ItemPointer) bsearch((void *) itemptr,
-								(void *) vacrelstats->dead_tuples,
-								vacrelstats->num_dead_tuples,
+								(void *) dead_tuples->itemptrs,
+								dead_tuples->num_tuples,
 								sizeof(ItemPointerData),
 								vac_cmp_itemptr);
 
@@ -2354,3 +2875,330 @@ heap_page_is_all_visible(Relation rel, Buffer buf,
 
 	return all_visible;
 }
+
+/*
+ * Compute the number of parallel worker processes to request. Both index
+ * vacuuming and index cleanup can be executed together with parallel workers.
+ * The relation sizes of table and indexes don't affect to the parallel
+ * degree for now. nrequested is the number of parallel workers that user
+ * requested. If nrequested is 0 we compute the parallel degree based on
+ * nindexes that is the number of indexes that support parallel index
+ * vacuuming.
+ */
+static int
+compute_parallel_workers(Relation *Irel, int nindexes, int nrequested)
+{
+	bool	leaderparticipates = true;
+	int		nindexes_to_vacuum = 0;
+	int		parallel_workers;
+	int		i;
+
+	Assert(nrequested >= 0);
+
+	/* Return immediately when parallelism disabled */
+	if (max_parallel_maintenance_workers == 0)
+		return 0;
+
+	/*
+	 * Compute the number of indexes that can participate to parallel index
+	 * vacuuming.
+	 */
+	for (i = 0; i < nindexes; i++)
+	{
+		IndexAmRoutine *amroutine = GetIndexAmRoutine(Irel[i]->rd_amhandler);
+
+		if (amroutine->amcanparallelvacuum)
+			nindexes_to_vacuum++;
+	}
+
+	/* No index supports parallel index vacuuming */
+	if (nindexes_to_vacuum == 0)
+		return 0;
+
+#ifdef PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION
+	leaderparticipates = false;
+#endif
+
+	/* The leader process takes one index */
+	if (leaderparticipates)
+		nindexes_to_vacuum--;
+
+	/* Compute the parallel degree */
+	parallel_workers = (nrequested > 0) ?
+		Min(nrequested, nindexes_to_vacuum) : nindexes_to_vacuum;
+
+	/* cap by max_parallel_maintenace_workers */
+	parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+
+	return parallel_workers;
+}
+
+/*
+ * Enter parallel mode, allocate and initialize the DSM segment.
+ */
+static LVParallelState *
+begin_parallel_vacuum(LVRelStats *vacrelstats, Oid relid, BlockNumber nblocks,
+					  Relation *Irel, int nindexes, int nrequested)
+{
+	LVParallelState *lps = (LVParallelState *) palloc(sizeof(LVParallelState));
+	ParallelContext *pcxt;
+	LVShared		*shared;
+	LVDeadTuples	*dead_tuples;
+	long	maxtuples;
+	char	*sharedquery;
+	Size	est_shared;
+	Size	est_deadtuples;
+	int		querylen;
+	int		i;
+
+	Assert(nrequested > 0);
+	Assert(nindexes > 0);
+
+	lps->leaderparticipates = true;
+
+#ifdef PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION
+	lps->leaderparticipates = false;
+#endif
+
+	EnterParallelMode();
+	pcxt = CreateParallelContext("postgres", "heap_parallel_vacuum_main",
+								 nrequested);
+	lps->pcxt = pcxt;
+	Assert(pcxt->nworkers > 0);
+
+	/* Estimate size for shared information -- PARALLEL_VACUUM_KEY_SHARED */
+	est_shared = MAXALIGN(add_size(SizeOfLVShared, BITMAPLEN(nindexes)));
+	for (i = 0; i < nindexes; i++)
+	{
+		if (Irel[i]->rd_indam->amcanparallelvacuum)
+			est_shared = add_size(est_shared,
+									add_size(sizeof(LVSharedIndStats),
+											 index_parallelvacuum_estimate(Irel[i])));
+	}
+	shm_toc_estimate_chunk(&pcxt->estimator, est_shared);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	/* Estimate size for dead tuples -- PARALLEL_VACUUM_KEY_DEAD_TUPLES */
+	maxtuples = compute_max_dead_tuples(nblocks, true);
+	est_deadtuples = MAXALIGN(add_size(SizeOfLVDeadTuples,
+									   mul_size(sizeof(ItemPointerData), maxtuples)));
+	shm_toc_estimate_chunk(&pcxt->estimator, est_deadtuples);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	/* Finally, estimate PARALLEL_VACUUM_KEY_QUERY_TEXT space */
+	querylen = strlen(debug_query_string);
+	shm_toc_estimate_chunk(&pcxt->estimator, querylen + 1);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	InitializeParallelDSM(pcxt);
+
+	/* Prepare shared information */
+	shared = (LVShared *) shm_toc_allocate(pcxt->toc, est_shared);
+	MemSet(shared, 0, est_shared);
+	shared->relid = relid;
+	shared->elevel = elevel;
+	shared->offset = add_size(SizeOfLVShared, BITMAPLEN(nindexes));
+	prepare_index_statistics(shared, Irel, nindexes);
+	pg_atomic_init_u32(&(shared->nprocessed), 0);
+
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_SHARED, shared);
+	lps->lvshared = shared;
+
+	/* Prepare the dead tuple space */
+	dead_tuples = (LVDeadTuples *) shm_toc_allocate(pcxt->toc, est_deadtuples);
+	dead_tuples->max_tuples = maxtuples;
+	dead_tuples->num_tuples = 0;
+	MemSet(dead_tuples->itemptrs, 0, sizeof(ItemPointerData) * maxtuples);
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_DEAD_TUPLES, dead_tuples);
+	vacrelstats->dead_tuples = dead_tuples;
+
+	/* Store query string for workers */
+	sharedquery = (char *) shm_toc_allocate(pcxt->toc, querylen + 1);
+	memcpy(sharedquery, debug_query_string, querylen + 1);
+	sharedquery[querylen] = '\0';
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, sharedquery);
+
+	return lps;
+}
+
+/*
+ * Initialize variables for shared index statistics, set NULL bitmap and
+ * the struct size of each indexes.  Also this function sets the number of
+ * indexes that do not support parallel index vacuuming and that use
+ * maintenance_work_mem.  Since currently we don't support parallel vacuum
+ * for autovacuum we don't need to care about autovacuum_work_mem.
+ */
+static void
+prepare_index_statistics(LVShared *lvshared, Relation *Irel, int nindexes)
+{
+	char *p = (char *) GetSharedIndStats(lvshared);
+	int nindexes_mwm = 0;
+	int i;
+
+	Assert(!IsAutoVacuumWorkerProcess());
+
+	for (i = 0; i < nindexes; i++)
+	{
+		LVSharedIndStats *indstats;
+
+		if (!Irel[i]->rd_indam->amcanparallelvacuum)
+		{
+			/* Set NULL as this index does not support parallel vacuum */
+			lvshared->bitmap[i >> 3] |= 0 << (i & 0x07);
+			lvshared->nindexes_nonparallel++;
+			continue;
+		}
+
+		if (Irel[i]->rd_indam->amusemaintenanceworkmem)
+			nindexes_mwm++;
+
+		/* Set the size for index statistics */
+		indstats = (LVSharedIndStats *) p;
+		lvshared->bitmap[i >> 3] |= 1 << (i & 0x07);
+		indstats->size = index_parallelvacuum_estimate(Irel[i]);
+
+		p += SizeOfSharedIndStats(indstats);
+	}
+
+	/* Compute the new maitenance_work_mem value for index vacuuming */
+	lvshared->maintenance_work_mem_worker =
+		(nindexes_mwm > 0) ? maintenance_work_mem / nindexes_mwm : maintenance_work_mem;
+}
+
+/*
+ * Destroy the parallel context, and end parallel mode.
+ *
+ * All writes are not allowed during parallel mode and it might not be
+ * safe to exit from the parallel mode while keeping the parallel context.
+ * So we copy the updated index statistics to a local memory and then later
+ * use that to update the index statistics.
+ */
+static void
+end_parallel_vacuum(LVParallelState *lps, Relation *Irel, int nindexes,
+					IndexBulkDeleteResult **stats)
+{
+	int i;
+
+	Assert(!IsParallelWorker());
+
+	/* copy the updated statistics */
+	for (i = 0; i < nindexes; i++)
+	{
+		LVSharedIndStats *indstats = get_indstats(lps->lvshared, i);
+
+		/*
+		 * Skip unused slot.  The statistics of this index are already
+		 * stored in local memory.
+		 */
+		if (indstats == NULL)
+			continue;
+
+		if (indstats->updated)
+		{
+			stats[i] = (IndexBulkDeleteResult *) palloc0(sizeof(IndexBulkDeleteResult));
+			memcpy(stats[i],
+				   GetIndexBulkDeleteResult(indstats),
+				   sizeof(IndexBulkDeleteResult));
+		}
+		else
+			stats[i] = NULL;
+	}
+
+	DestroyParallelContext(lps->pcxt);
+	ExitParallelMode();
+
+	/* Deactivate parallel vacuum */
+	pfree(lps);
+	lps = NULL;
+}
+
+/* Return the Nth index statistics or NULL */
+static LVSharedIndStats *
+get_indstats(LVShared *lvshared, int n)
+{
+	int		i;
+	char	*p = (char *) GetSharedIndStats(lvshared);;
+
+	if (IndStatsIsNull(lvshared, n))
+		return NULL;
+
+	for (i = 0; i < (n - 1); i++)
+	{
+		if (IndStatsIsNull(lvshared, i))
+			continue;
+
+		p += SizeOfSharedIndStats(p);
+	}
+
+	return (LVSharedIndStats *) p;
+}
+
+/*
+ * Perform work within a launched parallel process.
+ *
+ * Since parallel vacuum workers work only within index vacuuming and index
+ * cleanup, no need to report the progress information.
+ */
+void
+heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
+{
+	Relation	onerel;
+	Relation	*indrels;
+	LVShared	*lvshared;
+	LVDeadTuples	*dead_tuples;
+	int			nindexes;
+	char		*sharedquery;
+	IndexBulkDeleteResult **stats;
+
+	lvshared = (LVShared *) shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_SHARED,
+										   false);
+	elevel = lvshared->elevel;
+
+	ereport(DEBUG1,
+			(errmsg("starting parallel lazy vacuum worker for %s",
+					lvshared->for_cleanup ? "cleanup" : "vacuuming")));
+
+	/* Set debug_query_string for individual workers */
+	sharedquery = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, false);
+	debug_query_string = sharedquery;
+	pgstat_report_activity(STATE_RUNNING, debug_query_string);
+
+	/*
+	 * Open table.  The lock mode is the same as the leader process.  It's
+	 * okay because The lockmode does not conflict among the parallel
+	 * workers.
+	 */
+	onerel = table_open(lvshared->relid, ShareUpdateExclusiveLock);
+
+	/*
+	 * Open all indexes. indrels are sorted in order by OID, which should
+	 * be matched to the leader's one.
+	 */
+	vac_open_indexes(onerel, RowExclusiveLock, &nindexes, &indrels);
+	Assert(nindexes > 0);
+
+	/* Set dead tuple space */
+	dead_tuples = (LVDeadTuples *) shm_toc_lookup(toc,
+												  PARALLEL_VACUUM_KEY_DEAD_TUPLES,
+												  false);
+
+	/* Set cost-based vacuum delay */
+	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostBalance = 0;
+	VacuumPageHit = 0;
+	VacuumPageMiss = 0;
+	VacuumPageDirty = 0;
+
+	stats = (IndexBulkDeleteResult **)
+		palloc0(nindexes * sizeof(IndexBulkDeleteResult *));
+
+	if (lvshared->maintenance_work_mem_worker > 0)
+		maintenance_work_mem = lvshared->maintenance_work_mem_worker;
+
+	/* Do either vacuuming indexes or cleaning indexes */
+	vacuum_or_cleanup_indexes_worker(indrels, nindexes, stats, lvshared,
+									 dead_tuples);
+
+	vac_close_indexes(nindexes, indrels, RowExclusiveLock);
+	table_close(onerel, ShareUpdateExclusiveLock);
+}
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index 55d129a64f..86511b2703 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -14,6 +14,7 @@
 
 #include "postgres.h"
 
+#include "access/heapam.h"
 #include "access/nbtree.h"
 #include "access/parallel.h"
 #include "access/session.h"
@@ -140,6 +141,9 @@ static const struct
 	},
 	{
 		"_bt_parallel_build_main", _bt_parallel_build_main
+	},
+	{
+		"heap_parallel_vacuum_main", heap_parallel_vacuum_main
 	}
 };
 
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 4b67b40b28..9ada501709 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -42,6 +42,7 @@
 #include "nodes/makefuncs.h"
 #include "pgstat.h"
 #include "postmaster/autovacuum.h"
+#include "postmaster/bgworker_internals.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
 #include "storage/proc.h"
@@ -99,6 +100,7 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 	/* Set default value */
 	params.index_cleanup = VACOPT_TERNARY_DEFAULT;
 	params.truncate = VACOPT_TERNARY_DEFAULT;
+	params.nworkers = -1;
 
 	/* Parse options list */
 	foreach(lc, vacstmt->options)
@@ -129,6 +131,28 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 			params.index_cleanup = get_vacopt_ternary_value(opt);
 		else if (strcmp(opt->defname, "truncate") == 0)
 			params.truncate = get_vacopt_ternary_value(opt);
+		else if (strcmp(opt->defname, "parallel") == 0)
+		{
+			if (opt->arg == NULL)
+			{
+				/*
+				 * Parallel lazy vacuum is requested but user didn't specify
+				 * the parallel degree. The parallel degree will be determined
+				 * at the start of lazy vacuum.
+				 */
+				params.nworkers = 0;
+			}
+			else
+			{
+				params.nworkers = defGetInt32(opt);
+				if (params.nworkers < 1 || params.nworkers > MAX_PARALLEL_WORKER_LIMIT)
+					ereport(ERROR,
+							(errcode(ERRCODE_SYNTAX_ERROR),
+							 errmsg("parallel vacuum degree must be between 1 and %d",
+									MAX_PARALLEL_WORKER_LIMIT),
+							 parser_errposition(pstate, opt->location)));
+			}
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -170,6 +194,11 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 		}
 	}
 
+	if ((params.options & VACOPT_FULL) && params.nworkers >= 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot specify FULL option with PARALLEL option")));
+
 	/*
 	 * All freeze ages are zero if the FREEZE option is given; otherwise pass
 	 * them as -1 which means to use the default values.
@@ -1742,6 +1771,22 @@ vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params)
 		return false;
 	}
 
+	/*
+	 * Since parallel workers cannot access data in temporary tables, parallel
+	 * vacuum is not allowed for temporary relation.
+	 */
+	if (RelationUsesLocalBuffers(onerel) && params->nworkers >= 0)
+	{
+		ereport(WARNING,
+				(errmsg("skipping vacuum on \"%s\" --- cannot vacuum temporary tables in parallel",
+						RelationGetRelationName(onerel))));
+		relation_close(onerel, lmode);
+		PopActiveSnapshot();
+		CommitTransactionCommand();
+		/* It's OK to proceed with ANALYZE on this table */
+		return true;
+	}
+
 	/*
 	 * Silently ignore partitioned tables as there is no work to be done.  The
 	 * useful work is on their child partitions, which have been queued up for
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index c1dd8168ca..c3690f9c41 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2891,6 +2891,8 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 			(!wraparound ? VACOPT_SKIP_LOCKED : 0);
 		tab->at_params.index_cleanup = VACOPT_TERNARY_DEFAULT;
 		tab->at_params.truncate = VACOPT_TERNARY_DEFAULT;
+		/* We don't support parallel vacuum for autovacuum for now */
+		tab->at_params.nworkers = -1;
 		tab->at_params.freeze_min_age = freeze_min_age;
 		tab->at_params.freeze_table_age = freeze_table_age;
 		tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index e00dbab5aa..321a1511a8 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -3556,7 +3556,7 @@ psql_completion(const char *text, int start, int end)
 		if (ends_with(prev_wd, '(') || ends_with(prev_wd, ','))
 			COMPLETE_WITH("FULL", "FREEZE", "ANALYZE", "VERBOSE",
 						  "DISABLE_PAGE_SKIPPING", "SKIP_LOCKED",
-						  "INDEX_CLEANUP", "TRUNCATE");
+						  "INDEX_CLEANUP", "TRUNCATE", "PARALLEL");
 		else if (TailMatches("FULL|FREEZE|ANALYZE|VERBOSE|DISABLE_PAGE_SKIPPING|SKIP_LOCKED|INDEX_CLEANUP|TRUNCATE"))
 			COMPLETE_WITH("ON", "OFF");
 	}
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 858bcb6bc9..12065cc038 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -24,6 +24,8 @@
 #include "nodes/primnodes.h"
 #include "storage/bufpage.h"
 #include "storage/lockdefs.h"
+#include "storage/shm_toc.h"
+#include "storage/dsm.h"
 #include "utils/relcache.h"
 #include "utils/snapshot.h"
 
@@ -193,6 +195,7 @@ extern Size SyncScanShmemSize(void);
 struct VacuumParams;
 extern void heap_vacuum_rel(Relation onerel,
 							struct VacuumParams *params, BufferAccessStrategy bstrategy);
+extern void heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
 
 /* in heap/heapam_visibility.c */
 extern bool HeapTupleSatisfiesVisibility(HeapTuple stup, Snapshot snapshot,
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 128f7ae65d..43702f2f86 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -184,6 +184,11 @@ typedef struct VacuumParams
 										 * default value depends on reloptions */
 	VacOptTernaryValue truncate;	/* Truncate empty pages at the end,
 									 * default value depends on reloptions */
+	/*
+	 * The number of parallel vacuum workers. -1 by default for no workers
+	 * and 0 for choosing based on the number of indexes.
+	 */
+	int			nworkers;
 } VacuumParams;
 
 /* GUC parameters */
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index aff0b10a93..91db6a10b0 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -92,6 +92,20 @@ CONTEXT:  SQL function "do_analyze" statement 1
 SQL function "wrap_do_analyze" statement 1
 VACUUM FULL vactst;
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+-- PARALLEL option
+VACUUM (PARALLEL) vaccluster;
+VACUUM (PARALLEL 2) vaccluster;
+VACUUM (PARALLEL 0) vaccluster; -- error
+ERROR:  parallel vacuum degree must be between 1 and 1024
+LINE 1: VACUUM (PARALLEL 0) vaccluster;
+                ^
+VACUUM (PARALLEL 2, INDEX_CLEANUP FALSE) vaccluster;
+VACUUM (PARALLEL 2, FULL TRUE) vaccluster; -- error, cannot use both PARALLEL and FULL
+ERROR:  cannot specify FULL option with PARALLEL option
+CREATE TEMPORARY TABLE tmp (a int PRIMARY KEY);
+CREATE INDEX tmp_idx1 ON tmp (a);
+VACUUM (PARALLEL 1) tmp; -- error, cannot parallel vacuum temporary tables
+WARNING:  skipping vacuum on "tmp" --- cannot vacuum temporary tables in parallel
 -- INDEX_CLEANUP option
 CREATE TABLE no_index_cleanup (i INT PRIMARY KEY, t TEXT);
 -- Use uncompressed data stored in toast.
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index f0fee3af2b..66a9b110fe 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -75,6 +75,16 @@ VACUUM FULL vactst;
 
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
 
+-- PARALLEL option
+VACUUM (PARALLEL) vaccluster;
+VACUUM (PARALLEL 2) vaccluster;
+VACUUM (PARALLEL 0) vaccluster; -- error
+VACUUM (PARALLEL 2, INDEX_CLEANUP FALSE) vaccluster;
+VACUUM (PARALLEL 2, FULL TRUE) vaccluster; -- error, cannot use both PARALLEL and FULL
+CREATE TEMPORARY TABLE tmp (a int PRIMARY KEY);
+CREATE INDEX tmp_idx1 ON tmp (a);
+VACUUM (PARALLEL 1) tmp; -- error, cannot parallel vacuum temporary tables
+
 -- INDEX_CLEANUP option
 CREATE TABLE no_index_cleanup (i INT PRIMARY KEY, t TEXT);
 -- Use uncompressed data stored in toast.
-- 
2.22.0

#150

Masahiko Sawada

sawada.mshk@gmail.com

about 6 years ago

In reply to: Masahiko Sawada (#149)

1 attachment(s)

On Tue, Oct 29, 2019 at 4:06 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Mon, Oct 28, 2019 at 2:13 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Thu, Oct 24, 2019 at 4:33 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Thu, Oct 24, 2019 at 4:21 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Thu, Oct 24, 2019 at 11:51 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Fri, Oct 18, 2019 at 12:18 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Fri, Oct 18, 2019 at 11:25 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

I am thinking if we can write the patch for both the approaches (a.
compute shared costs and try to delay based on that, b. try to divide
the I/O cost among workers as described in the email above[1]) and do
some tests to see the behavior of throttling, that might help us in
deciding what is the best strategy to solve this problem, if any.
What do you think?

I agree with this idea. I can come up with a POC patch for approach
(b). Meanwhile, if someone is interested to quickly hack with the
approach (a) then we can do some testing and compare. Sawada-san,
by any chance will you be interested to write POC with approach (a)?
Otherwise, I will try to write it after finishing the first one
(approach b).

I have come up with the POC for approach (a).

Can we compute the overall throttling (sleep time) in the operation
separately for heap and index, then divide the index's sleep_time with
a number of workers and add it to heap's sleep time? Then, it will be
a bit easier to compare the data between parallel and non-parallel
case.

I have come up with a patch to compute the total delay during the
vacuum. So the idea of computing the total cost delay is

Total cost delay = Total dealy of heap scan + Total dealy of
index/worker; Patch is attached for the same.

I have prepared this patch on the latest patch of the parallel
vacuum[1]. I have also rebased the patch for the approach [b] for
dividing the vacuum cost limit and done some testing for computing the
I/O throttling. Attached patches 0001-POC-compute-total-cost-delay
and 0002-POC-divide-vacuum-cost-limit can be applied on top of
v31-0005-Add-paralell-P-option-to-vacuumdb-command.patch. I haven't
rebased on top of v31-0006, because v31-0006 is implementing the I/O
throttling with one approach and 0002-POC-divide-vacuum-cost-limit is
doing the same with another approach. But,
0001-POC-compute-total-cost-delay can be applied on top of v31-0006 as
well (just 1-2 lines conflict).

Testing: I have performed 2 tests, one with the same size indexes and
second with the different size indexes and measured total I/O delay
with the attached patch.

Setup:
VacuumCostDelay=10ms
VacuumCostLimit=2000

Test1 (Same size index):
create table test(a int, b varchar, c varchar);
create index idx1 on test(a);
create index idx2 on test(b);
create index idx3 on test(c);
insert into test select i, repeat('a',30)||i, repeat('a',20)||i from
generate_series(1,500000) as i;
delete from test where a < 200000;

Vacuum (Head) Parallel Vacuum
Vacuum Cost Divide Patch
Total Delay 1784 (ms) 1398(ms)
1938(ms)

Test2 (Variable size dead tuple in index)
create table test(a int, b varchar, c varchar);
create index idx1 on test(a);
create index idx2 on test(b) where a > 100000;
create index idx3 on test(c) where a > 150000;

insert into test select i, repeat('a',30)||i, repeat('a',20)||i from
generate_series(1,500000) as i;
delete from test where a < 200000;

Vacuum (Head) Parallel Vacuum
Vacuum Cost Divide Patch
Total Delay 1438 (ms) 1029(ms)
1529(ms)

Conclusion:
1. The tests prove that the total I/O delay is significantly less with
the parallel vacuum.
2. With the vacuum cost divide the problem is solved but the delay bit
more compared to the non-parallel version. The reason could be the
problem discussed at[2], but it needs further investigation.

Next, I will test with the v31-0006 (shared vacuum cost) patch. I
will also try to test different types of indexes.

Thank you for testing!

I realized that v31-0006 patch doesn't work fine so I've attached the
updated version patch that also incorporated some comments I got so
far. Sorry for the inconvenience. I'll apply your 0001 patch and also
test the total delay time.

FWIW I'd like to share the results of total delay time evaluation of
approach (a) (shared cost balance). I used the same workloads that
Dilip shared and set vacuum_cost_delay to 10. The results of two test
cases are here:

* Test1
normal : 12656 ms (hit 50594, miss 5700, dirty 7258, total 63552)
2 workers : 17149 ms (hit 47673, miss 8647, dirty 9157, total 65477)
1 worker : 19498 ms (hit 45954, miss 10340, dirty 10517, total 66811)

* Test2
normal : 1530 ms (hit 30645, miss 2, dirty 3, total 30650)
2 workers : 1538 ms (hit 30645, miss 2, dirty 3, total 30650)
1 worker : 1538 ms (hit 30645, miss 2, dirty 3, total 30650)

'hit', 'miss' and 'dirty' are the total numbers of buffer hits, buffer
misses and flushing dirty buffer, respectively. 'total' is the sum of
these three values.

In this evaluation I expect that parallel vacuum cases delay time as
much as the time of normal vacuum because the total number of pages to
vacuum is the same and we have the shared cost balance value and each
workers decide to sleep based on that value. According to the above
Test1 results, we can see that there is a big difference in the total
delay time among these cases (normal vacuum case is shortest), but
the cause of this is that parallel vacuum had to to flush more dirty
pages. Actually after increased shared_buffer I got expected results:

* Test1 (after increased shared_buffers)
normal : 2807 ms (hit 56295, miss 2, dirty 3, total 56300)
2 workers : 2840 ms (hit 56295, miss 2, dirty 3, total 56300)
1 worker : 2841 ms (hit 56295, miss 2, dirty 3, total 56300)

I updated the patch that computes the total cost delay shared by
Dilip[1]/messages/by-id/CAFiTN-thU-z8f04jO7xGMu5yUUpTpsBTvBrFW6EhRf-jGvEz=g@mail.gmail.com so that it collects the number of buffer hits and so on, and
have attached it. It can be applied on top of my latest patch set[1]/messages/by-id/CAFiTN-thU-z8f04jO7xGMu5yUUpTpsBTvBrFW6EhRf-jGvEz=g@mail.gmail.com.

[1]: /messages/by-id/CAFiTN-thU-z8f04jO7xGMu5yUUpTpsBTvBrFW6EhRf-jGvEz=g@mail.gmail.com
[2]: /messages/by-id/CAD21AoAqT17QwKJ_sWOqRxNvg66wMw1oZZzf9Rt-E-zD+XOh_Q@mail.gmail.com

Regards,

--
Masahiko Sawada

Attachments:

PoC-delay-stats.patchtext/x-patch; charset=US-ASCII; name=PoC-delay-stats.patchDownload

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index a9d9f31887..5ed92ac8d7 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -137,6 +137,7 @@
 #define PARALLEL_VACUUM_KEY_SHARED			1
 #define PARALLEL_VACUUM_KEY_DEAD_TUPLES		2
 #define PARALLEL_VACUUM_KEY_QUERY_TEXT		3
+#define PARALLEL_VACUUM_KEY_COST_DELAY		4
 
 /*
  * PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION disables the leader's
@@ -257,6 +258,21 @@ typedef struct LVSharedIndStats
 #define GetIndexBulkDeleteResult(s) \
 	((IndexBulkDeleteResult *)((char *)(s) + sizeof(LVSharedIndStats)))
 
+typedef struct LVDelayStats
+{
+	double	time;
+	int		hit;
+	int		miss;
+	int		dirty;
+} LVDelayStats;
+
+typedef struct LVCostDelay
+{
+	pg_atomic_uint32	nslot;
+	LVDelayStats 		stats[FLEXIBLE_ARRAY_MEMBER];
+} LVCostDelay;
+#define SizeOfLVCostDelay offsetof(LVCostDelay, stats) + sizeof(LVDelayStats)
+
 /* Struct for parallel lazy vacuum */
 typedef struct LVParallelState
 {
@@ -265,6 +281,8 @@ typedef struct LVParallelState
 	/* Shared information among parallel vacuum workers */
 	LVShared		*lvshared;
 
+	/* Shared cost delay. */
+	LVCostDelay		*lvcostdelay;
 	/*
 	 * Always true except for a debugging case where
 	 * PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION are defined.
@@ -757,6 +775,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		parallel_workers = compute_parallel_workers(Irel, nindexes,
 													params->nworkers);
 
+	VacuumCostTotalDelay = 0;
 	if (parallel_workers > 0)
 	{
 		/*
@@ -1733,6 +1752,10 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 					vacrelstats->scanned_pages, nblocks),
 			 errdetail_internal("%s", buf.data)));
 	pfree(buf.data);
+
+	elog(WARNING, "stats delay %lf, hit %d, miss %d, dirty %d, total %d",
+		 VacuumCostTotalDelay, _nhit, _nmiss, _ndirty,
+		_nhit + _nmiss + _ndirty);
 }
 
 
@@ -1967,6 +1990,8 @@ lazy_parallel_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
 										int nindexes, IndexBulkDeleteResult **stats,
 										LVParallelState *lps)
 {
+	int		i;
+
 	Assert(!IsParallelWorker());
 	Assert(ParallelVacuumIsActive(lps));
 	Assert(nindexes > 0);
@@ -2011,6 +2036,18 @@ lazy_parallel_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
 	/* Continue to use the shared balance value */
 	VacuumCostBalance = pg_atomic_read_u32(&(lps->lvshared->cost_balance));
 
+	/*
+	 * Collect all the delay from workers and add to total delay. The delay from leader
+	 * is already included in VacuumCostTotalDelay.
+	 */
+	for (i = 0; i < lps->pcxt->nworkers_launched; i++)
+	{
+		VacuumCostTotalDelay += lps->lvcostdelay->stats[i].time;
+		_nhit += lps->lvcostdelay->stats[i].hit;
+		_nmiss += lps->lvcostdelay->stats[i].miss;
+		_ndirty += lps->lvcostdelay->stats[i].dirty;
+	}
+
 	/*
 	 * We need to reinitialize the parallel context as no more index vacuuming and
 	 * index cleanup will be performed after that.
@@ -2968,10 +3005,12 @@ begin_parallel_vacuum(LVRelStats *vacrelstats, Oid relid, BlockNumber nblocks,
 	ParallelContext *pcxt;
 	LVShared		*shared;
 	LVDeadTuples	*dead_tuples;
+	LVCostDelay		*costdelay;
 	long	maxtuples;
 	char	*sharedquery;
 	Size	est_shared;
 	Size	est_deadtuples;
+	Size	est_costdelay;
 	int		querylen;
 	int		i;
 
@@ -3043,6 +3082,14 @@ begin_parallel_vacuum(LVRelStats *vacrelstats, Oid relid, BlockNumber nblocks,
 	sharedquery[querylen] = '\0';
 	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, sharedquery);
 
+	/* Vacuum cost balance. */
+	est_costdelay = MAXALIGN(add_size(SizeOfLVCostDelay,
+								   mul_size(sizeof(LVDelayStats), nrequested)));
+	costdelay = (LVCostDelay *) shm_toc_allocate(pcxt->toc, est_costdelay);
+	pg_atomic_init_u32(&(costdelay->nslot), 0);
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_COST_DELAY, costdelay);
+	lps->lvcostdelay = costdelay;
+
 	return lps;
 }
 
@@ -3171,8 +3218,10 @@ heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
 	Relation	*indrels;
 	LVShared	*lvshared;
 	LVDeadTuples	*dead_tuples;
+	LVCostDelay		*costdelay;
 	int			nindexes;
 	char		*sharedquery;
+	int			slot;
 	IndexBulkDeleteResult **stats;
 
 	lvshared = (LVShared *) shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_SHARED,
@@ -3207,6 +3256,11 @@ heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
 												  PARALLEL_VACUUM_KEY_DEAD_TUPLES,
 												  false);
 
+	costdelay = (LVCostDelay *) shm_toc_lookup(toc,
+												   PARALLEL_VACUUM_KEY_COST_DELAY,
+												   false);
+	slot = pg_atomic_fetch_add_u32(&(costdelay->nslot), 1);
+
 	/* Set cost-based vacuum delay */
 	VacuumCostActive = (VacuumCostDelay > 0);
 	VacuumCostBalance = 0;
@@ -3214,6 +3268,7 @@ heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
 	VacuumPageMiss = 0;
 	VacuumPageDirty = 0;
 	VacuumSharedCostBalance = &(lvshared->cost_balance);
+	_nhit = _nmiss = _ndirty = 0;
 
 	stats = (IndexBulkDeleteResult **)
 		palloc0(nindexes * sizeof(IndexBulkDeleteResult *));
@@ -3225,6 +3280,11 @@ heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
 	vacuum_or_cleanup_indexes_worker(indrels, nindexes, stats, lvshared,
 									 dead_tuples);
 
+	/* update the total delay in the shared location. */
+	costdelay->stats[slot].time = VacuumCostTotalDelay;
+	costdelay->stats[slot].hit = _nhit;
+	costdelay->stats[slot].miss = _nmiss;
+	costdelay->stats[slot].dirty = _ndirty;
 	vac_close_indexes(nindexes, indrels, RowExclusiveLock);
 	table_close(onerel, ShareUpdateExclusiveLock);
 }
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 1b9ea9b672..7fcd2d082f 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -412,6 +412,7 @@ vacuum(List *relations, VacuumParams *params,
 		VacuumPageHit = 0;
 		VacuumPageMiss = 0;
 		VacuumPageDirty = 0;
+		_nhit = _nmiss = _ndirty = 0;
 		VacuumSharedCostBalance = NULL;
 
 		/*
@@ -2050,6 +2051,8 @@ vacuum_delay_point(void)
 			/* update balance values for workers */
 			AutoVacuumUpdateDelay();
 
+			VacuumCostTotalDelay += msec;
+
 			/* Might have gotten an interrupt while sleeping */
 			CHECK_FOR_INTERRUPTS();
 		}
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index 483f705305..56e3631c6e 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -770,7 +770,10 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
 			VacuumPageHit++;
 
 			if (VacuumCostActive)
+			{
 				VacuumCostBalance += VacuumCostPageHit;
+				_nhit++;
+			}
 
 			TRACE_POSTGRESQL_BUFFER_READ_DONE(forkNum, blockNum,
 											  smgr->smgr_rnode.node.spcNode,
@@ -959,7 +962,10 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
 
 	VacuumPageMiss++;
 	if (VacuumCostActive)
+	{
 		VacuumCostBalance += VacuumCostPageMiss;
+		_nmiss++;
+	}
 
 	TRACE_POSTGRESQL_BUFFER_READ_DONE(forkNum, blockNum,
 									  smgr->smgr_rnode.node.spcNode,
@@ -1500,7 +1506,10 @@ MarkBufferDirty(Buffer buffer)
 		VacuumPageDirty++;
 		pgBufferUsage.shared_blks_dirtied++;
 		if (VacuumCostActive)
+		{
 			VacuumCostBalance += VacuumCostPageDirty;
+			_ndirty++;
+		}
 	}
 }
 
@@ -3556,7 +3565,10 @@ MarkBufferDirtyHint(Buffer buffer, bool buffer_std)
 			VacuumPageDirty++;
 			pgBufferUsage.shared_blks_dirtied++;
 			if (VacuumCostActive)
+			{
 				VacuumCostBalance += VacuumCostPageDirty;
+				_ndirty++;
+			}
 		}
 	}
 }
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 3bf96de256..de214f347b 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -139,11 +139,15 @@ int			VacuumCostPageMiss = 10;
 int			VacuumCostPageDirty = 20;
 int			VacuumCostLimit = 200;
 double		VacuumCostDelay = 0;
-
+double		VacuumCostTotalDelay = 0;
 int			VacuumPageHit = 0;
 int			VacuumPageMiss = 0;
 int			VacuumPageDirty = 0;
 
+int	_nhit = 0;
+int _nmiss = 0;
+int _ndirty = 0;
+
 int			VacuumCostBalance = 0;	/* working state for vacuum */
 bool		VacuumCostActive = false;
 
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index bc6e03fbc7..8d95b6ef4f 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -251,6 +251,10 @@ extern int	VacuumCostPageMiss;
 extern int	VacuumCostPageDirty;
 extern int	VacuumCostLimit;
 extern double VacuumCostDelay;
+extern double VacuumCostTotalDelay;
+extern int	_nhit;
+extern int _nmiss;
+extern int _ndirty;
 
 extern int	VacuumPageHit;
 extern int	VacuumPageMiss;

#151

Dilip Kumar

dilipbalaut@gmail.com

about 6 years ago

In reply to: Masahiko Sawada (#150)

On Tue, Oct 29, 2019 at 1:59 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Tue, Oct 29, 2019 at 4:06 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Mon, Oct 28, 2019 at 2:13 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Thu, Oct 24, 2019 at 4:33 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Thu, Oct 24, 2019 at 4:21 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Thu, Oct 24, 2019 at 11:51 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Fri, Oct 18, 2019 at 12:18 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Fri, Oct 18, 2019 at 11:25 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

I am thinking if we can write the patch for both the approaches (a.
compute shared costs and try to delay based on that, b. try to divide
the I/O cost among workers as described in the email above[1]) and do
some tests to see the behavior of throttling, that might help us in
deciding what is the best strategy to solve this problem, if any.
What do you think?

I agree with this idea. I can come up with a POC patch for approach
(b). Meanwhile, if someone is interested to quickly hack with the
approach (a) then we can do some testing and compare. Sawada-san,
by any chance will you be interested to write POC with approach (a)?
Otherwise, I will try to write it after finishing the first one
(approach b).

I have come up with the POC for approach (a).

Can we compute the overall throttling (sleep time) in the operation
separately for heap and index, then divide the index's sleep_time with
a number of workers and add it to heap's sleep time? Then, it will be
a bit easier to compare the data between parallel and non-parallel
case.

I have come up with a patch to compute the total delay during the
vacuum. So the idea of computing the total cost delay is

Total cost delay = Total dealy of heap scan + Total dealy of
index/worker; Patch is attached for the same.

I have prepared this patch on the latest patch of the parallel
vacuum[1]. I have also rebased the patch for the approach [b] for
dividing the vacuum cost limit and done some testing for computing the
I/O throttling. Attached patches 0001-POC-compute-total-cost-delay
and 0002-POC-divide-vacuum-cost-limit can be applied on top of
v31-0005-Add-paralell-P-option-to-vacuumdb-command.patch. I haven't
rebased on top of v31-0006, because v31-0006 is implementing the I/O
throttling with one approach and 0002-POC-divide-vacuum-cost-limit is
doing the same with another approach. But,
0001-POC-compute-total-cost-delay can be applied on top of v31-0006 as
well (just 1-2 lines conflict).

Testing: I have performed 2 tests, one with the same size indexes and
second with the different size indexes and measured total I/O delay
with the attached patch.

Setup:
VacuumCostDelay=10ms
VacuumCostLimit=2000

Test1 (Same size index):
create table test(a int, b varchar, c varchar);
create index idx1 on test(a);
create index idx2 on test(b);
create index idx3 on test(c);
insert into test select i, repeat('a',30)||i, repeat('a',20)||i from
generate_series(1,500000) as i;
delete from test where a < 200000;

Vacuum (Head) Parallel Vacuum
Vacuum Cost Divide Patch
Total Delay 1784 (ms) 1398(ms)
1938(ms)

Test2 (Variable size dead tuple in index)
create table test(a int, b varchar, c varchar);
create index idx1 on test(a);
create index idx2 on test(b) where a > 100000;
create index idx3 on test(c) where a > 150000;

insert into test select i, repeat('a',30)||i, repeat('a',20)||i from
generate_series(1,500000) as i;
delete from test where a < 200000;

Vacuum (Head) Parallel Vacuum
Vacuum Cost Divide Patch
Total Delay 1438 (ms) 1029(ms)
1529(ms)

Conclusion:
1. The tests prove that the total I/O delay is significantly less with
the parallel vacuum.
2. With the vacuum cost divide the problem is solved but the delay bit
more compared to the non-parallel version. The reason could be the
problem discussed at[2], but it needs further investigation.

Next, I will test with the v31-0006 (shared vacuum cost) patch. I
will also try to test different types of indexes.

Thank you for testing!

I realized that v31-0006 patch doesn't work fine so I've attached the
updated version patch that also incorporated some comments I got so
far. Sorry for the inconvenience. I'll apply your 0001 patch and also
test the total delay time.

FWIW I'd like to share the results of total delay time evaluation of
approach (a) (shared cost balance). I used the same workloads that
Dilip shared and set vacuum_cost_delay to 10. The results of two test
cases are here:

* Test1
normal : 12656 ms (hit 50594, miss 5700, dirty 7258, total 63552)
2 workers : 17149 ms (hit 47673, miss 8647, dirty 9157, total 65477)
1 worker : 19498 ms (hit 45954, miss 10340, dirty 10517, total 66811)

* Test2
normal : 1530 ms (hit 30645, miss 2, dirty 3, total 30650)
2 workers : 1538 ms (hit 30645, miss 2, dirty 3, total 30650)
1 worker : 1538 ms (hit 30645, miss 2, dirty 3, total 30650)

'hit', 'miss' and 'dirty' are the total numbers of buffer hits, buffer
misses and flushing dirty buffer, respectively. 'total' is the sum of
these three values.

In this evaluation I expect that parallel vacuum cases delay time as
much as the time of normal vacuum because the total number of pages to
vacuum is the same and we have the shared cost balance value and each
workers decide to sleep based on that value. According to the above
Test1 results, we can see that there is a big difference in the total
delay time among these cases (normal vacuum case is shortest), but
the cause of this is that parallel vacuum had to to flush more dirty
pages. Actually after increased shared_buffer I got expected results:

* Test1 (after increased shared_buffers)
normal : 2807 ms (hit 56295, miss 2, dirty 3, total 56300)
2 workers : 2840 ms (hit 56295, miss 2, dirty 3, total 56300)
1 worker : 2841 ms (hit 56295, miss 2, dirty 3, total 56300)

I updated the patch that computes the total cost delay shared by
Dilip[1] so that it collects the number of buffer hits and so on, and
have attached it. It can be applied on top of my latest patch set[1].

Thanks, Sawada-san. In my next test, I will use this updated patch.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#152

Dilip Kumar

dilipbalaut@gmail.com

about 6 years ago

In reply to: Dilip Kumar (#151)

On Tue, Oct 29, 2019 at 3:11 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Tue, Oct 29, 2019 at 1:59 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Tue, Oct 29, 2019 at 4:06 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Mon, Oct 28, 2019 at 2:13 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Thu, Oct 24, 2019 at 4:33 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Thu, Oct 24, 2019 at 4:21 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Thu, Oct 24, 2019 at 11:51 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Fri, Oct 18, 2019 at 12:18 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Fri, Oct 18, 2019 at 11:25 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

I am thinking if we can write the patch for both the approaches (a.
compute shared costs and try to delay based on that, b. try to divide
the I/O cost among workers as described in the email above[1]) and do
some tests to see the behavior of throttling, that might help us in
deciding what is the best strategy to solve this problem, if any.
What do you think?

I agree with this idea. I can come up with a POC patch for approach
(b). Meanwhile, if someone is interested to quickly hack with the
approach (a) then we can do some testing and compare. Sawada-san,
by any chance will you be interested to write POC with approach (a)?
Otherwise, I will try to write it after finishing the first one
(approach b).

I have come up with the POC for approach (a).

Can we compute the overall throttling (sleep time) in the operation
separately for heap and index, then divide the index's sleep_time with
a number of workers and add it to heap's sleep time? Then, it will be
a bit easier to compare the data between parallel and non-parallel
case.

I have come up with a patch to compute the total delay during the
vacuum. So the idea of computing the total cost delay is

Total cost delay = Total dealy of heap scan + Total dealy of
index/worker; Patch is attached for the same.

I have prepared this patch on the latest patch of the parallel
vacuum[1]. I have also rebased the patch for the approach [b] for
dividing the vacuum cost limit and done some testing for computing the
I/O throttling. Attached patches 0001-POC-compute-total-cost-delay
and 0002-POC-divide-vacuum-cost-limit can be applied on top of
v31-0005-Add-paralell-P-option-to-vacuumdb-command.patch. I haven't
rebased on top of v31-0006, because v31-0006 is implementing the I/O
throttling with one approach and 0002-POC-divide-vacuum-cost-limit is
doing the same with another approach. But,
0001-POC-compute-total-cost-delay can be applied on top of v31-0006 as
well (just 1-2 lines conflict).

Testing: I have performed 2 tests, one with the same size indexes and
second with the different size indexes and measured total I/O delay
with the attached patch.

Setup:
VacuumCostDelay=10ms
VacuumCostLimit=2000

Test1 (Same size index):
create table test(a int, b varchar, c varchar);
create index idx1 on test(a);
create index idx2 on test(b);
create index idx3 on test(c);
insert into test select i, repeat('a',30)||i, repeat('a',20)||i from
generate_series(1,500000) as i;
delete from test where a < 200000;

Vacuum (Head) Parallel Vacuum
Vacuum Cost Divide Patch
Total Delay 1784 (ms) 1398(ms)
1938(ms)

Test2 (Variable size dead tuple in index)
create table test(a int, b varchar, c varchar);
create index idx1 on test(a);
create index idx2 on test(b) where a > 100000;
create index idx3 on test(c) where a > 150000;

insert into test select i, repeat('a',30)||i, repeat('a',20)||i from
generate_series(1,500000) as i;
delete from test where a < 200000;

Vacuum (Head) Parallel Vacuum
Vacuum Cost Divide Patch
Total Delay 1438 (ms) 1029(ms)
1529(ms)

Conclusion:
1. The tests prove that the total I/O delay is significantly less with
the parallel vacuum.
2. With the vacuum cost divide the problem is solved but the delay bit
more compared to the non-parallel version. The reason could be the
problem discussed at[2], but it needs further investigation.

Next, I will test with the v31-0006 (shared vacuum cost) patch. I
will also try to test different types of indexes.

Thank you for testing!

I realized that v31-0006 patch doesn't work fine so I've attached the
updated version patch that also incorporated some comments I got so
far. Sorry for the inconvenience. I'll apply your 0001 patch and also
test the total delay time.

FWIW I'd like to share the results of total delay time evaluation of
approach (a) (shared cost balance). I used the same workloads that
Dilip shared and set vacuum_cost_delay to 10. The results of two test
cases are here:

* Test1
normal : 12656 ms (hit 50594, miss 5700, dirty 7258, total 63552)
2 workers : 17149 ms (hit 47673, miss 8647, dirty 9157, total 65477)
1 worker : 19498 ms (hit 45954, miss 10340, dirty 10517, total 66811)

* Test2
normal : 1530 ms (hit 30645, miss 2, dirty 3, total 30650)
2 workers : 1538 ms (hit 30645, miss 2, dirty 3, total 30650)
1 worker : 1538 ms (hit 30645, miss 2, dirty 3, total 30650)

'hit', 'miss' and 'dirty' are the total numbers of buffer hits, buffer
misses and flushing dirty buffer, respectively. 'total' is the sum of
these three values.

In this evaluation I expect that parallel vacuum cases delay time as
much as the time of normal vacuum because the total number of pages to
vacuum is the same and we have the shared cost balance value and each
workers decide to sleep based on that value. According to the above
Test1 results, we can see that there is a big difference in the total
delay time among these cases (normal vacuum case is shortest), but
the cause of this is that parallel vacuum had to to flush more dirty
pages. Actually after increased shared_buffer I got expected results:

* Test1 (after increased shared_buffers)
normal : 2807 ms (hit 56295, miss 2, dirty 3, total 56300)
2 workers : 2840 ms (hit 56295, miss 2, dirty 3, total 56300)
1 worker : 2841 ms (hit 56295, miss 2, dirty 3, total 56300)

I updated the patch that computes the total cost delay shared by
Dilip[1] so that it collects the number of buffer hits and so on, and
have attached it. It can be applied on top of my latest patch set[1].

Thanks, Sawada-san. In my next test, I will use this updated patch.

Few comments on the latest patch.

+heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
+{
...
+
+ stats = (IndexBulkDeleteResult **)
+ palloc0(nindexes * sizeof(IndexBulkDeleteResult *));
+
+ if (lvshared->maintenance_work_mem_worker > 0)
+ maintenance_work_mem = lvshared->maintenance_work_mem_worker;

So for a worker, we have set the new value of the
maintenance_work_mem, But if the leader is participating in the index
vacuuming then
shouldn't we set the new value of the maintenance_work_mem for the
leader as well?

+static void
+prepare_index_statistics(LVShared *lvshared, Relation *Irel, int nindexes)
+{
+ char *p = (char *) GetSharedIndStats(lvshared);
+ int vac_work_mem = IsAutoVacuumWorkerProcess() &&
+ autovacuum_work_mem != -1 ?
+ autovacuum_work_mem : maintenance_work_mem;
+ int nindexes_mwm = 0;
+ int i;

Can this ever be called from the Autovacuum Worker? I think instead
of adding handling for the auto vacuum worker we
can have an assert.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#153

Masahiko Sawada

sawada.mshk@gmail.com

about 6 years ago

In reply to: Amit Kapila (#144)

On Mon, Oct 28, 2019 at 3:50 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Sun, Oct 27, 2019 at 12:52 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Fri, Oct 25, 2019 at 9:19 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

I haven't yet read the new set of the patch. But, I have noticed one
thing. That we are getting the size of the statistics using the AM
routine. But, we are copying those statistics from local memory to
the shared memory directly using the memcpy. Wouldn't it be a good
idea to have an AM specific routine to get it copied from the local
memory to the shared memory? I am not sure it is worth it or not but
my thought behind this point is that it will give AM to have local
stats in any form ( like they can store a pointer in that ) but they
can serialize that while copying to shared stats. And, later when
shared stats are passed back to the Am then it can deserialize in its
local form and use it.

You have a point, but after changing the gist index, we don't have any
current usage for indexes that need something like that. So, on one
side there is some value in having an API to copy the stats, but on
the other side without having clear usage of an API, it might not be
good to expose a new API for the same. I think we can expose such an
API in the future if there is a need for the same. Do you or anyone
know of any external IndexAM that has such a need?

Few minor comments while glancing through the latest patchset.

1. I think you can merge 0001*, 0002*, 0003* patch into one patch as
all three expose new variable/function from IndexAmRoutine.

Fixed.

2.
+prepare_index_statistics(LVShared *lvshared, Relation *Irel, int nindexes)
+{
+ char *p = (char *) GetSharedIndStats(lvshared);
+ int vac_work_mem = IsAutoVacuumWorkerProcess() &&
+ autovacuum_work_mem != -1 ?
+ autovacuum_work_mem : maintenance_work_mem;
I think this function won't be called from AutoVacuumWorkerProcess at
least not as of now, so isn't it a better idea to have an Assert for
it?

Fixed.

3.
+void
+heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
This function is for performing a parallel operation on the index, so
why to start with heap?

Because parallel vacuum supports only indexes that are created on heaps.

It is better to name it as
index_parallel_vacuum_main or simply parallel_vacuum_main.

I'm concerned that both names index_parallel_vacuum_main and
parallel_vacuum_main seem to be generic in spite of these codes are
heap-specific code.

4.
/* useindex = true means two-pass strategy; false means one-pass */
@@ -128,17 +280,12 @@ typedef struct LVRelStats
BlockNumber pages_removed;
double tuples_deleted;
BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
- /* List of TIDs of tuples we intend to delete */
- /* NB: this list is ordered by TID address */
- int num_dead_tuples; /* current # of entries */
- int max_dead_tuples; /* # slots allocated in array */
- ItemPointer dead_tuples; /* array of ItemPointerData */
+ LVDeadTuples *dead_tuples;
int num_index_scans;
TransactionId latestRemovedXid;
bool lock_waiter_detected;
} LVRelStats;

-
/* A few variables that don't seem worth passing around as parameters */
static int elevel = -1;

It seems like a spurious line removal.

Fixed.

These above comments are incorporated in the latest patch set(v32) [1]/messages/by-id/CAD21AoAqT17QwKJ_sWOqRxNvg66wMw1oZZzf9Rt-E-zD+XOh_Q@mail.gmail.com.

[1]: /messages/by-id/CAD21AoAqT17QwKJ_sWOqRxNvg66wMw1oZZzf9Rt-E-zD+XOh_Q@mail.gmail.com

Regards,

--
Masahiko Sawada

#154

Dilip Kumar

dilipbalaut@gmail.com

about 6 years ago

In reply to: Masahiko Sawada (#150)

On Tue, Oct 29, 2019 at 1:59 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Actually after increased shared_buffer I got expected results:

* Test1 (after increased shared_buffers)
normal : 2807 ms (hit 56295, miss 2, dirty 3, total 56300)
2 workers : 2840 ms (hit 56295, miss 2, dirty 3, total 56300)
1 worker : 2841 ms (hit 56295, miss 2, dirty 3, total 56300)

I updated the patch that computes the total cost delay shared by
Dilip[1] so that it collects the number of buffer hits and so on, and
have attached it. It can be applied on top of my latest patch set[1].

I tried to repeat the test to see the IO delay with
v32-0004-PoC-shared-vacuum-cost-balance.patch [1]/messages/by-id/CAD21AoAqT17QwKJ_sWOqRxNvg66wMw1oZZzf9Rt-E-zD+XOh_Q@mail.gmail.com. I tried with
shared memory 4GB. I recreated the database and restarted the server
before each run. But, I could not see the same I/O delay and cost is
also not the same. Can you please tell me how much shared buffers did
you set?

Test1 (4GB shared buffers)
normal: stats delay 1348.160000, hit 68952, miss 2, dirty 10063,
total 79017
1 worker: stats delay 1821.255000, hit 78184, miss 2, dirty 14095, total 92281
2 workers: stats delay 2224.415000, hit 86482, miss 2, dirty 17665, total 104149

[1]: /messages/by-id/CAD21AoAqT17QwKJ_sWOqRxNvg66wMw1oZZzf9Rt-E-zD+XOh_Q@mail.gmail.com

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#155

Dilip Kumar

dilipbalaut@gmail.com

about 6 years ago

In reply to: Dilip Kumar (#154)

On Thu, Oct 31, 2019 at 11:33 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Tue, Oct 29, 2019 at 1:59 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Actually after increased shared_buffer I got expected results:

* Test1 (after increased shared_buffers)
normal : 2807 ms (hit 56295, miss 2, dirty 3, total 56300)
2 workers : 2840 ms (hit 56295, miss 2, dirty 3, total 56300)
1 worker : 2841 ms (hit 56295, miss 2, dirty 3, total 56300)

I updated the patch that computes the total cost delay shared by
Dilip[1] so that it collects the number of buffer hits and so on, and
have attached it. It can be applied on top of my latest patch set[1].

While reading your modified patch (PoC-delay-stats.patch), I have
noticed that in my patch I used below formulae to compute the total
delay
total delay = delay in heap scan + (total delay of index scan
/nworkers). But, in your patch, I can see that it is just total sum of
all delay. IMHO, the total sleep time during the index vacuum phase
must be divided by the number of workers, because even if at some
point, all the workers go for sleep (e.g. 10 msec) then the delay in
I/O will be only for 10msec not 30 msec. I think the same is
discussed upthread[1]/messages/by-id/CAA4eK1+PeiFLdTuwrE6CvbNdx80E-O=ZxCuWB2maREKFD-RaCA@mail.gmail.com

[1]: /messages/by-id/CAA4eK1+PeiFLdTuwrE6CvbNdx80E-O=ZxCuWB2maREKFD-RaCA@mail.gmail.com

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#156

Masahiko Sawada

sawada.mshk@gmail.com

about 6 years ago

In reply to: Dilip Kumar (#155)

On Thu, Oct 31, 2019 at 3:45 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Thu, Oct 31, 2019 at 11:33 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Tue, Oct 29, 2019 at 1:59 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Actually after increased shared_buffer I got expected results:

* Test1 (after increased shared_buffers)
normal : 2807 ms (hit 56295, miss 2, dirty 3, total 56300)
2 workers : 2840 ms (hit 56295, miss 2, dirty 3, total 56300)
1 worker : 2841 ms (hit 56295, miss 2, dirty 3, total 56300)

I updated the patch that computes the total cost delay shared by
Dilip[1] so that it collects the number of buffer hits and so on, and
have attached it. It can be applied on top of my latest patch set[1].

While reading your modified patch (PoC-delay-stats.patch), I have
noticed that in my patch I used below formulae to compute the total
delay
total delay = delay in heap scan + (total delay of index scan
/nworkers). But, in your patch, I can see that it is just total sum of
all delay. IMHO, the total sleep time during the index vacuum phase
must be divided by the number of workers, because even if at some
point, all the workers go for sleep (e.g. 10 msec) then the delay in
I/O will be only for 10msec not 30 msec. I think the same is
discussed upthread[1]

I think that two approaches make parallel vacuum worker wait in
different way: in approach(a) the vacuum delay works as if vacuum is
performed by single process, on the other hand in approach(b) the
vacuum delay work for each workers independently.

Suppose that the total number of blocks to vacuum is 10,000 blocks,
the cost per blocks is 10, the cost limit is 200 and sleep time is 5
ms. In single process vacuum the total sleep time is 2,500ms (=
(10,000 * 10 / 200) * 5). The approach (a) is the same, 2,500ms.
Because all parallel vacuum workers use the shared balance value and a
worker sleeps once the balance value exceeds the limit. In
approach(b), since the cost limit is divided evenly the value of each
workers is 40 (e.g. when 5 parallel degree). And suppose each workers
processes blocks evenly, the total sleep time of all workers is
12,500ms (=(2,000 * 10 / 40) * 5 * 5). I think that's why we can
compute the sleep time of approach(b) by dividing the total value by
the number of parallel workers.

IOW the approach(b) makes parallel vacuum delay much more than normal
vacuum and parallel vacuum with approach(a) even with the same
settings. Which behaviors do we expect? I thought the vacuum delay for
parallel vacuum should work as if it's a single process vacuum as we
did for memory usage. I might be missing something. If we prefer
approach(b) I should change the patch so that the leader process
divides the cost limit evenly.

Regards,

--
Masahiko Sawada

#157

Dilip Kumar

dilipbalaut@gmail.com

about 6 years ago

In reply to: Masahiko Sawada (#156)

On Fri, Nov 1, 2019 at 2:21 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Thu, Oct 31, 2019 at 3:45 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Thu, Oct 31, 2019 at 11:33 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Tue, Oct 29, 2019 at 1:59 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Actually after increased shared_buffer I got expected results:

* Test1 (after increased shared_buffers)
normal : 2807 ms (hit 56295, miss 2, dirty 3, total 56300)
2 workers : 2840 ms (hit 56295, miss 2, dirty 3, total 56300)
1 worker : 2841 ms (hit 56295, miss 2, dirty 3, total 56300)

I updated the patch that computes the total cost delay shared by
Dilip[1] so that it collects the number of buffer hits and so on, and
have attached it. It can be applied on top of my latest patch set[1].

While reading your modified patch (PoC-delay-stats.patch), I have
noticed that in my patch I used below formulae to compute the total
delay
total delay = delay in heap scan + (total delay of index scan
/nworkers). But, in your patch, I can see that it is just total sum of
all delay. IMHO, the total sleep time during the index vacuum phase
must be divided by the number of workers, because even if at some
point, all the workers go for sleep (e.g. 10 msec) then the delay in
I/O will be only for 10msec not 30 msec. I think the same is
discussed upthread[1]

I think that two approaches make parallel vacuum worker wait in
different way: in approach(a) the vacuum delay works as if vacuum is
performed by single process, on the other hand in approach(b) the
vacuum delay work for each workers independently.

Suppose that the total number of blocks to vacuum is 10,000 blocks,
the cost per blocks is 10, the cost limit is 200 and sleep time is 5
ms. In single process vacuum the total sleep time is 2,500ms (=
(10,000 * 10 / 200) * 5). The approach (a) is the same, 2,500ms.
Because all parallel vacuum workers use the shared balance value and a
worker sleeps once the balance value exceeds the limit. In
approach(b), since the cost limit is divided evenly the value of each
workers is 40 (e.g. when 5 parallel degree). And suppose each workers
processes blocks evenly, the total sleep time of all workers is
12,500ms (=(2,000 * 10 / 40) * 5 * 5). I think that's why we can
compute the sleep time of approach(b) by dividing the total value by
the number of parallel workers.

IOW the approach(b) makes parallel vacuum delay much more than normal
vacuum and parallel vacuum with approach(a) even with the same
settings. Which behaviors do we expect? I thought the vacuum delay for
parallel vacuum should work as if it's a single process vacuum as we
did for memory usage. I might be missing something. If we prefer
approach(b) I should change the patch so that the leader process
divides the cost limit evenly.

I have repeated the same test (test1 and test2)[1] with a higher
shared buffer (1GB). Currently, I have used the same formula for
computing the total delay
heap scan delay + index vacuuming delay / workers. Because, In my
opinion, multiple workers are doing I/O here so the total delay should
also be in multiple
of the number of workers. So if we want to compare the delay with the
sequential vacuum then we should divide total delay by the number of
workers. But, I am not
sure whether computing the total delay is the right way to compute the
I/O throttling or not. But, I support the approach (b) for dividing
the I/O limit because
auto vacuum workers are already operating with this approach.

test1:
normal: stats delay 1348.160000, hit 68952, miss 2, dirty 10063, total 79017
1 worker: stats delay 1349.585000, hit 68954, miss 2, dirty 10146,
total 79102 (cost divide patch)
2 worker: stats delay 1341.416141, hit 68956, miss 2, dirty 10036,
total 78994 (cost divide patch)
1 worker: stats delay 1025.495000, hit 78184, miss 2, dirty 14066,
total 92252 (share cost patch)
2 worker: stats delay 904.366667, hit 86482, miss 2, dirty 17806,
total 104290 (share cost patch)

test2:
normal: stats delay 530.475000, hit 36982, miss 2, dirty 3488, total 40472
1 worker: stats delay 530.700000, hit 36984, miss 2, dirty 3527, total
40513 (cost divide patch)
2 worker: stats delay 530.675000, hit 36984, miss 2, dirty 3532, total
40518 (cost divide patch)
1 worker: stats delay 490.570000, hit 39090, miss 2, dirty 3497, total
42589 (share cost patch)
2 worker: stats delay 480.571667, hit 39050, miss 2, dirty 3819, total
42871 (share cost patch)

So with higher, shared buffers, I can see with approach (b) we can
see the same total delay. With approach (a) I can see a bit less
total delay. But, a point to be noted that I have used the same
formulae for computing the total delay for both the approaches. But,
Sawada-san explained in the above mail that it may not be the right
way to computing the total delay for the approach (a). But my take is
that whether we are working with shared cost or we are dividing the
cost, the delay must be divided by number of workers in the parallel
phase. @Amit Kapila, what is your opinion on this?

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#158

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Dilip Kumar (#145)

On Mon, Oct 28, 2019 at 1:52 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Mon, Oct 28, 2019 at 12:20 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Sun, Oct 27, 2019 at 12:52 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Fri, Oct 25, 2019 at 9:19 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

I haven't yet read the new set of the patch. But, I have noticed one
thing. That we are getting the size of the statistics using the AM
routine. But, we are copying those statistics from local memory to
the shared memory directly using the memcpy. Wouldn't it be a good
idea to have an AM specific routine to get it copied from the local
memory to the shared memory? I am not sure it is worth it or not but
my thought behind this point is that it will give AM to have local
stats in any form ( like they can store a pointer in that ) but they
can serialize that while copying to shared stats. And, later when
shared stats are passed back to the Am then it can deserialize in its
local form and use it.

You have a point, but after changing the gist index, we don't have any
current usage for indexes that need something like that. So, on one
side there is some value in having an API to copy the stats, but on
the other side without having clear usage of an API, it might not be
good to expose a new API for the same. I think we can expose such an
API in the future if there is a need for the same.

I agree with the point. But, the current patch exposes an API for
estimating the size for the statistics. So IMHO, either we expose
both APIs for estimating the size of the stats and copy the stats or
none. Am I missing something here?

I think the first one is a must as the things stand today because
otherwise, we won't be able to copy the stats. The second one (expose
an API to copy stats) is good to have but there is no usage of it
immediately. We can expose the second API considering the future need
but as there is no valid case as of now, it will be difficult to test
and we are also not sure whether in future any IndexAM will require
such an API.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#159

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Masahiko Sawada (#156)

On Fri, Nov 1, 2019 at 2:21 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

I think that two approaches make parallel vacuum worker wait in
different way: in approach(a) the vacuum delay works as if vacuum is
performed by single process, on the other hand in approach(b) the
vacuum delay work for each workers independently.

Suppose that the total number of blocks to vacuum is 10,000 blocks,
the cost per blocks is 10, the cost limit is 200 and sleep time is 5
ms. In single process vacuum the total sleep time is 2,500ms (=
(10,000 * 10 / 200) * 5). The approach (a) is the same, 2,500ms.
Because all parallel vacuum workers use the shared balance value and a
worker sleeps once the balance value exceeds the limit. In
approach(b), since the cost limit is divided evenly the value of each
workers is 40 (e.g. when 5 parallel degree). And suppose each workers
processes blocks evenly, the total sleep time of all workers is
12,500ms (=(2,000 * 10 / 40) * 5 * 5). I think that's why we can
compute the sleep time of approach(b) by dividing the total value by
the number of parallel workers.

IOW the approach(b) makes parallel vacuum delay much more than normal
vacuum and parallel vacuum with approach(a) even with the same
settings. Which behaviors do we expect?

Yeah, this is an important thing to decide. I don't think that the
conclusion you are drawing is correct because it that is true then the
same applies to the current autovacuum work division where we divide
the cost_limit among workers but the cost_delay is same (see
autovac_balance_cost). Basically, if we consider the delay time of
each worker independently, then it would appear that a parallel vacuum
delay with approach (b) is more, but that is true only if the workers
run serially which is not true.

I thought the vacuum delay for
parallel vacuum should work as if it's a single process vacuum as we
did for memory usage. I might be missing something. If we prefer
approach(b) I should change the patch so that the leader process
divides the cost limit evenly.

I am also not completely sure which approach is better but I slightly
lean towards approach (b). I think we need input from some other
people as well. I will start a separate thread to discuss this and
see if that helps to get the input from others.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#160

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Dilip Kumar (#157)

On Sun, Nov 3, 2019 at 9:49 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Fri, Nov 1, 2019 at 2:21 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

I think that two approaches make parallel vacuum worker wait in
different way: in approach(a) the vacuum delay works as if vacuum is
performed by single process, on the other hand in approach(b) the
vacuum delay work for each workers independently.

Suppose that the total number of blocks to vacuum is 10,000 blocks,
the cost per blocks is 10, the cost limit is 200 and sleep time is 5
ms. In single process vacuum the total sleep time is 2,500ms (=
(10,000 * 10 / 200) * 5). The approach (a) is the same, 2,500ms.
Because all parallel vacuum workers use the shared balance value and a
worker sleeps once the balance value exceeds the limit. In
approach(b), since the cost limit is divided evenly the value of each
workers is 40 (e.g. when 5 parallel degree). And suppose each workers
processes blocks evenly, the total sleep time of all workers is
12,500ms (=(2,000 * 10 / 40) * 5 * 5). I think that's why we can
compute the sleep time of approach(b) by dividing the total value by
the number of parallel workers.

IOW the approach(b) makes parallel vacuum delay much more than normal
vacuum and parallel vacuum with approach(a) even with the same
settings. Which behaviors do we expect? I thought the vacuum delay for
parallel vacuum should work as if it's a single process vacuum as we
did for memory usage. I might be missing something. If we prefer
approach(b) I should change the patch so that the leader process
divides the cost limit evenly.

I have repeated the same test (test1 and test2)[1] with a higher
shared buffer (1GB). Currently, I have used the same formula for
computing the total delay
heap scan delay + index vacuuming delay / workers. Because, In my
opinion, multiple workers are doing I/O here so the total delay should
also be in multiple
of the number of workers. So if we want to compare the delay with the
sequential vacuum then we should divide total delay by the number of
workers. But, I am not
sure whether computing the total delay is the right way to compute the
I/O throttling or not. But, I support the approach (b) for dividing
the I/O limit because
auto vacuum workers are already operating with this approach.

test1:
normal: stats delay 1348.160000, hit 68952, miss 2, dirty 10063, total 79017
1 worker: stats delay 1349.585000, hit 68954, miss 2, dirty 10146,
total 79102 (cost divide patch)
2 worker: stats delay 1341.416141, hit 68956, miss 2, dirty 10036,
total 78994 (cost divide patch)
1 worker: stats delay 1025.495000, hit 78184, miss 2, dirty 14066,
total 92252 (share cost patch)
2 worker: stats delay 904.366667, hit 86482, miss 2, dirty 17806,
total 104290 (share cost patch)

test2:
normal: stats delay 530.475000, hit 36982, miss 2, dirty 3488, total 40472
1 worker: stats delay 530.700000, hit 36984, miss 2, dirty 3527, total
40513 (cost divide patch)
2 worker: stats delay 530.675000, hit 36984, miss 2, dirty 3532, total
40518 (cost divide patch)
1 worker: stats delay 490.570000, hit 39090, miss 2, dirty 3497, total
42589 (share cost patch)
2 worker: stats delay 480.571667, hit 39050, miss 2, dirty 3819, total
42871 (share cost patch)

So with higher, shared buffers, I can see with approach (b) we can
see the same total delay. With approach (a) I can see a bit less
total delay. But, a point to be noted that I have used the same
formulae for computing the total delay for both the approaches. But,
Sawada-san explained in the above mail that it may not be the right
way to computing the total delay for the approach (a). But my take is
that whether we are working with shared cost or we are dividing the
cost, the delay must be divided by number of workers in the parallel
phase.

Why do you think so? I think with approach (b) if all the workers are
doing equal amount of I/O, they will probably sleep at the same time
whereas with approach (a) each of them will sleep at different times.
So, probably dividing the delay in approach (b) makes more sense.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#161

Dilip Kumar

dilipbalaut@gmail.com

about 6 years ago

In reply to: Amit Kapila (#160)

On Mon, Nov 4, 2019 at 10:45 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Sun, Nov 3, 2019 at 9:49 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Fri, Nov 1, 2019 at 2:21 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

I think that two approaches make parallel vacuum worker wait in
different way: in approach(a) the vacuum delay works as if vacuum is
performed by single process, on the other hand in approach(b) the
vacuum delay work for each workers independently.

Suppose that the total number of blocks to vacuum is 10,000 blocks,
the cost per blocks is 10, the cost limit is 200 and sleep time is 5
ms. In single process vacuum the total sleep time is 2,500ms (=
(10,000 * 10 / 200) * 5). The approach (a) is the same, 2,500ms.
Because all parallel vacuum workers use the shared balance value and a
worker sleeps once the balance value exceeds the limit. In
approach(b), since the cost limit is divided evenly the value of each
workers is 40 (e.g. when 5 parallel degree). And suppose each workers
processes blocks evenly, the total sleep time of all workers is
12,500ms (=(2,000 * 10 / 40) * 5 * 5). I think that's why we can
compute the sleep time of approach(b) by dividing the total value by
the number of parallel workers.

IOW the approach(b) makes parallel vacuum delay much more than normal
vacuum and parallel vacuum with approach(a) even with the same
settings. Which behaviors do we expect? I thought the vacuum delay for
parallel vacuum should work as if it's a single process vacuum as we
did for memory usage. I might be missing something. If we prefer
approach(b) I should change the patch so that the leader process
divides the cost limit evenly.

I have repeated the same test (test1 and test2)[1] with a higher
shared buffer (1GB). Currently, I have used the same formula for
computing the total delay
heap scan delay + index vacuuming delay / workers. Because, In my
opinion, multiple workers are doing I/O here so the total delay should
also be in multiple
of the number of workers. So if we want to compare the delay with the
sequential vacuum then we should divide total delay by the number of
workers. But, I am not
sure whether computing the total delay is the right way to compute the
I/O throttling or not. But, I support the approach (b) for dividing
the I/O limit because
auto vacuum workers are already operating with this approach.

test1:
normal: stats delay 1348.160000, hit 68952, miss 2, dirty 10063, total 79017
1 worker: stats delay 1349.585000, hit 68954, miss 2, dirty 10146,
total 79102 (cost divide patch)
2 worker: stats delay 1341.416141, hit 68956, miss 2, dirty 10036,
total 78994 (cost divide patch)
1 worker: stats delay 1025.495000, hit 78184, miss 2, dirty 14066,
total 92252 (share cost patch)
2 worker: stats delay 904.366667, hit 86482, miss 2, dirty 17806,
total 104290 (share cost patch)

test2:
normal: stats delay 530.475000, hit 36982, miss 2, dirty 3488, total 40472
1 worker: stats delay 530.700000, hit 36984, miss 2, dirty 3527, total
40513 (cost divide patch)
2 worker: stats delay 530.675000, hit 36984, miss 2, dirty 3532, total
40518 (cost divide patch)
1 worker: stats delay 490.570000, hit 39090, miss 2, dirty 3497, total
42589 (share cost patch)
2 worker: stats delay 480.571667, hit 39050, miss 2, dirty 3819, total
42871 (share cost patch)

So with higher, shared buffers, I can see with approach (b) we can
see the same total delay. With approach (a) I can see a bit less
total delay. But, a point to be noted that I have used the same
formulae for computing the total delay for both the approaches. But,
Sawada-san explained in the above mail that it may not be the right
way to computing the total delay for the approach (a). But my take is
that whether we are working with shared cost or we are dividing the
cost, the delay must be divided by number of workers in the parallel
phase.

Why do you think so? I think with approach (b) if all the workers are
doing equal amount of I/O, they will probably sleep at the same time
whereas with approach (a) each of them will sleep at different times.
So, probably dividing the delay in approach (b) makes more sense.

Just to be clear, I did not mean that we divide the sleep time for
each worker. Actually, I meant how to project the total delay in the
test patch. So I think if we directly want to compare the sleep time
of the sequential vs parallel then it's not fair to just compare the
total sleep time because when multiple workers are working parallelly
shouldn't we need to consider their average sleep time?

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#162

Dilip Kumar

dilipbalaut@gmail.com

about 6 years ago

In reply to: Amit Kapila (#159)

On Mon, Nov 4, 2019 at 10:32 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Nov 1, 2019 at 2:21 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

I think that two approaches make parallel vacuum worker wait in
different way: in approach(a) the vacuum delay works as if vacuum is
performed by single process, on the other hand in approach(b) the
vacuum delay work for each workers independently.

Suppose that the total number of blocks to vacuum is 10,000 blocks,
the cost per blocks is 10, the cost limit is 200 and sleep time is 5
ms. In single process vacuum the total sleep time is 2,500ms (=
(10,000 * 10 / 200) * 5). The approach (a) is the same, 2,500ms.
Because all parallel vacuum workers use the shared balance value and a
worker sleeps once the balance value exceeds the limit. In
approach(b), since the cost limit is divided evenly the value of each
workers is 40 (e.g. when 5 parallel degree). And suppose each workers
processes blocks evenly, the total sleep time of all workers is
12,500ms (=(2,000 * 10 / 40) * 5 * 5). I think that's why we can
compute the sleep time of approach(b) by dividing the total value by
the number of parallel workers.

IOW the approach(b) makes parallel vacuum delay much more than normal
vacuum and parallel vacuum with approach(a) even with the same
settings. Which behaviors do we expect?

Yeah, this is an important thing to decide. I don't think that the
conclusion you are drawing is correct because it that is true then the
same applies to the current autovacuum work division where we divide
the cost_limit among workers but the cost_delay is same (see
autovac_balance_cost). Basically, if we consider the delay time of
each worker independently, then it would appear that a parallel vacuum
delay with approach (b) is more, but that is true only if the workers
run serially which is not true.

I thought the vacuum delay for
parallel vacuum should work as if it's a single process vacuum as we
did for memory usage. I might be missing something. If we prefer
approach(b) I should change the patch so that the leader process
divides the cost limit evenly.

I am also not completely sure which approach is better but I slightly
lean towards approach (b). I think we need input from some other
people as well. I will start a separate thread to discuss this and
see if that helps to get the input from others.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#163

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

about 6 years ago

In reply to: Amit Kapila (#159)

On Mon, 4 Nov 2019 at 14:02, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Nov 1, 2019 at 2:21 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

I think that two approaches make parallel vacuum worker wait in
different way: in approach(a) the vacuum delay works as if vacuum is
performed by single process, on the other hand in approach(b) the
vacuum delay work for each workers independently.

Suppose that the total number of blocks to vacuum is 10,000 blocks,
the cost per blocks is 10, the cost limit is 200 and sleep time is 5
ms. In single process vacuum the total sleep time is 2,500ms (=
(10,000 * 10 / 200) * 5). The approach (a) is the same, 2,500ms.
Because all parallel vacuum workers use the shared balance value and a
worker sleeps once the balance value exceeds the limit. In
approach(b), since the cost limit is divided evenly the value of each
workers is 40 (e.g. when 5 parallel degree). And suppose each workers
processes blocks evenly, the total sleep time of all workers is
12,500ms (=(2,000 * 10 / 40) * 5 * 5). I think that's why we can
compute the sleep time of approach(b) by dividing the total value by
the number of parallel workers.

IOW the approach(b) makes parallel vacuum delay much more than normal
vacuum and parallel vacuum with approach(a) even with the same
settings. Which behaviors do we expect?

Yeah, this is an important thing to decide. I don't think that the
conclusion you are drawing is correct because it that is true then the
same applies to the current autovacuum work division where we divide
the cost_limit among workers but the cost_delay is same (see
autovac_balance_cost). Basically, if we consider the delay time of
each worker independently, then it would appear that a parallel vacuum
delay with approach (b) is more, but that is true only if the workers
run serially which is not true.

I thought the vacuum delay for
parallel vacuum should work as if it's a single process vacuum as we
did for memory usage. I might be missing something. If we prefer
approach(b) I should change the patch so that the leader process
divides the cost limit evenly.

I am also not completely sure which approach is better but I slightly
lean towards approach (b).

Can we get the same sleep time as approach (b) if we divide the cost
limit by the number of workers and have the shared cost balance (i.e.
approach (a) with dividing the cost limit)? Currently the approach (b)
seems better but I'm concerned that it might unnecessarily delay
vacuum if some indexes are very small or bulk-deletions of indexes
does almost nothing such as brin.

I think we need input from some other
people as well. I will start a separate thread to discuss this and
see if that helps to get the input from others.

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#164

Dilip Kumar

dilipbalaut@gmail.com

about 6 years ago

In reply to: Masahiko Sawada (#163)

On Mon, Nov 4, 2019 at 1:00 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Mon, 4 Nov 2019 at 14:02, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Nov 1, 2019 at 2:21 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

I think that two approaches make parallel vacuum worker wait in
different way: in approach(a) the vacuum delay works as if vacuum is
performed by single process, on the other hand in approach(b) the
vacuum delay work for each workers independently.

Suppose that the total number of blocks to vacuum is 10,000 blocks,
the cost per blocks is 10, the cost limit is 200 and sleep time is 5
ms. In single process vacuum the total sleep time is 2,500ms (=
(10,000 * 10 / 200) * 5). The approach (a) is the same, 2,500ms.
Because all parallel vacuum workers use the shared balance value and a
worker sleeps once the balance value exceeds the limit. In
approach(b), since the cost limit is divided evenly the value of each
workers is 40 (e.g. when 5 parallel degree). And suppose each workers
processes blocks evenly, the total sleep time of all workers is
12,500ms (=(2,000 * 10 / 40) * 5 * 5). I think that's why we can
compute the sleep time of approach(b) by dividing the total value by
the number of parallel workers.

IOW the approach(b) makes parallel vacuum delay much more than normal
vacuum and parallel vacuum with approach(a) even with the same
settings. Which behaviors do we expect?

Yeah, this is an important thing to decide. I don't think that the
conclusion you are drawing is correct because it that is true then the
same applies to the current autovacuum work division where we divide
the cost_limit among workers but the cost_delay is same (see
autovac_balance_cost). Basically, if we consider the delay time of
each worker independently, then it would appear that a parallel vacuum
delay with approach (b) is more, but that is true only if the workers
run serially which is not true.

I thought the vacuum delay for
parallel vacuum should work as if it's a single process vacuum as we
did for memory usage. I might be missing something. If we prefer
approach(b) I should change the patch so that the leader process
divides the cost limit evenly.

I am also not completely sure which approach is better but I slightly
lean towards approach (b).

Can we get the same sleep time as approach (b) if we divide the cost
limit by the number of workers and have the shared cost balance (i.e.
approach (a) with dividing the cost limit)? Currently the approach (b)
seems better but I'm concerned that it might unnecessarily delay
vacuum if some indexes are very small or bulk-deletions of indexes
does almost nothing such as brin.

Are you worried that some of the workers might not have much I/O to do
but still we divide the cost limit equally? If that is the case then
that is the case with the auto vacuum workers also right?

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#165

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

about 6 years ago

In reply to: Dilip Kumar (#164)

On Mon, 4 Nov 2019 at 17:26, Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Mon, Nov 4, 2019 at 1:00 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Mon, 4 Nov 2019 at 14:02, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Nov 1, 2019 at 2:21 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

I think that two approaches make parallel vacuum worker wait in
different way: in approach(a) the vacuum delay works as if vacuum is
performed by single process, on the other hand in approach(b) the
vacuum delay work for each workers independently.

Suppose that the total number of blocks to vacuum is 10,000 blocks,
the cost per blocks is 10, the cost limit is 200 and sleep time is 5
ms. In single process vacuum the total sleep time is 2,500ms (=
(10,000 * 10 / 200) * 5). The approach (a) is the same, 2,500ms.
Because all parallel vacuum workers use the shared balance value and a
worker sleeps once the balance value exceeds the limit. In
approach(b), since the cost limit is divided evenly the value of each
workers is 40 (e.g. when 5 parallel degree). And suppose each workers
processes blocks evenly, the total sleep time of all workers is
12,500ms (=(2,000 * 10 / 40) * 5 * 5). I think that's why we can
compute the sleep time of approach(b) by dividing the total value by
the number of parallel workers.

IOW the approach(b) makes parallel vacuum delay much more than normal
vacuum and parallel vacuum with approach(a) even with the same
settings. Which behaviors do we expect?

Yeah, this is an important thing to decide. I don't think that the
conclusion you are drawing is correct because it that is true then the
same applies to the current autovacuum work division where we divide
the cost_limit among workers but the cost_delay is same (see
autovac_balance_cost). Basically, if we consider the delay time of
each worker independently, then it would appear that a parallel vacuum
delay with approach (b) is more, but that is true only if the workers
run serially which is not true.

I thought the vacuum delay for
parallel vacuum should work as if it's a single process vacuum as we
did for memory usage. I might be missing something. If we prefer
approach(b) I should change the patch so that the leader process
divides the cost limit evenly.

I am also not completely sure which approach is better but I slightly
lean towards approach (b).

Can we get the same sleep time as approach (b) if we divide the cost
limit by the number of workers and have the shared cost balance (i.e.
approach (a) with dividing the cost limit)? Currently the approach (b)
seems better but I'm concerned that it might unnecessarily delay
vacuum if some indexes are very small or bulk-deletions of indexes
does almost nothing such as brin.

Are you worried that some of the workers might not have much I/O to do
but still we divide the cost limit equally?

Yes.

If that is the case then
that is the case with the auto vacuum workers also right?

I think It is not right because we rebalance the cost after an
autovacuum worker finished. So as Amit mentioned on the new thread we
might need to make parallel vacuum workers notice to the leader once
exited so that it can rebalance the cost.

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#166

Dilip Kumar

dilipbalaut@gmail.com

about 6 years ago

In reply to: Masahiko Sawada (#165)

On Mon, Nov 4, 2019 at 2:11 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Mon, 4 Nov 2019 at 17:26, Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Mon, Nov 4, 2019 at 1:00 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Mon, 4 Nov 2019 at 14:02, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Nov 1, 2019 at 2:21 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

I think that two approaches make parallel vacuum worker wait in
different way: in approach(a) the vacuum delay works as if vacuum is
performed by single process, on the other hand in approach(b) the
vacuum delay work for each workers independently.

Suppose that the total number of blocks to vacuum is 10,000 blocks,
the cost per blocks is 10, the cost limit is 200 and sleep time is 5
ms. In single process vacuum the total sleep time is 2,500ms (=
(10,000 * 10 / 200) * 5). The approach (a) is the same, 2,500ms.
Because all parallel vacuum workers use the shared balance value and a
worker sleeps once the balance value exceeds the limit. In
approach(b), since the cost limit is divided evenly the value of each
workers is 40 (e.g. when 5 parallel degree). And suppose each workers
processes blocks evenly, the total sleep time of all workers is
12,500ms (=(2,000 * 10 / 40) * 5 * 5). I think that's why we can
compute the sleep time of approach(b) by dividing the total value by
the number of parallel workers.

IOW the approach(b) makes parallel vacuum delay much more than normal
vacuum and parallel vacuum with approach(a) even with the same
settings. Which behaviors do we expect?

Yeah, this is an important thing to decide. I don't think that the
conclusion you are drawing is correct because it that is true then the
same applies to the current autovacuum work division where we divide
the cost_limit among workers but the cost_delay is same (see
autovac_balance_cost). Basically, if we consider the delay time of
each worker independently, then it would appear that a parallel vacuum
delay with approach (b) is more, but that is true only if the workers
run serially which is not true.

I thought the vacuum delay for
parallel vacuum should work as if it's a single process vacuum as we
did for memory usage. I might be missing something. If we prefer
approach(b) I should change the patch so that the leader process
divides the cost limit evenly.

I am also not completely sure which approach is better but I slightly
lean towards approach (b).

Can we get the same sleep time as approach (b) if we divide the cost
limit by the number of workers and have the shared cost balance (i.e.
approach (a) with dividing the cost limit)? Currently the approach (b)
seems better but I'm concerned that it might unnecessarily delay
vacuum if some indexes are very small or bulk-deletions of indexes
does almost nothing such as brin.

Are you worried that some of the workers might not have much I/O to do
but still we divide the cost limit equally?

Yes.

If that is the case then
that is the case with the auto vacuum workers also right?

I think It is not right because we rebalance the cost after an
autovacuum worker finished. So as Amit mentioned on the new thread we
might need to make parallel vacuum workers notice to the leader once
exited so that it can rebalance the cost.

I agree that if the auto vacuum worker finishes then we rebalance the
cost and we need to do something similar here. And, that will be a
bit difficult to implement in parallel vacuum case.

We might need some shared memory array where we can set the worker
status as running as soon as the worker started running. And, when a
worker exit we can set it false and we can also set some flag saying
we need cost rebalancing. And, in vacuum_delay_point if we identify
that we need to rebalance then we can process the shared memory array
and find out how many workers are running and based on that we can
rebalance. Having said that I think for rebalancing we just need a
shared memory counter that how many workers are running.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#167

Mahendra Singh

mahi6run@gmail.com

about 6 years ago

In reply to: Masahiko Sawada (#149)

1 attachment(s)

Hi
I took all attached patches(v32-01 to v32-4) and one Dilip's patch from
"Questions/Observations related to Gist vacuum" mail thread. On the top of
all these patches, I created one more patch to test parallel vacuum
functionally for all existence test suite.
For reference, I am attaching patch.

*What does this patch?*
As we know that if we give parallel option with vacuum, then only we are
vacuuming using parallel workers. So to test, I used existence guc
*force_parallel_mode* and tested parallel vacuuming.

If force_parallel_mode is set as *regress, *then if parallel option is not
given with vacuum, I am forcing to use parallel workers for vacuum. If
there is only one index and parallel degree is not given with vacuum(or
parallel option is not given), and *force_parallel_mode = regress*, then I
am launching one parallel worker(I am not doing work by leader in this
case), but if there is more than one index, then i am using leader as a
worker for one index and launching workers for all other indexes.

After applying this patch and setting *force_parallel_mode = regress,* all
test cases are passing (make-check world)

I have some questions regarding my patch. Should we do vacuuming using
parallel workers even if force_parallel_mode is set as *on, *or we should
use new GUC to test parallel worker vacuum for existence test suite?

Please let me know your thoughts for this patch.

Thanks and Regards
Mahendra Thalor
EnterpriseDB: http://www.enterprisedb.com

On Tue, 29 Oct 2019 at 12:37, Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Show quoted text

On Mon, Oct 28, 2019 at 2:13 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Thu, Oct 24, 2019 at 4:33 PM Dilip Kumar <dilipbalaut@gmail.com>

wrote:

On Thu, Oct 24, 2019 at 4:21 PM Amit Kapila <amit.kapila16@gmail.com>

wrote:

On Thu, Oct 24, 2019 at 11:51 AM Dilip Kumar <dilipbalaut@gmail.com>

wrote:

On Fri, Oct 18, 2019 at 12:18 PM Dilip Kumar <

dilipbalaut@gmail.com> wrote:

On Fri, Oct 18, 2019 at 11:25 AM Amit Kapila <

amit.kapila16@gmail.com> wrote:

I am thinking if we can write the patch for both the

approaches (a.

compute shared costs and try to delay based on that, b. try to

divide

the I/O cost among workers as described in the email above[1])

and do

some tests to see the behavior of throttling, that might help

us in

deciding what is the best strategy to solve this problem, if

any.

What do you think?

I agree with this idea. I can come up with a POC patch for

approach

(b). Meanwhile, if someone is interested to quickly hack with

the

approach (a) then we can do some testing and compare.

Sawada-san,

by any chance will you be interested to write POC with approach

(a)?

Otherwise, I will try to write it after finishing the first one
(approach b).

I have come up with the POC for approach (a).

Can we compute the overall throttling (sleep time) in the operation
separately for heap and index, then divide the index's sleep_time

with

a number of workers and add it to heap's sleep time? Then, it will

be

a bit easier to compare the data between parallel and non-parallel
case.

I have come up with a patch to compute the total delay during the
vacuum. So the idea of computing the total cost delay is

Total cost delay = Total dealy of heap scan + Total dealy of
index/worker; Patch is attached for the same.

I have prepared this patch on the latest patch of the parallel
vacuum[1]. I have also rebased the patch for the approach [b] for
dividing the vacuum cost limit and done some testing for computing the
I/O throttling. Attached patches 0001-POC-compute-total-cost-delay
and 0002-POC-divide-vacuum-cost-limit can be applied on top of
v31-0005-Add-paralell-P-option-to-vacuumdb-command.patch. I haven't
rebased on top of v31-0006, because v31-0006 is implementing the I/O
throttling with one approach and 0002-POC-divide-vacuum-cost-limit is
doing the same with another approach. But,
0001-POC-compute-total-cost-delay can be applied on top of v31-0006 as
well (just 1-2 lines conflict).

Testing: I have performed 2 tests, one with the same size indexes and
second with the different size indexes and measured total I/O delay
with the attached patch.

Setup:
VacuumCostDelay=10ms
VacuumCostLimit=2000

Test1 (Same size index):
create table test(a int, b varchar, c varchar);
create index idx1 on test(a);
create index idx2 on test(b);
create index idx3 on test(c);
insert into test select i, repeat('a',30)||i, repeat('a',20)||i from
generate_series(1,500000) as i;
delete from test where a < 200000;

Vacuum (Head) Parallel Vacuum
Vacuum Cost Divide Patch
Total Delay 1784 (ms) 1398(ms)
1938(ms)

Test2 (Variable size dead tuple in index)
create table test(a int, b varchar, c varchar);
create index idx1 on test(a);
create index idx2 on test(b) where a > 100000;
create index idx3 on test(c) where a > 150000;

insert into test select i, repeat('a',30)||i, repeat('a',20)||i from
generate_series(1,500000) as i;
delete from test where a < 200000;

Vacuum (Head) Parallel Vacuum
Vacuum Cost Divide Patch
Total Delay 1438 (ms) 1029(ms)
1529(ms)

Conclusion:
1. The tests prove that the total I/O delay is significantly less with
the parallel vacuum.
2. With the vacuum cost divide the problem is solved but the delay bit
more compared to the non-parallel version. The reason could be the
problem discussed at[2], but it needs further investigation.

Next, I will test with the v31-0006 (shared vacuum cost) patch. I
will also try to test different types of indexes.

Thank you for testing!

I realized that v31-0006 patch doesn't work fine so I've attached the
updated version patch that also incorporated some comments I got so
far. Sorry for the inconvenience. I'll apply your 0001 patch and also
test the total delay time.

Regards,

--
Masahiko Sawada

Attachments:

Force_all_vacuum_to_use_parallel_vacuum_v1.patchapplication/octet-stream; name=Force_all_vacuum_to_use_parallel_vacuum_v1.patchDownload

commit b9308a8474cee6c71e16acae59d969f7266bad25
Author: Mahendra Singh Thalor <mahi6run@gmail.com>
Date:   Tue Nov 5 16:27:31 2019 +0530

    Use parallel vacuum if force_parallel_mode is setted as regress
    
    When force_parallel_mode is setted as regress and there is no
    parallel option given with vacuum(or parallel option is given
    without degree), then if there is 1 index, then launch
    1 parallel worker and if more than one index, then use leader as
    one worker and launch workers for remaining indexes.

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index a9d9f31..ac798a5 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -67,6 +67,7 @@
 #include "commands/progress.h"
 #include "commands/vacuum.h"
 #include "miscadmin.h"
+#include "optimizer/optimizer.h"
 #include "pgstat.h"
 #include "portability/instr_time.h"
 #include "postmaster/autovacuum.h"
@@ -2943,6 +2944,15 @@ compute_parallel_workers(Relation *Irel, int nindexes, int nrequested)
 	leaderparticipates = false;
 #endif
 
+	/*
+	 * If there is only one index and force_parallel_mode is setted as regress,
+	 * but number of parallel workers are not given with vacuum command, then
+	 * we will always lanuch one workers for vacuum.
+	 */
+	if (force_parallel_mode == FORCE_PARALLEL_REGRESS && nrequested == 0 &&
+		nindexes_to_vacuum == 1)
+		leaderparticipates = false;
+
 	/* The leader process takes one index */
 	if (leaderparticipates)
 		nindexes_to_vacuum--;
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 905d173..ca53f25 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -40,6 +40,7 @@
 #include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
+#include "optimizer/optimizer.h"
 #include "pgstat.h"
 #include "postmaster/autovacuum.h"
 #include "postmaster/bgworker_internals.h"
@@ -170,6 +171,14 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 		(full ? VACOPT_FULL : 0) |
 		(disable_page_skipping ? VACOPT_DISABLE_PAGE_SKIPPING : 0);
 
+	/*
+	 * If force_parallel_mode is setted as regress and there is no parallel
+	 * option is speified with vacuum, then enable parallel vacuum.
+	 */
+	if (params.nworkers == -1 && !(params.options & VACOPT_FULL) &&
+		force_parallel_mode == FORCE_PARALLEL_REGRESS)
+		params.nworkers = 0;
+
 	/* sanity checks on options */
 	Assert(params.options & (VACOPT_VACUUM | VACOPT_ANALYZE));
 	Assert((params.options & VACOPT_VACUUM) ||

#168

Dilip Kumar

dilipbalaut@gmail.com

about 6 years ago

In reply to: Mahendra Singh (#167)

On Wed, Nov 6, 2019 at 2:01 PM Mahendra Singh <mahi6run@gmail.com> wrote:

Hi
I took all attached patches(v32-01 to v32-4) and one Dilip's patch from "Questions/Observations related to Gist vacuum" mail thread. On the top of all these patches, I created one more patch to test parallel vacuum functionally for all existence test suite.
For reference, I am attaching patch.

What does this patch?
As we know that if we give parallel option with vacuum, then only we are vacuuming using parallel workers. So to test, I used existence guc force_parallel_mode and tested parallel vacuuming.

If force_parallel_mode is set as regress, then if parallel option is not given with vacuum, I am forcing to use parallel workers for vacuum. If there is only one index and parallel degree is not given with vacuum(or parallel option is not given), and force_parallel_mode = regress, then I am launching one parallel worker(I am not doing work by leader in this case), but if there is more than one index, then i am using leader as a worker for one index and launching workers for all other indexes.

After applying this patch and setting force_parallel_mode = regress, all test cases are passing (make-check world)

I have some questions regarding my patch. Should we do vacuuming using parallel workers even if force_parallel_mode is set as on, or we should use new GUC to test parallel worker vacuum for existence test suite?

IMHO, with force_parallel_mode=on we don't need to do anything here
because that is useful for normal query parallelism where if the user
thinks that the parallel plan should have been selected by the planer
but planer did not select the parallel plan then the user can force
and check. But, vacuum parallelism is itself forced by the user so
there is no point in doing it with force_parallel_mode=on. However,
force_parallel_mode=regress is useful for testing the vacuum with an
existing test suit.

Please let me know your thoughts for this patch.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#169

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

about 6 years ago

In reply to: Dilip Kumar (#168)

On Wed, 6 Nov 2019 at 18:42, Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Wed, Nov 6, 2019 at 2:01 PM Mahendra Singh <mahi6run@gmail.com> wrote:

Hi
I took all attached patches(v32-01 to v32-4) and one Dilip's patch from "Questions/Observations related to Gist vacuum" mail thread. On the top of all these patches, I created one more patch to test parallel vacuum functionally for all existence test suite.

Thank you for looking at this patch!

For reference, I am attaching patch.

What does this patch?
As we know that if we give parallel option with vacuum, then only we are vacuuming using parallel workers. So to test, I used existence guc force_parallel_mode and tested parallel vacuuming.

If force_parallel_mode is set as regress, then if parallel option is not given with vacuum, I am forcing to use parallel workers for vacuum. If there is only one index and parallel degree is not given with vacuum(or parallel option is not given), and force_parallel_mode = regress, then I am launching one parallel worker(I am not doing work by leader in this case), but if there is more than one index, then i am using leader as a worker for one index and launching workers for all other indexes.

After applying this patch and setting force_parallel_mode = regress, all test cases are passing (make-check world)

I have some questions regarding my patch. Should we do vacuuming using parallel workers even if force_parallel_mode is set as on, or we should use new GUC to test parallel worker vacuum for existence test suite?

IMHO, with force_parallel_mode=on we don't need to do anything here
because that is useful for normal query parallelism where if the user
thinks that the parallel plan should have been selected by the planer
but planer did not select the parallel plan then the user can force
and check. But, vacuum parallelism is itself forced by the user so
there is no point in doing it with force_parallel_mode=on.

Yeah I think so too. force_parallel_mode is a planner parameter and
parallel vacuum can be forced by vacuum option.

However,
force_parallel_mode=regress is useful for testing the vacuum with an
existing test suit.

If we want to control the leader participation by GUC parameter I
think we would need to have another GUC parameter rather than using
force_parallel_mode. And it's useful if we can use the parameter for
parallel CREATE INDEX as well. But it should be a separate patch.

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#170

Dilip Kumar

dilipbalaut@gmail.com

about 6 years ago

In reply to: Masahiko Sawada (#169)

On Wed, Nov 6, 2019 at 3:50 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Wed, 6 Nov 2019 at 18:42, Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Wed, Nov 6, 2019 at 2:01 PM Mahendra Singh <mahi6run@gmail.com> wrote:

Hi
I took all attached patches(v32-01 to v32-4) and one Dilip's patch from "Questions/Observations related to Gist vacuum" mail thread. On the top of all these patches, I created one more patch to test parallel vacuum functionally for all existence test suite.

Thank you for looking at this patch!

For reference, I am attaching patch.

What does this patch?
As we know that if we give parallel option with vacuum, then only we are vacuuming using parallel workers. So to test, I used existence guc force_parallel_mode and tested parallel vacuuming.

If force_parallel_mode is set as regress, then if parallel option is not given with vacuum, I am forcing to use parallel workers for vacuum. If there is only one index and parallel degree is not given with vacuum(or parallel option is not given), and force_parallel_mode = regress, then I am launching one parallel worker(I am not doing work by leader in this case), but if there is more than one index, then i am using leader as a worker for one index and launching workers for all other indexes.

After applying this patch and setting force_parallel_mode = regress, all test cases are passing (make-check world)

I have some questions regarding my patch. Should we do vacuuming using parallel workers even if force_parallel_mode is set as on, or we should use new GUC to test parallel worker vacuum for existence test suite?

IMHO, with force_parallel_mode=on we don't need to do anything here
because that is useful for normal query parallelism where if the user
thinks that the parallel plan should have been selected by the planer
but planer did not select the parallel plan then the user can force
and check. But, vacuum parallelism is itself forced by the user so
there is no point in doing it with force_parallel_mode=on.

Yeah I think so too. force_parallel_mode is a planner parameter and
parallel vacuum can be forced by vacuum option.

However,
force_parallel_mode=regress is useful for testing the vacuum with an
existing test suit.

If we want to control the leader participation by GUC parameter I
think we would need to have another GUC parameter rather than using
force_parallel_mode.

I think the purpose is not to disable the leader participation,
instead, I think the purpose of 'force_parallel_mode=regress' is that
without changing the existing test suit we can execute the existing
vacuum commands in the test suit with the worker. I did not study the
patch but the idea should be that if "force_parallel_mode=regress"
then normal vacuum command should be executed in parallel by using 1
worker.

And it's useful if we can use the parameter for

parallel CREATE INDEX as well. But it should be a separate patch.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#171

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

about 6 years ago

In reply to: Dilip Kumar (#170)

On Wed, 6 Nov 2019 at 20:29, Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Wed, Nov 6, 2019 at 3:50 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Wed, 6 Nov 2019 at 18:42, Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Wed, Nov 6, 2019 at 2:01 PM Mahendra Singh <mahi6run@gmail.com> wrote:

Hi
I took all attached patches(v32-01 to v32-4) and one Dilip's patch from "Questions/Observations related to Gist vacuum" mail thread. On the top of all these patches, I created one more patch to test parallel vacuum functionally for all existence test suite.

Thank you for looking at this patch!

For reference, I am attaching patch.

What does this patch?
As we know that if we give parallel option with vacuum, then only we are vacuuming using parallel workers. So to test, I used existence guc force_parallel_mode and tested parallel vacuuming.

If force_parallel_mode is set as regress, then if parallel option is not given with vacuum, I am forcing to use parallel workers for vacuum. If there is only one index and parallel degree is not given with vacuum(or parallel option is not given), and force_parallel_mode = regress, then I am launching one parallel worker(I am not doing work by leader in this case), but if there is more than one index, then i am using leader as a worker for one index and launching workers for all other indexes.

After applying this patch and setting force_parallel_mode = regress, all test cases are passing (make-check world)

I have some questions regarding my patch. Should we do vacuuming using parallel workers even if force_parallel_mode is set as on, or we should use new GUC to test parallel worker vacuum for existence test suite?

IMHO, with force_parallel_mode=on we don't need to do anything here
because that is useful for normal query parallelism where if the user
thinks that the parallel plan should have been selected by the planer
but planer did not select the parallel plan then the user can force
and check. But, vacuum parallelism is itself forced by the user so
there is no point in doing it with force_parallel_mode=on.

Yeah I think so too. force_parallel_mode is a planner parameter and
parallel vacuum can be forced by vacuum option.

However,
force_parallel_mode=regress is useful for testing the vacuum with an
existing test suit.

If we want to control the leader participation by GUC parameter I
think we would need to have another GUC parameter rather than using
force_parallel_mode.

I think the purpose is not to disable the leader participation,
instead, I think the purpose of 'force_parallel_mode=regress' is that
without changing the existing test suit we can execute the existing
vacuum commands in the test suit with the worker. I did not study the
patch but the idea should be that if "force_parallel_mode=regress"
then normal vacuum command should be executed in parallel by using 1
worker.

Oh I got it. Considering the current parallel vacuum design I'm not
sure that we can cover more test cases by forcing parallel vacuum
during existing test suite because most of these would be tables with
several indexes and one index vacuum cycle. It might be better to add
more test cases for parallel vacuum.

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#172

Dilip Kumar

dilipbalaut@gmail.com

about 6 years ago

In reply to: Masahiko Sawada (#171)

On Wed, 6 Nov 2019, 20:07 Masahiko Sawada, <masahiko.sawada@2ndquadrant.com>
wrote:

On Wed, 6 Nov 2019 at 20:29, Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Wed, Nov 6, 2019 at 3:50 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Wed, 6 Nov 2019 at 18:42, Dilip Kumar <dilipbalaut@gmail.com>

wrote:

On Wed, Nov 6, 2019 at 2:01 PM Mahendra Singh <mahi6run@gmail.com>

wrote:

Hi
I took all attached patches(v32-01 to v32-4) and one Dilip's patch

from "Questions/Observations related to Gist vacuum" mail thread. On the
top of all these patches, I created one more patch to test parallel vacuum
functionally for all existence test suite.

Thank you for looking at this patch!

For reference, I am attaching patch.

What does this patch?
As we know that if we give parallel option with vacuum, then only

we are vacuuming using parallel workers. So to test, I used existence guc
force_parallel_mode and tested parallel vacuuming.

If force_parallel_mode is set as regress, then if parallel option

is not given with vacuum, I am forcing to use parallel workers for vacuum.
If there is only one index and parallel degree is not given with vacuum(or
parallel option is not given), and force_parallel_mode = regress, then I am
launching one parallel worker(I am not doing work by leader in this case),
but if there is more than one index, then i am using leader as a worker for
one index and launching workers for all other indexes.

After applying this patch and setting force_parallel_mode =

regress, all test cases are passing (make-check world)

I have some questions regarding my patch. Should we do vacuuming

using parallel workers even if force_parallel_mode is set as on, or we
should use new GUC to test parallel worker vacuum for existence test suite?

IMHO, with force_parallel_mode=on we don't need to do anything here
because that is useful for normal query parallelism where if the user
thinks that the parallel plan should have been selected by the planer
but planer did not select the parallel plan then the user can force
and check. But, vacuum parallelism is itself forced by the user so
there is no point in doing it with force_parallel_mode=on.

Yeah I think so too. force_parallel_mode is a planner parameter and
parallel vacuum can be forced by vacuum option.

However,
force_parallel_mode=regress is useful for testing the vacuum with an
existing test suit.

If we want to control the leader participation by GUC parameter I
think we would need to have another GUC parameter rather than using
force_parallel_mode.

I think the purpose is not to disable the leader participation,
instead, I think the purpose of 'force_parallel_mode=regress' is that
without changing the existing test suit we can execute the existing
vacuum commands in the test suit with the worker. I did not study the
patch but the idea should be that if "force_parallel_mode=regress"
then normal vacuum command should be executed in parallel by using 1
worker.

Oh I got it. Considering the current parallel vacuum design I'm not
sure that we can cover more test cases by forcing parallel vacuum
during existing test suite because most of these would be tables with
several indexes and one index vacuum cycle.

Oh sure, but still it would be good to get them tested with the parallel
vacuum.

It might be better to add
more test cases for parallel vacuum.

I agree that it would be good to add additional test cases.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#173

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Masahiko Sawada (#149)

On Tue, Oct 29, 2019 at 12:37 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

I realized that v31-0006 patch doesn't work fine so I've attached the
updated version patch that also incorporated some comments I got so
far. Sorry for the inconvenience. I'll apply your 0001 patch and also
test the total delay time.

+ /*
+ * Generally index cleanup does not scan the index when index
+ * vacuuming (ambulkdelete) was already performed.  So we perform
+ * index cleanup with parallel workers only if we have not
+ * performed index vacuuming yet.  Otherwise, we do it in the
+ * leader process alone.
+ */
+ if (vacrelstats->num_index_scans == 0)
+ lazy_parallel_vacuum_or_cleanup_indexes(vacrelstats, Irel, nindexes,
+ stats, lps);

Today, I was thinking about this point where this check will work for
most cases, but still, exceptions are there like for brin index, the
main work is done in amvacuumcleanup function. Similarly, I think
there are few more indexes like gin, bloom where it seems we take
another pass over-index in the amvacuumcleanup phase. Don't you think
we should try to allow parallel workers for such cases? If so, I
don't have any great ideas on how to do that, but what comes to my
mind is to indicate that via stats (
IndexBulkDeleteResult) or via an indexam API. I am not sure if it is
acceptable to have indexam API for this.

Thoughts?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#174

Mahendra Singh

mahi6run@gmail.com

about 6 years ago

In reply to: Dilip Kumar (#170)

1 attachment(s)

Thanks Masahiko san and Dilip for looking into this patch.

In previous patch, when *'force_parallel_mode=regress*', I was doing all
the vacuum using multiple workers but we should do all the vacuuming using
only 1 worker(leader should not participate in vacuuming). So attaching
patch for same.

*What does this patch?*
If '*force_parallel_mode=regress*' and parallel option is not given with
vacuum, then all the vacuuming work will be done by one single worker and
leader will not participate. But if parallel option is given with vacuum,
then preference will be given to specified degree.

After applying this patch, all the test cases are passing(make check-world)
and I can't see any improvement in code coverage with this patch.

Please let me know your thoughts for this patch.

Thanks and Regards
Mahendra Thalor
EnterpriseDB: http://www.enterprisedb.com

On Wed, 6 Nov 2019 at 16:59, Dilip Kumar <dilipbalaut@gmail.com> wrote:

Show quoted text

On Wed, Nov 6, 2019 at 3:50 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Wed, 6 Nov 2019 at 18:42, Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Wed, Nov 6, 2019 at 2:01 PM Mahendra Singh <mahi6run@gmail.com>

wrote:

Hi
I took all attached patches(v32-01 to v32-4) and one Dilip's patch

from "Questions/Observations related to Gist vacuum" mail thread. On the
top of all these patches, I created one more patch to test parallel vacuum
functionally for all existence test suite.

Thank you for looking at this patch!

For reference, I am attaching patch.

What does this patch?
As we know that if we give parallel option with vacuum, then only we

are vacuuming using parallel workers. So to test, I used existence guc
force_parallel_mode and tested parallel vacuuming.

If force_parallel_mode is set as regress, then if parallel option is

not given with vacuum, I am forcing to use parallel workers for vacuum. If
there is only one index and parallel degree is not given with vacuum(or
parallel option is not given), and force_parallel_mode = regress, then I am
launching one parallel worker(I am not doing work by leader in this case),
but if there is more than one index, then i am using leader as a worker for
one index and launching workers for all other indexes.

After applying this patch and setting force_parallel_mode = regress,

all test cases are passing (make-check world)

I have some questions regarding my patch. Should we do vacuuming

using parallel workers even if force_parallel_mode is set as on, or we
should use new GUC to test parallel worker vacuum for existence test suite?

IMHO, with force_parallel_mode=on we don't need to do anything here
because that is useful for normal query parallelism where if the user
thinks that the parallel plan should have been selected by the planer
but planer did not select the parallel plan then the user can force
and check. But, vacuum parallelism is itself forced by the user so
there is no point in doing it with force_parallel_mode=on.

Yeah I think so too. force_parallel_mode is a planner parameter and
parallel vacuum can be forced by vacuum option.

However,
force_parallel_mode=regress is useful for testing the vacuum with an
existing test suit.

If we want to control the leader participation by GUC parameter I
think we would need to have another GUC parameter rather than using
force_parallel_mode.

I think the purpose is not to disable the leader participation,
instead, I think the purpose of 'force_parallel_mode=regress' is that
without changing the existing test suit we can execute the existing
vacuum commands in the test suit with the worker. I did not study the
patch but the idea should be that if "force_parallel_mode=regress"
then normal vacuum command should be executed in parallel by using 1
worker.

And it's useful if we can use the parameter for

parallel CREATE INDEX as well. But it should be a separate patch.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Attachments:

Force_all_vacuum_to_use_parallel_vacuum_v2.patchapplication/octet-stream; name=Force_all_vacuum_to_use_parallel_vacuum_v2.patchDownload

commit ae501497308a3b245872fc667660a85befd7020d
Author: Mahendra Singh Thalor <mahi6run@gmail.com>
Date:   Fri Nov 8 15:07:22 2019 +0530

    Use parallel vacuum if force_parallel_mode is setted as regress
    
    When force_parallel_mode is setted as regress and parallel option
    is not given with vacuum, then all the work will be done by single
    worker(leader will not do any work).  If user give parallel option,
    then preference will be given to that parallel degree.

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index a9d9f31..7f2914e 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -67,6 +67,7 @@
 #include "commands/progress.h"
 #include "commands/vacuum.h"
 #include "miscadmin.h"
+#include "optimizer/optimizer.h"
 #include "pgstat.h"
 #include "portability/instr_time.h"
 #include "postmaster/autovacuum.h"
@@ -79,6 +80,7 @@
 #include "utils/pg_rusage.h"
 #include "utils/timestamp.h"
 
+extern	bool    do_vacuum_using_only_single_worker;
 
 /*
  * Space/time tradeoff parameters: do these need to be user-tunable?
@@ -2943,6 +2945,15 @@ compute_parallel_workers(Relation *Irel, int nindexes, int nrequested)
 	leaderparticipates = false;
 #endif
 
+	/*
+	 * If force_parallel_mode is setted as regress, and parallel option is
+	 * not given with vacuum command, then all the vacuum work will be done by
+	 * single worker and leader will not participate.
+	 */
+	if (force_parallel_mode == FORCE_PARALLEL_REGRESS && nrequested == 1 &&
+		do_vacuum_using_only_single_worker)
+		leaderparticipates = false;
+
 	/* The leader process takes one index */
 	if (leaderparticipates)
 		nindexes_to_vacuum--;
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 905d173..a4dc8ba 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -40,6 +40,7 @@
 #include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
+#include "optimizer/optimizer.h"
 #include "pgstat.h"
 #include "postmaster/autovacuum.h"
 #include "postmaster/bgworker_internals.h"
@@ -62,6 +63,7 @@ int			vacuum_freeze_min_age;
 int			vacuum_freeze_table_age;
 int			vacuum_multixact_freeze_min_age;
 int			vacuum_multixact_freeze_table_age;
+bool    do_vacuum_using_only_single_worker = false;
 
 
 /* A few variables that don't seem worth passing around as parameters */
@@ -170,6 +172,20 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 		(full ? VACOPT_FULL : 0) |
 		(disable_page_skipping ? VACOPT_DISABLE_PAGE_SKIPPING : 0);
 
+	/*
+	 * If force_parallel_mode is setted as regress and there is no parallel
+	 * option is specified with vacuum, then enable parallel vacuum and set
+	 * degree of parallel worker as 1.
+	 */
+	if (params.nworkers == -1 && !(params.options & VACOPT_FULL) &&
+		force_parallel_mode == FORCE_PARALLEL_REGRESS)
+	{
+		do_vacuum_using_only_single_worker = true;
+		params.nworkers = 1;
+	}
+	else
+		do_vacuum_using_only_single_worker = false;
+
 	/* sanity checks on options */
 	Assert(params.options & (VACOPT_VACUUM | VACOPT_ANALYZE));
 	Assert((params.options & VACOPT_VACUUM) ||

#175

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

about 6 years ago

In reply to: Amit Kapila (#173)

On Fri, 8 Nov 2019 at 18:48, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Oct 29, 2019 at 12:37 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

I realized that v31-0006 patch doesn't work fine so I've attached the
updated version patch that also incorporated some comments I got so
far. Sorry for the inconvenience. I'll apply your 0001 patch and also
test the total delay time.
+ /*
+ * Generally index cleanup does not scan the index when index
+ * vacuuming (ambulkdelete) was already performed.  So we perform
+ * index cleanup with parallel workers only if we have not
+ * performed index vacuuming yet.  Otherwise, we do it in the
+ * leader process alone.
+ */
+ if (vacrelstats->num_index_scans == 0)
+ lazy_parallel_vacuum_or_cleanup_indexes(vacrelstats, Irel, nindexes,
+ stats, lps);
Today, I was thinking about this point where this check will work for
most cases, but still, exceptions are there like for brin index, the
main work is done in amvacuumcleanup function. Similarly, I think
there are few more indexes like gin, bloom where it seems we take
another pass over-index in the amvacuumcleanup phase. Don't you think
we should try to allow parallel workers for such cases? If so, I
don't have any great ideas on how to do that, but what comes to my
mind is to indicate that via stats (
IndexBulkDeleteResult) or via an indexam API. I am not sure if it is
acceptable to have indexam API for this.

Thoughts?

Good point. gin and bloom do a certain heavy work during cleanup and
during bulkdelete as you mentioned. Brin does it during cleanup, and
hash and gist do it during bulkdelete. There are three types of index
AM just inside postgres code. An idea I came up with is that we can
control parallel vacuum and parallel cleanup separately. That is,
adding a variable amcanparallelcleanup and we can do parallel cleanup
on only indexes of which amcanparallelcleanup is true. IndexBulkDelete
can be stored locally if both amcanparallelvacuum and
amcanparallelcleanup of an index are false because only the leader
process deals with such indexes. Otherwise we need to store it in DSM
as always.

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#176

Dilip Kumar

dilipbalaut@gmail.com

about 6 years ago

In reply to: Masahiko Sawada (#175)

On Mon, Nov 11, 2019 at 9:57 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Fri, 8 Nov 2019 at 18:48, Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Oct 29, 2019 at 12:37 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

I realized that v31-0006 patch doesn't work fine so I've attached the
updated version patch that also incorporated some comments I got so
far. Sorry for the inconvenience. I'll apply your 0001 patch and also
test the total delay time.
+ /*
+ * Generally index cleanup does not scan the index when index
+ * vacuuming (ambulkdelete) was already performed.  So we perform
+ * index cleanup with parallel workers only if we have not
+ * performed index vacuuming yet.  Otherwise, we do it in the
+ * leader process alone.
+ */
+ if (vacrelstats->num_index_scans == 0)
+ lazy_parallel_vacuum_or_cleanup_indexes(vacrelstats, Irel, nindexes,
+ stats, lps);
Today, I was thinking about this point where this check will work for
most cases, but still, exceptions are there like for brin index, the
main work is done in amvacuumcleanup function. Similarly, I think
there are few more indexes like gin, bloom where it seems we take
another pass over-index in the amvacuumcleanup phase. Don't you think
we should try to allow parallel workers for such cases? If so, I
don't have any great ideas on how to do that, but what comes to my
mind is to indicate that via stats (
IndexBulkDeleteResult) or via an indexam API. I am not sure if it is
acceptable to have indexam API for this.

Thoughts?
Good point. gin and bloom do a certain heavy work during cleanup and
during bulkdelete as you mentioned. Brin does it during cleanup, and
hash and gist do it during bulkdelete. There are three types of index
AM just inside postgres code. An idea I came up with is that we can
control parallel vacuum and parallel cleanup separately. That is,
adding a variable amcanparallelcleanup and we can do parallel cleanup
on only indexes of which amcanparallelcleanup is true. IndexBulkDelete
can be stored locally if both amcanparallelvacuum and
amcanparallelcleanup of an index are false because only the leader
process deals with such indexes. Otherwise we need to store it in DSM
as always.

IIUC, amcanparallelcleanup will be true for those indexes which does
heavy work during cleanup irrespective of whether bulkdelete is called
or not e.g. gin? If so, along with an amcanparallelcleanup flag, we
need to consider vacrelstats->num_index_scans right? So if
vacrelstats->num_index_scans == 0 then we need to launch parallel
worker for all the indexes who support amcanparallelvacuum and if
vacrelstats->num_index_scans > 0 then only for those who has
amcanparallelcleanup as true.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#177

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

about 6 years ago

In reply to: Dilip Kumar (#176)

On Mon, 11 Nov 2019 at 15:06, Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Mon, Nov 11, 2019 at 9:57 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
On Fri, 8 Nov 2019 at 18:48, Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Oct 29, 2019 at 12:37 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

I realized that v31-0006 patch doesn't work fine so I've attached the
updated version patch that also incorporated some comments I got so
far. Sorry for the inconvenience. I'll apply your 0001 patch and also
test the total delay time.
+ /*
+ * Generally index cleanup does not scan the index when index
+ * vacuuming (ambulkdelete) was already performed.  So we perform
+ * index cleanup with parallel workers only if we have not
+ * performed index vacuuming yet.  Otherwise, we do it in the
+ * leader process alone.
+ */
+ if (vacrelstats->num_index_scans == 0)
+ lazy_parallel_vacuum_or_cleanup_indexes(vacrelstats, Irel, nindexes,
+ stats, lps);
Today, I was thinking about this point where this check will work for
most cases, but still, exceptions are there like for brin index, the
main work is done in amvacuumcleanup function. Similarly, I think
there are few more indexes like gin, bloom where it seems we take
another pass over-index in the amvacuumcleanup phase. Don't you think
we should try to allow parallel workers for such cases? If so, I
don't have any great ideas on how to do that, but what comes to my
mind is to indicate that via stats (
IndexBulkDeleteResult) or via an indexam API. I am not sure if it is
acceptable to have indexam API for this.

Thoughts?
Good point. gin and bloom do a certain heavy work during cleanup and
during bulkdelete as you mentioned. Brin does it during cleanup, and
hash and gist do it during bulkdelete. There are three types of index
AM just inside postgres code. An idea I came up with is that we can
control parallel vacuum and parallel cleanup separately. That is,
adding a variable amcanparallelcleanup and we can do parallel cleanup
on only indexes of which amcanparallelcleanup is true. IndexBulkDelete
can be stored locally if both amcanparallelvacuum and
amcanparallelcleanup of an index are false because only the leader
process deals with such indexes. Otherwise we need to store it in DSM
as always.
IIUC, amcanparallelcleanup will be true for those indexes which does
heavy work during cleanup irrespective of whether bulkdelete is called
or not e.g. gin?

Yes, I guess that gin and brin set amcanparallelcleanup to true (gin
might set amcanparallevacuum to true as well).

If so, along with an amcanparallelcleanup flag, we
need to consider vacrelstats->num_index_scans right? So if
vacrelstats->num_index_scans == 0 then we need to launch parallel
worker for all the indexes who support amcanparallelvacuum and if
vacrelstats->num_index_scans > 0 then only for those who has
amcanparallelcleanup as true.

Yes, you're right. But this won't work fine for brin indexes who don't
want to participate in parallel vacuum but always want to participate
in parallel cleanup.

After more thoughts, I think we can have a ternary value: never,
always, once. If it's 'never' the index never participates in parallel
cleanup. I guess hash indexes use 'never'. Next, if it's 'always' the
index always participates regardless of vacrelstats->num_index_scan. I
guess gin, brin and bloom use 'always'. Finally if it's 'once' the
index participates in parallel cleanup only when it's the first time
(that is, vacrelstats->num_index_scan == 0), I guess btree, gist and
spgist use 'once'.

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#178

Dilip Kumar

dilipbalaut@gmail.com

about 6 years ago

In reply to: Masahiko Sawada (#177)

On Mon, Nov 11, 2019 at 12:26 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Mon, 11 Nov 2019 at 15:06, Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Mon, Nov 11, 2019 at 9:57 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
On Fri, 8 Nov 2019 at 18:48, Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Oct 29, 2019 at 12:37 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

I realized that v31-0006 patch doesn't work fine so I've attached the
updated version patch that also incorporated some comments I got so
far. Sorry for the inconvenience. I'll apply your 0001 patch and also
test the total delay time.
+ /*
+ * Generally index cleanup does not scan the index when index
+ * vacuuming (ambulkdelete) was already performed.  So we perform
+ * index cleanup with parallel workers only if we have not
+ * performed index vacuuming yet.  Otherwise, we do it in the
+ * leader process alone.
+ */
+ if (vacrelstats->num_index_scans == 0)
+ lazy_parallel_vacuum_or_cleanup_indexes(vacrelstats, Irel, nindexes,
+ stats, lps);
Today, I was thinking about this point where this check will work for
most cases, but still, exceptions are there like for brin index, the
main work is done in amvacuumcleanup function. Similarly, I think
there are few more indexes like gin, bloom where it seems we take
another pass over-index in the amvacuumcleanup phase. Don't you think
we should try to allow parallel workers for such cases? If so, I
don't have any great ideas on how to do that, but what comes to my
mind is to indicate that via stats (
IndexBulkDeleteResult) or via an indexam API. I am not sure if it is
acceptable to have indexam API for this.

Thoughts?
Good point. gin and bloom do a certain heavy work during cleanup and
during bulkdelete as you mentioned. Brin does it during cleanup, and
hash and gist do it during bulkdelete. There are three types of index
AM just inside postgres code. An idea I came up with is that we can
control parallel vacuum and parallel cleanup separately. That is,
adding a variable amcanparallelcleanup and we can do parallel cleanup
on only indexes of which amcanparallelcleanup is true. IndexBulkDelete
can be stored locally if both amcanparallelvacuum and
amcanparallelcleanup of an index are false because only the leader
process deals with such indexes. Otherwise we need to store it in DSM
as always.
IIUC, amcanparallelcleanup will be true for those indexes which does
heavy work during cleanup irrespective of whether bulkdelete is called
or not e.g. gin?
Yes, I guess that gin and brin set amcanparallelcleanup to true (gin
might set amcanparallevacuum to true as well).

If so, along with an amcanparallelcleanup flag, we
need to consider vacrelstats->num_index_scans right? So if
vacrelstats->num_index_scans == 0 then we need to launch parallel
worker for all the indexes who support amcanparallelvacuum and if
vacrelstats->num_index_scans > 0 then only for those who has
amcanparallelcleanup as true.

Yes, you're right. But this won't work fine for brin indexes who don't
want to participate in parallel vacuum but always want to participate
in parallel cleanup.

Yeah, right.

After more thoughts, I think we can have a ternary value: never,
always, once. If it's 'never' the index never participates in parallel
cleanup. I guess hash indexes use 'never'. Next, if it's 'always' the
index always participates regardless of vacrelstats->num_index_scan. I
guess gin, brin and bloom use 'always'. Finally if it's 'once' the
index participates in parallel cleanup only when it's the first time
(that is, vacrelstats->num_index_scan == 0), I guess btree, gist and
spgist use 'once'.

Yeah, this make sense to me.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#179

Dilip Kumar

dilipbalaut@gmail.com

about 6 years ago

In reply to: Masahiko Sawada (#149)

On Tue, Oct 29, 2019 at 12:37 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

I realized that v31-0006 patch doesn't work fine so I've attached the
updated version patch that also incorporated some comments I got so
far. Sorry for the inconvenience. I'll apply your 0001 patch and also
test the total delay time.

While reviewing the 0002, I got one doubt related to how we are
dividing the maintainance_work_mem

+prepare_index_statistics(LVShared *lvshared, Relation *Irel, int nindexes)
+{
+ /* Compute the new maitenance_work_mem value for index vacuuming */
+ lvshared->maintenance_work_mem_worker =
+ (nindexes_mwm > 0) ? maintenance_work_mem / nindexes_mwm :
maintenance_work_mem;
+}
Is it fair to just consider the number of indexes which use
maintenance_work_mem?  Or we need to consider the number of worker as
well.  My point is suppose there are 10 indexes which will use the
maintenance_work_mem but we are launching just 2 workers then what is
the point in dividing the maintenance_work_mem by 10.

IMHO the calculation should be like this
lvshared->maintenance_work_mem_worker = (nindexes_mwm > 0) ?
maintenance_work_mem / Min(nindexes_mwm, nworkers) :
maintenance_work_mem;

Am I missing something?

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#180

Mahendra Singh

mahi6run@gmail.com

about 6 years ago

In reply to: Masahiko Sawada (#149)

Hi All,
I did some performance testing with the help of Dilip to test normal vacuum
and parallel vacuum. Below is the test summary-

*Configuration settings:*
autovacuum = off
shared_buffers = 2GB
max_parallel_maintenance_workers = 6

*Test 1: (*table has 4 index on all tuples and deleting alternative tuples)
*create table test(a int, b int, c int, d int, e int, f int, g int, h int);
create index i1 on test (a); create index i2 on test (b); create index i3
on test (c); create index i4 on test (d); insert into test select
i,i,i,i,i,i,i,i from generate_series(1,1000000) as i; delete from test
where a %2=0;*

*case 1: (run normal vacuum)*
*vacuum test;*
*1019.453 ms*

*Case 2: (run vacuum with 1 parallel degree)*
*vacuum (parallel 1) test;*
*765.366 ms*

*Case 3:(run vacuum with 3 parallel degree)*
*vacuum (parallel 3) test;*
*555.227 ms*

*From above results, we can concluded that with the help of parallel
vacuum, performance is increased for large indexes.*

*Test 2:(table has 16 indexes and indexes are small , deleting alternative
tuples)*

*create table test(a int, b int, c int, d int, e int, f int, g int, h
int);create index i1 on test (a) where a < 100000;create index i2 on test
(a) where a > 100000 and a < 200000;create index i3 on test (a) where a >
200000 and a < 300000;create index i4 on test (a) where a > 300000 and a <
400000;create index i5 on test (a) where a > 400000 and a < 500000;create
index i6 on test (a) where a > 500000 and a < 600000;create index i7 on
test (b) where a < 100000;create index i8 on test (c) where a <
100000;create index i9 on test (d) where a < 100000;create index i10 on
test (d) where a < 100000;create index i11 on test (d) where a <
100000;create index i12 on test (d) where a < 100000;create index i13 on
test (d) where a < 100000;create index i14 on test (d) where a <
100000;create index i15 on test (d) where a < 100000;create index i16 on
test (d) where a < 100000;insert into test select i,i,i,i,i,i,i,i from
generate_series(1,1000000) as i;delete from test where a %2=0;*

*case 1: (run normal vacuum)*
*vacuum test;*
*649.187 ms*

*Case 2: (run vacuum with 1 parallel degree)*
*vacuum (parallel 1) test;*
*492.075 ms*

*Case 3:(run vacuum with 3 parallel degree)*
*vacuum (parallel 3) test;*
*435.581 ms*

*For small indexes also, we gained some performance by parallel vacuum.*

*I will continue my testing for stats collection.*

*Please let me know, if anybody has any suggestion for other testing(What
should be tested).*

*Thanks and Regards*

*Mahendra Thalor*
*EnterpriseDB: http://www.enterprisedb.com <http://www.enterprisedb.com>*

On Tue, 29 Oct 2019 at 12:37, Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Show quoted text

On Mon, Oct 28, 2019 at 2:13 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Thu, Oct 24, 2019 at 4:33 PM Dilip Kumar <dilipbalaut@gmail.com>

wrote:

On Thu, Oct 24, 2019 at 4:21 PM Amit Kapila <amit.kapila16@gmail.com>

wrote:

On Thu, Oct 24, 2019 at 11:51 AM Dilip Kumar <dilipbalaut@gmail.com>

wrote:

On Fri, Oct 18, 2019 at 12:18 PM Dilip Kumar <

dilipbalaut@gmail.com> wrote:

On Fri, Oct 18, 2019 at 11:25 AM Amit Kapila <

amit.kapila16@gmail.com> wrote:

I am thinking if we can write the patch for both the

approaches (a.

compute shared costs and try to delay based on that, b. try to

divide

the I/O cost among workers as described in the email above[1])

and do

some tests to see the behavior of throttling, that might help

us in

deciding what is the best strategy to solve this problem, if

any.

What do you think?

I agree with this idea. I can come up with a POC patch for

approach

(b). Meanwhile, if someone is interested to quickly hack with

the

approach (a) then we can do some testing and compare.

Sawada-san,

by any chance will you be interested to write POC with approach

(a)?

Otherwise, I will try to write it after finishing the first one
(approach b).

I have come up with the POC for approach (a).

Can we compute the overall throttling (sleep time) in the operation
separately for heap and index, then divide the index's sleep_time

with

a number of workers and add it to heap's sleep time? Then, it will

be

a bit easier to compare the data between parallel and non-parallel
case.

I have come up with a patch to compute the total delay during the
vacuum. So the idea of computing the total cost delay is

Total cost delay = Total dealy of heap scan + Total dealy of
index/worker; Patch is attached for the same.

I have prepared this patch on the latest patch of the parallel
vacuum[1]. I have also rebased the patch for the approach [b] for
dividing the vacuum cost limit and done some testing for computing the
I/O throttling. Attached patches 0001-POC-compute-total-cost-delay
and 0002-POC-divide-vacuum-cost-limit can be applied on top of
v31-0005-Add-paralell-P-option-to-vacuumdb-command.patch. I haven't
rebased on top of v31-0006, because v31-0006 is implementing the I/O
throttling with one approach and 0002-POC-divide-vacuum-cost-limit is
doing the same with another approach. But,
0001-POC-compute-total-cost-delay can be applied on top of v31-0006 as
well (just 1-2 lines conflict).

Testing: I have performed 2 tests, one with the same size indexes and
second with the different size indexes and measured total I/O delay
with the attached patch.

Setup:
VacuumCostDelay=10ms
VacuumCostLimit=2000

Test1 (Same size index):
create table test(a int, b varchar, c varchar);
create index idx1 on test(a);
create index idx2 on test(b);
create index idx3 on test(c);
insert into test select i, repeat('a',30)||i, repeat('a',20)||i from
generate_series(1,500000) as i;
delete from test where a < 200000;

Vacuum (Head) Parallel Vacuum
Vacuum Cost Divide Patch
Total Delay 1784 (ms) 1398(ms)
1938(ms)

Test2 (Variable size dead tuple in index)
create table test(a int, b varchar, c varchar);
create index idx1 on test(a);
create index idx2 on test(b) where a > 100000;
create index idx3 on test(c) where a > 150000;

insert into test select i, repeat('a',30)||i, repeat('a',20)||i from
generate_series(1,500000) as i;
delete from test where a < 200000;

Vacuum (Head) Parallel Vacuum
Vacuum Cost Divide Patch
Total Delay 1438 (ms) 1029(ms)
1529(ms)

Conclusion:
1. The tests prove that the total I/O delay is significantly less with
the parallel vacuum.
2. With the vacuum cost divide the problem is solved but the delay bit
more compared to the non-parallel version. The reason could be the
problem discussed at[2], but it needs further investigation.

Next, I will test with the v31-0006 (shared vacuum cost) patch. I
will also try to test different types of indexes.

Thank you for testing!

I realized that v31-0006 patch doesn't work fine so I've attached the
updated version patch that also incorporated some comments I got so
far. Sorry for the inconvenience. I'll apply your 0001 patch and also
test the total delay time.

Regards,

--
Masahiko Sawada

#181

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Masahiko Sawada (#177)

On Mon, Nov 11, 2019 at 12:26 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Mon, 11 Nov 2019 at 15:06, Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Mon, Nov 11, 2019 at 9:57 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

Good point. gin and bloom do a certain heavy work during cleanup and
during bulkdelete as you mentioned. Brin does it during cleanup, and
hash and gist do it during bulkdelete. There are three types of index
AM just inside postgres code. An idea I came up with is that we can
control parallel vacuum and parallel cleanup separately. That is,
adding a variable amcanparallelcleanup and we can do parallel cleanup
on only indexes of which amcanparallelcleanup is true.

This is what I mentioned in my email as a second option (whether to
expose via IndexAM). I am not sure if we can have a new variable just
for this.

IndexBulkDelete
can be stored locally if both amcanparallelvacuum and
amcanparallelcleanup of an index are false because only the leader
process deals with such indexes. Otherwise we need to store it in DSM
as always.

IIUC, amcanparallelcleanup will be true for those indexes which does
heavy work during cleanup irrespective of whether bulkdelete is called
or not e.g. gin?

Yes, I guess that gin and brin set amcanparallelcleanup to true (gin
might set amcanparallevacuum to true as well).

If so, along with an amcanparallelcleanup flag, we
need to consider vacrelstats->num_index_scans right? So if
vacrelstats->num_index_scans == 0 then we need to launch parallel
worker for all the indexes who support amcanparallelvacuum and if
vacrelstats->num_index_scans > 0 then only for those who has
amcanparallelcleanup as true.

Yes, you're right. But this won't work fine for brin indexes who don't
want to participate in parallel vacuum but always want to participate
in parallel cleanup.

After more thoughts, I think we can have a ternary value: never,
always, once. If it's 'never' the index never participates in parallel
cleanup. I guess hash indexes use 'never'. Next, if it's 'always' the
index always participates regardless of vacrelstats->num_index_scan. I
guess gin, brin and bloom use 'always'. Finally if it's 'once' the
index participates in parallel cleanup only when it's the first time
(that is, vacrelstats->num_index_scan == 0), I guess btree, gist and
spgist use 'once'.

I think this 'once' option is confusing especially because it also
depends on 'num_index_scans' which the IndexAM has no control over.
It might be that the option name is not good, but I am not sure.
Another thing is that for brin indexes, we don't want bulkdelete to
participate in parallelism. Do we want to have separate variables for
ambulkdelete and amvacuumcleanup which decides whether the particular
phase can be done in parallel? Another possibility could be to just
have one variable (say uint16 amparallelvacuum) which will tell us all
the options but I don't think that will be a popular approach
considering all the other methods and variables exposed. What do you
think?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#182

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Mahendra Singh (#180)

On Mon, Nov 11, 2019 at 2:53 PM Mahendra Singh <mahi6run@gmail.com> wrote:

For small indexes also, we gained some performance by parallel vacuum.

Thanks for doing all these tests. It is clear with this and previous
tests that this patch has benefit in wide variety of cases. However,
we should try to see some worst cases as well. For example, if there
are multiple indexes on a table and only one of them is large whereas
all others are very small say having a few 100 or 1000 rows.

Note: Please don't use the top-posting style to reply. Here, we use
inline reply.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#183

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

about 6 years ago

In reply to: Amit Kapila (#181)

On Mon, 11 Nov 2019 at 19:29, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Nov 11, 2019 at 12:26 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Mon, 11 Nov 2019 at 15:06, Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Mon, Nov 11, 2019 at 9:57 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

Good point. gin and bloom do a certain heavy work during cleanup and
during bulkdelete as you mentioned. Brin does it during cleanup, and
hash and gist do it during bulkdelete. There are three types of index
AM just inside postgres code. An idea I came up with is that we can
control parallel vacuum and parallel cleanup separately. That is,
adding a variable amcanparallelcleanup and we can do parallel cleanup
on only indexes of which amcanparallelcleanup is true.

This is what I mentioned in my email as a second option (whether to
expose via IndexAM). I am not sure if we can have a new variable just
for this.

IndexBulkDelete
can be stored locally if both amcanparallelvacuum and
amcanparallelcleanup of an index are false because only the leader
process deals with such indexes. Otherwise we need to store it in DSM
as always.

IIUC, amcanparallelcleanup will be true for those indexes which does
heavy work during cleanup irrespective of whether bulkdelete is called
or not e.g. gin?

Yes, I guess that gin and brin set amcanparallelcleanup to true (gin
might set amcanparallevacuum to true as well).

If so, along with an amcanparallelcleanup flag, we
need to consider vacrelstats->num_index_scans right? So if
vacrelstats->num_index_scans == 0 then we need to launch parallel
worker for all the indexes who support amcanparallelvacuum and if
vacrelstats->num_index_scans > 0 then only for those who has
amcanparallelcleanup as true.

Yes, you're right. But this won't work fine for brin indexes who don't
want to participate in parallel vacuum but always want to participate
in parallel cleanup.

After more thoughts, I think we can have a ternary value: never,
always, once. If it's 'never' the index never participates in parallel
cleanup. I guess hash indexes use 'never'. Next, if it's 'always' the
index always participates regardless of vacrelstats->num_index_scan. I
guess gin, brin and bloom use 'always'. Finally if it's 'once' the
index participates in parallel cleanup only when it's the first time
(that is, vacrelstats->num_index_scan == 0), I guess btree, gist and
spgist use 'once'.

I think this 'once' option is confusing especially because it also
depends on 'num_index_scans' which the IndexAM has no control over.
It might be that the option name is not good, but I am not sure.
Another thing is that for brin indexes, we don't want bulkdelete to
participate in parallelism.

I thought brin should set amcanparallelvacuum is false and
amcanparallelcleanup is 'always'.

Do we want to have separate variables for
ambulkdelete and amvacuumcleanup which decides whether the particular
phase can be done in parallel?

You mean adding variables to ambulkdelete and amvacuumcleanup as
function arguments? If so isn't it too late to tell the leader whether
the particular pchase can be done in parallel?

Another possibility could be to just
have one variable (say uint16 amparallelvacuum) which will tell us all
the options but I don't think that will be a popular approach
considering all the other methods and variables exposed. What do you
think?

Adding only one variable that can have flags would also be a good
idea, instead of having multiple variables for each option. For
instance FDW API uses such interface (see eflags of BeginForeignScan).

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#184

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Masahiko Sawada (#183)

On Tue, Nov 12, 2019 at 7:43 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Mon, 11 Nov 2019 at 19:29, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Nov 11, 2019 at 12:26 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

After more thoughts, I think we can have a ternary value: never,
always, once. If it's 'never' the index never participates in parallel
cleanup. I guess hash indexes use 'never'. Next, if it's 'always' the
index always participates regardless of vacrelstats->num_index_scan. I
guess gin, brin and bloom use 'always'. Finally if it's 'once' the
index participates in parallel cleanup only when it's the first time
(that is, vacrelstats->num_index_scan == 0), I guess btree, gist and
spgist use 'once'.

I think this 'once' option is confusing especially because it also
depends on 'num_index_scans' which the IndexAM has no control over.
It might be that the option name is not good, but I am not sure.
Another thing is that for brin indexes, we don't want bulkdelete to
participate in parallelism.

I thought brin should set amcanparallelvacuum is false and
amcanparallelcleanup is 'always'.

In that case, it is better to name the variable as amcanparallelbulkdelete.

Do we want to have separate variables for
ambulkdelete and amvacuumcleanup which decides whether the particular
phase can be done in parallel?

You mean adding variables to ambulkdelete and amvacuumcleanup as
function arguments?

No, I mean separate variables amcanparallelbulkdelete (bool) and
amcanparallelvacuumcleanup (unit16) variables.

Another possibility could be to just
have one variable (say uint16 amparallelvacuum) which will tell us all
the options but I don't think that will be a popular approach
considering all the other methods and variables exposed. What do you
think?

Adding only one variable that can have flags would also be a good
idea, instead of having multiple variables for each option. For
instance FDW API uses such interface (see eflags of BeginForeignScan).

Yeah, maybe something like amparallelvacuumoptions. The options can be:

VACUUM_OPTION_NO_PARALLEL 0 # vacuum (neither bulkdelete nor
vacuumcleanup) can't be performed in parallel
VACUUM_OPTION_NO_PARALLEL_CLEANUP 1 # vacuumcleanup cannot be
performed in parallel (hash index will set this flag)
VACUUM_OPTION_PARALLEL_BULKDEL 2 # bulkdelete can be done in
parallel (Indexes nbtree, hash, gin, gist, spgist, bloom will set this
flag)
VACUUM_OPTION_PARALLEL_COND_CLEANUP 3 # vacuumcleanup can be done in
parallel if bulkdelete is not performed (Indexes nbtree, brin, hash,
gin, gist, spgist, bloom will set this flag)
VACUUM_OPTION_PARALLEL_CLEANUP 4 # vacuumcleanup can be done in
parallel even if bulkdelete is already performed (Indexes gin, brin,
and bloom will set this flag)

Does something like this make sense? If we all agree on this, then I
think we can summarize the part of the discussion related to this API
and get feedback from a broader audience.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#185

Dilip Kumar

dilipbalaut@gmail.com

about 6 years ago

In reply to: Amit Kapila (#184)

On Tue, Nov 12, 2019 at 2:25 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Nov 12, 2019 at 7:43 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Mon, 11 Nov 2019 at 19:29, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Nov 11, 2019 at 12:26 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

After more thoughts, I think we can have a ternary value: never,
always, once. If it's 'never' the index never participates in parallel
cleanup. I guess hash indexes use 'never'. Next, if it's 'always' the
index always participates regardless of vacrelstats->num_index_scan. I
guess gin, brin and bloom use 'always'. Finally if it's 'once' the
index participates in parallel cleanup only when it's the first time
(that is, vacrelstats->num_index_scan == 0), I guess btree, gist and
spgist use 'once'.

I think this 'once' option is confusing especially because it also
depends on 'num_index_scans' which the IndexAM has no control over.
It might be that the option name is not good, but I am not sure.
Another thing is that for brin indexes, we don't want bulkdelete to
participate in parallelism.

I thought brin should set amcanparallelvacuum is false and
amcanparallelcleanup is 'always'.

In that case, it is better to name the variable as amcanparallelbulkdelete.

Do we want to have separate variables for
ambulkdelete and amvacuumcleanup which decides whether the particular
phase can be done in parallel?

You mean adding variables to ambulkdelete and amvacuumcleanup as
function arguments?

No, I mean separate variables amcanparallelbulkdelete (bool) and
amcanparallelvacuumcleanup (unit16) variables.

Another possibility could be to just
have one variable (say uint16 amparallelvacuum) which will tell us all
the options but I don't think that will be a popular approach
considering all the other methods and variables exposed. What do you
think?

Adding only one variable that can have flags would also be a good
idea, instead of having multiple variables for each option. For
instance FDW API uses such interface (see eflags of BeginForeignScan).

Yeah, maybe something like amparallelvacuumoptions. The options can be:

VACUUM_OPTION_NO_PARALLEL 0 # vacuum (neither bulkdelete nor
vacuumcleanup) can't be performed in parallel
VACUUM_OPTION_NO_PARALLEL_CLEANUP 1 # vacuumcleanup cannot be
performed in parallel (hash index will set this flag)

Maybe we don't want this option? because if 3 or 4 is not set then we
will not do the cleanup in parallel right?

VACUUM_OPTION_PARALLEL_BULKDEL 2 # bulkdelete can be done in
parallel (Indexes nbtree, hash, gin, gist, spgist, bloom will set this
flag)
VACUUM_OPTION_PARALLEL_COND_CLEANUP 3 # vacuumcleanup can be done in
parallel if bulkdelete is not performed (Indexes nbtree, brin, hash,
gin, gist, spgist, bloom will set this flag)
VACUUM_OPTION_PARALLEL_CLEANUP 4 # vacuumcleanup can be done in
parallel even if bulkdelete is already performed (Indexes gin, brin,
and bloom will set this flag)

Does something like this make sense?

Yeah, something like that seems better to me.

If we all agree on this, then I
think we can summarize the part of the discussion related to this API
and get feedback from a broader audience.

Make sense.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#186

Mahendra Singh

mahi6run@gmail.com

about 6 years ago

In reply to: Amit Kapila (#182)

On Mon, 11 Nov 2019 at 16:36, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Nov 11, 2019 at 2:53 PM Mahendra Singh <mahi6run@gmail.com> wrote:

For small indexes also, we gained some performance by parallel vacuum.

Thanks for doing all these tests. It is clear with this and previous
tests that this patch has benefit in wide variety of cases. However,
we should try to see some worst cases as well. For example, if there
are multiple indexes on a table and only one of them is large whereas
all others are very small say having a few 100 or 1000 rows.

Thanks Amit for your comments.

I did some testing on the above suggested lines. Below is the summary:
*Test case:(I created 16 indexes but only 1 index is large, other are very
small)*
create table test(a int, b int, c int, d int, e int, f int, g int, h int);
create index i3 on test (a) where a > 2000 and a < 3000;
create index i4 on test (a) where a > 3000 and a < 4000;
create index i5 on test (a) where a > 4000 and a < 5000;
create index i6 on test (a) where a > 5000 and a < 6000;
create index i7 on test (b) where a < 1000;
create index i8 on test (c) where a < 1000;
create index i9 on test (d) where a < 1000;
create index i10 on test (d) where a < 1000;
create index i11 on test (d) where a < 1000;
create index i12 on test (d) where a < 1000;
create index i13 on test (d) where a < 1000;
create index i14 on test (d) where a < 1000;
create index i15 on test (d) where a < 1000;
create index i16 on test (d) where a < 1000;
insert into test select i,i,i,i,i,i,i,i from generate_series(1,1000000) as
i;
delete from test where a %2=0;

case 1: vacuum without using parallel workers.
vacuum test;
228.259 ms

case 2: vacuum with 1 parallel worker.
vacuum (parallel 1) test;
251.725 ms

case 3: vacuum with 3 parallel workers.
vacuum (parallel 3) test;
259.986

From above results, it seems that if indexes are small, then parallel
vacuum is not beneficial as compared to normal vacuum.

Note: Please don't use the top-posting style to reply. Here, we use
inline reply.

Okay. I will follow inline reply.

Thanks and Regards
Mahendra Thalor
EnterpriseDB: http://www.enterprisedb.com

#187

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

about 6 years ago

In reply to: Dilip Kumar (#185)

On Tue, 12 Nov 2019 at 18:26, Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Tue, Nov 12, 2019 at 2:25 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Nov 12, 2019 at 7:43 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Mon, 11 Nov 2019 at 19:29, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Nov 11, 2019 at 12:26 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

After more thoughts, I think we can have a ternary value: never,
always, once. If it's 'never' the index never participates in parallel
cleanup. I guess hash indexes use 'never'. Next, if it's 'always' the
index always participates regardless of vacrelstats->num_index_scan. I
guess gin, brin and bloom use 'always'. Finally if it's 'once' the
index participates in parallel cleanup only when it's the first time
(that is, vacrelstats->num_index_scan == 0), I guess btree, gist and
spgist use 'once'.

I think this 'once' option is confusing especially because it also
depends on 'num_index_scans' which the IndexAM has no control over.
It might be that the option name is not good, but I am not sure.
Another thing is that for brin indexes, we don't want bulkdelete to
participate in parallelism.

I thought brin should set amcanparallelvacuum is false and
amcanparallelcleanup is 'always'.

In that case, it is better to name the variable as amcanparallelbulkdelete.

Do we want to have separate variables for
ambulkdelete and amvacuumcleanup which decides whether the particular
phase can be done in parallel?

You mean adding variables to ambulkdelete and amvacuumcleanup as
function arguments?

No, I mean separate variables amcanparallelbulkdelete (bool) and
amcanparallelvacuumcleanup (unit16) variables.

Another possibility could be to just
have one variable (say uint16 amparallelvacuum) which will tell us all
the options but I don't think that will be a popular approach
considering all the other methods and variables exposed. What do you
think?

Adding only one variable that can have flags would also be a good
idea, instead of having multiple variables for each option. For
instance FDW API uses such interface (see eflags of BeginForeignScan).

Yeah, maybe something like amparallelvacuumoptions. The options can be:

VACUUM_OPTION_NO_PARALLEL 0 # vacuum (neither bulkdelete nor
vacuumcleanup) can't be performed in parallel
VACUUM_OPTION_NO_PARALLEL_CLEANUP 1 # vacuumcleanup cannot be
performed in parallel (hash index will set this flag)

Maybe we don't want this option? because if 3 or 4 is not set then we
will not do the cleanup in parallel right?

VACUUM_OPTION_PARALLEL_BULKDEL 2 # bulkdelete can be done in
parallel (Indexes nbtree, hash, gin, gist, spgist, bloom will set this
flag)
VACUUM_OPTION_PARALLEL_COND_CLEANUP 3 # vacuumcleanup can be done in
parallel if bulkdelete is not performed (Indexes nbtree, brin, hash,
gin, gist, spgist, bloom will set this flag)
VACUUM_OPTION_PARALLEL_CLEANUP 4 # vacuumcleanup can be done in
parallel even if bulkdelete is already performed (Indexes gin, brin,
and bloom will set this flag)

Does something like this make sense?

3 and 4 confused me because 4 also looks conditional. How about having
two flags instead: one for doing parallel cleanup when not performed
yet (VACUUM_OPTION_PARALLEL_COND_CLEANUP) and another one for doing
always parallel cleanup (VACUUM_OPTION_PARALLEL_CLEANUP)? That way, we
can have flags as follows and index AM chooses two flags, one from the
first two flags for bulk deletion and another from next three flags
for cleanup.

VACUUM_OPTION_PARALLEL_NO_BULKDEL 1 << 0
VACUUM_OPTION_PARALLEL_BULKDEL 1 << 1
VACUUM_OPTION_PARALLEL_NO_CLEANUP 1 << 2
VACUUM_OPTION_PARALLEL_COND_CLEANUP 1 << 3
VACUUM_OPTION_PARALLEL_CLEANUP 1 << 4

Yeah, something like that seems better to me.

If we all agree on this, then I
think we can summarize the part of the discussion related to this API
and get feedback from a broader audience.

Make sense.

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#188

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

about 6 years ago

In reply to: Dilip Kumar (#179)

On Mon, 11 Nov 2019 at 17:57, Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Tue, Oct 29, 2019 at 12:37 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

I realized that v31-0006 patch doesn't work fine so I've attached the
updated version patch that also incorporated some comments I got so
far. Sorry for the inconvenience. I'll apply your 0001 patch and also
test the total delay time.

While reviewing the 0002, I got one doubt related to how we are
dividing the maintainance_work_mem
+prepare_index_statistics(LVShared *lvshared, Relation *Irel, int nindexes)
+{
+ /* Compute the new maitenance_work_mem value for index vacuuming */
+ lvshared->maintenance_work_mem_worker =
+ (nindexes_mwm > 0) ? maintenance_work_mem / nindexes_mwm :
maintenance_work_mem;
+}
Is it fair to just consider the number of indexes which use
maintenance_work_mem?  Or we need to consider the number of worker as
well.  My point is suppose there are 10 indexes which will use the
maintenance_work_mem but we are launching just 2 workers then what is
the point in dividing the maintenance_work_mem by 10.
IMHO the calculation should be like this
lvshared->maintenance_work_mem_worker = (nindexes_mwm > 0) ?
maintenance_work_mem / Min(nindexes_mwm, nworkers) :
maintenance_work_mem;

Am I missing something?

No, I think you're right. On the other hand I think that dividing it
by the number of indexes that will use the mantenance_work_mem makes
sense when parallel degree > the number of such indexes. Suppose the
table has 2 indexes and there are 10 workers then we should divide the
maintenance_work_mem by 2 rather than 10 because it's possible that at
most 2 indexes that uses the maintenance_work_mem are processed in
parallel at a time.

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#189

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Masahiko Sawada (#187)

On Tue, Nov 12, 2019 at 3:39 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Tue, 12 Nov 2019 at 18:26, Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Tue, Nov 12, 2019 at 2:25 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

Yeah, maybe something like amparallelvacuumoptions. The options can be:

VACUUM_OPTION_NO_PARALLEL 0 # vacuum (neither bulkdelete nor
vacuumcleanup) can't be performed in parallel
VACUUM_OPTION_NO_PARALLEL_CLEANUP 1 # vacuumcleanup cannot be
performed in parallel (hash index will set this flag)

Maybe we don't want this option? because if 3 or 4 is not set then we
will not do the cleanup in parallel right?

Yeah, but it is better to be explicit about this.

VACUUM_OPTION_PARALLEL_BULKDEL 2 # bulkdelete can be done in
parallel (Indexes nbtree, hash, gin, gist, spgist, bloom will set this
flag)
VACUUM_OPTION_PARALLEL_COND_CLEANUP 3 # vacuumcleanup can be done in
parallel if bulkdelete is not performed (Indexes nbtree, brin, hash,
gin, gist, spgist, bloom will set this flag)
VACUUM_OPTION_PARALLEL_CLEANUP 4 # vacuumcleanup can be done in
parallel even if bulkdelete is already performed (Indexes gin, brin,
and bloom will set this flag)

Does something like this make sense?

3 and 4 confused me because 4 also looks conditional. How about having
two flags instead: one for doing parallel cleanup when not performed
yet (VACUUM_OPTION_PARALLEL_COND_CLEANUP) and another one for doing
always parallel cleanup (VACUUM_OPTION_PARALLEL_CLEANUP)?

Hmm, this is exactly what I intend to say with 3 and 4. I am not sure
what makes you think 4 is conditional.

That way, we
can have flags as follows and index AM chooses two flags, one from the
first two flags for bulk deletion and another from next three flags
for cleanup.

VACUUM_OPTION_PARALLEL_NO_BULKDEL 1 << 0
VACUUM_OPTION_PARALLEL_BULKDEL 1 << 1
VACUUM_OPTION_PARALLEL_NO_CLEANUP 1 << 2
VACUUM_OPTION_PARALLEL_COND_CLEANUP 1 << 3
VACUUM_OPTION_PARALLEL_CLEANUP 1 << 4

This also looks reasonable, but if there is an index that doesn't want
to support a parallel vacuum, it needs to set multiple flags.

Yeah, something like that seems better to me.

If we all agree on this, then I
think we can summarize the part of the discussion related to this API
and get feedback from a broader audience.

Make sense.

+1

Okay, then I will write a separate email for this topic.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#190

Dilip Kumar

dilipbalaut@gmail.com

about 6 years ago

In reply to: Masahiko Sawada (#188)

On Tue, Nov 12, 2019 at 4:04 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Mon, 11 Nov 2019 at 17:57, Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Tue, Oct 29, 2019 at 12:37 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

I realized that v31-0006 patch doesn't work fine so I've attached the
updated version patch that also incorporated some comments I got so
far. Sorry for the inconvenience. I'll apply your 0001 patch and also
test the total delay time.

While reviewing the 0002, I got one doubt related to how we are
dividing the maintainance_work_mem
+prepare_index_statistics(LVShared *lvshared, Relation *Irel, int nindexes)
+{
+ /* Compute the new maitenance_work_mem value for index vacuuming */
+ lvshared->maintenance_work_mem_worker =
+ (nindexes_mwm > 0) ? maintenance_work_mem / nindexes_mwm :
maintenance_work_mem;
+}
Is it fair to just consider the number of indexes which use
maintenance_work_mem?  Or we need to consider the number of worker as
well.  My point is suppose there are 10 indexes which will use the
maintenance_work_mem but we are launching just 2 workers then what is
the point in dividing the maintenance_work_mem by 10.
IMHO the calculation should be like this
lvshared->maintenance_work_mem_worker = (nindexes_mwm > 0) ?
maintenance_work_mem / Min(nindexes_mwm, nworkers) :
maintenance_work_mem;

Am I missing something?
No, I think you're right. On the other hand I think that dividing it
by the number of indexes that will use the mantenance_work_mem makes
sense when parallel degree > the number of such indexes. Suppose the
table has 2 indexes and there are 10 workers then we should divide the
maintenance_work_mem by 2 rather than 10 because it's possible that at
most 2 indexes that uses the maintenance_work_mem are processed in
parallel at a time.

Right, thats the reason I suggested divide with Min(nindexes_mwm, nworkers).

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#191

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

about 6 years ago

In reply to: Amit Kapila (#189)

On Tue, 12 Nov 2019 at 20:11, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Nov 12, 2019 at 3:39 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Tue, 12 Nov 2019 at 18:26, Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Tue, Nov 12, 2019 at 2:25 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

Yeah, maybe something like amparallelvacuumoptions. The options can be:

VACUUM_OPTION_NO_PARALLEL 0 # vacuum (neither bulkdelete nor
vacuumcleanup) can't be performed in parallel
VACUUM_OPTION_NO_PARALLEL_CLEANUP 1 # vacuumcleanup cannot be
performed in parallel (hash index will set this flag)

Maybe we don't want this option? because if 3 or 4 is not set then we
will not do the cleanup in parallel right?

Yeah, but it is better to be explicit about this.

VACUUM_OPTION_NO_PARALLEL_BULKDEL is missing? I think brin indexes
will use this flag. It will end up with
(VACUUM_OPTION_NO_PARALLEL_CLEANUP |
VACUUM_OPTION_NO_PARALLEL_BULKDEL) is equivalent to
VACUUM_OPTION_NO_PARALLEL, though.

VACUUM_OPTION_PARALLEL_BULKDEL 2 # bulkdelete can be done in
parallel (Indexes nbtree, hash, gin, gist, spgist, bloom will set this
flag)
VACUUM_OPTION_PARALLEL_COND_CLEANUP 3 # vacuumcleanup can be done in
parallel if bulkdelete is not performed (Indexes nbtree, brin, hash,
gin, gist, spgist, bloom will set this flag)
VACUUM_OPTION_PARALLEL_CLEANUP 4 # vacuumcleanup can be done in
parallel even if bulkdelete is already performed (Indexes gin, brin,
and bloom will set this flag)

Does something like this make sense?

3 and 4 confused me because 4 also looks conditional. How about having
two flags instead: one for doing parallel cleanup when not performed
yet (VACUUM_OPTION_PARALLEL_COND_CLEANUP) and another one for doing
always parallel cleanup (VACUUM_OPTION_PARALLEL_CLEANUP)?

Hmm, this is exactly what I intend to say with 3 and 4. I am not sure
what makes you think 4 is conditional.

Hmm so why gin and bloom will set 3 and 4 flags? I thought if it sets
4 it doesn't need to set 3 because 4 means always doing cleanup in
parallel.

That way, we
can have flags as follows and index AM chooses two flags, one from the
first two flags for bulk deletion and another from next three flags
for cleanup.

VACUUM_OPTION_PARALLEL_NO_BULKDEL 1 << 0
VACUUM_OPTION_PARALLEL_BULKDEL 1 << 1
VACUUM_OPTION_PARALLEL_NO_CLEANUP 1 << 2
VACUUM_OPTION_PARALLEL_COND_CLEANUP 1 << 3
VACUUM_OPTION_PARALLEL_CLEANUP 1 << 4

This also looks reasonable, but if there is an index that doesn't want
to support a parallel vacuum, it needs to set multiple flags.

Right. It would be better to use uint16 as two uint8. I mean that if
first 8 bits are 0 it means VACUUM_OPTION_PARALLEL_NO_BULKDEL and if
next 8 bits are 0 means VACUUM_OPTION_PARALLEL_NO_CLEANUP. Other flags
could be followings:

VACUUM_OPTION_PARALLEL_BULKDEL 0x0001
VACUUM_OPTION_PARALLEL_COND_CLEANUP 0x0100
VACUUM_OPTION_PARALLEL_CLEANUP 0x0200

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#192

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

about 6 years ago

In reply to: Dilip Kumar (#190)

On Tue, 12 Nov 2019 at 20:29, Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Tue, Nov 12, 2019 at 4:04 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
On Mon, 11 Nov 2019 at 17:57, Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Tue, Oct 29, 2019 at 12:37 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

I realized that v31-0006 patch doesn't work fine so I've attached the
updated version patch that also incorporated some comments I got so
far. Sorry for the inconvenience. I'll apply your 0001 patch and also
test the total delay time.

While reviewing the 0002, I got one doubt related to how we are
dividing the maintainance_work_mem
+prepare_index_statistics(LVShared *lvshared, Relation *Irel, int nindexes)
+{
+ /* Compute the new maitenance_work_mem value for index vacuuming */
+ lvshared->maintenance_work_mem_worker =
+ (nindexes_mwm > 0) ? maintenance_work_mem / nindexes_mwm :
maintenance_work_mem;
+}
Is it fair to just consider the number of indexes which use
maintenance_work_mem?  Or we need to consider the number of worker as
well.  My point is suppose there are 10 indexes which will use the
maintenance_work_mem but we are launching just 2 workers then what is
the point in dividing the maintenance_work_mem by 10.
IMHO the calculation should be like this
lvshared->maintenance_work_mem_worker = (nindexes_mwm > 0) ?
maintenance_work_mem / Min(nindexes_mwm, nworkers) :
maintenance_work_mem;

Am I missing something?
No, I think you're right. On the other hand I think that dividing it
by the number of indexes that will use the mantenance_work_mem makes
sense when parallel degree > the number of such indexes. Suppose the
table has 2 indexes and there are 10 workers then we should divide the
maintenance_work_mem by 2 rather than 10 because it's possible that at
most 2 indexes that uses the maintenance_work_mem are processed in
parallel at a time.
Right, thats the reason I suggested divide with Min(nindexes_mwm, nworkers).

Thanks! I'll fix it in the next version patch.

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#193

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Masahiko Sawada (#191)

On Tue, Nov 12, 2019 at 5:30 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Tue, 12 Nov 2019 at 20:11, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Nov 12, 2019 at 3:39 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Tue, 12 Nov 2019 at 18:26, Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Tue, Nov 12, 2019 at 2:25 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

Yeah, maybe something like amparallelvacuumoptions. The options can be:

VACUUM_OPTION_NO_PARALLEL 0 # vacuum (neither bulkdelete nor
vacuumcleanup) can't be performed in parallel
VACUUM_OPTION_NO_PARALLEL_CLEANUP 1 # vacuumcleanup cannot be
performed in parallel (hash index will set this flag)

Maybe we don't want this option? because if 3 or 4 is not set then we
will not do the cleanup in parallel right?

Yeah, but it is better to be explicit about this.

VACUUM_OPTION_NO_PARALLEL_BULKDEL is missing?

I am not sure if that is required.

I think brin indexes
will use this flag.

Brin index can set VACUUM_OPTION_PARALLEL_CLEANUP in my proposal and
it should work.

It will end up with
(VACUUM_OPTION_NO_PARALLEL_CLEANUP |
VACUUM_OPTION_NO_PARALLEL_BULKDEL) is equivalent to
VACUUM_OPTION_NO_PARALLEL, though.

VACUUM_OPTION_PARALLEL_BULKDEL 2 # bulkdelete can be done in
parallel (Indexes nbtree, hash, gin, gist, spgist, bloom will set this
flag)
VACUUM_OPTION_PARALLEL_COND_CLEANUP 3 # vacuumcleanup can be done in
parallel if bulkdelete is not performed (Indexes nbtree, brin, hash,
gin, gist, spgist, bloom will set this flag)
VACUUM_OPTION_PARALLEL_CLEANUP 4 # vacuumcleanup can be done in
parallel even if bulkdelete is already performed (Indexes gin, brin,
and bloom will set this flag)

Does something like this make sense?

3 and 4 confused me because 4 also looks conditional. How about having
two flags instead: one for doing parallel cleanup when not performed
yet (VACUUM_OPTION_PARALLEL_COND_CLEANUP) and another one for doing
always parallel cleanup (VACUUM_OPTION_PARALLEL_CLEANUP)?

Hmm, this is exactly what I intend to say with 3 and 4. I am not sure
what makes you think 4 is conditional.

Hmm so why gin and bloom will set 3 and 4 flags? I thought if it sets
4 it doesn't need to set 3 because 4 means always doing cleanup in
parallel.

Yeah, that makes sense. They can just set 4.

That way, we
can have flags as follows and index AM chooses two flags, one from the
first two flags for bulk deletion and another from next three flags
for cleanup.

VACUUM_OPTION_PARALLEL_NO_BULKDEL 1 << 0
VACUUM_OPTION_PARALLEL_BULKDEL 1 << 1
VACUUM_OPTION_PARALLEL_NO_CLEANUP 1 << 2
VACUUM_OPTION_PARALLEL_COND_CLEANUP 1 << 3
VACUUM_OPTION_PARALLEL_CLEANUP 1 << 4

This also looks reasonable, but if there is an index that doesn't want
to support a parallel vacuum, it needs to set multiple flags.

Right. It would be better to use uint16 as two uint8. I mean that if
first 8 bits are 0 it means VACUUM_OPTION_PARALLEL_NO_BULKDEL and if
next 8 bits are 0 means VACUUM_OPTION_PARALLEL_NO_CLEANUP. Other flags
could be followings:

VACUUM_OPTION_PARALLEL_BULKDEL 0x0001
VACUUM_OPTION_PARALLEL_COND_CLEANUP 0x0100
VACUUM_OPTION_PARALLEL_CLEANUP 0x0200

Hmm, I think we should define these flags in the most simple way.
Your previous proposal sounds okay to me.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#194

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

about 6 years ago

In reply to: Amit Kapila (#193)

On Tue, 12 Nov 2019 at 22:33, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Nov 12, 2019 at 5:30 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Tue, 12 Nov 2019 at 20:11, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Nov 12, 2019 at 3:39 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Tue, 12 Nov 2019 at 18:26, Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Tue, Nov 12, 2019 at 2:25 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

Yeah, maybe something like amparallelvacuumoptions. The options can be:

VACUUM_OPTION_NO_PARALLEL 0 # vacuum (neither bulkdelete nor
vacuumcleanup) can't be performed in parallel
VACUUM_OPTION_NO_PARALLEL_CLEANUP 1 # vacuumcleanup cannot be
performed in parallel (hash index will set this flag)

Maybe we don't want this option? because if 3 or 4 is not set then we
will not do the cleanup in parallel right?

Yeah, but it is better to be explicit about this.

VACUUM_OPTION_NO_PARALLEL_BULKDEL is missing?

I am not sure if that is required.

I think brin indexes
will use this flag.

Brin index can set VACUUM_OPTION_PARALLEL_CLEANUP in my proposal and
it should work.

It will end up with
(VACUUM_OPTION_NO_PARALLEL_CLEANUP |
VACUUM_OPTION_NO_PARALLEL_BULKDEL) is equivalent to
VACUUM_OPTION_NO_PARALLEL, though.

VACUUM_OPTION_PARALLEL_BULKDEL 2 # bulkdelete can be done in
parallel (Indexes nbtree, hash, gin, gist, spgist, bloom will set this
flag)
VACUUM_OPTION_PARALLEL_COND_CLEANUP 3 # vacuumcleanup can be done in
parallel if bulkdelete is not performed (Indexes nbtree, brin, hash,
gin, gist, spgist, bloom will set this flag)
VACUUM_OPTION_PARALLEL_CLEANUP 4 # vacuumcleanup can be done in
parallel even if bulkdelete is already performed (Indexes gin, brin,
and bloom will set this flag)

Does something like this make sense?

3 and 4 confused me because 4 also looks conditional. How about having
two flags instead: one for doing parallel cleanup when not performed
yet (VACUUM_OPTION_PARALLEL_COND_CLEANUP) and another one for doing
always parallel cleanup (VACUUM_OPTION_PARALLEL_CLEANUP)?

Hmm, this is exactly what I intend to say with 3 and 4. I am not sure
what makes you think 4 is conditional.

Hmm so why gin and bloom will set 3 and 4 flags? I thought if it sets
4 it doesn't need to set 3 because 4 means always doing cleanup in
parallel.

Yeah, that makes sense. They can just set 4.

Okay,

That way, we
can have flags as follows and index AM chooses two flags, one from the
first two flags for bulk deletion and another from next three flags
for cleanup.

VACUUM_OPTION_PARALLEL_NO_BULKDEL 1 << 0
VACUUM_OPTION_PARALLEL_BULKDEL 1 << 1
VACUUM_OPTION_PARALLEL_NO_CLEANUP 1 << 2
VACUUM_OPTION_PARALLEL_COND_CLEANUP 1 << 3
VACUUM_OPTION_PARALLEL_CLEANUP 1 << 4

This also looks reasonable, but if there is an index that doesn't want
to support a parallel vacuum, it needs to set multiple flags.

Right. It would be better to use uint16 as two uint8. I mean that if
first 8 bits are 0 it means VACUUM_OPTION_PARALLEL_NO_BULKDEL and if
next 8 bits are 0 means VACUUM_OPTION_PARALLEL_NO_CLEANUP. Other flags
could be followings:

VACUUM_OPTION_PARALLEL_BULKDEL 0x0001
VACUUM_OPTION_PARALLEL_COND_CLEANUP 0x0100
VACUUM_OPTION_PARALLEL_CLEANUP 0x0200

Hmm, I think we should define these flags in the most simple way.
Your previous proposal sounds okay to me.

Okay. As you mentioned before, my previous proposal won't work for
existing index AMs that don't set amparallelvacuumoptions. But since we
have amcanparallelvacuum which is false by default I think we don't
need to worry about backward compatibility problem. The existing index
AM will use neither parallel bulk-deletion nor parallel cleanup by
default. When it wants to support parallel vacuum they will set
amparallelvacuumoptions as well as amcanparallelvacuum.

I'll try to use my previous proposal and check it. If something wrong
we can back to your proposal or others.

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#195

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Masahiko Sawada (#194)

On Wed, Nov 13, 2019 at 6:53 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Tue, 12 Nov 2019 at 22:33, Amit Kapila <amit.kapila16@gmail.com> wrote:

Hmm, I think we should define these flags in the most simple way.
Your previous proposal sounds okay to me.

Okay. As you mentioned before, my previous proposal won't work for
existing index AMs that don't set amparallelvacuumoptions.

You mean to say it won't work because it has to set multiple flags
which means that if IndexAm user doesn't set the value of
amparallelvacuumoptions then it won't work?

But since we
have amcanparallelvacuum which is false by default I think we don't
need to worry about backward compatibility problem. The existing index
AM will use neither parallel bulk-deletion nor parallel cleanup by
default. When it wants to support parallel vacuum they will set
amparallelvacuumoptions as well as amcanparallelvacuum.

Hmm, I was not thinking of multiple variables rather only one
variable. The default value should indicate that IndexAm doesn't
support a parallel vacuum. It might be that we need to do it the way
I originally proposed the different values of amparallelvacuumoptions
or maybe some variant of it where the default value can clearly say
that IndexAm doesn't support a parallel vacuum.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#196

Dilip Kumar

dilipbalaut@gmail.com

about 6 years ago

In reply to: Amit Kapila (#193)

On Tue, Nov 12, 2019 at 7:03 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Nov 12, 2019 at 5:30 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Tue, 12 Nov 2019 at 20:11, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Nov 12, 2019 at 3:39 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Tue, 12 Nov 2019 at 18:26, Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Tue, Nov 12, 2019 at 2:25 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

Yeah, maybe something like amparallelvacuumoptions. The options can be:

VACUUM_OPTION_NO_PARALLEL 0 # vacuum (neither bulkdelete nor
vacuumcleanup) can't be performed in parallel
VACUUM_OPTION_NO_PARALLEL_CLEANUP 1 # vacuumcleanup cannot be
performed in parallel (hash index will set this flag)

Maybe we don't want this option? because if 3 or 4 is not set then we
will not do the cleanup in parallel right?

Yeah, but it is better to be explicit about this.

VACUUM_OPTION_NO_PARALLEL_BULKDEL is missing?

I am not sure if that is required.

I think brin indexes
will use this flag.

Brin index can set VACUUM_OPTION_PARALLEL_CLEANUP in my proposal and
it should work.

IIUC, VACUUM_OPTION_PARALLEL_CLEANUP means no parallel bulk delete and
always parallel cleanup? I am not sure whether this is the best way
because for the cleanup option we are being explicit for each option
i.e PARALLEL_CLEANUP, NO_PARALLEL_CLEANUP, etc, then why not the same
for the bulk delete. I mean why don't we keep both PARALLEL_BULKDEL
and NO_PARALLEL_BULKDEL?

It will end up with
(VACUUM_OPTION_NO_PARALLEL_CLEANUP |
VACUUM_OPTION_NO_PARALLEL_BULKDEL) is equivalent to
VACUUM_OPTION_NO_PARALLEL, though.

VACUUM_OPTION_PARALLEL_BULKDEL 2 # bulkdelete can be done in
parallel (Indexes nbtree, hash, gin, gist, spgist, bloom will set this
flag)
VACUUM_OPTION_PARALLEL_COND_CLEANUP 3 # vacuumcleanup can be done in
parallel if bulkdelete is not performed (Indexes nbtree, brin, hash,
gin, gist, spgist, bloom will set this flag)
VACUUM_OPTION_PARALLEL_CLEANUP 4 # vacuumcleanup can be done in
parallel even if bulkdelete is already performed (Indexes gin, brin,
and bloom will set this flag)

Does something like this make sense?

3 and 4 confused me because 4 also looks conditional. How about having
two flags instead: one for doing parallel cleanup when not performed
yet (VACUUM_OPTION_PARALLEL_COND_CLEANUP) and another one for doing
always parallel cleanup (VACUUM_OPTION_PARALLEL_CLEANUP)?

Hmm, this is exactly what I intend to say with 3 and 4. I am not sure
what makes you think 4 is conditional.

Hmm so why gin and bloom will set 3 and 4 flags? I thought if it sets
4 it doesn't need to set 3 because 4 means always doing cleanup in
parallel.

Yeah, that makes sense. They can just set 4.

That way, we
can have flags as follows and index AM chooses two flags, one from the
first two flags for bulk deletion and another from next three flags
for cleanup.

VACUUM_OPTION_PARALLEL_NO_BULKDEL 1 << 0
VACUUM_OPTION_PARALLEL_BULKDEL 1 << 1
VACUUM_OPTION_PARALLEL_NO_CLEANUP 1 << 2
VACUUM_OPTION_PARALLEL_COND_CLEANUP 1 << 3
VACUUM_OPTION_PARALLEL_CLEANUP 1 << 4

This also looks reasonable, but if there is an index that doesn't want
to support a parallel vacuum, it needs to set multiple flags.

Right. It would be better to use uint16 as two uint8. I mean that if
first 8 bits are 0 it means VACUUM_OPTION_PARALLEL_NO_BULKDEL and if
next 8 bits are 0 means VACUUM_OPTION_PARALLEL_NO_CLEANUP. Other flags
could be followings:

VACUUM_OPTION_PARALLEL_BULKDEL 0x0001
VACUUM_OPTION_PARALLEL_COND_CLEANUP 0x0100
VACUUM_OPTION_PARALLEL_CLEANUP 0x0200

Hmm, I think we should define these flags in the most simple way.
Your previous proposal sounds okay to me.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#197

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

about 6 years ago

In reply to: Amit Kapila (#195)

On Wed, 13 Nov 2019 at 11:38, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Nov 13, 2019 at 6:53 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Tue, 12 Nov 2019 at 22:33, Amit Kapila <amit.kapila16@gmail.com> wrote:

Hmm, I think we should define these flags in the most simple way.
Your previous proposal sounds okay to me.

Okay. As you mentioned before, my previous proposal won't work for
existing index AMs that don't set amparallelvacuumoptions.

You mean to say it won't work because it has to set multiple flags
which means that if IndexAm user doesn't set the value of
amparallelvacuumoptions then it won't work?

Yes. In my previous proposal every index AMs need to set two flags.

But since we
have amcanparallelvacuum which is false by default I think we don't
need to worry about backward compatibility problem. The existing index
AM will use neither parallel bulk-deletion nor parallel cleanup by
default. When it wants to support parallel vacuum they will set
amparallelvacuumoptions as well as amcanparallelvacuum.

Hmm, I was not thinking of multiple variables rather only one
variable. The default value should indicate that IndexAm doesn't
support a parallel vacuum.

Yes.

It might be that we need to do it the way
I originally proposed the different values of amparallelvacuumoptions
or maybe some variant of it where the default value can clearly say
that IndexAm doesn't support a parallel vacuum.

Okay. After more thoughts on your original proposal, what I get
confused on your proposal is that there are two types of flags that
enable and disable options. Looking at 2, 3 and 4, it looks like all
options are disabled by default and setting these flags means to
enable them. On the other hand looking at 1, it looks like these
options are enabled by default and setting the flag means to disable
it. 0 makes sense to me. So how about having 0, 2, 3 and 4?

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#198

Dilip Kumar

dilipbalaut@gmail.com

about 6 years ago

In reply to: Masahiko Sawada (#192)

On Tue, Nov 12, 2019 at 5:31 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Tue, 12 Nov 2019 at 20:29, Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Tue, Nov 12, 2019 at 4:04 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
On Mon, 11 Nov 2019 at 17:57, Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Tue, Oct 29, 2019 at 12:37 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

I realized that v31-0006 patch doesn't work fine so I've attached the
updated version patch that also incorporated some comments I got so
far. Sorry for the inconvenience. I'll apply your 0001 patch and also
test the total delay time.

While reviewing the 0002, I got one doubt related to how we are
dividing the maintainance_work_mem
+prepare_index_statistics(LVShared *lvshared, Relation *Irel, int nindexes)
+{
+ /* Compute the new maitenance_work_mem value for index vacuuming */
+ lvshared->maintenance_work_mem_worker =
+ (nindexes_mwm > 0) ? maintenance_work_mem / nindexes_mwm :
maintenance_work_mem;
+}
Is it fair to just consider the number of indexes which use
maintenance_work_mem?  Or we need to consider the number of worker as
well.  My point is suppose there are 10 indexes which will use the
maintenance_work_mem but we are launching just 2 workers then what is
the point in dividing the maintenance_work_mem by 10.
IMHO the calculation should be like this
lvshared->maintenance_work_mem_worker = (nindexes_mwm > 0) ?
maintenance_work_mem / Min(nindexes_mwm, nworkers) :
maintenance_work_mem;

Am I missing something?
No, I think you're right. On the other hand I think that dividing it
by the number of indexes that will use the mantenance_work_mem makes
sense when parallel degree > the number of such indexes. Suppose the
table has 2 indexes and there are 10 workers then we should divide the
maintenance_work_mem by 2 rather than 10 because it's possible that at
most 2 indexes that uses the maintenance_work_mem are processed in
parallel at a time.
Right, thats the reason I suggested divide with Min(nindexes_mwm, nworkers).
Thanks! I'll fix it in the next version patch.

One more comment.

+lazy_vacuum_indexes(LVRelStats *vacrelstats, Relation *Irel,
+ int nindexes, IndexBulkDeleteResult **stats,
+ LVParallelState *lps)
+{
+ ....

+ if (ParallelVacuumIsActive(lps))
+ {

+
+ lazy_parallel_vacuum_or_cleanup_indexes(vacrelstats, Irel, nindexes,
+ stats, lps);
+
+ }
+
+ for (idx = 0; idx < nindexes; idx++)
+ {
+ /*
+ * Skip indexes that we have already vacuumed during parallel index
+ * vacuuming.
+ */
+ if (ParallelVacuumIsActive(lps) && !IndStatsIsNull(lps->lvshared, idx))
+ continue;
+
+ lazy_vacuum_index(Irel[idx], &stats[idx], vacrelstats->dead_tuples,
+   vacrelstats->old_live_tuples);
+ }
+}

In this function, if ParallelVacuumIsActive, we perform the parallel
vacuum for all the index for which parallel vacuum is supported and
once that is over we finish vacuuming remaining indexes for which
parallel vacuum is not supported. But, my question is that inside
lazy_parallel_vacuum_or_cleanup_indexes, we wait for all the workers
to finish their job then only we start with the sequential vacuuming
shouldn't we start that immediately as soon as the leader
participation is over in the parallel vacuum?

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#199

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Masahiko Sawada (#197)

On Wed, Nov 13, 2019 at 8:34 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Wed, 13 Nov 2019 at 11:38, Amit Kapila <amit.kapila16@gmail.com> wrote:

It might be that we need to do it the way
I originally proposed the different values of amparallelvacuumoptions
or maybe some variant of it where the default value can clearly say
that IndexAm doesn't support a parallel vacuum.

Okay. After more thoughts on your original proposal, what I get
confused on your proposal is that there are two types of flags that
enable and disable options. Looking at 2, 3 and 4, it looks like all
options are disabled by default and setting these flags means to
enable them. On the other hand looking at 1, it looks like these
options are enabled by default and setting the flag means to disable
it. 0 makes sense to me. So how about having 0, 2, 3 and 4?

Yeah, 0,2,3 and 4 sounds reasonable to me. Earlier, Dilip also got
confused with option 1.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#200

Dilip Kumar

dilipbalaut@gmail.com

about 6 years ago

In reply to: Dilip Kumar (#198)

On Wed, Nov 13, 2019 at 9:12 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Tue, Nov 12, 2019 at 5:31 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
On Tue, 12 Nov 2019 at 20:29, Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Tue, Nov 12, 2019 at 4:04 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
On Mon, 11 Nov 2019 at 17:57, Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Tue, Oct 29, 2019 at 12:37 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

I realized that v31-0006 patch doesn't work fine so I've attached the
updated version patch that also incorporated some comments I got so
far. Sorry for the inconvenience. I'll apply your 0001 patch and also
test the total delay time.

While reviewing the 0002, I got one doubt related to how we are
dividing the maintainance_work_mem
+prepare_index_statistics(LVShared *lvshared, Relation *Irel, int nindexes)
+{
+ /* Compute the new maitenance_work_mem value for index vacuuming */
+ lvshared->maintenance_work_mem_worker =
+ (nindexes_mwm > 0) ? maintenance_work_mem / nindexes_mwm :
maintenance_work_mem;
+}
Is it fair to just consider the number of indexes which use
maintenance_work_mem?  Or we need to consider the number of worker as
well.  My point is suppose there are 10 indexes which will use the
maintenance_work_mem but we are launching just 2 workers then what is
the point in dividing the maintenance_work_mem by 10.
IMHO the calculation should be like this
lvshared->maintenance_work_mem_worker = (nindexes_mwm > 0) ?
maintenance_work_mem / Min(nindexes_mwm, nworkers) :
maintenance_work_mem;

Am I missing something?
No, I think you're right. On the other hand I think that dividing it
by the number of indexes that will use the mantenance_work_mem makes
sense when parallel degree > the number of such indexes. Suppose the
table has 2 indexes and there are 10 workers then we should divide the
maintenance_work_mem by 2 rather than 10 because it's possible that at
most 2 indexes that uses the maintenance_work_mem are processed in
parallel at a time.
Right, thats the reason I suggested divide with Min(nindexes_mwm, nworkers).
Thanks! I'll fix it in the next version patch.
One more comment.
+lazy_vacuum_indexes(LVRelStats *vacrelstats, Relation *Irel,
+ int nindexes, IndexBulkDeleteResult **stats,
+ LVParallelState *lps)
+{
+ ....
+ if (ParallelVacuumIsActive(lps))
+ {
+
+ lazy_parallel_vacuum_or_cleanup_indexes(vacrelstats, Irel, nindexes,
+ stats, lps);
+
+ }
+
+ for (idx = 0; idx < nindexes; idx++)
+ {
+ /*
+ * Skip indexes that we have already vacuumed during parallel index
+ * vacuuming.
+ */
+ if (ParallelVacuumIsActive(lps) && !IndStatsIsNull(lps->lvshared, idx))
+ continue;
+
+ lazy_vacuum_index(Irel[idx], &stats[idx], vacrelstats->dead_tuples,
+   vacrelstats->old_live_tuples);
+ }
+}
In this function, if ParallelVacuumIsActive, we perform the parallel
vacuum for all the index for which parallel vacuum is supported and
once that is over we finish vacuuming remaining indexes for which
parallel vacuum is not supported. But, my question is that inside
lazy_parallel_vacuum_or_cleanup_indexes, we wait for all the workers
to finish their job then only we start with the sequential vacuuming
shouldn't we start that immediately as soon as the leader
participation is over in the parallel vacuum?

+ /*
+ * Since parallel workers cannot access data in temporary tables, parallel
+ * vacuum is not allowed for temporary relation.
+ */
+ if (RelationUsesLocalBuffers(onerel) && params->nworkers >= 0)
+ {
+ ereport(WARNING,
+ (errmsg("skipping vacuum on \"%s\" --- cannot vacuum temporary
tables in parallel",
+ RelationGetRelationName(onerel))));
+ relation_close(onerel, lmode);
+ PopActiveSnapshot();
+ CommitTransactionCommand();
+ /* It's OK to proceed with ANALYZE on this table */
+ return true;
+ }
+

If we can not support the parallel vacuum for the temporary table then
shouldn't we fall back to the normal vacuum instead of skipping the
table. I think it's not fair that if the user has given system-wide
parallel vacuum then all the temp table will be skipped and not at all
vacuumed then user need to again perform normal vacuum on those
tables.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#201

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Amit Kapila (#199)

On Wed, Nov 13, 2019 at 9:48 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

Yeah, 0,2,3 and 4 sounds reasonable to me. Earlier, Dilip also got
confused with option 1.

Let me try to summarize the discussion on this point and see if others
have any opinion on this matter.

We need a way to allow IndexAm to specify whether it can participate
in a parallel vacuum. As we know there are two phases of
index-vacuum, bulkdelete and vacuumcleanup and in many cases, the
bulkdelete performs the main deletion work and then vacuumcleanup just
returns index statistics. So, for such cases, we don't want the second
phase to be performed by a parallel vacuum worker. Now, if the
bulkdelete phase is not performed, then vacuumcleanup can process the
entire index in which case it is better to do that phase via parallel
worker.

OTOH, in some cases vacuumcleanup takes another pass over-index to
reclaim empty pages and update record the same in FSM even if
bulkdelete is performed. This happens in gin and bloom indexes.
Then, we have an index where we do all the work in cleanup phase like
in the case of brin indexes. Now, for this category of indexes, we
want vacuumcleanup phase to be also performed by a parallel worker.

In short different indexes have different requirements for which phase
of index vacuum can be performed in parallel. Just to be clear, we
can't perform both the phases (bulkdelete and cleanup) in one-go as
bulk-delete can happen multiple times on a large index whereas
vacuumcleanup is done once at the end.

Based on these needs, we came up with a way to allow users to specify
this information for IndexAm's. Basically, Indexam will expose a
variable amparallelvacuumoptions which can have below options

VACUUM_OPTION_NO_PARALLEL 1 << 0 # vacuum (neither bulkdelete nor
vacuumcleanup) can't be performed in parallel
VACUUM_OPTION_PARALLEL_BULKDEL 1 << 1 # bulkdelete can be done in
parallel (Indexes nbtree, hash, gin, gist, spgist, bloom will set this
flag)
VACUUM_OPTION_PARALLEL_COND_CLEANUP 1 << 2 # vacuumcleanup can be
done in parallel if bulkdelete is not performed (Indexes nbtree, brin,
gin, gist,
spgist, bloom will set this flag)
VACUUM_OPTION_PARALLEL_CLEANUP 1 << 3 # vacuumcleanup can be done in
parallel even if bulkdelete is already performed (Indexes gin, brin,
and bloom will set this flag)

We have discussed to expose this information via two variables but the
above seems like a better idea to all the people involved.

Any suggestions? Anyone thinks this is not the right way to expose
this information or there is no need to expose this information or
they have a better idea for this?

Sawada-San, Dilip, feel free to correct me.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#202

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

about 6 years ago

In reply to: Dilip Kumar (#198)

On Wed, 13 Nov 2019 at 12:43, Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Tue, Nov 12, 2019 at 5:31 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
On Tue, 12 Nov 2019 at 20:29, Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Tue, Nov 12, 2019 at 4:04 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
On Mon, 11 Nov 2019 at 17:57, Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Tue, Oct 29, 2019 at 12:37 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

I realized that v31-0006 patch doesn't work fine so I've attached the
updated version patch that also incorporated some comments I got so
far. Sorry for the inconvenience. I'll apply your 0001 patch and also
test the total delay time.

While reviewing the 0002, I got one doubt related to how we are
dividing the maintainance_work_mem
+prepare_index_statistics(LVShared *lvshared, Relation *Irel, int nindexes)
+{
+ /* Compute the new maitenance_work_mem value for index vacuuming */
+ lvshared->maintenance_work_mem_worker =
+ (nindexes_mwm > 0) ? maintenance_work_mem / nindexes_mwm :
maintenance_work_mem;
+}
Is it fair to just consider the number of indexes which use
maintenance_work_mem?  Or we need to consider the number of worker as
well.  My point is suppose there are 10 indexes which will use the
maintenance_work_mem but we are launching just 2 workers then what is
the point in dividing the maintenance_work_mem by 10.
IMHO the calculation should be like this
lvshared->maintenance_work_mem_worker = (nindexes_mwm > 0) ?
maintenance_work_mem / Min(nindexes_mwm, nworkers) :
maintenance_work_mem;

Am I missing something?
No, I think you're right. On the other hand I think that dividing it
by the number of indexes that will use the mantenance_work_mem makes
sense when parallel degree > the number of such indexes. Suppose the
table has 2 indexes and there are 10 workers then we should divide the
maintenance_work_mem by 2 rather than 10 because it's possible that at
most 2 indexes that uses the maintenance_work_mem are processed in
parallel at a time.
Right, thats the reason I suggested divide with Min(nindexes_mwm, nworkers).
Thanks! I'll fix it in the next version patch.
One more comment.
+lazy_vacuum_indexes(LVRelStats *vacrelstats, Relation *Irel,
+ int nindexes, IndexBulkDeleteResult **stats,
+ LVParallelState *lps)
+{
+ ....
+ if (ParallelVacuumIsActive(lps))
+ {
+
+ lazy_parallel_vacuum_or_cleanup_indexes(vacrelstats, Irel, nindexes,
+ stats, lps);
+
+ }
+
+ for (idx = 0; idx < nindexes; idx++)
+ {
+ /*
+ * Skip indexes that we have already vacuumed during parallel index
+ * vacuuming.
+ */
+ if (ParallelVacuumIsActive(lps) && !IndStatsIsNull(lps->lvshared, idx))
+ continue;
+
+ lazy_vacuum_index(Irel[idx], &stats[idx], vacrelstats->dead_tuples,
+   vacrelstats->old_live_tuples);
+ }
+}
In this function, if ParallelVacuumIsActive, we perform the parallel
vacuum for all the index for which parallel vacuum is supported and
once that is over we finish vacuuming remaining indexes for which
parallel vacuum is not supported. But, my question is that inside
lazy_parallel_vacuum_or_cleanup_indexes, we wait for all the workers
to finish their job then only we start with the sequential vacuuming
shouldn't we start that immediately as soon as the leader
participation is over in the parallel vacuum?

If we do that, while the leader process is vacuuming indexes that
don't not support parallel vacuum sequentially some workers might be
vacuuming for other indexes. Isn't it a problem? If it's not problem,
I think we can tie up indexes that don't support parallel vacuum to
the leader and do parallel index vacuum.

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#203

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Mahendra Singh (#186)

On Tue, Nov 12, 2019 at 3:14 PM Mahendra Singh <mahi6run@gmail.com> wrote:

On Mon, 11 Nov 2019 at 16:36, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Nov 11, 2019 at 2:53 PM Mahendra Singh <mahi6run@gmail.com> wrote:

For small indexes also, we gained some performance by parallel vacuum.

Thanks for doing all these tests. It is clear with this and previous
tests that this patch has benefit in wide variety of cases. However,
we should try to see some worst cases as well. For example, if there
are multiple indexes on a table and only one of them is large whereas
all others are very small say having a few 100 or 1000 rows.

Thanks Amit for your comments.

I did some testing on the above suggested lines. Below is the summary:
Test case:(I created 16 indexes but only 1 index is large, other are very small)
create table test(a int, b int, c int, d int, e int, f int, g int, h int);
create index i3 on test (a) where a > 2000 and a < 3000;
create index i4 on test (a) where a > 3000 and a < 4000;
create index i5 on test (a) where a > 4000 and a < 5000;
create index i6 on test (a) where a > 5000 and a < 6000;
create index i7 on test (b) where a < 1000;
create index i8 on test (c) where a < 1000;
create index i9 on test (d) where a < 1000;
create index i10 on test (d) where a < 1000;
create index i11 on test (d) where a < 1000;
create index i12 on test (d) where a < 1000;
create index i13 on test (d) where a < 1000;
create index i14 on test (d) where a < 1000;
create index i15 on test (d) where a < 1000;
create index i16 on test (d) where a < 1000;
insert into test select i,i,i,i,i,i,i,i from generate_series(1,1000000) as i;
delete from test where a %2=0;

case 1: vacuum without using parallel workers.
vacuum test;
228.259 ms

case 2: vacuum with 1 parallel worker.
vacuum (parallel 1) test;
251.725 ms

case 3: vacuum with 3 parallel workers.
vacuum (parallel 3) test;
259.986

From above results, it seems that if indexes are small, then parallel vacuum is not beneficial as compared to normal vacuum.

Right and that is what is expected as well. However, I think if
somehow disallow very small indexes to use parallel worker, then it
will be better. Can we use min_parallel_index_scan_size to decide
whether a particular index can participate in a parallel vacuum?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#204

Dilip Kumar

dilipbalaut@gmail.com

about 6 years ago

In reply to: Amit Kapila (#201)

On Wed, Nov 13, 2019 at 11:01 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Nov 13, 2019 at 9:48 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

Yeah, 0,2,3 and 4 sounds reasonable to me. Earlier, Dilip also got
confused with option 1.

Let me try to summarize the discussion on this point and see if others
have any opinion on this matter.

We need a way to allow IndexAm to specify whether it can participate
in a parallel vacuum. As we know there are two phases of
index-vacuum, bulkdelete and vacuumcleanup and in many cases, the
bulkdelete performs the main deletion work and then vacuumcleanup just
returns index statistics. So, for such cases, we don't want the second
phase to be performed by a parallel vacuum worker. Now, if the
bulkdelete phase is not performed, then vacuumcleanup can process the
entire index in which case it is better to do that phase via parallel
worker.

OTOH, in some cases vacuumcleanup takes another pass over-index to
reclaim empty pages and update record the same in FSM even if
bulkdelete is performed. This happens in gin and bloom indexes.
Then, we have an index where we do all the work in cleanup phase like
in the case of brin indexes. Now, for this category of indexes, we
want vacuumcleanup phase to be also performed by a parallel worker.

In short different indexes have different requirements for which phase
of index vacuum can be performed in parallel. Just to be clear, we
can't perform both the phases (bulkdelete and cleanup) in one-go as
bulk-delete can happen multiple times on a large index whereas
vacuumcleanup is done once at the end.

Based on these needs, we came up with a way to allow users to specify
this information for IndexAm's. Basically, Indexam will expose a
variable amparallelvacuumoptions which can have below options

VACUUM_OPTION_NO_PARALLEL 1 << 0 # vacuum (neither bulkdelete nor
vacuumcleanup) can't be performed in parallel
VACUUM_OPTION_PARALLEL_BULKDEL 1 << 1 # bulkdelete can be done in
parallel (Indexes nbtree, hash, gin, gist, spgist, bloom will set this
flag)
VACUUM_OPTION_PARALLEL_COND_CLEANUP 1 << 2 # vacuumcleanup can be
done in parallel if bulkdelete is not performed (Indexes nbtree, brin,
gin, gist,
spgist, bloom will set this flag)
VACUUM_OPTION_PARALLEL_CLEANUP 1 << 3 # vacuumcleanup can be done in
parallel even if bulkdelete is already performed (Indexes gin, brin,
and bloom will set this flag)

We have discussed to expose this information via two variables but the
above seems like a better idea to all the people involved.

Any suggestions? Anyone thinks this is not the right way to expose
this information or there is no need to expose this information or
they have a better idea for this?

Sawada-San, Dilip, feel free to correct me.

Looks fine to me.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#205

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Masahiko Sawada (#202)

On Wed, Nov 13, 2019 at 11:39 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Wed, 13 Nov 2019 at 12:43, Dilip Kumar <dilipbalaut@gmail.com> wrote:

In this function, if ParallelVacuumIsActive, we perform the parallel
vacuum for all the index for which parallel vacuum is supported and
once that is over we finish vacuuming remaining indexes for which
parallel vacuum is not supported. But, my question is that inside
lazy_parallel_vacuum_or_cleanup_indexes, we wait for all the workers
to finish their job then only we start with the sequential vacuuming
shouldn't we start that immediately as soon as the leader
participation is over in the parallel vacuum?

If we do that, while the leader process is vacuuming indexes that
don't not support parallel vacuum sequentially some workers might be
vacuuming for other indexes. Isn't it a problem?

Can you please explain what problem do you see with that?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#206

Dilip Kumar

dilipbalaut@gmail.com

about 6 years ago

In reply to: Masahiko Sawada (#202)

On Wed, Nov 13, 2019 at 11:39 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Wed, 13 Nov 2019 at 12:43, Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Tue, Nov 12, 2019 at 5:31 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
On Tue, 12 Nov 2019 at 20:29, Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Tue, Nov 12, 2019 at 4:04 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
On Mon, 11 Nov 2019 at 17:57, Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Tue, Oct 29, 2019 at 12:37 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

I realized that v31-0006 patch doesn't work fine so I've attached the
updated version patch that also incorporated some comments I got so
far. Sorry for the inconvenience. I'll apply your 0001 patch and also
test the total delay time.

While reviewing the 0002, I got one doubt related to how we are
dividing the maintainance_work_mem
+prepare_index_statistics(LVShared *lvshared, Relation *Irel, int nindexes)
+{
+ /* Compute the new maitenance_work_mem value for index vacuuming */
+ lvshared->maintenance_work_mem_worker =
+ (nindexes_mwm > 0) ? maintenance_work_mem / nindexes_mwm :
maintenance_work_mem;
+}
Is it fair to just consider the number of indexes which use
maintenance_work_mem?  Or we need to consider the number of worker as
well.  My point is suppose there are 10 indexes which will use the
maintenance_work_mem but we are launching just 2 workers then what is
the point in dividing the maintenance_work_mem by 10.
IMHO the calculation should be like this
lvshared->maintenance_work_mem_worker = (nindexes_mwm > 0) ?
maintenance_work_mem / Min(nindexes_mwm, nworkers) :
maintenance_work_mem;

Am I missing something?
No, I think you're right. On the other hand I think that dividing it
by the number of indexes that will use the mantenance_work_mem makes
sense when parallel degree > the number of such indexes. Suppose the
table has 2 indexes and there are 10 workers then we should divide the
maintenance_work_mem by 2 rather than 10 because it's possible that at
most 2 indexes that uses the maintenance_work_mem are processed in
parallel at a time.
Right, thats the reason I suggested divide with Min(nindexes_mwm, nworkers).
Thanks! I'll fix it in the next version patch.
One more comment.
+lazy_vacuum_indexes(LVRelStats *vacrelstats, Relation *Irel,
+ int nindexes, IndexBulkDeleteResult **stats,
+ LVParallelState *lps)
+{
+ ....
+ if (ParallelVacuumIsActive(lps))
+ {
+
+ lazy_parallel_vacuum_or_cleanup_indexes(vacrelstats, Irel, nindexes,
+ stats, lps);
+
+ }
+
+ for (idx = 0; idx < nindexes; idx++)
+ {
+ /*
+ * Skip indexes that we have already vacuumed during parallel index
+ * vacuuming.
+ */
+ if (ParallelVacuumIsActive(lps) && !IndStatsIsNull(lps->lvshared, idx))
+ continue;
+
+ lazy_vacuum_index(Irel[idx], &stats[idx], vacrelstats->dead_tuples,
+   vacrelstats->old_live_tuples);
+ }
+}
In this function, if ParallelVacuumIsActive, we perform the parallel
vacuum for all the index for which parallel vacuum is supported and
once that is over we finish vacuuming remaining indexes for which
parallel vacuum is not supported. But, my question is that inside
lazy_parallel_vacuum_or_cleanup_indexes, we wait for all the workers
to finish their job then only we start with the sequential vacuuming
shouldn't we start that immediately as soon as the leader
participation is over in the parallel vacuum?
If we do that, while the leader process is vacuuming indexes that
don't not support parallel vacuum sequentially some workers might be
vacuuming for other indexes. Isn't it a problem?

I am not sure what could be the problem.

If it's not problem,

I think we can tie up indexes that don't support parallel vacuum to
the leader and do parallel index vacuum.

I am not sure whether we can do that or not. Because if we do a
parallel vacuum from the leader for the indexes which don't support a
parallel option then we will unnecessarily allocate the shared memory
for those indexes (index stats). Moreover, I think it could also
cause a problem in a multi-pass vacuum if we try to copy its stats
into the shared memory.

I think simple option would be that as soon as leader participation is
over we can have a loop for all the indexes who don't support
parallelism in that phase and after completing that we wait for the
parallel workers to finish.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#207

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

about 6 years ago

In reply to: Amit Kapila (#205)

On Wed, 13 Nov 2019 at 17:57, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Nov 13, 2019 at 11:39 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Wed, 13 Nov 2019 at 12:43, Dilip Kumar <dilipbalaut@gmail.com> wrote:

In this function, if ParallelVacuumIsActive, we perform the parallel
vacuum for all the index for which parallel vacuum is supported and
once that is over we finish vacuuming remaining indexes for which
parallel vacuum is not supported. But, my question is that inside
lazy_parallel_vacuum_or_cleanup_indexes, we wait for all the workers
to finish their job then only we start with the sequential vacuuming
shouldn't we start that immediately as soon as the leader
participation is over in the parallel vacuum?

If we do that, while the leader process is vacuuming indexes that
don't not support parallel vacuum sequentially some workers might be
vacuuming for other indexes. Isn't it a problem?

Can you please explain what problem do you see with that?

I think it depends on index AM user expectation. If disabling parallel
vacuum for an index means that index AM user doesn't just want to
vacuum the index by parallel worker, it's not problem. But if it means
that the user doesn't want to vacuum the index during other indexes is
being processed in parallel it's unexpected behaviour for the user.
I'm probably worrying too much.

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#208

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Masahiko Sawada (#207)

On Wed, Nov 13, 2019 at 3:55 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Wed, 13 Nov 2019 at 17:57, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Nov 13, 2019 at 11:39 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Wed, 13 Nov 2019 at 12:43, Dilip Kumar <dilipbalaut@gmail.com> wrote:

In this function, if ParallelVacuumIsActive, we perform the parallel
vacuum for all the index for which parallel vacuum is supported and
once that is over we finish vacuuming remaining indexes for which
parallel vacuum is not supported. But, my question is that inside
lazy_parallel_vacuum_or_cleanup_indexes, we wait for all the workers
to finish their job then only we start with the sequential vacuuming
shouldn't we start that immediately as soon as the leader
participation is over in the parallel vacuum?

If we do that, while the leader process is vacuuming indexes that
don't not support parallel vacuum sequentially some workers might be
vacuuming for other indexes. Isn't it a problem?

Can you please explain what problem do you see with that?

I think it depends on index AM user expectation. If disabling parallel
vacuum for an index means that index AM user doesn't just want to
vacuum the index by parallel worker, it's not problem. But if it means
that the user doesn't want to vacuum the index during other indexes is
being processed in parallel it's unexpected behaviour for the user.

I would expect the earlier.

I'm probably worrying too much.

Yeah, we can keep the behavior with respect to your first expectation
(If disabling parallel vacuum for an index means that index AM user
doesn't just want to vacuum the index by parallel worker, it's not
problem). It might not be difficult to change later if there is an
example of such a case.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#209

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

about 6 years ago

In reply to: Dilip Kumar (#206)

On Wed, 13 Nov 2019 at 18:49, Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Wed, Nov 13, 2019 at 11:39 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
On Wed, 13 Nov 2019 at 12:43, Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Tue, Nov 12, 2019 at 5:31 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
On Tue, 12 Nov 2019 at 20:29, Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Tue, Nov 12, 2019 at 4:04 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
On Mon, 11 Nov 2019 at 17:57, Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Tue, Oct 29, 2019 at 12:37 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

I realized that v31-0006 patch doesn't work fine so I've attached the
updated version patch that also incorporated some comments I got so
far. Sorry for the inconvenience. I'll apply your 0001 patch and also
test the total delay time.

While reviewing the 0002, I got one doubt related to how we are
dividing the maintainance_work_mem
+prepare_index_statistics(LVShared *lvshared, Relation *Irel, int nindexes)
+{
+ /* Compute the new maitenance_work_mem value for index vacuuming */
+ lvshared->maintenance_work_mem_worker =
+ (nindexes_mwm > 0) ? maintenance_work_mem / nindexes_mwm :
maintenance_work_mem;
+}
Is it fair to just consider the number of indexes which use
maintenance_work_mem?  Or we need to consider the number of worker as
well.  My point is suppose there are 10 indexes which will use the
maintenance_work_mem but we are launching just 2 workers then what is
the point in dividing the maintenance_work_mem by 10.
IMHO the calculation should be like this
lvshared->maintenance_work_mem_worker = (nindexes_mwm > 0) ?
maintenance_work_mem / Min(nindexes_mwm, nworkers) :
maintenance_work_mem;

Am I missing something?
No, I think you're right. On the other hand I think that dividing it
by the number of indexes that will use the mantenance_work_mem makes
sense when parallel degree > the number of such indexes. Suppose the
table has 2 indexes and there are 10 workers then we should divide the
maintenance_work_mem by 2 rather than 10 because it's possible that at
most 2 indexes that uses the maintenance_work_mem are processed in
parallel at a time.
Right, thats the reason I suggested divide with Min(nindexes_mwm, nworkers).
Thanks! I'll fix it in the next version patch.
One more comment.
+lazy_vacuum_indexes(LVRelStats *vacrelstats, Relation *Irel,
+ int nindexes, IndexBulkDeleteResult **stats,
+ LVParallelState *lps)
+{
+ ....
+ if (ParallelVacuumIsActive(lps))
+ {
+
+ lazy_parallel_vacuum_or_cleanup_indexes(vacrelstats, Irel, nindexes,
+ stats, lps);
+
+ }
+
+ for (idx = 0; idx < nindexes; idx++)
+ {
+ /*
+ * Skip indexes that we have already vacuumed during parallel index
+ * vacuuming.
+ */
+ if (ParallelVacuumIsActive(lps) && !IndStatsIsNull(lps->lvshared, idx))
+ continue;
+
+ lazy_vacuum_index(Irel[idx], &stats[idx], vacrelstats->dead_tuples,
+   vacrelstats->old_live_tuples);
+ }
+}
In this function, if ParallelVacuumIsActive, we perform the parallel
vacuum for all the index for which parallel vacuum is supported and
once that is over we finish vacuuming remaining indexes for which
parallel vacuum is not supported. But, my question is that inside
lazy_parallel_vacuum_or_cleanup_indexes, we wait for all the workers
to finish their job then only we start with the sequential vacuuming
shouldn't we start that immediately as soon as the leader
participation is over in the parallel vacuum?
If we do that, while the leader process is vacuuming indexes that
don't not support parallel vacuum sequentially some workers might be
vacuuming for other indexes. Isn't it a problem?
I am not sure what could be the problem.

If it's not problem,

I think we can tie up indexes that don't support parallel vacuum to
the leader and do parallel index vacuum.

I am not sure whether we can do that or not. Because if we do a
parallel vacuum from the leader for the indexes which don't support a
parallel option then we will unnecessarily allocate the shared memory
for those indexes (index stats). Moreover, I think it could also
cause a problem in a multi-pass vacuum if we try to copy its stats
into the shared memory.

I think simple option would be that as soon as leader participation is
over we can have a loop for all the indexes who don't support
parallelism in that phase and after completing that we wait for the
parallel workers to finish.

Hmm I thought we don't allocate DSM for indexes which don't support
both parallel bulk deletion and parallel cleanup and we can always
assign indexes to the leader process if they don't support particular
phase during parallel index vacuuming. But your suggestion sounds more
simple. I'll incorporate your suggestion in the next version patch.
Thanks!

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#210

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Dilip Kumar (#200)

On Wed, Nov 13, 2019 at 9:51 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:

+ /*
+ * Since parallel workers cannot access data in temporary tables, parallel
+ * vacuum is not allowed for temporary relation.
+ */
+ if (RelationUsesLocalBuffers(onerel) && params->nworkers >= 0)
+ {
+ ereport(WARNING,
+ (errmsg("skipping vacuum on \"%s\" --- cannot vacuum temporary
tables in parallel",
+ RelationGetRelationName(onerel))));
+ relation_close(onerel, lmode);
+ PopActiveSnapshot();
+ CommitTransactionCommand();
+ /* It's OK to proceed with ANALYZE on this table */
+ return true;
+ }
+
If we can not support the parallel vacuum for the temporary table then
shouldn't we fall back to the normal vacuum instead of skipping the
table. I think it's not fair that if the user has given system-wide
parallel vacuum then all the temp table will be skipped and not at all
vacuumed then user need to again perform normal vacuum on those
tables.

Good point. However, I think the current coding also makes sense for
cases like "Vacuum (analyze, parallel 2) tmp_tab;". In such a case,
it will skip the vacuum part of it but will perform analyze. Having
said that, I can see the merit of your point and I also vote to follow
your suggestion and add a note to the document unless it makes code
look ugly.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#211

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

about 6 years ago

In reply to: Amit Kapila (#201)

On Wed, 13 Nov 2019 at 14:31, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Nov 13, 2019 at 9:48 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

Yeah, 0,2,3 and 4 sounds reasonable to me. Earlier, Dilip also got
confused with option 1.

Let me try to summarize the discussion on this point and see if others
have any opinion on this matter.

Thank you for summarizing.

We need a way to allow IndexAm to specify whether it can participate
in a parallel vacuum. As we know there are two phases of
index-vacuum, bulkdelete and vacuumcleanup and in many cases, the
bulkdelete performs the main deletion work and then vacuumcleanup just
returns index statistics. So, for such cases, we don't want the second
phase to be performed by a parallel vacuum worker. Now, if the
bulkdelete phase is not performed, then vacuumcleanup can process the
entire index in which case it is better to do that phase via parallel
worker.

OTOH, in some cases vacuumcleanup takes another pass over-index to
reclaim empty pages and update record the same in FSM even if
bulkdelete is performed. This happens in gin and bloom indexes.
Then, we have an index where we do all the work in cleanup phase like
in the case of brin indexes. Now, for this category of indexes, we
want vacuumcleanup phase to be also performed by a parallel worker.

In short different indexes have different requirements for which phase
of index vacuum can be performed in parallel. Just to be clear, we
can't perform both the phases (bulkdelete and cleanup) in one-go as
bulk-delete can happen multiple times on a large index whereas
vacuumcleanup is done once at the end.

Based on these needs, we came up with a way to allow users to specify
this information for IndexAm's. Basically, Indexam will expose a
variable amparallelvacuumoptions which can have below options

VACUUM_OPTION_NO_PARALLEL 1 << 0 # vacuum (neither bulkdelete nor
vacuumcleanup) can't be performed in parallel

I think VACUUM_OPTION_NO_PARALLEL can be 0 so that index AMs who don't
want to support parallel vacuum don't have to set anything.

VACUUM_OPTION_PARALLEL_BULKDEL 1 << 1 # bulkdelete can be done in
parallel (Indexes nbtree, hash, gin, gist, spgist, bloom will set this
flag)
VACUUM_OPTION_PARALLEL_COND_CLEANUP 1 << 2 # vacuumcleanup can be
done in parallel if bulkdelete is not performed (Indexes nbtree, brin,
gin, gist,
spgist, bloom will set this flag)
VACUUM_OPTION_PARALLEL_CLEANUP 1 << 3 # vacuumcleanup can be done in
parallel even if bulkdelete is already performed (Indexes gin, brin,
and bloom will set this flag)

I think gin and bloom don't need to set both but should set only
VACUUM_OPTION_PARALLEL_CLEANUP.

And I'm going to disallow index AMs to set both
VACUUM_OPTION_PARALLEL_COND_CLEANUP and VACUUM_OPTION_PARALLEL_CLEANUP
by assertions, is that okay?

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#212

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Masahiko Sawada (#211)

On Mon, Nov 18, 2019 at 11:37 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Wed, 13 Nov 2019 at 14:31, Amit Kapila <amit.kapila16@gmail.com> wrote:

Based on these needs, we came up with a way to allow users to specify
this information for IndexAm's. Basically, Indexam will expose a
variable amparallelvacuumoptions which can have below options

VACUUM_OPTION_NO_PARALLEL 1 << 0 # vacuum (neither bulkdelete nor
vacuumcleanup) can't be performed in parallel

I think VACUUM_OPTION_NO_PARALLEL can be 0 so that index AMs who don't
want to support parallel vacuum don't have to set anything.

make sense.

VACUUM_OPTION_PARALLEL_BULKDEL 1 << 1 # bulkdelete can be done in
parallel (Indexes nbtree, hash, gin, gist, spgist, bloom will set this
flag)
VACUUM_OPTION_PARALLEL_COND_CLEANUP 1 << 2 # vacuumcleanup can be
done in parallel if bulkdelete is not performed (Indexes nbtree, brin,
gin, gist,
spgist, bloom will set this flag)
VACUUM_OPTION_PARALLEL_CLEANUP 1 << 3 # vacuumcleanup can be done in
parallel even if bulkdelete is already performed (Indexes gin, brin,
and bloom will set this flag)

I think gin and bloom don't need to set both but should set only
VACUUM_OPTION_PARALLEL_CLEANUP.

And I'm going to disallow index AMs to set both
VACUUM_OPTION_PARALLEL_COND_CLEANUP and VACUUM_OPTION_PARALLEL_CLEANUP
by assertions, is that okay?

Sounds reasonable to me.

Are you planning to include the changes related to I/O throttling
based on the discussion in the nearby thread [1]/messages/by-id/CAA4eK1+uDgLwfnAhQWGpAe66D85PdkeBygZGVyX96+ovN1PbOg@mail.gmail.com? I think you can do
that if you agree with the conclusion in the last email[1]/messages/by-id/CAA4eK1+uDgLwfnAhQWGpAe66D85PdkeBygZGVyX96+ovN1PbOg@mail.gmail.com, otherwise,
we can explore it separately.

[1]: /messages/by-id/CAA4eK1+uDgLwfnAhQWGpAe66D85PdkeBygZGVyX96+ovN1PbOg@mail.gmail.com

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#213

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

about 6 years ago

In reply to: Amit Kapila (#212)

On Mon, 18 Nov 2019 at 15:34, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Nov 18, 2019 at 11:37 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Wed, 13 Nov 2019 at 14:31, Amit Kapila <amit.kapila16@gmail.com> wrote:

Based on these needs, we came up with a way to allow users to specify
this information for IndexAm's. Basically, Indexam will expose a
variable amparallelvacuumoptions which can have below options

VACUUM_OPTION_NO_PARALLEL 1 << 0 # vacuum (neither bulkdelete nor
vacuumcleanup) can't be performed in parallel

I think VACUUM_OPTION_NO_PARALLEL can be 0 so that index AMs who don't
want to support parallel vacuum don't have to set anything.

make sense.

VACUUM_OPTION_PARALLEL_BULKDEL 1 << 1 # bulkdelete can be done in
parallel (Indexes nbtree, hash, gin, gist, spgist, bloom will set this
flag)
VACUUM_OPTION_PARALLEL_COND_CLEANUP 1 << 2 # vacuumcleanup can be
done in parallel if bulkdelete is not performed (Indexes nbtree, brin,
gin, gist,
spgist, bloom will set this flag)
VACUUM_OPTION_PARALLEL_CLEANUP 1 << 3 # vacuumcleanup can be done in
parallel even if bulkdelete is already performed (Indexes gin, brin,
and bloom will set this flag)

I think gin and bloom don't need to set both but should set only
VACUUM_OPTION_PARALLEL_CLEANUP.

And I'm going to disallow index AMs to set both
VACUUM_OPTION_PARALLEL_COND_CLEANUP and VACUUM_OPTION_PARALLEL_CLEANUP
by assertions, is that okay?

Sounds reasonable to me.

Are you planning to include the changes related to I/O throttling
based on the discussion in the nearby thread [1]? I think you can do
that if you agree with the conclusion in the last email[1], otherwise,
we can explore it separately.

Yes I agreed. I'm going to include that changes in the next version
patches. And I think we will be able to do more discussion based on
the patch.

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#214

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

about 6 years ago

In reply to: Masahiko Sawada (#213)

3 attachment(s)

On Mon, 18 Nov 2019 at 15:38, Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Mon, 18 Nov 2019 at 15:34, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Nov 18, 2019 at 11:37 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Wed, 13 Nov 2019 at 14:31, Amit Kapila <amit.kapila16@gmail.com> wrote:

Based on these needs, we came up with a way to allow users to specify
this information for IndexAm's. Basically, Indexam will expose a
variable amparallelvacuumoptions which can have below options

VACUUM_OPTION_NO_PARALLEL 1 << 0 # vacuum (neither bulkdelete nor
vacuumcleanup) can't be performed in parallel

I think VACUUM_OPTION_NO_PARALLEL can be 0 so that index AMs who don't
want to support parallel vacuum don't have to set anything.

make sense.

VACUUM_OPTION_PARALLEL_BULKDEL 1 << 1 # bulkdelete can be done in
parallel (Indexes nbtree, hash, gin, gist, spgist, bloom will set this
flag)
VACUUM_OPTION_PARALLEL_COND_CLEANUP 1 << 2 # vacuumcleanup can be
done in parallel if bulkdelete is not performed (Indexes nbtree, brin,
gin, gist,
spgist, bloom will set this flag)
VACUUM_OPTION_PARALLEL_CLEANUP 1 << 3 # vacuumcleanup can be done in
parallel even if bulkdelete is already performed (Indexes gin, brin,
and bloom will set this flag)

I think gin and bloom don't need to set both but should set only
VACUUM_OPTION_PARALLEL_CLEANUP.

And I'm going to disallow index AMs to set both
VACUUM_OPTION_PARALLEL_COND_CLEANUP and VACUUM_OPTION_PARALLEL_CLEANUP
by assertions, is that okay?

Sounds reasonable to me.

Are you planning to include the changes related to I/O throttling
based on the discussion in the nearby thread [1]? I think you can do
that if you agree with the conclusion in the last email[1], otherwise,
we can explore it separately.

Yes I agreed. I'm going to include that changes in the next version
patches. And I think we will be able to do more discussion based on
the patch.

I've attached the latest version patch set. The patch set includes all
discussed points regarding index AM options as well as shared cost
balance. Also I added some test cases used all types of index AM.

During developments I had one concern about the number of parallel
workers to launch. In current design each index AMs can choose the
participation of parallel bulk-deletion and parallel cleanup. That
also means the number of parallel worker to launch might be different
for each time of parallel bulk-deletion and parallel cleanup. In
current patch the leader will always launch the number of indexes that
support either one but it would not be efficient in some cases. For
example, if we have 3 indexes supporting only parallel bulk-deletion
and 2 indexes supporting only parallel index cleanup, we would launch
5 workers for each execution but some workers will do nothing at all.
To deal with this problem, I wonder if we can improve the parallel
query so that the leader process creates a parallel context with the
maximum number of indexes and can launch a part of workers instead of
all of them.

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

v33-0001-Add-index-AM-field-and-callback-for-parallel-ind.patchapplication/octet-stream; name=v33-0001-Add-index-AM-field-and-callback-for-parallel-ind.patchDownload

From 1283b804697902b70e9cd234e36b853b126d6efe Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Tue, 15 Oct 2019 17:03:22 +0900
Subject: [PATCH v33 1/3] Add index AM field and callback for parallel index
 vacuum

---
 contrib/bloom/blutils.c                       |  5 ++++
 doc/src/sgml/indexam.sgml                     | 21 ++++++++++++++
 src/backend/access/brin/brin.c                |  5 ++++
 src/backend/access/gin/ginutil.c              |  5 ++++
 src/backend/access/gist/gist.c                |  5 ++++
 src/backend/access/hash/hash.c                |  4 +++
 src/backend/access/index/indexam.c            | 29 +++++++++++++++++++
 src/backend/access/nbtree/nbtree.c            |  4 +++
 src/backend/access/spgist/spgutils.c          |  5 ++++
 src/include/access/amapi.h                    | 13 +++++++++
 src/include/access/genam.h                    |  1 +
 src/include/commands/vacuum.h                 | 28 ++++++++++++++++++
 .../modules/dummy_index_am/dummy_index_am.c   |  4 +++
 13 files changed, 129 insertions(+)

diff --git a/contrib/bloom/blutils.c b/contrib/bloom/blutils.c
index e2063bac62..cde36c5b49 100644
--- a/contrib/bloom/blutils.c
+++ b/contrib/bloom/blutils.c
@@ -18,6 +18,7 @@
 #include "access/reloptions.h"
 #include "bloom.h"
 #include "catalog/index.h"
+#include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
@@ -121,6 +122,9 @@ blhandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_CLEANUP;
+	amroutine->amusemaintenanceworkmem = false;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = blbuild;
@@ -144,6 +148,7 @@ blhandler(PG_FUNCTION_ARGS)
 	amroutine->amestimateparallelscan = NULL;
 	amroutine->aminitparallelscan = NULL;
 	amroutine->amparallelrescan = NULL;
+	amroutine->amestimateparallelvacuum = NULL;
 
 	PG_RETURN_POINTER(amroutine);
 }
diff --git a/doc/src/sgml/indexam.sgml b/doc/src/sgml/indexam.sgml
index dd54c68802..693171dc4f 100644
--- a/doc/src/sgml/indexam.sgml
+++ b/doc/src/sgml/indexam.sgml
@@ -122,6 +122,10 @@ typedef struct IndexAmRoutine
     bool        amcanparallel;
     /* does AM support columns included with clause INCLUDE? */
     bool        amcaninclude;
+    /* OR of parallel vacuum flags */
+    uint8       amparallelvacuumoptions;
+    /* does AM use maintenance_work_mem? */
+    bool        amusemaintenanceworkmem;
     /* type of data stored in index, or InvalidOid if variable */
     Oid         amkeytype;
 
@@ -149,6 +153,9 @@ typedef struct IndexAmRoutine
     amestimateparallelscan_function amestimateparallelscan;    /* can be NULL */
     aminitparallelscan_function aminitparallelscan;    /* can be NULL */
     amparallelrescan_function amparallelrescan;    /* can be NULL */
+
+    /* interface functions to support parallel vacuum */
+    amestimateparallelvacuum_function amestimateparallelvacuum; /* can be NULL */
 } IndexAmRoutine;
 </programlisting>
   </para>
@@ -731,6 +738,20 @@ amparallelrescan (IndexScanDesc scan);
    the beginning.
   </para>
 
+  <para>
+<programlisting>
+void
+amestimateparallelvacuum (IndexScanDesc scan);
+</programlisting>
+   Estimate and return the number of bytes of dynamic shared memory which the
+   access method will be needed to copy the statistics to.
+  </para>
+
+  <para>
+   It is not necessary to implement this function for access methods which
+   do not support parallel vacuum or in cases where the access method does not
+   require more than size of <structname>IndexBulkDeleteResult</structname>.
+  </para>
  </sect1>
 
  <sect1 id="index-scanning">
diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index 294ffa6e20..fbb4af9df1 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -27,6 +27,7 @@
 #include "access/xloginsert.h"
 #include "catalog/index.h"
 #include "catalog/pg_am.h"
+#include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "pgstat.h"
 #include "postmaster/autovacuum.h"
@@ -101,6 +102,9 @@ brinhandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_CLEANUP;
+	amroutine->amusemaintenanceworkmem = false;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = brinbuild;
@@ -124,6 +128,7 @@ brinhandler(PG_FUNCTION_ARGS)
 	amroutine->amestimateparallelscan = NULL;
 	amroutine->aminitparallelscan = NULL;
 	amroutine->amparallelrescan = NULL;
+	amroutine->amestimateparallelvacuum = NULL;
 
 	PG_RETURN_POINTER(amroutine);
 }
diff --git a/src/backend/access/gin/ginutil.c b/src/backend/access/gin/ginutil.c
index 38593554f0..8c174b28fc 100644
--- a/src/backend/access/gin/ginutil.c
+++ b/src/backend/access/gin/ginutil.c
@@ -20,6 +20,7 @@
 #include "access/xloginsert.h"
 #include "catalog/pg_collation.h"
 #include "catalog/pg_type.h"
+#include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "storage/indexfsm.h"
 #include "storage/lmgr.h"
@@ -53,6 +54,9 @@ ginhandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_CLEANUP;
+	amroutine->amusemaintenanceworkmem = true;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = ginbuild;
@@ -76,6 +80,7 @@ ginhandler(PG_FUNCTION_ARGS)
 	amroutine->amestimateparallelscan = NULL;
 	amroutine->aminitparallelscan = NULL;
 	amroutine->amparallelrescan = NULL;
+	amroutine->amestimateparallelvacuum = NULL;
 
 	PG_RETURN_POINTER(amroutine);
 }
diff --git a/src/backend/access/gist/gist.c b/src/backend/access/gist/gist.c
index 8d9c8d025d..bbb630fb88 100644
--- a/src/backend/access/gist/gist.c
+++ b/src/backend/access/gist/gist.c
@@ -17,6 +17,7 @@
 #include "access/gist_private.h"
 #include "access/gistscan.h"
 #include "catalog/pg_collation.h"
+#include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "nodes/execnodes.h"
 #include "storage/lmgr.h"
@@ -74,6 +75,9 @@ gisthandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = true;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP;
+	amroutine->amusemaintenanceworkmem = false;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = gistbuild;
@@ -97,6 +101,7 @@ gisthandler(PG_FUNCTION_ARGS)
 	amroutine->amestimateparallelscan = NULL;
 	amroutine->aminitparallelscan = NULL;
 	amroutine->amparallelrescan = NULL;
+	amroutine->amestimateparallelvacuum = NULL;
 
 	PG_RETURN_POINTER(amroutine);
 }
diff --git a/src/backend/access/hash/hash.c b/src/backend/access/hash/hash.c
index a0597a0c6e..10d6efdd9f 100644
--- a/src/backend/access/hash/hash.c
+++ b/src/backend/access/hash/hash.c
@@ -72,6 +72,9 @@ hashhandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL;
+	amroutine->amusemaintenanceworkmem = false;
 	amroutine->amkeytype = INT4OID;
 
 	amroutine->ambuild = hashbuild;
@@ -95,6 +98,7 @@ hashhandler(PG_FUNCTION_ARGS)
 	amroutine->amestimateparallelscan = NULL;
 	amroutine->aminitparallelscan = NULL;
 	amroutine->amparallelrescan = NULL;
+	amroutine->amestimateparallelvacuum = NULL;
 
 	PG_RETURN_POINTER(amroutine);
 }
diff --git a/src/backend/access/index/indexam.c b/src/backend/access/index/indexam.c
index 9dfa0ddfbb..5238b9d38f 100644
--- a/src/backend/access/index/indexam.c
+++ b/src/backend/access/index/indexam.c
@@ -711,6 +711,35 @@ index_vacuum_cleanup(IndexVacuumInfo *info,
 	return indexRelation->rd_indam->amvacuumcleanup(info, stats);
 }
 
+/*
+ * index_parallelvacuum_estimate - estimate shared memory for parallel vacuum
+ *
+ * Currently, we don't pass any information to the AM-specific estimator,
+ * so it can probably only return a constant.  In the future, we might need
+ * to pass more information.
+ */
+Size
+index_parallelvacuum_estimate(Relation indexRelation)
+{
+	Size		nbytes;
+
+	RELATION_CHECKS;
+
+	/*
+	 * If amestimateparallelvacuum is not provided, assume only
+	 * IndexBulkDeleteResult is needed.
+	 */
+	if (indexRelation->rd_indam->amestimateparallelvacuum != NULL)
+	{
+		nbytes = indexRelation->rd_indam->amestimateparallelvacuum();
+		Assert(nbytes >= MAXALIGN(sizeof(IndexBulkDeleteResult)));
+	}
+	else
+		nbytes = MAXALIGN(sizeof(IndexBulkDeleteResult));
+
+	return nbytes;
+}
+
 /* ----------------
  *		index_can_return
  *
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index 4cfd5289ad..6a8d12ecbf 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -123,6 +123,9 @@ bthandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = true;
 	amroutine->amcaninclude = true;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP;
+	amroutine->amusemaintenanceworkmem = false;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = btbuild;
@@ -146,6 +149,7 @@ bthandler(PG_FUNCTION_ARGS)
 	amroutine->amestimateparallelscan = btestimateparallelscan;
 	amroutine->aminitparallelscan = btinitparallelscan;
 	amroutine->amparallelrescan = btparallelrescan;
+	amroutine->amestimateparallelvacuum = NULL;
 
 	PG_RETURN_POINTER(amroutine);
 }
diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c
index 45472db147..bb3e855cce 100644
--- a/src/backend/access/spgist/spgutils.c
+++ b/src/backend/access/spgist/spgutils.c
@@ -22,6 +22,7 @@
 #include "access/transam.h"
 #include "access/xact.h"
 #include "catalog/pg_amop.h"
+#include "commands/vacuum.h"
 #include "storage/bufmgr.h"
 #include "storage/indexfsm.h"
 #include "storage/lmgr.h"
@@ -56,6 +57,9 @@ spghandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP;
+	amroutine->amusemaintenanceworkmem = false;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = spgbuild;
@@ -79,6 +83,7 @@ spghandler(PG_FUNCTION_ARGS)
 	amroutine->amestimateparallelscan = NULL;
 	amroutine->aminitparallelscan = NULL;
 	amroutine->amparallelrescan = NULL;
+	amroutine->amestimateparallelvacuum = NULL;
 
 	PG_RETURN_POINTER(amroutine);
 }
diff --git a/src/include/access/amapi.h b/src/include/access/amapi.h
index 6e3db06eed..0fd399442d 100644
--- a/src/include/access/amapi.h
+++ b/src/include/access/amapi.h
@@ -156,6 +156,12 @@ typedef void (*aminitparallelscan_function) (void *target);
 /* (re)start parallel index scan */
 typedef void (*amparallelrescan_function) (IndexScanDesc scan);
 
+/*
+ * Callback function signatures - for parallel index vacuuming.
+ */
+/* estimate size of parallel index vacuuming memory */
+typedef Size (*amestimateparallelvacuum_function) (void);
+
 /*
  * API struct for an index AM.  Note this must be stored in a single palloc'd
  * chunk of memory.
@@ -197,6 +203,10 @@ typedef struct IndexAmRoutine
 	bool		amcanparallel;
 	/* does AM support columns included with clause INCLUDE? */
 	bool		amcaninclude;
+	/* OR of parallel vacuum flags */
+	uint8		amparallelvacuumoptions;
+	/* does AM use maintenance_work_mem? */
+	bool		amusemaintenanceworkmem;
 	/* type of data stored in index, or InvalidOid if variable */
 	Oid			amkeytype;
 
@@ -230,6 +240,9 @@ typedef struct IndexAmRoutine
 	amestimateparallelscan_function amestimateparallelscan; /* can be NULL */
 	aminitparallelscan_function aminitparallelscan; /* can be NULL */
 	amparallelrescan_function amparallelrescan; /* can be NULL */
+
+	/* interface functions to support parallel vacuum */
+	amestimateparallelvacuum_function amestimateparallelvacuum; /* can be NULL */
 } IndexAmRoutine;
 
 
diff --git a/src/include/access/genam.h b/src/include/access/genam.h
index a813b004be..48ed5bbac7 100644
--- a/src/include/access/genam.h
+++ b/src/include/access/genam.h
@@ -179,6 +179,7 @@ extern IndexBulkDeleteResult *index_bulk_delete(IndexVacuumInfo *info,
 												void *callback_state);
 extern IndexBulkDeleteResult *index_vacuum_cleanup(IndexVacuumInfo *info,
 												   IndexBulkDeleteResult *stats);
+extern Size index_parallelvacuum_estimate(Relation indexRelation);
 extern bool index_can_return(Relation indexRelation, int attno);
 extern RegProcedure index_getprocid(Relation irel, AttrNumber attnum,
 									uint16 procnum);
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 128f7ae65d..7b6f269785 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -23,6 +23,34 @@
 #include "storage/lock.h"
 #include "utils/relcache.h"
 
+/*
+ * Flags amparallelvacuumoptions to control participation
+ * of bulkdelete and vacuumcleanup. Both are disabled by
+ * default.
+ */
+#define VACUUM_OPTION_NO_PARALLEL			0
+
+/* bulkdelete can be performed in parallel */
+#define VACUUM_OPTION_PARALLEL_BULKDEL		(1 << 0)
+
+/*
+ * vacuumcleanup can be performed in parallel if bulkdelete is
+ * not performed yet.
+ */
+#define VACUUM_OPTION_PARALLEL_COND_CLEANUP	(1 << 1)
+
+/* vacuumcleanup can be performed in parallel */
+#define VACUUM_OPTION_PARALLEL_CLEANUP		(1 << 2)
+
+/* value for checking vacuum flags */
+#define VACUUM_OPTION_MAX_VALID_VALUE		((1 << 3) - 1)
+
+/* Macros for parallel vacuum options */
+#define VACUUM_OPTION_SUPPORT_PARALLEL_BULKDEL(flag) \
+	((((flag) & VACUUM_OPTION_PARALLEL_BULKDEL)) != 0)
+#define VACUUM_OPTION_SUPPORT_PARALLEL_CLEANUP(flag) \
+	((((flag) & VACUUM_OPTION_PARALLEL_COND_CLEANUP) != 0) || \
+	 (((flag) & VACUUM_OPTION_PARALLEL_CLEANUP) != 0))
 
 /*----------
  * ANALYZE builds one of these structs for each attribute (column) that is
diff --git a/src/test/modules/dummy_index_am/dummy_index_am.c b/src/test/modules/dummy_index_am/dummy_index_am.c
index 053636e4b4..096534a6ee 100644
--- a/src/test/modules/dummy_index_am/dummy_index_am.c
+++ b/src/test/modules/dummy_index_am/dummy_index_am.c
@@ -16,6 +16,7 @@
 #include "access/amapi.h"
 #include "access/reloptions.h"
 #include "catalog/index.h"
+#include "commands/vacuum.h"
 #include "nodes/pathnodes.h"
 #include "utils/guc.h"
 #include "utils/rel.h"
@@ -294,6 +295,8 @@ dihandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amparallelvacuumoptions = VACUUM_OPTION_NO_PARALLEL;
+	amroutine->amusemaintenanceworkmem = false;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = dibuild;
@@ -317,6 +320,7 @@ dihandler(PG_FUNCTION_ARGS)
 	amroutine->amestimateparallelscan = NULL;
 	amroutine->aminitparallelscan = NULL;
 	amroutine->amparallelrescan = NULL;
+	amroutine->amestimateparallelvacuum = NULL;
 
 	PG_RETURN_POINTER(amroutine);
 }
-- 
2.23.0

v33-0003-Add-paralell-P-option-to-vacuumdb-command.patchapplication/octet-stream; name=v33-0003-Add-paralell-P-option-to-vacuumdb-command.patchDownload

From 6828071bf3d34fd8973a7e64892d45f0478f6d73 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Wed, 23 Jan 2019 16:07:53 +0900
Subject: [PATCH v33 3/3] Add --paralell, -P option to vacuumdb command

---
 doc/src/sgml/ref/vacuumdb.sgml    | 16 +++++++++++
 src/bin/scripts/t/100_vacuumdb.pl | 10 ++++++-
 src/bin/scripts/vacuumdb.c        | 48 ++++++++++++++++++++++++++++++-
 3 files changed, 72 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/ref/vacuumdb.sgml b/doc/src/sgml/ref/vacuumdb.sgml
index 47d93456f8..f6ac0c6e5a 100644
--- a/doc/src/sgml/ref/vacuumdb.sgml
+++ b/doc/src/sgml/ref/vacuumdb.sgml
@@ -226,6 +226,22 @@ PostgreSQL documentation
       </listitem>
      </varlistentry>
 
+     <varlistentry>
+      <term><option>-P <replaceable class="parameter">workers</replaceable></option></term>
+      <term><option>--parallel=<replaceable class="parameter">workers</replaceable></option></term>
+      <listitem>
+       <para>
+        Execute parallel vacuum with <productname>PostgreSQL</productname>'s
+        <replaceable class="parameter">workers</replaceable> background workers.
+       </para>
+       <para>
+        This option will require background workers, so make sure your
+        <xref linkend="guc-max-parallel-workers-maintenance"/> setting is more
+        than one.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry>
       <term><option>-q</option></term>
       <term><option>--quiet</option></term>
diff --git a/src/bin/scripts/t/100_vacuumdb.pl b/src/bin/scripts/t/100_vacuumdb.pl
index b685b35282..8fe80719e8 100644
--- a/src/bin/scripts/t/100_vacuumdb.pl
+++ b/src/bin/scripts/t/100_vacuumdb.pl
@@ -3,7 +3,7 @@ use warnings;
 
 use PostgresNode;
 use TestLib;
-use Test::More tests => 44;
+use Test::More tests => 48;
 
 program_help_ok('vacuumdb');
 program_version_ok('vacuumdb');
@@ -48,6 +48,14 @@ $node->issues_sql_like(
 $node->command_fails(
 	[ 'vacuumdb', '--analyze-only', '--disable-page-skipping', 'postgres' ],
 	'--analyze-only and --disable-page-skipping specified together');
+$node->issues_sql_like(
+	[ 'vacuumdb', '-P2', 'postgres' ],
+	qr/statement: VACUUM \(PARALLEL 2\).*;/,
+	'vacuumdb -P2');
+$node->issues_sql_like(
+	[ 'vacuumdb', '-P', 'postgres' ],
+	qr/statement: VACUUM \(PARALLEL\).*;/,
+	'vacuumdb -P');
 $node->command_ok([qw(vacuumdb -Z --table=pg_am dbname=template1)],
 	'vacuumdb with connection string');
 
diff --git a/src/bin/scripts/vacuumdb.c b/src/bin/scripts/vacuumdb.c
index 2c7219239f..63bf66a70b 100644
--- a/src/bin/scripts/vacuumdb.c
+++ b/src/bin/scripts/vacuumdb.c
@@ -34,6 +34,8 @@ typedef struct vacuumingOptions
 	bool		skip_locked;
 	int			min_xid_age;
 	int			min_mxid_age;
+	int			parallel_workers;	/* -1 disables, 0 for choosing based on the
+									 * number of indexes */
 } vacuumingOptions;
 
 
@@ -86,6 +88,7 @@ main(int argc, char *argv[])
 		{"full", no_argument, NULL, 'f'},
 		{"verbose", no_argument, NULL, 'v'},
 		{"jobs", required_argument, NULL, 'j'},
+		{"parallel", optional_argument, NULL, 'P'},
 		{"maintenance-db", required_argument, NULL, 2},
 		{"analyze-in-stages", no_argument, NULL, 3},
 		{"disable-page-skipping", no_argument, NULL, 4},
@@ -115,6 +118,7 @@ main(int argc, char *argv[])
 
 	/* initialize options to all false */
 	memset(&vacopts, 0, sizeof(vacopts));
+	vacopts.parallel_workers = -1;
 
 	pg_logging_init(argv[0]);
 	progname = get_progname(argv[0]);
@@ -122,7 +126,7 @@ main(int argc, char *argv[])
 
 	handle_help_version_opts(argc, argv, "vacuumdb", help);
 
-	while ((c = getopt_long(argc, argv, "h:p:U:wWeqd:zZFat:fvj:", long_options, &optindex)) != -1)
+	while ((c = getopt_long(argc, argv, "h:p:P::U:wWeqd:zZFat:fvj:", long_options, &optindex)) != -1)
 	{
 		switch (c)
 		{
@@ -182,6 +186,24 @@ main(int argc, char *argv[])
 					exit(1);
 				}
 				break;
+			case 'P':
+				{
+					int parallel_workers = 0;
+
+					if (optarg != NULL)
+					{
+						parallel_workers = atoi(optarg);
+						if (parallel_workers <= 0)
+						{
+							pg_log_error("number of parallel workers must be at least 1");
+							exit(1);
+						}
+					}
+
+					/* allow to set 0, meaning PARALLEL without the parallel degree */
+					vacopts.parallel_workers = parallel_workers;
+					break;
+				}
 			case 2:
 				maintenance_db = pg_strdup(optarg);
 				break;
@@ -254,9 +276,22 @@ main(int argc, char *argv[])
 						 "disable-page-skipping");
 			exit(1);
 		}
+		if (vacopts.parallel_workers >= 0)
+		{
+			pg_log_error("cannot use the \"%s\" option when performing only analyze",
+						 "parallel");
+			exit(1);
+		}
 		/* allow 'and_analyze' with 'analyze_only' */
 	}
 
+	if (vacopts.full && vacopts.parallel_workers >= 0)
+	{
+		pg_log_error("cannot use the \"%s\" option with \"%s\" option",
+					 "full", "parallel");
+		exit(1);
+	}
+
 	setup_cancel_handler();
 
 	/* Avoid opening extra connections. */
@@ -822,6 +857,16 @@ prepare_vacuum_command(PQExpBuffer sql, int serverVersion,
 				appendPQExpBuffer(sql, "%sANALYZE", sep);
 				sep = comma;
 			}
+			if (vacopts->parallel_workers > 0)
+			{
+				appendPQExpBuffer(sql, "%sPARALLEL %d", sep, vacopts->parallel_workers);
+				sep = comma;
+			}
+			if (vacopts->parallel_workers == 0)
+			{
+				appendPQExpBuffer(sql, "%sPARALLEL", sep);
+				sep = comma;
+			}
 			if (sep != paren)
 				appendPQExpBufferChar(sql, ')');
 		}
@@ -885,6 +930,7 @@ help(const char *progname)
 	printf(_("  -j, --jobs=NUM                  use this many concurrent connections to vacuum\n"));
 	printf(_("      --min-mxid-age=MXID_AGE     minimum multixact ID age of tables to vacuum\n"));
 	printf(_("      --min-xid-age=XID_AGE       minimum transaction ID age of tables to vacuum\n"));
+	printf(_("  -P, --parallel[=NUM]            do parallel vacuuming\n"));
 	printf(_("  -q, --quiet                     don't write any messages\n"));
 	printf(_("      --skip-locked               skip relations that cannot be immediately locked\n"));
 	printf(_("  -t, --table='TABLE[(COLUMNS)]'  vacuum specific table(s) only\n"));
-- 
2.23.0

v33-0002-Add-parallel-option-to-VACUUM-command.patchapplication/octet-stream; name=v33-0002-Add-parallel-option-to-VACUUM-command.patchDownload

From 90c02dd6e38f7c1e6c9cdf5b9725e0a5add5327c Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 25 Oct 2019 22:47:41 +0900
Subject: [PATCH v33 2/3] Add parallel option to VACUUM command

This change adds PARALLEL option to VACUUM command that enable us to
perform index vacuuming and index cleanup with background
workers. Individual indexes is processed by one vacuum
process. Therefore parallel vacuum can be used when the table has at
least two indexes and it cannot specify larger parallel degree than
the number of indexes that the table has.

The parallel degree is either specified by user or determined based on
the number of indexes that the table has, and further limited by
max_parallel_maintenance_workers. The table size and index size don't
affect it.
---
 doc/src/sgml/config.sgml              |   14 +-
 doc/src/sgml/ref/vacuum.sgml          |   45 +
 src/backend/access/heap/vacuumlazy.c  | 1214 +++++++++++++++++++++++--
 src/backend/access/transam/parallel.c |    4 +
 src/backend/commands/vacuum.c         |  109 ++-
 src/backend/postmaster/autovacuum.c   |    2 +
 src/bin/psql/tab-complete.c           |    2 +-
 src/include/access/heapam.h           |    6 +
 src/include/commands/vacuum.h         |    5 +
 src/test/regress/expected/vacuum.out  |   26 +
 src/test/regress/sql/vacuum.sql       |   25 +
 11 files changed, 1340 insertions(+), 112 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index f83770350e..90ac399228 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2289,13 +2289,13 @@ include_dir 'conf.d'
        <listitem>
         <para>
          Sets the maximum number of parallel workers that can be
-         started by a single utility command.  Currently, the only
-         parallel utility command that supports the use of parallel
-         workers is <command>CREATE INDEX</command>, and only when
-         building a B-tree index.  Parallel workers are taken from the
-         pool of processes established by <xref
-         linkend="guc-max-worker-processes"/>, limited by <xref
-         linkend="guc-max-parallel-workers"/>.  Note that the requested
+         started by a single utility command.  Currently, the parallel
+         utility commands that support the use of parallel workers are
+         <command>CREATE INDEX</command> only when building a B-tree index,
+         and <command>VACUUM</command> without <literal>FULL</literal>
+         option. Parallel workers are taken from the pool of processes
+         established by <xref linkend="guc-max-worker-processes"/>, limited
+         by <xref linkend="guc-max-parallel-workers"/>.  Note that the requested
          number of workers may not actually be available at run time.
          If this occurs, the utility operation will run with fewer
          workers than expected.  The default value is 2.  Setting this
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index f9b0fb8794..ae086b976b 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -34,6 +34,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     SKIP_LOCKED [ <replaceable class="parameter">boolean</replaceable> ]
     INDEX_CLEANUP [ <replaceable class="parameter">boolean</replaceable> ]
     TRUNCATE [ <replaceable class="parameter">boolean</replaceable> ]
+    PARALLEL [ <replaceable class="parameter">integer</replaceable> ]
 
 <phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
 
@@ -223,6 +224,32 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><literal>PARALLEL</literal></term>
+    <listitem>
+     <para>
+      Perform vacuum index and cleanup index phases of <command>VACUUM</command>
+      in parallel using <replaceable class="parameter">integer</replaceable>
+      background workers (for the detail of each vacuum phases, please
+      refer to <xref linkend="vacuum-phases"/>). If the parallel degree
+      <replaceable class="parameter">integer</replaceable> is omitted,
+      then <command>VACUUM</command> decides the number of workers based
+      on number of indexes that support parallel vacuum operation on the
+      relation which is further limited by
+      <xref linkend="guc-max-parallel-workers-maintenance"/>. Please note
+      that it is not guaranteed that the number of parallel worker specified
+      in <replaceable class="parameter">integer</replaceable> will be used
+      during execution. It is possible for a vacuum to run with fewer workers
+      than specified, or even with no workers at all. Only one worker can
+      be used per index. So parallel workers are launched only when there
+      are at least <literal>2</literal> indexes in the table. Workers for
+      vacuum launches before starting each phases and exit at the end of
+      the phase. These behaviors might change in a future release. This
+      option can not use with <literal>FULL</literal> option.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><replaceable class="parameter">boolean</replaceable></term>
     <listitem>
@@ -237,6 +264,18 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><replaceable class="parameter">integer</replaceable></term>
+    <listitem>
+     <para>
+      Specifies a positive integer value passed to the selected option.
+      The <replaceable class="parameter">integer</replaceable> value can
+      also be omitted, in which case the default value of the selected
+      option is used.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><replaceable class="parameter">table_name</replaceable></term>
     <listitem>
@@ -316,6 +355,12 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     more than a plain <command>VACUUM</command> would.
    </para>
 
+   <para>
+     The <option>PARALLEL</option> option is used for only vacuum purpose.
+     Even if this option is specified with <option>ANALYZE</option> option
+     it does not affect <option>ANALYZE</option>.
+   </para>
+
    <para>
     <command>VACUUM</command> causes a substantial increase in I/O traffic,
     which might cause poor performance for other active sessions.  Therefore,
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index a3c4a1df3b..c2fe56a4b2 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -22,6 +22,21 @@
  * of index scans performed.  So we don't use maintenance_work_mem memory for
  * the TID array, just enough to hold as many heap tuples as fit on one page.
  *
+ * Lazy vacuum supports parallel execution with parallel worker processes.  In
+ * parallel lazy vacuum, we perform both index vacuuming and index cleanup with
+ * parallel worker processes.  Individual indexes are processed by one vacuum
+ * process.  At the beginning of lazy vacuum (at lazy_scan_heap) we prepare the
+ * parallel context and initialize the DSM segment that contains shared information
+ * as well as the memory space for storing dead tuples.  When starting either
+ * index vacuuming or index cleanup, we launch parallel worker processes.  Once
+ * all indexes are processed the parallel worker processes exit.  And then the
+ * leader process re-initializes the parallel context while keeping recorded
+ * dead tuples so that the leader can launch parallel workers again in the next
+ * time.  Note that all parallel workers live during either index vacuuming or
+ * index cleanup but the leader process neither exits from the parallel mode
+ * nor destroys the parallel context.  For updating the index statistics, since
+ * any updates are not allowed during parallel mode we update the index
+ * statistics after exited from the parallel mode.
  *
  * Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
@@ -36,13 +51,16 @@
 
 #include <math.h>
 
+#include "access/amapi.h"
 #include "access/genam.h"
 #include "access/heapam.h"
 #include "access/heapam_xlog.h"
 #include "access/htup_details.h"
 #include "access/multixact.h"
+#include "access/parallel.h"
 #include "access/transam.h"
 #include "access/visibilitymap.h"
+#include "access/xact.h"
 #include "access/xlog.h"
 #include "catalog/storage.h"
 #include "commands/dbcommands.h"
@@ -55,6 +73,7 @@
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
 #include "storage/lmgr.h"
+#include "tcop/tcopprot.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/pg_rusage.h"
@@ -110,6 +129,180 @@
  */
 #define PREFETCH_SIZE			((BlockNumber) 32)
 
+/*
+ * DSM keys for parallel lazy vacuum. Unlike other parallel execution code,
+ * since we don't need to worry about DSM keys conflicting with plan_node_id
+ * we can use small integers.
+ */
+#define PARALLEL_VACUUM_KEY_SHARED			1
+#define PARALLEL_VACUUM_KEY_DEAD_TUPLES		2
+#define PARALLEL_VACUUM_KEY_QUERY_TEXT		3
+
+/*
+ * PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION disables the leader's
+ * participation in parallel lazy vacuum.  This may be useful as a debugging
+ * aid.
+#undef PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION
+ */
+
+/*
+ * Macro to check if we are in a parallel lazy vacuum. If true, we are
+ * in the parallel mode and prepared the DSM segment.
+ */
+#define ParallelVacuumIsActive(lps) (((LVParallelState *) (lps)) != NULL)
+
+/*
+ * LVDeadTuples stores the dead tuple TIDs collected during heap scan.
+ * This is allocated in the DSM segment when parallel lazy vacuum
+ * mode, otherwise allocated in a local memory.
+ */
+typedef struct LVDeadTuples
+{
+	int			max_tuples;	/* # slots allocated in array */
+	int			num_tuples;	/* current # of entries */
+	/* List of TIDs of tuples we intend to delete */
+	/* NB: this list is ordered by TID address */
+	ItemPointerData itemptrs[FLEXIBLE_ARRAY_MEMBER];	/* array of ItemPointerData */
+} LVDeadTuples;
+#define SizeOfLVDeadTuples offsetof(LVDeadTuples, itemptrs) + sizeof(ItemPointerData)
+
+/*
+ * Shared information among parallel workers. So this is allocated in
+ * the DSM segment.
+ */
+typedef struct LVShared
+{
+	/*
+	 * Target table relid and log level. These fields are not modified
+	 * during the lazy vacuum.
+	 */
+	Oid		relid;
+	int		elevel;
+
+	/*
+	 * An indication for vacuum workers of doing either index vacuuming or
+	 * index cleanup. first_time is true only if for_cleanup is true and
+	 * bulk-deletion is not performed yet.
+	 */
+	bool	for_cleanup;
+	bool	first_time;
+
+	/*
+	 * Fields for both index vacuuming and index cleanup.
+	 *
+	 * reltuples is the total number of input heap tuples.  We set either
+	 * an old live tuples in index vacuuming case or the new live tuples in
+	 * index cleanup case.
+	 *
+	 * estimated_count is true if the reltuples is estimated value.
+	 */
+	double	reltuples;
+	bool	estimated_count;
+
+	/*
+	 * In single process lazy vacuum we could consume more memory during
+	 * index vacuuming or cleanup apart from the memory for heap scanning
+	 * if an index consume memory during ambulkdelete and amvacuumcleanup.
+	 * In parallel index vacuuming, since individual vacuum workers
+	 * consumes memory we set the new maitenance_work_mem for each workers
+	 * to not consume more memory than single process lazy vacuum.
+	 */
+	int		maintenance_work_mem_worker;
+
+	/*
+	 * Shared vacuum cost balance.  During parallel index vacuuming
+	 * VacuumSharedCostBalance points to this value and it accumulates the
+	 * balance of each parallel vacuum workers.
+	 */
+	pg_atomic_uint32 cost_balance;
+
+	/*
+	 * Number of active parallel workers.  This is used for computing the
+	 * minimum threshold of the vacuum cost balance for a worker to go
+	 * for the delay.
+	 */
+	pg_atomic_uint32 active_nworkers;
+
+	/*
+	 * Variables to control parallel index vacuuming.  Index statistics
+	 * returned from ambulkdelete and amvacuumcleanup is nullable variable
+	 * length.  'offset' is NULL bitmap. Note that a 0 indicates a null,
+	 * while 1 indicates non-null.  The index statistics follows at end of
+	 * struct.
+	 */
+	pg_atomic_uint32	idx;		/* counter for vacuuming and clean up */
+	pg_atomic_uint32	nprocessed;	/* # of indexes done during parallel execution */
+	uint32				offset;		/* sizeof header incl. bitmap */
+	bits8				bitmap[FLEXIBLE_ARRAY_MEMBER];	 /* bit map of NULLs */
+
+	/* Shared index statistics data follows at end of struct */
+} LVShared;
+#define SizeOfLVShared offsetof(LVShared, bitmap) + sizeof(bits8)
+#define GetSharedIndStats(s) \
+	((LVSharedIndStats *)((char *)(s) + ((LVShared *)(s))->offset))
+#define IndStatsIsNull(s, i) \
+	(!(((LVShared *)(s))->bitmap[(i) >> 3] & (1 << ((i) & 0x07))))
+
+/*
+ * Variables for cost-based vacuum delay for parallel index vacuuming.
+ * The basic idea of cost-based vacuum delay for parallel index vacuuming
+ * is to allow all parallel vacuum workers including the leader process
+ * to have a shared view of cost related parameters (mainly VacuumCostBalance)
+ * and allow each worker to update it and then based on that decide
+ * whether it needs to sleep.  Besides, we allow any worker to sleep
+ * only if it has performed the I/O above a certain threshold, which is
+ * calculated based on the number of active workers (VacuumActiveNWorkers),
+ * and the overall cost balance is more than VacuumCostLimit set by the
+ * system.  Then we will allow the worker to sleep proportional to the work
+ * done and reduce the VacuumSharedCostBalance by the amount which is
+ * consumed by the current worker (VacuumCostBalanceLocal).  This can
+ * avoid letting the workers sleep which has done less or no I/O as compared
+ * to other workers, and therefore can ensure that workers who are doing
+ * more I/O got throttled more.
+ */
+pg_atomic_uint32	*VacuumSharedCostBalance = NULL;
+pg_atomic_uint32	*VacuumActiveNWorkers = NULL;
+int					VacuumCostBalanceLocal = 0;
+
+/*
+ * Struct for an index bulk-deletion statistic used for parallel lazy
+ * vacuum. This is allocated in the DSM segment.  IndexBulkDeleteResult
+ * follows at end of struct.
+ */
+typedef struct LVSharedIndStats
+{
+	Size	size;
+	bool	updated;	/* are the stats updated */
+
+	/* Index bulk-deletion result data follows at end of struct */
+} LVSharedIndStats;
+#define SizeOfSharedIndStats(s) \
+	(sizeof(LVSharedIndStats) + ((LVSharedIndStats *)(s))->size)
+#define GetIndexBulkDeleteResult(s) \
+	((IndexBulkDeleteResult *)((char *)(s) + sizeof(LVSharedIndStats)))
+
+/* Struct for parallel lazy vacuum */
+typedef struct LVParallelState
+{
+	ParallelContext	*pcxt;
+
+	/* Shared information among parallel vacuum workers */
+	LVShared		*lvshared;
+
+	/*
+	 * The number of indexes that do NOT support parallel
+	 * index bulk-deletion and parallel index cleanup respectively.
+	 */
+	int				nindexes_nonparallel_bulkdel;
+	int				nindexes_nonparallel_cleanup;
+
+	/*
+	 * Always true except for a debugging case where
+	 * PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION are defined.
+	 */
+	bool			leaderparticipates;
+} LVParallelState;
+
 typedef struct LVRelStats
 {
 	/* useindex = true means two-pass strategy; false means one-pass */
@@ -128,11 +321,7 @@ typedef struct LVRelStats
 	BlockNumber pages_removed;
 	double		tuples_deleted;
 	BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
-	/* List of TIDs of tuples we intend to delete */
-	/* NB: this list is ordered by TID address */
-	int			num_dead_tuples;	/* current # of entries */
-	int			max_dead_tuples;	/* # slots allocated in array */
-	ItemPointer dead_tuples;	/* array of ItemPointerData */
+	LVDeadTuples *dead_tuples;
 	int			num_index_scans;
 	TransactionId latestRemovedXid;
 	bool		lock_waiter_detected;
@@ -155,12 +344,11 @@ static void lazy_scan_heap(Relation onerel, VacuumParams *params,
 						   bool aggressive);
 static void lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats);
 static bool lazy_check_needs_freeze(Buffer buf, bool *hastup);
-static void lazy_vacuum_index(Relation indrel,
-							  IndexBulkDeleteResult **stats,
-							  LVRelStats *vacrelstats);
+static void lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult **stats,
+							  LVDeadTuples *dead_tuples, double reltuples);
 static void lazy_cleanup_index(Relation indrel,
-							   IndexBulkDeleteResult *stats,
-							   LVRelStats *vacrelstats);
+							   IndexBulkDeleteResult **stats,
+							   double reltuples, bool estimated_count);
 static int	lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 							 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer);
 static bool should_attempt_truncation(VacuumParams *params,
@@ -169,12 +357,39 @@ static void lazy_truncate_heap(Relation onerel, LVRelStats *vacrelstats);
 static BlockNumber count_nondeletable_pages(Relation onerel,
 											LVRelStats *vacrelstats);
 static void lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks);
-static void lazy_record_dead_tuple(LVRelStats *vacrelstats,
+static void lazy_record_dead_tuple(LVDeadTuples *dead_tuples,
 								   ItemPointer itemptr);
 static bool lazy_tid_reaped(ItemPointer itemptr, void *state);
 static int	vac_cmp_itemptr(const void *left, const void *right);
 static bool heap_page_is_all_visible(Relation rel, Buffer buf,
 									 TransactionId *visibility_cutoff_xid, bool *all_frozen);
+static LVParallelState *begin_parallel_vacuum(LVRelStats *vacrelstats, Oid relid,
+											  BlockNumber nblocks, Relation *Irel,
+											  int nindexes, int nrequested);
+static void end_parallel_vacuum(LVParallelState *lps, Relation *Irel, int nindexes,
+								IndexBulkDeleteResult **stats);
+static void prepare_index_statistics(LVShared *lvshared, Relation *Irel, int nindexes,
+									 int nworkers);
+static void lazy_vacuum_indexes(LVRelStats *vacrelstats, Relation *Irel,
+								int nindexes, IndexBulkDeleteResult **stats,
+								LVParallelState *lps);
+static void lazy_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
+								 int nindexes, IndexBulkDeleteResult **stats,
+								 LVParallelState *lps);
+static void lazy_parallel_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
+													int nindexes, IndexBulkDeleteResult **stats,
+													LVParallelState *lps);
+static void vacuum_or_cleanup_indexes_worker(Relation *Irel, int nindexes,
+											 IndexBulkDeleteResult **stats,
+											 LVShared *lvshared,
+											 LVDeadTuples *dead_tuples);
+static void update_index_statistics(Relation *Irel, IndexBulkDeleteResult **stats,
+									int nindexes);
+static LVSharedIndStats *get_indstats(LVShared *lvshared, int n);
+static int compute_parallel_workers(Relation *Irel, int nindexes, int nrequested);
+static long compute_max_dead_tuples(BlockNumber relblocks, bool hasindex);
+static bool skip_parallel_index_vacuum(Relation indrel, bool for_cleanup,
+									   bool first_time);
 
 
 /*
@@ -488,6 +703,18 @@ vacuum_log_cleanup_info(Relation rel, LVRelStats *vacrelstats)
  *		dead-tuple TIDs, invoke vacuuming of indexes and call lazy_vacuum_heap
  *		to reclaim dead line pointers.
  *
+ *		If the table has at least two indexes and parallel lazy vacuum is
+ *		requested, we execute both index vacuuming and index cleanup with
+ *		parallel workers. In parallel lazy vacuum, we enter parallel mode and
+ *		then create both the parallel context and the DSM segment before starting
+ *		heap scan so that we can record dead tuples to the DSM segment. All
+ *		parallel workers are launched at beginning of index vacuuming and index
+ *		cleanup and they exit once done with all indexes. At the end of this
+ *		function we exit from parallel mode. Index bulk-deletion results are
+ *		stored in the DSM segment and update index statistics as a whole after
+ *		exited from parallel mode since all writes are not allowed during parallel
+ *		mode.
+ *
  *		If there are no indexes then we can reclaim line pointers on the fly;
  *		dead line pointers need only be retained until all index pointers that
  *		reference them have been killed.
@@ -496,6 +723,8 @@ static void
 lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			   Relation *Irel, int nindexes, bool aggressive)
 {
+	LVParallelState *lps = NULL;
+	LVDeadTuples *dead_tuples;
 	BlockNumber nblocks,
 				blkno;
 	HeapTupleData tuple;
@@ -518,6 +747,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	bool		skipping_blocks;
 	xl_heap_freeze_tuple *frozen;
 	StringInfoData buf;
+	int			parallel_workers = 0;
 	const int	initprog_index[] = {
 		PROGRESS_VACUUM_PHASE,
 		PROGRESS_VACUUM_TOTAL_HEAP_BLKS,
@@ -553,13 +783,41 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	vacrelstats->nonempty_pages = 0;
 	vacrelstats->latestRemovedXid = InvalidTransactionId;
 
-	lazy_space_alloc(vacrelstats, nblocks);
+	/*
+	 * If parallel lazy vacuum is requested and we vacuum indexes, compute
+	 * the number of parallel vacuum worker to launch.
+	 */
+	if (params->nworkers >= 0 && vacrelstats->useindex)
+		parallel_workers = compute_parallel_workers(Irel, nindexes,
+													params->nworkers);
+
+	if (parallel_workers > 0)
+	{
+		/*
+		 * Enter parallel mode, create the parallel context and allocate the
+		 * DSM segment.
+		 */
+		lps = begin_parallel_vacuum(vacrelstats,
+									RelationGetRelid(onerel),
+									nblocks, Irel, nindexes,
+									parallel_workers);
+	}
+	else
+	{
+		/*
+		 * Use single process vacuum. We allocate the memory space for dead
+		 * tuples locally.
+		 */
+		lazy_space_alloc(vacrelstats, nblocks);
+	}
+
+	dead_tuples = vacrelstats->dead_tuples;
 	frozen = palloc(sizeof(xl_heap_freeze_tuple) * MaxHeapTuplesPerPage);
 
 	/* Report that we're scanning the heap, advertising total # of blocks */
 	initprog_val[0] = PROGRESS_VACUUM_PHASE_SCAN_HEAP;
 	initprog_val[1] = nblocks;
-	initprog_val[2] = vacrelstats->max_dead_tuples;
+	initprog_val[2] = dead_tuples->max_tuples;
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
 	/*
@@ -737,8 +995,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * If we are close to overrunning the available space for dead-tuple
 		 * TIDs, pause and do a cycle of vacuuming before we tackle this page.
 		 */
-		if ((vacrelstats->max_dead_tuples - vacrelstats->num_dead_tuples) < MaxHeapTuplesPerPage &&
-			vacrelstats->num_dead_tuples > 0)
+		if ((dead_tuples->max_tuples - dead_tuples->num_tuples) < MaxHeapTuplesPerPage &&
+			dead_tuples->num_tuples > 0)
 		{
 			const int	hvp_index[] = {
 				PROGRESS_VACUUM_PHASE,
@@ -766,10 +1024,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 										 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
 			/* Remove index entries */
-			for (i = 0; i < nindexes; i++)
-				lazy_vacuum_index(Irel[i],
-								  &indstats[i],
-								  vacrelstats);
+			lazy_vacuum_indexes(vacrelstats, Irel, nindexes, indstats, lps);
 
 			/*
 			 * Report that we are now vacuuming the heap.  We also increase
@@ -789,7 +1044,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 			vacrelstats->num_index_scans++;
 
 			/*
@@ -985,7 +1240,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		has_dead_tuples = false;
 		nfrozen = 0;
 		hastup = false;
-		prev_dead_count = vacrelstats->num_dead_tuples;
+		prev_dead_count = dead_tuples->num_tuples;
 		maxoff = PageGetMaxOffsetNumber(page);
 
 		/*
@@ -1024,7 +1279,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 */
 			if (ItemIdIsDead(itemid))
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				all_visible = false;
 				continue;
 			}
@@ -1170,7 +1425,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 
 			if (tupgone)
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				HeapTupleHeaderAdvanceLatestRemovedXid(tuple.t_data,
 													   &vacrelstats->latestRemovedXid);
 				tups_vacuumed += 1;
@@ -1240,7 +1495,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * doing a second scan. Also we don't do that but forget dead tuples
 		 * when index cleanup is disabled.
 		 */
-		if (!vacrelstats->useindex && vacrelstats->num_dead_tuples > 0)
+		if (!vacrelstats->useindex && dead_tuples->num_tuples > 0)
 		{
 			if (nindexes == 0)
 			{
@@ -1269,7 +1524,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 
 			/*
 			 * Periodically do incremental FSM vacuuming to make newly-freed
@@ -1384,7 +1639,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * page, so remember its free space as-is.  (This path will always be
 		 * taken if there are no indexes.)
 		 */
-		if (vacrelstats->num_dead_tuples == prev_dead_count)
+		if (dead_tuples->num_tuples == prev_dead_count)
 			RecordPageWithFreeSpace(onerel, blkno, freespace);
 	}
 
@@ -1418,7 +1673,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 
 	/* If any tuples need to be deleted, perform final vacuum cycle */
 	/* XXX put a threshold on min number of tuples here? */
-	if (vacrelstats->num_dead_tuples > 0)
+	if (dead_tuples->num_tuples > 0)
 	{
 		const int	hvp_index[] = {
 			PROGRESS_VACUUM_PHASE,
@@ -1434,10 +1689,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 									 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
 		/* Remove index entries */
-		for (i = 0; i < nindexes; i++)
-			lazy_vacuum_index(Irel[i],
-							  &indstats[i],
-							  vacrelstats);
+		lazy_vacuum_indexes(vacrelstats, Irel, nindexes, indstats, lps);
 
 		/* Report that we are now vacuuming the heap */
 		hvp_val[0] = PROGRESS_VACUUM_PHASE_VACUUM_HEAP;
@@ -1463,12 +1715,19 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
 								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
 
-	/* Do post-vacuum cleanup and statistics update for each index */
+	/* Do post-vacuum cleanup */
 	if (vacrelstats->useindex)
-	{
-		for (i = 0; i < nindexes; i++)
-			lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
-	}
+		lazy_cleanup_indexes(vacrelstats, Irel, nindexes, indstats, lps);
+
+	/*
+	 * End parallel mode before updating index statistics as we cannot write
+	 * during parallel mode.
+	 */
+	if (ParallelVacuumIsActive(lps))
+		end_parallel_vacuum(lps, Irel, nindexes, indstats);
+
+	/* Update index statistics */
+	 update_index_statistics(Irel, indstats, nindexes);
 
 	/* If no indexes, make log report that lazy_vacuum_heap would've made */
 	if (vacuumed_pages)
@@ -1534,7 +1793,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 	npages = 0;
 
 	tupindex = 0;
-	while (tupindex < vacrelstats->num_dead_tuples)
+	while (tupindex < vacrelstats->dead_tuples->num_tuples)
 	{
 		BlockNumber tblk;
 		Buffer		buf;
@@ -1543,7 +1802,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 
 		vacuum_delay_point();
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples->itemptrs[tupindex]);
 		buf = ReadBufferExtended(onerel, MAIN_FORKNUM, tblk, RBM_NORMAL,
 								 vac_strategy);
 		if (!ConditionalLockBufferForCleanup(buf))
@@ -1591,6 +1850,7 @@ static int
 lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 				 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer)
 {
+	LVDeadTuples	*dead_tuples = vacrelstats->dead_tuples;
 	Page		page = BufferGetPage(buffer);
 	OffsetNumber unused[MaxOffsetNumber];
 	int			uncnt = 0;
@@ -1601,16 +1861,16 @@ lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 
 	START_CRIT_SECTION();
 
-	for (; tupindex < vacrelstats->num_dead_tuples; tupindex++)
+	for (; tupindex < dead_tuples->num_tuples; tupindex++)
 	{
 		BlockNumber tblk;
 		OffsetNumber toff;
 		ItemId		itemid;
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&dead_tuples->itemptrs[tupindex]);
 		if (tblk != blkno)
 			break;				/* past end of tuples for this block */
-		toff = ItemPointerGetOffsetNumber(&vacrelstats->dead_tuples[tupindex]);
+		toff = ItemPointerGetOffsetNumber(&dead_tuples->itemptrs[tupindex]);
 		itemid = PageGetItemId(page, toff);
 		ItemIdSetUnused(itemid);
 		unused[uncnt++] = toff;
@@ -1731,19 +1991,336 @@ lazy_check_needs_freeze(Buffer buf, bool *hastup)
 	return false;
 }
 
+/*
+ * Perform index vacuuming or index cleanup with parallel workers. This function
+ * must be used by the parallel vacuum leader process. The caller must set
+ * lps->lvshared->for_cleanup to indicate whether vacuuming or cleanup.
+ */
+static void
+lazy_parallel_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
+										int nindexes, IndexBulkDeleteResult **stats,
+										LVParallelState *lps)
+{
+	int nindexes_remains;
+
+	Assert(!IsParallelWorker());
+	Assert(ParallelVacuumIsActive(lps));
+	Assert(nindexes > 0);
+
+	/* Enable shared cost balance */
+	VacuumSharedCostBalance = &(lps->lvshared->cost_balance);
+	VacuumActiveNWorkers = &(lps->lvshared->active_nworkers);
+
+	/*
+	 * Set up shared cost balance and the number of active workers for
+	 * vacuum delay.
+	 */
+	pg_atomic_write_u32(VacuumSharedCostBalance, VacuumCostBalance);
+	pg_atomic_write_u32(VacuumActiveNWorkers, 0);
+
+	/*
+	 * Reset the local value so that we compute cost balance during
+	 * parallel index vacuuming.
+	 */
+	VacuumCostBalance = 0;
+	VacuumCostBalanceLocal = 0;
+
+	/* Launch all workers */
+	LaunchParallelWorkers(lps->pcxt);
+
+	if (lps->lvshared->for_cleanup)
+		ereport(elevel,
+				(errmsg(ngettext("launched %d parallel vacuum worker for index cleanup (planned: %d)",
+								 "launched %d parallel vacuum workers for index cleanup (planned: %d)",
+								 lps->pcxt->nworkers_launched),
+						lps->pcxt->nworkers_launched, lps->pcxt->nworkers)));
+	else
+		ereport(elevel,
+				(errmsg(ngettext("launched %d parallel vacuum worker for index vacuuming (planned: %d)",
+								 "launched %d parallel vacuum workers for index vacuuming (planned: %d)",
+								 lps->pcxt->nworkers_launched),
+						lps->pcxt->nworkers_launched, lps->pcxt->nworkers)));
+
+	/*
+	* Increment the active worker count. We cannot decrement until the
+	* all parallel workers finish.
+	*/
+	pg_atomic_add_fetch_u32(VacuumActiveNWorkers, 1);
+
+	/*
+	 * Join as parallel workers. The leader process alone does that in
+	 * case where no workers launched.
+	 */
+	if (lps->leaderparticipates || lps->pcxt->nworkers_launched == 0)
+		vacuum_or_cleanup_indexes_worker(Irel, nindexes, stats, lps->lvshared,
+										 vacrelstats->dead_tuples);
+
+	/*
+	 * Here, the indexes that had been skipped during parallel index vacuuming
+	 * are remaining. If there are such indexes the leader process does vacuum
+	 * or cleanup them one by one.
+	 */
+	nindexes_remains = nindexes - pg_atomic_read_u32(&(lps->lvshared->nprocessed));
+	if (nindexes_remains > 0)
+	{
+		int i;
+#ifdef USE_ASSERT_CHECKING
+		int nprocessed = 0;
+#endif
+
+		for (i = 0; i < nindexes; i++)
+		{
+			bool processed = !skip_parallel_index_vacuum(Irel[i],
+														 lps->lvshared->for_cleanup,
+														 lps->lvshared->first_time);
+
+			/* Skip the already processed indexes */
+			if (processed)
+				continue;
+
+			if (lps->lvshared->for_cleanup)
+				lazy_cleanup_index(Irel[i], &stats[i],
+								   vacrelstats->new_rel_tuples,
+								   vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+			else
+				lazy_vacuum_index(Irel[i], &stats[i], vacrelstats->dead_tuples,
+								  vacrelstats->old_live_tuples);
+#ifdef USE_ASSERT_CHECKING
+			nprocessed++;
+#endif
+		}
+#ifdef USE_ASSERT_CHECKING
+		Assert(nprocessed == nindexes_remains);
+#endif
+	}
+
+	/*
+	 * We have completed the index vacuum so decrement the active worker
+	 * count.
+	 */
+	pg_atomic_sub_fetch_u32(VacuumActiveNWorkers, 1);
+
+	/* Wait for all vacuum workers to finish */
+	WaitForParallelWorkersToFinish(lps->pcxt);
+
+	/* Take over the shared balance value to heap scan */
+	VacuumCostBalance = pg_atomic_read_u32(VacuumSharedCostBalance);
+
+	/* Disable shared cost balance for vacuum delay */
+	VacuumSharedCostBalance = NULL;
+	VacuumActiveNWorkers = NULL;
+
+	/*
+	 * In cleanup case we don't need to reinitialize the parallel
+	 * context as no more index vacuuming and index cleanup will be
+	 * performed after that.
+	 */
+	if (!lps->lvshared->for_cleanup)
+	{
+		/* Reset the processing counts */
+		pg_atomic_write_u32(&(lps->lvshared->idx), 0);
+		pg_atomic_write_u32(&(lps->lvshared->nprocessed), 0);
+
+		/*
+		 * Reinitialize the parallel context to relaunch parallel workers
+		 * for the next execution.
+		 */
+		ReinitializeParallelDSM(lps->pcxt);
+	}
+}
+
+/*
+ * Index vacuuming and index cleanup routine used by parallel vacuum
+ * worker processes including the leader process.  After finished each
+ * indexes this function copies the index statistics returned from
+ * ambulkdelete and amvacuumcleanup to the DSM segment.
+ */
+static void
+vacuum_or_cleanup_indexes_worker(Relation *Irel, int nindexes,
+								 IndexBulkDeleteResult **stats,
+								 LVShared *lvshared,
+								 LVDeadTuples *dead_tuples)
+{
+	/* Loop until all indexes are vacuumed */
+	for (;;)
+	{
+		int idx;
+		LVSharedIndStats *shared_indstats;
+		IndexBulkDeleteResult *bulkdelete_res;
+
+		/* Get an index number to process */
+		idx = pg_atomic_fetch_add_u32(&(lvshared->idx), 1);
+
+		/* Done for all indexes? */
+		if (idx >= nindexes)
+			break;
+
+		/*
+		 * Skip if this index doesn't support parallel execution
+		 * at this time.
+		 */
+		if (skip_parallel_index_vacuum(Irel[idx], lvshared->for_cleanup,
+									   lvshared->first_time))
+			continue;
+
+		/* Get index statistics struct of this index */
+		shared_indstats = get_indstats(lvshared, idx);
+
+		/* Skip if this index doesn't support parallel index vacuuming */
+		if (shared_indstats == NULL)
+			continue;
+
+		/* Get the space for IndexBulkDeleteResult */
+		bulkdelete_res = GetIndexBulkDeleteResult(shared_indstats);
+
+		/*
+		 * Update the pointer to the corresponding bulk-deletion result
+		 * if someone has already updated it.
+		 */
+		if (shared_indstats->updated && stats[idx] == NULL)
+			stats[idx] = bulkdelete_res;
+
+		/* Increment the processing count */
+		pg_atomic_add_fetch_u32(&(lvshared->nprocessed), 1);
+
+		/* Do vacuum or cleanup one index */
+		if (lvshared->for_cleanup)
+			lazy_cleanup_index(Irel[idx], &(stats[idx]), lvshared->reltuples,
+							   lvshared->estimated_count);
+		else
+			lazy_vacuum_index(Irel[idx], &(stats[idx]), dead_tuples,
+							  lvshared->reltuples);
+
+		/*
+		 * Copy the index bulk-deletion result returned from ambulkdelete
+		 * and amvacuumcleanup to the DSM segment if it's the first time to
+		 * get it from them, because they allocate it locally and it's
+		 * possible that an index will be vacuumed by the different vacuum
+		 * process at the next time.  The copying the result normally
+		 * happens only after the first time of index vacuuming.  From the
+		 * second time, we pass the result on the DSM segment so that they
+		 * then update it directly.
+		 *
+		 * Since all vacuum workers write the bulk-deletion result at
+		 * different slots we can write them without locking.
+		 */
+		if (!shared_indstats->updated && stats[idx] != NULL)
+		{
+			memcpy(bulkdelete_res, stats[idx], shared_indstats->size);
+			shared_indstats->updated = true;
+
+			/*
+			 * no longer need the locally allocated result and now
+			 * stats[idx] points to the DSM segment.
+			 */
+			pfree(stats[idx]);
+			stats[idx] = bulkdelete_res;
+		}
+	}
+}
+
+/*
+ * Cleanup indexes.  This function must be used by the parallel vacuum
+ * leader process in parallel vacuum case.
+ */
+static void
+lazy_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
+					int nindexes, IndexBulkDeleteResult **stats,
+					LVParallelState *lps)
+{
+	int		idx;
+
+	Assert(!IsParallelWorker());
+	Assert(nindexes > 0);
+
+	/*
+	 * If parallel vacuum is active we perform index cleanup with parallel
+	 * workers.
+	 */
+	if (ParallelVacuumIsActive(lps))
+	{
+		/* Tell parallel workers to do index cleanup */
+		lps->lvshared->for_cleanup = true;
+		lps->lvshared->first_time =
+			(vacrelstats->num_index_scans == 0);
+
+		/*
+		 * Now we can provide a better estimate of total number of
+		 * surviving tuples (we assume indexes are more interested in that
+		 * than in the number of nominally live tuples).
+		 */
+		lps->lvshared->reltuples = vacrelstats->new_rel_tuples;
+		lps->lvshared->estimated_count =
+			(vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+
+		lazy_parallel_vacuum_or_cleanup_indexes(vacrelstats, Irel, nindexes,
+												stats, lps);
+	}
+	else
+	{
+		for (idx = 0; idx < nindexes; idx++)
+			lazy_cleanup_index(Irel[idx], &stats[idx],
+							   vacrelstats->new_rel_tuples,
+							   vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	}
+}
+
+/*
+ * Vacuum indexes. This function must be used by the parallel vacuum leader
+ * process in parallel vacuum case.
+ */
+static void
+lazy_vacuum_indexes(LVRelStats *vacrelstats, Relation *Irel,
+					int nindexes, IndexBulkDeleteResult **stats,
+					LVParallelState *lps)
+{
+	int		idx;
+
+	Assert(!IsParallelWorker());
+	Assert(nindexes > 0);
+
+	/*
+	 * If parallel vacuum is active we perform index vacuuming with
+	 * parallel workers.
+	 */
+	if (ParallelVacuumIsActive(lps))
+	{
+		/* Tell parallel workers to do index vacuuming */
+		lps->lvshared->for_cleanup = false;
+		lps->lvshared->first_time = false;
+
+		/*
+		 * We can only provide an approximate value of num_heap_tuples in
+		 * vacuum cases.
+		 */
+		lps->lvshared->reltuples = vacrelstats->old_live_tuples;
+		lps->lvshared->estimated_count = true;
+
+		lazy_parallel_vacuum_or_cleanup_indexes(vacrelstats, Irel, nindexes,
+												stats, lps);
+	}
+	else
+	{
+		for (idx = 0; idx < nindexes; idx++)
+			lazy_vacuum_index(Irel[idx], &stats[idx], vacrelstats->dead_tuples,
+							  vacrelstats->old_live_tuples);
+	}
+}
 
 /*
  *	lazy_vacuum_index() -- vacuum one index relation.
  *
  *		Delete all the index entries pointing to tuples listed in
  *		vacrelstats->dead_tuples, and update running statistics.
+ *		reltuples is the number of heap tuples to be passed to the
+ *		bulk delete callback.
  */
 static void
-lazy_vacuum_index(Relation indrel,
-				  IndexBulkDeleteResult **stats,
-				  LVRelStats *vacrelstats)
+lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult **stats,
+				  LVDeadTuples *dead_tuples, double reltuples)
 {
 	IndexVacuumInfo ivinfo;
+	char		*msgfmt;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
@@ -1753,30 +2330,38 @@ lazy_vacuum_index(Relation indrel,
 	ivinfo.report_progress = false;
 	ivinfo.estimated_count = true;
 	ivinfo.message_level = elevel;
-	/* We can only provide an approximate value of num_heap_tuples here */
-	ivinfo.num_heap_tuples = vacrelstats->old_live_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
 	/* Do bulk deletion */
 	*stats = index_bulk_delete(&ivinfo, *stats,
-							   lazy_tid_reaped, (void *) vacrelstats);
+							   lazy_tid_reaped, (void *) dead_tuples);
+
+	if (IsParallelWorker())
+		msgfmt = gettext_noop("scanned index \"%s\" to remove %d row versions by parallel vacuum worker");
+	else
+		msgfmt = gettext_noop("scanned index \"%s\" to remove %d row versions");
 
 	ereport(elevel,
-			(errmsg("scanned index \"%s\" to remove %d row versions",
+			(errmsg(msgfmt,
 					RelationGetRelationName(indrel),
-					vacrelstats->num_dead_tuples),
+					dead_tuples->num_tuples),
 			 errdetail_internal("%s", pg_rusage_show(&ru0))));
 }
 
 /*
  *	lazy_cleanup_index() -- do post-vacuum cleanup for one index relation.
+ *
+ *		reltuples is the number of heap tuples and estimated_count is true
+ *		if the reltuples is an estimated value.
  */
 static void
 lazy_cleanup_index(Relation indrel,
-				   IndexBulkDeleteResult *stats,
-				   LVRelStats *vacrelstats)
+				   IndexBulkDeleteResult **stats,
+				   double reltuples, bool estimated_count)
 {
 	IndexVacuumInfo ivinfo;
+	char		*msgfmt;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
@@ -1784,49 +2369,62 @@ lazy_cleanup_index(Relation indrel,
 	ivinfo.index = indrel;
 	ivinfo.analyze_only = false;
 	ivinfo.report_progress = false;
-	ivinfo.estimated_count = (vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	ivinfo.estimated_count = estimated_count;
 	ivinfo.message_level = elevel;
 
-	/*
-	 * Now we can provide a better estimate of total number of surviving
-	 * tuples (we assume indexes are more interested in that than in the
-	 * number of nominally live tuples).
-	 */
-	ivinfo.num_heap_tuples = vacrelstats->new_rel_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
-	stats = index_vacuum_cleanup(&ivinfo, stats);
+	*stats = index_vacuum_cleanup(&ivinfo, *stats);
 
-	if (!stats)
+	if (!(*stats))
 		return;
 
-	/*
-	 * Now update statistics in pg_class, but only if the index says the count
-	 * is accurate.
-	 */
-	if (!stats->estimated_count)
-		vac_update_relstats(indrel,
-							stats->num_pages,
-							stats->num_index_tuples,
-							0,
-							false,
-							InvalidTransactionId,
-							InvalidMultiXactId,
-							false);
+	if (IsParallelWorker())
+		msgfmt = gettext_noop("index \"%s\" now contains %.0f row versions in %u pages, reported by parallel vacuum worker");
+	else
+		msgfmt = gettext_noop("index \"%s\" now contains %.0f row versions in %u pages");
 
 	ereport(elevel,
-			(errmsg("index \"%s\" now contains %.0f row versions in %u pages",
+			(errmsg(msgfmt,
 					RelationGetRelationName(indrel),
-					stats->num_index_tuples,
-					stats->num_pages),
+					(*stats)->num_index_tuples,
+					(*stats)->num_pages),
 			 errdetail("%.0f index row versions were removed.\n"
 					   "%u index pages have been deleted, %u are currently reusable.\n"
 					   "%s.",
-					   stats->tuples_removed,
-					   stats->pages_deleted, stats->pages_free,
+					   (*stats)->tuples_removed,
+					   (*stats)->pages_deleted, (*stats)->pages_free,
 					   pg_rusage_show(&ru0))));
+}
+
+/*
+ * Update index statistics in pg_class if the statistics is accurate.
+ */
+static void
+update_index_statistics(Relation *Irel, IndexBulkDeleteResult **stats,
+						int nindexes)
+{
+	int i;
 
-	pfree(stats);
+	Assert(!IsInParallelMode());
+
+	for (i = 0; i < nindexes; i++)
+	{
+		if (stats[i] == NULL || stats[i]->estimated_count)
+			continue;
+
+		/* Update index statistics */
+		vac_update_relstats(Irel[i],
+							stats[i]->num_pages,
+							stats[i]->num_index_tuples,
+							0,
+							false,
+							InvalidTransactionId,
+							InvalidMultiXactId,
+							false);
+		pfree(stats[i]);
+	}
 }
 
 /*
@@ -2134,19 +2732,17 @@ count_nondeletable_pages(Relation onerel, LVRelStats *vacrelstats)
 }
 
 /*
- * lazy_space_alloc - space allocation decisions for lazy vacuum
- *
- * See the comments at the head of this file for rationale.
+ * Return the maximum number of dead tuples we can record.
  */
-static void
-lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+static long
+compute_max_dead_tuples(BlockNumber relblocks, bool useindex)
 {
 	long		maxtuples;
 	int			vac_work_mem = IsAutoVacuumWorkerProcess() &&
 	autovacuum_work_mem != -1 ?
 	autovacuum_work_mem : maintenance_work_mem;
 
-	if (vacrelstats->useindex)
+	if (useindex)
 	{
 		maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);
 		maxtuples = Min(maxtuples, INT_MAX);
@@ -2160,34 +2756,49 @@ lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
 		maxtuples = Max(maxtuples, MaxHeapTuplesPerPage);
 	}
 	else
-	{
 		maxtuples = MaxHeapTuplesPerPage;
-	}
 
-	vacrelstats->num_dead_tuples = 0;
-	vacrelstats->max_dead_tuples = (int) maxtuples;
-	vacrelstats->dead_tuples = (ItemPointer)
-		palloc(maxtuples * sizeof(ItemPointerData));
+	return maxtuples;
+}
+
+/*
+ * lazy_space_alloc - space allocation decisions for lazy vacuum
+ *
+ * See the comments at the head of this file for rationale.
+ */
+static void
+lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+{
+	LVDeadTuples	*dead_tuples = NULL;
+	long		maxtuples;
+
+	maxtuples = compute_max_dead_tuples(relblocks, vacrelstats->useindex);
+
+	dead_tuples = (LVDeadTuples *)
+		palloc(SizeOfLVDeadTuples + maxtuples * sizeof(ItemPointerData));
+	dead_tuples->num_tuples = 0;
+	dead_tuples->max_tuples = (int) maxtuples;
+
+	vacrelstats->dead_tuples = dead_tuples;
 }
 
 /*
  * lazy_record_dead_tuple - remember one deletable tuple
  */
 static void
-lazy_record_dead_tuple(LVRelStats *vacrelstats,
-					   ItemPointer itemptr)
+lazy_record_dead_tuple(LVDeadTuples *dead_tuples, ItemPointer itemptr)
 {
 	/*
 	 * The array shouldn't overflow under normal behavior, but perhaps it
 	 * could if we are given a really small maintenance_work_mem. In that
 	 * case, just forget the last few tuples (we'll get 'em next time).
 	 */
-	if (vacrelstats->num_dead_tuples < vacrelstats->max_dead_tuples)
+	if (dead_tuples->num_tuples < dead_tuples->max_tuples)
 	{
-		vacrelstats->dead_tuples[vacrelstats->num_dead_tuples] = *itemptr;
-		vacrelstats->num_dead_tuples++;
+		dead_tuples->itemptrs[dead_tuples->num_tuples] = *itemptr;
+		dead_tuples->num_tuples++;
 		pgstat_progress_update_param(PROGRESS_VACUUM_NUM_DEAD_TUPLES,
-									 vacrelstats->num_dead_tuples);
+									 dead_tuples->num_tuples);
 	}
 }
 
@@ -2201,12 +2812,12 @@ lazy_record_dead_tuple(LVRelStats *vacrelstats,
 static bool
 lazy_tid_reaped(ItemPointer itemptr, void *state)
 {
-	LVRelStats *vacrelstats = (LVRelStats *) state;
+	LVDeadTuples	*dead_tuples = (LVDeadTuples *) state;
 	ItemPointer res;
 
 	res = (ItemPointer) bsearch((void *) itemptr,
-								(void *) vacrelstats->dead_tuples,
-								vacrelstats->num_dead_tuples,
+								(void *) dead_tuples->itemptrs,
+								dead_tuples->num_tuples,
 								sizeof(ItemPointerData),
 								vac_cmp_itemptr);
 
@@ -2354,3 +2965,406 @@ heap_page_is_all_visible(Relation rel, Buffer buf,
 
 	return all_visible;
 }
+
+/*
+ * Compute the number of parallel worker processes to request. Both index
+ * vacuuming and index cleanup can be executed together with parallel workers.
+ * The relation sizes of table and indexes don't affect to the parallel
+ * degree for now. nrequested is the number of parallel workers that user
+ * requested. If nrequested is 0 we compute the parallel degree based on
+ * nindexes that is the number of indexes that support parallel index
+ * vacuuming.
+ */
+static int
+compute_parallel_workers(Relation *Irel, int nindexes, int nrequested)
+{
+	bool	leaderparticipates = true;
+	int		nindexes_to_vacuum = 0;
+	int		parallel_workers;
+	int		i;
+
+	Assert(nrequested >= 0);
+
+	/* Return immediately when parallelism disabled */
+	if (max_parallel_maintenance_workers == 0)
+		return 0;
+
+	/*
+	 * Compute the number of indexes that can participate to parallel index
+	 * vacuuming.
+	 */
+	for (i = 0; i < nindexes; i++)
+	{
+		if (Irel[i]->rd_indam->amparallelvacuumoptions !=
+			VACUUM_OPTION_NO_PARALLEL)
+			nindexes_to_vacuum++;
+	}
+
+	/* No index supports parallel index vacuuming */
+	if (nindexes_to_vacuum == 0)
+		return 0;
+
+#ifdef PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION
+	leaderparticipates = false;
+#endif
+
+	/* The leader process takes one index */
+	if (leaderparticipates)
+		nindexes_to_vacuum--;
+
+	/* Compute the parallel degree */
+	parallel_workers = (nrequested > 0) ?
+		Min(nrequested, nindexes_to_vacuum) : nindexes_to_vacuum;
+
+	/* cap by max_parallel_maintenace_workers */
+	parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+
+	return parallel_workers;
+}
+
+/*
+ * Enter parallel mode, allocate and initialize the DSM segment.
+ */
+static LVParallelState *
+begin_parallel_vacuum(LVRelStats *vacrelstats, Oid relid, BlockNumber nblocks,
+					  Relation *Irel, int nindexes, int nrequested)
+{
+	LVParallelState *lps = (LVParallelState *) palloc0(sizeof(LVParallelState));
+	ParallelContext *pcxt;
+	LVShared		*shared;
+	LVDeadTuples	*dead_tuples;
+	long	maxtuples;
+	char	*sharedquery;
+	Size	est_shared;
+	Size	est_deadtuples;
+	int		querylen;
+	int		i;
+
+	Assert(nrequested > 0);
+	Assert(nindexes > 0);
+
+	lps->leaderparticipates = true;
+
+#ifdef PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION
+	lps->leaderparticipates = false;
+#endif
+
+	EnterParallelMode();
+	pcxt = CreateParallelContext("postgres", "heap_parallel_vacuum_main",
+								 nrequested);
+	lps->pcxt = pcxt;
+	Assert(pcxt->nworkers > 0);
+
+	/* Estimate size for shared information -- PARALLEL_VACUUM_KEY_SHARED */
+	est_shared = MAXALIGN(add_size(SizeOfLVShared, BITMAPLEN(nindexes)));
+	for (i = 0; i < nindexes; i++)
+	{
+		uint8 vacoptions = Irel[i]->rd_indam->amparallelvacuumoptions;
+
+		/*
+		 * Cleanup option should be either disabled, always performing
+		 * in parallel or conditionally performing in parallel.
+		 */
+		Assert(!((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) &&
+				 (vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP)));
+		Assert(vacoptions <= VACUUM_OPTION_MAX_VALID_VALUE);
+
+		if (vacoptions != VACUUM_OPTION_NO_PARALLEL)
+		{
+			est_shared = add_size(est_shared,
+									add_size(sizeof(LVSharedIndStats),
+											 index_parallelvacuum_estimate(Irel[i])));
+
+			/*
+			 * Remember the number of indexes that don't support parallel
+			 * bulk-deletion and parallel cleanup respectively.
+			 */
+			if (!VACUUM_OPTION_SUPPORT_PARALLEL_BULKDEL(vacoptions))
+				lps->nindexes_nonparallel_bulkdel++;
+			if (!VACUUM_OPTION_SUPPORT_PARALLEL_CLEANUP(vacoptions))
+				lps->nindexes_nonparallel_cleanup++;
+		}
+	}
+	shm_toc_estimate_chunk(&pcxt->estimator, est_shared);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	/* Estimate size for dead tuples -- PARALLEL_VACUUM_KEY_DEAD_TUPLES */
+	maxtuples = compute_max_dead_tuples(nblocks, true);
+	est_deadtuples = MAXALIGN(add_size(SizeOfLVDeadTuples,
+									   mul_size(sizeof(ItemPointerData), maxtuples)));
+	shm_toc_estimate_chunk(&pcxt->estimator, est_deadtuples);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	/* Finally, estimate PARALLEL_VACUUM_KEY_QUERY_TEXT space */
+	querylen = strlen(debug_query_string);
+	shm_toc_estimate_chunk(&pcxt->estimator, querylen + 1);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	InitializeParallelDSM(pcxt);
+
+	/* Prepare shared information */
+	shared = (LVShared *) shm_toc_allocate(pcxt->toc, est_shared);
+	MemSet(shared, 0, est_shared);
+	shared->relid = relid;
+	shared->elevel = elevel;
+	shared->offset = add_size(SizeOfLVShared, BITMAPLEN(nindexes));
+	prepare_index_statistics(shared, Irel, nindexes, nrequested);
+	pg_atomic_init_u32(&(shared->idx), 0);
+	pg_atomic_init_u32(&(shared->nprocessed), 0);
+	pg_atomic_init_u32(&(shared->cost_balance), 0);
+	pg_atomic_init_u32(&(shared->active_nworkers), 0);
+
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_SHARED, shared);
+	lps->lvshared = shared;
+
+	/* Prepare the dead tuple space */
+	dead_tuples = (LVDeadTuples *) shm_toc_allocate(pcxt->toc, est_deadtuples);
+	dead_tuples->max_tuples = maxtuples;
+	dead_tuples->num_tuples = 0;
+	MemSet(dead_tuples->itemptrs, 0, sizeof(ItemPointerData) * maxtuples);
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_DEAD_TUPLES, dead_tuples);
+	vacrelstats->dead_tuples = dead_tuples;
+
+	/* Store query string for workers */
+	sharedquery = (char *) shm_toc_allocate(pcxt->toc, querylen + 1);
+	memcpy(sharedquery, debug_query_string, querylen + 1);
+	sharedquery[querylen] = '\0';
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, sharedquery);
+
+	return lps;
+}
+
+/*
+ * Initialize variables for shared index statistics, set NULL bitmap and
+ * the struct size of each indexes.  Also this function sets the number of
+ * indexes that do not support parallel index vacuuming and that use
+ * maintenance_work_mem.  Since currently we don't support parallel vacuum
+ * for autovacuum we don't need to care about autovacuum_work_mem.
+ */
+static void
+prepare_index_statistics(LVShared *lvshared, Relation *Irel, int nindexes,
+						 int nworkers)
+{
+	char *p = (char *) GetSharedIndStats(lvshared);
+	int nindexes_mwm = 0;
+	int i;
+
+	Assert(!IsAutoVacuumWorkerProcess());
+
+	for (i = 0; i < nindexes; i++)
+	{
+		LVSharedIndStats *indstats;
+
+		if (Irel[i]->rd_indam->amparallelvacuumoptions ==
+			VACUUM_OPTION_NO_PARALLEL)
+		{
+			/* Set NULL as this index does not support parallel vacuum */
+			lvshared->bitmap[i >> 3] |= 0 << (i & 0x07);
+			continue;
+		}
+
+		if (Irel[i]->rd_indam->amusemaintenanceworkmem)
+			nindexes_mwm++;
+
+		/* Set the size for index statistics */
+		indstats = (LVSharedIndStats *) p;
+		lvshared->bitmap[i >> 3] |= 1 << (i & 0x07);
+		indstats->size = index_parallelvacuum_estimate(Irel[i]);
+
+		p += SizeOfSharedIndStats(indstats);
+	}
+
+	/* Compute the new maitenance_work_mem value for index vacuuming */
+	lvshared->maintenance_work_mem_worker =
+		(nindexes_mwm > 0) ?
+		maintenance_work_mem / Min(nworkers, nindexes_mwm) :
+		maintenance_work_mem;
+}
+
+/*
+ * Destroy the parallel context, and end parallel mode.
+ *
+ * All writes are not allowed during parallel mode and it might not be
+ * safe to exit from the parallel mode while keeping the parallel context.
+ * So we copy the updated index statistics to a local memory and then later
+ * use that to update the index statistics.
+ */
+static void
+end_parallel_vacuum(LVParallelState *lps, Relation *Irel, int nindexes,
+					IndexBulkDeleteResult **stats)
+{
+	int i;
+
+	Assert(!IsParallelWorker());
+
+	/* copy the updated statistics */
+	for (i = 0; i < nindexes; i++)
+	{
+		LVSharedIndStats *indstats = get_indstats(lps->lvshared, i);
+
+		/*
+		 * Skip unused slot.  The statistics of this index are already
+		 * stored in local memory.
+		 */
+		if (indstats == NULL)
+			continue;
+
+		if (indstats->updated)
+		{
+			stats[i] = (IndexBulkDeleteResult *) palloc0(sizeof(IndexBulkDeleteResult));
+			memcpy(stats[i],
+				   GetIndexBulkDeleteResult(indstats),
+				   sizeof(IndexBulkDeleteResult));
+		}
+		else
+			stats[i] = NULL;
+	}
+
+	DestroyParallelContext(lps->pcxt);
+	ExitParallelMode();
+
+	/* Deactivate parallel vacuum */
+	pfree(lps);
+	lps = NULL;
+}
+
+/* Return the Nth index statistics or NULL */
+static LVSharedIndStats *
+get_indstats(LVShared *lvshared, int n)
+{
+	int		i;
+	char	*p;
+
+	if (IndStatsIsNull(lvshared, n))
+		return NULL;
+
+	p = (char *) GetSharedIndStats(lvshared);
+	for (i = 0; i < (n - 1); i++)
+	{
+		if (IndStatsIsNull(lvshared, i))
+			continue;
+
+		p += SizeOfSharedIndStats(p);
+	}
+
+	return (LVSharedIndStats *) p;
+}
+
+/*
+ * Check if the given index participates parallel index vacuuming
+ * or parallel index cleanup. for_cleanup indicates whether index
+ * cleanup or index bulk-deletion. first_time is true if bulk-deletion
+ * is not performed yet. Return true if the index is skipped.
+ */
+static bool
+skip_parallel_index_vacuum(Relation indrel, bool for_cleanup,
+						   bool first_time)
+{
+	uint8 vacoptions = indrel->rd_indam->amparallelvacuumoptions;
+
+	/* first_time must be true only if for_cleanup is true */
+	Assert(for_cleanup || !first_time);
+
+	if (for_cleanup)
+	{
+		/* Skip if the index does not support parallel cleanup */
+		if (!VACUUM_OPTION_SUPPORT_PARALLEL_CLEANUP(vacoptions))
+			return true;
+
+		/*
+		 * Skip if the index support to parallel cleanup only first
+		 * time cleanup but it is not the first time.
+		 */
+		if (!first_time &&
+			((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP)) != 0)
+			return true;
+	}
+	else if (!VACUUM_OPTION_SUPPORT_PARALLEL_BULKDEL(vacoptions))
+	{
+		/* Skip if the index does not support parallel bulk deletion */
+		return true;
+	}
+
+	return false;
+}
+
+/*
+ * Perform work within a launched parallel process.
+ *
+ * Since parallel vacuum workers work only within index vacuuming and index
+ * cleanup, no need to report the progress information.
+ */
+void
+heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
+{
+	Relation	onerel;
+	Relation	*indrels;
+	LVShared	*lvshared;
+	LVDeadTuples	*dead_tuples;
+	int			nindexes;
+	char		*sharedquery;
+	IndexBulkDeleteResult **stats;
+
+	lvshared = (LVShared *) shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_SHARED,
+										   false);
+	elevel = lvshared->elevel;
+
+	ereport(DEBUG1,
+			(errmsg("starting parallel lazy vacuum worker for %s",
+					lvshared->for_cleanup ? "cleanup" : "vacuuming")));
+
+	/* Set debug_query_string for individual workers */
+	sharedquery = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, false);
+	debug_query_string = sharedquery;
+	pgstat_report_activity(STATE_RUNNING, debug_query_string);
+
+	/*
+	 * Open table.  The lock mode is the same as the leader process.  It's
+	 * okay because The lockmode does not conflict among the parallel
+	 * workers.
+	 */
+	onerel = table_open(lvshared->relid, ShareUpdateExclusiveLock);
+
+	/*
+	 * Open all indexes. indrels are sorted in order by OID, which should
+	 * be matched to the leader's one.
+	 */
+	vac_open_indexes(onerel, RowExclusiveLock, &nindexes, &indrels);
+	Assert(nindexes > 0);
+
+	/* Set dead tuple space */
+	dead_tuples = (LVDeadTuples *) shm_toc_lookup(toc,
+												  PARALLEL_VACUUM_KEY_DEAD_TUPLES,
+												  false);
+
+	/* Set cost-based vacuum delay */
+	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostBalance = 0;
+	VacuumPageHit = 0;
+	VacuumPageMiss = 0;
+	VacuumPageDirty = 0;
+	VacuumSharedCostBalance = &(lvshared->cost_balance);
+	VacuumActiveNWorkers = &(lvshared->active_nworkers);
+
+	stats = (IndexBulkDeleteResult **)
+		palloc0(nindexes * sizeof(IndexBulkDeleteResult *));
+
+	if (lvshared->maintenance_work_mem_worker > 0)
+		maintenance_work_mem = lvshared->maintenance_work_mem_worker;
+
+	/* Increment the active worker count. */
+	pg_atomic_add_fetch_u32(VacuumActiveNWorkers, 1);
+
+	/* Do either vacuuming indexes or cleaning indexes */
+	vacuum_or_cleanup_indexes_worker(indrels, nindexes, stats, lvshared,
+									 dead_tuples);
+
+	/*
+	 * We have completed the index vacuum so decrement the active worker
+	 * count.
+	 */
+	pg_atomic_sub_fetch_u32(VacuumActiveNWorkers, 1);
+
+	vac_close_indexes(nindexes, indrels, RowExclusiveLock);
+	table_close(onerel, ShareUpdateExclusiveLock);
+}
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index d147236429..b42cbd58d8 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -14,6 +14,7 @@
 
 #include "postgres.h"
 
+#include "access/heapam.h"
 #include "access/nbtree.h"
 #include "access/parallel.h"
 #include "access/session.h"
@@ -139,6 +140,9 @@ static const struct
 	},
 	{
 		"_bt_parallel_build_main", _bt_parallel_build_main
+	},
+	{
+		"heap_parallel_vacuum_main", heap_parallel_vacuum_main
 	}
 };
 
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index da1da23400..4b7f480fd6 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -42,6 +42,7 @@
 #include "nodes/makefuncs.h"
 #include "pgstat.h"
 #include "postmaster/autovacuum.h"
+#include "postmaster/bgworker_internals.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
 #include "storage/proc.h"
@@ -99,6 +100,7 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 	/* Set default value */
 	params.index_cleanup = VACOPT_TERNARY_DEFAULT;
 	params.truncate = VACOPT_TERNARY_DEFAULT;
+	params.nworkers = -1;
 
 	/* Parse options list */
 	foreach(lc, vacstmt->options)
@@ -129,6 +131,28 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 			params.index_cleanup = get_vacopt_ternary_value(opt);
 		else if (strcmp(opt->defname, "truncate") == 0)
 			params.truncate = get_vacopt_ternary_value(opt);
+		else if (strcmp(opt->defname, "parallel") == 0)
+		{
+			if (opt->arg == NULL)
+			{
+				/*
+				 * Parallel lazy vacuum is requested but user didn't specify
+				 * the parallel degree. The parallel degree will be determined
+				 * at the start of lazy vacuum.
+				 */
+				params.nworkers = 0;
+			}
+			else
+			{
+				params.nworkers = defGetInt32(opt);
+				if (params.nworkers < 1 || params.nworkers > MAX_PARALLEL_WORKER_LIMIT)
+					ereport(ERROR,
+							(errcode(ERRCODE_SYNTAX_ERROR),
+							 errmsg("parallel vacuum degree must be between 1 and %d",
+									MAX_PARALLEL_WORKER_LIMIT),
+							 parser_errposition(pstate, opt->location)));
+			}
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -170,6 +194,11 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 		}
 	}
 
+	if ((params.options & VACOPT_FULL) && params.nworkers >= 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot specify FULL option with PARALLEL option")));
+
 	/*
 	 * All freeze ages are zero if the FREEZE option is given; otherwise pass
 	 * them as -1 which means to use the default values.
@@ -383,6 +412,7 @@ vacuum(List *relations, VacuumParams *params,
 		VacuumPageHit = 0;
 		VacuumPageMiss = 0;
 		VacuumPageDirty = 0;
+		VacuumSharedCostBalance = NULL;
 
 		/*
 		 * Loop to process each selected relation.
@@ -1738,6 +1768,20 @@ vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params)
 		return false;
 	}
 
+	/*
+	 * Since parallel workers cannot access data in temporary tables, parallel
+	 * vacuum is not allowed for temporary relation. However rather than
+	 * skipping vacuum on the table, just disabling parallel option is better
+	 * option in most cases.
+	 */
+	if (RelationUsesLocalBuffers(onerel) && params->nworkers >= 0)
+	{
+		ereport(WARNING,
+				(errmsg("disabling parallel option of vacuum on \"%s\" --- cannot vacuum temporary tables in parallel",
+						RelationGetRelationName(onerel))));
+		params->nworkers = 0;
+	}
+
 	/*
 	 * Silently ignore partitioned tables as there is no work to be done.  The
 	 * useful work is on their child partitions, which have been queued up for
@@ -1941,16 +1985,73 @@ vac_close_indexes(int nindexes, Relation *Irel, LOCKMODE lockmode)
 void
 vacuum_delay_point(void)
 {
+	double	msec = 0;
+
 	/* Always check for interrupts */
 	CHECK_FOR_INTERRUPTS();
 
-	/* Nap if appropriate */
-	if (VacuumCostActive && !InterruptPending &&
-		VacuumCostBalance >= VacuumCostLimit)
+	if (!VacuumCostActive || InterruptPending)
+		return;
+
+	/*
+	 * If the vacuum cost balance is shared among parallel workers we
+	 * decide whether to sleep based on that.
+	 */
+	if (VacuumSharedCostBalance != NULL)
 	{
-		double		msec;
+		int nworkers = pg_atomic_read_u32(VacuumActiveNWorkers);
+
+		/* At least count itself */
+		Assert(nworkers >= 1);
+
+		/* Update the shared cost balance value atomically */
+		while (true)
+		{
+			uint32 shared_balance;
+			uint32 new_balance;
+			uint32 local_balance;
+
+			msec = 0;
+
+			/* compute new balance by adding the local value */
+			shared_balance = pg_atomic_read_u32(VacuumSharedCostBalance);
+			new_balance = shared_balance + VacuumCostBalance;
 
+			/* also compute the total local balance */
+			local_balance = VacuumCostBalanceLocal + VacuumCostBalance;
+
+			if ((new_balance >= VacuumCostLimit) &&
+				(local_balance > 0.5 * (VacuumCostLimit / nworkers)))
+			{
+				/* compute sleep time based on the local cost balance */
+				msec = VacuumCostDelay * VacuumCostBalanceLocal / VacuumCostLimit;
+				new_balance = shared_balance - VacuumCostBalanceLocal;
+				VacuumCostBalanceLocal = 0;
+			}
+
+			if (pg_atomic_compare_exchange_u32(VacuumSharedCostBalance,
+											   &shared_balance,
+											   new_balance))
+			{
+				/* Updated successfully, break */
+				break;
+			}
+		}
+
+		VacuumCostBalanceLocal += VacuumCostBalance;
+
+		/*
+		 * Reset the local balance as we accumulated it into the shared
+		 * value.
+		 */
+		VacuumCostBalance = 0;
+	}
+	else if (VacuumCostBalance >= VacuumCostLimit)
 		msec = VacuumCostDelay * VacuumCostBalance / VacuumCostLimit;
+
+	/* Nap if appropriate */
+	if (msec > 0)
+	{
 		if (msec > VacuumCostDelay * 4)
 			msec = VacuumCostDelay * 4;
 
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index c1dd8168ca..c3690f9c41 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2891,6 +2891,8 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 			(!wraparound ? VACOPT_SKIP_LOCKED : 0);
 		tab->at_params.index_cleanup = VACOPT_TERNARY_DEFAULT;
 		tab->at_params.truncate = VACOPT_TERNARY_DEFAULT;
+		/* We don't support parallel vacuum for autovacuum for now */
+		tab->at_params.nworkers = -1;
 		tab->at_params.freeze_min_age = freeze_min_age;
 		tab->at_params.freeze_table_age = freeze_table_age;
 		tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 98c917bf7a..ce35be710f 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -3560,7 +3560,7 @@ psql_completion(const char *text, int start, int end)
 		if (ends_with(prev_wd, '(') || ends_with(prev_wd, ','))
 			COMPLETE_WITH("FULL", "FREEZE", "ANALYZE", "VERBOSE",
 						  "DISABLE_PAGE_SKIPPING", "SKIP_LOCKED",
-						  "INDEX_CLEANUP", "TRUNCATE");
+						  "INDEX_CLEANUP", "TRUNCATE", "PARALLEL");
 		else if (TailMatches("FULL|FREEZE|ANALYZE|VERBOSE|DISABLE_PAGE_SKIPPING|SKIP_LOCKED|INDEX_CLEANUP|TRUNCATE"))
 			COMPLETE_WITH("ON", "OFF");
 	}
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 858bcb6bc9..61725e749f 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -24,6 +24,8 @@
 #include "nodes/primnodes.h"
 #include "storage/bufpage.h"
 #include "storage/lockdefs.h"
+#include "storage/shm_toc.h"
+#include "storage/dsm.h"
 #include "utils/relcache.h"
 #include "utils/snapshot.h"
 
@@ -190,9 +192,13 @@ extern void SyncScanShmemInit(void);
 extern Size SyncScanShmemSize(void);
 
 /* in heap/vacuumlazy.c */
+extern pg_atomic_uint32	*VacuumSharedCostBalance;
+extern pg_atomic_uint32	*VacuumActiveNWorkers;
+extern int	VacuumCostBalanceLocal;
 struct VacuumParams;
 extern void heap_vacuum_rel(Relation onerel,
 							struct VacuumParams *params, BufferAccessStrategy bstrategy);
+extern void heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
 
 /* in heap/heapam_visibility.c */
 extern bool HeapTupleSatisfiesVisibility(HeapTuple stup, Snapshot snapshot,
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 7b6f269785..295b6a17f0 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -212,6 +212,11 @@ typedef struct VacuumParams
 										 * default value depends on reloptions */
 	VacOptTernaryValue truncate;	/* Truncate empty pages at the end,
 									 * default value depends on reloptions */
+	/*
+	 * The number of parallel vacuum workers. -1 by default for no workers
+	 * and 0 for choosing based on the number of indexes.
+	 */
+	int			nworkers;
 } VacuumParams;
 
 /* GUC parameters */
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index 9996d882d1..cf5e1f0a4e 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -92,6 +92,32 @@ CONTEXT:  SQL function "do_analyze" statement 1
 SQL function "wrap_do_analyze" statement 1
 VACUUM FULL vactst;
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+-- PARALLEL option
+CREATE TABLE pvactst (i INT, a INT[], p POINT) with (autovacuum_enabled = off);
+INSERT INTO pvactst SELECT i, array[1,2,3], point(i, i+1) FROM generate_series(1,100000) i;
+CREATE INDEX btree_pvactst ON pvactst USING btree (i);
+CREATE INDEX hash_pvactst ON pvactst USING hash (i);
+CREATE INDEX brin_pvactst ON pvactst USING brin (i);
+CREATE INDEX gin_pvactst ON pvactst USING gin (a);
+CREATE INDEX gist_pvactst ON pvactst USING gist (p);
+CREATE INDEX spgist_pvactst ON pvactst USING spgist (p);
+-- VACUUM invokes parallel index cleanup
+VACUUM (PARALLEL 2) pvactst;
+-- VACUUM invokes parallel bulk-deletion and parallel index cleanup
+UPDATE pvactst SET i = i WHERE i < 1000;
+VACUUM (PARALLEL 2) pvactst;
+VACUUM (PARALLEL 0) pvactst; -- error
+ERROR:  parallel vacuum degree must be between 1 and 1024
+LINE 1: VACUUM (PARALLEL 0) pvactst;
+                ^
+VACUUM (PARALLEL 2, INDEX_CLEANUP FALSE) pvactst;
+VACUUM (PARALLEL 2, FULL TRUE) pvactst; -- error, cannot use both PARALLEL and FULL
+ERROR:  cannot specify FULL option with PARALLEL option
+CREATE TEMPORARY TABLE tmp (a int PRIMARY KEY);
+CREATE INDEX tmp_idx1 ON tmp (a);
+VACUUM (PARALLEL 1) tmp; -- disables parallel vacuum option
+WARNING:  disabling parallel option of vacuum on "tmp" --- cannot vacuum temporary tables in parallel
+DROP TABLE pvactst;
 -- INDEX_CLEANUP option
 CREATE TABLE no_index_cleanup (i INT PRIMARY KEY, t TEXT);
 -- Use uncompressed data stored in toast.
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 69987f75e9..0aecf17773 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -75,6 +75,31 @@ VACUUM FULL vactst;
 
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
 
+-- PARALLEL option
+CREATE TABLE pvactst (i INT, a INT[], p POINT) with (autovacuum_enabled = off);
+INSERT INTO pvactst SELECT i, array[1,2,3], point(i, i+1) FROM generate_series(1,100000) i;
+CREATE INDEX btree_pvactst ON pvactst USING btree (i);
+CREATE INDEX hash_pvactst ON pvactst USING hash (i);
+CREATE INDEX brin_pvactst ON pvactst USING brin (i);
+CREATE INDEX gin_pvactst ON pvactst USING gin (a);
+CREATE INDEX gist_pvactst ON pvactst USING gist (p);
+CREATE INDEX spgist_pvactst ON pvactst USING spgist (p);
+
+-- VACUUM invokes parallel index cleanup
+VACUUM (PARALLEL 2) pvactst;
+
+-- VACUUM invokes parallel bulk-deletion and parallel index cleanup
+UPDATE pvactst SET i = i WHERE i < 1000;
+VACUUM (PARALLEL 2) pvactst;
+
+VACUUM (PARALLEL 0) pvactst; -- error
+VACUUM (PARALLEL 2, INDEX_CLEANUP FALSE) pvactst;
+VACUUM (PARALLEL 2, FULL TRUE) pvactst; -- error, cannot use both PARALLEL and FULL
+CREATE TEMPORARY TABLE tmp (a int PRIMARY KEY);
+CREATE INDEX tmp_idx1 ON tmp (a);
+VACUUM (PARALLEL 1) tmp; -- disables parallel vacuum option
+DROP TABLE pvactst;
+
 -- INDEX_CLEANUP option
 CREATE TABLE no_index_cleanup (i INT PRIMARY KEY, t TEXT);
 -- Use uncompressed data stored in toast.
-- 
2.23.0

#215

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Masahiko Sawada (#214)

On Wed, Nov 20, 2019 at 11:01 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

I've attached the latest version patch set. The patch set includes all
discussed points regarding index AM options as well as shared cost
balance. Also I added some test cases used all types of index AM.

During developments I had one concern about the number of parallel
workers to launch. In current design each index AMs can choose the
participation of parallel bulk-deletion and parallel cleanup. That
also means the number of parallel worker to launch might be different
for each time of parallel bulk-deletion and parallel cleanup. In
current patch the leader will always launch the number of indexes that
support either one but it would not be efficient in some cases. For
example, if we have 3 indexes supporting only parallel bulk-deletion
and 2 indexes supporting only parallel index cleanup, we would launch
5 workers for each execution but some workers will do nothing at all.
To deal with this problem, I wonder if we can improve the parallel
query so that the leader process creates a parallel context with the
maximum number of indexes and can launch a part of workers instead of
all of them.

Can't we choose the number of workers as a maximum of
"num_of_indexes_that_support_bulk_del" and
"num_of_indexes_that_support_cleanup"? If we can do that, then we can
always launch the required number of workers for each phase (bulk_del,
cleanup). In your above example, it should choose 3 workers while
creating a parallel context. Do you see any problem with that?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#216

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

about 6 years ago

In reply to: Amit Kapila (#215)

On Wed, 20 Nov 2019 at 17:02, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Nov 20, 2019 at 11:01 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

I've attached the latest version patch set. The patch set includes all
discussed points regarding index AM options as well as shared cost
balance. Also I added some test cases used all types of index AM.

During developments I had one concern about the number of parallel
workers to launch. In current design each index AMs can choose the
participation of parallel bulk-deletion and parallel cleanup. That
also means the number of parallel worker to launch might be different
for each time of parallel bulk-deletion and parallel cleanup. In
current patch the leader will always launch the number of indexes that
support either one but it would not be efficient in some cases. For
example, if we have 3 indexes supporting only parallel bulk-deletion
and 2 indexes supporting only parallel index cleanup, we would launch
5 workers for each execution but some workers will do nothing at all.
To deal with this problem, I wonder if we can improve the parallel
query so that the leader process creates a parallel context with the
maximum number of indexes and can launch a part of workers instead of
all of them.

Can't we choose the number of workers as a maximum of
"num_of_indexes_that_support_bulk_del" and
"num_of_indexes_that_support_cleanup"? If we can do that, then we can
always launch the required number of workers for each phase (bulk_del,
cleanup). In your above example, it should choose 3 workers while
creating a parallel context. Do you see any problem with that?

I might be missing something but if we create the parallel context
with 3 workers the leader process always launches 3 workers. Therefore
in the above case it launches 3 workers even in cleanup although 2
workers is enough.

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#217

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Masahiko Sawada (#216)

On Wed, Nov 20, 2019 at 4:04 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Wed, 20 Nov 2019 at 17:02, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Nov 20, 2019 at 11:01 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

I've attached the latest version patch set. The patch set includes all
discussed points regarding index AM options as well as shared cost
balance. Also I added some test cases used all types of index AM.

During developments I had one concern about the number of parallel
workers to launch. In current design each index AMs can choose the
participation of parallel bulk-deletion and parallel cleanup. That
also means the number of parallel worker to launch might be different
for each time of parallel bulk-deletion and parallel cleanup. In
current patch the leader will always launch the number of indexes that
support either one but it would not be efficient in some cases. For
example, if we have 3 indexes supporting only parallel bulk-deletion
and 2 indexes supporting only parallel index cleanup, we would launch
5 workers for each execution but some workers will do nothing at all.
To deal with this problem, I wonder if we can improve the parallel
query so that the leader process creates a parallel context with the
maximum number of indexes and can launch a part of workers instead of
all of them.

Can't we choose the number of workers as a maximum of
"num_of_indexes_that_support_bulk_del" and
"num_of_indexes_that_support_cleanup"? If we can do that, then we can
always launch the required number of workers for each phase (bulk_del,
cleanup). In your above example, it should choose 3 workers while
creating a parallel context. Do you see any problem with that?

I might be missing something but if we create the parallel context
with 3 workers the leader process always launches 3 workers. Therefore
in the above case it launches 3 workers even in cleanup although 2
workers is enough.

Right, so we can either extend parallel API to launch fewer workers
than it has in parallel context as suggested by you or we can use
separate parallel context for each phase. Going with the earlier has
the benefit that we don't need to recreate the parallel context and
the latter has the advantage that we won't keep additional shared
memory allocated. BTW, what kind of API change you have in mind for
the approach you are suggesting?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#218

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

about 6 years ago

In reply to: Amit Kapila (#217)

On Wed, 20 Nov 2019 at 20:36, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Nov 20, 2019 at 4:04 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Wed, 20 Nov 2019 at 17:02, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Nov 20, 2019 at 11:01 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

I've attached the latest version patch set. The patch set includes all
discussed points regarding index AM options as well as shared cost
balance. Also I added some test cases used all types of index AM.

During developments I had one concern about the number of parallel
workers to launch. In current design each index AMs can choose the
participation of parallel bulk-deletion and parallel cleanup. That
also means the number of parallel worker to launch might be different
for each time of parallel bulk-deletion and parallel cleanup. In
current patch the leader will always launch the number of indexes that
support either one but it would not be efficient in some cases. For
example, if we have 3 indexes supporting only parallel bulk-deletion
and 2 indexes supporting only parallel index cleanup, we would launch
5 workers for each execution but some workers will do nothing at all.
To deal with this problem, I wonder if we can improve the parallel
query so that the leader process creates a parallel context with the
maximum number of indexes and can launch a part of workers instead of
all of them.

Can't we choose the number of workers as a maximum of
"num_of_indexes_that_support_bulk_del" and
"num_of_indexes_that_support_cleanup"? If we can do that, then we can
always launch the required number of workers for each phase (bulk_del,
cleanup). In your above example, it should choose 3 workers while
creating a parallel context. Do you see any problem with that?

I might be missing something but if we create the parallel context
with 3 workers the leader process always launches 3 workers. Therefore
in the above case it launches 3 workers even in cleanup although 2
workers is enough.

Right, so we can either extend parallel API to launch fewer workers
than it has in parallel context as suggested by you or we can use
separate parallel context for each phase. Going with the earlier has
the benefit that we don't need to recreate the parallel context and
the latter has the advantage that we won't keep additional shared
memory allocated.

I also thought to use separate parallel contexts for each phase but
can the same DSM be used by parallel workers who initiated from
different parallel contexts? If not I think that doesn't work because
the parallel vacuum needs to set data to DSM of ambulkdelete and then
parallel workers for amvacuumcleanup needs to access it.

BTW, what kind of API change you have in mind for
the approach you are suggesting?

I was thinking to add a new API, say LaunchParallelNWorkers(pcxt, n),
where n is the number of workers the caller wants to launch and should
be lower than the value in the parallel context.

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#219

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Masahiko Sawada (#218)

On Thu, Nov 21, 2019 at 6:53 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Wed, 20 Nov 2019 at 20:36, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Nov 20, 2019 at 4:04 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Wed, 20 Nov 2019 at 17:02, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Nov 20, 2019 at 11:01 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

I've attached the latest version patch set. The patch set includes all
discussed points regarding index AM options as well as shared cost
balance. Also I added some test cases used all types of index AM.

During developments I had one concern about the number of parallel
workers to launch. In current design each index AMs can choose the
participation of parallel bulk-deletion and parallel cleanup. That
also means the number of parallel worker to launch might be different
for each time of parallel bulk-deletion and parallel cleanup. In
current patch the leader will always launch the number of indexes that
support either one but it would not be efficient in some cases. For
example, if we have 3 indexes supporting only parallel bulk-deletion
and 2 indexes supporting only parallel index cleanup, we would launch
5 workers for each execution but some workers will do nothing at all.
To deal with this problem, I wonder if we can improve the parallel
query so that the leader process creates a parallel context with the
maximum number of indexes and can launch a part of workers instead of
all of them.

Can't we choose the number of workers as a maximum of
"num_of_indexes_that_support_bulk_del" and
"num_of_indexes_that_support_cleanup"? If we can do that, then we can
always launch the required number of workers for each phase (bulk_del,
cleanup). In your above example, it should choose 3 workers while
creating a parallel context. Do you see any problem with that?

I might be missing something but if we create the parallel context
with 3 workers the leader process always launches 3 workers. Therefore
in the above case it launches 3 workers even in cleanup although 2
workers is enough.

Right, so we can either extend parallel API to launch fewer workers
than it has in parallel context as suggested by you or we can use
separate parallel context for each phase. Going with the earlier has
the benefit that we don't need to recreate the parallel context and
the latter has the advantage that we won't keep additional shared
memory allocated.

I also thought to use separate parallel contexts for each phase but
can the same DSM be used by parallel workers who initiated from
different parallel contexts? If not I think that doesn't work because
the parallel vacuum needs to set data to DSM of ambulkdelete and then
parallel workers for amvacuumcleanup needs to access it.

We can probably copy the stats in local memory instead of pointing it
to dsm after bulk-deletion, but I think that would unnecessary
overhead and doesn't sound like a good idea.

BTW, what kind of API change you have in mind for
the approach you are suggesting?

I was thinking to add a new API, say LaunchParallelNWorkers(pcxt, n),
where n is the number of workers the caller wants to launch and should
be lower than the value in the parallel context.

For that won't you need to duplicate most of the code of
LaunchParallelWorkers or maybe move the entire code in
LaunchParallelNWorkers and then LaunchParallelWorkers can also call
it. Another idea could be to just extend the existing API
LaunchParallelWorkers to take input parameter as the number of
workers, do you see any problem with that or is there a reason you
prefer to write a new API for this?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#220

Dilip Kumar

dilipbalaut@gmail.com

about 6 years ago

In reply to: Amit Kapila (#219)

On Thu, Nov 21, 2019 at 9:15 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Thu, Nov 21, 2019 at 6:53 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Wed, 20 Nov 2019 at 20:36, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Nov 20, 2019 at 4:04 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Wed, 20 Nov 2019 at 17:02, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Nov 20, 2019 at 11:01 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

I've attached the latest version patch set. The patch set includes all
discussed points regarding index AM options as well as shared cost
balance. Also I added some test cases used all types of index AM.

During developments I had one concern about the number of parallel
workers to launch. In current design each index AMs can choose the
participation of parallel bulk-deletion and parallel cleanup. That
also means the number of parallel worker to launch might be different
for each time of parallel bulk-deletion and parallel cleanup. In
current patch the leader will always launch the number of indexes that
support either one but it would not be efficient in some cases. For
example, if we have 3 indexes supporting only parallel bulk-deletion
and 2 indexes supporting only parallel index cleanup, we would launch
5 workers for each execution but some workers will do nothing at all.
To deal with this problem, I wonder if we can improve the parallel
query so that the leader process creates a parallel context with the
maximum number of indexes and can launch a part of workers instead of
all of them.

Can't we choose the number of workers as a maximum of
"num_of_indexes_that_support_bulk_del" and
"num_of_indexes_that_support_cleanup"? If we can do that, then we can
always launch the required number of workers for each phase (bulk_del,
cleanup). In your above example, it should choose 3 workers while
creating a parallel context. Do you see any problem with that?

I might be missing something but if we create the parallel context
with 3 workers the leader process always launches 3 workers. Therefore
in the above case it launches 3 workers even in cleanup although 2
workers is enough.

Right, so we can either extend parallel API to launch fewer workers
than it has in parallel context as suggested by you or we can use
separate parallel context for each phase. Going with the earlier has
the benefit that we don't need to recreate the parallel context and
the latter has the advantage that we won't keep additional shared
memory allocated.

I also thought to use separate parallel contexts for each phase but
can the same DSM be used by parallel workers who initiated from
different parallel contexts? If not I think that doesn't work because
the parallel vacuum needs to set data to DSM of ambulkdelete and then
parallel workers for amvacuumcleanup needs to access it.

We can probably copy the stats in local memory instead of pointing it
to dsm after bulk-deletion, but I think that would unnecessary
overhead and doesn't sound like a good idea.

I agree that it will be unnecessary overhead.

BTW, what kind of API change you have in mind for
the approach you are suggesting?

I was thinking to add a new API, say LaunchParallelNWorkers(pcxt, n),
where n is the number of workers the caller wants to launch and should
be lower than the value in the parallel context.

For that won't you need to duplicate most of the code of
LaunchParallelWorkers or maybe move the entire code in
LaunchParallelNWorkers and then LaunchParallelWorkers can also call
it. Another idea could be to just extend the existing API
LaunchParallelWorkers to take input parameter as the number of
workers, do you see any problem with that or is there a reason you
prefer to write a new API for this?

I think we can pass an extra parameter to LaunchParallelWorkers
therein we can try to launch min(pcxt->nworkers, n). Or we can put an
assert (n <= pcxt->nworkers).

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#221

Dilip Kumar

dilipbalaut@gmail.com

about 6 years ago

In reply to: Masahiko Sawada (#214)

On Wed, Nov 20, 2019 at 11:01 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Mon, 18 Nov 2019 at 15:38, Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Mon, 18 Nov 2019 at 15:34, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Nov 18, 2019 at 11:37 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Wed, 13 Nov 2019 at 14:31, Amit Kapila <amit.kapila16@gmail.com> wrote:

Based on these needs, we came up with a way to allow users to specify
this information for IndexAm's. Basically, Indexam will expose a
variable amparallelvacuumoptions which can have below options

VACUUM_OPTION_NO_PARALLEL 1 << 0 # vacuum (neither bulkdelete nor
vacuumcleanup) can't be performed in parallel

I think VACUUM_OPTION_NO_PARALLEL can be 0 so that index AMs who don't
want to support parallel vacuum don't have to set anything.

make sense.

VACUUM_OPTION_PARALLEL_BULKDEL 1 << 1 # bulkdelete can be done in
parallel (Indexes nbtree, hash, gin, gist, spgist, bloom will set this
flag)
VACUUM_OPTION_PARALLEL_COND_CLEANUP 1 << 2 # vacuumcleanup can be
done in parallel if bulkdelete is not performed (Indexes nbtree, brin,
gin, gist,
spgist, bloom will set this flag)
VACUUM_OPTION_PARALLEL_CLEANUP 1 << 3 # vacuumcleanup can be done in
parallel even if bulkdelete is already performed (Indexes gin, brin,
and bloom will set this flag)

I think gin and bloom don't need to set both but should set only
VACUUM_OPTION_PARALLEL_CLEANUP.

And I'm going to disallow index AMs to set both
VACUUM_OPTION_PARALLEL_COND_CLEANUP and VACUUM_OPTION_PARALLEL_CLEANUP
by assertions, is that okay?

Sounds reasonable to me.

Are you planning to include the changes related to I/O throttling
based on the discussion in the nearby thread [1]? I think you can do
that if you agree with the conclusion in the last email[1], otherwise,
we can explore it separately.

Yes I agreed. I'm going to include that changes in the next version
patches. And I think we will be able to do more discussion based on
the patch.

I've attached the latest version patch set. The patch set includes all
discussed points regarding index AM options as well as shared cost
balance. Also I added some test cases used all types of index AM.

During developments I had one concern about the number of parallel
workers to launch. In current design each index AMs can choose the
participation of parallel bulk-deletion and parallel cleanup. That
also means the number of parallel worker to launch might be different
for each time of parallel bulk-deletion and parallel cleanup. In
current patch the leader will always launch the number of indexes that
support either one but it would not be efficient in some cases. For
example, if we have 3 indexes supporting only parallel bulk-deletion
and 2 indexes supporting only parallel index cleanup, we would launch
5 workers for each execution but some workers will do nothing at all.
To deal with this problem, I wonder if we can improve the parallel
query so that the leader process creates a parallel context with the
maximum number of indexes and can launch a part of workers instead of
all of them.

+
+ /* compute new balance by adding the local value */
+ shared_balance = pg_atomic_read_u32(VacuumSharedCostBalance);
+ new_balance = shared_balance + VacuumCostBalance;

+ /* also compute the total local balance */
+ local_balance = VacuumCostBalanceLocal + VacuumCostBalance;
+
+ if ((new_balance >= VacuumCostLimit) &&
+ (local_balance > 0.5 * (VacuumCostLimit / nworkers)))
+ {
+ /* compute sleep time based on the local cost balance */
+ msec = VacuumCostDelay * VacuumCostBalanceLocal / VacuumCostLimit;
+ new_balance = shared_balance - VacuumCostBalanceLocal;
+ VacuumCostBalanceLocal = 0;
+ }
+
+ if (pg_atomic_compare_exchange_u32(VacuumSharedCostBalance,
+    &shared_balance,
+    new_balance))
+ {
+ /* Updated successfully, break */
+ break;
+ }
While looking at the shared costing delay part, I have noticed that
while checking the delay condition, we are considering local_balance
which is VacuumCostBalanceLocal + VacuumCostBalance, but while
computing the new balance we only reduce shared balance by
VacuumCostBalanceLocal,  I think it should be reduced with
local_balance?  I see that later we are adding VacuumCostBalance to
the VacuumCostBalanceLocal so we are not loosing accounting for this
balance.  But, I feel it is not right that we compare based on one
value and operate based on other. I think we can immediately set
VacuumCostBalanceLocal += VacuumCostBalance before checking the
condition.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#222

Dilip Kumar

dilipbalaut@gmail.com

about 6 years ago

In reply to: Dilip Kumar (#221)

On Thu, Nov 21, 2019 at 10:46 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Wed, Nov 20, 2019 at 11:01 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Mon, 18 Nov 2019 at 15:38, Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Mon, 18 Nov 2019 at 15:34, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Nov 18, 2019 at 11:37 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Wed, 13 Nov 2019 at 14:31, Amit Kapila <amit.kapila16@gmail.com> wrote:

Based on these needs, we came up with a way to allow users to specify
this information for IndexAm's. Basically, Indexam will expose a
variable amparallelvacuumoptions which can have below options

VACUUM_OPTION_NO_PARALLEL 1 << 0 # vacuum (neither bulkdelete nor
vacuumcleanup) can't be performed in parallel

I think VACUUM_OPTION_NO_PARALLEL can be 0 so that index AMs who don't
want to support parallel vacuum don't have to set anything.

make sense.

VACUUM_OPTION_PARALLEL_BULKDEL 1 << 1 # bulkdelete can be done in
parallel (Indexes nbtree, hash, gin, gist, spgist, bloom will set this
flag)
VACUUM_OPTION_PARALLEL_COND_CLEANUP 1 << 2 # vacuumcleanup can be
done in parallel if bulkdelete is not performed (Indexes nbtree, brin,
gin, gist,
spgist, bloom will set this flag)
VACUUM_OPTION_PARALLEL_CLEANUP 1 << 3 # vacuumcleanup can be done in
parallel even if bulkdelete is already performed (Indexes gin, brin,
and bloom will set this flag)

I think gin and bloom don't need to set both but should set only
VACUUM_OPTION_PARALLEL_CLEANUP.

And I'm going to disallow index AMs to set both
VACUUM_OPTION_PARALLEL_COND_CLEANUP and VACUUM_OPTION_PARALLEL_CLEANUP
by assertions, is that okay?

Sounds reasonable to me.

Are you planning to include the changes related to I/O throttling
based on the discussion in the nearby thread [1]? I think you can do
that if you agree with the conclusion in the last email[1], otherwise,
we can explore it separately.

Yes I agreed. I'm going to include that changes in the next version
patches. And I think we will be able to do more discussion based on
the patch.

I've attached the latest version patch set. The patch set includes all
discussed points regarding index AM options as well as shared cost
balance. Also I added some test cases used all types of index AM.

During developments I had one concern about the number of parallel
workers to launch. In current design each index AMs can choose the
participation of parallel bulk-deletion and parallel cleanup. That
also means the number of parallel worker to launch might be different
for each time of parallel bulk-deletion and parallel cleanup. In
current patch the leader will always launch the number of indexes that
support either one but it would not be efficient in some cases. For
example, if we have 3 indexes supporting only parallel bulk-deletion
and 2 indexes supporting only parallel index cleanup, we would launch
5 workers for each execution but some workers will do nothing at all.
To deal with this problem, I wonder if we can improve the parallel
query so that the leader process creates a parallel context with the
maximum number of indexes and can launch a part of workers instead of
all of them.
+
+ /* compute new balance by adding the local value */
+ shared_balance = pg_atomic_read_u32(VacuumSharedCostBalance);
+ new_balance = shared_balance + VacuumCostBalance;
+ /* also compute the total local balance */
+ local_balance = VacuumCostBalanceLocal + VacuumCostBalance;
+
+ if ((new_balance >= VacuumCostLimit) &&
+ (local_balance > 0.5 * (VacuumCostLimit / nworkers)))
+ {
+ /* compute sleep time based on the local cost balance */
+ msec = VacuumCostDelay * VacuumCostBalanceLocal / VacuumCostLimit;
+ new_balance = shared_balance - VacuumCostBalanceLocal;
+ VacuumCostBalanceLocal = 0;
+ }
+
+ if (pg_atomic_compare_exchange_u32(VacuumSharedCostBalance,
+    &shared_balance,
+    new_balance))
+ {
+ /* Updated successfully, break */
+ break;
+ }
While looking at the shared costing delay part, I have noticed that
while checking the delay condition, we are considering local_balance
which is VacuumCostBalanceLocal + VacuumCostBalance, but while
computing the new balance we only reduce shared balance by
VacuumCostBalanceLocal,  I think it should be reduced with
local_balance?  I see that later we are adding VacuumCostBalance to
the VacuumCostBalanceLocal so we are not loosing accounting for this
balance.  But, I feel it is not right that we compare based on one
value and operate based on other. I think we can immediately set
VacuumCostBalanceLocal += VacuumCostBalance before checking the
condition.

+/*
+ * index_parallelvacuum_estimate - estimate shared memory for parallel vacuum
+ *
+ * Currently, we don't pass any information to the AM-specific estimator,
+ * so it can probably only return a constant.  In the future, we might need
+ * to pass more information.
+ */
+Size
+index_parallelvacuum_estimate(Relation indexRelation)
+{
+ Size nbytes;
+
+ RELATION_CHECKS;
+
+ /*
+ * If amestimateparallelvacuum is not provided, assume only
+ * IndexBulkDeleteResult is needed.
+ */
+ if (indexRelation->rd_indam->amestimateparallelvacuum != NULL)
+ {
+ nbytes = indexRelation->rd_indam->amestimateparallelvacuum();
+ Assert(nbytes >= MAXALIGN(sizeof(IndexBulkDeleteResult)));
+ }
+ else
+ nbytes = MAXALIGN(sizeof(IndexBulkDeleteResult));
+
+ return nbytes;
+}

In v33-0001-Add-index-AM-field-and-callback-for-parallel-ind patch, I
am a bit doubtful about this kind of arrangement, where the code in
the "if" is always unreachable with the current AMs. I am not sure
what is the best way to handle this, should we just drop the
amestimateparallelvacuum altogether? Because currently, we are just
providing a size estimate function without a copy function, even if
the in future some Am give an estimate about the size of the stats, we
can not directly memcpy the stat from the local memory to the shared
memory, we might then need a copy function also from the AM so that it
can flatten the stats and store in proper format?

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#223

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

about 6 years ago

In reply to: Dilip Kumar (#220)

On Thu, 21 Nov 2019 at 13:25, Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Thu, Nov 21, 2019 at 9:15 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Thu, Nov 21, 2019 at 6:53 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Wed, 20 Nov 2019 at 20:36, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Nov 20, 2019 at 4:04 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Wed, 20 Nov 2019 at 17:02, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Nov 20, 2019 at 11:01 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

I've attached the latest version patch set. The patch set includes all
discussed points regarding index AM options as well as shared cost
balance. Also I added some test cases used all types of index AM.

During developments I had one concern about the number of parallel
workers to launch. In current design each index AMs can choose the
participation of parallel bulk-deletion and parallel cleanup. That
also means the number of parallel worker to launch might be different
for each time of parallel bulk-deletion and parallel cleanup. In
current patch the leader will always launch the number of indexes that
support either one but it would not be efficient in some cases. For
example, if we have 3 indexes supporting only parallel bulk-deletion
and 2 indexes supporting only parallel index cleanup, we would launch
5 workers for each execution but some workers will do nothing at all.
To deal with this problem, I wonder if we can improve the parallel
query so that the leader process creates a parallel context with the
maximum number of indexes and can launch a part of workers instead of
all of them.

Can't we choose the number of workers as a maximum of
"num_of_indexes_that_support_bulk_del" and
"num_of_indexes_that_support_cleanup"? If we can do that, then we can
always launch the required number of workers for each phase (bulk_del,
cleanup). In your above example, it should choose 3 workers while
creating a parallel context. Do you see any problem with that?

I might be missing something but if we create the parallel context
with 3 workers the leader process always launches 3 workers. Therefore
in the above case it launches 3 workers even in cleanup although 2
workers is enough.

Right, so we can either extend parallel API to launch fewer workers
than it has in parallel context as suggested by you or we can use
separate parallel context for each phase. Going with the earlier has
the benefit that we don't need to recreate the parallel context and
the latter has the advantage that we won't keep additional shared
memory allocated.

I also thought to use separate parallel contexts for each phase but
can the same DSM be used by parallel workers who initiated from
different parallel contexts? If not I think that doesn't work because
the parallel vacuum needs to set data to DSM of ambulkdelete and then
parallel workers for amvacuumcleanup needs to access it.

We can probably copy the stats in local memory instead of pointing it
to dsm after bulk-deletion, but I think that would unnecessary
overhead and doesn't sound like a good idea.

Right.

I agree that it will be unnecessary overhead.

BTW, what kind of API change you have in mind for
the approach you are suggesting?

I was thinking to add a new API, say LaunchParallelNWorkers(pcxt, n),
where n is the number of workers the caller wants to launch and should
be lower than the value in the parallel context.

For that won't you need to duplicate most of the code of
LaunchParallelWorkers or maybe move the entire code in
LaunchParallelNWorkers and then LaunchParallelWorkers can also call
it. Another idea could be to just extend the existing API
LaunchParallelWorkers to take input parameter as the number of
workers, do you see any problem with that or is there a reason you
prefer to write a new API for this?

Yeah, passing an extra parameter to LaunchParallelWorkers seems to be
a good idea. I just thought that the current API is also reasonable
because the caller of LaunchParallelWorkers doesn't need to care about
the number of workers, which is helpful for some cases, for example,
where the caller of CreateParallelContext and the caller of
LaunchParallelWorker are in different components. However it's not be
a problem since as far as I can see the current code there is no such
designed feature (these functions are called in the same function).

I think we can pass an extra parameter to LaunchParallelWorkers
therein we can try to launch min(pcxt->nworkers, n). Or we can put an
assert (n <= pcxt->nworkers).

I prefer to use min(pcxt->nworkers, n).

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#224

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

about 6 years ago

In reply to: Dilip Kumar (#221)

On Thu, 21 Nov 2019 at 14:16, Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Wed, Nov 20, 2019 at 11:01 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Mon, 18 Nov 2019 at 15:38, Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Mon, 18 Nov 2019 at 15:34, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Nov 18, 2019 at 11:37 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Wed, 13 Nov 2019 at 14:31, Amit Kapila <amit.kapila16@gmail.com> wrote:

Based on these needs, we came up with a way to allow users to specify
this information for IndexAm's. Basically, Indexam will expose a
variable amparallelvacuumoptions which can have below options

VACUUM_OPTION_NO_PARALLEL 1 << 0 # vacuum (neither bulkdelete nor
vacuumcleanup) can't be performed in parallel

I think VACUUM_OPTION_NO_PARALLEL can be 0 so that index AMs who don't
want to support parallel vacuum don't have to set anything.

make sense.

VACUUM_OPTION_PARALLEL_BULKDEL 1 << 1 # bulkdelete can be done in
parallel (Indexes nbtree, hash, gin, gist, spgist, bloom will set this
flag)
VACUUM_OPTION_PARALLEL_COND_CLEANUP 1 << 2 # vacuumcleanup can be
done in parallel if bulkdelete is not performed (Indexes nbtree, brin,
gin, gist,
spgist, bloom will set this flag)
VACUUM_OPTION_PARALLEL_CLEANUP 1 << 3 # vacuumcleanup can be done in
parallel even if bulkdelete is already performed (Indexes gin, brin,
and bloom will set this flag)

I think gin and bloom don't need to set both but should set only
VACUUM_OPTION_PARALLEL_CLEANUP.

And I'm going to disallow index AMs to set both
VACUUM_OPTION_PARALLEL_COND_CLEANUP and VACUUM_OPTION_PARALLEL_CLEANUP
by assertions, is that okay?

Sounds reasonable to me.

Are you planning to include the changes related to I/O throttling
based on the discussion in the nearby thread [1]? I think you can do
that if you agree with the conclusion in the last email[1], otherwise,
we can explore it separately.

Yes I agreed. I'm going to include that changes in the next version
patches. And I think we will be able to do more discussion based on
the patch.

I've attached the latest version patch set. The patch set includes all
discussed points regarding index AM options as well as shared cost
balance. Also I added some test cases used all types of index AM.

During developments I had one concern about the number of parallel
workers to launch. In current design each index AMs can choose the
participation of parallel bulk-deletion and parallel cleanup. That
also means the number of parallel worker to launch might be different
for each time of parallel bulk-deletion and parallel cleanup. In
current patch the leader will always launch the number of indexes that
support either one but it would not be efficient in some cases. For
example, if we have 3 indexes supporting only parallel bulk-deletion
and 2 indexes supporting only parallel index cleanup, we would launch
5 workers for each execution but some workers will do nothing at all.
To deal with this problem, I wonder if we can improve the parallel
query so that the leader process creates a parallel context with the
maximum number of indexes and can launch a part of workers instead of
all of them.
+
+ /* compute new balance by adding the local value */
+ shared_balance = pg_atomic_read_u32(VacuumSharedCostBalance);
+ new_balance = shared_balance + VacuumCostBalance;
+ /* also compute the total local balance */
+ local_balance = VacuumCostBalanceLocal + VacuumCostBalance;
+
+ if ((new_balance >= VacuumCostLimit) &&
+ (local_balance > 0.5 * (VacuumCostLimit / nworkers)))
+ {
+ /* compute sleep time based on the local cost balance */
+ msec = VacuumCostDelay * VacuumCostBalanceLocal / VacuumCostLimit;
+ new_balance = shared_balance - VacuumCostBalanceLocal;
+ VacuumCostBalanceLocal = 0;
+ }
+
+ if (pg_atomic_compare_exchange_u32(VacuumSharedCostBalance,
+    &shared_balance,
+    new_balance))
+ {
+ /* Updated successfully, break */
+ break;
+ }
While looking at the shared costing delay part, I have noticed that
while checking the delay condition, we are considering local_balance
which is VacuumCostBalanceLocal + VacuumCostBalance, but while
computing the new balance we only reduce shared balance by
VacuumCostBalanceLocal,  I think it should be reduced with
local_balance?

Right.

I see that later we are adding VacuumCostBalance to
the VacuumCostBalanceLocal so we are not loosing accounting for this
balance. But, I feel it is not right that we compare based on one
value and operate based on other. I think we can immediately set
VacuumCostBalanceLocal += VacuumCostBalance before checking the
condition.

I think we should not do VacuumCostBalanceLocal += VacuumCostBalance
inside the while loop because it's repeatedly executed until CAS
operation succeeds. Instead we can move it before the loop and remove
local_balance? The code would be like the following:

if (VacuumSharedCostBalance != NULL)
{
:
VacuumCostBalanceLocal += VacuumCostBalance;
:
/* Update the shared cost balance value atomically */
while (true)
{
uint32 shared_balance;
uint32 new_balance;

msec = 0;

/* compute new balance by adding the local value */
shared_balance = pg_atomic_read_u32(VacuumSharedCostBalance);
new_balance = shared_balance + VacuumCostBalance;

if ((new_balance >= VacuumCostLimit) &&
(VacuumCostBalanceLocal > 0.5 * (VacuumCostLimit / nworkers)))
{
/* compute sleep time based on the local cost balance */
msec = VacuumCostDelay * VacuumCostBalanceLocal / VacuumCostLimit;
new_balance = shared_balance - VacuumCostBalanceLocal;
VacuumCostBalanceLocal = 0;
}

if (pg_atomic_compare_exchange_u32(VacuumSharedCostBalance,
&shared_balance,
new_balance))
{
/* Updated successfully, break */
break;
}
}

:
VacuumCostBalance = 0;
}

Thoughts?

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#225

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

about 6 years ago

In reply to: Dilip Kumar (#222)

On Thu, 21 Nov 2019 at 14:32, Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Thu, Nov 21, 2019 at 10:46 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Wed, Nov 20, 2019 at 11:01 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Mon, 18 Nov 2019 at 15:38, Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Mon, 18 Nov 2019 at 15:34, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Nov 18, 2019 at 11:37 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Wed, 13 Nov 2019 at 14:31, Amit Kapila <amit.kapila16@gmail.com> wrote:

Based on these needs, we came up with a way to allow users to specify
this information for IndexAm's. Basically, Indexam will expose a
variable amparallelvacuumoptions which can have below options

VACUUM_OPTION_NO_PARALLEL 1 << 0 # vacuum (neither bulkdelete nor
vacuumcleanup) can't be performed in parallel

I think VACUUM_OPTION_NO_PARALLEL can be 0 so that index AMs who don't
want to support parallel vacuum don't have to set anything.

make sense.

VACUUM_OPTION_PARALLEL_BULKDEL 1 << 1 # bulkdelete can be done in
parallel (Indexes nbtree, hash, gin, gist, spgist, bloom will set this
flag)
VACUUM_OPTION_PARALLEL_COND_CLEANUP 1 << 2 # vacuumcleanup can be
done in parallel if bulkdelete is not performed (Indexes nbtree, brin,
gin, gist,
spgist, bloom will set this flag)
VACUUM_OPTION_PARALLEL_CLEANUP 1 << 3 # vacuumcleanup can be done in
parallel even if bulkdelete is already performed (Indexes gin, brin,
and bloom will set this flag)

I think gin and bloom don't need to set both but should set only
VACUUM_OPTION_PARALLEL_CLEANUP.

And I'm going to disallow index AMs to set both
VACUUM_OPTION_PARALLEL_COND_CLEANUP and VACUUM_OPTION_PARALLEL_CLEANUP
by assertions, is that okay?

Sounds reasonable to me.

Are you planning to include the changes related to I/O throttling
based on the discussion in the nearby thread [1]? I think you can do
that if you agree with the conclusion in the last email[1], otherwise,
we can explore it separately.

Yes I agreed. I'm going to include that changes in the next version
patches. And I think we will be able to do more discussion based on
the patch.

I've attached the latest version patch set. The patch set includes all
discussed points regarding index AM options as well as shared cost
balance. Also I added some test cases used all types of index AM.

During developments I had one concern about the number of parallel
workers to launch. In current design each index AMs can choose the
participation of parallel bulk-deletion and parallel cleanup. That
also means the number of parallel worker to launch might be different
for each time of parallel bulk-deletion and parallel cleanup. In
current patch the leader will always launch the number of indexes that
support either one but it would not be efficient in some cases. For
example, if we have 3 indexes supporting only parallel bulk-deletion
and 2 indexes supporting only parallel index cleanup, we would launch
5 workers for each execution but some workers will do nothing at all.
To deal with this problem, I wonder if we can improve the parallel
query so that the leader process creates a parallel context with the
maximum number of indexes and can launch a part of workers instead of
all of them.
+
+ /* compute new balance by adding the local value */
+ shared_balance = pg_atomic_read_u32(VacuumSharedCostBalance);
+ new_balance = shared_balance + VacuumCostBalance;
+ /* also compute the total local balance */
+ local_balance = VacuumCostBalanceLocal + VacuumCostBalance;
+
+ if ((new_balance >= VacuumCostLimit) &&
+ (local_balance > 0.5 * (VacuumCostLimit / nworkers)))
+ {
+ /* compute sleep time based on the local cost balance */
+ msec = VacuumCostDelay * VacuumCostBalanceLocal / VacuumCostLimit;
+ new_balance = shared_balance - VacuumCostBalanceLocal;
+ VacuumCostBalanceLocal = 0;
+ }
+
+ if (pg_atomic_compare_exchange_u32(VacuumSharedCostBalance,
+    &shared_balance,
+    new_balance))
+ {
+ /* Updated successfully, break */
+ break;
+ }
While looking at the shared costing delay part, I have noticed that
while checking the delay condition, we are considering local_balance
which is VacuumCostBalanceLocal + VacuumCostBalance, but while
computing the new balance we only reduce shared balance by
VacuumCostBalanceLocal,  I think it should be reduced with
local_balance?  I see that later we are adding VacuumCostBalance to
the VacuumCostBalanceLocal so we are not loosing accounting for this
balance.  But, I feel it is not right that we compare based on one
value and operate based on other. I think we can immediately set
VacuumCostBalanceLocal += VacuumCostBalance before checking the
condition.
+/*
+ * index_parallelvacuum_estimate - estimate shared memory for parallel vacuum
+ *
+ * Currently, we don't pass any information to the AM-specific estimator,
+ * so it can probably only return a constant.  In the future, we might need
+ * to pass more information.
+ */
+Size
+index_parallelvacuum_estimate(Relation indexRelation)
+{
+ Size nbytes;
+
+ RELATION_CHECKS;
+
+ /*
+ * If amestimateparallelvacuum is not provided, assume only
+ * IndexBulkDeleteResult is needed.
+ */
+ if (indexRelation->rd_indam->amestimateparallelvacuum != NULL)
+ {
+ nbytes = indexRelation->rd_indam->amestimateparallelvacuum();
+ Assert(nbytes >= MAXALIGN(sizeof(IndexBulkDeleteResult)));
+ }
+ else
+ nbytes = MAXALIGN(sizeof(IndexBulkDeleteResult));
+
+ return nbytes;
+}
In v33-0001-Add-index-AM-field-and-callback-for-parallel-ind patch, I
am a bit doubtful about this kind of arrangement, where the code in
the "if" is always unreachable with the current AMs. I am not sure
what is the best way to handle this, should we just drop the
amestimateparallelvacuum altogether?

IIUC the motivation of amestimateparallelvacuum is for third party
index AM. If it allocates memory more than IndexBulkDeleteResult like
the current gist indexes (although we'll change it) it will break
index statistics of other indexes or even can be cause of crash. I'm
not sure there is such third party index AMs and it's true that all
index AMs in postgres code will not use this callback as you
mentioned, but I think we need to take care of it because such usage
is still possible.

Because currently, we are just
providing a size estimate function without a copy function, even if
the in future some Am give an estimate about the size of the stats, we
can not directly memcpy the stat from the local memory to the shared
memory, we might then need a copy function also from the AM so that it
can flatten the stats and store in proper format?

I might be missing something but why can't we copy the stats from the
local memory to the DSM without the callback for copying stats? The
lazy vacuum code will get the pointer of the stats that are allocated
by index AM and the code can know the size of it. So I think we can
just memcpy to DSM.

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#226

Dilip Kumar

dilipbalaut@gmail.com

about 6 years ago

In reply to: Masahiko Sawada (#224)

On Thu, 21 Nov 2019, 13:52 Masahiko Sawada, <masahiko.sawada@2ndquadrant.com>
wrote:

On Thu, 21 Nov 2019 at 14:16, Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Wed, Nov 20, 2019 at 11:01 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Mon, 18 Nov 2019 at 15:38, Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Mon, 18 Nov 2019 at 15:34, Amit Kapila <amit.kapila16@gmail.com>

wrote:

On Mon, Nov 18, 2019 at 11:37 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Wed, 13 Nov 2019 at 14:31, Amit Kapila <

amit.kapila16@gmail.com> wrote:

Based on these needs, we came up with a way to allow users to

specify

this information for IndexAm's. Basically, Indexam will expose

a

variable amparallelvacuumoptions which can have below options

VACUUM_OPTION_NO_PARALLEL 1 << 0 # vacuum (neither

bulkdelete nor

vacuumcleanup) can't be performed in parallel

I think VACUUM_OPTION_NO_PARALLEL can be 0 so that index AMs who

don't

want to support parallel vacuum don't have to set anything.

make sense.

VACUUM_OPTION_PARALLEL_BULKDEL 1 << 1 # bulkdelete can be

done in

parallel (Indexes nbtree, hash, gin, gist, spgist, bloom will

set this

flag)
VACUUM_OPTION_PARALLEL_COND_CLEANUP 1 << 2 # vacuumcleanup

can be

done in parallel if bulkdelete is not performed (Indexes

nbtree, brin,

gin, gist,
spgist, bloom will set this flag)
VACUUM_OPTION_PARALLEL_CLEANUP 1 << 3 # vacuumcleanup can be

done in

parallel even if bulkdelete is already performed (Indexes gin,

brin,

and bloom will set this flag)

I think gin and bloom don't need to set both but should set only
VACUUM_OPTION_PARALLEL_CLEANUP.

And I'm going to disallow index AMs to set both
VACUUM_OPTION_PARALLEL_COND_CLEANUP and

VACUUM_OPTION_PARALLEL_CLEANUP

by assertions, is that okay?

Sounds reasonable to me.

Are you planning to include the changes related to I/O throttling
based on the discussion in the nearby thread [1]? I think you can

do

that if you agree with the conclusion in the last email[1],

otherwise,
we can explore it separately.

Yes I agreed. I'm going to include that changes in the next version
patches. And I think we will be able to do more discussion based on
the patch.

I've attached the latest version patch set. The patch set includes all
discussed points regarding index AM options as well as shared cost
balance. Also I added some test cases used all types of index AM.

During developments I had one concern about the number of parallel
workers to launch. In current design each index AMs can choose the
participation of parallel bulk-deletion and parallel cleanup. That
also means the number of parallel worker to launch might be different
for each time of parallel bulk-deletion and parallel cleanup. In
current patch the leader will always launch the number of indexes that
support either one but it would not be efficient in some cases. For
example, if we have 3 indexes supporting only parallel bulk-deletion
and 2 indexes supporting only parallel index cleanup, we would launch
5 workers for each execution but some workers will do nothing at all.
To deal with this problem, I wonder if we can improve the parallel
query so that the leader process creates a parallel context with the
maximum number of indexes and can launch a part of workers instead of
all of them.
+
+ /* compute new balance by adding the local value */
+ shared_balance = pg_atomic_read_u32(VacuumSharedCostBalance);
+ new_balance = shared_balance + VacuumCostBalance;
+ /* also compute the total local balance */
+ local_balance = VacuumCostBalanceLocal + VacuumCostBalance;
+
+ if ((new_balance >= VacuumCostLimit) &&
+ (local_balance > 0.5 * (VacuumCostLimit / nworkers)))
+ {
+ /* compute sleep time based on the local cost balance */
+ msec = VacuumCostDelay * VacuumCostBalanceLocal / VacuumCostLimit;
+ new_balance = shared_balance - VacuumCostBalanceLocal;
+ VacuumCostBalanceLocal = 0;
+ }
+
+ if (pg_atomic_compare_exchange_u32(VacuumSharedCostBalance,
+    &shared_balance,
+    new_balance))
+ {
+ /* Updated successfully, break */
+ break;
+ }
While looking at the shared costing delay part, I have noticed that
while checking the delay condition, we are considering local_balance
which is VacuumCostBalanceLocal + VacuumCostBalance, but while
computing the new balance we only reduce shared balance by
VacuumCostBalanceLocal,  I think it should be reduced with
local_balance?
Right.

I see that later we are adding VacuumCostBalance to
the VacuumCostBalanceLocal so we are not loosing accounting for this
balance. But, I feel it is not right that we compare based on one
value and operate based on other. I think we can immediately set
VacuumCostBalanceLocal += VacuumCostBalance before checking the
condition.

I think we should not do VacuumCostBalanceLocal += VacuumCostBalance
inside the while loop because it's repeatedly executed until CAS
operation succeeds. Instead we can move it before the loop and remove
local_balance?

Right, I meant before loop.

The code would be like the following:

if (VacuumSharedCostBalance != NULL)
{
:
VacuumCostBalanceLocal += VacuumCostBalance;
:
/* Update the shared cost balance value atomically */
while (true)
{
uint32 shared_balance;
uint32 new_balance;

msec = 0;

/* compute new balance by adding the local value */
shared_balance = pg_atomic_read_u32(VacuumSharedCostBalance);
new_balance = shared_balance + VacuumCostBalance;

if ((new_balance >= VacuumCostLimit) &&
(VacuumCostBalanceLocal > 0.5 * (VacuumCostLimit / nworkers)))
{
/* compute sleep time based on the local cost balance */
msec = VacuumCostDelay * VacuumCostBalanceLocal /
VacuumCostLimit;
new_balance = shared_balance - VacuumCostBalanceLocal;
VacuumCostBalanceLocal = 0;
}

if (pg_atomic_compare_exchange_u32(VacuumSharedCostBalance,
&shared_balance,
new_balance))
{
/* Updated successfully, break */
break;
}
}

:
VacuumCostBalance = 0;
}

Thoughts?

Looks fine to me.

Show quoted text

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#227

Dilip Kumar

dilipbalaut@gmail.com

about 6 years ago

In reply to: Masahiko Sawada (#225)

On Thu, 21 Nov 2019, 14:15 Masahiko Sawada, <masahiko.sawada@2ndquadrant.com>
wrote:

On Thu, 21 Nov 2019 at 14:32, Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Thu, Nov 21, 2019 at 10:46 AM Dilip Kumar <dilipbalaut@gmail.com>

wrote:

On Wed, Nov 20, 2019 at 11:01 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Mon, 18 Nov 2019 at 15:38, Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Mon, 18 Nov 2019 at 15:34, Amit Kapila <amit.kapila16@gmail.com>

wrote:

On Mon, Nov 18, 2019 at 11:37 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Wed, 13 Nov 2019 at 14:31, Amit Kapila <

amit.kapila16@gmail.com> wrote:

Based on these needs, we came up with a way to allow users

to specify

this information for IndexAm's. Basically, Indexam will

expose a

variable amparallelvacuumoptions which can have below options

VACUUM_OPTION_NO_PARALLEL 1 << 0 # vacuum (neither

bulkdelete nor

vacuumcleanup) can't be performed in parallel

I think VACUUM_OPTION_NO_PARALLEL can be 0 so that index AMs

who don't

want to support parallel vacuum don't have to set anything.

make sense.

VACUUM_OPTION_PARALLEL_BULKDEL 1 << 1 # bulkdelete can be

done in

parallel (Indexes nbtree, hash, gin, gist, spgist, bloom

will set this

flag)
VACUUM_OPTION_PARALLEL_COND_CLEANUP 1 << 2 # vacuumcleanup

can be

done in parallel if bulkdelete is not performed (Indexes

nbtree, brin,

gin, gist,
spgist, bloom will set this flag)
VACUUM_OPTION_PARALLEL_CLEANUP 1 << 3 # vacuumcleanup can

be done in

parallel even if bulkdelete is already performed (Indexes

gin, brin,

and bloom will set this flag)

I think gin and bloom don't need to set both but should set

only

VACUUM_OPTION_PARALLEL_CLEANUP.

And I'm going to disallow index AMs to set both
VACUUM_OPTION_PARALLEL_COND_CLEANUP and

VACUUM_OPTION_PARALLEL_CLEANUP

by assertions, is that okay?

Sounds reasonable to me.

Are you planning to include the changes related to I/O throttling
based on the discussion in the nearby thread [1]? I think you

can do

that if you agree with the conclusion in the last email[1],

otherwise,

we can explore it separately.

Yes I agreed. I'm going to include that changes in the next version
patches. And I think we will be able to do more discussion based on
the patch.

I've attached the latest version patch set. The patch set includes

all

discussed points regarding index AM options as well as shared cost
balance. Also I added some test cases used all types of index AM.

During developments I had one concern about the number of parallel
workers to launch. In current design each index AMs can choose the
participation of parallel bulk-deletion and parallel cleanup. That
also means the number of parallel worker to launch might be different
for each time of parallel bulk-deletion and parallel cleanup. In
current patch the leader will always launch the number of indexes

that
support either one but it would not be efficient in some cases. For
example, if we have 3 indexes supporting only parallel bulk-deletion
and 2 indexes supporting only parallel index cleanup, we would launch
5 workers for each execution but some workers will do nothing at all.
To deal with this problem, I wonder if we can improve the parallel
query so that the leader process creates a parallel context with the
maximum number of indexes and can launch a part of workers instead of
all of them.
+
+ /* compute new balance by adding the local value */
+ shared_balance = pg_atomic_read_u32(VacuumSharedCostBalance);
+ new_balance = shared_balance + VacuumCostBalance;
+ /* also compute the total local balance */
+ local_balance = VacuumCostBalanceLocal + VacuumCostBalance;
+
+ if ((new_balance >= VacuumCostLimit) &&
+ (local_balance > 0.5 * (VacuumCostLimit / nworkers)))
+ {
+ /* compute sleep time based on the local cost balance */
+ msec = VacuumCostDelay * VacuumCostBalanceLocal / VacuumCostLimit;
+ new_balance = shared_balance - VacuumCostBalanceLocal;
+ VacuumCostBalanceLocal = 0;
+ }
+
+ if (pg_atomic_compare_exchange_u32(VacuumSharedCostBalance,
+    &shared_balance,
+    new_balance))
+ {
+ /* Updated successfully, break */
+ break;
+ }
While looking at the shared costing delay part, I have noticed that
while checking the delay condition, we are considering local_balance
which is VacuumCostBalanceLocal + VacuumCostBalance, but while
computing the new balance we only reduce shared balance by
VacuumCostBalanceLocal,  I think it should be reduced with
local_balance?  I see that later we are adding VacuumCostBalance to
the VacuumCostBalanceLocal so we are not loosing accounting for this
balance.  But, I feel it is not right that we compare based on one
value and operate based on other. I think we can immediately set
VacuumCostBalanceLocal += VacuumCostBalance before checking the
condition.
+/*
+ * index_parallelvacuum_estimate - estimate shared memory for parallel
vacuum
+ *
+ * Currently, we don't pass any information to the AM-specific
estimator,

+ * so it can probably only return a constant. In the future, we might

need
+ * to pass more information.
+ */
+Size
+index_parallelvacuum_estimate(Relation indexRelation)
+{
+ Size nbytes;
+
+ RELATION_CHECKS;
+
+ /*
+ * If amestimateparallelvacuum is not provided, assume only
+ * IndexBulkDeleteResult is needed.
+ */
+ if (indexRelation->rd_indam->amestimateparallelvacuum != NULL)
+ {
+ nbytes = indexRelation->rd_indam->amestimateparallelvacuum();
+ Assert(nbytes >= MAXALIGN(sizeof(IndexBulkDeleteResult)));
+ }
+ else
+ nbytes = MAXALIGN(sizeof(IndexBulkDeleteResult));
+
+ return nbytes;
+}
In v33-0001-Add-index-AM-field-and-callback-for-parallel-ind patch, I
am a bit doubtful about this kind of arrangement, where the code in
the "if" is always unreachable with the current AMs. I am not sure
what is the best way to handle this, should we just drop the
amestimateparallelvacuum altogether?
IIUC the motivation of amestimateparallelvacuum is for third party
index AM. If it allocates memory more than IndexBulkDeleteResult like
the current gist indexes (although we'll change it) it will break
index statistics of other indexes or even can be cause of crash. I'm
not sure there is such third party index AMs and it's true that all
index AMs in postgres code will not use this callback as you
mentioned, but I think we need to take care of it because such usage
is still possible.

Because currently, we are just
providing a size estimate function without a copy function, even if
the in future some Am give an estimate about the size of the stats, we
can not directly memcpy the stat from the local memory to the shared
memory, we might then need a copy function also from the AM so that it
can flatten the stats and store in proper format?

I might be missing something but why can't we copy the stats from the
local memory to the DSM without the callback for copying stats? The
lazy vacuum code will get the pointer of the stats that are allocated
by index AM and the code can know the size of it. So I think we can
just memcpy to DSM.

Oh sure. But, what I meant is that if AM may keep pointers in its stats
as GistBulkDeleteResult do so we might not be able to copy directly outside
the AM. So I thought that if we have a call back for the copy then the AM
can flatten the stats such that IndexBulkDeleteResult, followed by AM
specific stats. Yeah but someone may argue that we might force the AM to
return the stats in a form that it can be memcpy directly. So I think I am
fine with the way it is.

#228

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Dilip Kumar (#227)

On Thu, Nov 21, 2019 at 3:25 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Thu, 21 Nov 2019, 14:15 Masahiko Sawada, <masahiko.sawada@2ndquadrant.com> wrote:

On Thu, 21 Nov 2019 at 14:32, Dilip Kumar <dilipbalaut@gmail.com> wrote:

In v33-0001-Add-index-AM-field-and-callback-for-parallel-ind patch, I
am a bit doubtful about this kind of arrangement, where the code in
the "if" is always unreachable with the current AMs. I am not sure
what is the best way to handle this, should we just drop the
amestimateparallelvacuum altogether?

IIUC the motivation of amestimateparallelvacuum is for third party
index AM. If it allocates memory more than IndexBulkDeleteResult like
the current gist indexes (although we'll change it) it will break
index statistics of other indexes or even can be cause of crash. I'm
not sure there is such third party index AMs and it's true that all
index AMs in postgres code will not use this callback as you
mentioned, but I think we need to take care of it because such usage
is still possible.

Because currently, we are just
providing a size estimate function without a copy function, even if
the in future some Am give an estimate about the size of the stats, we
can not directly memcpy the stat from the local memory to the shared
memory, we might then need a copy function also from the AM so that it
can flatten the stats and store in proper format?

I might be missing something but why can't we copy the stats from the
local memory to the DSM without the callback for copying stats? The
lazy vacuum code will get the pointer of the stats that are allocated
by index AM and the code can know the size of it. So I think we can
just memcpy to DSM.

Oh sure. But, what I meant is that if AM may keep pointers in its stats as GistBulkDeleteResult do so we might not be able to copy directly outside the AM. So I thought that if we have a call back for the copy then the AM can flatten the stats such that IndexBulkDeleteResult, followed by AM specific stats. Yeah but someone may argue that we might force the AM to return the stats in a form that it can be memcpy directly. So I think I am fine with the way it is.

I think we have discussed this point earlier as well and the
conclusion was to provide an API if there is a need for the same.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#229

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Masahiko Sawada (#214)

2 attachment(s)

On Wed, Nov 20, 2019 at 11:01 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

I've attached the latest version patch set. The patch set includes all
discussed points regarding index AM options as well as shared cost
balance. Also I added some test cases used all types of index AM.

I have reviewed the first patch and made a number of modifications
that include adding/modifying comments, made some corrections and
modifications in the documentation. You can find my changes in
v33-0001-delta-amit.patch. See, if those look okay to you, if so,
please include those in the next version of the patch. I am attaching
both your version of patch and delta changes by me.

One comment on v33-0002-Add-parallel-option-to-VACUUM-command:

+ /* Estimate size for shared information -- PARALLEL_VACUUM_KEY_SHARED */
+ est_shared = MAXALIGN(add_size(SizeOfLVShared, BITMAPLEN
(nindexes)));
..
+ shared->offset = add_size(SizeOfLVShared, BITMAPLEN(nindexes));

Here, don't you need to do MAXALIGN to set offset as we are computing
it that way while estimating shared memory? If not, then probably,
some comments are required to explain it.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachments:

v33-0001-Add-index-AM-field-and-callback-for-parallel-ind.patchapplication/octet-stream; name=v33-0001-Add-index-AM-field-and-callback-for-parallel-ind.patchDownload

From 1283b804697902b70e9cd234e36b853b126d6efe Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Tue, 15 Oct 2019 17:03:22 +0900
Subject: [PATCH v33 1/3] Add index AM field and callback for parallel index
 vacuum

---
 contrib/bloom/blutils.c                       |  5 ++++
 doc/src/sgml/indexam.sgml                     | 21 ++++++++++++++
 src/backend/access/brin/brin.c                |  5 ++++
 src/backend/access/gin/ginutil.c              |  5 ++++
 src/backend/access/gist/gist.c                |  5 ++++
 src/backend/access/hash/hash.c                |  4 +++
 src/backend/access/index/indexam.c            | 29 +++++++++++++++++++
 src/backend/access/nbtree/nbtree.c            |  4 +++
 src/backend/access/spgist/spgutils.c          |  5 ++++
 src/include/access/amapi.h                    | 13 +++++++++
 src/include/access/genam.h                    |  1 +
 src/include/commands/vacuum.h                 | 28 ++++++++++++++++++
 .../modules/dummy_index_am/dummy_index_am.c   |  4 +++
 13 files changed, 129 insertions(+)

diff --git a/contrib/bloom/blutils.c b/contrib/bloom/blutils.c
index e2063bac62..cde36c5b49 100644
--- a/contrib/bloom/blutils.c
+++ b/contrib/bloom/blutils.c
@@ -18,6 +18,7 @@
 #include "access/reloptions.h"
 #include "bloom.h"
 #include "catalog/index.h"
+#include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
@@ -121,6 +122,9 @@ blhandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_CLEANUP;
+	amroutine->amusemaintenanceworkmem = false;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = blbuild;
@@ -144,6 +148,7 @@ blhandler(PG_FUNCTION_ARGS)
 	amroutine->amestimateparallelscan = NULL;
 	amroutine->aminitparallelscan = NULL;
 	amroutine->amparallelrescan = NULL;
+	amroutine->amestimateparallelvacuum = NULL;
 
 	PG_RETURN_POINTER(amroutine);
 }
diff --git a/doc/src/sgml/indexam.sgml b/doc/src/sgml/indexam.sgml
index dd54c68802..693171dc4f 100644
--- a/doc/src/sgml/indexam.sgml
+++ b/doc/src/sgml/indexam.sgml
@@ -122,6 +122,10 @@ typedef struct IndexAmRoutine
     bool        amcanparallel;
     /* does AM support columns included with clause INCLUDE? */
     bool        amcaninclude;
+    /* OR of parallel vacuum flags */
+    uint8       amparallelvacuumoptions;
+    /* does AM use maintenance_work_mem? */
+    bool        amusemaintenanceworkmem;
     /* type of data stored in index, or InvalidOid if variable */
     Oid         amkeytype;
 
@@ -149,6 +153,9 @@ typedef struct IndexAmRoutine
     amestimateparallelscan_function amestimateparallelscan;    /* can be NULL */
     aminitparallelscan_function aminitparallelscan;    /* can be NULL */
     amparallelrescan_function amparallelrescan;    /* can be NULL */
+
+    /* interface functions to support parallel vacuum */
+    amestimateparallelvacuum_function amestimateparallelvacuum; /* can be NULL */
 } IndexAmRoutine;
 </programlisting>
   </para>
@@ -731,6 +738,20 @@ amparallelrescan (IndexScanDesc scan);
    the beginning.
   </para>
 
+  <para>
+<programlisting>
+void
+amestimateparallelvacuum (IndexScanDesc scan);
+</programlisting>
+   Estimate and return the number of bytes of dynamic shared memory which the
+   access method will be needed to copy the statistics to.
+  </para>
+
+  <para>
+   It is not necessary to implement this function for access methods which
+   do not support parallel vacuum or in cases where the access method does not
+   require more than size of <structname>IndexBulkDeleteResult</structname>.
+  </para>
  </sect1>
 
  <sect1 id="index-scanning">
diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index 294ffa6e20..fbb4af9df1 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -27,6 +27,7 @@
 #include "access/xloginsert.h"
 #include "catalog/index.h"
 #include "catalog/pg_am.h"
+#include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "pgstat.h"
 #include "postmaster/autovacuum.h"
@@ -101,6 +102,9 @@ brinhandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_CLEANUP;
+	amroutine->amusemaintenanceworkmem = false;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = brinbuild;
@@ -124,6 +128,7 @@ brinhandler(PG_FUNCTION_ARGS)
 	amroutine->amestimateparallelscan = NULL;
 	amroutine->aminitparallelscan = NULL;
 	amroutine->amparallelrescan = NULL;
+	amroutine->amestimateparallelvacuum = NULL;
 
 	PG_RETURN_POINTER(amroutine);
 }
diff --git a/src/backend/access/gin/ginutil.c b/src/backend/access/gin/ginutil.c
index 38593554f0..8c174b28fc 100644
--- a/src/backend/access/gin/ginutil.c
+++ b/src/backend/access/gin/ginutil.c
@@ -20,6 +20,7 @@
 #include "access/xloginsert.h"
 #include "catalog/pg_collation.h"
 #include "catalog/pg_type.h"
+#include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "storage/indexfsm.h"
 #include "storage/lmgr.h"
@@ -53,6 +54,9 @@ ginhandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_CLEANUP;
+	amroutine->amusemaintenanceworkmem = true;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = ginbuild;
@@ -76,6 +80,7 @@ ginhandler(PG_FUNCTION_ARGS)
 	amroutine->amestimateparallelscan = NULL;
 	amroutine->aminitparallelscan = NULL;
 	amroutine->amparallelrescan = NULL;
+	amroutine->amestimateparallelvacuum = NULL;
 
 	PG_RETURN_POINTER(amroutine);
 }
diff --git a/src/backend/access/gist/gist.c b/src/backend/access/gist/gist.c
index 8d9c8d025d..bbb630fb88 100644
--- a/src/backend/access/gist/gist.c
+++ b/src/backend/access/gist/gist.c
@@ -17,6 +17,7 @@
 #include "access/gist_private.h"
 #include "access/gistscan.h"
 #include "catalog/pg_collation.h"
+#include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "nodes/execnodes.h"
 #include "storage/lmgr.h"
@@ -74,6 +75,9 @@ gisthandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = true;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP;
+	amroutine->amusemaintenanceworkmem = false;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = gistbuild;
@@ -97,6 +101,7 @@ gisthandler(PG_FUNCTION_ARGS)
 	amroutine->amestimateparallelscan = NULL;
 	amroutine->aminitparallelscan = NULL;
 	amroutine->amparallelrescan = NULL;
+	amroutine->amestimateparallelvacuum = NULL;
 
 	PG_RETURN_POINTER(amroutine);
 }
diff --git a/src/backend/access/hash/hash.c b/src/backend/access/hash/hash.c
index a0597a0c6e..10d6efdd9f 100644
--- a/src/backend/access/hash/hash.c
+++ b/src/backend/access/hash/hash.c
@@ -72,6 +72,9 @@ hashhandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL;
+	amroutine->amusemaintenanceworkmem = false;
 	amroutine->amkeytype = INT4OID;
 
 	amroutine->ambuild = hashbuild;
@@ -95,6 +98,7 @@ hashhandler(PG_FUNCTION_ARGS)
 	amroutine->amestimateparallelscan = NULL;
 	amroutine->aminitparallelscan = NULL;
 	amroutine->amparallelrescan = NULL;
+	amroutine->amestimateparallelvacuum = NULL;
 
 	PG_RETURN_POINTER(amroutine);
 }
diff --git a/src/backend/access/index/indexam.c b/src/backend/access/index/indexam.c
index 9dfa0ddfbb..5238b9d38f 100644
--- a/src/backend/access/index/indexam.c
+++ b/src/backend/access/index/indexam.c
@@ -711,6 +711,35 @@ index_vacuum_cleanup(IndexVacuumInfo *info,
 	return indexRelation->rd_indam->amvacuumcleanup(info, stats);
 }
 
+/*
+ * index_parallelvacuum_estimate - estimate shared memory for parallel vacuum
+ *
+ * Currently, we don't pass any information to the AM-specific estimator,
+ * so it can probably only return a constant.  In the future, we might need
+ * to pass more information.
+ */
+Size
+index_parallelvacuum_estimate(Relation indexRelation)
+{
+	Size		nbytes;
+
+	RELATION_CHECKS;
+
+	/*
+	 * If amestimateparallelvacuum is not provided, assume only
+	 * IndexBulkDeleteResult is needed.
+	 */
+	if (indexRelation->rd_indam->amestimateparallelvacuum != NULL)
+	{
+		nbytes = indexRelation->rd_indam->amestimateparallelvacuum();
+		Assert(nbytes >= MAXALIGN(sizeof(IndexBulkDeleteResult)));
+	}
+	else
+		nbytes = MAXALIGN(sizeof(IndexBulkDeleteResult));
+
+	return nbytes;
+}
+
 /* ----------------
  *		index_can_return
  *
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index 4cfd5289ad..6a8d12ecbf 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -123,6 +123,9 @@ bthandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = true;
 	amroutine->amcaninclude = true;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP;
+	amroutine->amusemaintenanceworkmem = false;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = btbuild;
@@ -146,6 +149,7 @@ bthandler(PG_FUNCTION_ARGS)
 	amroutine->amestimateparallelscan = btestimateparallelscan;
 	amroutine->aminitparallelscan = btinitparallelscan;
 	amroutine->amparallelrescan = btparallelrescan;
+	amroutine->amestimateparallelvacuum = NULL;
 
 	PG_RETURN_POINTER(amroutine);
 }
diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c
index 45472db147..bb3e855cce 100644
--- a/src/backend/access/spgist/spgutils.c
+++ b/src/backend/access/spgist/spgutils.c
@@ -22,6 +22,7 @@
 #include "access/transam.h"
 #include "access/xact.h"
 #include "catalog/pg_amop.h"
+#include "commands/vacuum.h"
 #include "storage/bufmgr.h"
 #include "storage/indexfsm.h"
 #include "storage/lmgr.h"
@@ -56,6 +57,9 @@ spghandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP;
+	amroutine->amusemaintenanceworkmem = false;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = spgbuild;
@@ -79,6 +83,7 @@ spghandler(PG_FUNCTION_ARGS)
 	amroutine->amestimateparallelscan = NULL;
 	amroutine->aminitparallelscan = NULL;
 	amroutine->amparallelrescan = NULL;
+	amroutine->amestimateparallelvacuum = NULL;
 
 	PG_RETURN_POINTER(amroutine);
 }
diff --git a/src/include/access/amapi.h b/src/include/access/amapi.h
index 6e3db06eed..0fd399442d 100644
--- a/src/include/access/amapi.h
+++ b/src/include/access/amapi.h
@@ -156,6 +156,12 @@ typedef void (*aminitparallelscan_function) (void *target);
 /* (re)start parallel index scan */
 typedef void (*amparallelrescan_function) (IndexScanDesc scan);
 
+/*
+ * Callback function signatures - for parallel index vacuuming.
+ */
+/* estimate size of parallel index vacuuming memory */
+typedef Size (*amestimateparallelvacuum_function) (void);
+
 /*
  * API struct for an index AM.  Note this must be stored in a single palloc'd
  * chunk of memory.
@@ -197,6 +203,10 @@ typedef struct IndexAmRoutine
 	bool		amcanparallel;
 	/* does AM support columns included with clause INCLUDE? */
 	bool		amcaninclude;
+	/* OR of parallel vacuum flags */
+	uint8		amparallelvacuumoptions;
+	/* does AM use maintenance_work_mem? */
+	bool		amusemaintenanceworkmem;
 	/* type of data stored in index, or InvalidOid if variable */
 	Oid			amkeytype;
 
@@ -230,6 +240,9 @@ typedef struct IndexAmRoutine
 	amestimateparallelscan_function amestimateparallelscan; /* can be NULL */
 	aminitparallelscan_function aminitparallelscan; /* can be NULL */
 	amparallelrescan_function amparallelrescan; /* can be NULL */
+
+	/* interface functions to support parallel vacuum */
+	amestimateparallelvacuum_function amestimateparallelvacuum; /* can be NULL */
 } IndexAmRoutine;
 
 
diff --git a/src/include/access/genam.h b/src/include/access/genam.h
index a813b004be..48ed5bbac7 100644
--- a/src/include/access/genam.h
+++ b/src/include/access/genam.h
@@ -179,6 +179,7 @@ extern IndexBulkDeleteResult *index_bulk_delete(IndexVacuumInfo *info,
 												void *callback_state);
 extern IndexBulkDeleteResult *index_vacuum_cleanup(IndexVacuumInfo *info,
 												   IndexBulkDeleteResult *stats);
+extern Size index_parallelvacuum_estimate(Relation indexRelation);
 extern bool index_can_return(Relation indexRelation, int attno);
 extern RegProcedure index_getprocid(Relation irel, AttrNumber attnum,
 									uint16 procnum);
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 128f7ae65d..7b6f269785 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -23,6 +23,34 @@
 #include "storage/lock.h"
 #include "utils/relcache.h"
 
+/*
+ * Flags amparallelvacuumoptions to control participation
+ * of bulkdelete and vacuumcleanup. Both are disabled by
+ * default.
+ */
+#define VACUUM_OPTION_NO_PARALLEL			0
+
+/* bulkdelete can be performed in parallel */
+#define VACUUM_OPTION_PARALLEL_BULKDEL		(1 << 0)
+
+/*
+ * vacuumcleanup can be performed in parallel if bulkdelete is
+ * not performed yet.
+ */
+#define VACUUM_OPTION_PARALLEL_COND_CLEANUP	(1 << 1)
+
+/* vacuumcleanup can be performed in parallel */
+#define VACUUM_OPTION_PARALLEL_CLEANUP		(1 << 2)
+
+/* value for checking vacuum flags */
+#define VACUUM_OPTION_MAX_VALID_VALUE		((1 << 3) - 1)
+
+/* Macros for parallel vacuum options */
+#define VACUUM_OPTION_SUPPORT_PARALLEL_BULKDEL(flag) \
+	((((flag) & VACUUM_OPTION_PARALLEL_BULKDEL)) != 0)
+#define VACUUM_OPTION_SUPPORT_PARALLEL_CLEANUP(flag) \
+	((((flag) & VACUUM_OPTION_PARALLEL_COND_CLEANUP) != 0) || \
+	 (((flag) & VACUUM_OPTION_PARALLEL_CLEANUP) != 0))
 
 /*----------
  * ANALYZE builds one of these structs for each attribute (column) that is
diff --git a/src/test/modules/dummy_index_am/dummy_index_am.c b/src/test/modules/dummy_index_am/dummy_index_am.c
index 053636e4b4..096534a6ee 100644
--- a/src/test/modules/dummy_index_am/dummy_index_am.c
+++ b/src/test/modules/dummy_index_am/dummy_index_am.c
@@ -16,6 +16,7 @@
 #include "access/amapi.h"
 #include "access/reloptions.h"
 #include "catalog/index.h"
+#include "commands/vacuum.h"
 #include "nodes/pathnodes.h"
 #include "utils/guc.h"
 #include "utils/rel.h"
@@ -294,6 +295,8 @@ dihandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amparallelvacuumoptions = VACUUM_OPTION_NO_PARALLEL;
+	amroutine->amusemaintenanceworkmem = false;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = dibuild;
@@ -317,6 +320,7 @@ dihandler(PG_FUNCTION_ARGS)
 	amroutine->amestimateparallelscan = NULL;
 	amroutine->aminitparallelscan = NULL;
 	amroutine->amparallelrescan = NULL;
+	amroutine->amestimateparallelvacuum = NULL;
 
 	PG_RETURN_POINTER(amroutine);
 }
-- 
2.23.0

v33-0001-delta-amit.patchapplication/octet-stream; name=v33-0001-delta-amit.patchDownload

From 1113e7d0d524e912be0e1959e50bd0d27c9b5381 Mon Sep 17 00:00:00 2001
From: Amit Kapila <amit.kapila@enterprisedb.com>
Date: Fri, 22 Nov 2019 14:24:18 +0530
Subject: [PATCH] Fixed issues and added comments.

---
 doc/src/sgml/indexam.sgml          | 13 +++++++------
 src/backend/access/index/indexam.c |  6 ++----
 src/include/access/amapi.h         |  8 ++++----
 src/include/commands/vacuum.h      | 29 +++++++++++++++++++++--------
 4 files changed, 34 insertions(+), 22 deletions(-)

diff --git a/doc/src/sgml/indexam.sgml b/doc/src/sgml/indexam.sgml
index 693171dc4f..9fed438fc6 100644
--- a/doc/src/sgml/indexam.sgml
+++ b/doc/src/sgml/indexam.sgml
@@ -154,7 +154,7 @@ typedef struct IndexAmRoutine
     aminitparallelscan_function aminitparallelscan;    /* can be NULL */
     amparallelrescan_function amparallelrescan;    /* can be NULL */
 
-    /* interface functions to support parallel vacuum */
+    /* interface function to support parallel vacuum */
     amestimateparallelvacuum_function amestimateparallelvacuum; /* can be NULL */
 } IndexAmRoutine;
 </programlisting>
@@ -740,17 +740,18 @@ amparallelrescan (IndexScanDesc scan);
 
   <para>
 <programlisting>
-void
-amestimateparallelvacuum (IndexScanDesc scan);
+Size
+amestimateparallelvacuum (void);
 </programlisting>
-   Estimate and return the number of bytes of dynamic shared memory which the
-   access method will be needed to copy the statistics to.
+   Estimate and return the number of bytes of dynamic shared memory needed to
+   store statistics returned by the access method.
   </para>
 
   <para>
    It is not necessary to implement this function for access methods which
    do not support parallel vacuum or in cases where the access method does not
-   require more than size of <structname>IndexBulkDeleteResult</structname>.
+   require more than size of <structname>IndexBulkDeleteResult</structname> to
+   store statistics.
   </para>
  </sect1>
 
diff --git a/src/backend/access/index/indexam.c b/src/backend/access/index/indexam.c
index 4801c326be..d176f0193b 100644
--- a/src/backend/access/index/indexam.c
+++ b/src/backend/access/index/indexam.c
@@ -712,11 +712,9 @@ index_vacuum_cleanup(IndexVacuumInfo *info,
 }
 
 /*
- * index_parallelvacuum_estimate - estimate shared memory for parallel vacuum
+ * index_parallelvacuum_estimate
  *
- * Currently, we don't pass any information to the AM-specific estimator,
- * so it can probably only return a constant.  In the future, we might need
- * to pass more information.
+ * Estimates the DSM space needed to store statistics for parallel vacuum.
  */
 Size
 index_parallelvacuum_estimate(Relation indexRelation)
diff --git a/src/include/access/amapi.h b/src/include/access/amapi.h
index 0fd399442d..eb23f01ab6 100644
--- a/src/include/access/amapi.h
+++ b/src/include/access/amapi.h
@@ -157,9 +157,9 @@ typedef void (*aminitparallelscan_function) (void *target);
 typedef void (*amparallelrescan_function) (IndexScanDesc scan);
 
 /*
- * Callback function signatures - for parallel index vacuuming.
+ * Callback function signature - for parallel index vacuuming.
  */
-/* estimate size of parallel index vacuuming memory */
+/* estimate size of statitics needed for parallel index vacuum */
 typedef Size (*amestimateparallelvacuum_function) (void);
 
 /*
@@ -203,7 +203,7 @@ typedef struct IndexAmRoutine
 	bool		amcanparallel;
 	/* does AM support columns included with clause INCLUDE? */
 	bool		amcaninclude;
-	/* OR of parallel vacuum flags */
+	/* OR of parallel vacuum flags.  See vacuum.h for flags. */
 	uint8		amparallelvacuumoptions;
 	/* does AM use maintenance_work_mem? */
 	bool		amusemaintenanceworkmem;
@@ -241,7 +241,7 @@ typedef struct IndexAmRoutine
 	aminitparallelscan_function aminitparallelscan; /* can be NULL */
 	amparallelrescan_function amparallelrescan; /* can be NULL */
 
-	/* interface functions to support parallel vacuum */
+	/* interface function to support parallel vacuum */
 	amestimateparallelvacuum_function amestimateparallelvacuum; /* can be NULL */
 } IndexAmRoutine;
 
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 7b6f269785..508d5762ae 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -24,28 +24,41 @@
 #include "utils/relcache.h"
 
 /*
- * Flags amparallelvacuumoptions to control participation
- * of bulkdelete and vacuumcleanup. Both are disabled by
- * default.
+ * Flags to control the participation of bulkdelete and vacuumcleanup in
+ * parallel vacuum.
+ */
+
+/*
+ * Both bulkdelete and vacuumcleanup are disabled by default.  This will be
+ * used by IndexAM's that don't want to participate in parallel vacuum.
  */
 #define VACUUM_OPTION_NO_PARALLEL			0
 
-/* bulkdelete can be performed in parallel */
+/*
+ * bulkdelete can be performed in parallel.  This option can be used by
+ * IndexAm's that need to scan the index to delete the tuples.
+ */
 #define VACUUM_OPTION_PARALLEL_BULKDEL		(1 << 0)
 
 /*
- * vacuumcleanup can be performed in parallel if bulkdelete is
- * not performed yet.
+ * vacuumcleanup can be performed in parallel if bulkdelete is not performed
+ * yet.  This will be used by IndexAM's that can scan the index if the
+ * bulkdelete is not performed.
  */
 #define VACUUM_OPTION_PARALLEL_COND_CLEANUP	(1 << 1)
 
-/* vacuumcleanup can be performed in parallel */
+/*
+ * vacuumcleanup can be performed in parallel even if bulkdelete has already
+ * processed the index.  This will be used by IndexAM's that scan the index
+ * during the cleanup phase of index irrespective of whether the index is
+ * already scanned or not during bulkdelete phase.
+ */
 #define VACUUM_OPTION_PARALLEL_CLEANUP		(1 << 2)
 
 /* value for checking vacuum flags */
 #define VACUUM_OPTION_MAX_VALID_VALUE		((1 << 3) - 1)
 
-/* Macros for parallel vacuum options */
+/* macros for parallel vacuum options */
 #define VACUUM_OPTION_SUPPORT_PARALLEL_BULKDEL(flag) \
 	((((flag) & VACUUM_OPTION_PARALLEL_BULKDEL)) != 0)
 #define VACUUM_OPTION_SUPPORT_PARALLEL_CLEANUP(flag) \
-- 
2.16.2.windows.1

#230

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Amit Kapila (#229)

2 attachment(s)

On Fri, Nov 22, 2019 at 2:49 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Nov 20, 2019 at 11:01 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

I've attached the latest version patch set. The patch set includes all
discussed points regarding index AM options as well as shared cost
balance. Also I added some test cases used all types of index AM.

I have reviewed the first patch and made a number of modifications
that include adding/modifying comments, made some corrections and
modifications in the documentation. You can find my changes in
v33-0001-delta-amit.patch.

I have continued my review for this patch series and reviewed/hacked
the second patch. I have added/modified comments, changed function
ordering in file to make them look consistent and a few other changes.
You can find my changes in v33-0002-delta-amit.patch. Are you
working on review comments given recently, if you have not started
yet, then it might be better to prepare a patch atop of v33 version as
I am also going to work on this patch series, that way it will be easy
to merge changes. OTOH, if you are already working on those, then it
is fine. I can merge any remaining changes with your new patch.
Whatever be the case, please let me know.

Few more comments on v33-0002-Add-parallel-option-to-VACUUM-command.patch:
---------------------------------------------------------------------------------------------------------------------------
1.
+ * leader process re-initializes the parallel context while keeping recorded
+ * dead tuples so that the leader can launch parallel workers again in the next
+ * time.

In this sentence, it is not clear to me why we need to keep the
recorded dead tuples while re-initialize parallel workers? The next
time when workers are launched, they should process a new set of dead
tuples, no?

2.
lazy_parallel_vacuum_or_cleanup_indexes()
{
..
+ /*
+ * Increment the active worker count. We cannot decrement until the
+ * all parallel workers finish.
+ */
+
pg_atomic_add_fetch_u32(VacuumActiveNWorkers, 1);
+
+ /*
+ * Join as parallel workers. The leader process alone does that in
+ * case where
no workers launched.
+ */
+ if (lps->leaderparticipates || lps->pcxt->nworkers_launched == 0)
+ vacuum_or_cleanup_indexes_worker
(Irel, nindexes, stats, lps->lvshared,
+ vacrelstats->dead_tuples);
+
+ /*
+
 * Here, the indexes that had been skipped during parallel index vacuuming
+ * are remaining. If there are such indexes the leader process does
vacuum
+ * or cleanup them one by one.
+ */
+ nindexes_remains = nindexes -
pg_atomic_read_u32(&(lps->lvshared->nprocessed));
+ if
(nindexes_remains > 0)
+ {
+ int i;
+#ifdef USE_ASSERT_CHECKING
+ int nprocessed = 0;
+#endif
+
+ for (i = 0; i <
nindexes; i++)
+ {
+ bool processed = !skip_parallel_index_vacuum(Irel[i],
+
lps->lvshared->for_cleanup,
+
lps->lvshared->first_time);
+
+ /* Skip the already processed indexes */
+
if (processed)
+ continue;
+
+ if (lps->lvshared->for_cleanup)
+
lazy_cleanup_index(Irel[i], &stats[i],
+    vacrelstats->new_rel_tuples,
+
   vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+ else
+
lazy_vacuum_index(Irel[i], &stats[i], vacrelstats->dead_tuples,
+   vacrelstats-

old_live_tuples);

+#ifdef USE_ASSERT_CHECKING
+ nprocessed++;
+#endif
+ }
+#ifdef USE_ASSERT_CHECKING
+ Assert
(nprocessed == nindexes_remains);
+#endif
+ }
+
+ /*
+ * We have completed the index vacuum so decrement the active worker
+ * count.
+
 */
+ pg_atomic_sub_fetch_u32(VacuumActiveNWorkers, 1);
..
}

Here, it seems that we can increment/decrement the
VacuumActiveNWorkers even when there is no work performed by the
leader backend. How about moving increment/decrement inside function
vacuum_or_cleanup_indexes_worker? In that case, we need to do it in
this function when we are actually doing an index vacuum or cleanup.
After doing that the other usage of increment/decrement of
VacuumActiveNWorkers in other function heap_parallel_vacuum_main can
be removed.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachments:

v33-0002-Add-parallel-option-to-VACUUM-command.patchapplication/octet-stream; name=v33-0002-Add-parallel-option-to-VACUUM-command.patchDownload

From 90c02dd6e38f7c1e6c9cdf5b9725e0a5add5327c Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 25 Oct 2019 22:47:41 +0900
Subject: [PATCH v33 2/3] Add parallel option to VACUUM command

This change adds PARALLEL option to VACUUM command that enable us to
perform index vacuuming and index cleanup with background
workers. Individual indexes is processed by one vacuum
process. Therefore parallel vacuum can be used when the table has at
least two indexes and it cannot specify larger parallel degree than
the number of indexes that the table has.

The parallel degree is either specified by user or determined based on
the number of indexes that the table has, and further limited by
max_parallel_maintenance_workers. The table size and index size don't
affect it.
---
 doc/src/sgml/config.sgml              |   14 +-
 doc/src/sgml/ref/vacuum.sgml          |   45 +
 src/backend/access/heap/vacuumlazy.c  | 1214 +++++++++++++++++++++++--
 src/backend/access/transam/parallel.c |    4 +
 src/backend/commands/vacuum.c         |  109 ++-
 src/backend/postmaster/autovacuum.c   |    2 +
 src/bin/psql/tab-complete.c           |    2 +-
 src/include/access/heapam.h           |    6 +
 src/include/commands/vacuum.h         |    5 +
 src/test/regress/expected/vacuum.out  |   26 +
 src/test/regress/sql/vacuum.sql       |   25 +
 11 files changed, 1340 insertions(+), 112 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index f83770350e..90ac399228 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2289,13 +2289,13 @@ include_dir 'conf.d'
        <listitem>
         <para>
          Sets the maximum number of parallel workers that can be
-         started by a single utility command.  Currently, the only
-         parallel utility command that supports the use of parallel
-         workers is <command>CREATE INDEX</command>, and only when
-         building a B-tree index.  Parallel workers are taken from the
-         pool of processes established by <xref
-         linkend="guc-max-worker-processes"/>, limited by <xref
-         linkend="guc-max-parallel-workers"/>.  Note that the requested
+         started by a single utility command.  Currently, the parallel
+         utility commands that support the use of parallel workers are
+         <command>CREATE INDEX</command> only when building a B-tree index,
+         and <command>VACUUM</command> without <literal>FULL</literal>
+         option. Parallel workers are taken from the pool of processes
+         established by <xref linkend="guc-max-worker-processes"/>, limited
+         by <xref linkend="guc-max-parallel-workers"/>.  Note that the requested
          number of workers may not actually be available at run time.
          If this occurs, the utility operation will run with fewer
          workers than expected.  The default value is 2.  Setting this
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index f9b0fb8794..ae086b976b 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -34,6 +34,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     SKIP_LOCKED [ <replaceable class="parameter">boolean</replaceable> ]
     INDEX_CLEANUP [ <replaceable class="parameter">boolean</replaceable> ]
     TRUNCATE [ <replaceable class="parameter">boolean</replaceable> ]
+    PARALLEL [ <replaceable class="parameter">integer</replaceable> ]
 
 <phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
 
@@ -223,6 +224,32 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><literal>PARALLEL</literal></term>
+    <listitem>
+     <para>
+      Perform vacuum index and cleanup index phases of <command>VACUUM</command>
+      in parallel using <replaceable class="parameter">integer</replaceable>
+      background workers (for the detail of each vacuum phases, please
+      refer to <xref linkend="vacuum-phases"/>). If the parallel degree
+      <replaceable class="parameter">integer</replaceable> is omitted,
+      then <command>VACUUM</command> decides the number of workers based
+      on number of indexes that support parallel vacuum operation on the
+      relation which is further limited by
+      <xref linkend="guc-max-parallel-workers-maintenance"/>. Please note
+      that it is not guaranteed that the number of parallel worker specified
+      in <replaceable class="parameter">integer</replaceable> will be used
+      during execution. It is possible for a vacuum to run with fewer workers
+      than specified, or even with no workers at all. Only one worker can
+      be used per index. So parallel workers are launched only when there
+      are at least <literal>2</literal> indexes in the table. Workers for
+      vacuum launches before starting each phases and exit at the end of
+      the phase. These behaviors might change in a future release. This
+      option can not use with <literal>FULL</literal> option.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><replaceable class="parameter">boolean</replaceable></term>
     <listitem>
@@ -237,6 +264,18 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><replaceable class="parameter">integer</replaceable></term>
+    <listitem>
+     <para>
+      Specifies a positive integer value passed to the selected option.
+      The <replaceable class="parameter">integer</replaceable> value can
+      also be omitted, in which case the default value of the selected
+      option is used.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><replaceable class="parameter">table_name</replaceable></term>
     <listitem>
@@ -316,6 +355,12 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     more than a plain <command>VACUUM</command> would.
    </para>
 
+   <para>
+     The <option>PARALLEL</option> option is used for only vacuum purpose.
+     Even if this option is specified with <option>ANALYZE</option> option
+     it does not affect <option>ANALYZE</option>.
+   </para>
+
    <para>
     <command>VACUUM</command> causes a substantial increase in I/O traffic,
     which might cause poor performance for other active sessions.  Therefore,
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index a3c4a1df3b..c2fe56a4b2 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -22,6 +22,21 @@
  * of index scans performed.  So we don't use maintenance_work_mem memory for
  * the TID array, just enough to hold as many heap tuples as fit on one page.
  *
+ * Lazy vacuum supports parallel execution with parallel worker processes.  In
+ * parallel lazy vacuum, we perform both index vacuuming and index cleanup with
+ * parallel worker processes.  Individual indexes are processed by one vacuum
+ * process.  At the beginning of lazy vacuum (at lazy_scan_heap) we prepare the
+ * parallel context and initialize the DSM segment that contains shared information
+ * as well as the memory space for storing dead tuples.  When starting either
+ * index vacuuming or index cleanup, we launch parallel worker processes.  Once
+ * all indexes are processed the parallel worker processes exit.  And then the
+ * leader process re-initializes the parallel context while keeping recorded
+ * dead tuples so that the leader can launch parallel workers again in the next
+ * time.  Note that all parallel workers live during either index vacuuming or
+ * index cleanup but the leader process neither exits from the parallel mode
+ * nor destroys the parallel context.  For updating the index statistics, since
+ * any updates are not allowed during parallel mode we update the index
+ * statistics after exited from the parallel mode.
  *
  * Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
@@ -36,13 +51,16 @@
 
 #include <math.h>
 
+#include "access/amapi.h"
 #include "access/genam.h"
 #include "access/heapam.h"
 #include "access/heapam_xlog.h"
 #include "access/htup_details.h"
 #include "access/multixact.h"
+#include "access/parallel.h"
 #include "access/transam.h"
 #include "access/visibilitymap.h"
+#include "access/xact.h"
 #include "access/xlog.h"
 #include "catalog/storage.h"
 #include "commands/dbcommands.h"
@@ -55,6 +73,7 @@
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
 #include "storage/lmgr.h"
+#include "tcop/tcopprot.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/pg_rusage.h"
@@ -110,6 +129,180 @@
  */
 #define PREFETCH_SIZE			((BlockNumber) 32)
 
+/*
+ * DSM keys for parallel lazy vacuum. Unlike other parallel execution code,
+ * since we don't need to worry about DSM keys conflicting with plan_node_id
+ * we can use small integers.
+ */
+#define PARALLEL_VACUUM_KEY_SHARED			1
+#define PARALLEL_VACUUM_KEY_DEAD_TUPLES		2
+#define PARALLEL_VACUUM_KEY_QUERY_TEXT		3
+
+/*
+ * PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION disables the leader's
+ * participation in parallel lazy vacuum.  This may be useful as a debugging
+ * aid.
+#undef PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION
+ */
+
+/*
+ * Macro to check if we are in a parallel lazy vacuum. If true, we are
+ * in the parallel mode and prepared the DSM segment.
+ */
+#define ParallelVacuumIsActive(lps) (((LVParallelState *) (lps)) != NULL)
+
+/*
+ * LVDeadTuples stores the dead tuple TIDs collected during heap scan.
+ * This is allocated in the DSM segment when parallel lazy vacuum
+ * mode, otherwise allocated in a local memory.
+ */
+typedef struct LVDeadTuples
+{
+	int			max_tuples;	/* # slots allocated in array */
+	int			num_tuples;	/* current # of entries */
+	/* List of TIDs of tuples we intend to delete */
+	/* NB: this list is ordered by TID address */
+	ItemPointerData itemptrs[FLEXIBLE_ARRAY_MEMBER];	/* array of ItemPointerData */
+} LVDeadTuples;
+#define SizeOfLVDeadTuples offsetof(LVDeadTuples, itemptrs) + sizeof(ItemPointerData)
+
+/*
+ * Shared information among parallel workers. So this is allocated in
+ * the DSM segment.
+ */
+typedef struct LVShared
+{
+	/*
+	 * Target table relid and log level. These fields are not modified
+	 * during the lazy vacuum.
+	 */
+	Oid		relid;
+	int		elevel;
+
+	/*
+	 * An indication for vacuum workers of doing either index vacuuming or
+	 * index cleanup. first_time is true only if for_cleanup is true and
+	 * bulk-deletion is not performed yet.
+	 */
+	bool	for_cleanup;
+	bool	first_time;
+
+	/*
+	 * Fields for both index vacuuming and index cleanup.
+	 *
+	 * reltuples is the total number of input heap tuples.  We set either
+	 * an old live tuples in index vacuuming case or the new live tuples in
+	 * index cleanup case.
+	 *
+	 * estimated_count is true if the reltuples is estimated value.
+	 */
+	double	reltuples;
+	bool	estimated_count;
+
+	/*
+	 * In single process lazy vacuum we could consume more memory during
+	 * index vacuuming or cleanup apart from the memory for heap scanning
+	 * if an index consume memory during ambulkdelete and amvacuumcleanup.
+	 * In parallel index vacuuming, since individual vacuum workers
+	 * consumes memory we set the new maitenance_work_mem for each workers
+	 * to not consume more memory than single process lazy vacuum.
+	 */
+	int		maintenance_work_mem_worker;
+
+	/*
+	 * Shared vacuum cost balance.  During parallel index vacuuming
+	 * VacuumSharedCostBalance points to this value and it accumulates the
+	 * balance of each parallel vacuum workers.
+	 */
+	pg_atomic_uint32 cost_balance;
+
+	/*
+	 * Number of active parallel workers.  This is used for computing the
+	 * minimum threshold of the vacuum cost balance for a worker to go
+	 * for the delay.
+	 */
+	pg_atomic_uint32 active_nworkers;
+
+	/*
+	 * Variables to control parallel index vacuuming.  Index statistics
+	 * returned from ambulkdelete and amvacuumcleanup is nullable variable
+	 * length.  'offset' is NULL bitmap. Note that a 0 indicates a null,
+	 * while 1 indicates non-null.  The index statistics follows at end of
+	 * struct.
+	 */
+	pg_atomic_uint32	idx;		/* counter for vacuuming and clean up */
+	pg_atomic_uint32	nprocessed;	/* # of indexes done during parallel execution */
+	uint32				offset;		/* sizeof header incl. bitmap */
+	bits8				bitmap[FLEXIBLE_ARRAY_MEMBER];	 /* bit map of NULLs */
+
+	/* Shared index statistics data follows at end of struct */
+} LVShared;
+#define SizeOfLVShared offsetof(LVShared, bitmap) + sizeof(bits8)
+#define GetSharedIndStats(s) \
+	((LVSharedIndStats *)((char *)(s) + ((LVShared *)(s))->offset))
+#define IndStatsIsNull(s, i) \
+	(!(((LVShared *)(s))->bitmap[(i) >> 3] & (1 << ((i) & 0x07))))
+
+/*
+ * Variables for cost-based vacuum delay for parallel index vacuuming.
+ * The basic idea of cost-based vacuum delay for parallel index vacuuming
+ * is to allow all parallel vacuum workers including the leader process
+ * to have a shared view of cost related parameters (mainly VacuumCostBalance)
+ * and allow each worker to update it and then based on that decide
+ * whether it needs to sleep.  Besides, we allow any worker to sleep
+ * only if it has performed the I/O above a certain threshold, which is
+ * calculated based on the number of active workers (VacuumActiveNWorkers),
+ * and the overall cost balance is more than VacuumCostLimit set by the
+ * system.  Then we will allow the worker to sleep proportional to the work
+ * done and reduce the VacuumSharedCostBalance by the amount which is
+ * consumed by the current worker (VacuumCostBalanceLocal).  This can
+ * avoid letting the workers sleep which has done less or no I/O as compared
+ * to other workers, and therefore can ensure that workers who are doing
+ * more I/O got throttled more.
+ */
+pg_atomic_uint32	*VacuumSharedCostBalance = NULL;
+pg_atomic_uint32	*VacuumActiveNWorkers = NULL;
+int					VacuumCostBalanceLocal = 0;
+
+/*
+ * Struct for an index bulk-deletion statistic used for parallel lazy
+ * vacuum. This is allocated in the DSM segment.  IndexBulkDeleteResult
+ * follows at end of struct.
+ */
+typedef struct LVSharedIndStats
+{
+	Size	size;
+	bool	updated;	/* are the stats updated */
+
+	/* Index bulk-deletion result data follows at end of struct */
+} LVSharedIndStats;
+#define SizeOfSharedIndStats(s) \
+	(sizeof(LVSharedIndStats) + ((LVSharedIndStats *)(s))->size)
+#define GetIndexBulkDeleteResult(s) \
+	((IndexBulkDeleteResult *)((char *)(s) + sizeof(LVSharedIndStats)))
+
+/* Struct for parallel lazy vacuum */
+typedef struct LVParallelState
+{
+	ParallelContext	*pcxt;
+
+	/* Shared information among parallel vacuum workers */
+	LVShared		*lvshared;
+
+	/*
+	 * The number of indexes that do NOT support parallel
+	 * index bulk-deletion and parallel index cleanup respectively.
+	 */
+	int				nindexes_nonparallel_bulkdel;
+	int				nindexes_nonparallel_cleanup;
+
+	/*
+	 * Always true except for a debugging case where
+	 * PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION are defined.
+	 */
+	bool			leaderparticipates;
+} LVParallelState;
+
 typedef struct LVRelStats
 {
 	/* useindex = true means two-pass strategy; false means one-pass */
@@ -128,11 +321,7 @@ typedef struct LVRelStats
 	BlockNumber pages_removed;
 	double		tuples_deleted;
 	BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
-	/* List of TIDs of tuples we intend to delete */
-	/* NB: this list is ordered by TID address */
-	int			num_dead_tuples;	/* current # of entries */
-	int			max_dead_tuples;	/* # slots allocated in array */
-	ItemPointer dead_tuples;	/* array of ItemPointerData */
+	LVDeadTuples *dead_tuples;
 	int			num_index_scans;
 	TransactionId latestRemovedXid;
 	bool		lock_waiter_detected;
@@ -155,12 +344,11 @@ static void lazy_scan_heap(Relation onerel, VacuumParams *params,
 						   bool aggressive);
 static void lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats);
 static bool lazy_check_needs_freeze(Buffer buf, bool *hastup);
-static void lazy_vacuum_index(Relation indrel,
-							  IndexBulkDeleteResult **stats,
-							  LVRelStats *vacrelstats);
+static void lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult **stats,
+							  LVDeadTuples *dead_tuples, double reltuples);
 static void lazy_cleanup_index(Relation indrel,
-							   IndexBulkDeleteResult *stats,
-							   LVRelStats *vacrelstats);
+							   IndexBulkDeleteResult **stats,
+							   double reltuples, bool estimated_count);
 static int	lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 							 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer);
 static bool should_attempt_truncation(VacuumParams *params,
@@ -169,12 +357,39 @@ static void lazy_truncate_heap(Relation onerel, LVRelStats *vacrelstats);
 static BlockNumber count_nondeletable_pages(Relation onerel,
 											LVRelStats *vacrelstats);
 static void lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks);
-static void lazy_record_dead_tuple(LVRelStats *vacrelstats,
+static void lazy_record_dead_tuple(LVDeadTuples *dead_tuples,
 								   ItemPointer itemptr);
 static bool lazy_tid_reaped(ItemPointer itemptr, void *state);
 static int	vac_cmp_itemptr(const void *left, const void *right);
 static bool heap_page_is_all_visible(Relation rel, Buffer buf,
 									 TransactionId *visibility_cutoff_xid, bool *all_frozen);
+static LVParallelState *begin_parallel_vacuum(LVRelStats *vacrelstats, Oid relid,
+											  BlockNumber nblocks, Relation *Irel,
+											  int nindexes, int nrequested);
+static void end_parallel_vacuum(LVParallelState *lps, Relation *Irel, int nindexes,
+								IndexBulkDeleteResult **stats);
+static void prepare_index_statistics(LVShared *lvshared, Relation *Irel, int nindexes,
+									 int nworkers);
+static void lazy_vacuum_indexes(LVRelStats *vacrelstats, Relation *Irel,
+								int nindexes, IndexBulkDeleteResult **stats,
+								LVParallelState *lps);
+static void lazy_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
+								 int nindexes, IndexBulkDeleteResult **stats,
+								 LVParallelState *lps);
+static void lazy_parallel_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
+													int nindexes, IndexBulkDeleteResult **stats,
+													LVParallelState *lps);
+static void vacuum_or_cleanup_indexes_worker(Relation *Irel, int nindexes,
+											 IndexBulkDeleteResult **stats,
+											 LVShared *lvshared,
+											 LVDeadTuples *dead_tuples);
+static void update_index_statistics(Relation *Irel, IndexBulkDeleteResult **stats,
+									int nindexes);
+static LVSharedIndStats *get_indstats(LVShared *lvshared, int n);
+static int compute_parallel_workers(Relation *Irel, int nindexes, int nrequested);
+static long compute_max_dead_tuples(BlockNumber relblocks, bool hasindex);
+static bool skip_parallel_index_vacuum(Relation indrel, bool for_cleanup,
+									   bool first_time);
 
 
 /*
@@ -488,6 +703,18 @@ vacuum_log_cleanup_info(Relation rel, LVRelStats *vacrelstats)
  *		dead-tuple TIDs, invoke vacuuming of indexes and call lazy_vacuum_heap
  *		to reclaim dead line pointers.
  *
+ *		If the table has at least two indexes and parallel lazy vacuum is
+ *		requested, we execute both index vacuuming and index cleanup with
+ *		parallel workers. In parallel lazy vacuum, we enter parallel mode and
+ *		then create both the parallel context and the DSM segment before starting
+ *		heap scan so that we can record dead tuples to the DSM segment. All
+ *		parallel workers are launched at beginning of index vacuuming and index
+ *		cleanup and they exit once done with all indexes. At the end of this
+ *		function we exit from parallel mode. Index bulk-deletion results are
+ *		stored in the DSM segment and update index statistics as a whole after
+ *		exited from parallel mode since all writes are not allowed during parallel
+ *		mode.
+ *
  *		If there are no indexes then we can reclaim line pointers on the fly;
  *		dead line pointers need only be retained until all index pointers that
  *		reference them have been killed.
@@ -496,6 +723,8 @@ static void
 lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			   Relation *Irel, int nindexes, bool aggressive)
 {
+	LVParallelState *lps = NULL;
+	LVDeadTuples *dead_tuples;
 	BlockNumber nblocks,
 				blkno;
 	HeapTupleData tuple;
@@ -518,6 +747,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	bool		skipping_blocks;
 	xl_heap_freeze_tuple *frozen;
 	StringInfoData buf;
+	int			parallel_workers = 0;
 	const int	initprog_index[] = {
 		PROGRESS_VACUUM_PHASE,
 		PROGRESS_VACUUM_TOTAL_HEAP_BLKS,
@@ -553,13 +783,41 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	vacrelstats->nonempty_pages = 0;
 	vacrelstats->latestRemovedXid = InvalidTransactionId;
 
-	lazy_space_alloc(vacrelstats, nblocks);
+	/*
+	 * If parallel lazy vacuum is requested and we vacuum indexes, compute
+	 * the number of parallel vacuum worker to launch.
+	 */
+	if (params->nworkers >= 0 && vacrelstats->useindex)
+		parallel_workers = compute_parallel_workers(Irel, nindexes,
+													params->nworkers);
+
+	if (parallel_workers > 0)
+	{
+		/*
+		 * Enter parallel mode, create the parallel context and allocate the
+		 * DSM segment.
+		 */
+		lps = begin_parallel_vacuum(vacrelstats,
+									RelationGetRelid(onerel),
+									nblocks, Irel, nindexes,
+									parallel_workers);
+	}
+	else
+	{
+		/*
+		 * Use single process vacuum. We allocate the memory space for dead
+		 * tuples locally.
+		 */
+		lazy_space_alloc(vacrelstats, nblocks);
+	}
+
+	dead_tuples = vacrelstats->dead_tuples;
 	frozen = palloc(sizeof(xl_heap_freeze_tuple) * MaxHeapTuplesPerPage);
 
 	/* Report that we're scanning the heap, advertising total # of blocks */
 	initprog_val[0] = PROGRESS_VACUUM_PHASE_SCAN_HEAP;
 	initprog_val[1] = nblocks;
-	initprog_val[2] = vacrelstats->max_dead_tuples;
+	initprog_val[2] = dead_tuples->max_tuples;
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
 	/*
@@ -737,8 +995,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * If we are close to overrunning the available space for dead-tuple
 		 * TIDs, pause and do a cycle of vacuuming before we tackle this page.
 		 */
-		if ((vacrelstats->max_dead_tuples - vacrelstats->num_dead_tuples) < MaxHeapTuplesPerPage &&
-			vacrelstats->num_dead_tuples > 0)
+		if ((dead_tuples->max_tuples - dead_tuples->num_tuples) < MaxHeapTuplesPerPage &&
+			dead_tuples->num_tuples > 0)
 		{
 			const int	hvp_index[] = {
 				PROGRESS_VACUUM_PHASE,
@@ -766,10 +1024,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 										 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
 			/* Remove index entries */
-			for (i = 0; i < nindexes; i++)
-				lazy_vacuum_index(Irel[i],
-								  &indstats[i],
-								  vacrelstats);
+			lazy_vacuum_indexes(vacrelstats, Irel, nindexes, indstats, lps);
 
 			/*
 			 * Report that we are now vacuuming the heap.  We also increase
@@ -789,7 +1044,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 			vacrelstats->num_index_scans++;
 
 			/*
@@ -985,7 +1240,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		has_dead_tuples = false;
 		nfrozen = 0;
 		hastup = false;
-		prev_dead_count = vacrelstats->num_dead_tuples;
+		prev_dead_count = dead_tuples->num_tuples;
 		maxoff = PageGetMaxOffsetNumber(page);
 
 		/*
@@ -1024,7 +1279,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 */
 			if (ItemIdIsDead(itemid))
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				all_visible = false;
 				continue;
 			}
@@ -1170,7 +1425,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 
 			if (tupgone)
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				HeapTupleHeaderAdvanceLatestRemovedXid(tuple.t_data,
 													   &vacrelstats->latestRemovedXid);
 				tups_vacuumed += 1;
@@ -1240,7 +1495,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * doing a second scan. Also we don't do that but forget dead tuples
 		 * when index cleanup is disabled.
 		 */
-		if (!vacrelstats->useindex && vacrelstats->num_dead_tuples > 0)
+		if (!vacrelstats->useindex && dead_tuples->num_tuples > 0)
 		{
 			if (nindexes == 0)
 			{
@@ -1269,7 +1524,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 
 			/*
 			 * Periodically do incremental FSM vacuuming to make newly-freed
@@ -1384,7 +1639,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * page, so remember its free space as-is.  (This path will always be
 		 * taken if there are no indexes.)
 		 */
-		if (vacrelstats->num_dead_tuples == prev_dead_count)
+		if (dead_tuples->num_tuples == prev_dead_count)
 			RecordPageWithFreeSpace(onerel, blkno, freespace);
 	}
 
@@ -1418,7 +1673,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 
 	/* If any tuples need to be deleted, perform final vacuum cycle */
 	/* XXX put a threshold on min number of tuples here? */
-	if (vacrelstats->num_dead_tuples > 0)
+	if (dead_tuples->num_tuples > 0)
 	{
 		const int	hvp_index[] = {
 			PROGRESS_VACUUM_PHASE,
@@ -1434,10 +1689,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 									 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
 		/* Remove index entries */
-		for (i = 0; i < nindexes; i++)
-			lazy_vacuum_index(Irel[i],
-							  &indstats[i],
-							  vacrelstats);
+		lazy_vacuum_indexes(vacrelstats, Irel, nindexes, indstats, lps);
 
 		/* Report that we are now vacuuming the heap */
 		hvp_val[0] = PROGRESS_VACUUM_PHASE_VACUUM_HEAP;
@@ -1463,12 +1715,19 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
 								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
 
-	/* Do post-vacuum cleanup and statistics update for each index */
+	/* Do post-vacuum cleanup */
 	if (vacrelstats->useindex)
-	{
-		for (i = 0; i < nindexes; i++)
-			lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
-	}
+		lazy_cleanup_indexes(vacrelstats, Irel, nindexes, indstats, lps);
+
+	/*
+	 * End parallel mode before updating index statistics as we cannot write
+	 * during parallel mode.
+	 */
+	if (ParallelVacuumIsActive(lps))
+		end_parallel_vacuum(lps, Irel, nindexes, indstats);
+
+	/* Update index statistics */
+	 update_index_statistics(Irel, indstats, nindexes);
 
 	/* If no indexes, make log report that lazy_vacuum_heap would've made */
 	if (vacuumed_pages)
@@ -1534,7 +1793,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 	npages = 0;
 
 	tupindex = 0;
-	while (tupindex < vacrelstats->num_dead_tuples)
+	while (tupindex < vacrelstats->dead_tuples->num_tuples)
 	{
 		BlockNumber tblk;
 		Buffer		buf;
@@ -1543,7 +1802,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 
 		vacuum_delay_point();
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples->itemptrs[tupindex]);
 		buf = ReadBufferExtended(onerel, MAIN_FORKNUM, tblk, RBM_NORMAL,
 								 vac_strategy);
 		if (!ConditionalLockBufferForCleanup(buf))
@@ -1591,6 +1850,7 @@ static int
 lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 				 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer)
 {
+	LVDeadTuples	*dead_tuples = vacrelstats->dead_tuples;
 	Page		page = BufferGetPage(buffer);
 	OffsetNumber unused[MaxOffsetNumber];
 	int			uncnt = 0;
@@ -1601,16 +1861,16 @@ lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 
 	START_CRIT_SECTION();
 
-	for (; tupindex < vacrelstats->num_dead_tuples; tupindex++)
+	for (; tupindex < dead_tuples->num_tuples; tupindex++)
 	{
 		BlockNumber tblk;
 		OffsetNumber toff;
 		ItemId		itemid;
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&dead_tuples->itemptrs[tupindex]);
 		if (tblk != blkno)
 			break;				/* past end of tuples for this block */
-		toff = ItemPointerGetOffsetNumber(&vacrelstats->dead_tuples[tupindex]);
+		toff = ItemPointerGetOffsetNumber(&dead_tuples->itemptrs[tupindex]);
 		itemid = PageGetItemId(page, toff);
 		ItemIdSetUnused(itemid);
 		unused[uncnt++] = toff;
@@ -1731,19 +1991,336 @@ lazy_check_needs_freeze(Buffer buf, bool *hastup)
 	return false;
 }
 
+/*
+ * Perform index vacuuming or index cleanup with parallel workers. This function
+ * must be used by the parallel vacuum leader process. The caller must set
+ * lps->lvshared->for_cleanup to indicate whether vacuuming or cleanup.
+ */
+static void
+lazy_parallel_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
+										int nindexes, IndexBulkDeleteResult **stats,
+										LVParallelState *lps)
+{
+	int nindexes_remains;
+
+	Assert(!IsParallelWorker());
+	Assert(ParallelVacuumIsActive(lps));
+	Assert(nindexes > 0);
+
+	/* Enable shared cost balance */
+	VacuumSharedCostBalance = &(lps->lvshared->cost_balance);
+	VacuumActiveNWorkers = &(lps->lvshared->active_nworkers);
+
+	/*
+	 * Set up shared cost balance and the number of active workers for
+	 * vacuum delay.
+	 */
+	pg_atomic_write_u32(VacuumSharedCostBalance, VacuumCostBalance);
+	pg_atomic_write_u32(VacuumActiveNWorkers, 0);
+
+	/*
+	 * Reset the local value so that we compute cost balance during
+	 * parallel index vacuuming.
+	 */
+	VacuumCostBalance = 0;
+	VacuumCostBalanceLocal = 0;
+
+	/* Launch all workers */
+	LaunchParallelWorkers(lps->pcxt);
+
+	if (lps->lvshared->for_cleanup)
+		ereport(elevel,
+				(errmsg(ngettext("launched %d parallel vacuum worker for index cleanup (planned: %d)",
+								 "launched %d parallel vacuum workers for index cleanup (planned: %d)",
+								 lps->pcxt->nworkers_launched),
+						lps->pcxt->nworkers_launched, lps->pcxt->nworkers)));
+	else
+		ereport(elevel,
+				(errmsg(ngettext("launched %d parallel vacuum worker for index vacuuming (planned: %d)",
+								 "launched %d parallel vacuum workers for index vacuuming (planned: %d)",
+								 lps->pcxt->nworkers_launched),
+						lps->pcxt->nworkers_launched, lps->pcxt->nworkers)));
+
+	/*
+	* Increment the active worker count. We cannot decrement until the
+	* all parallel workers finish.
+	*/
+	pg_atomic_add_fetch_u32(VacuumActiveNWorkers, 1);
+
+	/*
+	 * Join as parallel workers. The leader process alone does that in
+	 * case where no workers launched.
+	 */
+	if (lps->leaderparticipates || lps->pcxt->nworkers_launched == 0)
+		vacuum_or_cleanup_indexes_worker(Irel, nindexes, stats, lps->lvshared,
+										 vacrelstats->dead_tuples);
+
+	/*
+	 * Here, the indexes that had been skipped during parallel index vacuuming
+	 * are remaining. If there are such indexes the leader process does vacuum
+	 * or cleanup them one by one.
+	 */
+	nindexes_remains = nindexes - pg_atomic_read_u32(&(lps->lvshared->nprocessed));
+	if (nindexes_remains > 0)
+	{
+		int i;
+#ifdef USE_ASSERT_CHECKING
+		int nprocessed = 0;
+#endif
+
+		for (i = 0; i < nindexes; i++)
+		{
+			bool processed = !skip_parallel_index_vacuum(Irel[i],
+														 lps->lvshared->for_cleanup,
+														 lps->lvshared->first_time);
+
+			/* Skip the already processed indexes */
+			if (processed)
+				continue;
+
+			if (lps->lvshared->for_cleanup)
+				lazy_cleanup_index(Irel[i], &stats[i],
+								   vacrelstats->new_rel_tuples,
+								   vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+			else
+				lazy_vacuum_index(Irel[i], &stats[i], vacrelstats->dead_tuples,
+								  vacrelstats->old_live_tuples);
+#ifdef USE_ASSERT_CHECKING
+			nprocessed++;
+#endif
+		}
+#ifdef USE_ASSERT_CHECKING
+		Assert(nprocessed == nindexes_remains);
+#endif
+	}
+
+	/*
+	 * We have completed the index vacuum so decrement the active worker
+	 * count.
+	 */
+	pg_atomic_sub_fetch_u32(VacuumActiveNWorkers, 1);
+
+	/* Wait for all vacuum workers to finish */
+	WaitForParallelWorkersToFinish(lps->pcxt);
+
+	/* Take over the shared balance value to heap scan */
+	VacuumCostBalance = pg_atomic_read_u32(VacuumSharedCostBalance);
+
+	/* Disable shared cost balance for vacuum delay */
+	VacuumSharedCostBalance = NULL;
+	VacuumActiveNWorkers = NULL;
+
+	/*
+	 * In cleanup case we don't need to reinitialize the parallel
+	 * context as no more index vacuuming and index cleanup will be
+	 * performed after that.
+	 */
+	if (!lps->lvshared->for_cleanup)
+	{
+		/* Reset the processing counts */
+		pg_atomic_write_u32(&(lps->lvshared->idx), 0);
+		pg_atomic_write_u32(&(lps->lvshared->nprocessed), 0);
+
+		/*
+		 * Reinitialize the parallel context to relaunch parallel workers
+		 * for the next execution.
+		 */
+		ReinitializeParallelDSM(lps->pcxt);
+	}
+}
+
+/*
+ * Index vacuuming and index cleanup routine used by parallel vacuum
+ * worker processes including the leader process.  After finished each
+ * indexes this function copies the index statistics returned from
+ * ambulkdelete and amvacuumcleanup to the DSM segment.
+ */
+static void
+vacuum_or_cleanup_indexes_worker(Relation *Irel, int nindexes,
+								 IndexBulkDeleteResult **stats,
+								 LVShared *lvshared,
+								 LVDeadTuples *dead_tuples)
+{
+	/* Loop until all indexes are vacuumed */
+	for (;;)
+	{
+		int idx;
+		LVSharedIndStats *shared_indstats;
+		IndexBulkDeleteResult *bulkdelete_res;
+
+		/* Get an index number to process */
+		idx = pg_atomic_fetch_add_u32(&(lvshared->idx), 1);
+
+		/* Done for all indexes? */
+		if (idx >= nindexes)
+			break;
+
+		/*
+		 * Skip if this index doesn't support parallel execution
+		 * at this time.
+		 */
+		if (skip_parallel_index_vacuum(Irel[idx], lvshared->for_cleanup,
+									   lvshared->first_time))
+			continue;
+
+		/* Get index statistics struct of this index */
+		shared_indstats = get_indstats(lvshared, idx);
+
+		/* Skip if this index doesn't support parallel index vacuuming */
+		if (shared_indstats == NULL)
+			continue;
+
+		/* Get the space for IndexBulkDeleteResult */
+		bulkdelete_res = GetIndexBulkDeleteResult(shared_indstats);
+
+		/*
+		 * Update the pointer to the corresponding bulk-deletion result
+		 * if someone has already updated it.
+		 */
+		if (shared_indstats->updated && stats[idx] == NULL)
+			stats[idx] = bulkdelete_res;
+
+		/* Increment the processing count */
+		pg_atomic_add_fetch_u32(&(lvshared->nprocessed), 1);
+
+		/* Do vacuum or cleanup one index */
+		if (lvshared->for_cleanup)
+			lazy_cleanup_index(Irel[idx], &(stats[idx]), lvshared->reltuples,
+							   lvshared->estimated_count);
+		else
+			lazy_vacuum_index(Irel[idx], &(stats[idx]), dead_tuples,
+							  lvshared->reltuples);
+
+		/*
+		 * Copy the index bulk-deletion result returned from ambulkdelete
+		 * and amvacuumcleanup to the DSM segment if it's the first time to
+		 * get it from them, because they allocate it locally and it's
+		 * possible that an index will be vacuumed by the different vacuum
+		 * process at the next time.  The copying the result normally
+		 * happens only after the first time of index vacuuming.  From the
+		 * second time, we pass the result on the DSM segment so that they
+		 * then update it directly.
+		 *
+		 * Since all vacuum workers write the bulk-deletion result at
+		 * different slots we can write them without locking.
+		 */
+		if (!shared_indstats->updated && stats[idx] != NULL)
+		{
+			memcpy(bulkdelete_res, stats[idx], shared_indstats->size);
+			shared_indstats->updated = true;
+
+			/*
+			 * no longer need the locally allocated result and now
+			 * stats[idx] points to the DSM segment.
+			 */
+			pfree(stats[idx]);
+			stats[idx] = bulkdelete_res;
+		}
+	}
+}
+
+/*
+ * Cleanup indexes.  This function must be used by the parallel vacuum
+ * leader process in parallel vacuum case.
+ */
+static void
+lazy_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
+					int nindexes, IndexBulkDeleteResult **stats,
+					LVParallelState *lps)
+{
+	int		idx;
+
+	Assert(!IsParallelWorker());
+	Assert(nindexes > 0);
+
+	/*
+	 * If parallel vacuum is active we perform index cleanup with parallel
+	 * workers.
+	 */
+	if (ParallelVacuumIsActive(lps))
+	{
+		/* Tell parallel workers to do index cleanup */
+		lps->lvshared->for_cleanup = true;
+		lps->lvshared->first_time =
+			(vacrelstats->num_index_scans == 0);
+
+		/*
+		 * Now we can provide a better estimate of total number of
+		 * surviving tuples (we assume indexes are more interested in that
+		 * than in the number of nominally live tuples).
+		 */
+		lps->lvshared->reltuples = vacrelstats->new_rel_tuples;
+		lps->lvshared->estimated_count =
+			(vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+
+		lazy_parallel_vacuum_or_cleanup_indexes(vacrelstats, Irel, nindexes,
+												stats, lps);
+	}
+	else
+	{
+		for (idx = 0; idx < nindexes; idx++)
+			lazy_cleanup_index(Irel[idx], &stats[idx],
+							   vacrelstats->new_rel_tuples,
+							   vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	}
+}
+
+/*
+ * Vacuum indexes. This function must be used by the parallel vacuum leader
+ * process in parallel vacuum case.
+ */
+static void
+lazy_vacuum_indexes(LVRelStats *vacrelstats, Relation *Irel,
+					int nindexes, IndexBulkDeleteResult **stats,
+					LVParallelState *lps)
+{
+	int		idx;
+
+	Assert(!IsParallelWorker());
+	Assert(nindexes > 0);
+
+	/*
+	 * If parallel vacuum is active we perform index vacuuming with
+	 * parallel workers.
+	 */
+	if (ParallelVacuumIsActive(lps))
+	{
+		/* Tell parallel workers to do index vacuuming */
+		lps->lvshared->for_cleanup = false;
+		lps->lvshared->first_time = false;
+
+		/*
+		 * We can only provide an approximate value of num_heap_tuples in
+		 * vacuum cases.
+		 */
+		lps->lvshared->reltuples = vacrelstats->old_live_tuples;
+		lps->lvshared->estimated_count = true;
+
+		lazy_parallel_vacuum_or_cleanup_indexes(vacrelstats, Irel, nindexes,
+												stats, lps);
+	}
+	else
+	{
+		for (idx = 0; idx < nindexes; idx++)
+			lazy_vacuum_index(Irel[idx], &stats[idx], vacrelstats->dead_tuples,
+							  vacrelstats->old_live_tuples);
+	}
+}
 
 /*
  *	lazy_vacuum_index() -- vacuum one index relation.
  *
  *		Delete all the index entries pointing to tuples listed in
  *		vacrelstats->dead_tuples, and update running statistics.
+ *		reltuples is the number of heap tuples to be passed to the
+ *		bulk delete callback.
  */
 static void
-lazy_vacuum_index(Relation indrel,
-				  IndexBulkDeleteResult **stats,
-				  LVRelStats *vacrelstats)
+lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult **stats,
+				  LVDeadTuples *dead_tuples, double reltuples)
 {
 	IndexVacuumInfo ivinfo;
+	char		*msgfmt;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
@@ -1753,30 +2330,38 @@ lazy_vacuum_index(Relation indrel,
 	ivinfo.report_progress = false;
 	ivinfo.estimated_count = true;
 	ivinfo.message_level = elevel;
-	/* We can only provide an approximate value of num_heap_tuples here */
-	ivinfo.num_heap_tuples = vacrelstats->old_live_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
 	/* Do bulk deletion */
 	*stats = index_bulk_delete(&ivinfo, *stats,
-							   lazy_tid_reaped, (void *) vacrelstats);
+							   lazy_tid_reaped, (void *) dead_tuples);
+
+	if (IsParallelWorker())
+		msgfmt = gettext_noop("scanned index \"%s\" to remove %d row versions by parallel vacuum worker");
+	else
+		msgfmt = gettext_noop("scanned index \"%s\" to remove %d row versions");
 
 	ereport(elevel,
-			(errmsg("scanned index \"%s\" to remove %d row versions",
+			(errmsg(msgfmt,
 					RelationGetRelationName(indrel),
-					vacrelstats->num_dead_tuples),
+					dead_tuples->num_tuples),
 			 errdetail_internal("%s", pg_rusage_show(&ru0))));
 }
 
 /*
  *	lazy_cleanup_index() -- do post-vacuum cleanup for one index relation.
+ *
+ *		reltuples is the number of heap tuples and estimated_count is true
+ *		if the reltuples is an estimated value.
  */
 static void
 lazy_cleanup_index(Relation indrel,
-				   IndexBulkDeleteResult *stats,
-				   LVRelStats *vacrelstats)
+				   IndexBulkDeleteResult **stats,
+				   double reltuples, bool estimated_count)
 {
 	IndexVacuumInfo ivinfo;
+	char		*msgfmt;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
@@ -1784,49 +2369,62 @@ lazy_cleanup_index(Relation indrel,
 	ivinfo.index = indrel;
 	ivinfo.analyze_only = false;
 	ivinfo.report_progress = false;
-	ivinfo.estimated_count = (vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	ivinfo.estimated_count = estimated_count;
 	ivinfo.message_level = elevel;
 
-	/*
-	 * Now we can provide a better estimate of total number of surviving
-	 * tuples (we assume indexes are more interested in that than in the
-	 * number of nominally live tuples).
-	 */
-	ivinfo.num_heap_tuples = vacrelstats->new_rel_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
-	stats = index_vacuum_cleanup(&ivinfo, stats);
+	*stats = index_vacuum_cleanup(&ivinfo, *stats);
 
-	if (!stats)
+	if (!(*stats))
 		return;
 
-	/*
-	 * Now update statistics in pg_class, but only if the index says the count
-	 * is accurate.
-	 */
-	if (!stats->estimated_count)
-		vac_update_relstats(indrel,
-							stats->num_pages,
-							stats->num_index_tuples,
-							0,
-							false,
-							InvalidTransactionId,
-							InvalidMultiXactId,
-							false);
+	if (IsParallelWorker())
+		msgfmt = gettext_noop("index \"%s\" now contains %.0f row versions in %u pages, reported by parallel vacuum worker");
+	else
+		msgfmt = gettext_noop("index \"%s\" now contains %.0f row versions in %u pages");
 
 	ereport(elevel,
-			(errmsg("index \"%s\" now contains %.0f row versions in %u pages",
+			(errmsg(msgfmt,
 					RelationGetRelationName(indrel),
-					stats->num_index_tuples,
-					stats->num_pages),
+					(*stats)->num_index_tuples,
+					(*stats)->num_pages),
 			 errdetail("%.0f index row versions were removed.\n"
 					   "%u index pages have been deleted, %u are currently reusable.\n"
 					   "%s.",
-					   stats->tuples_removed,
-					   stats->pages_deleted, stats->pages_free,
+					   (*stats)->tuples_removed,
+					   (*stats)->pages_deleted, (*stats)->pages_free,
 					   pg_rusage_show(&ru0))));
+}
+
+/*
+ * Update index statistics in pg_class if the statistics is accurate.
+ */
+static void
+update_index_statistics(Relation *Irel, IndexBulkDeleteResult **stats,
+						int nindexes)
+{
+	int i;
 
-	pfree(stats);
+	Assert(!IsInParallelMode());
+
+	for (i = 0; i < nindexes; i++)
+	{
+		if (stats[i] == NULL || stats[i]->estimated_count)
+			continue;
+
+		/* Update index statistics */
+		vac_update_relstats(Irel[i],
+							stats[i]->num_pages,
+							stats[i]->num_index_tuples,
+							0,
+							false,
+							InvalidTransactionId,
+							InvalidMultiXactId,
+							false);
+		pfree(stats[i]);
+	}
 }
 
 /*
@@ -2134,19 +2732,17 @@ count_nondeletable_pages(Relation onerel, LVRelStats *vacrelstats)
 }
 
 /*
- * lazy_space_alloc - space allocation decisions for lazy vacuum
- *
- * See the comments at the head of this file for rationale.
+ * Return the maximum number of dead tuples we can record.
  */
-static void
-lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+static long
+compute_max_dead_tuples(BlockNumber relblocks, bool useindex)
 {
 	long		maxtuples;
 	int			vac_work_mem = IsAutoVacuumWorkerProcess() &&
 	autovacuum_work_mem != -1 ?
 	autovacuum_work_mem : maintenance_work_mem;
 
-	if (vacrelstats->useindex)
+	if (useindex)
 	{
 		maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);
 		maxtuples = Min(maxtuples, INT_MAX);
@@ -2160,34 +2756,49 @@ lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
 		maxtuples = Max(maxtuples, MaxHeapTuplesPerPage);
 	}
 	else
-	{
 		maxtuples = MaxHeapTuplesPerPage;
-	}
 
-	vacrelstats->num_dead_tuples = 0;
-	vacrelstats->max_dead_tuples = (int) maxtuples;
-	vacrelstats->dead_tuples = (ItemPointer)
-		palloc(maxtuples * sizeof(ItemPointerData));
+	return maxtuples;
+}
+
+/*
+ * lazy_space_alloc - space allocation decisions for lazy vacuum
+ *
+ * See the comments at the head of this file for rationale.
+ */
+static void
+lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+{
+	LVDeadTuples	*dead_tuples = NULL;
+	long		maxtuples;
+
+	maxtuples = compute_max_dead_tuples(relblocks, vacrelstats->useindex);
+
+	dead_tuples = (LVDeadTuples *)
+		palloc(SizeOfLVDeadTuples + maxtuples * sizeof(ItemPointerData));
+	dead_tuples->num_tuples = 0;
+	dead_tuples->max_tuples = (int) maxtuples;
+
+	vacrelstats->dead_tuples = dead_tuples;
 }
 
 /*
  * lazy_record_dead_tuple - remember one deletable tuple
  */
 static void
-lazy_record_dead_tuple(LVRelStats *vacrelstats,
-					   ItemPointer itemptr)
+lazy_record_dead_tuple(LVDeadTuples *dead_tuples, ItemPointer itemptr)
 {
 	/*
 	 * The array shouldn't overflow under normal behavior, but perhaps it
 	 * could if we are given a really small maintenance_work_mem. In that
 	 * case, just forget the last few tuples (we'll get 'em next time).
 	 */
-	if (vacrelstats->num_dead_tuples < vacrelstats->max_dead_tuples)
+	if (dead_tuples->num_tuples < dead_tuples->max_tuples)
 	{
-		vacrelstats->dead_tuples[vacrelstats->num_dead_tuples] = *itemptr;
-		vacrelstats->num_dead_tuples++;
+		dead_tuples->itemptrs[dead_tuples->num_tuples] = *itemptr;
+		dead_tuples->num_tuples++;
 		pgstat_progress_update_param(PROGRESS_VACUUM_NUM_DEAD_TUPLES,
-									 vacrelstats->num_dead_tuples);
+									 dead_tuples->num_tuples);
 	}
 }
 
@@ -2201,12 +2812,12 @@ lazy_record_dead_tuple(LVRelStats *vacrelstats,
 static bool
 lazy_tid_reaped(ItemPointer itemptr, void *state)
 {
-	LVRelStats *vacrelstats = (LVRelStats *) state;
+	LVDeadTuples	*dead_tuples = (LVDeadTuples *) state;
 	ItemPointer res;
 
 	res = (ItemPointer) bsearch((void *) itemptr,
-								(void *) vacrelstats->dead_tuples,
-								vacrelstats->num_dead_tuples,
+								(void *) dead_tuples->itemptrs,
+								dead_tuples->num_tuples,
 								sizeof(ItemPointerData),
 								vac_cmp_itemptr);
 
@@ -2354,3 +2965,406 @@ heap_page_is_all_visible(Relation rel, Buffer buf,
 
 	return all_visible;
 }
+
+/*
+ * Compute the number of parallel worker processes to request. Both index
+ * vacuuming and index cleanup can be executed together with parallel workers.
+ * The relation sizes of table and indexes don't affect to the parallel
+ * degree for now. nrequested is the number of parallel workers that user
+ * requested. If nrequested is 0 we compute the parallel degree based on
+ * nindexes that is the number of indexes that support parallel index
+ * vacuuming.
+ */
+static int
+compute_parallel_workers(Relation *Irel, int nindexes, int nrequested)
+{
+	bool	leaderparticipates = true;
+	int		nindexes_to_vacuum = 0;
+	int		parallel_workers;
+	int		i;
+
+	Assert(nrequested >= 0);
+
+	/* Return immediately when parallelism disabled */
+	if (max_parallel_maintenance_workers == 0)
+		return 0;
+
+	/*
+	 * Compute the number of indexes that can participate to parallel index
+	 * vacuuming.
+	 */
+	for (i = 0; i < nindexes; i++)
+	{
+		if (Irel[i]->rd_indam->amparallelvacuumoptions !=
+			VACUUM_OPTION_NO_PARALLEL)
+			nindexes_to_vacuum++;
+	}
+
+	/* No index supports parallel index vacuuming */
+	if (nindexes_to_vacuum == 0)
+		return 0;
+
+#ifdef PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION
+	leaderparticipates = false;
+#endif
+
+	/* The leader process takes one index */
+	if (leaderparticipates)
+		nindexes_to_vacuum--;
+
+	/* Compute the parallel degree */
+	parallel_workers = (nrequested > 0) ?
+		Min(nrequested, nindexes_to_vacuum) : nindexes_to_vacuum;
+
+	/* cap by max_parallel_maintenace_workers */
+	parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+
+	return parallel_workers;
+}
+
+/*
+ * Enter parallel mode, allocate and initialize the DSM segment.
+ */
+static LVParallelState *
+begin_parallel_vacuum(LVRelStats *vacrelstats, Oid relid, BlockNumber nblocks,
+					  Relation *Irel, int nindexes, int nrequested)
+{
+	LVParallelState *lps = (LVParallelState *) palloc0(sizeof(LVParallelState));
+	ParallelContext *pcxt;
+	LVShared		*shared;
+	LVDeadTuples	*dead_tuples;
+	long	maxtuples;
+	char	*sharedquery;
+	Size	est_shared;
+	Size	est_deadtuples;
+	int		querylen;
+	int		i;
+
+	Assert(nrequested > 0);
+	Assert(nindexes > 0);
+
+	lps->leaderparticipates = true;
+
+#ifdef PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION
+	lps->leaderparticipates = false;
+#endif
+
+	EnterParallelMode();
+	pcxt = CreateParallelContext("postgres", "heap_parallel_vacuum_main",
+								 nrequested);
+	lps->pcxt = pcxt;
+	Assert(pcxt->nworkers > 0);
+
+	/* Estimate size for shared information -- PARALLEL_VACUUM_KEY_SHARED */
+	est_shared = MAXALIGN(add_size(SizeOfLVShared, BITMAPLEN(nindexes)));
+	for (i = 0; i < nindexes; i++)
+	{
+		uint8 vacoptions = Irel[i]->rd_indam->amparallelvacuumoptions;
+
+		/*
+		 * Cleanup option should be either disabled, always performing
+		 * in parallel or conditionally performing in parallel.
+		 */
+		Assert(!((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) &&
+				 (vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP)));
+		Assert(vacoptions <= VACUUM_OPTION_MAX_VALID_VALUE);
+
+		if (vacoptions != VACUUM_OPTION_NO_PARALLEL)
+		{
+			est_shared = add_size(est_shared,
+									add_size(sizeof(LVSharedIndStats),
+											 index_parallelvacuum_estimate(Irel[i])));
+
+			/*
+			 * Remember the number of indexes that don't support parallel
+			 * bulk-deletion and parallel cleanup respectively.
+			 */
+			if (!VACUUM_OPTION_SUPPORT_PARALLEL_BULKDEL(vacoptions))
+				lps->nindexes_nonparallel_bulkdel++;
+			if (!VACUUM_OPTION_SUPPORT_PARALLEL_CLEANUP(vacoptions))
+				lps->nindexes_nonparallel_cleanup++;
+		}
+	}
+	shm_toc_estimate_chunk(&pcxt->estimator, est_shared);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	/* Estimate size for dead tuples -- PARALLEL_VACUUM_KEY_DEAD_TUPLES */
+	maxtuples = compute_max_dead_tuples(nblocks, true);
+	est_deadtuples = MAXALIGN(add_size(SizeOfLVDeadTuples,
+									   mul_size(sizeof(ItemPointerData), maxtuples)));
+	shm_toc_estimate_chunk(&pcxt->estimator, est_deadtuples);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	/* Finally, estimate PARALLEL_VACUUM_KEY_QUERY_TEXT space */
+	querylen = strlen(debug_query_string);
+	shm_toc_estimate_chunk(&pcxt->estimator, querylen + 1);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	InitializeParallelDSM(pcxt);
+
+	/* Prepare shared information */
+	shared = (LVShared *) shm_toc_allocate(pcxt->toc, est_shared);
+	MemSet(shared, 0, est_shared);
+	shared->relid = relid;
+	shared->elevel = elevel;
+	shared->offset = add_size(SizeOfLVShared, BITMAPLEN(nindexes));
+	prepare_index_statistics(shared, Irel, nindexes, nrequested);
+	pg_atomic_init_u32(&(shared->idx), 0);
+	pg_atomic_init_u32(&(shared->nprocessed), 0);
+	pg_atomic_init_u32(&(shared->cost_balance), 0);
+	pg_atomic_init_u32(&(shared->active_nworkers), 0);
+
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_SHARED, shared);
+	lps->lvshared = shared;
+
+	/* Prepare the dead tuple space */
+	dead_tuples = (LVDeadTuples *) shm_toc_allocate(pcxt->toc, est_deadtuples);
+	dead_tuples->max_tuples = maxtuples;
+	dead_tuples->num_tuples = 0;
+	MemSet(dead_tuples->itemptrs, 0, sizeof(ItemPointerData) * maxtuples);
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_DEAD_TUPLES, dead_tuples);
+	vacrelstats->dead_tuples = dead_tuples;
+
+	/* Store query string for workers */
+	sharedquery = (char *) shm_toc_allocate(pcxt->toc, querylen + 1);
+	memcpy(sharedquery, debug_query_string, querylen + 1);
+	sharedquery[querylen] = '\0';
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, sharedquery);
+
+	return lps;
+}
+
+/*
+ * Initialize variables for shared index statistics, set NULL bitmap and
+ * the struct size of each indexes.  Also this function sets the number of
+ * indexes that do not support parallel index vacuuming and that use
+ * maintenance_work_mem.  Since currently we don't support parallel vacuum
+ * for autovacuum we don't need to care about autovacuum_work_mem.
+ */
+static void
+prepare_index_statistics(LVShared *lvshared, Relation *Irel, int nindexes,
+						 int nworkers)
+{
+	char *p = (char *) GetSharedIndStats(lvshared);
+	int nindexes_mwm = 0;
+	int i;
+
+	Assert(!IsAutoVacuumWorkerProcess());
+
+	for (i = 0; i < nindexes; i++)
+	{
+		LVSharedIndStats *indstats;
+
+		if (Irel[i]->rd_indam->amparallelvacuumoptions ==
+			VACUUM_OPTION_NO_PARALLEL)
+		{
+			/* Set NULL as this index does not support parallel vacuum */
+			lvshared->bitmap[i >> 3] |= 0 << (i & 0x07);
+			continue;
+		}
+
+		if (Irel[i]->rd_indam->amusemaintenanceworkmem)
+			nindexes_mwm++;
+
+		/* Set the size for index statistics */
+		indstats = (LVSharedIndStats *) p;
+		lvshared->bitmap[i >> 3] |= 1 << (i & 0x07);
+		indstats->size = index_parallelvacuum_estimate(Irel[i]);
+
+		p += SizeOfSharedIndStats(indstats);
+	}
+
+	/* Compute the new maitenance_work_mem value for index vacuuming */
+	lvshared->maintenance_work_mem_worker =
+		(nindexes_mwm > 0) ?
+		maintenance_work_mem / Min(nworkers, nindexes_mwm) :
+		maintenance_work_mem;
+}
+
+/*
+ * Destroy the parallel context, and end parallel mode.
+ *
+ * All writes are not allowed during parallel mode and it might not be
+ * safe to exit from the parallel mode while keeping the parallel context.
+ * So we copy the updated index statistics to a local memory and then later
+ * use that to update the index statistics.
+ */
+static void
+end_parallel_vacuum(LVParallelState *lps, Relation *Irel, int nindexes,
+					IndexBulkDeleteResult **stats)
+{
+	int i;
+
+	Assert(!IsParallelWorker());
+
+	/* copy the updated statistics */
+	for (i = 0; i < nindexes; i++)
+	{
+		LVSharedIndStats *indstats = get_indstats(lps->lvshared, i);
+
+		/*
+		 * Skip unused slot.  The statistics of this index are already
+		 * stored in local memory.
+		 */
+		if (indstats == NULL)
+			continue;
+
+		if (indstats->updated)
+		{
+			stats[i] = (IndexBulkDeleteResult *) palloc0(sizeof(IndexBulkDeleteResult));
+			memcpy(stats[i],
+				   GetIndexBulkDeleteResult(indstats),
+				   sizeof(IndexBulkDeleteResult));
+		}
+		else
+			stats[i] = NULL;
+	}
+
+	DestroyParallelContext(lps->pcxt);
+	ExitParallelMode();
+
+	/* Deactivate parallel vacuum */
+	pfree(lps);
+	lps = NULL;
+}
+
+/* Return the Nth index statistics or NULL */
+static LVSharedIndStats *
+get_indstats(LVShared *lvshared, int n)
+{
+	int		i;
+	char	*p;
+
+	if (IndStatsIsNull(lvshared, n))
+		return NULL;
+
+	p = (char *) GetSharedIndStats(lvshared);
+	for (i = 0; i < (n - 1); i++)
+	{
+		if (IndStatsIsNull(lvshared, i))
+			continue;
+
+		p += SizeOfSharedIndStats(p);
+	}
+
+	return (LVSharedIndStats *) p;
+}
+
+/*
+ * Check if the given index participates parallel index vacuuming
+ * or parallel index cleanup. for_cleanup indicates whether index
+ * cleanup or index bulk-deletion. first_time is true if bulk-deletion
+ * is not performed yet. Return true if the index is skipped.
+ */
+static bool
+skip_parallel_index_vacuum(Relation indrel, bool for_cleanup,
+						   bool first_time)
+{
+	uint8 vacoptions = indrel->rd_indam->amparallelvacuumoptions;
+
+	/* first_time must be true only if for_cleanup is true */
+	Assert(for_cleanup || !first_time);
+
+	if (for_cleanup)
+	{
+		/* Skip if the index does not support parallel cleanup */
+		if (!VACUUM_OPTION_SUPPORT_PARALLEL_CLEANUP(vacoptions))
+			return true;
+
+		/*
+		 * Skip if the index support to parallel cleanup only first
+		 * time cleanup but it is not the first time.
+		 */
+		if (!first_time &&
+			((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP)) != 0)
+			return true;
+	}
+	else if (!VACUUM_OPTION_SUPPORT_PARALLEL_BULKDEL(vacoptions))
+	{
+		/* Skip if the index does not support parallel bulk deletion */
+		return true;
+	}
+
+	return false;
+}
+
+/*
+ * Perform work within a launched parallel process.
+ *
+ * Since parallel vacuum workers work only within index vacuuming and index
+ * cleanup, no need to report the progress information.
+ */
+void
+heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
+{
+	Relation	onerel;
+	Relation	*indrels;
+	LVShared	*lvshared;
+	LVDeadTuples	*dead_tuples;
+	int			nindexes;
+	char		*sharedquery;
+	IndexBulkDeleteResult **stats;
+
+	lvshared = (LVShared *) shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_SHARED,
+										   false);
+	elevel = lvshared->elevel;
+
+	ereport(DEBUG1,
+			(errmsg("starting parallel lazy vacuum worker for %s",
+					lvshared->for_cleanup ? "cleanup" : "vacuuming")));
+
+	/* Set debug_query_string for individual workers */
+	sharedquery = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, false);
+	debug_query_string = sharedquery;
+	pgstat_report_activity(STATE_RUNNING, debug_query_string);
+
+	/*
+	 * Open table.  The lock mode is the same as the leader process.  It's
+	 * okay because The lockmode does not conflict among the parallel
+	 * workers.
+	 */
+	onerel = table_open(lvshared->relid, ShareUpdateExclusiveLock);
+
+	/*
+	 * Open all indexes. indrels are sorted in order by OID, which should
+	 * be matched to the leader's one.
+	 */
+	vac_open_indexes(onerel, RowExclusiveLock, &nindexes, &indrels);
+	Assert(nindexes > 0);
+
+	/* Set dead tuple space */
+	dead_tuples = (LVDeadTuples *) shm_toc_lookup(toc,
+												  PARALLEL_VACUUM_KEY_DEAD_TUPLES,
+												  false);
+
+	/* Set cost-based vacuum delay */
+	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostBalance = 0;
+	VacuumPageHit = 0;
+	VacuumPageMiss = 0;
+	VacuumPageDirty = 0;
+	VacuumSharedCostBalance = &(lvshared->cost_balance);
+	VacuumActiveNWorkers = &(lvshared->active_nworkers);
+
+	stats = (IndexBulkDeleteResult **)
+		palloc0(nindexes * sizeof(IndexBulkDeleteResult *));
+
+	if (lvshared->maintenance_work_mem_worker > 0)
+		maintenance_work_mem = lvshared->maintenance_work_mem_worker;
+
+	/* Increment the active worker count. */
+	pg_atomic_add_fetch_u32(VacuumActiveNWorkers, 1);
+
+	/* Do either vacuuming indexes or cleaning indexes */
+	vacuum_or_cleanup_indexes_worker(indrels, nindexes, stats, lvshared,
+									 dead_tuples);
+
+	/*
+	 * We have completed the index vacuum so decrement the active worker
+	 * count.
+	 */
+	pg_atomic_sub_fetch_u32(VacuumActiveNWorkers, 1);
+
+	vac_close_indexes(nindexes, indrels, RowExclusiveLock);
+	table_close(onerel, ShareUpdateExclusiveLock);
+}
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index d147236429..b42cbd58d8 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -14,6 +14,7 @@
 
 #include "postgres.h"
 
+#include "access/heapam.h"
 #include "access/nbtree.h"
 #include "access/parallel.h"
 #include "access/session.h"
@@ -139,6 +140,9 @@ static const struct
 	},
 	{
 		"_bt_parallel_build_main", _bt_parallel_build_main
+	},
+	{
+		"heap_parallel_vacuum_main", heap_parallel_vacuum_main
 	}
 };
 
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index da1da23400..4b7f480fd6 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -42,6 +42,7 @@
 #include "nodes/makefuncs.h"
 #include "pgstat.h"
 #include "postmaster/autovacuum.h"
+#include "postmaster/bgworker_internals.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
 #include "storage/proc.h"
@@ -99,6 +100,7 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 	/* Set default value */
 	params.index_cleanup = VACOPT_TERNARY_DEFAULT;
 	params.truncate = VACOPT_TERNARY_DEFAULT;
+	params.nworkers = -1;
 
 	/* Parse options list */
 	foreach(lc, vacstmt->options)
@@ -129,6 +131,28 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 			params.index_cleanup = get_vacopt_ternary_value(opt);
 		else if (strcmp(opt->defname, "truncate") == 0)
 			params.truncate = get_vacopt_ternary_value(opt);
+		else if (strcmp(opt->defname, "parallel") == 0)
+		{
+			if (opt->arg == NULL)
+			{
+				/*
+				 * Parallel lazy vacuum is requested but user didn't specify
+				 * the parallel degree. The parallel degree will be determined
+				 * at the start of lazy vacuum.
+				 */
+				params.nworkers = 0;
+			}
+			else
+			{
+				params.nworkers = defGetInt32(opt);
+				if (params.nworkers < 1 || params.nworkers > MAX_PARALLEL_WORKER_LIMIT)
+					ereport(ERROR,
+							(errcode(ERRCODE_SYNTAX_ERROR),
+							 errmsg("parallel vacuum degree must be between 1 and %d",
+									MAX_PARALLEL_WORKER_LIMIT),
+							 parser_errposition(pstate, opt->location)));
+			}
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -170,6 +194,11 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 		}
 	}
 
+	if ((params.options & VACOPT_FULL) && params.nworkers >= 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot specify FULL option with PARALLEL option")));
+
 	/*
 	 * All freeze ages are zero if the FREEZE option is given; otherwise pass
 	 * them as -1 which means to use the default values.
@@ -383,6 +412,7 @@ vacuum(List *relations, VacuumParams *params,
 		VacuumPageHit = 0;
 		VacuumPageMiss = 0;
 		VacuumPageDirty = 0;
+		VacuumSharedCostBalance = NULL;
 
 		/*
 		 * Loop to process each selected relation.
@@ -1738,6 +1768,20 @@ vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params)
 		return false;
 	}
 
+	/*
+	 * Since parallel workers cannot access data in temporary tables, parallel
+	 * vacuum is not allowed for temporary relation. However rather than
+	 * skipping vacuum on the table, just disabling parallel option is better
+	 * option in most cases.
+	 */
+	if (RelationUsesLocalBuffers(onerel) && params->nworkers >= 0)
+	{
+		ereport(WARNING,
+				(errmsg("disabling parallel option of vacuum on \"%s\" --- cannot vacuum temporary tables in parallel",
+						RelationGetRelationName(onerel))));
+		params->nworkers = 0;
+	}
+
 	/*
 	 * Silently ignore partitioned tables as there is no work to be done.  The
 	 * useful work is on their child partitions, which have been queued up for
@@ -1941,16 +1985,73 @@ vac_close_indexes(int nindexes, Relation *Irel, LOCKMODE lockmode)
 void
 vacuum_delay_point(void)
 {
+	double	msec = 0;
+
 	/* Always check for interrupts */
 	CHECK_FOR_INTERRUPTS();
 
-	/* Nap if appropriate */
-	if (VacuumCostActive && !InterruptPending &&
-		VacuumCostBalance >= VacuumCostLimit)
+	if (!VacuumCostActive || InterruptPending)
+		return;
+
+	/*
+	 * If the vacuum cost balance is shared among parallel workers we
+	 * decide whether to sleep based on that.
+	 */
+	if (VacuumSharedCostBalance != NULL)
 	{
-		double		msec;
+		int nworkers = pg_atomic_read_u32(VacuumActiveNWorkers);
+
+		/* At least count itself */
+		Assert(nworkers >= 1);
+
+		/* Update the shared cost balance value atomically */
+		while (true)
+		{
+			uint32 shared_balance;
+			uint32 new_balance;
+			uint32 local_balance;
+
+			msec = 0;
+
+			/* compute new balance by adding the local value */
+			shared_balance = pg_atomic_read_u32(VacuumSharedCostBalance);
+			new_balance = shared_balance + VacuumCostBalance;
 
+			/* also compute the total local balance */
+			local_balance = VacuumCostBalanceLocal + VacuumCostBalance;
+
+			if ((new_balance >= VacuumCostLimit) &&
+				(local_balance > 0.5 * (VacuumCostLimit / nworkers)))
+			{
+				/* compute sleep time based on the local cost balance */
+				msec = VacuumCostDelay * VacuumCostBalanceLocal / VacuumCostLimit;
+				new_balance = shared_balance - VacuumCostBalanceLocal;
+				VacuumCostBalanceLocal = 0;
+			}
+
+			if (pg_atomic_compare_exchange_u32(VacuumSharedCostBalance,
+											   &shared_balance,
+											   new_balance))
+			{
+				/* Updated successfully, break */
+				break;
+			}
+		}
+
+		VacuumCostBalanceLocal += VacuumCostBalance;
+
+		/*
+		 * Reset the local balance as we accumulated it into the shared
+		 * value.
+		 */
+		VacuumCostBalance = 0;
+	}
+	else if (VacuumCostBalance >= VacuumCostLimit)
 		msec = VacuumCostDelay * VacuumCostBalance / VacuumCostLimit;
+
+	/* Nap if appropriate */
+	if (msec > 0)
+	{
 		if (msec > VacuumCostDelay * 4)
 			msec = VacuumCostDelay * 4;
 
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index c1dd8168ca..c3690f9c41 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2891,6 +2891,8 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 			(!wraparound ? VACOPT_SKIP_LOCKED : 0);
 		tab->at_params.index_cleanup = VACOPT_TERNARY_DEFAULT;
 		tab->at_params.truncate = VACOPT_TERNARY_DEFAULT;
+		/* We don't support parallel vacuum for autovacuum for now */
+		tab->at_params.nworkers = -1;
 		tab->at_params.freeze_min_age = freeze_min_age;
 		tab->at_params.freeze_table_age = freeze_table_age;
 		tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 98c917bf7a..ce35be710f 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -3560,7 +3560,7 @@ psql_completion(const char *text, int start, int end)
 		if (ends_with(prev_wd, '(') || ends_with(prev_wd, ','))
 			COMPLETE_WITH("FULL", "FREEZE", "ANALYZE", "VERBOSE",
 						  "DISABLE_PAGE_SKIPPING", "SKIP_LOCKED",
-						  "INDEX_CLEANUP", "TRUNCATE");
+						  "INDEX_CLEANUP", "TRUNCATE", "PARALLEL");
 		else if (TailMatches("FULL|FREEZE|ANALYZE|VERBOSE|DISABLE_PAGE_SKIPPING|SKIP_LOCKED|INDEX_CLEANUP|TRUNCATE"))
 			COMPLETE_WITH("ON", "OFF");
 	}
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 858bcb6bc9..61725e749f 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -24,6 +24,8 @@
 #include "nodes/primnodes.h"
 #include "storage/bufpage.h"
 #include "storage/lockdefs.h"
+#include "storage/shm_toc.h"
+#include "storage/dsm.h"
 #include "utils/relcache.h"
 #include "utils/snapshot.h"
 
@@ -190,9 +192,13 @@ extern void SyncScanShmemInit(void);
 extern Size SyncScanShmemSize(void);
 
 /* in heap/vacuumlazy.c */
+extern pg_atomic_uint32	*VacuumSharedCostBalance;
+extern pg_atomic_uint32	*VacuumActiveNWorkers;
+extern int	VacuumCostBalanceLocal;
 struct VacuumParams;
 extern void heap_vacuum_rel(Relation onerel,
 							struct VacuumParams *params, BufferAccessStrategy bstrategy);
+extern void heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
 
 /* in heap/heapam_visibility.c */
 extern bool HeapTupleSatisfiesVisibility(HeapTuple stup, Snapshot snapshot,
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 7b6f269785..295b6a17f0 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -212,6 +212,11 @@ typedef struct VacuumParams
 										 * default value depends on reloptions */
 	VacOptTernaryValue truncate;	/* Truncate empty pages at the end,
 									 * default value depends on reloptions */
+	/*
+	 * The number of parallel vacuum workers. -1 by default for no workers
+	 * and 0 for choosing based on the number of indexes.
+	 */
+	int			nworkers;
 } VacuumParams;
 
 /* GUC parameters */
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index 9996d882d1..cf5e1f0a4e 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -92,6 +92,32 @@ CONTEXT:  SQL function "do_analyze" statement 1
 SQL function "wrap_do_analyze" statement 1
 VACUUM FULL vactst;
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+-- PARALLEL option
+CREATE TABLE pvactst (i INT, a INT[], p POINT) with (autovacuum_enabled = off);
+INSERT INTO pvactst SELECT i, array[1,2,3], point(i, i+1) FROM generate_series(1,100000) i;
+CREATE INDEX btree_pvactst ON pvactst USING btree (i);
+CREATE INDEX hash_pvactst ON pvactst USING hash (i);
+CREATE INDEX brin_pvactst ON pvactst USING brin (i);
+CREATE INDEX gin_pvactst ON pvactst USING gin (a);
+CREATE INDEX gist_pvactst ON pvactst USING gist (p);
+CREATE INDEX spgist_pvactst ON pvactst USING spgist (p);
+-- VACUUM invokes parallel index cleanup
+VACUUM (PARALLEL 2) pvactst;
+-- VACUUM invokes parallel bulk-deletion and parallel index cleanup
+UPDATE pvactst SET i = i WHERE i < 1000;
+VACUUM (PARALLEL 2) pvactst;
+VACUUM (PARALLEL 0) pvactst; -- error
+ERROR:  parallel vacuum degree must be between 1 and 1024
+LINE 1: VACUUM (PARALLEL 0) pvactst;
+                ^
+VACUUM (PARALLEL 2, INDEX_CLEANUP FALSE) pvactst;
+VACUUM (PARALLEL 2, FULL TRUE) pvactst; -- error, cannot use both PARALLEL and FULL
+ERROR:  cannot specify FULL option with PARALLEL option
+CREATE TEMPORARY TABLE tmp (a int PRIMARY KEY);
+CREATE INDEX tmp_idx1 ON tmp (a);
+VACUUM (PARALLEL 1) tmp; -- disables parallel vacuum option
+WARNING:  disabling parallel option of vacuum on "tmp" --- cannot vacuum temporary tables in parallel
+DROP TABLE pvactst;
 -- INDEX_CLEANUP option
 CREATE TABLE no_index_cleanup (i INT PRIMARY KEY, t TEXT);
 -- Use uncompressed data stored in toast.
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 69987f75e9..0aecf17773 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -75,6 +75,31 @@ VACUUM FULL vactst;
 
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
 
+-- PARALLEL option
+CREATE TABLE pvactst (i INT, a INT[], p POINT) with (autovacuum_enabled = off);
+INSERT INTO pvactst SELECT i, array[1,2,3], point(i, i+1) FROM generate_series(1,100000) i;
+CREATE INDEX btree_pvactst ON pvactst USING btree (i);
+CREATE INDEX hash_pvactst ON pvactst USING hash (i);
+CREATE INDEX brin_pvactst ON pvactst USING brin (i);
+CREATE INDEX gin_pvactst ON pvactst USING gin (a);
+CREATE INDEX gist_pvactst ON pvactst USING gist (p);
+CREATE INDEX spgist_pvactst ON pvactst USING spgist (p);
+
+-- VACUUM invokes parallel index cleanup
+VACUUM (PARALLEL 2) pvactst;
+
+-- VACUUM invokes parallel bulk-deletion and parallel index cleanup
+UPDATE pvactst SET i = i WHERE i < 1000;
+VACUUM (PARALLEL 2) pvactst;
+
+VACUUM (PARALLEL 0) pvactst; -- error
+VACUUM (PARALLEL 2, INDEX_CLEANUP FALSE) pvactst;
+VACUUM (PARALLEL 2, FULL TRUE) pvactst; -- error, cannot use both PARALLEL and FULL
+CREATE TEMPORARY TABLE tmp (a int PRIMARY KEY);
+CREATE INDEX tmp_idx1 ON tmp (a);
+VACUUM (PARALLEL 1) tmp; -- disables parallel vacuum option
+DROP TABLE pvactst;
+
 -- INDEX_CLEANUP option
 CREATE TABLE no_index_cleanup (i INT PRIMARY KEY, t TEXT);
 -- Use uncompressed data stored in toast.
-- 
2.23.0

v33-0002-delta-amit.patchapplication/octet-stream; name=v33-0002-delta-amit.patchDownload

From 87d5f6bff7817367462a94a81128bfe243516310 Mon Sep 17 00:00:00 2001
From: Amit Kapila <amit.kapila@enterprisedb.com>
Date: Mon, 25 Nov 2019 16:26:17 +0530
Subject: [PATCH] Added/Changed comments and other cosmetic changes.

---
 doc/src/sgml/config.sgml             |   2 +-
 doc/src/sgml/ref/vacuum.sgml         |  26 +--
 src/backend/access/heap/vacuumlazy.c | 322 +++++++++++++++++------------------
 3 files changed, 175 insertions(+), 175 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 51e7bb4a62..7e17d98fd8 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2314,7 +2314,7 @@ include_dir 'conf.d'
          utility commands that support the use of parallel workers are
          <command>CREATE INDEX</command> only when building a B-tree index,
          and <command>VACUUM</command> without <literal>FULL</literal>
-         option. Parallel workers are taken from the pool of processes
+         option.  Parallel workers are taken from the pool of processes
          established by <xref linkend="guc-max-worker-processes"/>, limited
          by <xref linkend="guc-max-parallel-workers"/>.  Note that the requested
          number of workers may not actually be available at run time.
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index ae086b976b..9fee083233 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -231,21 +231,21 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
       Perform vacuum index and cleanup index phases of <command>VACUUM</command>
       in parallel using <replaceable class="parameter">integer</replaceable>
       background workers (for the detail of each vacuum phases, please
-      refer to <xref linkend="vacuum-phases"/>). If the parallel degree
+      refer to <xref linkend="vacuum-phases"/>).  If the parallel degree
       <replaceable class="parameter">integer</replaceable> is omitted,
       then <command>VACUUM</command> decides the number of workers based
       on number of indexes that support parallel vacuum operation on the
       relation which is further limited by
-      <xref linkend="guc-max-parallel-workers-maintenance"/>. Please note
-      that it is not guaranteed that the number of parallel worker specified
+      <xref linkend="guc-max-parallel-workers-maintenance"/>.  Please note
+      that it is not guaranteed that the number of parallel workers specified
       in <replaceable class="parameter">integer</replaceable> will be used
-      during execution. It is possible for a vacuum to run with fewer workers
-      than specified, or even with no workers at all. Only one worker can
-      be used per index. So parallel workers are launched only when there
-      are at least <literal>2</literal> indexes in the table. Workers for
-      vacuum launches before starting each phases and exit at the end of
-      the phase. These behaviors might change in a future release. This
-      option can not use with <literal>FULL</literal> option.
+      during execution.  It is possible for a vacuum to run with fewer workers
+      than specified, or even with no workers at all.  Only one worker can
+      be used per index.  So parallel workers are launched only when there
+      are at least <literal>2</literal> indexes in the table.  Workers for
+      vacuum launches before starting each phase and exit at the end of
+      the phase.  These behaviors might change in a future release.  This
+      option can't be used with the <literal>FULL</literal> option.
      </para>
     </listitem>
    </varlistentry>
@@ -270,8 +270,8 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
      <para>
       Specifies a positive integer value passed to the selected option.
       The <replaceable class="parameter">integer</replaceable> value can
-      also be omitted, in which case the default value of the selected
-      option is used.
+      also be omitted, in which case the value is decided by the command
+      based on the option used.
      </para>
     </listitem>
    </varlistentry>
@@ -356,7 +356,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    </para>
 
    <para>
-     The <option>PARALLEL</option> option is used for only vacuum purpose.
+     The <option>PARALLEL</option> option is used only for vacuum purpose.
      Even if this option is specified with <option>ANALYZE</option> option
      it does not affect <option>ANALYZE</option>.
    </para>
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index c2fe56a4b2..17598a126a 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -23,20 +23,21 @@
  * the TID array, just enough to hold as many heap tuples as fit on one page.
  *
  * Lazy vacuum supports parallel execution with parallel worker processes.  In
- * parallel lazy vacuum, we perform both index vacuuming and index cleanup with
- * parallel worker processes.  Individual indexes are processed by one vacuum
- * process.  At the beginning of lazy vacuum (at lazy_scan_heap) we prepare the
- * parallel context and initialize the DSM segment that contains shared information
- * as well as the memory space for storing dead tuples.  When starting either
- * index vacuuming or index cleanup, we launch parallel worker processes.  Once
- * all indexes are processed the parallel worker processes exit.  And then the
- * leader process re-initializes the parallel context while keeping recorded
- * dead tuples so that the leader can launch parallel workers again in the next
- * time.  Note that all parallel workers live during either index vacuuming or
- * index cleanup but the leader process neither exits from the parallel mode
- * nor destroys the parallel context.  For updating the index statistics, since
- * any updates are not allowed during parallel mode we update the index
- * statistics after exited from the parallel mode.
+ * a parallel lazy vacuum, we perform both index vacuuming and index cleanup
+ * with parallel worker processes.  Individual indexes are processed by one
+ * vacuum process.  At the beginning of a lazy vacuum (at lazy_scan_heap) we
+ * prepare the parallel context and initialize the DSM segment that contains
+ * shared information as well as the memory space for storing dead tuples.
+ * When starting either index vacuuming or index cleanup, we launch parallel
+ * worker processes.  Once all indexes are processed the parallel worker
+ * processes exit.  And then the leader process re-initializes the parallel
+ * context while keeping recorded dead tuples so that the leader can launch
+ * parallel workers again in the next time.  Note that all parallel workers
+ * live during either index vacuuming or index cleanup but the leader process
+ * neither exits from the parallel mode nor destroys the parallel context.  For
+ * updating the index statistics, since any updates are not allowed during
+ * parallel mode we update the index statistics after exited from the parallel
+ * mode.
  *
  * Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
@@ -130,7 +131,7 @@
 #define PREFETCH_SIZE			((BlockNumber) 32)
 
 /*
- * DSM keys for parallel lazy vacuum. Unlike other parallel execution code,
+ * DSM keys for parallel lazy vacuum.  Unlike other parallel execution code,
  * since we don't need to worry about DSM keys conflicting with plan_node_id
  * we can use small integers.
  */
@@ -146,15 +147,15 @@
  */
 
 /*
- * Macro to check if we are in a parallel lazy vacuum. If true, we are
+ * Macro to check if we are in a parallel lazy vacuum.  If true, we are
  * in the parallel mode and prepared the DSM segment.
  */
 #define ParallelVacuumIsActive(lps) (((LVParallelState *) (lps)) != NULL)
 
 /*
- * LVDeadTuples stores the dead tuple TIDs collected during heap scan.
- * This is allocated in the DSM segment when parallel lazy vacuum
- * mode, otherwise allocated in a local memory.
+ * LVDeadTuples stores the dead tuple TIDs collected during the heap scan.
+ * This is allocated in the DSM segment in parallel mode and in local memory
+ * in non-parallel mode.
  */
 typedef struct LVDeadTuples
 {
@@ -164,11 +165,12 @@ typedef struct LVDeadTuples
 	/* NB: this list is ordered by TID address */
 	ItemPointerData itemptrs[FLEXIBLE_ARRAY_MEMBER];	/* array of ItemPointerData */
 } LVDeadTuples;
+
 #define SizeOfLVDeadTuples offsetof(LVDeadTuples, itemptrs) + sizeof(ItemPointerData)
 
 /*
- * Shared information among parallel workers. So this is allocated in
- * the DSM segment.
+ * Shared information among parallel workers.  So this is allocated in the DSM
+ * segment.
  */
 typedef struct LVShared
 {
@@ -181,7 +183,7 @@ typedef struct LVShared
 
 	/*
 	 * An indication for vacuum workers of doing either index vacuuming or
-	 * index cleanup. first_time is true only if for_cleanup is true and
+	 * index cleanup.  first_time is true only if for_cleanup is true and
 	 * bulk-deletion is not performed yet.
 	 */
 	bool	for_cleanup;
@@ -201,11 +203,11 @@ typedef struct LVShared
 
 	/*
 	 * In single process lazy vacuum we could consume more memory during
-	 * index vacuuming or cleanup apart from the memory for heap scanning
-	 * if an index consume memory during ambulkdelete and amvacuumcleanup.
-	 * In parallel index vacuuming, since individual vacuum workers
-	 * consumes memory we set the new maitenance_work_mem for each workers
-	 * to not consume more memory than single process lazy vacuum.
+	 * index vacuuming or cleanup apart from the memory for heap scanning.
+	 * In parallel index vacuuming, since individual vacuum workers can
+	 * consume memory equal to maitenance_work_mem, the new
+	 * maitenance_work_mem for each worker is set such that the parallel
+	 * operation doesn't consume more memory than single process lazy vacuum.
 	 */
 	int		maintenance_work_mem_worker;
 
@@ -237,51 +239,31 @@ typedef struct LVShared
 
 	/* Shared index statistics data follows at end of struct */
 } LVShared;
+
 #define SizeOfLVShared offsetof(LVShared, bitmap) + sizeof(bits8)
 #define GetSharedIndStats(s) \
 	((LVSharedIndStats *)((char *)(s) + ((LVShared *)(s))->offset))
 #define IndStatsIsNull(s, i) \
 	(!(((LVShared *)(s))->bitmap[(i) >> 3] & (1 << ((i) & 0x07))))
 
-/*
- * Variables for cost-based vacuum delay for parallel index vacuuming.
- * The basic idea of cost-based vacuum delay for parallel index vacuuming
- * is to allow all parallel vacuum workers including the leader process
- * to have a shared view of cost related parameters (mainly VacuumCostBalance)
- * and allow each worker to update it and then based on that decide
- * whether it needs to sleep.  Besides, we allow any worker to sleep
- * only if it has performed the I/O above a certain threshold, which is
- * calculated based on the number of active workers (VacuumActiveNWorkers),
- * and the overall cost balance is more than VacuumCostLimit set by the
- * system.  Then we will allow the worker to sleep proportional to the work
- * done and reduce the VacuumSharedCostBalance by the amount which is
- * consumed by the current worker (VacuumCostBalanceLocal).  This can
- * avoid letting the workers sleep which has done less or no I/O as compared
- * to other workers, and therefore can ensure that workers who are doing
- * more I/O got throttled more.
- */
-pg_atomic_uint32	*VacuumSharedCostBalance = NULL;
-pg_atomic_uint32	*VacuumActiveNWorkers = NULL;
-int					VacuumCostBalanceLocal = 0;
-
 /*
  * Struct for an index bulk-deletion statistic used for parallel lazy
- * vacuum. This is allocated in the DSM segment.  IndexBulkDeleteResult
- * follows at end of struct.
+ * vacuum.  This is allocated in the DSM segment.
  */
 typedef struct LVSharedIndStats
 {
 	Size	size;
-	bool	updated;	/* are the stats updated */
+	bool	updated;	/* are the stats updated? */
 
-	/* Index bulk-deletion result data follows at end of struct */
+	/* IndexBulkDeleteResult data follows at end of struct */
 } LVSharedIndStats;
+
 #define SizeOfSharedIndStats(s) \
 	(sizeof(LVSharedIndStats) + ((LVSharedIndStats *)(s))->size)
 #define GetIndexBulkDeleteResult(s) \
 	((IndexBulkDeleteResult *)((char *)(s) + sizeof(LVSharedIndStats)))
 
-/* Struct for parallel lazy vacuum */
+/* Struct for maintaining a parallel vacuum state. */
 typedef struct LVParallelState
 {
 	ParallelContext	*pcxt;
@@ -337,6 +319,26 @@ static MultiXactId MultiXactCutoff;
 
 static BufferAccessStrategy vac_strategy;
 
+/*
+ * Variables for cost-based vacuum delay for parallel index vacuuming.
+ * The basic idea of cost-based vacuum delay for parallel index vacuuming
+ * is to allow all parallel vacuum workers including the leader process
+ * to have a shared view of cost related parameters (mainly VacuumCostBalance)
+ * and allow each worker to update it and then based on that decide
+ * whether it needs to sleep.  Besides, we allow any worker to sleep
+ * only if it has performed the I/O above a certain threshold, which is
+ * calculated based on the number of active workers (VacuumActiveNWorkers),
+ * and the overall cost balance is more than VacuumCostLimit set by the
+ * system.  Then we will allow the worker to sleep proportional to the work
+ * done and reduce the VacuumSharedCostBalance by the amount which is
+ * consumed by the current worker (VacuumCostBalanceLocal).  This can
+ * avoid letting the workers sleep which has done less or no I/O as compared
+ * to other workers, and therefore can ensure that workers who are doing
+ * more I/O got throttled more.
+ */
+pg_atomic_uint32	*VacuumSharedCostBalance = NULL;
+pg_atomic_uint32	*VacuumActiveNWorkers = NULL;
+int					VacuumCostBalanceLocal = 0;
 
 /* non-export function prototypes */
 static void lazy_scan_heap(Relation onerel, VacuumParams *params,
@@ -363,19 +365,6 @@ static bool lazy_tid_reaped(ItemPointer itemptr, void *state);
 static int	vac_cmp_itemptr(const void *left, const void *right);
 static bool heap_page_is_all_visible(Relation rel, Buffer buf,
 									 TransactionId *visibility_cutoff_xid, bool *all_frozen);
-static LVParallelState *begin_parallel_vacuum(LVRelStats *vacrelstats, Oid relid,
-											  BlockNumber nblocks, Relation *Irel,
-											  int nindexes, int nrequested);
-static void end_parallel_vacuum(LVParallelState *lps, Relation *Irel, int nindexes,
-								IndexBulkDeleteResult **stats);
-static void prepare_index_statistics(LVShared *lvshared, Relation *Irel, int nindexes,
-									 int nworkers);
-static void lazy_vacuum_indexes(LVRelStats *vacrelstats, Relation *Irel,
-								int nindexes, IndexBulkDeleteResult **stats,
-								LVParallelState *lps);
-static void lazy_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
-								 int nindexes, IndexBulkDeleteResult **stats,
-								 LVParallelState *lps);
 static void lazy_parallel_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
 													int nindexes, IndexBulkDeleteResult **stats,
 													LVParallelState *lps);
@@ -383,11 +372,24 @@ static void vacuum_or_cleanup_indexes_worker(Relation *Irel, int nindexes,
 											 IndexBulkDeleteResult **stats,
 											 LVShared *lvshared,
 											 LVDeadTuples *dead_tuples);
+static void lazy_vacuum_indexes(LVRelStats *vacrelstats, Relation *Irel,
+								int nindexes, IndexBulkDeleteResult **stats,
+								LVParallelState *lps);
+static void lazy_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
+								 int nindexes, IndexBulkDeleteResult **stats,
+								 LVParallelState *lps);
 static void update_index_statistics(Relation *Irel, IndexBulkDeleteResult **stats,
 									int nindexes);
-static LVSharedIndStats *get_indstats(LVShared *lvshared, int n);
-static int compute_parallel_workers(Relation *Irel, int nindexes, int nrequested);
 static long compute_max_dead_tuples(BlockNumber relblocks, bool hasindex);
+static int compute_parallel_workers(Relation *Irel, int nindexes, int nrequested);
+static void prepare_index_statistics(LVShared *lvshared, Relation *Irel, int nindexes,
+									 int nworkers);
+static LVParallelState *begin_parallel_vacuum(LVRelStats *vacrelstats, Oid relid,
+											  BlockNumber nblocks, Relation *Irel,
+											  int nindexes, int nrequested);
+static void end_parallel_vacuum(LVParallelState *lps, Relation *Irel, int nindexes,
+								IndexBulkDeleteResult **stats);
+static LVSharedIndStats *get_indstats(LVShared *lvshared, int n);
 static bool skip_parallel_index_vacuum(Relation indrel, bool for_cleanup,
 									   bool first_time);
 
@@ -705,12 +707,12 @@ vacuum_log_cleanup_info(Relation rel, LVRelStats *vacrelstats)
  *
  *		If the table has at least two indexes and parallel lazy vacuum is
  *		requested, we execute both index vacuuming and index cleanup with
- *		parallel workers. In parallel lazy vacuum, we enter parallel mode and
+ *		parallel workers.  In parallel lazy vacuum, we enter parallel mode and
  *		then create both the parallel context and the DSM segment before starting
- *		heap scan so that we can record dead tuples to the DSM segment. All
+ *		heap scan so that we can record dead tuples to the DSM segment.  All
  *		parallel workers are launched at beginning of index vacuuming and index
- *		cleanup and they exit once done with all indexes. At the end of this
- *		function we exit from parallel mode. Index bulk-deletion results are
+ *		cleanup and they exit once done with all indexes.  At the end of this
+ *		function we exit from parallel mode.  Index bulk-deletion results are
  *		stored in the DSM segment and update index statistics as a whole after
  *		exited from parallel mode since all writes are not allowed during parallel
  *		mode.
@@ -784,8 +786,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	vacrelstats->latestRemovedXid = InvalidTransactionId;
 
 	/*
-	 * If parallel lazy vacuum is requested and we vacuum indexes, compute
-	 * the number of parallel vacuum worker to launch.
+	 * Compute the number of parallel vacuum workers to launch if the parallel
+	 * vacuum is requested and we need to vacuum the indexes.
 	 */
 	if (params->nworkers >= 0 && vacrelstats->useindex)
 		parallel_workers = compute_parallel_workers(Irel, nindexes,
@@ -1992,9 +1994,10 @@ lazy_check_needs_freeze(Buffer buf, bool *hastup)
 }
 
 /*
- * Perform index vacuuming or index cleanup with parallel workers. This function
- * must be used by the parallel vacuum leader process. The caller must set
- * lps->lvshared->for_cleanup to indicate whether vacuuming or cleanup.
+ * Perform index vacuuming or index cleanup with parallel workers.  This
+ * function must be used by the parallel vacuum leader process. The caller
+ * must set lps->lvshared->for_cleanup to indicate whether to perform vacuum
+ * or cleanup.
  */
 static void
 lazy_parallel_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
@@ -2042,7 +2045,7 @@ lazy_parallel_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
 						lps->pcxt->nworkers_launched, lps->pcxt->nworkers)));
 
 	/*
-	* Increment the active worker count. We cannot decrement until the
+	* Increment the active worker count.  We cannot decrement until the
 	* all parallel workers finish.
 	*/
 	pg_atomic_add_fetch_u32(VacuumActiveNWorkers, 1);
@@ -2220,11 +2223,11 @@ vacuum_or_cleanup_indexes_worker(Relation *Irel, int nindexes,
 }
 
 /*
- * Cleanup indexes.  This function must be used by the parallel vacuum
- * leader process in parallel vacuum case.
+ * Vacuum indexes.  We process the indexes serially unless we are doing
+ * parallel vacuum.
  */
 static void
-lazy_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
+lazy_vacuum_indexes(LVRelStats *vacrelstats, Relation *Irel,
 					int nindexes, IndexBulkDeleteResult **stats,
 					LVParallelState *lps)
 {
@@ -2233,25 +2236,19 @@ lazy_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
 	Assert(!IsParallelWorker());
 	Assert(nindexes > 0);
 
-	/*
-	 * If parallel vacuum is active we perform index cleanup with parallel
-	 * workers.
-	 */
+	/* Perform index vacuuming with parallel workers for parallel vacuum. */
 	if (ParallelVacuumIsActive(lps))
 	{
-		/* Tell parallel workers to do index cleanup */
-		lps->lvshared->for_cleanup = true;
-		lps->lvshared->first_time =
-			(vacrelstats->num_index_scans == 0);
+		/* Tell parallel workers to do index vacuuming */
+		lps->lvshared->for_cleanup = false;
+		lps->lvshared->first_time = false;
 
 		/*
-		 * Now we can provide a better estimate of total number of
-		 * surviving tuples (we assume indexes are more interested in that
-		 * than in the number of nominally live tuples).
+		 * We can only provide an approximate value of num_heap_tuples in
+		 * vacuum cases.
 		 */
-		lps->lvshared->reltuples = vacrelstats->new_rel_tuples;
-		lps->lvshared->estimated_count =
-			(vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+		lps->lvshared->reltuples = vacrelstats->old_live_tuples;
+		lps->lvshared->estimated_count = true;
 
 		lazy_parallel_vacuum_or_cleanup_indexes(vacrelstats, Irel, nindexes,
 												stats, lps);
@@ -2259,20 +2256,19 @@ lazy_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
 	else
 	{
 		for (idx = 0; idx < nindexes; idx++)
-			lazy_cleanup_index(Irel[idx], &stats[idx],
-							   vacrelstats->new_rel_tuples,
-							   vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+			lazy_vacuum_index(Irel[idx], &stats[idx], vacrelstats->dead_tuples,
+							  vacrelstats->old_live_tuples);
 	}
 }
 
 /*
- * Vacuum indexes. This function must be used by the parallel vacuum leader
- * process in parallel vacuum case.
+ * Cleanup indexes.  We process the indexes serially unless we are doing
+ * parallel vacuum.
  */
 static void
-lazy_vacuum_indexes(LVRelStats *vacrelstats, Relation *Irel,
-					int nindexes, IndexBulkDeleteResult **stats,
-					LVParallelState *lps)
+lazy_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
+					 int nindexes, IndexBulkDeleteResult **stats,
+					 LVParallelState *lps)
 {
 	int		idx;
 
@@ -2280,21 +2276,24 @@ lazy_vacuum_indexes(LVRelStats *vacrelstats, Relation *Irel,
 	Assert(nindexes > 0);
 
 	/*
-	 * If parallel vacuum is active we perform index vacuuming with
-	 * parallel workers.
+	 * If parallel vacuum is active we perform index cleanup with parallel
+	 * workers.
 	 */
 	if (ParallelVacuumIsActive(lps))
 	{
-		/* Tell parallel workers to do index vacuuming */
-		lps->lvshared->for_cleanup = false;
-		lps->lvshared->first_time = false;
+		/* Tell parallel workers to do index cleanup */
+		lps->lvshared->for_cleanup = true;
+		lps->lvshared->first_time =
+						(vacrelstats->num_index_scans == 0);
 
 		/*
-		 * We can only provide an approximate value of num_heap_tuples in
-		 * vacuum cases.
+		 * Now we can provide a better estimate of total number of
+		 * surviving tuples (we assume indexes are more interested in that
+		 * than in the number of nominally live tuples).
 		 */
-		lps->lvshared->reltuples = vacrelstats->old_live_tuples;
-		lps->lvshared->estimated_count = true;
+		lps->lvshared->reltuples = vacrelstats->new_rel_tuples;
+		lps->lvshared->estimated_count =
+					(vacrelstats->tupcount_pages < vacrelstats->rel_pages);
 
 		lazy_parallel_vacuum_or_cleanup_indexes(vacrelstats, Irel, nindexes,
 												stats, lps);
@@ -2302,8 +2301,9 @@ lazy_vacuum_indexes(LVRelStats *vacrelstats, Relation *Irel,
 	else
 	{
 		for (idx = 0; idx < nindexes; idx++)
-			lazy_vacuum_index(Irel[idx], &stats[idx], vacrelstats->dead_tuples,
-							  vacrelstats->old_live_tuples);
+			lazy_cleanup_index(Irel[idx], &stats[idx],
+							   vacrelstats->new_rel_tuples,
+							   vacrelstats->tupcount_pages < vacrelstats->rel_pages);
 	}
 }
 
@@ -3022,6 +3022,53 @@ compute_parallel_workers(Relation *Irel, int nindexes, int nrequested)
 	return parallel_workers;
 }
 
+/*
+ * Initialize variables for shared index statistics, set NULL bitmap and
+ * the struct size of each indexes.  Also this function sets the number of
+ * indexes that do not support parallel index vacuuming and that use
+ * maintenance_work_mem.  Since currently we don't support parallel vacuum
+ * for autovacuum we don't need to care about autovacuum_work_mem.
+ */
+static void
+prepare_index_statistics(LVShared *lvshared, Relation *Irel, int nindexes,
+	int nworkers)
+{
+	char *p = (char *)GetSharedIndStats(lvshared);
+	int nindexes_mwm = 0;
+	int i;
+
+	Assert(!IsAutoVacuumWorkerProcess());
+
+	for (i = 0; i < nindexes; i++)
+	{
+		LVSharedIndStats *indstats;
+
+		if (Irel[i]->rd_indam->amparallelvacuumoptions ==
+			VACUUM_OPTION_NO_PARALLEL)
+		{
+			/* Set NULL as this index does not support parallel vacuum */
+			lvshared->bitmap[i >> 3] |= 0 << (i & 0x07);
+			continue;
+		}
+
+		if (Irel[i]->rd_indam->amusemaintenanceworkmem)
+			nindexes_mwm++;
+
+		/* Set the size for index statistics */
+		indstats = (LVSharedIndStats *)p;
+		lvshared->bitmap[i >> 3] |= 1 << (i & 0x07);
+		indstats->size = index_parallelvacuum_estimate(Irel[i]);
+
+		p += SizeOfSharedIndStats(indstats);
+	}
+
+	/* Compute the new maitenance_work_mem value for index vacuuming */
+	lvshared->maintenance_work_mem_worker =
+					(nindexes_mwm > 0) ?
+					maintenance_work_mem / Min(nworkers, nindexes_mwm) :
+					maintenance_work_mem;
+}
+
 /*
  * Enter parallel mode, allocate and initialize the DSM segment.
  */
@@ -3134,53 +3181,6 @@ begin_parallel_vacuum(LVRelStats *vacrelstats, Oid relid, BlockNumber nblocks,
 	return lps;
 }
 
-/*
- * Initialize variables for shared index statistics, set NULL bitmap and
- * the struct size of each indexes.  Also this function sets the number of
- * indexes that do not support parallel index vacuuming and that use
- * maintenance_work_mem.  Since currently we don't support parallel vacuum
- * for autovacuum we don't need to care about autovacuum_work_mem.
- */
-static void
-prepare_index_statistics(LVShared *lvshared, Relation *Irel, int nindexes,
-						 int nworkers)
-{
-	char *p = (char *) GetSharedIndStats(lvshared);
-	int nindexes_mwm = 0;
-	int i;
-
-	Assert(!IsAutoVacuumWorkerProcess());
-
-	for (i = 0; i < nindexes; i++)
-	{
-		LVSharedIndStats *indstats;
-
-		if (Irel[i]->rd_indam->amparallelvacuumoptions ==
-			VACUUM_OPTION_NO_PARALLEL)
-		{
-			/* Set NULL as this index does not support parallel vacuum */
-			lvshared->bitmap[i >> 3] |= 0 << (i & 0x07);
-			continue;
-		}
-
-		if (Irel[i]->rd_indam->amusemaintenanceworkmem)
-			nindexes_mwm++;
-
-		/* Set the size for index statistics */
-		indstats = (LVSharedIndStats *) p;
-		lvshared->bitmap[i >> 3] |= 1 << (i & 0x07);
-		indstats->size = index_parallelvacuum_estimate(Irel[i]);
-
-		p += SizeOfSharedIndStats(indstats);
-	}
-
-	/* Compute the new maitenance_work_mem value for index vacuuming */
-	lvshared->maintenance_work_mem_worker =
-		(nindexes_mwm > 0) ?
-		maintenance_work_mem / Min(nworkers, nindexes_mwm) :
-		maintenance_work_mem;
-}
-
 /*
  * Destroy the parallel context, and end parallel mode.
  *
-- 
2.16.2.windows.1

#231

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

about 6 years ago

In reply to: Amit Kapila (#229)

On Fri, 22 Nov 2019 at 10:19, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Nov 20, 2019 at 11:01 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

I've attached the latest version patch set. The patch set includes all
discussed points regarding index AM options as well as shared cost
balance. Also I added some test cases used all types of index AM.

I have reviewed the first patch and made a number of modifications
that include adding/modifying comments, made some corrections and
modifications in the documentation. You can find my changes in
v33-0001-delta-amit.patch. See, if those look okay to you, if so,
please include those in the next version of the patch. I am attaching
both your version of patch and delta changes by me.

Thank you.

All changes look good to me. But after changed the 0002 patch the two
macros for parallel vacuum options (VACUUM_OPTIONS_SUPPORT_XXX) is no
longer necessary. So we can remove them and can add if we need them
again.

One comment on v33-0002-Add-parallel-option-to-VACUUM-command:
+ /* Estimate size for shared information -- PARALLEL_VACUUM_KEY_SHARED */
+ est_shared = MAXALIGN(add_size(SizeOfLVShared, BITMAPLEN
(nindexes)));
..
+ shared->offset = add_size(SizeOfLVShared, BITMAPLEN(nindexes));
Here, don't you need to do MAXALIGN to set offset as we are computing
it that way while estimating shared memory? If not, then probably,
some comments are required to explain it.

You're right. Will fix it.

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#232

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Masahiko Sawada (#231)

On Mon, Nov 25, 2019 at 9:42 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Fri, 22 Nov 2019 at 10:19, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Nov 20, 2019 at 11:01 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

I've attached the latest version patch set. The patch set includes all
discussed points regarding index AM options as well as shared cost
balance. Also I added some test cases used all types of index AM.

I have reviewed the first patch and made a number of modifications
that include adding/modifying comments, made some corrections and
modifications in the documentation. You can find my changes in
v33-0001-delta-amit.patch. See, if those look okay to you, if so,
please include those in the next version of the patch. I am attaching
both your version of patch and delta changes by me.

Thank you.

All changes look good to me. But after changed the 0002 patch the two
macros for parallel vacuum options (VACUUM_OPTIONS_SUPPORT_XXX) is no
longer necessary. So we can remove them and can add if we need them
again.

Sounds reasonable.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#233

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Amit Kapila (#230)

1 attachment(s)

On Mon, Nov 25, 2019 at 5:36 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

2.
lazy_parallel_vacuum_or_cleanup_indexes()
{
..
..
}

Here, it seems that we can increment/decrement the
VacuumActiveNWorkers even when there is no work performed by the
leader backend. How about moving increment/decrement inside function
vacuum_or_cleanup_indexes_worker? In that case, we need to do it in
this function when we are actually doing an index vacuum or cleanup.
After doing that the other usage of increment/decrement of
VacuumActiveNWorkers in other function heap_parallel_vacuum_main can
be removed.

One of my colleague Mahendra who was testing this patch found that
stats for index reported by view pg_statio_all_tables are wrong for
parallel vacuum. I debugged the issue and found that there were two
problems in the stats related code.
1. The function get_indstats seem to be computing the wrong value of
stats for the last index.
2. The function lazy_parallel_vacuum_or_cleanup_indexes() was not
pointing to the computed stats when the parallel index scan is
skipped.

Find the above two fixes in the attached patch. This is on top of the
patches I sent yesterday [1]/messages/by-id/CAA4eK1LQ+YGjmSS-XqhuAa6eb=Xykpx1LiT7UXJHmEKP=0QtsA@mail.gmail.com.

Some more comments on v33-0002-Add-parallel-option-to-VACUUM-command
-------------------------------------------------------------------------------------------------------------
1. The code in function lazy_parallel_vacuum_or_cleanup_indexes()
that processes the indexes that have skipped parallel processing can
be moved to a separate function. Further, the newly added code by the
attached patch can also be moved to a separate function as the same
code is used in function vacuum_or_cleanup_indexes_worker().

2.
+void
+heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
{
..
+ stats = (IndexBulkDeleteResult **)
+ palloc0(nindexes * sizeof(IndexBulkDeleteResult *));
..
}

It would be neat if we free this memory once it is used.

3.
+ /*
+ * Compute the number of indexes that can participate to parallel index
+ * vacuuming.
+ */

/to/in

4. The function lazy_parallel_vacuum_or_cleanup_indexes() launches
workers without checking whether it needs to do the same or not. For
ex. in cleanup phase, it is possible that we don't need to launch any
worker, so it will be waste. It might be that you are already
planning to handle it based on the previous comments/discussion in
which case you can ignore this.

[1]: /messages/by-id/CAA4eK1LQ+YGjmSS-XqhuAa6eb=Xykpx1LiT7UXJHmEKP=0QtsA@mail.gmail.com

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachments:

v33-0002-delta2-fix-stats-issue.patchapplication/octet-stream; name=v33-0002-delta2-fix-stats-issue.patchDownload

From a6151eae698c238ee4d978d161e1f826c0956fa9 Mon Sep 17 00:00:00 2001
From: Amit Kapila <amit.kapila@enterprisedb.com>
Date: Tue, 26 Nov 2019 17:02:29 +0530
Subject: [PATCH] fix stats issue.

---
 src/backend/access/heap/vacuumlazy.c | 23 ++++++++++++++++++++++-
 1 file changed, 22 insertions(+), 1 deletion(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 17598a126a..11a167afd0 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -2073,6 +2073,9 @@ lazy_parallel_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
 
 		for (i = 0; i < nindexes; i++)
 		{
+			LVSharedIndStats *shared_indstats;
+			IndexBulkDeleteResult *bulkdelete_res;
+
 			bool processed = !skip_parallel_index_vacuum(Irel[i],
 														 lps->lvshared->for_cleanup,
 														 lps->lvshared->first_time);
@@ -2081,6 +2084,24 @@ lazy_parallel_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
 			if (processed)
 				continue;
 
+			/* Get index statistics struct of this index */
+			shared_indstats = get_indstats(lps->lvshared, i);
+
+			/* Skip if this index doesn't support parallel index vacuuming */
+			if (shared_indstats != NULL)
+			{
+
+				/* Get the space for IndexBulkDeleteResult */
+				bulkdelete_res = GetIndexBulkDeleteResult(shared_indstats);
+
+				/*
+				 * Update the pointer to the corresponding bulk-deletion result
+				 * if someone has already updated it.
+				 */
+				if (shared_indstats->updated && stats[i] == NULL)
+					stats[i] = bulkdelete_res;
+			}
+
 			if (lps->lvshared->for_cleanup)
 				lazy_cleanup_index(Irel[i], &stats[i],
 								   vacrelstats->new_rel_tuples,
@@ -3239,7 +3260,7 @@ get_indstats(LVShared *lvshared, int n)
 		return NULL;
 
 	p = (char *) GetSharedIndStats(lvshared);
-	for (i = 0; i < (n - 1); i++)
+	for (i = 0; i < n; i++)
 	{
 		if (IndStatsIsNull(lvshared, i))
 			continue;
-- 
2.16.2.windows.1

#234

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

about 6 years ago

In reply to: Amit Kapila (#233)

1 attachment(s)

On Tue, 26 Nov 2019 at 13:34, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Nov 25, 2019 at 5:36 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

2.
lazy_parallel_vacuum_or_cleanup_indexes()
{
..
..
}

Here, it seems that we can increment/decrement the
VacuumActiveNWorkers even when there is no work performed by the
leader backend. How about moving increment/decrement inside function
vacuum_or_cleanup_indexes_worker? In that case, we need to do it in
this function when we are actually doing an index vacuum or cleanup.
After doing that the other usage of increment/decrement of
VacuumActiveNWorkers in other function heap_parallel_vacuum_main can
be removed.

Yeah we can move it inside vacuum_or_cleanup_indexes_worker but we
still need to increment the count before processing the indexes that
have skipped parallel operations because some workers might still be
running yet.

One of my colleague Mahendra who was testing this patch found that
stats for index reported by view pg_statio_all_tables are wrong for
parallel vacuum. I debugged the issue and found that there were two
problems in the stats related code.
1. The function get_indstats seem to be computing the wrong value of
stats for the last index.
2. The function lazy_parallel_vacuum_or_cleanup_indexes() was not
pointing to the computed stats when the parallel index scan is
skipped.

Find the above two fixes in the attached patch. This is on top of the
patches I sent yesterday [1].

Thank you! During testing the current patch by myself I also found this bug.

Some more comments on v33-0002-Add-parallel-option-to-VACUUM-command
-------------------------------------------------------------------------------------------------------------
1. The code in function lazy_parallel_vacuum_or_cleanup_indexes()
that processes the indexes that have skipped parallel processing can
be moved to a separate function. Further, the newly added code by the
attached patch can also be moved to a separate function as the same
code is used in function vacuum_or_cleanup_indexes_worker().
2.
+void
+heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
{
..
+ stats = (IndexBulkDeleteResult **)
+ palloc0(nindexes * sizeof(IndexBulkDeleteResult *));
..
}
It would be neat if we free this memory once it is used.
3.
+ /*
+ * Compute the number of indexes that can participate to parallel index
+ * vacuuming.
+ */
/to/in

4. The function lazy_parallel_vacuum_or_cleanup_indexes() launches
workers without checking whether it needs to do the same or not. For
ex. in cleanup phase, it is possible that we don't need to launch any
worker, so it will be waste. It might be that you are already
planning to handle it based on the previous comments/discussion in
which case you can ignore this.

I've incorporated the comments I got so far including the above and
the memory alignment issue. Therefore the attached v34 patch includes
that changes and changes in v33-0002-delta-amit.patch and
v33-0002-delta2-fix-stats-issue.patch. In this version I add an extra
argument to LaunchParallelWorkers function and make the leader process
launch the parallel workers as much as the particular phase needs.

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

v34-0002-Add-parallel-option-to-VACUUM-command.patchapplication/octet-stream; name=v34-0002-Add-parallel-option-to-VACUUM-command.patchDownload

From 4738f0801cb8854894ea8299f4763e2ffc9b63da Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 25 Oct 2019 22:47:41 +0900
Subject: [PATCH v34] Add parallel option to VACUUM command

This change adds PARALLEL option to VACUUM command that enable us to
perform index vacuuming and index cleanup with background
workers. Individual indexes is processed by one vacuum
process. Therefore parallel vacuum can be used when the table has at
least two indexes and it cannot specify larger parallel degree than
the number of indexes that the table has.

The parallel degree is either specified by user or determined based on
the number of indexes that the table has, and further limited by
max_parallel_maintenance_workers. The table size and index size don't
affect it.
---
 doc/src/sgml/config.sgml               |   14 +-
 doc/src/sgml/ref/vacuum.sgml           |   45 +
 src/backend/access/heap/vacuumlazy.c   | 1283 ++++++++++++++++++++++--
 src/backend/access/nbtree/nbtsort.c    |    2 +-
 src/backend/access/transam/parallel.c  |    9 +-
 src/backend/commands/vacuum.c          |  109 +-
 src/backend/executor/nodeGather.c      |    2 +-
 src/backend/executor/nodeGatherMerge.c |    2 +-
 src/backend/postmaster/autovacuum.c    |    2 +
 src/bin/psql/tab-complete.c            |    2 +-
 src/include/access/heapam.h            |    6 +
 src/include/access/parallel.h          |    2 +-
 src/include/commands/vacuum.h          |    5 +
 src/test/regress/expected/vacuum.out   |   26 +
 src/test/regress/sql/vacuum.sql        |   25 +
 15 files changed, 1416 insertions(+), 118 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index d4d1fe45cc..7e17d98fd8 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2310,13 +2310,13 @@ include_dir 'conf.d'
        <listitem>
         <para>
          Sets the maximum number of parallel workers that can be
-         started by a single utility command.  Currently, the only
-         parallel utility command that supports the use of parallel
-         workers is <command>CREATE INDEX</command>, and only when
-         building a B-tree index.  Parallel workers are taken from the
-         pool of processes established by <xref
-         linkend="guc-max-worker-processes"/>, limited by <xref
-         linkend="guc-max-parallel-workers"/>.  Note that the requested
+         started by a single utility command.  Currently, the parallel
+         utility commands that support the use of parallel workers are
+         <command>CREATE INDEX</command> only when building a B-tree index,
+         and <command>VACUUM</command> without <literal>FULL</literal>
+         option.  Parallel workers are taken from the pool of processes
+         established by <xref linkend="guc-max-worker-processes"/>, limited
+         by <xref linkend="guc-max-parallel-workers"/>.  Note that the requested
          number of workers may not actually be available at run time.
          If this occurs, the utility operation will run with fewer
          workers than expected.  The default value is 2.  Setting this
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index f9b0fb8794..9fee083233 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -34,6 +34,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     SKIP_LOCKED [ <replaceable class="parameter">boolean</replaceable> ]
     INDEX_CLEANUP [ <replaceable class="parameter">boolean</replaceable> ]
     TRUNCATE [ <replaceable class="parameter">boolean</replaceable> ]
+    PARALLEL [ <replaceable class="parameter">integer</replaceable> ]
 
 <phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
 
@@ -223,6 +224,32 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><literal>PARALLEL</literal></term>
+    <listitem>
+     <para>
+      Perform vacuum index and cleanup index phases of <command>VACUUM</command>
+      in parallel using <replaceable class="parameter">integer</replaceable>
+      background workers (for the detail of each vacuum phases, please
+      refer to <xref linkend="vacuum-phases"/>).  If the parallel degree
+      <replaceable class="parameter">integer</replaceable> is omitted,
+      then <command>VACUUM</command> decides the number of workers based
+      on number of indexes that support parallel vacuum operation on the
+      relation which is further limited by
+      <xref linkend="guc-max-parallel-workers-maintenance"/>.  Please note
+      that it is not guaranteed that the number of parallel workers specified
+      in <replaceable class="parameter">integer</replaceable> will be used
+      during execution.  It is possible for a vacuum to run with fewer workers
+      than specified, or even with no workers at all.  Only one worker can
+      be used per index.  So parallel workers are launched only when there
+      are at least <literal>2</literal> indexes in the table.  Workers for
+      vacuum launches before starting each phase and exit at the end of
+      the phase.  These behaviors might change in a future release.  This
+      option can't be used with the <literal>FULL</literal> option.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><replaceable class="parameter">boolean</replaceable></term>
     <listitem>
@@ -237,6 +264,18 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><replaceable class="parameter">integer</replaceable></term>
+    <listitem>
+     <para>
+      Specifies a positive integer value passed to the selected option.
+      The <replaceable class="parameter">integer</replaceable> value can
+      also be omitted, in which case the value is decided by the command
+      based on the option used.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><replaceable class="parameter">table_name</replaceable></term>
     <listitem>
@@ -316,6 +355,12 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     more than a plain <command>VACUUM</command> would.
    </para>
 
+   <para>
+     The <option>PARALLEL</option> option is used only for vacuum purpose.
+     Even if this option is specified with <option>ANALYZE</option> option
+     it does not affect <option>ANALYZE</option>.
+   </para>
+
    <para>
     <command>VACUUM</command> causes a substantial increase in I/O traffic,
     which might cause poor performance for other active sessions.  Therefore,
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index a3c4a1df3b..33e78a6cca 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -22,6 +22,21 @@
  * of index scans performed.  So we don't use maintenance_work_mem memory for
  * the TID array, just enough to hold as many heap tuples as fit on one page.
  *
+ * Lazy vacuum supports parallel execution with parallel worker processes.  In
+ * a parallel lazy vacuum, we perform both index vacuuming and index cleanup
+ * with parallel worker processes.  Individual indexes are processed by one
+ * vacuum process.  At the beginning of a lazy vacuum (at lazy_scan_heap) we
+ * prepare the parallel context and initialize the DSM segment that contains
+ * shared information as well as the memory space for storing dead tuples.
+ * When starting either index vacuuming or index cleanup, we launch parallel
+ * worker processes.  Once all indexes are processed the parallel worker
+ * processes exit.  And then the leader process re-initializes the parallel
+ * context so that the leader can launch parallel workers again in the next
+ * time.  Note that all parallel workers live during either index vacuuming
+ * or index cleanup but the leader process neither exits from the parallel
+ * mode nor destroys the parallel context.  For updating the index statistics,
+ * since any updates are not allowed during parallel mode we update the index
+ * statistics after exited from the parallel mode.
  *
  * Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
@@ -36,13 +51,16 @@
 
 #include <math.h>
 
+#include "access/amapi.h"
 #include "access/genam.h"
 #include "access/heapam.h"
 #include "access/heapam_xlog.h"
 #include "access/htup_details.h"
 #include "access/multixact.h"
+#include "access/parallel.h"
 #include "access/transam.h"
 #include "access/visibilitymap.h"
+#include "access/xact.h"
 #include "access/xlog.h"
 #include "catalog/storage.h"
 #include "commands/dbcommands.h"
@@ -55,6 +73,7 @@
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
 #include "storage/lmgr.h"
+#include "tcop/tcopprot.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/pg_rusage.h"
@@ -110,6 +129,162 @@
  */
 #define PREFETCH_SIZE			((BlockNumber) 32)
 
+/*
+ * DSM keys for parallel lazy vacuum.  Unlike other parallel execution code,
+ * since we don't need to worry about DSM keys conflicting with plan_node_id
+ * we can use small integers.
+ */
+#define PARALLEL_VACUUM_KEY_SHARED			1
+#define PARALLEL_VACUUM_KEY_DEAD_TUPLES		2
+#define PARALLEL_VACUUM_KEY_QUERY_TEXT		3
+
+/*
+ * PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION disables the leader's
+ * participation in parallel lazy vacuum.  This may be useful as a debugging
+ * aid.
+#undef PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION
+ */
+
+/*
+ * Macro to check if we are in a parallel lazy vacuum.  If true, we are
+ * in the parallel mode and prepared the DSM segment.
+ */
+#define ParallelVacuumIsActive(lps) (((LVParallelState *) (lps)) != NULL)
+
+/*
+ * LVDeadTuples stores the dead tuple TIDs collected during the heap scan.
+ * This is allocated in the DSM segment in parallel mode and in local memory
+ * in non-parallel mode.
+ */
+typedef struct LVDeadTuples
+{
+	int			max_tuples;	/* # slots allocated in array */
+	int			num_tuples;	/* current # of entries */
+	/* List of TIDs of tuples we intend to delete */
+	/* NB: this list is ordered by TID address */
+	ItemPointerData itemptrs[FLEXIBLE_ARRAY_MEMBER];	/* array of ItemPointerData */
+} LVDeadTuples;
+
+#define SizeOfLVDeadTuples offsetof(LVDeadTuples, itemptrs) + sizeof(ItemPointerData)
+
+/*
+ * Shared information among parallel workers.  So this is allocated in the DSM
+ * segment.
+ */
+typedef struct LVShared
+{
+	/*
+	 * Target table relid and log level. These fields are not modified
+	 * during the lazy vacuum.
+	 */
+	Oid		relid;
+	int		elevel;
+
+	/*
+	 * An indication for vacuum workers of doing either index vacuuming or
+	 * index cleanup.  first_time is true only if for_cleanup is true and
+	 * bulk-deletion is not performed yet.
+	 */
+	bool	for_cleanup;
+	bool	first_time;
+
+	/*
+	 * Fields for both index vacuuming and index cleanup.
+	 *
+	 * reltuples is the total number of input heap tuples.  We set either
+	 * an old live tuples in index vacuuming case or the new live tuples in
+	 * index cleanup case.
+	 *
+	 * estimated_count is true if the reltuples is estimated value.
+	 */
+	double	reltuples;
+	bool	estimated_count;
+
+	/*
+	 * In single process lazy vacuum we could consume more memory during
+	 * index vacuuming or cleanup apart from the memory for heap scanning.
+	 * In parallel index vacuuming, since individual vacuum workers can
+	 * consume memory equal to maitenance_work_mem, the new
+	 * maitenance_work_mem for each worker is set such that the parallel
+	 * operation doesn't consume more memory than single process lazy vacuum.
+	 */
+	int		maintenance_work_mem_worker;
+
+	/*
+	 * Shared vacuum cost balance.  During parallel index vacuuming
+	 * VacuumSharedCostBalance points to this value and it accumulates the
+	 * balance of each parallel vacuum workers.
+	 */
+	pg_atomic_uint32 cost_balance;
+
+	/*
+	 * Number of active parallel workers.  This is used for computing the
+	 * minimum threshold of the vacuum cost balance for a worker to go
+	 * for the delay.
+	 */
+	pg_atomic_uint32 active_nworkers;
+
+	/*
+	 * Variables to control parallel index vacuuming.  Index statistics
+	 * returned from ambulkdelete and amvacuumcleanup is nullable variable
+	 * length.  'offset' is NULL bitmap. Note that a 0 indicates a null,
+	 * while 1 indicates non-null.  The index statistics follows at end of
+	 * struct.
+	 */
+	pg_atomic_uint32	idx;		/* counter for vacuuming and clean up */
+	pg_atomic_uint32	nprocessed;	/* # of indexes done during parallel execution */
+	uint32				offset;		/* sizeof header incl. bitmap */
+	bits8				bitmap[FLEXIBLE_ARRAY_MEMBER];	 /* bit map of NULLs */
+
+	/* Shared index statistics data follows at end of struct */
+} LVShared;
+
+#define SizeOfLVShared offsetof(LVShared, bitmap) + sizeof(bits8)
+#define GetSharedIndStats(s) \
+	((LVSharedIndStats *)((char *)(s) + ((LVShared *)(s))->offset))
+#define IndStatsIsNull(s, i) \
+	(!(((LVShared *)(s))->bitmap[(i) >> 3] & (1 << ((i) & 0x07))))
+
+/*
+ * Struct for an index bulk-deletion statistic used for parallel lazy
+ * vacuum.  This is allocated in the DSM segment.
+ */
+typedef struct LVSharedIndStats
+{
+	Size	size;
+	bool	updated;	/* are the stats updated? */
+
+	/* IndexBulkDeleteResult data follows at end of struct */
+} LVSharedIndStats;
+
+#define SizeOfSharedIndStats(s) \
+	(sizeof(LVSharedIndStats) + ((LVSharedIndStats *)(s))->size)
+#define GetIndexBulkDeleteResult(s) \
+	((IndexBulkDeleteResult *)((char *)(s) + sizeof(LVSharedIndStats)))
+
+/* Struct for maintaining a parallel vacuum state. */
+typedef struct LVParallelState
+{
+	ParallelContext	*pcxt;
+
+	/* Shared information among parallel vacuum workers */
+	LVShared		*lvshared;
+
+	/*
+	 * The number of indexes that support parallel index bulk-deletion
+	 * and parallel index cleanup respectively.
+	 */
+	int				nindexes_parallel_bulkdel;
+	int				nindexes_parallel_cleanup;
+	int				nindexes_parallel_condcleanup;
+
+	/*
+	 * Always true except for a debugging case where
+	 * PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION are defined.
+	 */
+	bool			leaderparticipates;
+} LVParallelState;
+
 typedef struct LVRelStats
 {
 	/* useindex = true means two-pass strategy; false means one-pass */
@@ -128,11 +303,7 @@ typedef struct LVRelStats
 	BlockNumber pages_removed;
 	double		tuples_deleted;
 	BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
-	/* List of TIDs of tuples we intend to delete */
-	/* NB: this list is ordered by TID address */
-	int			num_dead_tuples;	/* current # of entries */
-	int			max_dead_tuples;	/* # slots allocated in array */
-	ItemPointer dead_tuples;	/* array of ItemPointerData */
+	LVDeadTuples *dead_tuples;
 	int			num_index_scans;
 	TransactionId latestRemovedXid;
 	bool		lock_waiter_detected;
@@ -148,6 +319,26 @@ static MultiXactId MultiXactCutoff;
 
 static BufferAccessStrategy vac_strategy;
 
+/*
+ * Variables for cost-based vacuum delay for parallel index vacuuming.
+ * The basic idea of cost-based vacuum delay for parallel index vacuuming
+ * is to allow all parallel vacuum workers including the leader process
+ * to have a shared view of cost related parameters (mainly VacuumCostBalance)
+ * and allow each worker to update it and then based on that decide
+ * whether it needs to sleep.  Besides, we allow any worker to sleep
+ * only if it has performed the I/O above a certain threshold, which is
+ * calculated based on the number of active workers (VacuumActiveNWorkers),
+ * and the overall cost balance is more than VacuumCostLimit set by the
+ * system.  Then we will allow the worker to sleep proportional to the work
+ * done and reduce the VacuumSharedCostBalance by the amount which is
+ * consumed by the current worker (VacuumCostBalanceLocal).  This can
+ * avoid letting the workers sleep which has done less or no I/O as compared
+ * to other workers, and therefore can ensure that workers who are doing
+ * more I/O got throttled more.
+ */
+pg_atomic_uint32	*VacuumSharedCostBalance = NULL;
+pg_atomic_uint32	*VacuumActiveNWorkers = NULL;
+int					VacuumCostBalanceLocal = 0;
 
 /* non-export function prototypes */
 static void lazy_scan_heap(Relation onerel, VacuumParams *params,
@@ -155,12 +346,11 @@ static void lazy_scan_heap(Relation onerel, VacuumParams *params,
 						   bool aggressive);
 static void lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats);
 static bool lazy_check_needs_freeze(Buffer buf, bool *hastup);
-static void lazy_vacuum_index(Relation indrel,
-							  IndexBulkDeleteResult **stats,
-							  LVRelStats *vacrelstats);
+static void lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult **stats,
+							  LVDeadTuples *dead_tuples, double reltuples);
 static void lazy_cleanup_index(Relation indrel,
-							   IndexBulkDeleteResult *stats,
-							   LVRelStats *vacrelstats);
+							   IndexBulkDeleteResult **stats,
+							   double reltuples, bool estimated_count);
 static int	lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 							 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer);
 static bool should_attempt_truncation(VacuumParams *params,
@@ -169,12 +359,44 @@ static void lazy_truncate_heap(Relation onerel, LVRelStats *vacrelstats);
 static BlockNumber count_nondeletable_pages(Relation onerel,
 											LVRelStats *vacrelstats);
 static void lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks);
-static void lazy_record_dead_tuple(LVRelStats *vacrelstats,
+static void lazy_record_dead_tuple(LVDeadTuples *dead_tuples,
 								   ItemPointer itemptr);
 static bool lazy_tid_reaped(ItemPointer itemptr, void *state);
 static int	vac_cmp_itemptr(const void *left, const void *right);
 static bool heap_page_is_all_visible(Relation rel, Buffer buf,
 									 TransactionId *visibility_cutoff_xid, bool *all_frozen);
+static void lazy_parallel_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
+													int nindexes, IndexBulkDeleteResult **stats,
+													LVParallelState *lps);
+static void vacuum_or_cleanup_indexes_worker(Relation *Irel, int nindexes,
+											 IndexBulkDeleteResult **stats,
+											 LVShared *lvshared,
+											 LVDeadTuples *dead_tuples);
+static void vacuum_or_cleanup_skipped_indexes(LVRelStats *vacrelstats, Relation *Irel,
+											  int nindexes, IndexBulkDeleteResult **stats,
+											  LVParallelState *lps, bool in_parallel);
+static void vacuum_or_cleanup_one_index_worker(Relation indrel, IndexBulkDeleteResult **stats,
+											   LVShared *lvshared, LVSharedIndStats *shared_indstats,
+											   LVDeadTuples *dead_tuples);
+static void lazy_vacuum_indexes(LVRelStats *vacrelstats, Relation *Irel,
+								int nindexes, IndexBulkDeleteResult **stats,
+								LVParallelState *lps);
+static void lazy_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
+								 int nindexes, IndexBulkDeleteResult **stats,
+								 LVParallelState *lps);
+static void update_index_statistics(Relation *Irel, IndexBulkDeleteResult **stats,
+									int nindexes);
+static long compute_max_dead_tuples(BlockNumber relblocks, bool hasindex);
+static int compute_parallel_workers(Relation *Irel, int nindexes, int nrequested);
+static void prepare_index_statistics(LVShared *lvshared, Relation *Irel, int nindexes,
+									 int nworkers);
+static LVParallelState *begin_parallel_vacuum(LVRelStats *vacrelstats, Oid relid,
+											  BlockNumber nblocks, Relation *Irel,
+											  int nindexes, int nrequested);
+static void end_parallel_vacuum(LVParallelState *lps, Relation *Irel, int nindexes,
+								IndexBulkDeleteResult **stats);
+static LVSharedIndStats *get_indstats(LVShared *lvshared, int n);
+static bool skip_parallel_index_vacuum(Relation indrel, LVShared *lvshared);
 
 
 /*
@@ -488,6 +710,18 @@ vacuum_log_cleanup_info(Relation rel, LVRelStats *vacrelstats)
  *		dead-tuple TIDs, invoke vacuuming of indexes and call lazy_vacuum_heap
  *		to reclaim dead line pointers.
  *
+ *		If the table has at least two indexes and parallel lazy vacuum is
+ *		requested, we execute both index vacuuming and index cleanup with
+ *		parallel workers.  In parallel lazy vacuum, we enter parallel mode and
+ *		then create both the parallel context and the DSM segment before starting
+ *		heap scan so that we can record dead tuples to the DSM segment.  All
+ *		parallel workers are launched at beginning of index vacuuming and index
+ *		cleanup and they exit once done with all indexes.  At the end of this
+ *		function we exit from parallel mode.  Index bulk-deletion results are
+ *		stored in the DSM segment and update index statistics as a whole after
+ *		exited from parallel mode since all writes are not allowed during parallel
+ *		mode.
+ *
  *		If there are no indexes then we can reclaim line pointers on the fly;
  *		dead line pointers need only be retained until all index pointers that
  *		reference them have been killed.
@@ -496,6 +730,8 @@ static void
 lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			   Relation *Irel, int nindexes, bool aggressive)
 {
+	LVParallelState *lps = NULL;
+	LVDeadTuples *dead_tuples;
 	BlockNumber nblocks,
 				blkno;
 	HeapTupleData tuple;
@@ -518,6 +754,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	bool		skipping_blocks;
 	xl_heap_freeze_tuple *frozen;
 	StringInfoData buf;
+	int			parallel_workers = 0;
 	const int	initprog_index[] = {
 		PROGRESS_VACUUM_PHASE,
 		PROGRESS_VACUUM_TOTAL_HEAP_BLKS,
@@ -553,13 +790,41 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	vacrelstats->nonempty_pages = 0;
 	vacrelstats->latestRemovedXid = InvalidTransactionId;
 
-	lazy_space_alloc(vacrelstats, nblocks);
+	/*
+	 * Compute the number of parallel vacuum workers to launch if the parallel
+	 * vacuum is requested and we need to vacuum the indexes.
+	 */
+	if (params->nworkers >= 0 && vacrelstats->useindex)
+		parallel_workers = compute_parallel_workers(Irel, nindexes,
+													params->nworkers);
+
+	if (parallel_workers > 0)
+	{
+		/*
+		 * Enter parallel mode, create the parallel context and allocate the
+		 * DSM segment.
+		 */
+		lps = begin_parallel_vacuum(vacrelstats,
+									RelationGetRelid(onerel),
+									nblocks, Irel, nindexes,
+									parallel_workers);
+	}
+	else
+	{
+		/*
+		 * Use single process vacuum. We allocate the memory space for dead
+		 * tuples locally.
+		 */
+		lazy_space_alloc(vacrelstats, nblocks);
+	}
+
+	dead_tuples = vacrelstats->dead_tuples;
 	frozen = palloc(sizeof(xl_heap_freeze_tuple) * MaxHeapTuplesPerPage);
 
 	/* Report that we're scanning the heap, advertising total # of blocks */
 	initprog_val[0] = PROGRESS_VACUUM_PHASE_SCAN_HEAP;
 	initprog_val[1] = nblocks;
-	initprog_val[2] = vacrelstats->max_dead_tuples;
+	initprog_val[2] = dead_tuples->max_tuples;
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
 	/*
@@ -737,8 +1002,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * If we are close to overrunning the available space for dead-tuple
 		 * TIDs, pause and do a cycle of vacuuming before we tackle this page.
 		 */
-		if ((vacrelstats->max_dead_tuples - vacrelstats->num_dead_tuples) < MaxHeapTuplesPerPage &&
-			vacrelstats->num_dead_tuples > 0)
+		if ((dead_tuples->max_tuples - dead_tuples->num_tuples) < MaxHeapTuplesPerPage &&
+			dead_tuples->num_tuples > 0)
 		{
 			const int	hvp_index[] = {
 				PROGRESS_VACUUM_PHASE,
@@ -766,10 +1031,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 										 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
 			/* Remove index entries */
-			for (i = 0; i < nindexes; i++)
-				lazy_vacuum_index(Irel[i],
-								  &indstats[i],
-								  vacrelstats);
+			lazy_vacuum_indexes(vacrelstats, Irel, nindexes, indstats, lps);
 
 			/*
 			 * Report that we are now vacuuming the heap.  We also increase
@@ -789,7 +1051,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 			vacrelstats->num_index_scans++;
 
 			/*
@@ -985,7 +1247,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		has_dead_tuples = false;
 		nfrozen = 0;
 		hastup = false;
-		prev_dead_count = vacrelstats->num_dead_tuples;
+		prev_dead_count = dead_tuples->num_tuples;
 		maxoff = PageGetMaxOffsetNumber(page);
 
 		/*
@@ -1024,7 +1286,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 */
 			if (ItemIdIsDead(itemid))
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				all_visible = false;
 				continue;
 			}
@@ -1170,7 +1432,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 
 			if (tupgone)
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				HeapTupleHeaderAdvanceLatestRemovedXid(tuple.t_data,
 													   &vacrelstats->latestRemovedXid);
 				tups_vacuumed += 1;
@@ -1240,7 +1502,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * doing a second scan. Also we don't do that but forget dead tuples
 		 * when index cleanup is disabled.
 		 */
-		if (!vacrelstats->useindex && vacrelstats->num_dead_tuples > 0)
+		if (!vacrelstats->useindex && dead_tuples->num_tuples > 0)
 		{
 			if (nindexes == 0)
 			{
@@ -1269,7 +1531,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 
 			/*
 			 * Periodically do incremental FSM vacuuming to make newly-freed
@@ -1384,7 +1646,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * page, so remember its free space as-is.  (This path will always be
 		 * taken if there are no indexes.)
 		 */
-		if (vacrelstats->num_dead_tuples == prev_dead_count)
+		if (dead_tuples->num_tuples == prev_dead_count)
 			RecordPageWithFreeSpace(onerel, blkno, freespace);
 	}
 
@@ -1418,7 +1680,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 
 	/* If any tuples need to be deleted, perform final vacuum cycle */
 	/* XXX put a threshold on min number of tuples here? */
-	if (vacrelstats->num_dead_tuples > 0)
+	if (dead_tuples->num_tuples > 0)
 	{
 		const int	hvp_index[] = {
 			PROGRESS_VACUUM_PHASE,
@@ -1434,10 +1696,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 									 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
 		/* Remove index entries */
-		for (i = 0; i < nindexes; i++)
-			lazy_vacuum_index(Irel[i],
-							  &indstats[i],
-							  vacrelstats);
+		lazy_vacuum_indexes(vacrelstats, Irel, nindexes, indstats, lps);
 
 		/* Report that we are now vacuuming the heap */
 		hvp_val[0] = PROGRESS_VACUUM_PHASE_VACUUM_HEAP;
@@ -1463,12 +1722,19 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
 								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
 
-	/* Do post-vacuum cleanup and statistics update for each index */
+	/* Do post-vacuum cleanup */
 	if (vacrelstats->useindex)
-	{
-		for (i = 0; i < nindexes; i++)
-			lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
-	}
+		lazy_cleanup_indexes(vacrelstats, Irel, nindexes, indstats, lps);
+
+	/*
+	 * End parallel mode before updating index statistics as we cannot write
+	 * during parallel mode.
+	 */
+	if (ParallelVacuumIsActive(lps))
+		end_parallel_vacuum(lps, Irel, nindexes, indstats);
+
+	/* Update index statistics */
+	 update_index_statistics(Irel, indstats, nindexes);
 
 	/* If no indexes, make log report that lazy_vacuum_heap would've made */
 	if (vacuumed_pages)
@@ -1534,7 +1800,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 	npages = 0;
 
 	tupindex = 0;
-	while (tupindex < vacrelstats->num_dead_tuples)
+	while (tupindex < vacrelstats->dead_tuples->num_tuples)
 	{
 		BlockNumber tblk;
 		Buffer		buf;
@@ -1543,7 +1809,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 
 		vacuum_delay_point();
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples->itemptrs[tupindex]);
 		buf = ReadBufferExtended(onerel, MAIN_FORKNUM, tblk, RBM_NORMAL,
 								 vac_strategy);
 		if (!ConditionalLockBufferForCleanup(buf))
@@ -1591,6 +1857,7 @@ static int
 lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 				 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer)
 {
+	LVDeadTuples	*dead_tuples = vacrelstats->dead_tuples;
 	Page		page = BufferGetPage(buffer);
 	OffsetNumber unused[MaxOffsetNumber];
 	int			uncnt = 0;
@@ -1601,16 +1868,16 @@ lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 
 	START_CRIT_SECTION();
 
-	for (; tupindex < vacrelstats->num_dead_tuples; tupindex++)
+	for (; tupindex < dead_tuples->num_tuples; tupindex++)
 	{
 		BlockNumber tblk;
 		OffsetNumber toff;
 		ItemId		itemid;
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&dead_tuples->itemptrs[tupindex]);
 		if (tblk != blkno)
 			break;				/* past end of tuples for this block */
-		toff = ItemPointerGetOffsetNumber(&vacrelstats->dead_tuples[tupindex]);
+		toff = ItemPointerGetOffsetNumber(&dead_tuples->itemptrs[tupindex]);
 		itemid = PageGetItemId(page, toff);
 		ItemIdSetUnused(itemid);
 		unused[uncnt++] = toff;
@@ -1731,19 +1998,392 @@ lazy_check_needs_freeze(Buffer buf, bool *hastup)
 	return false;
 }
 
+/*
+ * Perform index vacuuming or index cleanup with parallel workers.  This
+ * function must be used by the parallel vacuum leader process. The caller
+ * must set lps->lvshared->for_cleanup to indicate whether to perform vacuum
+ * or cleanup.
+ */
+static void
+lazy_parallel_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
+										int nindexes, IndexBulkDeleteResult **stats,
+										LVParallelState *lps)
+{
+	int	nworkers;
+
+	Assert(!IsParallelWorker());
+	Assert(ParallelVacuumIsActive(lps));
+	Assert(nindexes > 0);
+
+	/* Determine the number of parallel workers to launch */
+	if (lps->lvshared->for_cleanup)
+	{
+		if (lps->lvshared->first_time)
+			nworkers = lps->nindexes_parallel_cleanup +
+				lps->nindexes_parallel_condcleanup - 1;
+		else
+			nworkers = lps->nindexes_parallel_cleanup - 1;
+
+	}
+	else
+		nworkers = lps->nindexes_parallel_bulkdel - 1;
+
+	/* Cap by the worker we computed at the beginning of parallel lazy vacuum */
+	nworkers = Min(nworkers, lps->pcxt->nworkers);
+
+	/* Setup the shared cost-based vacuum delay and launch workers*/
+	if (nworkers > 0)
+	{
+		/* Enable shared cost balance */
+		VacuumSharedCostBalance = &(lps->lvshared->cost_balance);
+		VacuumActiveNWorkers = &(lps->lvshared->active_nworkers);
+
+		/*
+		 * Set up shared cost balance and the number of active workers for
+		 * vacuum delay.
+		 */
+		pg_atomic_write_u32(VacuumSharedCostBalance, VacuumCostBalance);
+		pg_atomic_write_u32(VacuumActiveNWorkers, 0);
+
+		/*
+		 * Reset the local value so that we compute cost balance during
+		 * parallel index vacuuming.
+		 */
+		VacuumCostBalance = 0;
+		VacuumCostBalanceLocal = 0;
+
+		LaunchParallelWorkers(lps->pcxt, nworkers);
+
+		if (lps->lvshared->for_cleanup)
+			ereport(elevel,
+					(errmsg(ngettext("launched %d parallel vacuum worker for index cleanup (planned: %d)",
+									 "launched %d parallel vacuum workers for index cleanup (planned: %d)",
+									 lps->pcxt->nworkers_launched),
+							lps->pcxt->nworkers_launched, nworkers)));
+		else
+			ereport(elevel,
+					(errmsg(ngettext("launched %d parallel vacuum worker for index vacuuming (planned: %d)",
+									 "launched %d parallel vacuum workers for index vacuuming (planned: %d)",
+									 lps->pcxt->nworkers_launched),
+							lps->pcxt->nworkers_launched, nworkers)));
+	}
+
+	/*
+	 * Join as a parallel worker. The leader process alone does that in
+	 * case where no workers launched.
+	 */
+	if (lps->leaderparticipates || lps->pcxt->nworkers_launched == 0)
+		vacuum_or_cleanup_indexes_worker(Irel, nindexes, stats, lps->lvshared,
+										 vacrelstats->dead_tuples);
+
+	/*
+	 * Here, the indexes that had been skipped during parallel index vacuuming
+	 * are remaining. If there are such indexes the leader process does vacuum
+	 * or cleanup them one by one.
+	 */
+	vacuum_or_cleanup_skipped_indexes(vacrelstats, Irel, nindexes,
+									  stats, lps, nworkers > 0);
+
+	/* Wait for all vacuum workers to finish */
+	WaitForParallelWorkersToFinish(lps->pcxt);
+
+	/* Take over the shared balance value to heap scan */
+	VacuumCostBalance = pg_atomic_read_u32(VacuumSharedCostBalance);
+
+	/* Disable shared cost balance for vacuum delay */
+	VacuumSharedCostBalance = NULL;
+	VacuumActiveNWorkers = NULL;
+
+	/*
+	 * In cleanup case we don't need to reinitialize the parallel
+	 * context as no more index vacuuming and index cleanup will be
+	 * performed after that.
+	 */
+	if (!lps->lvshared->for_cleanup)
+	{
+		/* Reset the processing counts */
+		pg_atomic_write_u32(&(lps->lvshared->idx), 0);
+		pg_atomic_write_u32(&(lps->lvshared->nprocessed), 0);
+
+		/*
+		 * Reinitialize the parallel context to relaunch parallel workers
+		 * for the next execution.
+		 */
+		ReinitializeParallelDSM(lps->pcxt);
+	}
+}
+
+/*
+ * Index vacuuming and index cleanup routine used by parallel vacuum
+ * worker processes including the leader process.
+ */
+static void
+vacuum_or_cleanup_indexes_worker(Relation *Irel, int nindexes,
+								 IndexBulkDeleteResult **stats,
+								 LVShared *lvshared,
+								 LVDeadTuples *dead_tuples)
+{
+	/* Increment the active worker count */
+	pg_atomic_add_fetch_u32(VacuumActiveNWorkers, 1);
+
+	/* Loop until all indexes are vacuumed */
+	for (;;)
+	{
+		int idx;
+		LVSharedIndStats *shared_indstats;
+
+		/* Get an index number to process */
+		idx = pg_atomic_fetch_add_u32(&(lvshared->idx), 1);
+
+		/* Done for all indexes? */
+		if (idx >= nindexes)
+			break;
+
+		/*
+		 * Skip if this index doesn't support parallel execution
+		 * at this time.
+		 */
+		if (skip_parallel_index_vacuum(Irel[idx], lvshared))
+			continue;
+
+		/* Increment the processing count */
+		pg_atomic_add_fetch_u32(&(lvshared->nprocessed), 1);
+
+		/* Get index statistics struct of this index */
+		shared_indstats = get_indstats(lvshared, idx);
+		Assert(shared_indstats);
+
+		/* Do vacuum or cleanup one index */
+		vacuum_or_cleanup_one_index_worker(Irel[idx], &(stats[idx]),
+										   lvshared, shared_indstats,
+										   dead_tuples);
+	}
+
+	/*
+	 * We have completed the index vacuum so decrement the active worker
+	 * count.
+	 */
+	pg_atomic_sub_fetch_u32(VacuumActiveNWorkers, 1);
+}
+
+/*
+ * Vacuum or cleanup indexes that have been skipped during parallel operation
+ * because these indexes don't support parallel operation at that phase.
+ * Therefore this function must be called by the leader process.  in_parallel
+ * is true when some parallel workers might be running in parallel. So we need
+ * to increment the active worker count for shared cost-based vacuum delay.
+ */
+static void
+vacuum_or_cleanup_skipped_indexes(LVRelStats *vacrelstats, Relation *Irel,
+								  int nindexes, IndexBulkDeleteResult **stats,
+								  LVParallelState *lps, bool in_parallel)
+{
+	int nindexes_remains;
+	int i;
+#ifdef USE_ASSERT_CHECKING
+	int nprocessed = 0;
+#endif
+
+	nindexes_remains = nindexes - pg_atomic_read_u32(&(lps->lvshared->nprocessed));
+	Assert(nindexes_remains >= 0);
+
+	/* Quick exit if all indexes have already been processed */
+	if (nindexes_remains == 0)
+		return;
+
+	/* Increment the active worker count if some worker might be running */
+	if (in_parallel)
+		pg_atomic_add_fetch_u32(VacuumActiveNWorkers, 1);
+
+	for (i = 0; i < nindexes; i++)
+	{
+		bool processed = !skip_parallel_index_vacuum(Irel[i], lps->lvshared);
+
+		/* Skip the already processed indexes */
+		if (processed)
+			continue;
+
+		vacuum_or_cleanup_one_index_worker(Irel[i], &(stats[i]),
+										   lps->lvshared, get_indstats(lps->lvshared, i),
+										   vacrelstats->dead_tuples);
+
+#ifdef USE_ASSERT_CHECKING
+		nprocessed++;
+#endif
+	}
+
+	/*
+	 * We have completed the index vacuum so decrement the active worker
+	 * count.
+	 */
+	if (in_parallel)
+		pg_atomic_sub_fetch_u32(VacuumActiveNWorkers, 1);
+
+#ifdef USE_ASSERT_CHECKING
+	Assert(nprocessed == nindexes_remains);
+#endif
+}
+
+/*
+ * Vacuum or cleanup one index by worker processing including the leader
+ * process.  After finished each indexes this function copies the index
+ * statistics returned from ambulkdelete and amvacuumcleanup to the DSM
+ * segment.
+ */
+static void
+vacuum_or_cleanup_one_index_worker(Relation indrel, IndexBulkDeleteResult **stats,
+								   LVShared *lvshared, LVSharedIndStats *shared_indstats,
+								   LVDeadTuples *dead_tuples)
+{
+	IndexBulkDeleteResult *bulkdelete_res = NULL;
+
+	if (shared_indstats)
+	{
+		/* Get the space for IndexBulkDeleteResult */
+		bulkdelete_res = GetIndexBulkDeleteResult(shared_indstats);
+
+		/*
+		 * Update the pointer to the corresponding bulk-deletion result
+		 * if someone has already updated it.
+		 */
+		if (shared_indstats->updated && *stats == NULL)
+			*stats = bulkdelete_res;
+	}
+
+	/* Do vacuum or cleanup one index */
+	if (lvshared->for_cleanup)
+		lazy_cleanup_index(indrel, stats, lvshared->reltuples,
+						   lvshared->estimated_count);
+	else
+		lazy_vacuum_index(indrel, stats, dead_tuples,
+						  lvshared->reltuples);
+
+	/*
+	 * Copy the index bulk-deletion result returned from ambulkdelete
+	 * and amvacuumcleanup to the DSM segment if it's the first time to
+	 * get it from them, because they allocate it locally and it's
+	 * possible that an index will be vacuumed by the different vacuum
+	 * process at the next time.  The copying the result normally
+	 * happens only after the first time of index vacuuming.  From the
+	 * second time, we pass the result on the DSM segment so that they
+	 * then update it directly.
+	 *
+	 * Since all vacuum workers write the bulk-deletion result at
+	 * different slots we can write them without locking.
+	 */
+	if (shared_indstats && !shared_indstats->updated && *stats != NULL)
+	{
+		memcpy(bulkdelete_res, *stats, shared_indstats->size);
+		shared_indstats->updated = true;
+
+		/*
+		 * no longer need the locally allocated result and now
+		 * stats[idx] points to the DSM segment.
+		 */
+		pfree(*stats);
+		*stats = bulkdelete_res;
+	}
+}
+
+/*
+ * Vacuum indexes.  We process the indexes serially unless we are doing
+ * parallel vacuum.
+ */
+static void
+lazy_vacuum_indexes(LVRelStats *vacrelstats, Relation *Irel,
+					int nindexes, IndexBulkDeleteResult **stats,
+					LVParallelState *lps)
+{
+	int		idx;
+
+	Assert(!IsParallelWorker());
+	Assert(nindexes > 0);
+
+	/* Perform index vacuuming with parallel workers for parallel vacuum. */
+	if (ParallelVacuumIsActive(lps))
+	{
+		/* Tell parallel workers to do index vacuuming */
+		lps->lvshared->for_cleanup = false;
+		lps->lvshared->first_time = false;
+
+		/*
+		 * We can only provide an approximate value of num_heap_tuples in
+		 * vacuum cases.
+		 */
+		lps->lvshared->reltuples = vacrelstats->old_live_tuples;
+		lps->lvshared->estimated_count = true;
+
+		lazy_parallel_vacuum_or_cleanup_indexes(vacrelstats, Irel, nindexes,
+												stats, lps);
+	}
+	else
+	{
+		for (idx = 0; idx < nindexes; idx++)
+			lazy_vacuum_index(Irel[idx], &stats[idx], vacrelstats->dead_tuples,
+							  vacrelstats->old_live_tuples);
+	}
+}
+
+/*
+ * Cleanup indexes.  We process the indexes serially unless we are doing
+ * parallel vacuum.
+ */
+static void
+lazy_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
+					 int nindexes, IndexBulkDeleteResult **stats,
+					 LVParallelState *lps)
+{
+	int		idx;
+
+	Assert(!IsParallelWorker());
+	Assert(nindexes > 0);
+
+	/*
+	 * If parallel vacuum is active we perform index cleanup with parallel
+	 * workers.
+	 */
+	if (ParallelVacuumIsActive(lps))
+	{
+		/* Tell parallel workers to do index cleanup */
+		lps->lvshared->for_cleanup = true;
+		lps->lvshared->first_time =
+						(vacrelstats->num_index_scans == 0);
+
+		/*
+		 * Now we can provide a better estimate of total number of
+		 * surviving tuples (we assume indexes are more interested in that
+		 * than in the number of nominally live tuples).
+		 */
+		lps->lvshared->reltuples = vacrelstats->new_rel_tuples;
+		lps->lvshared->estimated_count =
+					(vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+
+		lazy_parallel_vacuum_or_cleanup_indexes(vacrelstats, Irel, nindexes,
+												stats, lps);
+	}
+	else
+	{
+		for (idx = 0; idx < nindexes; idx++)
+			lazy_cleanup_index(Irel[idx], &stats[idx],
+							   vacrelstats->new_rel_tuples,
+							   vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	}
+}
 
 /*
  *	lazy_vacuum_index() -- vacuum one index relation.
  *
  *		Delete all the index entries pointing to tuples listed in
  *		vacrelstats->dead_tuples, and update running statistics.
+ *		reltuples is the number of heap tuples to be passed to the
+ *		bulk delete callback.
  */
 static void
-lazy_vacuum_index(Relation indrel,
-				  IndexBulkDeleteResult **stats,
-				  LVRelStats *vacrelstats)
+lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult **stats,
+				  LVDeadTuples *dead_tuples, double reltuples)
 {
 	IndexVacuumInfo ivinfo;
+	char		*msgfmt;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
@@ -1753,30 +2393,38 @@ lazy_vacuum_index(Relation indrel,
 	ivinfo.report_progress = false;
 	ivinfo.estimated_count = true;
 	ivinfo.message_level = elevel;
-	/* We can only provide an approximate value of num_heap_tuples here */
-	ivinfo.num_heap_tuples = vacrelstats->old_live_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
 	/* Do bulk deletion */
 	*stats = index_bulk_delete(&ivinfo, *stats,
-							   lazy_tid_reaped, (void *) vacrelstats);
+							   lazy_tid_reaped, (void *) dead_tuples);
+
+	if (IsParallelWorker())
+		msgfmt = gettext_noop("scanned index \"%s\" to remove %d row versions by parallel vacuum worker");
+	else
+		msgfmt = gettext_noop("scanned index \"%s\" to remove %d row versions");
 
 	ereport(elevel,
-			(errmsg("scanned index \"%s\" to remove %d row versions",
+			(errmsg(msgfmt,
 					RelationGetRelationName(indrel),
-					vacrelstats->num_dead_tuples),
+					dead_tuples->num_tuples),
 			 errdetail_internal("%s", pg_rusage_show(&ru0))));
 }
 
 /*
  *	lazy_cleanup_index() -- do post-vacuum cleanup for one index relation.
+ *
+ *		reltuples is the number of heap tuples and estimated_count is true
+ *		if the reltuples is an estimated value.
  */
 static void
 lazy_cleanup_index(Relation indrel,
-				   IndexBulkDeleteResult *stats,
-				   LVRelStats *vacrelstats)
+				   IndexBulkDeleteResult **stats,
+				   double reltuples, bool estimated_count)
 {
 	IndexVacuumInfo ivinfo;
+	char		*msgfmt;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
@@ -1784,49 +2432,62 @@ lazy_cleanup_index(Relation indrel,
 	ivinfo.index = indrel;
 	ivinfo.analyze_only = false;
 	ivinfo.report_progress = false;
-	ivinfo.estimated_count = (vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	ivinfo.estimated_count = estimated_count;
 	ivinfo.message_level = elevel;
 
-	/*
-	 * Now we can provide a better estimate of total number of surviving
-	 * tuples (we assume indexes are more interested in that than in the
-	 * number of nominally live tuples).
-	 */
-	ivinfo.num_heap_tuples = vacrelstats->new_rel_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
-	stats = index_vacuum_cleanup(&ivinfo, stats);
+	*stats = index_vacuum_cleanup(&ivinfo, *stats);
 
-	if (!stats)
+	if (!(*stats))
 		return;
 
-	/*
-	 * Now update statistics in pg_class, but only if the index says the count
-	 * is accurate.
-	 */
-	if (!stats->estimated_count)
-		vac_update_relstats(indrel,
-							stats->num_pages,
-							stats->num_index_tuples,
-							0,
-							false,
-							InvalidTransactionId,
-							InvalidMultiXactId,
-							false);
+	if (IsParallelWorker())
+		msgfmt = gettext_noop("index \"%s\" now contains %.0f row versions in %u pages, reported by parallel vacuum worker");
+	else
+		msgfmt = gettext_noop("index \"%s\" now contains %.0f row versions in %u pages");
 
 	ereport(elevel,
-			(errmsg("index \"%s\" now contains %.0f row versions in %u pages",
+			(errmsg(msgfmt,
 					RelationGetRelationName(indrel),
-					stats->num_index_tuples,
-					stats->num_pages),
+					(*stats)->num_index_tuples,
+					(*stats)->num_pages),
 			 errdetail("%.0f index row versions were removed.\n"
 					   "%u index pages have been deleted, %u are currently reusable.\n"
 					   "%s.",
-					   stats->tuples_removed,
-					   stats->pages_deleted, stats->pages_free,
+					   (*stats)->tuples_removed,
+					   (*stats)->pages_deleted, (*stats)->pages_free,
 					   pg_rusage_show(&ru0))));
+}
 
-	pfree(stats);
+/*
+ * Update index statistics in pg_class if the statistics is accurate.
+ */
+static void
+update_index_statistics(Relation *Irel, IndexBulkDeleteResult **stats,
+						int nindexes)
+{
+	int i;
+
+	Assert(!IsInParallelMode());
+
+	for (i = 0; i < nindexes; i++)
+	{
+		if (stats[i] == NULL || stats[i]->estimated_count)
+			continue;
+
+		/* Update index statistics */
+		vac_update_relstats(Irel[i],
+							stats[i]->num_pages,
+							stats[i]->num_index_tuples,
+							0,
+							false,
+							InvalidTransactionId,
+							InvalidMultiXactId,
+							false);
+		pfree(stats[i]);
+	}
 }
 
 /*
@@ -2134,19 +2795,17 @@ count_nondeletable_pages(Relation onerel, LVRelStats *vacrelstats)
 }
 
 /*
- * lazy_space_alloc - space allocation decisions for lazy vacuum
- *
- * See the comments at the head of this file for rationale.
+ * Return the maximum number of dead tuples we can record.
  */
-static void
-lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+static long
+compute_max_dead_tuples(BlockNumber relblocks, bool useindex)
 {
 	long		maxtuples;
 	int			vac_work_mem = IsAutoVacuumWorkerProcess() &&
 	autovacuum_work_mem != -1 ?
 	autovacuum_work_mem : maintenance_work_mem;
 
-	if (vacrelstats->useindex)
+	if (useindex)
 	{
 		maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);
 		maxtuples = Min(maxtuples, INT_MAX);
@@ -2160,34 +2819,49 @@ lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
 		maxtuples = Max(maxtuples, MaxHeapTuplesPerPage);
 	}
 	else
-	{
 		maxtuples = MaxHeapTuplesPerPage;
-	}
 
-	vacrelstats->num_dead_tuples = 0;
-	vacrelstats->max_dead_tuples = (int) maxtuples;
-	vacrelstats->dead_tuples = (ItemPointer)
-		palloc(maxtuples * sizeof(ItemPointerData));
+	return maxtuples;
+}
+
+/*
+ * lazy_space_alloc - space allocation decisions for lazy vacuum
+ *
+ * See the comments at the head of this file for rationale.
+ */
+static void
+lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+{
+	LVDeadTuples	*dead_tuples = NULL;
+	long		maxtuples;
+
+	maxtuples = compute_max_dead_tuples(relblocks, vacrelstats->useindex);
+
+	dead_tuples = (LVDeadTuples *)
+		palloc(SizeOfLVDeadTuples + maxtuples * sizeof(ItemPointerData));
+	dead_tuples->num_tuples = 0;
+	dead_tuples->max_tuples = (int) maxtuples;
+
+	vacrelstats->dead_tuples = dead_tuples;
 }
 
 /*
  * lazy_record_dead_tuple - remember one deletable tuple
  */
 static void
-lazy_record_dead_tuple(LVRelStats *vacrelstats,
-					   ItemPointer itemptr)
+lazy_record_dead_tuple(LVDeadTuples *dead_tuples, ItemPointer itemptr)
 {
 	/*
 	 * The array shouldn't overflow under normal behavior, but perhaps it
 	 * could if we are given a really small maintenance_work_mem. In that
 	 * case, just forget the last few tuples (we'll get 'em next time).
 	 */
-	if (vacrelstats->num_dead_tuples < vacrelstats->max_dead_tuples)
+	if (dead_tuples->num_tuples < dead_tuples->max_tuples)
 	{
-		vacrelstats->dead_tuples[vacrelstats->num_dead_tuples] = *itemptr;
-		vacrelstats->num_dead_tuples++;
+		dead_tuples->itemptrs[dead_tuples->num_tuples] = *itemptr;
+		dead_tuples->num_tuples++;
 		pgstat_progress_update_param(PROGRESS_VACUUM_NUM_DEAD_TUPLES,
-									 vacrelstats->num_dead_tuples);
+									 dead_tuples->num_tuples);
 	}
 }
 
@@ -2201,12 +2875,12 @@ lazy_record_dead_tuple(LVRelStats *vacrelstats,
 static bool
 lazy_tid_reaped(ItemPointer itemptr, void *state)
 {
-	LVRelStats *vacrelstats = (LVRelStats *) state;
+	LVDeadTuples	*dead_tuples = (LVDeadTuples *) state;
 	ItemPointer res;
 
 	res = (ItemPointer) bsearch((void *) itemptr,
-								(void *) vacrelstats->dead_tuples,
-								vacrelstats->num_dead_tuples,
+								(void *) dead_tuples->itemptrs,
+								dead_tuples->num_tuples,
 								sizeof(ItemPointerData),
 								vac_cmp_itemptr);
 
@@ -2354,3 +3028,412 @@ heap_page_is_all_visible(Relation rel, Buffer buf,
 
 	return all_visible;
 }
+
+/*
+ * Compute the number of parallel worker processes to request. Both index
+ * vacuuming and index cleanup can be executed together with parallel workers.
+ * The relation sizes of table and indexes don't affect to the parallel
+ * degree for now. nrequested is the number of parallel workers that user
+ * requested. If nrequested is 0 we compute the parallel degree based on
+ * nindexes that is the number of indexes that support parallel index
+ * vacuuming.
+ */
+static int
+compute_parallel_workers(Relation *Irel, int nindexes, int nrequested)
+{
+	bool	leaderparticipates = true;
+	int		nindexes_parallel = 0;
+	int		nindexes_parallel_bulkdel = 0;
+	int		nindexes_parallel_cleanup = 0;
+	int		parallel_workers;
+	int		i;
+
+	Assert(nrequested >= 0);
+
+	/* Return immediately when parallelism disabled */
+	if (max_parallel_maintenance_workers == 0)
+		return 0;
+
+	/*
+	 * Compute the number of indexes that can participate in parallel index
+	 * vacuuming.
+	 */
+	for (i = 0; i < nindexes; i++)
+	{
+		uint8 vacoptions = Irel[i]->rd_indam->amparallelvacuumoptions;
+
+		if ((vacoptions & VACUUM_OPTION_PARALLEL_BULKDEL) != 0)
+			nindexes_parallel_bulkdel++;
+		if (((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) != 0) ||
+			((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) != 0))
+			nindexes_parallel_cleanup++;
+	}
+
+	nindexes_parallel = Max(nindexes_parallel_bulkdel,
+							nindexes_parallel_cleanup);
+
+	/* No index supports parallel index vacuuming */
+	if (nindexes_parallel == 0)
+		return 0;
+
+#ifdef PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION
+	leaderparticipates = false;
+#endif
+
+	/* The leader process takes one index */
+	if (leaderparticipates)
+		nindexes_parallel--;
+
+	/* Compute the parallel degree */
+	parallel_workers = (nrequested > 0) ?
+		Min(nrequested, nindexes_parallel) : nindexes_parallel;
+
+	/* cap by max_parallel_maintenace_workers */
+	parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+
+	return parallel_workers;
+}
+
+/*
+ * Initialize variables for shared index statistics, set NULL bitmap and
+ * the struct size of each indexes.  Also this function sets the number of
+ * indexes that do not support parallel index vacuuming and that use
+ * maintenance_work_mem.  Since currently we don't support parallel vacuum
+ * for autovacuum we don't need to care about autovacuum_work_mem.
+ */
+static void
+prepare_index_statistics(LVShared *lvshared, Relation *Irel, int nindexes,
+	int nworkers)
+{
+	char *p = (char *)GetSharedIndStats(lvshared);
+	int nindexes_mwm = 0;
+	int i;
+
+	Assert(!IsAutoVacuumWorkerProcess());
+
+	for (i = 0; i < nindexes; i++)
+	{
+		LVSharedIndStats *indstats;
+
+		if (Irel[i]->rd_indam->amparallelvacuumoptions ==
+			VACUUM_OPTION_NO_PARALLEL)
+		{
+			/* Set NULL as this index does not support parallel vacuum */
+			lvshared->bitmap[i >> 3] |= 0 << (i & 0x07);
+			continue;
+		}
+
+		if (Irel[i]->rd_indam->amusemaintenanceworkmem)
+			nindexes_mwm++;
+
+		/* Set the size for index statistics */
+		indstats = (LVSharedIndStats *)p;
+		lvshared->bitmap[i >> 3] |= 1 << (i & 0x07);
+		indstats->size = index_parallelvacuum_estimate(Irel[i]);
+
+		p += SizeOfSharedIndStats(indstats);
+	}
+
+	/* Compute the new maitenance_work_mem value for index vacuuming */
+	lvshared->maintenance_work_mem_worker =
+					(nindexes_mwm > 0) ?
+					maintenance_work_mem / Min(nworkers, nindexes_mwm) :
+					maintenance_work_mem;
+}
+
+/*
+ * Enter parallel mode, allocate and initialize the DSM segment.
+ */
+static LVParallelState *
+begin_parallel_vacuum(LVRelStats *vacrelstats, Oid relid, BlockNumber nblocks,
+					  Relation *Irel, int nindexes, int nrequested)
+{
+	LVParallelState *lps = (LVParallelState *) palloc0(sizeof(LVParallelState));
+	ParallelContext *pcxt;
+	LVShared		*shared;
+	LVDeadTuples	*dead_tuples;
+	long	maxtuples;
+	char	*sharedquery;
+	Size	est_shared;
+	Size	est_deadtuples;
+	int		querylen;
+	int		i;
+
+	Assert(nrequested > 0);
+	Assert(nindexes > 0);
+
+	lps->leaderparticipates = true;
+
+#ifdef PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION
+	lps->leaderparticipates = false;
+#endif
+
+	EnterParallelMode();
+	pcxt = CreateParallelContext("postgres", "heap_parallel_vacuum_main",
+								 nrequested);
+	lps->pcxt = pcxt;
+	Assert(pcxt->nworkers > 0);
+
+	/* Estimate size for shared information -- PARALLEL_VACUUM_KEY_SHARED */
+	est_shared = MAXALIGN(add_size(SizeOfLVShared, BITMAPLEN(nindexes)));
+	for (i = 0; i < nindexes; i++)
+	{
+		uint8 vacoptions = Irel[i]->rd_indam->amparallelvacuumoptions;
+
+		/*
+		 * Cleanup option should be either disabled, always performing
+		 * in parallel or conditionally performing in parallel.
+		 */
+		Assert(((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) == 0) ||
+			   ((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) == 0));
+		Assert(vacoptions <= VACUUM_OPTION_MAX_VALID_VALUE);
+
+		if (vacoptions != VACUUM_OPTION_NO_PARALLEL)
+		{
+			est_shared = add_size(est_shared,
+								  add_size(sizeof(LVSharedIndStats),
+										   index_parallelvacuum_estimate(Irel[i])));
+
+			/*
+			 * Remember the number of indexes that support parallel operation
+			 * for each phases.
+			 */
+			if ((vacoptions & VACUUM_OPTION_PARALLEL_BULKDEL) != 0)
+				lps->nindexes_parallel_bulkdel++;
+			if ((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) != 0)
+				lps->nindexes_parallel_cleanup++;
+			if ((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) != 0)
+				lps->nindexes_parallel_condcleanup++;
+		}
+	}
+	shm_toc_estimate_chunk(&pcxt->estimator, est_shared);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	/* Estimate size for dead tuples -- PARALLEL_VACUUM_KEY_DEAD_TUPLES */
+	maxtuples = compute_max_dead_tuples(nblocks, true);
+	est_deadtuples = MAXALIGN(add_size(SizeOfLVDeadTuples,
+									   mul_size(sizeof(ItemPointerData), maxtuples)));
+	shm_toc_estimate_chunk(&pcxt->estimator, est_deadtuples);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	/* Finally, estimate PARALLEL_VACUUM_KEY_QUERY_TEXT space */
+	querylen = strlen(debug_query_string);
+	shm_toc_estimate_chunk(&pcxt->estimator, querylen + 1);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	InitializeParallelDSM(pcxt);
+
+	/* Prepare shared information */
+	shared = (LVShared *) shm_toc_allocate(pcxt->toc, est_shared);
+	MemSet(shared, 0, est_shared);
+	shared->relid = relid;
+	shared->elevel = elevel;
+
+	/*
+	 * We need to care about alignment because we estimate the shared memory
+	 * in that way.
+	 */
+	shared->offset = MAXALIGN(add_size(SizeOfLVShared, BITMAPLEN(nindexes)));
+	prepare_index_statistics(shared, Irel, nindexes, nrequested);
+	pg_atomic_init_u32(&(shared->idx), 0);
+	pg_atomic_init_u32(&(shared->nprocessed), 0);
+	pg_atomic_init_u32(&(shared->cost_balance), 0);
+	pg_atomic_init_u32(&(shared->active_nworkers), 0);
+
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_SHARED, shared);
+	lps->lvshared = shared;
+
+	/* Prepare the dead tuple space */
+	dead_tuples = (LVDeadTuples *) shm_toc_allocate(pcxt->toc, est_deadtuples);
+	dead_tuples->max_tuples = maxtuples;
+	dead_tuples->num_tuples = 0;
+	MemSet(dead_tuples->itemptrs, 0, sizeof(ItemPointerData) * maxtuples);
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_DEAD_TUPLES, dead_tuples);
+	vacrelstats->dead_tuples = dead_tuples;
+
+	/* Store query string for workers */
+	sharedquery = (char *) shm_toc_allocate(pcxt->toc, querylen + 1);
+	memcpy(sharedquery, debug_query_string, querylen + 1);
+	sharedquery[querylen] = '\0';
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, sharedquery);
+
+	return lps;
+}
+
+/*
+ * Destroy the parallel context, and end parallel mode.
+ *
+ * All writes are not allowed during parallel mode and it might not be
+ * safe to exit from the parallel mode while keeping the parallel context.
+ * So we copy the updated index statistics to a local memory and then later
+ * use that to update the index statistics.
+ */
+static void
+end_parallel_vacuum(LVParallelState *lps, Relation *Irel, int nindexes,
+					IndexBulkDeleteResult **stats)
+{
+	int i;
+
+	Assert(!IsParallelWorker());
+
+	/* copy the updated statistics */
+	for (i = 0; i < nindexes; i++)
+	{
+		LVSharedIndStats *indstats = get_indstats(lps->lvshared, i);
+
+		/*
+		 * Skip unused slot.  The statistics of this index are already
+		 * stored in local memory.
+		 */
+		if (indstats == NULL)
+			continue;
+
+		if (indstats->updated)
+		{
+			stats[i] = (IndexBulkDeleteResult *) palloc0(sizeof(IndexBulkDeleteResult));
+			memcpy(stats[i],
+				   GetIndexBulkDeleteResult(indstats),
+				   sizeof(IndexBulkDeleteResult));
+		}
+		else
+			stats[i] = NULL;
+	}
+
+	DestroyParallelContext(lps->pcxt);
+	ExitParallelMode();
+
+	/* Deactivate parallel vacuum */
+	pfree(lps);
+	lps = NULL;
+}
+
+/* Return the Nth index statistics or NULL */
+static LVSharedIndStats *
+get_indstats(LVShared *lvshared, int n)
+{
+	int		i;
+	char	*p;
+
+	if (IndStatsIsNull(lvshared, n))
+		return NULL;
+
+	p = (char *) GetSharedIndStats(lvshared);
+	for (i = 0; i < n; i++)
+	{
+		if (IndStatsIsNull(lvshared, i))
+			continue;
+
+		p += SizeOfSharedIndStats(p);
+	}
+
+	return (LVSharedIndStats *) p;
+}
+
+/*
+ * Check if the given index participates parallel index vacuuming
+ * or parallel index cleanup.
+ */
+static bool
+skip_parallel_index_vacuum(Relation indrel, LVShared *lvshared)
+{
+	uint8 vacoptions = indrel->rd_indam->amparallelvacuumoptions;
+
+	/* first_time must be true only if for_cleanup is true */
+	Assert(lvshared->for_cleanup || !lvshared->first_time);
+
+	if (lvshared->for_cleanup)
+	{
+		/* Skip if the index does not support parallel cleanup */
+		if (((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) == 0) &&
+			((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) == 0))
+			return true;
+
+		/*
+		 * Skip if the index support to parallel cleanup only first
+		 * time cleanup but it is not the first time.
+		 */
+		if (!lvshared->first_time &&
+			((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) != 0))
+			return true;
+	}
+	else if ((vacoptions & VACUUM_OPTION_PARALLEL_BULKDEL) == 0)
+	{
+		/* Skip if the index does not support parallel bulk deletion */
+		return true;
+	}
+
+	return false;
+}
+
+/*
+ * Perform work within a launched parallel process.
+ *
+ * Since parallel vacuum workers work only within index vacuuming and index
+ * cleanup, no need to report the progress information.
+ */
+void
+heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
+{
+	Relation	onerel;
+	Relation	*indrels;
+	LVShared	*lvshared;
+	LVDeadTuples	*dead_tuples;
+	int			nindexes;
+	char		*sharedquery;
+	IndexBulkDeleteResult **stats;
+
+	lvshared = (LVShared *) shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_SHARED,
+										   false);
+	elevel = lvshared->elevel;
+
+	ereport(DEBUG1,
+			(errmsg("starting parallel lazy vacuum worker for %s",
+					lvshared->for_cleanup ? "cleanup" : "vacuuming")));
+
+	/* Set debug_query_string for individual workers */
+	sharedquery = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, false);
+	debug_query_string = sharedquery;
+	pgstat_report_activity(STATE_RUNNING, debug_query_string);
+
+	/*
+	 * Open table.  The lock mode is the same as the leader process.  It's
+	 * okay because The lockmode does not conflict among the parallel
+	 * workers.
+	 */
+	onerel = table_open(lvshared->relid, ShareUpdateExclusiveLock);
+
+	/*
+	 * Open all indexes. indrels are sorted in order by OID, which should
+	 * be matched to the leader's one.
+	 */
+	vac_open_indexes(onerel, RowExclusiveLock, &nindexes, &indrels);
+	Assert(nindexes > 0);
+
+	/* Set dead tuple space */
+	dead_tuples = (LVDeadTuples *) shm_toc_lookup(toc,
+												  PARALLEL_VACUUM_KEY_DEAD_TUPLES,
+												  false);
+
+	/* Set cost-based vacuum delay */
+	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostBalance = 0;
+	VacuumPageHit = 0;
+	VacuumPageMiss = 0;
+	VacuumPageDirty = 0;
+	VacuumSharedCostBalance = &(lvshared->cost_balance);
+	VacuumActiveNWorkers = &(lvshared->active_nworkers);
+
+	stats = (IndexBulkDeleteResult **)
+		palloc0(nindexes * sizeof(IndexBulkDeleteResult *));
+
+	if (lvshared->maintenance_work_mem_worker > 0)
+		maintenance_work_mem = lvshared->maintenance_work_mem_worker;
+
+	/* Do either vacuuming indexes or cleaning indexes */
+	vacuum_or_cleanup_indexes_worker(indrels, nindexes, stats, lvshared,
+									 dead_tuples);
+
+	vac_close_indexes(nindexes, indrels, RowExclusiveLock);
+	table_close(onerel, ShareUpdateExclusiveLock);
+	pfree(stats);
+}
diff --git a/src/backend/access/nbtree/nbtsort.c b/src/backend/access/nbtree/nbtsort.c
index fc7d43a0f3..a8ae866a9c 100644
--- a/src/backend/access/nbtree/nbtsort.c
+++ b/src/backend/access/nbtree/nbtsort.c
@@ -1428,7 +1428,7 @@ _bt_begin_parallel(BTBuildState *buildstate, bool isconcurrent, int request)
 	shm_toc_insert(pcxt->toc, PARALLEL_KEY_QUERY_TEXT, sharedquery);
 
 	/* Launch workers, saving status for leader/caller */
-	LaunchParallelWorkers(pcxt);
+	LaunchParallelWorkers(pcxt, request);
 	btleader->pcxt = pcxt;
 	btleader->nparticipanttuplesorts = pcxt->nworkers_launched;
 	if (leaderparticipates)
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index d147236429..157c309211 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -14,6 +14,7 @@
 
 #include "postgres.h"
 
+#include "access/heapam.h"
 #include "access/nbtree.h"
 #include "access/parallel.h"
 #include "access/session.h"
@@ -139,6 +140,9 @@ static const struct
 	},
 	{
 		"_bt_parallel_build_main", _bt_parallel_build_main
+	},
+	{
+		"heap_parallel_vacuum_main", heap_parallel_vacuum_main
 	}
 };
 
@@ -490,10 +494,11 @@ ReinitializeParallelDSM(ParallelContext *pcxt)
  * Launch parallel workers.
  */
 void
-LaunchParallelWorkers(ParallelContext *pcxt)
+LaunchParallelWorkers(ParallelContext *pcxt, int nworkers)
 {
 	MemoryContext oldcontext;
 	BackgroundWorker worker;
+	int			nworkers_to_launch = Min(nworkers, pcxt->nworkers);;
 	int			i;
 	bool		any_registrations_failed = false;
 
@@ -533,7 +538,7 @@ LaunchParallelWorkers(ParallelContext *pcxt)
 	 * fails.  It wouldn't help much anyway, because registering the worker in
 	 * no way guarantees that it will start up and initialize successfully.
 	 */
-	for (i = 0; i < pcxt->nworkers; ++i)
+	for (i = 0; i < nworkers_to_launch; ++i)
 	{
 		memcpy(worker.bgw_extra, &i, sizeof(int));
 		if (!any_registrations_failed &&
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index da1da23400..4b7f480fd6 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -42,6 +42,7 @@
 #include "nodes/makefuncs.h"
 #include "pgstat.h"
 #include "postmaster/autovacuum.h"
+#include "postmaster/bgworker_internals.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
 #include "storage/proc.h"
@@ -99,6 +100,7 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 	/* Set default value */
 	params.index_cleanup = VACOPT_TERNARY_DEFAULT;
 	params.truncate = VACOPT_TERNARY_DEFAULT;
+	params.nworkers = -1;
 
 	/* Parse options list */
 	foreach(lc, vacstmt->options)
@@ -129,6 +131,28 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 			params.index_cleanup = get_vacopt_ternary_value(opt);
 		else if (strcmp(opt->defname, "truncate") == 0)
 			params.truncate = get_vacopt_ternary_value(opt);
+		else if (strcmp(opt->defname, "parallel") == 0)
+		{
+			if (opt->arg == NULL)
+			{
+				/*
+				 * Parallel lazy vacuum is requested but user didn't specify
+				 * the parallel degree. The parallel degree will be determined
+				 * at the start of lazy vacuum.
+				 */
+				params.nworkers = 0;
+			}
+			else
+			{
+				params.nworkers = defGetInt32(opt);
+				if (params.nworkers < 1 || params.nworkers > MAX_PARALLEL_WORKER_LIMIT)
+					ereport(ERROR,
+							(errcode(ERRCODE_SYNTAX_ERROR),
+							 errmsg("parallel vacuum degree must be between 1 and %d",
+									MAX_PARALLEL_WORKER_LIMIT),
+							 parser_errposition(pstate, opt->location)));
+			}
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -170,6 +194,11 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 		}
 	}
 
+	if ((params.options & VACOPT_FULL) && params.nworkers >= 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot specify FULL option with PARALLEL option")));
+
 	/*
 	 * All freeze ages are zero if the FREEZE option is given; otherwise pass
 	 * them as -1 which means to use the default values.
@@ -383,6 +412,7 @@ vacuum(List *relations, VacuumParams *params,
 		VacuumPageHit = 0;
 		VacuumPageMiss = 0;
 		VacuumPageDirty = 0;
+		VacuumSharedCostBalance = NULL;
 
 		/*
 		 * Loop to process each selected relation.
@@ -1738,6 +1768,20 @@ vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params)
 		return false;
 	}
 
+	/*
+	 * Since parallel workers cannot access data in temporary tables, parallel
+	 * vacuum is not allowed for temporary relation. However rather than
+	 * skipping vacuum on the table, just disabling parallel option is better
+	 * option in most cases.
+	 */
+	if (RelationUsesLocalBuffers(onerel) && params->nworkers >= 0)
+	{
+		ereport(WARNING,
+				(errmsg("disabling parallel option of vacuum on \"%s\" --- cannot vacuum temporary tables in parallel",
+						RelationGetRelationName(onerel))));
+		params->nworkers = 0;
+	}
+
 	/*
 	 * Silently ignore partitioned tables as there is no work to be done.  The
 	 * useful work is on their child partitions, which have been queued up for
@@ -1941,16 +1985,73 @@ vac_close_indexes(int nindexes, Relation *Irel, LOCKMODE lockmode)
 void
 vacuum_delay_point(void)
 {
+	double	msec = 0;
+
 	/* Always check for interrupts */
 	CHECK_FOR_INTERRUPTS();
 
-	/* Nap if appropriate */
-	if (VacuumCostActive && !InterruptPending &&
-		VacuumCostBalance >= VacuumCostLimit)
+	if (!VacuumCostActive || InterruptPending)
+		return;
+
+	/*
+	 * If the vacuum cost balance is shared among parallel workers we
+	 * decide whether to sleep based on that.
+	 */
+	if (VacuumSharedCostBalance != NULL)
 	{
-		double		msec;
+		int nworkers = pg_atomic_read_u32(VacuumActiveNWorkers);
+
+		/* At least count itself */
+		Assert(nworkers >= 1);
+
+		/* Update the shared cost balance value atomically */
+		while (true)
+		{
+			uint32 shared_balance;
+			uint32 new_balance;
+			uint32 local_balance;
+
+			msec = 0;
+
+			/* compute new balance by adding the local value */
+			shared_balance = pg_atomic_read_u32(VacuumSharedCostBalance);
+			new_balance = shared_balance + VacuumCostBalance;
 
+			/* also compute the total local balance */
+			local_balance = VacuumCostBalanceLocal + VacuumCostBalance;
+
+			if ((new_balance >= VacuumCostLimit) &&
+				(local_balance > 0.5 * (VacuumCostLimit / nworkers)))
+			{
+				/* compute sleep time based on the local cost balance */
+				msec = VacuumCostDelay * VacuumCostBalanceLocal / VacuumCostLimit;
+				new_balance = shared_balance - VacuumCostBalanceLocal;
+				VacuumCostBalanceLocal = 0;
+			}
+
+			if (pg_atomic_compare_exchange_u32(VacuumSharedCostBalance,
+											   &shared_balance,
+											   new_balance))
+			{
+				/* Updated successfully, break */
+				break;
+			}
+		}
+
+		VacuumCostBalanceLocal += VacuumCostBalance;
+
+		/*
+		 * Reset the local balance as we accumulated it into the shared
+		 * value.
+		 */
+		VacuumCostBalance = 0;
+	}
+	else if (VacuumCostBalance >= VacuumCostLimit)
 		msec = VacuumCostDelay * VacuumCostBalance / VacuumCostLimit;
+
+	/* Nap if appropriate */
+	if (msec > 0)
+	{
 		if (msec > VacuumCostDelay * 4)
 			msec = VacuumCostDelay * 4;
 
diff --git a/src/backend/executor/nodeGather.c b/src/backend/executor/nodeGather.c
index 69d5a1f239..df28ff2927 100644
--- a/src/backend/executor/nodeGather.c
+++ b/src/backend/executor/nodeGather.c
@@ -183,7 +183,7 @@ ExecGather(PlanState *pstate)
 			 * requested, or indeed any at all.
 			 */
 			pcxt = node->pei->pcxt;
-			LaunchParallelWorkers(pcxt);
+			LaunchParallelWorkers(pcxt, gather->num_workers);
 			/* We save # workers launched for the benefit of EXPLAIN */
 			node->nworkers_launched = pcxt->nworkers_launched;
 
diff --git a/src/backend/executor/nodeGatherMerge.c b/src/backend/executor/nodeGatherMerge.c
index 6ef128e2ab..cb9d5a725a 100644
--- a/src/backend/executor/nodeGatherMerge.c
+++ b/src/backend/executor/nodeGatherMerge.c
@@ -224,7 +224,7 @@ ExecGatherMerge(PlanState *pstate)
 
 			/* Try to launch workers. */
 			pcxt = node->pei->pcxt;
-			LaunchParallelWorkers(pcxt);
+			LaunchParallelWorkers(pcxt, gm->num_workers);
 			/* We save # workers launched for the benefit of EXPLAIN */
 			node->nworkers_launched = pcxt->nworkers_launched;
 
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index c1dd8168ca..c3690f9c41 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2891,6 +2891,8 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 			(!wraparound ? VACOPT_SKIP_LOCKED : 0);
 		tab->at_params.index_cleanup = VACOPT_TERNARY_DEFAULT;
 		tab->at_params.truncate = VACOPT_TERNARY_DEFAULT;
+		/* We don't support parallel vacuum for autovacuum for now */
+		tab->at_params.nworkers = -1;
 		tab->at_params.freeze_min_age = freeze_min_age;
 		tab->at_params.freeze_table_age = freeze_table_age;
 		tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 172e00b46e..eb9c2e8d0b 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -3585,7 +3585,7 @@ psql_completion(const char *text, int start, int end)
 		if (ends_with(prev_wd, '(') || ends_with(prev_wd, ','))
 			COMPLETE_WITH("FULL", "FREEZE", "ANALYZE", "VERBOSE",
 						  "DISABLE_PAGE_SKIPPING", "SKIP_LOCKED",
-						  "INDEX_CLEANUP", "TRUNCATE");
+						  "INDEX_CLEANUP", "TRUNCATE", "PARALLEL");
 		else if (TailMatches("FULL|FREEZE|ANALYZE|VERBOSE|DISABLE_PAGE_SKIPPING|SKIP_LOCKED|INDEX_CLEANUP|TRUNCATE"))
 			COMPLETE_WITH("ON", "OFF");
 	}
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 858bcb6bc9..61725e749f 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -24,6 +24,8 @@
 #include "nodes/primnodes.h"
 #include "storage/bufpage.h"
 #include "storage/lockdefs.h"
+#include "storage/shm_toc.h"
+#include "storage/dsm.h"
 #include "utils/relcache.h"
 #include "utils/snapshot.h"
 
@@ -190,9 +192,13 @@ extern void SyncScanShmemInit(void);
 extern Size SyncScanShmemSize(void);
 
 /* in heap/vacuumlazy.c */
+extern pg_atomic_uint32	*VacuumSharedCostBalance;
+extern pg_atomic_uint32	*VacuumActiveNWorkers;
+extern int	VacuumCostBalanceLocal;
 struct VacuumParams;
 extern void heap_vacuum_rel(Relation onerel,
 							struct VacuumParams *params, BufferAccessStrategy bstrategy);
+extern void heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
 
 /* in heap/heapam_visibility.c */
 extern bool HeapTupleSatisfiesVisibility(HeapTuple stup, Snapshot snapshot,
diff --git a/src/include/access/parallel.h b/src/include/access/parallel.h
index c00ae6424c..e5e6ae6c08 100644
--- a/src/include/access/parallel.h
+++ b/src/include/access/parallel.h
@@ -63,7 +63,7 @@ extern ParallelContext *CreateParallelContext(const char *library_name,
 											  const char *function_name, int nworkers);
 extern void InitializeParallelDSM(ParallelContext *pcxt);
 extern void ReinitializeParallelDSM(ParallelContext *pcxt);
-extern void LaunchParallelWorkers(ParallelContext *pcxt);
+extern void LaunchParallelWorkers(ParallelContext *pcxt, int nworkers);
 extern void WaitForParallelWorkersToAttach(ParallelContext *pcxt);
 extern void WaitForParallelWorkersToFinish(ParallelContext *pcxt);
 extern void DestroyParallelContext(ParallelContext *pcxt);
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 5f23f1ab1d..0a586dca8d 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -218,6 +218,11 @@ typedef struct VacuumParams
 										 * default value depends on reloptions */
 	VacOptTernaryValue truncate;	/* Truncate empty pages at the end,
 									 * default value depends on reloptions */
+	/*
+	 * The number of parallel vacuum workers. -1 by default for no workers
+	 * and 0 for choosing based on the number of indexes.
+	 */
+	int			nworkers;
 } VacuumParams;
 
 /* GUC parameters */
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index 9996d882d1..cf5e1f0a4e 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -92,6 +92,32 @@ CONTEXT:  SQL function "do_analyze" statement 1
 SQL function "wrap_do_analyze" statement 1
 VACUUM FULL vactst;
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+-- PARALLEL option
+CREATE TABLE pvactst (i INT, a INT[], p POINT) with (autovacuum_enabled = off);
+INSERT INTO pvactst SELECT i, array[1,2,3], point(i, i+1) FROM generate_series(1,100000) i;
+CREATE INDEX btree_pvactst ON pvactst USING btree (i);
+CREATE INDEX hash_pvactst ON pvactst USING hash (i);
+CREATE INDEX brin_pvactst ON pvactst USING brin (i);
+CREATE INDEX gin_pvactst ON pvactst USING gin (a);
+CREATE INDEX gist_pvactst ON pvactst USING gist (p);
+CREATE INDEX spgist_pvactst ON pvactst USING spgist (p);
+-- VACUUM invokes parallel index cleanup
+VACUUM (PARALLEL 2) pvactst;
+-- VACUUM invokes parallel bulk-deletion and parallel index cleanup
+UPDATE pvactst SET i = i WHERE i < 1000;
+VACUUM (PARALLEL 2) pvactst;
+VACUUM (PARALLEL 0) pvactst; -- error
+ERROR:  parallel vacuum degree must be between 1 and 1024
+LINE 1: VACUUM (PARALLEL 0) pvactst;
+                ^
+VACUUM (PARALLEL 2, INDEX_CLEANUP FALSE) pvactst;
+VACUUM (PARALLEL 2, FULL TRUE) pvactst; -- error, cannot use both PARALLEL and FULL
+ERROR:  cannot specify FULL option with PARALLEL option
+CREATE TEMPORARY TABLE tmp (a int PRIMARY KEY);
+CREATE INDEX tmp_idx1 ON tmp (a);
+VACUUM (PARALLEL 1) tmp; -- disables parallel vacuum option
+WARNING:  disabling parallel option of vacuum on "tmp" --- cannot vacuum temporary tables in parallel
+DROP TABLE pvactst;
 -- INDEX_CLEANUP option
 CREATE TABLE no_index_cleanup (i INT PRIMARY KEY, t TEXT);
 -- Use uncompressed data stored in toast.
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 69987f75e9..0aecf17773 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -75,6 +75,31 @@ VACUUM FULL vactst;
 
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
 
+-- PARALLEL option
+CREATE TABLE pvactst (i INT, a INT[], p POINT) with (autovacuum_enabled = off);
+INSERT INTO pvactst SELECT i, array[1,2,3], point(i, i+1) FROM generate_series(1,100000) i;
+CREATE INDEX btree_pvactst ON pvactst USING btree (i);
+CREATE INDEX hash_pvactst ON pvactst USING hash (i);
+CREATE INDEX brin_pvactst ON pvactst USING brin (i);
+CREATE INDEX gin_pvactst ON pvactst USING gin (a);
+CREATE INDEX gist_pvactst ON pvactst USING gist (p);
+CREATE INDEX spgist_pvactst ON pvactst USING spgist (p);
+
+-- VACUUM invokes parallel index cleanup
+VACUUM (PARALLEL 2) pvactst;
+
+-- VACUUM invokes parallel bulk-deletion and parallel index cleanup
+UPDATE pvactst SET i = i WHERE i < 1000;
+VACUUM (PARALLEL 2) pvactst;
+
+VACUUM (PARALLEL 0) pvactst; -- error
+VACUUM (PARALLEL 2, INDEX_CLEANUP FALSE) pvactst;
+VACUUM (PARALLEL 2, FULL TRUE) pvactst; -- error, cannot use both PARALLEL and FULL
+CREATE TEMPORARY TABLE tmp (a int PRIMARY KEY);
+CREATE INDEX tmp_idx1 ON tmp (a);
+VACUUM (PARALLEL 1) tmp; -- disables parallel vacuum option
+DROP TABLE pvactst;
+
 -- INDEX_CLEANUP option
 CREATE TABLE no_index_cleanup (i INT PRIMARY KEY, t TEXT);
 -- Use uncompressed data stored in toast.
-- 
2.23.0

#235

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Masahiko Sawada (#234)

On Wed, Nov 27, 2019 at 12:52 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

I've incorporated the comments I got so far including the above and
the memory alignment issue.

Thanks, I will look into the new version. BTW, why haven't you posted
0001 patch (IndexAM API's patch)? I think without that we need to use
the previous version for that. Also, I think we should post Dilip's
patch related to Gist index [1]/messages/by-id/CAFiTN-uQY+B+CLb8W3YYdb7XmB9hyYFXkAy3C7RY=-YSWRV1DA@mail.gmail.com modifications for parallel vacuum or
at least have a mention for that while posting a new version as
without that even make check fails.

[1]: /messages/by-id/CAFiTN-uQY+B+CLb8W3YYdb7XmB9hyYFXkAy3C7RY=-YSWRV1DA@mail.gmail.com

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#236

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Amit Kapila (#235)

1 attachment(s)

On Wed, Nov 27, 2019 at 8:14 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Nov 27, 2019 at 12:52 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

I've incorporated the comments I got so far including the above and
the memory alignment issue.

Thanks, I will look into the new version.

Few comments:
-----------------------
1.
+static void
+vacuum_or_cleanup_indexes_worker(Relation *Irel, int nindexes,
+ IndexBulkDeleteResult **stats,
+ LVShared *lvshared,
+ LVDeadTuples *dead_tuples)
+{
+ /* Increment the active worker count */
+ pg_atomic_add_fetch_u32(VacuumActiveNWorkers, 1);

The above code is wrong because it is possible that this function is
called even when there are no workers in which case
VacuumActiveNWorkers will be NULL.

2.
+ /* Take over the shared balance value to heap scan */
+ VacuumCostBalance = pg_atomic_read_u32(VacuumSharedCostBalance);

We can carry over shared balance only if the same is active.

3.
+ if (Irel[i]->rd_indam->amparallelvacuumoptions ==
+ VACUUM_OPTION_NO_PARALLEL)
+ {
+
/* Set NULL as this index does not support parallel vacuum */
+ lvshared->bitmap[i >> 3] |= 0 << (i & 0x07);

Can we avoid setting this for each index by initializing bitmap as all
NULL's as is done in the attached patch?

4.
+ /*
+ * Variables to control parallel index vacuuming.  Index statistics
+ * returned from ambulkdelete and amvacuumcleanup is nullable
variable
+ * length.  'offset' is NULL bitmap. Note that a 0 indicates a null,
+ * while 1 indicates non-null.  The index statistics follows
at end of
+ * struct.
+ */

This comment is not clear, so I have re-worded it. See, if the
changed comment makes sense.

I have fixed all the above issues, made a couple of other cosmetic
changes and modified a few comments. See the changes in
v34-0002-delta-amit. I am attaching just the delta patch on top of
v34-0002-Add-parallel-option-to-VACUUM-command.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachments:

v34-0002-delta-amit.patchapplication/octet-stream; name=v34-0002-delta-amit.patchDownload

From a4512c383e4758cfdc8a34a8abff94b22f4e55fa Mon Sep 17 00:00:00 2001
From: Amit Kapila <amit.kapila@enterprisedb.com>
Date: Wed, 27 Nov 2019 17:46:35 +0530
Subject: [PATCH] fixed issues, rearranged some code and made cosmetic changes.

---
 src/backend/access/heap/vacuumlazy.c | 124 ++++++++++++++++++++---------------
 1 file changed, 70 insertions(+), 54 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 33e78a6cca..b71e6a2fc6 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -31,12 +31,13 @@
  * When starting either index vacuuming or index cleanup, we launch parallel
  * worker processes.  Once all indexes are processed the parallel worker
  * processes exit.  And then the leader process re-initializes the parallel
- * context so that the leader can launch parallel workers again in the next
- * time.  Note that all parallel workers live during either index vacuuming
- * or index cleanup but the leader process neither exits from the parallel
- * mode nor destroys the parallel context.  For updating the index statistics,
- * since any updates are not allowed during parallel mode we update the index
- * statistics after exited from the parallel mode.
+ * context so that it can use the same DSM for multiple passses of index
+ * vacuum and for performing index cleanup.  Note that all parallel workers
+ * live during either index vacuuming or index cleanup but the leader process
+ * neither exits from the parallel mode nor destroys the parallel context.
+ * For updating the index statistics, since any updates are not allowed during
+ * parallel mode we update the index statistics after exited from the parallel
+ * mode.
  *
  * Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
@@ -181,7 +182,7 @@ typedef struct LVShared
 	int		elevel;
 
 	/*
-	 * An indication for vacuum workers of doing either index vacuuming or
+	 * An indication for vacuum workers to perform either index vacuuming or
 	 * index cleanup.  first_time is true only if for_cleanup is true and
 	 * bulk-deletion is not performed yet.
 	 */
@@ -225,11 +226,9 @@ typedef struct LVShared
 	pg_atomic_uint32 active_nworkers;
 
 	/*
-	 * Variables to control parallel index vacuuming.  Index statistics
-	 * returned from ambulkdelete and amvacuumcleanup is nullable variable
-	 * length.  'offset' is NULL bitmap. Note that a 0 indicates a null,
-	 * while 1 indicates non-null.  The index statistics follows at end of
-	 * struct.
+	 * Variables to control parallel index vacuuming.  We have a bitmap to
+	 * indicate which index has stats in shared memory.  The set bit in the
+	 * map indicates that the particular index supports a parallel vacuum.
 	 */
 	pg_atomic_uint32	idx;		/* counter for vacuuming and clean up */
 	pg_atomic_uint32	nprocessed;	/* # of indexes done during parallel execution */
@@ -374,7 +373,7 @@ static void vacuum_or_cleanup_indexes_worker(Relation *Irel, int nindexes,
 											 LVDeadTuples *dead_tuples);
 static void vacuum_or_cleanup_skipped_indexes(LVRelStats *vacrelstats, Relation *Irel,
 											  int nindexes, IndexBulkDeleteResult **stats,
-											  LVParallelState *lps, bool in_parallel);
+											  LVParallelState *lps);
 static void vacuum_or_cleanup_one_index_worker(Relation indrel, IndexBulkDeleteResult **stats,
 											   LVShared *lvshared, LVSharedIndStats *shared_indstats,
 											   LVDeadTuples *dead_tuples);
@@ -2034,17 +2033,6 @@ lazy_parallel_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
 	/* Setup the shared cost-based vacuum delay and launch workers*/
 	if (nworkers > 0)
 	{
-		/* Enable shared cost balance */
-		VacuumSharedCostBalance = &(lps->lvshared->cost_balance);
-		VacuumActiveNWorkers = &(lps->lvshared->active_nworkers);
-
-		/*
-		 * Set up shared cost balance and the number of active workers for
-		 * vacuum delay.
-		 */
-		pg_atomic_write_u32(VacuumSharedCostBalance, VacuumCostBalance);
-		pg_atomic_write_u32(VacuumActiveNWorkers, 0);
-
 		/*
 		 * Reset the local value so that we compute cost balance during
 		 * parallel index vacuuming.
@@ -2054,6 +2042,21 @@ lazy_parallel_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
 
 		LaunchParallelWorkers(lps->pcxt, nworkers);
 
+		/* Enable shared costing iff we process indexes in parallel. */
+		if (lps->pcxt->nworkers_launched > 0)
+		{
+			/* Enable shared cost balance */
+			VacuumSharedCostBalance = &(lps->lvshared->cost_balance);
+			VacuumActiveNWorkers = &(lps->lvshared->active_nworkers);
+
+			/*
+			 * Set up shared cost balance and the number of active workers for
+			 * vacuum delay.
+			 */
+			pg_atomic_write_u32(VacuumSharedCostBalance, VacuumCostBalance);
+			pg_atomic_write_u32(VacuumActiveNWorkers, 0);
+		}
+
 		if (lps->lvshared->for_cleanup)
 			ereport(elevel,
 					(errmsg(ngettext("launched %d parallel vacuum worker for index cleanup (planned: %d)",
@@ -2081,14 +2084,15 @@ lazy_parallel_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
 	 * are remaining. If there are such indexes the leader process does vacuum
 	 * or cleanup them one by one.
 	 */
-	vacuum_or_cleanup_skipped_indexes(vacrelstats, Irel, nindexes,
-									  stats, lps, nworkers > 0);
+	vacuum_or_cleanup_skipped_indexes(vacrelstats, Irel, nindexes, stats,
+									  lps);
 
 	/* Wait for all vacuum workers to finish */
 	WaitForParallelWorkersToFinish(lps->pcxt);
 
-	/* Take over the shared balance value to heap scan */
-	VacuumCostBalance = pg_atomic_read_u32(VacuumSharedCostBalance);
+	/* Carry the shared balance value to heap scan */
+	if (VacuumSharedCostBalance)
+		VacuumCostBalance = pg_atomic_read_u32(VacuumSharedCostBalance);
 
 	/* Disable shared cost balance for vacuum delay */
 	VacuumSharedCostBalance = NULL;
@@ -2115,7 +2119,8 @@ lazy_parallel_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
 
 /*
  * Index vacuuming and index cleanup routine used by parallel vacuum
- * worker processes including the leader process.
+ * worker processes and the leader process to process the indexes in
+ * parallel.
  */
 static void
 vacuum_or_cleanup_indexes_worker(Relation *Irel, int nindexes,
@@ -2123,8 +2128,11 @@ vacuum_or_cleanup_indexes_worker(Relation *Irel, int nindexes,
 								 LVShared *lvshared,
 								 LVDeadTuples *dead_tuples)
 {
-	/* Increment the active worker count */
-	pg_atomic_add_fetch_u32(VacuumActiveNWorkers, 1);
+	/*
+	 * Increment the active worker count if we are able to launch any worker.
+	 */
+	if (VacuumActiveNWorkers)
+		pg_atomic_add_fetch_u32(VacuumActiveNWorkers, 1);
 
 	/* Loop until all indexes are vacuumed */
 	for (;;)
@@ -2139,18 +2147,20 @@ vacuum_or_cleanup_indexes_worker(Relation *Irel, int nindexes,
 		if (idx >= nindexes)
 			break;
 
-		/*
-		 * Skip if this index doesn't support parallel execution
-		 * at this time.
-		 */
+		/* Skip processing indexes that doesn't support parallel operation */
 		if (skip_parallel_index_vacuum(Irel[idx], lvshared))
 			continue;
 
 		/* Increment the processing count */
 		pg_atomic_add_fetch_u32(&(lvshared->nprocessed), 1);
 
-		/* Get index statistics struct of this index */
+		/* Get the index statistics of this index from DSM */
 		shared_indstats = get_indstats(lvshared, idx);
+
+		/*
+		 * This must exist in DSM as we reach here only for indexes that
+		 * support the parallel operation.
+		 */
 		Assert(shared_indstats);
 
 		/* Do vacuum or cleanup one index */
@@ -2163,7 +2173,8 @@ vacuum_or_cleanup_indexes_worker(Relation *Irel, int nindexes,
 	 * We have completed the index vacuum so decrement the active worker
 	 * count.
 	 */
-	pg_atomic_sub_fetch_u32(VacuumActiveNWorkers, 1);
+	if (VacuumActiveNWorkers)
+		pg_atomic_sub_fetch_u32(VacuumActiveNWorkers, 1);
 }
 
 /*
@@ -2176,7 +2187,7 @@ vacuum_or_cleanup_indexes_worker(Relation *Irel, int nindexes,
 static void
 vacuum_or_cleanup_skipped_indexes(LVRelStats *vacrelstats, Relation *Irel,
 								  int nindexes, IndexBulkDeleteResult **stats,
-								  LVParallelState *lps, bool in_parallel)
+								  LVParallelState *lps)
 {
 	int nindexes_remains;
 	int i;
@@ -2191,8 +2202,10 @@ vacuum_or_cleanup_skipped_indexes(LVRelStats *vacrelstats, Relation *Irel,
 	if (nindexes_remains == 0)
 		return;
 
-	/* Increment the active worker count if some worker might be running */
-	if (in_parallel)
+	/*
+	 * Increment the active worker count if we are able to launch any worker.
+	 */
+	if (VacuumActiveNWorkers)
 		pg_atomic_add_fetch_u32(VacuumActiveNWorkers, 1);
 
 	for (i = 0; i < nindexes; i++)
@@ -2216,7 +2229,7 @@ vacuum_or_cleanup_skipped_indexes(LVRelStats *vacrelstats, Relation *Irel,
 	 * We have completed the index vacuum so decrement the active worker
 	 * count.
 	 */
-	if (in_parallel)
+	if (VacuumActiveNWorkers)
 		pg_atomic_sub_fetch_u32(VacuumActiveNWorkers, 1);
 
 #ifdef USE_ASSERT_CHECKING
@@ -3095,8 +3108,8 @@ compute_parallel_workers(Relation *Irel, int nindexes, int nrequested)
 }
 
 /*
- * Initialize variables for shared index statistics, set NULL bitmap and
- * the struct size of each indexes.  Also this function sets the number of
+ * Initialize variables for shared index statistics, set NULL bitmap and the
+ * size of stats for each index.  Also, this function sets the number of
  * indexes that do not support parallel index vacuuming and that use
  * maintenance_work_mem.  Since currently we don't support parallel vacuum
  * for autovacuum we don't need to care about autovacuum_work_mem.
@@ -3105,30 +3118,31 @@ static void
 prepare_index_statistics(LVShared *lvshared, Relation *Irel, int nindexes,
 	int nworkers)
 {
-	char *p = (char *)GetSharedIndStats(lvshared);
+	char *p = (char *) GetSharedIndStats(lvshared);
 	int nindexes_mwm = 0;
 	int i;
 
 	Assert(!IsAutoVacuumWorkerProcess());
 
+	/* Set NULL for all indexes */
+	memset(lvshared->bitmap, 0x00, BITMAPLEN(nindexes));
+
 	for (i = 0; i < nindexes; i++)
 	{
 		LVSharedIndStats *indstats;
 
 		if (Irel[i]->rd_indam->amparallelvacuumoptions ==
 			VACUUM_OPTION_NO_PARALLEL)
-		{
-			/* Set NULL as this index does not support parallel vacuum */
-			lvshared->bitmap[i >> 3] |= 0 << (i & 0x07);
 			continue;
-		}
 
 		if (Irel[i]->rd_indam->amusemaintenanceworkmem)
 			nindexes_mwm++;
 
-		/* Set the size for index statistics */
-		indstats = (LVSharedIndStats *)p;
+		/* Set NOT NULL as this index do support parallelism */
 		lvshared->bitmap[i >> 3] |= 1 << (i & 0x07);
+
+		/* Set the size for index statistics */
+		indstats = (LVSharedIndStats *) p;
 		indstats->size = index_parallelvacuum_estimate(Irel[i]);
 
 		p += SizeOfSharedIndStats(indstats);
@@ -3330,7 +3344,7 @@ get_indstats(LVShared *lvshared, int n)
 }
 
 /*
- * Check if the given index participates parallel index vacuuming
+ * Check if the given index participates in parallel index vacuuming
  * or parallel index cleanup.
  */
 static bool
@@ -3343,14 +3357,16 @@ skip_parallel_index_vacuum(Relation indrel, LVShared *lvshared)
 
 	if (lvshared->for_cleanup)
 	{
-		/* Skip if the index does not support parallel cleanup */
+		/* Skip, if the index does not support parallel cleanup */
 		if (((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) == 0) &&
 			((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) == 0))
 			return true;
 
 		/*
-		 * Skip if the index support to parallel cleanup only first
-		 * time cleanup but it is not the first time.
+		 * Skip, if the index supports parallel cleanup conditionally, but we
+		 * have already processed the index (for bulkdelete).  See the
+		 * comments for option VACUUM_OPTION_PARALLEL_COND_CLEANUP to know
+		 * when indexes support parallel cleanup conditionally.
 		 */
 		if (!lvshared->first_time &&
 			((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) != 0))
-- 
2.16.2.windows.1

#237

Mahendra Singh

mahi6run@gmail.com

about 6 years ago

In reply to: Amit Kapila (#235)

On Wed, 27 Nov 2019 at 08:14, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Nov 27, 2019 at 12:52 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

I've incorporated the comments I got so far including the above and
the memory alignment issue.

Thanks, I will look into the new version. BTW, why haven't you posted
0001 patch (IndexAM API's patch)? I think without that we need to use
the previous version for that. Also, I think we should post Dilip's
patch related to Gist index [1] modifications for parallel vacuum or
at least have a mention for that while posting a new version as
without that even make check fails.

[1] -
/messages/by-id/CAFiTN-uQY+B+CLb8W3YYdb7XmB9hyYFXkAy3C7RY=-YSWRV1DA@mail.gmail.com

I did some testing on the top of v33 patch set. By debugging, I was able to
hit one assert in lazy_parallel_vacuum_or_cleanup_indexes.
TRAP: FailedAssertion("nprocessed == nindexes_remains", File:
"vacuumlazy.c", Line: 2099)

I further debugged and found that this assert is not valid in all the
cases. Here, nprocessed can be less than nindexes_remains in some cases
because it is possible that parallel worker is launched for vacuum and idx
count is incremented in vacuum_or_cleanup_indexes_worker for particular
index but work is still not finished(lvshared->nprocessed is not
incremented yet) so in that case, nprocessed will be less than
nindexes_remains. I think, we should remove this assert.

I have one comment for assert used variable:

+#ifdef USE_ASSERT_CHECKING
+ int nprocessed = 0;
+#endif

I think, we can make above declaration as " int nprocessed
PG_USED_FOR_ASSERTS_ONLY = 0" so that code looks good because this
USE_ASSERT_CHECKING is used in 3 places in 20-30 code lines.

Thanks and Regards
Mahendra Thalor
EnterpriseDB: http://www.enterprisedb.com

#238

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

about 6 years ago

In reply to: Amit Kapila (#236)

4 attachment(s)

On Wed, 27 Nov 2019 at 13:26, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Nov 27, 2019 at 8:14 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Nov 27, 2019 at 12:52 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

I've incorporated the comments I got so far including the above and
the memory alignment issue.

Thanks, I will look into the new version.
Few comments:
-----------------------
1.
+static void
+vacuum_or_cleanup_indexes_worker(Relation *Irel, int nindexes,
+ IndexBulkDeleteResult **stats,
+ LVShared *lvshared,
+ LVDeadTuples *dead_tuples)
+{
+ /* Increment the active worker count */
+ pg_atomic_add_fetch_u32(VacuumActiveNWorkers, 1);
The above code is wrong because it is possible that this function is
called even when there are no workers in which case
VacuumActiveNWorkers will be NULL.
2.
+ /* Take over the shared balance value to heap scan */
+ VacuumCostBalance = pg_atomic_read_u32(VacuumSharedCostBalance);
We can carry over shared balance only if the same is active.
3.
+ if (Irel[i]->rd_indam->amparallelvacuumoptions ==
+ VACUUM_OPTION_NO_PARALLEL)
+ {
+
/* Set NULL as this index does not support parallel vacuum */
+ lvshared->bitmap[i >> 3] |= 0 << (i & 0x07);
Can we avoid setting this for each index by initializing bitmap as all
NULL's as is done in the attached patch?
4.
+ /*
+ * Variables to control parallel index vacuuming.  Index statistics
+ * returned from ambulkdelete and amvacuumcleanup is nullable
variable
+ * length.  'offset' is NULL bitmap. Note that a 0 indicates a null,
+ * while 1 indicates non-null.  The index statistics follows
at end of
+ * struct.
+ */
This comment is not clear, so I have re-worded it. See, if the
changed comment makes sense.

I have fixed all the above issues, made a couple of other cosmetic
changes and modified a few comments. See the changes in
v34-0002-delta-amit. I am attaching just the delta patch on top of
v34-0002-Add-parallel-option-to-VACUUM-command.

Thank you for reviewing this patch. All changes you made looks good to me.

I thought I already have posted all v34 patches but didn't, sorry. So
I've attached v35 patch set that incorporated your changes and it
includes Dilip's patch for gist index (0001). These patches can be
applied on top of the current HEAD and make check should pass.
Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

v35-0002-Add-index-AM-field-and-callback-for-parallel-ind.patchapplication/octet-stream; name=v35-0002-Add-index-AM-field-and-callback-for-parallel-ind.patchDownload

From 16c04f8bf2230e3e3a4b7d46e13b2a9f710f994e Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Tue, 15 Oct 2019 17:03:22 +0900
Subject: [PATCH v35 2/4] Add index AM field and callback for parallel index
 vacuum

---
 contrib/bloom/blutils.c                       |  5 +++
 doc/src/sgml/indexam.sgml                     | 22 ++++++++++++
 src/backend/access/brin/brin.c                |  5 +++
 src/backend/access/gin/ginutil.c              |  5 +++
 src/backend/access/gist/gist.c                |  5 +++
 src/backend/access/hash/hash.c                |  4 +++
 src/backend/access/index/indexam.c            | 27 +++++++++++++++
 src/backend/access/nbtree/nbtree.c            |  4 +++
 src/backend/access/spgist/spgutils.c          |  5 +++
 src/include/access/amapi.h                    | 13 +++++++
 src/include/access/genam.h                    |  1 +
 src/include/commands/vacuum.h                 | 34 +++++++++++++++++++
 .../modules/dummy_index_am/dummy_index_am.c   |  4 +++
 13 files changed, 134 insertions(+)

diff --git a/contrib/bloom/blutils.c b/contrib/bloom/blutils.c
index e2063bac62..cde36c5b49 100644
--- a/contrib/bloom/blutils.c
+++ b/contrib/bloom/blutils.c
@@ -18,6 +18,7 @@
 #include "access/reloptions.h"
 #include "bloom.h"
 #include "catalog/index.h"
+#include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
@@ -121,6 +122,9 @@ blhandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_CLEANUP;
+	amroutine->amusemaintenanceworkmem = false;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = blbuild;
@@ -144,6 +148,7 @@ blhandler(PG_FUNCTION_ARGS)
 	amroutine->amestimateparallelscan = NULL;
 	amroutine->aminitparallelscan = NULL;
 	amroutine->amparallelrescan = NULL;
+	amroutine->amestimateparallelvacuum = NULL;
 
 	PG_RETURN_POINTER(amroutine);
 }
diff --git a/doc/src/sgml/indexam.sgml b/doc/src/sgml/indexam.sgml
index dd54c68802..9fed438fc6 100644
--- a/doc/src/sgml/indexam.sgml
+++ b/doc/src/sgml/indexam.sgml
@@ -122,6 +122,10 @@ typedef struct IndexAmRoutine
     bool        amcanparallel;
     /* does AM support columns included with clause INCLUDE? */
     bool        amcaninclude;
+    /* OR of parallel vacuum flags */
+    uint8       amparallelvacuumoptions;
+    /* does AM use maintenance_work_mem? */
+    bool        amusemaintenanceworkmem;
     /* type of data stored in index, or InvalidOid if variable */
     Oid         amkeytype;
 
@@ -149,6 +153,9 @@ typedef struct IndexAmRoutine
     amestimateparallelscan_function amestimateparallelscan;    /* can be NULL */
     aminitparallelscan_function aminitparallelscan;    /* can be NULL */
     amparallelrescan_function amparallelrescan;    /* can be NULL */
+
+    /* interface function to support parallel vacuum */
+    amestimateparallelvacuum_function amestimateparallelvacuum; /* can be NULL */
 } IndexAmRoutine;
 </programlisting>
   </para>
@@ -731,6 +738,21 @@ amparallelrescan (IndexScanDesc scan);
    the beginning.
   </para>
 
+  <para>
+<programlisting>
+Size
+amestimateparallelvacuum (void);
+</programlisting>
+   Estimate and return the number of bytes of dynamic shared memory needed to
+   store statistics returned by the access method.
+  </para>
+
+  <para>
+   It is not necessary to implement this function for access methods which
+   do not support parallel vacuum or in cases where the access method does not
+   require more than size of <structname>IndexBulkDeleteResult</structname> to
+   store statistics.
+  </para>
  </sect1>
 
  <sect1 id="index-scanning">
diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index 294ffa6e20..fbb4af9df1 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -27,6 +27,7 @@
 #include "access/xloginsert.h"
 #include "catalog/index.h"
 #include "catalog/pg_am.h"
+#include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "pgstat.h"
 #include "postmaster/autovacuum.h"
@@ -101,6 +102,9 @@ brinhandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_CLEANUP;
+	amroutine->amusemaintenanceworkmem = false;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = brinbuild;
@@ -124,6 +128,7 @@ brinhandler(PG_FUNCTION_ARGS)
 	amroutine->amestimateparallelscan = NULL;
 	amroutine->aminitparallelscan = NULL;
 	amroutine->amparallelrescan = NULL;
+	amroutine->amestimateparallelvacuum = NULL;
 
 	PG_RETURN_POINTER(amroutine);
 }
diff --git a/src/backend/access/gin/ginutil.c b/src/backend/access/gin/ginutil.c
index 38593554f0..8c174b28fc 100644
--- a/src/backend/access/gin/ginutil.c
+++ b/src/backend/access/gin/ginutil.c
@@ -20,6 +20,7 @@
 #include "access/xloginsert.h"
 #include "catalog/pg_collation.h"
 #include "catalog/pg_type.h"
+#include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "storage/indexfsm.h"
 #include "storage/lmgr.h"
@@ -53,6 +54,9 @@ ginhandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_CLEANUP;
+	amroutine->amusemaintenanceworkmem = true;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = ginbuild;
@@ -76,6 +80,7 @@ ginhandler(PG_FUNCTION_ARGS)
 	amroutine->amestimateparallelscan = NULL;
 	amroutine->aminitparallelscan = NULL;
 	amroutine->amparallelrescan = NULL;
+	amroutine->amestimateparallelvacuum = NULL;
 
 	PG_RETURN_POINTER(amroutine);
 }
diff --git a/src/backend/access/gist/gist.c b/src/backend/access/gist/gist.c
index 8d9c8d025d..bbb630fb88 100644
--- a/src/backend/access/gist/gist.c
+++ b/src/backend/access/gist/gist.c
@@ -17,6 +17,7 @@
 #include "access/gist_private.h"
 #include "access/gistscan.h"
 #include "catalog/pg_collation.h"
+#include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "nodes/execnodes.h"
 #include "storage/lmgr.h"
@@ -74,6 +75,9 @@ gisthandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = true;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP;
+	amroutine->amusemaintenanceworkmem = false;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = gistbuild;
@@ -97,6 +101,7 @@ gisthandler(PG_FUNCTION_ARGS)
 	amroutine->amestimateparallelscan = NULL;
 	amroutine->aminitparallelscan = NULL;
 	amroutine->amparallelrescan = NULL;
+	amroutine->amestimateparallelvacuum = NULL;
 
 	PG_RETURN_POINTER(amroutine);
 }
diff --git a/src/backend/access/hash/hash.c b/src/backend/access/hash/hash.c
index a0597a0c6e..10d6efdd9f 100644
--- a/src/backend/access/hash/hash.c
+++ b/src/backend/access/hash/hash.c
@@ -72,6 +72,9 @@ hashhandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL;
+	amroutine->amusemaintenanceworkmem = false;
 	amroutine->amkeytype = INT4OID;
 
 	amroutine->ambuild = hashbuild;
@@ -95,6 +98,7 @@ hashhandler(PG_FUNCTION_ARGS)
 	amroutine->amestimateparallelscan = NULL;
 	amroutine->aminitparallelscan = NULL;
 	amroutine->amparallelrescan = NULL;
+	amroutine->amestimateparallelvacuum = NULL;
 
 	PG_RETURN_POINTER(amroutine);
 }
diff --git a/src/backend/access/index/indexam.c b/src/backend/access/index/indexam.c
index 4af418287d..d176f0193b 100644
--- a/src/backend/access/index/indexam.c
+++ b/src/backend/access/index/indexam.c
@@ -711,6 +711,33 @@ index_vacuum_cleanup(IndexVacuumInfo *info,
 	return indexRelation->rd_indam->amvacuumcleanup(info, stats);
 }
 
+/*
+ * index_parallelvacuum_estimate
+ *
+ * Estimates the DSM space needed to store statistics for parallel vacuum.
+ */
+Size
+index_parallelvacuum_estimate(Relation indexRelation)
+{
+	Size		nbytes;
+
+	RELATION_CHECKS;
+
+	/*
+	 * If amestimateparallelvacuum is not provided, assume only
+	 * IndexBulkDeleteResult is needed.
+	 */
+	if (indexRelation->rd_indam->amestimateparallelvacuum != NULL)
+	{
+		nbytes = indexRelation->rd_indam->amestimateparallelvacuum();
+		Assert(nbytes >= MAXALIGN(sizeof(IndexBulkDeleteResult)));
+	}
+	else
+		nbytes = MAXALIGN(sizeof(IndexBulkDeleteResult));
+
+	return nbytes;
+}
+
 /* ----------------
  *		index_can_return
  *
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index c67235ab80..6ff80876e2 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -123,6 +123,9 @@ bthandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = true;
 	amroutine->amcaninclude = true;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP;
+	amroutine->amusemaintenanceworkmem = false;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = btbuild;
@@ -146,6 +149,7 @@ bthandler(PG_FUNCTION_ARGS)
 	amroutine->amestimateparallelscan = btestimateparallelscan;
 	amroutine->aminitparallelscan = btinitparallelscan;
 	amroutine->amparallelrescan = btparallelrescan;
+	amroutine->amestimateparallelvacuum = NULL;
 
 	PG_RETURN_POINTER(amroutine);
 }
diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c
index e2d391ee75..1216d2702c 100644
--- a/src/backend/access/spgist/spgutils.c
+++ b/src/backend/access/spgist/spgutils.c
@@ -22,6 +22,7 @@
 #include "access/transam.h"
 #include "access/xact.h"
 #include "catalog/pg_amop.h"
+#include "commands/vacuum.h"
 #include "storage/bufmgr.h"
 #include "storage/indexfsm.h"
 #include "storage/lmgr.h"
@@ -56,6 +57,9 @@ spghandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP;
+	amroutine->amusemaintenanceworkmem = false;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = spgbuild;
@@ -79,6 +83,7 @@ spghandler(PG_FUNCTION_ARGS)
 	amroutine->amestimateparallelscan = NULL;
 	amroutine->aminitparallelscan = NULL;
 	amroutine->amparallelrescan = NULL;
+	amroutine->amestimateparallelvacuum = NULL;
 
 	PG_RETURN_POINTER(amroutine);
 }
diff --git a/src/include/access/amapi.h b/src/include/access/amapi.h
index 6e3db06eed..eb23f01ab6 100644
--- a/src/include/access/amapi.h
+++ b/src/include/access/amapi.h
@@ -156,6 +156,12 @@ typedef void (*aminitparallelscan_function) (void *target);
 /* (re)start parallel index scan */
 typedef void (*amparallelrescan_function) (IndexScanDesc scan);
 
+/*
+ * Callback function signature - for parallel index vacuuming.
+ */
+/* estimate size of statitics needed for parallel index vacuum */
+typedef Size (*amestimateparallelvacuum_function) (void);
+
 /*
  * API struct for an index AM.  Note this must be stored in a single palloc'd
  * chunk of memory.
@@ -197,6 +203,10 @@ typedef struct IndexAmRoutine
 	bool		amcanparallel;
 	/* does AM support columns included with clause INCLUDE? */
 	bool		amcaninclude;
+	/* OR of parallel vacuum flags.  See vacuum.h for flags. */
+	uint8		amparallelvacuumoptions;
+	/* does AM use maintenance_work_mem? */
+	bool		amusemaintenanceworkmem;
 	/* type of data stored in index, or InvalidOid if variable */
 	Oid			amkeytype;
 
@@ -230,6 +240,9 @@ typedef struct IndexAmRoutine
 	amestimateparallelscan_function amestimateparallelscan; /* can be NULL */
 	aminitparallelscan_function aminitparallelscan; /* can be NULL */
 	amparallelrescan_function amparallelrescan; /* can be NULL */
+
+	/* interface function to support parallel vacuum */
+	amestimateparallelvacuum_function amestimateparallelvacuum; /* can be NULL */
 } IndexAmRoutine;
 
 
diff --git a/src/include/access/genam.h b/src/include/access/genam.h
index a813b004be..48ed5bbac7 100644
--- a/src/include/access/genam.h
+++ b/src/include/access/genam.h
@@ -179,6 +179,7 @@ extern IndexBulkDeleteResult *index_bulk_delete(IndexVacuumInfo *info,
 												void *callback_state);
 extern IndexBulkDeleteResult *index_vacuum_cleanup(IndexVacuumInfo *info,
 												   IndexBulkDeleteResult *stats);
+extern Size index_parallelvacuum_estimate(Relation indexRelation);
 extern bool index_can_return(Relation indexRelation, int attno);
 extern RegProcedure index_getprocid(Relation irel, AttrNumber attnum,
 									uint16 procnum);
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 128f7ae65d..5f23f1ab1d 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -23,6 +23,40 @@
 #include "storage/lock.h"
 #include "utils/relcache.h"
 
+/*
+ * Flags to control the participation of bulkdelete and vacuumcleanup in
+ * parallel vacuum.
+ */
+
+/*
+ * Both bulkdelete and vacuumcleanup are disabled by default.  This will be
+ * used by IndexAM's that don't want to participate in parallel vacuum.
+ */
+#define VACUUM_OPTION_NO_PARALLEL			0
+
+/*
+ * bulkdelete can be performed in parallel.  This option can be used by
+ * IndexAm's that need to scan the index to delete the tuples.
+ */
+#define VACUUM_OPTION_PARALLEL_BULKDEL		(1 << 0)
+
+/*
+ * vacuumcleanup can be performed in parallel if bulkdelete is not performed
+ * yet.  This will be used by IndexAM's that can scan the index if the
+ * bulkdelete is not performed.
+ */
+#define VACUUM_OPTION_PARALLEL_COND_CLEANUP	(1 << 1)
+
+/*
+ * vacuumcleanup can be performed in parallel even if bulkdelete has already
+ * processed the index.  This will be used by IndexAM's that scan the index
+ * during the cleanup phase of index irrespective of whether the index is
+ * already scanned or not during bulkdelete phase.
+ */
+#define VACUUM_OPTION_PARALLEL_CLEANUP		(1 << 2)
+
+/* value for checking vacuum flags */
+#define VACUUM_OPTION_MAX_VALID_VALUE		((1 << 3) - 1)
 
 /*----------
  * ANALYZE builds one of these structs for each attribute (column) that is
diff --git a/src/test/modules/dummy_index_am/dummy_index_am.c b/src/test/modules/dummy_index_am/dummy_index_am.c
index 053636e4b4..096534a6ee 100644
--- a/src/test/modules/dummy_index_am/dummy_index_am.c
+++ b/src/test/modules/dummy_index_am/dummy_index_am.c
@@ -16,6 +16,7 @@
 #include "access/amapi.h"
 #include "access/reloptions.h"
 #include "catalog/index.h"
+#include "commands/vacuum.h"
 #include "nodes/pathnodes.h"
 #include "utils/guc.h"
 #include "utils/rel.h"
@@ -294,6 +295,8 @@ dihandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amparallelvacuumoptions = VACUUM_OPTION_NO_PARALLEL;
+	amroutine->amusemaintenanceworkmem = false;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = dibuild;
@@ -317,6 +320,7 @@ dihandler(PG_FUNCTION_ARGS)
 	amroutine->amestimateparallelscan = NULL;
 	amroutine->aminitparallelscan = NULL;
 	amroutine->amparallelrescan = NULL;
+	amroutine->amestimateparallelvacuum = NULL;
 
 	PG_RETURN_POINTER(amroutine);
 }
-- 
2.23.0

v35-0001-delete-empty-page-in-gistbulkdelete.patchapplication/octet-stream; name=v35-0001-delete-empty-page-in-gistbulkdelete.patchDownload

From 42d77e85a2e7a0a3562bd66e76f911feb07e469e Mon Sep 17 00:00:00 2001
From: Dilip Kumar <dilip.kumar@enterprisedb.com>
Date: Tue, 22 Oct 2019 13:54:14 +0530
Subject: [PATCH v35 1/4] delete empty page in gistbulkdelete

---
 src/backend/access/gist/gistvacuum.c | 148 +++++++++++----------------
 1 file changed, 59 insertions(+), 89 deletions(-)

diff --git a/src/backend/access/gist/gistvacuum.c b/src/backend/access/gist/gistvacuum.c
index 710e4015b3..6551558a41 100644
--- a/src/backend/access/gist/gistvacuum.c
+++ b/src/backend/access/gist/gistvacuum.c
@@ -24,58 +24,34 @@
 #include "storage/lmgr.h"
 #include "utils/memutils.h"
 
-/*
- * State kept across vacuum stages.
- */
+/* Working state needed by gistbulkdelete */
 typedef struct
 {
-	IndexBulkDeleteResult stats;	/* must be first */
+	IndexVacuumInfo *info;
+	IndexBulkDeleteResult *stats;
+	IndexBulkDeleteCallback callback;
+	void	   *callback_state;
+	GistNSN		startNSN;
 
 	/*
-	 * These are used to memorize all internal and empty leaf pages in the 1st
-	 * vacuum stage.  They are used in the 2nd stage, to delete all the empty
-	 * pages.
+	 * These are used to memorize all internal and empty leaf pages. They are
+	 * used for deleting all the empty pages.
 	 */
 	IntegerSet *internal_page_set;
 	IntegerSet *empty_leaf_set;
 	MemoryContext page_set_context;
-} GistBulkDeleteResult;
-
-/* Working state needed by gistbulkdelete */
-typedef struct
-{
-	IndexVacuumInfo *info;
-	GistBulkDeleteResult *stats;
-	IndexBulkDeleteCallback callback;
-	void	   *callback_state;
-	GistNSN		startNSN;
 } GistVacState;
 
-static void gistvacuumscan(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
+static void gistvacuumscan(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
 						   IndexBulkDeleteCallback callback, void *callback_state);
 static void gistvacuumpage(GistVacState *vstate, BlockNumber blkno,
 						   BlockNumber orig_blkno);
 static void gistvacuum_delete_empty_pages(IndexVacuumInfo *info,
-										  GistBulkDeleteResult *stats);
-static bool gistdeletepage(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
+										  GistVacState *stats);
+static bool gistdeletepage(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
 						   Buffer buffer, OffsetNumber downlink,
 						   Buffer leafBuffer);
 
-/* allocate the 'stats' struct that's kept over vacuum stages */
-static GistBulkDeleteResult *
-create_GistBulkDeleteResult(void)
-{
-	GistBulkDeleteResult *gist_stats;
-
-	gist_stats = (GistBulkDeleteResult *) palloc0(sizeof(GistBulkDeleteResult));
-	gist_stats->page_set_context =
-		GenerationContextCreate(CurrentMemoryContext,
-								"GiST VACUUM page set context",
-								16 * 1024);
-
-	return gist_stats;
-}
-
 /*
  * VACUUM bulkdelete stage: remove index entries.
  */
@@ -83,15 +59,13 @@ IndexBulkDeleteResult *
 gistbulkdelete(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
 			   IndexBulkDeleteCallback callback, void *callback_state)
 {
-	GistBulkDeleteResult *gist_stats = (GistBulkDeleteResult *) stats;
-
 	/* allocate stats if first time through, else re-use existing struct */
-	if (gist_stats == NULL)
-		gist_stats = create_GistBulkDeleteResult();
+	if (stats == NULL)
+		stats = (IndexBulkDeleteResult *) palloc0(sizeof(IndexBulkDeleteResult));
 
-	gistvacuumscan(info, gist_stats, callback, callback_state);
+	gistvacuumscan(info, stats, callback, callback_state);
 
-	return (IndexBulkDeleteResult *) gist_stats;
+	return stats;
 }
 
 /*
@@ -100,8 +74,6 @@ gistbulkdelete(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
 IndexBulkDeleteResult *
 gistvacuumcleanup(IndexVacuumInfo *info, IndexBulkDeleteResult *stats)
 {
-	GistBulkDeleteResult *gist_stats = (GistBulkDeleteResult *) stats;
-
 	/* No-op in ANALYZE ONLY mode */
 	if (info->analyze_only)
 		return stats;
@@ -111,24 +83,12 @@ gistvacuumcleanup(IndexVacuumInfo *info, IndexBulkDeleteResult *stats)
 	 * stats from the latest gistbulkdelete call.  If it wasn't called, we
 	 * still need to do a pass over the index, to obtain index statistics.
 	 */
-	if (gist_stats == NULL)
+	if (stats == NULL)
 	{
-		gist_stats = create_GistBulkDeleteResult();
-		gistvacuumscan(info, gist_stats, NULL, NULL);
+		stats = (IndexBulkDeleteResult *) palloc0(sizeof(IndexBulkDeleteResult));
+		gistvacuumscan(info, stats, NULL, NULL);
 	}
 
-	/*
-	 * If we saw any empty pages, try to unlink them from the tree so that
-	 * they can be reused.
-	 */
-	gistvacuum_delete_empty_pages(info, gist_stats);
-
-	/* we don't need the internal and empty page sets anymore */
-	MemoryContextDelete(gist_stats->page_set_context);
-	gist_stats->page_set_context = NULL;
-	gist_stats->internal_page_set = NULL;
-	gist_stats->empty_leaf_set = NULL;
-
 	/*
 	 * It's quite possible for us to be fooled by concurrent page splits into
 	 * double-counting some index tuples, so disbelieve any total that exceeds
@@ -137,11 +97,11 @@ gistvacuumcleanup(IndexVacuumInfo *info, IndexBulkDeleteResult *stats)
 	 */
 	if (!info->estimated_count)
 	{
-		if (gist_stats->stats.num_index_tuples > info->num_heap_tuples)
-			gist_stats->stats.num_index_tuples = info->num_heap_tuples;
+		if (stats->num_index_tuples > info->num_heap_tuples)
+			stats->num_index_tuples = info->num_heap_tuples;
 	}
 
-	return (IndexBulkDeleteResult *) gist_stats;
+	return stats;
 }
 
 /*
@@ -161,7 +121,7 @@ gistvacuumcleanup(IndexVacuumInfo *info, IndexBulkDeleteResult *stats)
  * The caller is responsible for initially allocating/zeroing a stats struct.
  */
 static void
-gistvacuumscan(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
+gistvacuumscan(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
 			   IndexBulkDeleteCallback callback, void *callback_state)
 {
 	Relation	rel = info->index;
@@ -175,11 +135,10 @@ gistvacuumscan(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
 	 * Reset counts that will be incremented during the scan; needed in case
 	 * of multiple scans during a single VACUUM command.
 	 */
-	stats->stats.estimated_count = false;
-	stats->stats.num_index_tuples = 0;
-	stats->stats.pages_deleted = 0;
-	stats->stats.pages_free = 0;
-	MemoryContextReset(stats->page_set_context);
+	stats->estimated_count = false;
+	stats->num_index_tuples = 0;
+	stats->pages_deleted = 0;
+	stats->pages_free = 0;
 
 	/*
 	 * Create the integer sets to remember all the internal and the empty leaf
@@ -187,9 +146,12 @@ gistvacuumscan(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
 	 * this context so that the subsequent allocations for these integer sets
 	 * will be done from the same context.
 	 */
-	oldctx = MemoryContextSwitchTo(stats->page_set_context);
-	stats->internal_page_set = intset_create();
-	stats->empty_leaf_set = intset_create();
+	vstate.page_set_context = GenerationContextCreate(CurrentMemoryContext,
+												"GiST VACUUM page set context",
+												16 * 1024);
+	oldctx = MemoryContextSwitchTo(vstate.page_set_context);
+	vstate.internal_page_set = intset_create();
+	vstate.empty_leaf_set = intset_create();
 	MemoryContextSwitchTo(oldctx);
 
 	/* Set up info to pass down to gistvacuumpage */
@@ -257,11 +219,20 @@ gistvacuumscan(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
 	 * Note that if no recyclable pages exist, we don't bother vacuuming the
 	 * FSM at all.
 	 */
-	if (stats->stats.pages_free > 0)
+	if (stats->pages_free > 0)
 		IndexFreeSpaceMapVacuum(rel);
 
 	/* update statistics */
-	stats->stats.num_pages = num_pages;
+	stats->num_pages = num_pages;
+
+	/*
+	 * If we saw any empty pages, try to unlink them from the tree so that
+	 * they can be reused.
+	 */
+	gistvacuum_delete_empty_pages(info, &vstate);
+
+	/* we don't need the internal and empty page sets anymore */
+	MemoryContextDelete(vstate.page_set_context);
 }
 
 /*
@@ -278,7 +249,6 @@ gistvacuumscan(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
 static void
 gistvacuumpage(GistVacState *vstate, BlockNumber blkno, BlockNumber orig_blkno)
 {
-	GistBulkDeleteResult *stats = vstate->stats;
 	IndexVacuumInfo *info = vstate->info;
 	IndexBulkDeleteCallback callback = vstate->callback;
 	void	   *callback_state = vstate->callback_state;
@@ -307,13 +277,13 @@ restart:
 	{
 		/* Okay to recycle this page */
 		RecordFreeIndexPage(rel, blkno);
-		stats->stats.pages_free++;
-		stats->stats.pages_deleted++;
+		vstate->stats->pages_free++;
+		vstate->stats->pages_deleted++;
 	}
 	else if (GistPageIsDeleted(page))
 	{
 		/* Already deleted, but can't recycle yet */
-		stats->stats.pages_deleted++;
+		vstate->stats->pages_deleted++;
 	}
 	else if (GistPageIsLeaf(page))
 	{
@@ -388,7 +358,7 @@ restart:
 
 			END_CRIT_SECTION();
 
-			stats->stats.tuples_removed += ntodelete;
+			vstate->stats->tuples_removed += ntodelete;
 			/* must recompute maxoff */
 			maxoff = PageGetMaxOffsetNumber(page);
 		}
@@ -405,10 +375,10 @@ restart:
 			 * it up.
 			 */
 			if (blkno == orig_blkno)
-				intset_add_member(stats->empty_leaf_set, blkno);
+				intset_add_member(vstate->empty_leaf_set, blkno);
 		}
 		else
-			stats->stats.num_index_tuples += nremain;
+			vstate->stats->num_index_tuples += nremain;
 	}
 	else
 	{
@@ -443,7 +413,7 @@ restart:
 		 * parents of empty leaf pages.
 		 */
 		if (blkno == orig_blkno)
-			intset_add_member(stats->internal_page_set, blkno);
+			intset_add_member(vstate->internal_page_set, blkno);
 	}
 
 	UnlockReleaseBuffer(buffer);
@@ -466,7 +436,7 @@ restart:
  * Scan all internal pages, and try to delete their empty child pages.
  */
 static void
-gistvacuum_delete_empty_pages(IndexVacuumInfo *info, GistBulkDeleteResult *stats)
+gistvacuum_delete_empty_pages(IndexVacuumInfo *info, GistVacState *vstate)
 {
 	Relation	rel = info->index;
 	BlockNumber empty_pages_remaining;
@@ -475,10 +445,10 @@ gistvacuum_delete_empty_pages(IndexVacuumInfo *info, GistBulkDeleteResult *stats
 	/*
 	 * Rescan all inner pages to find those that have empty child pages.
 	 */
-	empty_pages_remaining = intset_num_entries(stats->empty_leaf_set);
-	intset_begin_iterate(stats->internal_page_set);
+	empty_pages_remaining = intset_num_entries(vstate->empty_leaf_set);
+	intset_begin_iterate(vstate->internal_page_set);
 	while (empty_pages_remaining > 0 &&
-		   intset_iterate_next(stats->internal_page_set, &blkno))
+		   intset_iterate_next(vstate->internal_page_set, &blkno))
 	{
 		Buffer		buffer;
 		Page		page;
@@ -521,7 +491,7 @@ gistvacuum_delete_empty_pages(IndexVacuumInfo *info, GistBulkDeleteResult *stats
 			BlockNumber leafblk;
 
 			leafblk = ItemPointerGetBlockNumber(&(idxtuple->t_tid));
-			if (intset_is_member(stats->empty_leaf_set, leafblk))
+			if (intset_is_member(vstate->empty_leaf_set, leafblk))
 			{
 				leafs_to_delete[ntodelete] = leafblk;
 				todelete[ntodelete++] = off;
@@ -561,7 +531,7 @@ gistvacuum_delete_empty_pages(IndexVacuumInfo *info, GistBulkDeleteResult *stats
 			gistcheckpage(rel, leafbuf);
 
 			LockBuffer(buffer, GIST_EXCLUSIVE);
-			if (gistdeletepage(info, stats,
+			if (gistdeletepage(info, vstate->stats,
 							   buffer, todelete[i] - deleted,
 							   leafbuf))
 				deleted++;
@@ -573,7 +543,7 @@ gistvacuum_delete_empty_pages(IndexVacuumInfo *info, GistBulkDeleteResult *stats
 		ReleaseBuffer(buffer);
 
 		/* update stats */
-		stats->stats.pages_removed += deleted;
+		vstate->stats->pages_removed += deleted;
 
 		/*
 		 * We can stop the scan as soon as we have seen the downlinks, even if
@@ -596,7 +566,7 @@ gistvacuum_delete_empty_pages(IndexVacuumInfo *info, GistBulkDeleteResult *stats
  * prevented it.
  */
 static bool
-gistdeletepage(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
+gistdeletepage(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
 			   Buffer parentBuffer, OffsetNumber downlink,
 			   Buffer leafBuffer)
 {
@@ -665,7 +635,7 @@ gistdeletepage(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
 	/* mark the page as deleted */
 	MarkBufferDirty(leafBuffer);
 	GistPageSetDeleted(leafPage, txid);
-	stats->stats.pages_deleted++;
+	stats->pages_deleted++;
 
 	/* remove the downlink from the parent */
 	MarkBufferDirty(parentBuffer);
-- 
2.23.0

v35-0004-Add-paralell-P-option-to-vacuumdb-command.patchapplication/octet-stream; name=v35-0004-Add-paralell-P-option-to-vacuumdb-command.patchDownload

From ebed3539a244a56de590a77c4ced52fc9d168fae Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Wed, 23 Jan 2019 16:07:53 +0900
Subject: [PATCH v35 4/4] Add --paralell, -P option to vacuumdb command

---
 doc/src/sgml/ref/vacuumdb.sgml    | 16 +++++++++++
 src/bin/scripts/t/100_vacuumdb.pl | 10 ++++++-
 src/bin/scripts/vacuumdb.c        | 48 ++++++++++++++++++++++++++++++-
 3 files changed, 72 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/ref/vacuumdb.sgml b/doc/src/sgml/ref/vacuumdb.sgml
index 47d93456f8..f6ac0c6e5a 100644
--- a/doc/src/sgml/ref/vacuumdb.sgml
+++ b/doc/src/sgml/ref/vacuumdb.sgml
@@ -226,6 +226,22 @@ PostgreSQL documentation
       </listitem>
      </varlistentry>
 
+     <varlistentry>
+      <term><option>-P <replaceable class="parameter">workers</replaceable></option></term>
+      <term><option>--parallel=<replaceable class="parameter">workers</replaceable></option></term>
+      <listitem>
+       <para>
+        Execute parallel vacuum with <productname>PostgreSQL</productname>'s
+        <replaceable class="parameter">workers</replaceable> background workers.
+       </para>
+       <para>
+        This option will require background workers, so make sure your
+        <xref linkend="guc-max-parallel-workers-maintenance"/> setting is more
+        than one.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry>
       <term><option>-q</option></term>
       <term><option>--quiet</option></term>
diff --git a/src/bin/scripts/t/100_vacuumdb.pl b/src/bin/scripts/t/100_vacuumdb.pl
index b685b35282..8fe80719e8 100644
--- a/src/bin/scripts/t/100_vacuumdb.pl
+++ b/src/bin/scripts/t/100_vacuumdb.pl
@@ -3,7 +3,7 @@ use warnings;
 
 use PostgresNode;
 use TestLib;
-use Test::More tests => 44;
+use Test::More tests => 48;
 
 program_help_ok('vacuumdb');
 program_version_ok('vacuumdb');
@@ -48,6 +48,14 @@ $node->issues_sql_like(
 $node->command_fails(
 	[ 'vacuumdb', '--analyze-only', '--disable-page-skipping', 'postgres' ],
 	'--analyze-only and --disable-page-skipping specified together');
+$node->issues_sql_like(
+	[ 'vacuumdb', '-P2', 'postgres' ],
+	qr/statement: VACUUM \(PARALLEL 2\).*;/,
+	'vacuumdb -P2');
+$node->issues_sql_like(
+	[ 'vacuumdb', '-P', 'postgres' ],
+	qr/statement: VACUUM \(PARALLEL\).*;/,
+	'vacuumdb -P');
 $node->command_ok([qw(vacuumdb -Z --table=pg_am dbname=template1)],
 	'vacuumdb with connection string');
 
diff --git a/src/bin/scripts/vacuumdb.c b/src/bin/scripts/vacuumdb.c
index 2c7219239f..63bf66a70b 100644
--- a/src/bin/scripts/vacuumdb.c
+++ b/src/bin/scripts/vacuumdb.c
@@ -34,6 +34,8 @@ typedef struct vacuumingOptions
 	bool		skip_locked;
 	int			min_xid_age;
 	int			min_mxid_age;
+	int			parallel_workers;	/* -1 disables, 0 for choosing based on the
+									 * number of indexes */
 } vacuumingOptions;
 
 
@@ -86,6 +88,7 @@ main(int argc, char *argv[])
 		{"full", no_argument, NULL, 'f'},
 		{"verbose", no_argument, NULL, 'v'},
 		{"jobs", required_argument, NULL, 'j'},
+		{"parallel", optional_argument, NULL, 'P'},
 		{"maintenance-db", required_argument, NULL, 2},
 		{"analyze-in-stages", no_argument, NULL, 3},
 		{"disable-page-skipping", no_argument, NULL, 4},
@@ -115,6 +118,7 @@ main(int argc, char *argv[])
 
 	/* initialize options to all false */
 	memset(&vacopts, 0, sizeof(vacopts));
+	vacopts.parallel_workers = -1;
 
 	pg_logging_init(argv[0]);
 	progname = get_progname(argv[0]);
@@ -122,7 +126,7 @@ main(int argc, char *argv[])
 
 	handle_help_version_opts(argc, argv, "vacuumdb", help);
 
-	while ((c = getopt_long(argc, argv, "h:p:U:wWeqd:zZFat:fvj:", long_options, &optindex)) != -1)
+	while ((c = getopt_long(argc, argv, "h:p:P::U:wWeqd:zZFat:fvj:", long_options, &optindex)) != -1)
 	{
 		switch (c)
 		{
@@ -182,6 +186,24 @@ main(int argc, char *argv[])
 					exit(1);
 				}
 				break;
+			case 'P':
+				{
+					int parallel_workers = 0;
+
+					if (optarg != NULL)
+					{
+						parallel_workers = atoi(optarg);
+						if (parallel_workers <= 0)
+						{
+							pg_log_error("number of parallel workers must be at least 1");
+							exit(1);
+						}
+					}
+
+					/* allow to set 0, meaning PARALLEL without the parallel degree */
+					vacopts.parallel_workers = parallel_workers;
+					break;
+				}
 			case 2:
 				maintenance_db = pg_strdup(optarg);
 				break;
@@ -254,9 +276,22 @@ main(int argc, char *argv[])
 						 "disable-page-skipping");
 			exit(1);
 		}
+		if (vacopts.parallel_workers >= 0)
+		{
+			pg_log_error("cannot use the \"%s\" option when performing only analyze",
+						 "parallel");
+			exit(1);
+		}
 		/* allow 'and_analyze' with 'analyze_only' */
 	}
 
+	if (vacopts.full && vacopts.parallel_workers >= 0)
+	{
+		pg_log_error("cannot use the \"%s\" option with \"%s\" option",
+					 "full", "parallel");
+		exit(1);
+	}
+
 	setup_cancel_handler();
 
 	/* Avoid opening extra connections. */
@@ -822,6 +857,16 @@ prepare_vacuum_command(PQExpBuffer sql, int serverVersion,
 				appendPQExpBuffer(sql, "%sANALYZE", sep);
 				sep = comma;
 			}
+			if (vacopts->parallel_workers > 0)
+			{
+				appendPQExpBuffer(sql, "%sPARALLEL %d", sep, vacopts->parallel_workers);
+				sep = comma;
+			}
+			if (vacopts->parallel_workers == 0)
+			{
+				appendPQExpBuffer(sql, "%sPARALLEL", sep);
+				sep = comma;
+			}
 			if (sep != paren)
 				appendPQExpBufferChar(sql, ')');
 		}
@@ -885,6 +930,7 @@ help(const char *progname)
 	printf(_("  -j, --jobs=NUM                  use this many concurrent connections to vacuum\n"));
 	printf(_("      --min-mxid-age=MXID_AGE     minimum multixact ID age of tables to vacuum\n"));
 	printf(_("      --min-xid-age=XID_AGE       minimum transaction ID age of tables to vacuum\n"));
+	printf(_("  -P, --parallel[=NUM]            do parallel vacuuming\n"));
 	printf(_("  -q, --quiet                     don't write any messages\n"));
 	printf(_("      --skip-locked               skip relations that cannot be immediately locked\n"));
 	printf(_("  -t, --table='TABLE[(COLUMNS)]'  vacuum specific table(s) only\n"));
-- 
2.23.0

v35-0003-Add-parallel-option-to-VACUUM-command.patchapplication/octet-stream; name=v35-0003-Add-parallel-option-to-VACUUM-command.patchDownload

From 483cd0ea5f9e3f3bddc18b788cc79f75204a3a74 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 25 Oct 2019 22:47:41 +0900
Subject: [PATCH v35 3/4] Add parallel option to VACUUM command

This change adds PARALLEL option to VACUUM command that enable us to
perform index vacuuming and index cleanup with background
workers. Individual indexes is processed by one vacuum
process. Therefore parallel vacuum can be used when the table has at
least two indexes and it cannot specify larger parallel degree than
the number of indexes that the table has.

The parallel degree is either specified by user or determined based on
the number of indexes that the table has, and further limited by
max_parallel_maintenance_workers. The table size and index size don't
affect it.
---
 doc/src/sgml/config.sgml               |   14 +-
 doc/src/sgml/ref/vacuum.sgml           |   45 +
 src/backend/access/heap/vacuumlazy.c   | 1288 ++++++++++++++++++++++--
 src/backend/access/nbtree/nbtsort.c    |    2 +-
 src/backend/access/transam/parallel.c  |    9 +-
 src/backend/commands/vacuum.c          |  109 +-
 src/backend/executor/nodeGather.c      |    2 +-
 src/backend/executor/nodeGatherMerge.c |    2 +-
 src/backend/postmaster/autovacuum.c    |    2 +
 src/bin/psql/tab-complete.c            |    2 +-
 src/include/access/heapam.h            |    6 +
 src/include/access/parallel.h          |    2 +-
 src/include/commands/vacuum.h          |    5 +
 src/test/regress/expected/vacuum.out   |   26 +
 src/test/regress/sql/vacuum.sql        |   25 +
 15 files changed, 1421 insertions(+), 118 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index d4d1fe45cc..7e17d98fd8 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2310,13 +2310,13 @@ include_dir 'conf.d'
        <listitem>
         <para>
          Sets the maximum number of parallel workers that can be
-         started by a single utility command.  Currently, the only
-         parallel utility command that supports the use of parallel
-         workers is <command>CREATE INDEX</command>, and only when
-         building a B-tree index.  Parallel workers are taken from the
-         pool of processes established by <xref
-         linkend="guc-max-worker-processes"/>, limited by <xref
-         linkend="guc-max-parallel-workers"/>.  Note that the requested
+         started by a single utility command.  Currently, the parallel
+         utility commands that support the use of parallel workers are
+         <command>CREATE INDEX</command> only when building a B-tree index,
+         and <command>VACUUM</command> without <literal>FULL</literal>
+         option.  Parallel workers are taken from the pool of processes
+         established by <xref linkend="guc-max-worker-processes"/>, limited
+         by <xref linkend="guc-max-parallel-workers"/>.  Note that the requested
          number of workers may not actually be available at run time.
          If this occurs, the utility operation will run with fewer
          workers than expected.  The default value is 2.  Setting this
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index f9b0fb8794..9fee083233 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -34,6 +34,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     SKIP_LOCKED [ <replaceable class="parameter">boolean</replaceable> ]
     INDEX_CLEANUP [ <replaceable class="parameter">boolean</replaceable> ]
     TRUNCATE [ <replaceable class="parameter">boolean</replaceable> ]
+    PARALLEL [ <replaceable class="parameter">integer</replaceable> ]
 
 <phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
 
@@ -223,6 +224,32 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><literal>PARALLEL</literal></term>
+    <listitem>
+     <para>
+      Perform vacuum index and cleanup index phases of <command>VACUUM</command>
+      in parallel using <replaceable class="parameter">integer</replaceable>
+      background workers (for the detail of each vacuum phases, please
+      refer to <xref linkend="vacuum-phases"/>).  If the parallel degree
+      <replaceable class="parameter">integer</replaceable> is omitted,
+      then <command>VACUUM</command> decides the number of workers based
+      on number of indexes that support parallel vacuum operation on the
+      relation which is further limited by
+      <xref linkend="guc-max-parallel-workers-maintenance"/>.  Please note
+      that it is not guaranteed that the number of parallel workers specified
+      in <replaceable class="parameter">integer</replaceable> will be used
+      during execution.  It is possible for a vacuum to run with fewer workers
+      than specified, or even with no workers at all.  Only one worker can
+      be used per index.  So parallel workers are launched only when there
+      are at least <literal>2</literal> indexes in the table.  Workers for
+      vacuum launches before starting each phase and exit at the end of
+      the phase.  These behaviors might change in a future release.  This
+      option can't be used with the <literal>FULL</literal> option.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><replaceable class="parameter">boolean</replaceable></term>
     <listitem>
@@ -237,6 +264,18 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><replaceable class="parameter">integer</replaceable></term>
+    <listitem>
+     <para>
+      Specifies a positive integer value passed to the selected option.
+      The <replaceable class="parameter">integer</replaceable> value can
+      also be omitted, in which case the value is decided by the command
+      based on the option used.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><replaceable class="parameter">table_name</replaceable></term>
     <listitem>
@@ -316,6 +355,12 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     more than a plain <command>VACUUM</command> would.
    </para>
 
+   <para>
+     The <option>PARALLEL</option> option is used only for vacuum purpose.
+     Even if this option is specified with <option>ANALYZE</option> option
+     it does not affect <option>ANALYZE</option>.
+   </para>
+
    <para>
     <command>VACUUM</command> causes a substantial increase in I/O traffic,
     which might cause poor performance for other active sessions.  Therefore,
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index a3c4a1df3b..36f4ef1772 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -22,6 +22,22 @@
  * of index scans performed.  So we don't use maintenance_work_mem memory for
  * the TID array, just enough to hold as many heap tuples as fit on one page.
  *
+ * Lazy vacuum supports parallel execution with parallel worker processes.  In
+ * a parallel lazy vacuum, we perform both index vacuuming and index cleanup
+ * with parallel worker processes.  Individual indexes are processed by one
+ * vacuum process.  At the beginning of a lazy vacuum (at lazy_scan_heap) we
+ * prepare the parallel context and initialize the DSM segment that contains
+ * shared information as well as the memory space for storing dead tuples.
+ * When starting either index vacuuming or index cleanup, we launch parallel
+ * worker processes.  Once all indexes are processed the parallel worker
+ * processes exit.  And then the leader process re-initializes the parallel
+ * context so that it can use the same DSM for multiple passses of index
+ * vacuum and for performing index cleanup.  Note that all parallel workers
+ * live during either index vacuuming or index cleanup but the leader process
+ * neither exits from the parallel mode nor destroys the parallel context.
+ * For updating the index statistics, since any updates are not allowed during
+ * parallel mode we update the index statistics after exited from the parallel
+ * mode.
  *
  * Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
@@ -36,13 +52,16 @@
 
 #include <math.h>
 
+#include "access/amapi.h"
 #include "access/genam.h"
 #include "access/heapam.h"
 #include "access/heapam_xlog.h"
 #include "access/htup_details.h"
 #include "access/multixact.h"
+#include "access/parallel.h"
 #include "access/transam.h"
 #include "access/visibilitymap.h"
+#include "access/xact.h"
 #include "access/xlog.h"
 #include "catalog/storage.h"
 #include "commands/dbcommands.h"
@@ -55,6 +74,7 @@
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
 #include "storage/lmgr.h"
+#include "tcop/tcopprot.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/pg_rusage.h"
@@ -110,6 +130,160 @@
  */
 #define PREFETCH_SIZE			((BlockNumber) 32)
 
+/*
+ * DSM keys for parallel lazy vacuum.  Unlike other parallel execution code,
+ * since we don't need to worry about DSM keys conflicting with plan_node_id
+ * we can use small integers.
+ */
+#define PARALLEL_VACUUM_KEY_SHARED			1
+#define PARALLEL_VACUUM_KEY_DEAD_TUPLES		2
+#define PARALLEL_VACUUM_KEY_QUERY_TEXT		3
+
+/*
+ * PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION disables the leader's
+ * participation in parallel lazy vacuum.  This may be useful as a debugging
+ * aid.
+#undef PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION
+ */
+
+/*
+ * Macro to check if we are in a parallel lazy vacuum.  If true, we are
+ * in the parallel mode and prepared the DSM segment.
+ */
+#define ParallelVacuumIsActive(lps) (((LVParallelState *) (lps)) != NULL)
+
+/*
+ * LVDeadTuples stores the dead tuple TIDs collected during the heap scan.
+ * This is allocated in the DSM segment in parallel mode and in local memory
+ * in non-parallel mode.
+ */
+typedef struct LVDeadTuples
+{
+	int			max_tuples;	/* # slots allocated in array */
+	int			num_tuples;	/* current # of entries */
+	/* List of TIDs of tuples we intend to delete */
+	/* NB: this list is ordered by TID address */
+	ItemPointerData itemptrs[FLEXIBLE_ARRAY_MEMBER];	/* array of ItemPointerData */
+} LVDeadTuples;
+
+#define SizeOfLVDeadTuples offsetof(LVDeadTuples, itemptrs) + sizeof(ItemPointerData)
+
+/*
+ * Shared information among parallel workers.  So this is allocated in the DSM
+ * segment.
+ */
+typedef struct LVShared
+{
+	/*
+	 * Target table relid and log level. These fields are not modified
+	 * during the lazy vacuum.
+	 */
+	Oid		relid;
+	int		elevel;
+
+	/*
+	 * An indication for vacuum workers to perform either index vacuuming or
+	 * index cleanup.  first_time is true only if for_cleanup is true and
+	 * bulk-deletion is not performed yet.
+	 */
+	bool	for_cleanup;
+	bool	first_time;
+
+	/*
+	 * Fields for both index vacuuming and index cleanup.
+	 *
+	 * reltuples is the total number of input heap tuples.  We set either
+	 * an old live tuples in index vacuuming case or the new live tuples in
+	 * index cleanup case.
+	 *
+	 * estimated_count is true if the reltuples is estimated value.
+	 */
+	double	reltuples;
+	bool	estimated_count;
+
+	/*
+	 * In single process lazy vacuum we could consume more memory during
+	 * index vacuuming or cleanup apart from the memory for heap scanning.
+	 * In parallel index vacuuming, since individual vacuum workers can
+	 * consume memory equal to maitenance_work_mem, the new
+	 * maitenance_work_mem for each worker is set such that the parallel
+	 * operation doesn't consume more memory than single process lazy vacuum.
+	 */
+	int		maintenance_work_mem_worker;
+
+	/*
+	 * Shared vacuum cost balance.  During parallel index vacuuming
+	 * VacuumSharedCostBalance points to this value and it accumulates the
+	 * balance of each parallel vacuum workers.
+	 */
+	pg_atomic_uint32 cost_balance;
+
+	/*
+	 * Number of active parallel workers.  This is used for computing the
+	 * minimum threshold of the vacuum cost balance for a worker to go
+	 * for the delay.
+	 */
+	pg_atomic_uint32 active_nworkers;
+
+	/*
+	 * Variables to control parallel index vacuuming.  We have a bitmap to
+	 * indicate which index has stats in shared memory.  The set bit in the
+	 * map indicates that the particular index supports a parallel vacuum.
+	 */
+	pg_atomic_uint32	idx;		/* counter for vacuuming and clean up */
+	pg_atomic_uint32	nprocessed;	/* # of indexes done during parallel execution */
+	uint32				offset;		/* sizeof header incl. bitmap */
+	bits8				bitmap[FLEXIBLE_ARRAY_MEMBER];	 /* bit map of NULLs */
+
+	/* Shared index statistics data follows at end of struct */
+} LVShared;
+
+#define SizeOfLVShared offsetof(LVShared, bitmap) + sizeof(bits8)
+#define GetSharedIndStats(s) \
+	((LVSharedIndStats *)((char *)(s) + ((LVShared *)(s))->offset))
+#define IndStatsIsNull(s, i) \
+	(!(((LVShared *)(s))->bitmap[(i) >> 3] & (1 << ((i) & 0x07))))
+
+/*
+ * Struct for an index bulk-deletion statistic used for parallel lazy
+ * vacuum.  This is allocated in the DSM segment.
+ */
+typedef struct LVSharedIndStats
+{
+	Size	size;
+	bool	updated;	/* are the stats updated? */
+
+	/* IndexBulkDeleteResult data follows at end of struct */
+} LVSharedIndStats;
+
+#define SizeOfSharedIndStats(s) \
+	(sizeof(LVSharedIndStats) + ((LVSharedIndStats *)(s))->size)
+#define GetIndexBulkDeleteResult(s) \
+	((IndexBulkDeleteResult *)((char *)(s) + sizeof(LVSharedIndStats)))
+
+/* Struct for maintaining a parallel vacuum state. */
+typedef struct LVParallelState
+{
+	ParallelContext	*pcxt;
+
+	/* Shared information among parallel vacuum workers */
+	LVShared		*lvshared;
+
+	/*
+	 * The number of indexes that support parallel index bulk-deletion
+	 * and parallel index cleanup respectively.
+	 */
+	int				nindexes_parallel_bulkdel;
+	int				nindexes_parallel_cleanup;
+	int				nindexes_parallel_condcleanup;
+
+	/*
+	 * Always true except for a debugging case where
+	 * PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION are defined.
+	 */
+	bool			leaderparticipates;
+} LVParallelState;
+
 typedef struct LVRelStats
 {
 	/* useindex = true means two-pass strategy; false means one-pass */
@@ -128,11 +302,7 @@ typedef struct LVRelStats
 	BlockNumber pages_removed;
 	double		tuples_deleted;
 	BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
-	/* List of TIDs of tuples we intend to delete */
-	/* NB: this list is ordered by TID address */
-	int			num_dead_tuples;	/* current # of entries */
-	int			max_dead_tuples;	/* # slots allocated in array */
-	ItemPointer dead_tuples;	/* array of ItemPointerData */
+	LVDeadTuples *dead_tuples;
 	int			num_index_scans;
 	TransactionId latestRemovedXid;
 	bool		lock_waiter_detected;
@@ -148,6 +318,26 @@ static MultiXactId MultiXactCutoff;
 
 static BufferAccessStrategy vac_strategy;
 
+/*
+ * Variables for cost-based vacuum delay for parallel index vacuuming.
+ * The basic idea of cost-based vacuum delay for parallel index vacuuming
+ * is to allow all parallel vacuum workers including the leader process
+ * to have a shared view of cost related parameters (mainly VacuumCostBalance)
+ * and allow each worker to update it and then based on that decide
+ * whether it needs to sleep.  Besides, we allow any worker to sleep
+ * only if it has performed the I/O above a certain threshold, which is
+ * calculated based on the number of active workers (VacuumActiveNWorkers),
+ * and the overall cost balance is more than VacuumCostLimit set by the
+ * system.  Then we will allow the worker to sleep proportional to the work
+ * done and reduce the VacuumSharedCostBalance by the amount which is
+ * consumed by the current worker (VacuumCostBalanceLocal).  This can
+ * avoid letting the workers sleep which has done less or no I/O as compared
+ * to other workers, and therefore can ensure that workers who are doing
+ * more I/O got throttled more.
+ */
+pg_atomic_uint32	*VacuumSharedCostBalance = NULL;
+pg_atomic_uint32	*VacuumActiveNWorkers = NULL;
+int					VacuumCostBalanceLocal = 0;
 
 /* non-export function prototypes */
 static void lazy_scan_heap(Relation onerel, VacuumParams *params,
@@ -155,12 +345,11 @@ static void lazy_scan_heap(Relation onerel, VacuumParams *params,
 						   bool aggressive);
 static void lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats);
 static bool lazy_check_needs_freeze(Buffer buf, bool *hastup);
-static void lazy_vacuum_index(Relation indrel,
-							  IndexBulkDeleteResult **stats,
-							  LVRelStats *vacrelstats);
+static void lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult **stats,
+							  LVDeadTuples *dead_tuples, double reltuples);
 static void lazy_cleanup_index(Relation indrel,
-							   IndexBulkDeleteResult *stats,
-							   LVRelStats *vacrelstats);
+							   IndexBulkDeleteResult **stats,
+							   double reltuples, bool estimated_count);
 static int	lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 							 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer);
 static bool should_attempt_truncation(VacuumParams *params,
@@ -169,12 +358,44 @@ static void lazy_truncate_heap(Relation onerel, LVRelStats *vacrelstats);
 static BlockNumber count_nondeletable_pages(Relation onerel,
 											LVRelStats *vacrelstats);
 static void lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks);
-static void lazy_record_dead_tuple(LVRelStats *vacrelstats,
+static void lazy_record_dead_tuple(LVDeadTuples *dead_tuples,
 								   ItemPointer itemptr);
 static bool lazy_tid_reaped(ItemPointer itemptr, void *state);
 static int	vac_cmp_itemptr(const void *left, const void *right);
 static bool heap_page_is_all_visible(Relation rel, Buffer buf,
 									 TransactionId *visibility_cutoff_xid, bool *all_frozen);
+static void lazy_parallel_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
+													int nindexes, IndexBulkDeleteResult **stats,
+													LVParallelState *lps);
+static void vacuum_or_cleanup_indexes_worker(Relation *Irel, int nindexes,
+											 IndexBulkDeleteResult **stats,
+											 LVShared *lvshared,
+											 LVDeadTuples *dead_tuples);
+static void vacuum_or_cleanup_skipped_indexes(LVRelStats *vacrelstats, Relation *Irel,
+											  int nindexes, IndexBulkDeleteResult **stats,
+											  LVParallelState *lps);
+static void vacuum_or_cleanup_one_index_worker(Relation indrel, IndexBulkDeleteResult **stats,
+											   LVShared *lvshared, LVSharedIndStats *shared_indstats,
+											   LVDeadTuples *dead_tuples);
+static void lazy_vacuum_indexes(LVRelStats *vacrelstats, Relation *Irel,
+								int nindexes, IndexBulkDeleteResult **stats,
+								LVParallelState *lps);
+static void lazy_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
+								 int nindexes, IndexBulkDeleteResult **stats,
+								 LVParallelState *lps);
+static void update_index_statistics(Relation *Irel, IndexBulkDeleteResult **stats,
+									int nindexes);
+static long compute_max_dead_tuples(BlockNumber relblocks, bool hasindex);
+static int compute_parallel_workers(Relation *Irel, int nindexes, int nrequested);
+static void prepare_index_statistics(LVShared *lvshared, Relation *Irel, int nindexes,
+									 int nworkers);
+static LVParallelState *begin_parallel_vacuum(LVRelStats *vacrelstats, Oid relid,
+											  BlockNumber nblocks, Relation *Irel,
+											  int nindexes, int nrequested);
+static void end_parallel_vacuum(LVParallelState *lps, Relation *Irel, int nindexes,
+								IndexBulkDeleteResult **stats);
+static LVSharedIndStats *get_indstats(LVShared *lvshared, int n);
+static bool skip_parallel_index_vacuum(Relation indrel, LVShared *lvshared);
 
 
 /*
@@ -488,6 +709,18 @@ vacuum_log_cleanup_info(Relation rel, LVRelStats *vacrelstats)
  *		dead-tuple TIDs, invoke vacuuming of indexes and call lazy_vacuum_heap
  *		to reclaim dead line pointers.
  *
+ *		If the table has at least two indexes and parallel lazy vacuum is
+ *		requested, we execute both index vacuuming and index cleanup with
+ *		parallel workers.  In parallel lazy vacuum, we enter parallel mode and
+ *		then create both the parallel context and the DSM segment before starting
+ *		heap scan so that we can record dead tuples to the DSM segment.  All
+ *		parallel workers are launched at beginning of index vacuuming and index
+ *		cleanup and they exit once done with all indexes.  At the end of this
+ *		function we exit from parallel mode.  Index bulk-deletion results are
+ *		stored in the DSM segment and update index statistics as a whole after
+ *		exited from parallel mode since all writes are not allowed during parallel
+ *		mode.
+ *
  *		If there are no indexes then we can reclaim line pointers on the fly;
  *		dead line pointers need only be retained until all index pointers that
  *		reference them have been killed.
@@ -496,6 +729,8 @@ static void
 lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			   Relation *Irel, int nindexes, bool aggressive)
 {
+	LVParallelState *lps = NULL;
+	LVDeadTuples *dead_tuples;
 	BlockNumber nblocks,
 				blkno;
 	HeapTupleData tuple;
@@ -518,6 +753,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	bool		skipping_blocks;
 	xl_heap_freeze_tuple *frozen;
 	StringInfoData buf;
+	int			parallel_workers = 0;
 	const int	initprog_index[] = {
 		PROGRESS_VACUUM_PHASE,
 		PROGRESS_VACUUM_TOTAL_HEAP_BLKS,
@@ -553,13 +789,41 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	vacrelstats->nonempty_pages = 0;
 	vacrelstats->latestRemovedXid = InvalidTransactionId;
 
-	lazy_space_alloc(vacrelstats, nblocks);
+	/*
+	 * Compute the number of parallel vacuum workers to launch if the parallel
+	 * vacuum is requested and we need to vacuum the indexes.
+	 */
+	if (params->nworkers >= 0 && vacrelstats->useindex)
+		parallel_workers = compute_parallel_workers(Irel, nindexes,
+													params->nworkers);
+
+	if (parallel_workers > 0)
+	{
+		/*
+		 * Enter parallel mode, create the parallel context and allocate the
+		 * DSM segment.
+		 */
+		lps = begin_parallel_vacuum(vacrelstats,
+									RelationGetRelid(onerel),
+									nblocks, Irel, nindexes,
+									parallel_workers);
+	}
+	else
+	{
+		/*
+		 * Use single process vacuum. We allocate the memory space for dead
+		 * tuples locally.
+		 */
+		lazy_space_alloc(vacrelstats, nblocks);
+	}
+
+	dead_tuples = vacrelstats->dead_tuples;
 	frozen = palloc(sizeof(xl_heap_freeze_tuple) * MaxHeapTuplesPerPage);
 
 	/* Report that we're scanning the heap, advertising total # of blocks */
 	initprog_val[0] = PROGRESS_VACUUM_PHASE_SCAN_HEAP;
 	initprog_val[1] = nblocks;
-	initprog_val[2] = vacrelstats->max_dead_tuples;
+	initprog_val[2] = dead_tuples->max_tuples;
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
 	/*
@@ -737,8 +1001,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * If we are close to overrunning the available space for dead-tuple
 		 * TIDs, pause and do a cycle of vacuuming before we tackle this page.
 		 */
-		if ((vacrelstats->max_dead_tuples - vacrelstats->num_dead_tuples) < MaxHeapTuplesPerPage &&
-			vacrelstats->num_dead_tuples > 0)
+		if ((dead_tuples->max_tuples - dead_tuples->num_tuples) < MaxHeapTuplesPerPage &&
+			dead_tuples->num_tuples > 0)
 		{
 			const int	hvp_index[] = {
 				PROGRESS_VACUUM_PHASE,
@@ -766,10 +1030,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 										 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
 			/* Remove index entries */
-			for (i = 0; i < nindexes; i++)
-				lazy_vacuum_index(Irel[i],
-								  &indstats[i],
-								  vacrelstats);
+			lazy_vacuum_indexes(vacrelstats, Irel, nindexes, indstats, lps);
 
 			/*
 			 * Report that we are now vacuuming the heap.  We also increase
@@ -789,7 +1050,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 			vacrelstats->num_index_scans++;
 
 			/*
@@ -985,7 +1246,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		has_dead_tuples = false;
 		nfrozen = 0;
 		hastup = false;
-		prev_dead_count = vacrelstats->num_dead_tuples;
+		prev_dead_count = dead_tuples->num_tuples;
 		maxoff = PageGetMaxOffsetNumber(page);
 
 		/*
@@ -1024,7 +1285,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 */
 			if (ItemIdIsDead(itemid))
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				all_visible = false;
 				continue;
 			}
@@ -1170,7 +1431,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 
 			if (tupgone)
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				HeapTupleHeaderAdvanceLatestRemovedXid(tuple.t_data,
 													   &vacrelstats->latestRemovedXid);
 				tups_vacuumed += 1;
@@ -1240,7 +1501,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * doing a second scan. Also we don't do that but forget dead tuples
 		 * when index cleanup is disabled.
 		 */
-		if (!vacrelstats->useindex && vacrelstats->num_dead_tuples > 0)
+		if (!vacrelstats->useindex && dead_tuples->num_tuples > 0)
 		{
 			if (nindexes == 0)
 			{
@@ -1269,7 +1530,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 
 			/*
 			 * Periodically do incremental FSM vacuuming to make newly-freed
@@ -1384,7 +1645,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * page, so remember its free space as-is.  (This path will always be
 		 * taken if there are no indexes.)
 		 */
-		if (vacrelstats->num_dead_tuples == prev_dead_count)
+		if (dead_tuples->num_tuples == prev_dead_count)
 			RecordPageWithFreeSpace(onerel, blkno, freespace);
 	}
 
@@ -1418,7 +1679,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 
 	/* If any tuples need to be deleted, perform final vacuum cycle */
 	/* XXX put a threshold on min number of tuples here? */
-	if (vacrelstats->num_dead_tuples > 0)
+	if (dead_tuples->num_tuples > 0)
 	{
 		const int	hvp_index[] = {
 			PROGRESS_VACUUM_PHASE,
@@ -1434,10 +1695,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 									 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
 		/* Remove index entries */
-		for (i = 0; i < nindexes; i++)
-			lazy_vacuum_index(Irel[i],
-							  &indstats[i],
-							  vacrelstats);
+		lazy_vacuum_indexes(vacrelstats, Irel, nindexes, indstats, lps);
 
 		/* Report that we are now vacuuming the heap */
 		hvp_val[0] = PROGRESS_VACUUM_PHASE_VACUUM_HEAP;
@@ -1463,12 +1721,19 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
 								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
 
-	/* Do post-vacuum cleanup and statistics update for each index */
+	/* Do post-vacuum cleanup */
 	if (vacrelstats->useindex)
-	{
-		for (i = 0; i < nindexes; i++)
-			lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
-	}
+		lazy_cleanup_indexes(vacrelstats, Irel, nindexes, indstats, lps);
+
+	/*
+	 * End parallel mode before updating index statistics as we cannot write
+	 * during parallel mode.
+	 */
+	if (ParallelVacuumIsActive(lps))
+		end_parallel_vacuum(lps, Irel, nindexes, indstats);
+
+	/* Update index statistics */
+	 update_index_statistics(Irel, indstats, nindexes);
 
 	/* If no indexes, make log report that lazy_vacuum_heap would've made */
 	if (vacuumed_pages)
@@ -1534,7 +1799,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 	npages = 0;
 
 	tupindex = 0;
-	while (tupindex < vacrelstats->num_dead_tuples)
+	while (tupindex < vacrelstats->dead_tuples->num_tuples)
 	{
 		BlockNumber tblk;
 		Buffer		buf;
@@ -1543,7 +1808,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 
 		vacuum_delay_point();
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples->itemptrs[tupindex]);
 		buf = ReadBufferExtended(onerel, MAIN_FORKNUM, tblk, RBM_NORMAL,
 								 vac_strategy);
 		if (!ConditionalLockBufferForCleanup(buf))
@@ -1591,6 +1856,7 @@ static int
 lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 				 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer)
 {
+	LVDeadTuples	*dead_tuples = vacrelstats->dead_tuples;
 	Page		page = BufferGetPage(buffer);
 	OffsetNumber unused[MaxOffsetNumber];
 	int			uncnt = 0;
@@ -1601,16 +1867,16 @@ lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 
 	START_CRIT_SECTION();
 
-	for (; tupindex < vacrelstats->num_dead_tuples; tupindex++)
+	for (; tupindex < dead_tuples->num_tuples; tupindex++)
 	{
 		BlockNumber tblk;
 		OffsetNumber toff;
 		ItemId		itemid;
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&dead_tuples->itemptrs[tupindex]);
 		if (tblk != blkno)
 			break;				/* past end of tuples for this block */
-		toff = ItemPointerGetOffsetNumber(&vacrelstats->dead_tuples[tupindex]);
+		toff = ItemPointerGetOffsetNumber(&dead_tuples->itemptrs[tupindex]);
 		itemid = PageGetItemId(page, toff);
 		ItemIdSetUnused(itemid);
 		unused[uncnt++] = toff;
@@ -1731,19 +1997,395 @@ lazy_check_needs_freeze(Buffer buf, bool *hastup)
 	return false;
 }
 
+/*
+ * Perform index vacuuming or index cleanup with parallel workers.  This
+ * function must be used by the parallel vacuum leader process. The caller
+ * must set lps->lvshared->for_cleanup to indicate whether to perform vacuum
+ * or cleanup.
+ */
+static void
+lazy_parallel_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
+										int nindexes, IndexBulkDeleteResult **stats,
+										LVParallelState *lps)
+{
+	int	nworkers;
+
+	Assert(!IsParallelWorker());
+	Assert(ParallelVacuumIsActive(lps));
+	Assert(nindexes > 0);
+
+	/* Determine the number of parallel workers to launch */
+	if (lps->lvshared->for_cleanup)
+	{
+		if (lps->lvshared->first_time)
+			nworkers = lps->nindexes_parallel_cleanup +
+				lps->nindexes_parallel_condcleanup - 1;
+		else
+			nworkers = lps->nindexes_parallel_cleanup - 1;
+
+	}
+	else
+		nworkers = lps->nindexes_parallel_bulkdel - 1;
+
+	/* Cap by the worker we computed at the beginning of parallel lazy vacuum */
+	nworkers = Min(nworkers, lps->pcxt->nworkers);
+
+	/* Setup the shared cost-based vacuum delay and launch workers*/
+	if (nworkers > 0)
+	{
+		/*
+		 * Reset the local value so that we compute cost balance during
+		 * parallel index vacuuming.
+		 */
+		VacuumCostBalance = 0;
+		VacuumCostBalanceLocal = 0;
+
+		LaunchParallelWorkers(lps->pcxt, nworkers);
+
+		/* Enable shared costing iff we process indexes in parallel. */
+		if (lps->pcxt->nworkers_launched > 0)
+		{
+			/* Enable shared cost balance */
+			VacuumSharedCostBalance = &(lps->lvshared->cost_balance);
+			VacuumActiveNWorkers = &(lps->lvshared->active_nworkers);
+
+			/*
+			 * Set up shared cost balance and the number of active workers for
+			 * vacuum delay.
+			 */
+			pg_atomic_write_u32(VacuumSharedCostBalance, VacuumCostBalance);
+			pg_atomic_write_u32(VacuumActiveNWorkers, 0);
+		}
+
+		if (lps->lvshared->for_cleanup)
+			ereport(elevel,
+					(errmsg(ngettext("launched %d parallel vacuum worker for index cleanup (planned: %d)",
+									 "launched %d parallel vacuum workers for index cleanup (planned: %d)",
+									 lps->pcxt->nworkers_launched),
+							lps->pcxt->nworkers_launched, nworkers)));
+		else
+			ereport(elevel,
+					(errmsg(ngettext("launched %d parallel vacuum worker for index vacuuming (planned: %d)",
+									 "launched %d parallel vacuum workers for index vacuuming (planned: %d)",
+									 lps->pcxt->nworkers_launched),
+							lps->pcxt->nworkers_launched, nworkers)));
+	}
+
+	/*
+	 * Join as a parallel worker. The leader process alone does that in
+	 * case where no workers launched.
+	 */
+	if (lps->leaderparticipates || lps->pcxt->nworkers_launched == 0)
+		vacuum_or_cleanup_indexes_worker(Irel, nindexes, stats, lps->lvshared,
+										 vacrelstats->dead_tuples);
+
+	/*
+	 * Here, the indexes that had been skipped during parallel index vacuuming
+	 * are remaining. If there are such indexes the leader process does vacuum
+	 * or cleanup them one by one.
+	 */
+	vacuum_or_cleanup_skipped_indexes(vacrelstats, Irel, nindexes, stats,
+									  lps);
+
+	/* Wait for all vacuum workers to finish */
+	WaitForParallelWorkersToFinish(lps->pcxt);
+
+	/* Carry the shared balance value to heap scan */
+	if (VacuumSharedCostBalance)
+		VacuumCostBalance = pg_atomic_read_u32(VacuumSharedCostBalance);
+
+	/* Disable shared cost balance for vacuum delay */
+	VacuumSharedCostBalance = NULL;
+	VacuumActiveNWorkers = NULL;
+
+	/*
+	 * In cleanup case we don't need to reinitialize the parallel
+	 * context as no more index vacuuming and index cleanup will be
+	 * performed after that.
+	 */
+	if (!lps->lvshared->for_cleanup)
+	{
+		/* Reset the processing counts */
+		pg_atomic_write_u32(&(lps->lvshared->idx), 0);
+		pg_atomic_write_u32(&(lps->lvshared->nprocessed), 0);
+
+		/*
+		 * Reinitialize the parallel context to relaunch parallel workers
+		 * for the next execution.
+		 */
+		ReinitializeParallelDSM(lps->pcxt);
+	}
+}
+
+/*
+ * Index vacuuming and index cleanup routine used by parallel vacuum
+ * worker processes and the leader process to process the indexes in
+ * parallel.
+ */
+static void
+vacuum_or_cleanup_indexes_worker(Relation *Irel, int nindexes,
+								 IndexBulkDeleteResult **stats,
+								 LVShared *lvshared,
+								 LVDeadTuples *dead_tuples)
+{
+	/*
+	 * Increment the active worker count if we are able to launch any worker.
+	 */
+	if (VacuumActiveNWorkers)
+		pg_atomic_add_fetch_u32(VacuumActiveNWorkers, 1);
+
+	/* Loop until all indexes are vacuumed */
+	for (;;)
+	{
+		int idx;
+		LVSharedIndStats *shared_indstats;
+
+		/* Get an index number to process */
+		idx = pg_atomic_fetch_add_u32(&(lvshared->idx), 1);
+
+		/* Done for all indexes? */
+		if (idx >= nindexes)
+			break;
+
+		/* Skip processing indexes that doesn't support parallel operation */
+		if (skip_parallel_index_vacuum(Irel[idx], lvshared))
+			continue;
+
+		/* Increment the processing count */
+		pg_atomic_add_fetch_u32(&(lvshared->nprocessed), 1);
+
+		/* Get the index statistics of this index from DSM */
+		shared_indstats = get_indstats(lvshared, idx);
+
+		/*
+		 * This must exist in DSM as we reach here only for indexes that
+		 * support the parallel operation.
+		 */
+		Assert(shared_indstats);
+
+		/* Do vacuum or cleanup one index */
+		vacuum_or_cleanup_one_index_worker(Irel[idx], &(stats[idx]),
+										   lvshared, shared_indstats,
+										   dead_tuples);
+	}
+
+	/*
+	 * We have completed the index vacuum so decrement the active worker
+	 * count.
+	 */
+	if (VacuumActiveNWorkers)
+		pg_atomic_sub_fetch_u32(VacuumActiveNWorkers, 1);
+}
+
+/*
+ * Vacuum or cleanup indexes that have been skipped during parallel operation
+ * because these indexes don't support parallel operation at that phase.  Therefore
+ * this function must be called by the leader process.
+ */
+static void
+vacuum_or_cleanup_skipped_indexes(LVRelStats *vacrelstats, Relation *Irel,
+								  int nindexes, IndexBulkDeleteResult **stats,
+								  LVParallelState *lps)
+{
+	int nindexes_remains;
+	int i;
+
+	Assert(!IsParallelWorker());
+
+	nindexes_remains = nindexes - pg_atomic_read_u32(&(lps->lvshared->nprocessed));
+	Assert(nindexes_remains >= 0);
+
+	/* Quick exit if all indexes have already been processed */
+	if (nindexes_remains == 0)
+		return;
+
+	/*
+	 * Increment the active worker count if we are able to launch any worker.
+	 */
+	if (VacuumActiveNWorkers)
+		pg_atomic_add_fetch_u32(VacuumActiveNWorkers, 1);
+
+	for (i = 0; i < nindexes; i++)
+	{
+		bool processed = !skip_parallel_index_vacuum(Irel[i], lps->lvshared);
+
+		/* Skip the already processed indexes */
+		if (processed)
+			continue;
+
+		vacuum_or_cleanup_one_index_worker(Irel[i], &(stats[i]),
+										   lps->lvshared, get_indstats(lps->lvshared, i),
+										   vacrelstats->dead_tuples);
+	}
+
+	/*
+	 * We have completed the index vacuum so decrement the active worker
+	 * count.
+	 */
+	if (VacuumActiveNWorkers)
+		pg_atomic_sub_fetch_u32(VacuumActiveNWorkers, 1);
+}
+
+/*
+ * Vacuum or cleanup one index by worker processing including the leader
+ * process.  After finished each indexes this function copies the index
+ * statistics returned from ambulkdelete and amvacuumcleanup to the DSM
+ * segment.
+ */
+static void
+vacuum_or_cleanup_one_index_worker(Relation indrel, IndexBulkDeleteResult **stats,
+								   LVShared *lvshared, LVSharedIndStats *shared_indstats,
+								   LVDeadTuples *dead_tuples)
+{
+	IndexBulkDeleteResult *bulkdelete_res = NULL;
+
+	if (shared_indstats)
+	{
+		/* Get the space for IndexBulkDeleteResult */
+		bulkdelete_res = GetIndexBulkDeleteResult(shared_indstats);
+
+		/*
+		 * Update the pointer to the corresponding bulk-deletion result
+		 * if someone has already updated it.
+		 */
+		if (shared_indstats->updated && *stats == NULL)
+			*stats = bulkdelete_res;
+	}
+
+	/* Do vacuum or cleanup one index */
+	if (lvshared->for_cleanup)
+		lazy_cleanup_index(indrel, stats, lvshared->reltuples,
+						   lvshared->estimated_count);
+	else
+		lazy_vacuum_index(indrel, stats, dead_tuples,
+						  lvshared->reltuples);
+
+	/*
+	 * Copy the index bulk-deletion result returned from ambulkdelete
+	 * and amvacuumcleanup to the DSM segment if it's the first time to
+	 * get it from them, because they allocate it locally and it's
+	 * possible that an index will be vacuumed by the different vacuum
+	 * process at the next time.  The copying the result normally
+	 * happens only after the first time of index vacuuming.  From the
+	 * second time, we pass the result on the DSM segment so that they
+	 * then update it directly.
+	 *
+	 * Since all vacuum workers write the bulk-deletion result at
+	 * different slots we can write them without locking.
+	 */
+	if (shared_indstats && !shared_indstats->updated && *stats != NULL)
+	{
+		memcpy(bulkdelete_res, *stats, shared_indstats->size);
+		shared_indstats->updated = true;
+
+		/*
+		 * no longer need the locally allocated result and now
+		 * stats[idx] points to the DSM segment.
+		 */
+		pfree(*stats);
+		*stats = bulkdelete_res;
+	}
+}
+
+/*
+ * Vacuum indexes.  We process the indexes serially unless we are doing
+ * parallel vacuum.
+ */
+static void
+lazy_vacuum_indexes(LVRelStats *vacrelstats, Relation *Irel,
+					int nindexes, IndexBulkDeleteResult **stats,
+					LVParallelState *lps)
+{
+	int		idx;
+
+	Assert(!IsParallelWorker());
+	Assert(nindexes > 0);
+
+	/* Perform index vacuuming with parallel workers for parallel vacuum. */
+	if (ParallelVacuumIsActive(lps))
+	{
+		/* Tell parallel workers to do index vacuuming */
+		lps->lvshared->for_cleanup = false;
+		lps->lvshared->first_time = false;
+
+		/*
+		 * We can only provide an approximate value of num_heap_tuples in
+		 * vacuum cases.
+		 */
+		lps->lvshared->reltuples = vacrelstats->old_live_tuples;
+		lps->lvshared->estimated_count = true;
+
+		lazy_parallel_vacuum_or_cleanup_indexes(vacrelstats, Irel, nindexes,
+												stats, lps);
+	}
+	else
+	{
+		for (idx = 0; idx < nindexes; idx++)
+			lazy_vacuum_index(Irel[idx], &stats[idx], vacrelstats->dead_tuples,
+							  vacrelstats->old_live_tuples);
+	}
+}
+
+/*
+ * Cleanup indexes.  We process the indexes serially unless we are doing
+ * parallel vacuum.
+ */
+static void
+lazy_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
+					 int nindexes, IndexBulkDeleteResult **stats,
+					 LVParallelState *lps)
+{
+	int		idx;
+
+	Assert(!IsParallelWorker());
+	Assert(nindexes > 0);
+
+	/*
+	 * If parallel vacuum is active we perform index cleanup with parallel
+	 * workers.
+	 */
+	if (ParallelVacuumIsActive(lps))
+	{
+		/* Tell parallel workers to do index cleanup */
+		lps->lvshared->for_cleanup = true;
+		lps->lvshared->first_time =
+						(vacrelstats->num_index_scans == 0);
+
+		/*
+		 * Now we can provide a better estimate of total number of
+		 * surviving tuples (we assume indexes are more interested in that
+		 * than in the number of nominally live tuples).
+		 */
+		lps->lvshared->reltuples = vacrelstats->new_rel_tuples;
+		lps->lvshared->estimated_count =
+					(vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+
+		lazy_parallel_vacuum_or_cleanup_indexes(vacrelstats, Irel, nindexes,
+												stats, lps);
+	}
+	else
+	{
+		for (idx = 0; idx < nindexes; idx++)
+			lazy_cleanup_index(Irel[idx], &stats[idx],
+							   vacrelstats->new_rel_tuples,
+							   vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	}
+}
 
 /*
  *	lazy_vacuum_index() -- vacuum one index relation.
  *
  *		Delete all the index entries pointing to tuples listed in
  *		vacrelstats->dead_tuples, and update running statistics.
+ *		reltuples is the number of heap tuples to be passed to the
+ *		bulk delete callback.
  */
 static void
-lazy_vacuum_index(Relation indrel,
-				  IndexBulkDeleteResult **stats,
-				  LVRelStats *vacrelstats)
+lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult **stats,
+				  LVDeadTuples *dead_tuples, double reltuples)
 {
 	IndexVacuumInfo ivinfo;
+	char		*msgfmt;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
@@ -1753,30 +2395,38 @@ lazy_vacuum_index(Relation indrel,
 	ivinfo.report_progress = false;
 	ivinfo.estimated_count = true;
 	ivinfo.message_level = elevel;
-	/* We can only provide an approximate value of num_heap_tuples here */
-	ivinfo.num_heap_tuples = vacrelstats->old_live_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
 	/* Do bulk deletion */
 	*stats = index_bulk_delete(&ivinfo, *stats,
-							   lazy_tid_reaped, (void *) vacrelstats);
+							   lazy_tid_reaped, (void *) dead_tuples);
+
+	if (IsParallelWorker())
+		msgfmt = gettext_noop("scanned index \"%s\" to remove %d row versions by parallel vacuum worker");
+	else
+		msgfmt = gettext_noop("scanned index \"%s\" to remove %d row versions");
 
 	ereport(elevel,
-			(errmsg("scanned index \"%s\" to remove %d row versions",
+			(errmsg(msgfmt,
 					RelationGetRelationName(indrel),
-					vacrelstats->num_dead_tuples),
+					dead_tuples->num_tuples),
 			 errdetail_internal("%s", pg_rusage_show(&ru0))));
 }
 
 /*
  *	lazy_cleanup_index() -- do post-vacuum cleanup for one index relation.
+ *
+ *		reltuples is the number of heap tuples and estimated_count is true
+ *		if the reltuples is an estimated value.
  */
 static void
 lazy_cleanup_index(Relation indrel,
-				   IndexBulkDeleteResult *stats,
-				   LVRelStats *vacrelstats)
+				   IndexBulkDeleteResult **stats,
+				   double reltuples, bool estimated_count)
 {
 	IndexVacuumInfo ivinfo;
+	char		*msgfmt;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
@@ -1784,49 +2434,62 @@ lazy_cleanup_index(Relation indrel,
 	ivinfo.index = indrel;
 	ivinfo.analyze_only = false;
 	ivinfo.report_progress = false;
-	ivinfo.estimated_count = (vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	ivinfo.estimated_count = estimated_count;
 	ivinfo.message_level = elevel;
 
-	/*
-	 * Now we can provide a better estimate of total number of surviving
-	 * tuples (we assume indexes are more interested in that than in the
-	 * number of nominally live tuples).
-	 */
-	ivinfo.num_heap_tuples = vacrelstats->new_rel_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
-	stats = index_vacuum_cleanup(&ivinfo, stats);
+	*stats = index_vacuum_cleanup(&ivinfo, *stats);
 
-	if (!stats)
+	if (!(*stats))
 		return;
 
-	/*
-	 * Now update statistics in pg_class, but only if the index says the count
-	 * is accurate.
-	 */
-	if (!stats->estimated_count)
-		vac_update_relstats(indrel,
-							stats->num_pages,
-							stats->num_index_tuples,
-							0,
-							false,
-							InvalidTransactionId,
-							InvalidMultiXactId,
-							false);
+	if (IsParallelWorker())
+		msgfmt = gettext_noop("index \"%s\" now contains %.0f row versions in %u pages, reported by parallel vacuum worker");
+	else
+		msgfmt = gettext_noop("index \"%s\" now contains %.0f row versions in %u pages");
 
 	ereport(elevel,
-			(errmsg("index \"%s\" now contains %.0f row versions in %u pages",
+			(errmsg(msgfmt,
 					RelationGetRelationName(indrel),
-					stats->num_index_tuples,
-					stats->num_pages),
+					(*stats)->num_index_tuples,
+					(*stats)->num_pages),
 			 errdetail("%.0f index row versions were removed.\n"
 					   "%u index pages have been deleted, %u are currently reusable.\n"
 					   "%s.",
-					   stats->tuples_removed,
-					   stats->pages_deleted, stats->pages_free,
+					   (*stats)->tuples_removed,
+					   (*stats)->pages_deleted, (*stats)->pages_free,
 					   pg_rusage_show(&ru0))));
+}
 
-	pfree(stats);
+/*
+ * Update index statistics in pg_class if the statistics is accurate.
+ */
+static void
+update_index_statistics(Relation *Irel, IndexBulkDeleteResult **stats,
+						int nindexes)
+{
+	int i;
+
+	Assert(!IsInParallelMode());
+
+	for (i = 0; i < nindexes; i++)
+	{
+		if (stats[i] == NULL || stats[i]->estimated_count)
+			continue;
+
+		/* Update index statistics */
+		vac_update_relstats(Irel[i],
+							stats[i]->num_pages,
+							stats[i]->num_index_tuples,
+							0,
+							false,
+							InvalidTransactionId,
+							InvalidMultiXactId,
+							false);
+		pfree(stats[i]);
+	}
 }
 
 /*
@@ -2134,19 +2797,17 @@ count_nondeletable_pages(Relation onerel, LVRelStats *vacrelstats)
 }
 
 /*
- * lazy_space_alloc - space allocation decisions for lazy vacuum
- *
- * See the comments at the head of this file for rationale.
+ * Return the maximum number of dead tuples we can record.
  */
-static void
-lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+static long
+compute_max_dead_tuples(BlockNumber relblocks, bool useindex)
 {
 	long		maxtuples;
 	int			vac_work_mem = IsAutoVacuumWorkerProcess() &&
 	autovacuum_work_mem != -1 ?
 	autovacuum_work_mem : maintenance_work_mem;
 
-	if (vacrelstats->useindex)
+	if (useindex)
 	{
 		maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);
 		maxtuples = Min(maxtuples, INT_MAX);
@@ -2160,34 +2821,49 @@ lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
 		maxtuples = Max(maxtuples, MaxHeapTuplesPerPage);
 	}
 	else
-	{
 		maxtuples = MaxHeapTuplesPerPage;
-	}
 
-	vacrelstats->num_dead_tuples = 0;
-	vacrelstats->max_dead_tuples = (int) maxtuples;
-	vacrelstats->dead_tuples = (ItemPointer)
-		palloc(maxtuples * sizeof(ItemPointerData));
+	return maxtuples;
+}
+
+/*
+ * lazy_space_alloc - space allocation decisions for lazy vacuum
+ *
+ * See the comments at the head of this file for rationale.
+ */
+static void
+lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+{
+	LVDeadTuples	*dead_tuples = NULL;
+	long		maxtuples;
+
+	maxtuples = compute_max_dead_tuples(relblocks, vacrelstats->useindex);
+
+	dead_tuples = (LVDeadTuples *)
+		palloc(SizeOfLVDeadTuples + maxtuples * sizeof(ItemPointerData));
+	dead_tuples->num_tuples = 0;
+	dead_tuples->max_tuples = (int) maxtuples;
+
+	vacrelstats->dead_tuples = dead_tuples;
 }
 
 /*
  * lazy_record_dead_tuple - remember one deletable tuple
  */
 static void
-lazy_record_dead_tuple(LVRelStats *vacrelstats,
-					   ItemPointer itemptr)
+lazy_record_dead_tuple(LVDeadTuples *dead_tuples, ItemPointer itemptr)
 {
 	/*
 	 * The array shouldn't overflow under normal behavior, but perhaps it
 	 * could if we are given a really small maintenance_work_mem. In that
 	 * case, just forget the last few tuples (we'll get 'em next time).
 	 */
-	if (vacrelstats->num_dead_tuples < vacrelstats->max_dead_tuples)
+	if (dead_tuples->num_tuples < dead_tuples->max_tuples)
 	{
-		vacrelstats->dead_tuples[vacrelstats->num_dead_tuples] = *itemptr;
-		vacrelstats->num_dead_tuples++;
+		dead_tuples->itemptrs[dead_tuples->num_tuples] = *itemptr;
+		dead_tuples->num_tuples++;
 		pgstat_progress_update_param(PROGRESS_VACUUM_NUM_DEAD_TUPLES,
-									 vacrelstats->num_dead_tuples);
+									 dead_tuples->num_tuples);
 	}
 }
 
@@ -2201,12 +2877,12 @@ lazy_record_dead_tuple(LVRelStats *vacrelstats,
 static bool
 lazy_tid_reaped(ItemPointer itemptr, void *state)
 {
-	LVRelStats *vacrelstats = (LVRelStats *) state;
+	LVDeadTuples	*dead_tuples = (LVDeadTuples *) state;
 	ItemPointer res;
 
 	res = (ItemPointer) bsearch((void *) itemptr,
-								(void *) vacrelstats->dead_tuples,
-								vacrelstats->num_dead_tuples,
+								(void *) dead_tuples->itemptrs,
+								dead_tuples->num_tuples,
 								sizeof(ItemPointerData),
 								vac_cmp_itemptr);
 
@@ -2354,3 +3030,415 @@ heap_page_is_all_visible(Relation rel, Buffer buf,
 
 	return all_visible;
 }
+
+/*
+ * Compute the number of parallel worker processes to request. Both index
+ * vacuuming and index cleanup can be executed together with parallel workers.
+ * The relation sizes of table and indexes don't affect to the parallel
+ * degree for now. nrequested is the number of parallel workers that user
+ * requested. If nrequested is 0 we compute the parallel degree based on
+ * nindexes that is the number of indexes that support parallel index
+ * vacuuming.
+ */
+static int
+compute_parallel_workers(Relation *Irel, int nindexes, int nrequested)
+{
+	bool	leaderparticipates = true;
+	int		nindexes_parallel = 0;
+	int		nindexes_parallel_bulkdel = 0;
+	int		nindexes_parallel_cleanup = 0;
+	int		parallel_workers;
+	int		i;
+
+	Assert(nrequested >= 0);
+
+	/* Return immediately when parallelism disabled */
+	if (max_parallel_maintenance_workers == 0)
+		return 0;
+
+	/*
+	 * Compute the number of indexes that can participate in parallel index
+	 * vacuuming.
+	 */
+	for (i = 0; i < nindexes; i++)
+	{
+		uint8 vacoptions = Irel[i]->rd_indam->amparallelvacuumoptions;
+
+		if ((vacoptions & VACUUM_OPTION_PARALLEL_BULKDEL) != 0)
+			nindexes_parallel_bulkdel++;
+		if (((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) != 0) ||
+			((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) != 0))
+			nindexes_parallel_cleanup++;
+	}
+
+	nindexes_parallel = Max(nindexes_parallel_bulkdel,
+							nindexes_parallel_cleanup);
+
+	/* No index supports parallel index vacuuming */
+	if (nindexes_parallel == 0)
+		return 0;
+
+#ifdef PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION
+	leaderparticipates = false;
+#endif
+
+	/* The leader process takes one index */
+	if (leaderparticipates)
+		nindexes_parallel--;
+
+	/* Compute the parallel degree */
+	parallel_workers = (nrequested > 0) ?
+		Min(nrequested, nindexes_parallel) : nindexes_parallel;
+
+	/* cap by max_parallel_maintenace_workers */
+	parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+
+	return parallel_workers;
+}
+
+/*
+ * Initialize variables for shared index statistics, set NULL bitmap and the
+ * size of stats for each index.  Also, this function sets the number of
+ * indexes that do not support parallel index vacuuming and that use
+ * maintenance_work_mem.  Since currently we don't support parallel vacuum
+ * for autovacuum we don't need to care about autovacuum_work_mem.
+ */
+static void
+prepare_index_statistics(LVShared *lvshared, Relation *Irel, int nindexes,
+	int nworkers)
+{
+	char *p = (char *) GetSharedIndStats(lvshared);
+	int nindexes_mwm = 0;
+	int i;
+
+	Assert(!IsAutoVacuumWorkerProcess());
+
+	/* Set NULL for all indexes */
+	memset(lvshared->bitmap, 0x00, BITMAPLEN(nindexes));
+
+	for (i = 0; i < nindexes; i++)
+	{
+		LVSharedIndStats *indstats;
+
+		if (Irel[i]->rd_indam->amparallelvacuumoptions ==
+			VACUUM_OPTION_NO_PARALLEL)
+			continue;
+
+		if (Irel[i]->rd_indam->amusemaintenanceworkmem)
+			nindexes_mwm++;
+
+		/* Set NOT NULL as this index do support parallelism */
+		lvshared->bitmap[i >> 3] |= 1 << (i & 0x07);
+
+		/* Set the size for index statistics */
+		indstats = (LVSharedIndStats *) p;
+		indstats->size = index_parallelvacuum_estimate(Irel[i]);
+
+		p += SizeOfSharedIndStats(indstats);
+	}
+
+	/* Compute the new maitenance_work_mem value for index vacuuming */
+	lvshared->maintenance_work_mem_worker =
+					(nindexes_mwm > 0) ?
+					maintenance_work_mem / Min(nworkers, nindexes_mwm) :
+					maintenance_work_mem;
+}
+
+/*
+ * Enter parallel mode, allocate and initialize the DSM segment.
+ */
+static LVParallelState *
+begin_parallel_vacuum(LVRelStats *vacrelstats, Oid relid, BlockNumber nblocks,
+					  Relation *Irel, int nindexes, int nrequested)
+{
+	LVParallelState *lps = (LVParallelState *) palloc0(sizeof(LVParallelState));
+	ParallelContext *pcxt;
+	LVShared		*shared;
+	LVDeadTuples	*dead_tuples;
+	long	maxtuples;
+	char	*sharedquery;
+	Size	est_shared;
+	Size	est_deadtuples;
+	int		querylen;
+	int		i;
+
+	Assert(nrequested > 0);
+	Assert(nindexes > 0);
+
+	lps->leaderparticipates = true;
+
+#ifdef PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION
+	lps->leaderparticipates = false;
+#endif
+
+	EnterParallelMode();
+	pcxt = CreateParallelContext("postgres", "heap_parallel_vacuum_main",
+								 nrequested);
+	lps->pcxt = pcxt;
+	Assert(pcxt->nworkers > 0);
+
+	/* Estimate size for shared information -- PARALLEL_VACUUM_KEY_SHARED */
+	est_shared = MAXALIGN(add_size(SizeOfLVShared, BITMAPLEN(nindexes)));
+	for (i = 0; i < nindexes; i++)
+	{
+		uint8 vacoptions = Irel[i]->rd_indam->amparallelvacuumoptions;
+
+		/*
+		 * Cleanup option should be either disabled, always performing
+		 * in parallel or conditionally performing in parallel.
+		 */
+		Assert(((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) == 0) ||
+			   ((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) == 0));
+		Assert(vacoptions <= VACUUM_OPTION_MAX_VALID_VALUE);
+
+		if (vacoptions != VACUUM_OPTION_NO_PARALLEL)
+		{
+			est_shared = add_size(est_shared,
+								  add_size(sizeof(LVSharedIndStats),
+										   index_parallelvacuum_estimate(Irel[i])));
+
+			/*
+			 * Remember the number of indexes that support parallel operation
+			 * for each phases.
+			 */
+			if ((vacoptions & VACUUM_OPTION_PARALLEL_BULKDEL) != 0)
+				lps->nindexes_parallel_bulkdel++;
+			if ((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) != 0)
+				lps->nindexes_parallel_cleanup++;
+			if ((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) != 0)
+				lps->nindexes_parallel_condcleanup++;
+		}
+	}
+	shm_toc_estimate_chunk(&pcxt->estimator, est_shared);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	/* Estimate size for dead tuples -- PARALLEL_VACUUM_KEY_DEAD_TUPLES */
+	maxtuples = compute_max_dead_tuples(nblocks, true);
+	est_deadtuples = MAXALIGN(add_size(SizeOfLVDeadTuples,
+									   mul_size(sizeof(ItemPointerData), maxtuples)));
+	shm_toc_estimate_chunk(&pcxt->estimator, est_deadtuples);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	/* Finally, estimate PARALLEL_VACUUM_KEY_QUERY_TEXT space */
+	querylen = strlen(debug_query_string);
+	shm_toc_estimate_chunk(&pcxt->estimator, querylen + 1);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	InitializeParallelDSM(pcxt);
+
+	/* Prepare shared information */
+	shared = (LVShared *) shm_toc_allocate(pcxt->toc, est_shared);
+	MemSet(shared, 0, est_shared);
+	shared->relid = relid;
+	shared->elevel = elevel;
+
+	/*
+	 * We need to care about alignment because we estimate the shared memory
+	 * in that way.
+	 */
+	shared->offset = MAXALIGN(add_size(SizeOfLVShared, BITMAPLEN(nindexes)));
+	prepare_index_statistics(shared, Irel, nindexes, nrequested);
+	pg_atomic_init_u32(&(shared->idx), 0);
+	pg_atomic_init_u32(&(shared->nprocessed), 0);
+	pg_atomic_init_u32(&(shared->cost_balance), 0);
+	pg_atomic_init_u32(&(shared->active_nworkers), 0);
+
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_SHARED, shared);
+	lps->lvshared = shared;
+
+	/* Prepare the dead tuple space */
+	dead_tuples = (LVDeadTuples *) shm_toc_allocate(pcxt->toc, est_deadtuples);
+	dead_tuples->max_tuples = maxtuples;
+	dead_tuples->num_tuples = 0;
+	MemSet(dead_tuples->itemptrs, 0, sizeof(ItemPointerData) * maxtuples);
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_DEAD_TUPLES, dead_tuples);
+	vacrelstats->dead_tuples = dead_tuples;
+
+	/* Store query string for workers */
+	sharedquery = (char *) shm_toc_allocate(pcxt->toc, querylen + 1);
+	memcpy(sharedquery, debug_query_string, querylen + 1);
+	sharedquery[querylen] = '\0';
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, sharedquery);
+
+	return lps;
+}
+
+/*
+ * Destroy the parallel context, and end parallel mode.
+ *
+ * All writes are not allowed during parallel mode and it might not be
+ * safe to exit from the parallel mode while keeping the parallel context.
+ * So we copy the updated index statistics to a local memory and then later
+ * use that to update the index statistics.
+ */
+static void
+end_parallel_vacuum(LVParallelState *lps, Relation *Irel, int nindexes,
+					IndexBulkDeleteResult **stats)
+{
+	int i;
+
+	Assert(!IsParallelWorker());
+
+	/* copy the updated statistics */
+	for (i = 0; i < nindexes; i++)
+	{
+		LVSharedIndStats *indstats = get_indstats(lps->lvshared, i);
+
+		/*
+		 * Skip unused slot.  The statistics of this index are already
+		 * stored in local memory.
+		 */
+		if (indstats == NULL)
+			continue;
+
+		if (indstats->updated)
+		{
+			stats[i] = (IndexBulkDeleteResult *) palloc0(sizeof(IndexBulkDeleteResult));
+			memcpy(stats[i],
+				   GetIndexBulkDeleteResult(indstats),
+				   sizeof(IndexBulkDeleteResult));
+		}
+		else
+			stats[i] = NULL;
+	}
+
+	DestroyParallelContext(lps->pcxt);
+	ExitParallelMode();
+
+	/* Deactivate parallel vacuum */
+	pfree(lps);
+	lps = NULL;
+}
+
+/* Return the Nth index statistics or NULL */
+static LVSharedIndStats *
+get_indstats(LVShared *lvshared, int n)
+{
+	int		i;
+	char	*p;
+
+	if (IndStatsIsNull(lvshared, n))
+		return NULL;
+
+	p = (char *) GetSharedIndStats(lvshared);
+	for (i = 0; i < n; i++)
+	{
+		if (IndStatsIsNull(lvshared, i))
+			continue;
+
+		p += SizeOfSharedIndStats(p);
+	}
+
+	return (LVSharedIndStats *) p;
+}
+
+/*
+ * Check if the given index participates in parallel index vacuuming
+ * or parallel index cleanup.
+ */
+static bool
+skip_parallel_index_vacuum(Relation indrel, LVShared *lvshared)
+{
+	uint8 vacoptions = indrel->rd_indam->amparallelvacuumoptions;
+
+	/* first_time must be true only if for_cleanup is true */
+	Assert(lvshared->for_cleanup || !lvshared->first_time);
+
+	if (lvshared->for_cleanup)
+	{
+		/* Skip, if the index does not support parallel cleanup */
+		if (((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) == 0) &&
+			((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) == 0))
+			return true;
+
+		/*
+		 * Skip, if the index supports parallel cleanup conditionally, but we
+		 * have already processed the index (for bulkdelete).  See the
+		 * comments for option VACUUM_OPTION_PARALLEL_COND_CLEANUP to know
+		 * when indexes support parallel cleanup conditionally.
+		 */
+		if (!lvshared->first_time &&
+			((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) != 0))
+			return true;
+	}
+	else if ((vacoptions & VACUUM_OPTION_PARALLEL_BULKDEL) == 0)
+	{
+		/* Skip if the index does not support parallel bulk deletion */
+		return true;
+	}
+
+	return false;
+}
+
+/*
+ * Perform work within a launched parallel process.
+ *
+ * Since parallel vacuum workers work only within index vacuuming and index
+ * cleanup, no need to report the progress information.
+ */
+void
+heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
+{
+	Relation	onerel;
+	Relation	*indrels;
+	LVShared	*lvshared;
+	LVDeadTuples	*dead_tuples;
+	int			nindexes;
+	char		*sharedquery;
+	IndexBulkDeleteResult **stats;
+
+	lvshared = (LVShared *) shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_SHARED,
+										   false);
+	elevel = lvshared->elevel;
+
+	ereport(DEBUG1,
+			(errmsg("starting parallel lazy vacuum worker for %s",
+					lvshared->for_cleanup ? "cleanup" : "vacuuming")));
+
+	/* Set debug_query_string for individual workers */
+	sharedquery = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, false);
+	debug_query_string = sharedquery;
+	pgstat_report_activity(STATE_RUNNING, debug_query_string);
+
+	/*
+	 * Open table.  The lock mode is the same as the leader process.  It's
+	 * okay because The lockmode does not conflict among the parallel
+	 * workers.
+	 */
+	onerel = table_open(lvshared->relid, ShareUpdateExclusiveLock);
+
+	/*
+	 * Open all indexes. indrels are sorted in order by OID, which should
+	 * be matched to the leader's one.
+	 */
+	vac_open_indexes(onerel, RowExclusiveLock, &nindexes, &indrels);
+	Assert(nindexes > 0);
+
+	/* Set dead tuple space */
+	dead_tuples = (LVDeadTuples *) shm_toc_lookup(toc,
+												  PARALLEL_VACUUM_KEY_DEAD_TUPLES,
+												  false);
+
+	/* Set cost-based vacuum delay */
+	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostBalance = 0;
+	VacuumPageHit = 0;
+	VacuumPageMiss = 0;
+	VacuumPageDirty = 0;
+	VacuumSharedCostBalance = &(lvshared->cost_balance);
+	VacuumActiveNWorkers = &(lvshared->active_nworkers);
+
+	stats = (IndexBulkDeleteResult **)
+		palloc0(nindexes * sizeof(IndexBulkDeleteResult *));
+
+	if (lvshared->maintenance_work_mem_worker > 0)
+		maintenance_work_mem = lvshared->maintenance_work_mem_worker;
+
+	/* Do either vacuuming indexes or cleaning indexes */
+	vacuum_or_cleanup_indexes_worker(indrels, nindexes, stats, lvshared,
+									 dead_tuples);
+
+	vac_close_indexes(nindexes, indrels, RowExclusiveLock);
+	table_close(onerel, ShareUpdateExclusiveLock);
+	pfree(stats);
+}
diff --git a/src/backend/access/nbtree/nbtsort.c b/src/backend/access/nbtree/nbtsort.c
index 1dd39a9535..c9972f5f37 100644
--- a/src/backend/access/nbtree/nbtsort.c
+++ b/src/backend/access/nbtree/nbtsort.c
@@ -1428,7 +1428,7 @@ _bt_begin_parallel(BTBuildState *buildstate, bool isconcurrent, int request)
 	shm_toc_insert(pcxt->toc, PARALLEL_KEY_QUERY_TEXT, sharedquery);
 
 	/* Launch workers, saving status for leader/caller */
-	LaunchParallelWorkers(pcxt);
+	LaunchParallelWorkers(pcxt, request);
 	btleader->pcxt = pcxt;
 	btleader->nparticipanttuplesorts = pcxt->nworkers_launched;
 	if (leaderparticipates)
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index d147236429..157c309211 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -14,6 +14,7 @@
 
 #include "postgres.h"
 
+#include "access/heapam.h"
 #include "access/nbtree.h"
 #include "access/parallel.h"
 #include "access/session.h"
@@ -139,6 +140,9 @@ static const struct
 	},
 	{
 		"_bt_parallel_build_main", _bt_parallel_build_main
+	},
+	{
+		"heap_parallel_vacuum_main", heap_parallel_vacuum_main
 	}
 };
 
@@ -490,10 +494,11 @@ ReinitializeParallelDSM(ParallelContext *pcxt)
  * Launch parallel workers.
  */
 void
-LaunchParallelWorkers(ParallelContext *pcxt)
+LaunchParallelWorkers(ParallelContext *pcxt, int nworkers)
 {
 	MemoryContext oldcontext;
 	BackgroundWorker worker;
+	int			nworkers_to_launch = Min(nworkers, pcxt->nworkers);;
 	int			i;
 	bool		any_registrations_failed = false;
 
@@ -533,7 +538,7 @@ LaunchParallelWorkers(ParallelContext *pcxt)
 	 * fails.  It wouldn't help much anyway, because registering the worker in
 	 * no way guarantees that it will start up and initialize successfully.
 	 */
-	for (i = 0; i < pcxt->nworkers; ++i)
+	for (i = 0; i < nworkers_to_launch; ++i)
 	{
 		memcpy(worker.bgw_extra, &i, sizeof(int));
 		if (!any_registrations_failed &&
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index da1da23400..4b7f480fd6 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -42,6 +42,7 @@
 #include "nodes/makefuncs.h"
 #include "pgstat.h"
 #include "postmaster/autovacuum.h"
+#include "postmaster/bgworker_internals.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
 #include "storage/proc.h"
@@ -99,6 +100,7 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 	/* Set default value */
 	params.index_cleanup = VACOPT_TERNARY_DEFAULT;
 	params.truncate = VACOPT_TERNARY_DEFAULT;
+	params.nworkers = -1;
 
 	/* Parse options list */
 	foreach(lc, vacstmt->options)
@@ -129,6 +131,28 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 			params.index_cleanup = get_vacopt_ternary_value(opt);
 		else if (strcmp(opt->defname, "truncate") == 0)
 			params.truncate = get_vacopt_ternary_value(opt);
+		else if (strcmp(opt->defname, "parallel") == 0)
+		{
+			if (opt->arg == NULL)
+			{
+				/*
+				 * Parallel lazy vacuum is requested but user didn't specify
+				 * the parallel degree. The parallel degree will be determined
+				 * at the start of lazy vacuum.
+				 */
+				params.nworkers = 0;
+			}
+			else
+			{
+				params.nworkers = defGetInt32(opt);
+				if (params.nworkers < 1 || params.nworkers > MAX_PARALLEL_WORKER_LIMIT)
+					ereport(ERROR,
+							(errcode(ERRCODE_SYNTAX_ERROR),
+							 errmsg("parallel vacuum degree must be between 1 and %d",
+									MAX_PARALLEL_WORKER_LIMIT),
+							 parser_errposition(pstate, opt->location)));
+			}
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -170,6 +194,11 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 		}
 	}
 
+	if ((params.options & VACOPT_FULL) && params.nworkers >= 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot specify FULL option with PARALLEL option")));
+
 	/*
 	 * All freeze ages are zero if the FREEZE option is given; otherwise pass
 	 * them as -1 which means to use the default values.
@@ -383,6 +412,7 @@ vacuum(List *relations, VacuumParams *params,
 		VacuumPageHit = 0;
 		VacuumPageMiss = 0;
 		VacuumPageDirty = 0;
+		VacuumSharedCostBalance = NULL;
 
 		/*
 		 * Loop to process each selected relation.
@@ -1738,6 +1768,20 @@ vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params)
 		return false;
 	}
 
+	/*
+	 * Since parallel workers cannot access data in temporary tables, parallel
+	 * vacuum is not allowed for temporary relation. However rather than
+	 * skipping vacuum on the table, just disabling parallel option is better
+	 * option in most cases.
+	 */
+	if (RelationUsesLocalBuffers(onerel) && params->nworkers >= 0)
+	{
+		ereport(WARNING,
+				(errmsg("disabling parallel option of vacuum on \"%s\" --- cannot vacuum temporary tables in parallel",
+						RelationGetRelationName(onerel))));
+		params->nworkers = 0;
+	}
+
 	/*
 	 * Silently ignore partitioned tables as there is no work to be done.  The
 	 * useful work is on their child partitions, which have been queued up for
@@ -1941,16 +1985,73 @@ vac_close_indexes(int nindexes, Relation *Irel, LOCKMODE lockmode)
 void
 vacuum_delay_point(void)
 {
+	double	msec = 0;
+
 	/* Always check for interrupts */
 	CHECK_FOR_INTERRUPTS();
 
-	/* Nap if appropriate */
-	if (VacuumCostActive && !InterruptPending &&
-		VacuumCostBalance >= VacuumCostLimit)
+	if (!VacuumCostActive || InterruptPending)
+		return;
+
+	/*
+	 * If the vacuum cost balance is shared among parallel workers we
+	 * decide whether to sleep based on that.
+	 */
+	if (VacuumSharedCostBalance != NULL)
 	{
-		double		msec;
+		int nworkers = pg_atomic_read_u32(VacuumActiveNWorkers);
+
+		/* At least count itself */
+		Assert(nworkers >= 1);
+
+		/* Update the shared cost balance value atomically */
+		while (true)
+		{
+			uint32 shared_balance;
+			uint32 new_balance;
+			uint32 local_balance;
+
+			msec = 0;
+
+			/* compute new balance by adding the local value */
+			shared_balance = pg_atomic_read_u32(VacuumSharedCostBalance);
+			new_balance = shared_balance + VacuumCostBalance;
 
+			/* also compute the total local balance */
+			local_balance = VacuumCostBalanceLocal + VacuumCostBalance;
+
+			if ((new_balance >= VacuumCostLimit) &&
+				(local_balance > 0.5 * (VacuumCostLimit / nworkers)))
+			{
+				/* compute sleep time based on the local cost balance */
+				msec = VacuumCostDelay * VacuumCostBalanceLocal / VacuumCostLimit;
+				new_balance = shared_balance - VacuumCostBalanceLocal;
+				VacuumCostBalanceLocal = 0;
+			}
+
+			if (pg_atomic_compare_exchange_u32(VacuumSharedCostBalance,
+											   &shared_balance,
+											   new_balance))
+			{
+				/* Updated successfully, break */
+				break;
+			}
+		}
+
+		VacuumCostBalanceLocal += VacuumCostBalance;
+
+		/*
+		 * Reset the local balance as we accumulated it into the shared
+		 * value.
+		 */
+		VacuumCostBalance = 0;
+	}
+	else if (VacuumCostBalance >= VacuumCostLimit)
 		msec = VacuumCostDelay * VacuumCostBalance / VacuumCostLimit;
+
+	/* Nap if appropriate */
+	if (msec > 0)
+	{
 		if (msec > VacuumCostDelay * 4)
 			msec = VacuumCostDelay * 4;
 
diff --git a/src/backend/executor/nodeGather.c b/src/backend/executor/nodeGather.c
index 69d5a1f239..df28ff2927 100644
--- a/src/backend/executor/nodeGather.c
+++ b/src/backend/executor/nodeGather.c
@@ -183,7 +183,7 @@ ExecGather(PlanState *pstate)
 			 * requested, or indeed any at all.
 			 */
 			pcxt = node->pei->pcxt;
-			LaunchParallelWorkers(pcxt);
+			LaunchParallelWorkers(pcxt, gather->num_workers);
 			/* We save # workers launched for the benefit of EXPLAIN */
 			node->nworkers_launched = pcxt->nworkers_launched;
 
diff --git a/src/backend/executor/nodeGatherMerge.c b/src/backend/executor/nodeGatherMerge.c
index 6ef128e2ab..cb9d5a725a 100644
--- a/src/backend/executor/nodeGatherMerge.c
+++ b/src/backend/executor/nodeGatherMerge.c
@@ -224,7 +224,7 @@ ExecGatherMerge(PlanState *pstate)
 
 			/* Try to launch workers. */
 			pcxt = node->pei->pcxt;
-			LaunchParallelWorkers(pcxt);
+			LaunchParallelWorkers(pcxt, gm->num_workers);
 			/* We save # workers launched for the benefit of EXPLAIN */
 			node->nworkers_launched = pcxt->nworkers_launched;
 
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index c1dd8168ca..c3690f9c41 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2891,6 +2891,8 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 			(!wraparound ? VACOPT_SKIP_LOCKED : 0);
 		tab->at_params.index_cleanup = VACOPT_TERNARY_DEFAULT;
 		tab->at_params.truncate = VACOPT_TERNARY_DEFAULT;
+		/* We don't support parallel vacuum for autovacuum for now */
+		tab->at_params.nworkers = -1;
 		tab->at_params.freeze_min_age = freeze_min_age;
 		tab->at_params.freeze_table_age = freeze_table_age;
 		tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index df26826993..4df3de429e 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -3585,7 +3585,7 @@ psql_completion(const char *text, int start, int end)
 		if (ends_with(prev_wd, '(') || ends_with(prev_wd, ','))
 			COMPLETE_WITH("FULL", "FREEZE", "ANALYZE", "VERBOSE",
 						  "DISABLE_PAGE_SKIPPING", "SKIP_LOCKED",
-						  "INDEX_CLEANUP", "TRUNCATE");
+						  "INDEX_CLEANUP", "TRUNCATE", "PARALLEL");
 		else if (TailMatches("FULL|FREEZE|ANALYZE|VERBOSE|DISABLE_PAGE_SKIPPING|SKIP_LOCKED|INDEX_CLEANUP|TRUNCATE"))
 			COMPLETE_WITH("ON", "OFF");
 	}
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 858bcb6bc9..61725e749f 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -24,6 +24,8 @@
 #include "nodes/primnodes.h"
 #include "storage/bufpage.h"
 #include "storage/lockdefs.h"
+#include "storage/shm_toc.h"
+#include "storage/dsm.h"
 #include "utils/relcache.h"
 #include "utils/snapshot.h"
 
@@ -190,9 +192,13 @@ extern void SyncScanShmemInit(void);
 extern Size SyncScanShmemSize(void);
 
 /* in heap/vacuumlazy.c */
+extern pg_atomic_uint32	*VacuumSharedCostBalance;
+extern pg_atomic_uint32	*VacuumActiveNWorkers;
+extern int	VacuumCostBalanceLocal;
 struct VacuumParams;
 extern void heap_vacuum_rel(Relation onerel,
 							struct VacuumParams *params, BufferAccessStrategy bstrategy);
+extern void heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
 
 /* in heap/heapam_visibility.c */
 extern bool HeapTupleSatisfiesVisibility(HeapTuple stup, Snapshot snapshot,
diff --git a/src/include/access/parallel.h b/src/include/access/parallel.h
index c00ae6424c..e5e6ae6c08 100644
--- a/src/include/access/parallel.h
+++ b/src/include/access/parallel.h
@@ -63,7 +63,7 @@ extern ParallelContext *CreateParallelContext(const char *library_name,
 											  const char *function_name, int nworkers);
 extern void InitializeParallelDSM(ParallelContext *pcxt);
 extern void ReinitializeParallelDSM(ParallelContext *pcxt);
-extern void LaunchParallelWorkers(ParallelContext *pcxt);
+extern void LaunchParallelWorkers(ParallelContext *pcxt, int nworkers);
 extern void WaitForParallelWorkersToAttach(ParallelContext *pcxt);
 extern void WaitForParallelWorkersToFinish(ParallelContext *pcxt);
 extern void DestroyParallelContext(ParallelContext *pcxt);
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 5f23f1ab1d..0a586dca8d 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -218,6 +218,11 @@ typedef struct VacuumParams
 										 * default value depends on reloptions */
 	VacOptTernaryValue truncate;	/* Truncate empty pages at the end,
 									 * default value depends on reloptions */
+	/*
+	 * The number of parallel vacuum workers. -1 by default for no workers
+	 * and 0 for choosing based on the number of indexes.
+	 */
+	int			nworkers;
 } VacuumParams;
 
 /* GUC parameters */
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index 9996d882d1..cf5e1f0a4e 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -92,6 +92,32 @@ CONTEXT:  SQL function "do_analyze" statement 1
 SQL function "wrap_do_analyze" statement 1
 VACUUM FULL vactst;
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+-- PARALLEL option
+CREATE TABLE pvactst (i INT, a INT[], p POINT) with (autovacuum_enabled = off);
+INSERT INTO pvactst SELECT i, array[1,2,3], point(i, i+1) FROM generate_series(1,100000) i;
+CREATE INDEX btree_pvactst ON pvactst USING btree (i);
+CREATE INDEX hash_pvactst ON pvactst USING hash (i);
+CREATE INDEX brin_pvactst ON pvactst USING brin (i);
+CREATE INDEX gin_pvactst ON pvactst USING gin (a);
+CREATE INDEX gist_pvactst ON pvactst USING gist (p);
+CREATE INDEX spgist_pvactst ON pvactst USING spgist (p);
+-- VACUUM invokes parallel index cleanup
+VACUUM (PARALLEL 2) pvactst;
+-- VACUUM invokes parallel bulk-deletion and parallel index cleanup
+UPDATE pvactst SET i = i WHERE i < 1000;
+VACUUM (PARALLEL 2) pvactst;
+VACUUM (PARALLEL 0) pvactst; -- error
+ERROR:  parallel vacuum degree must be between 1 and 1024
+LINE 1: VACUUM (PARALLEL 0) pvactst;
+                ^
+VACUUM (PARALLEL 2, INDEX_CLEANUP FALSE) pvactst;
+VACUUM (PARALLEL 2, FULL TRUE) pvactst; -- error, cannot use both PARALLEL and FULL
+ERROR:  cannot specify FULL option with PARALLEL option
+CREATE TEMPORARY TABLE tmp (a int PRIMARY KEY);
+CREATE INDEX tmp_idx1 ON tmp (a);
+VACUUM (PARALLEL 1) tmp; -- disables parallel vacuum option
+WARNING:  disabling parallel option of vacuum on "tmp" --- cannot vacuum temporary tables in parallel
+DROP TABLE pvactst;
 -- INDEX_CLEANUP option
 CREATE TABLE no_index_cleanup (i INT PRIMARY KEY, t TEXT);
 -- Use uncompressed data stored in toast.
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 69987f75e9..0aecf17773 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -75,6 +75,31 @@ VACUUM FULL vactst;
 
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
 
+-- PARALLEL option
+CREATE TABLE pvactst (i INT, a INT[], p POINT) with (autovacuum_enabled = off);
+INSERT INTO pvactst SELECT i, array[1,2,3], point(i, i+1) FROM generate_series(1,100000) i;
+CREATE INDEX btree_pvactst ON pvactst USING btree (i);
+CREATE INDEX hash_pvactst ON pvactst USING hash (i);
+CREATE INDEX brin_pvactst ON pvactst USING brin (i);
+CREATE INDEX gin_pvactst ON pvactst USING gin (a);
+CREATE INDEX gist_pvactst ON pvactst USING gist (p);
+CREATE INDEX spgist_pvactst ON pvactst USING spgist (p);
+
+-- VACUUM invokes parallel index cleanup
+VACUUM (PARALLEL 2) pvactst;
+
+-- VACUUM invokes parallel bulk-deletion and parallel index cleanup
+UPDATE pvactst SET i = i WHERE i < 1000;
+VACUUM (PARALLEL 2) pvactst;
+
+VACUUM (PARALLEL 0) pvactst; -- error
+VACUUM (PARALLEL 2, INDEX_CLEANUP FALSE) pvactst;
+VACUUM (PARALLEL 2, FULL TRUE) pvactst; -- error, cannot use both PARALLEL and FULL
+CREATE TEMPORARY TABLE tmp (a int PRIMARY KEY);
+CREATE INDEX tmp_idx1 ON tmp (a);
+VACUUM (PARALLEL 1) tmp; -- disables parallel vacuum option
+DROP TABLE pvactst;
+
 -- INDEX_CLEANUP option
 CREATE TABLE no_index_cleanup (i INT PRIMARY KEY, t TEXT);
 -- Use uncompressed data stored in toast.
-- 
2.23.0

#239

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

about 6 years ago

In reply to: Mahendra Singh (#237)

On Wed, 27 Nov 2019 at 13:28, Mahendra Singh <mahi6run@gmail.com> wrote:

On Wed, 27 Nov 2019 at 08:14, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Nov 27, 2019 at 12:52 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

I've incorporated the comments I got so far including the above and
the memory alignment issue.

Thanks, I will look into the new version. BTW, why haven't you posted
0001 patch (IndexAM API's patch)? I think without that we need to use
the previous version for that. Also, I think we should post Dilip's
patch related to Gist index [1] modifications for parallel vacuum or
at least have a mention for that while posting a new version as
without that even make check fails.

[1] - /messages/by-id/CAFiTN-uQY+B+CLb8W3YYdb7XmB9hyYFXkAy3C7RY=-YSWRV1DA@mail.gmail.com

I did some testing on the top of v33 patch set. By debugging, I was able to hit one assert in lazy_parallel_vacuum_or_cleanup_indexes.
TRAP: FailedAssertion("nprocessed == nindexes_remains", File: "vacuumlazy.c", Line: 2099)

I further debugged and found that this assert is not valid in all the cases. Here, nprocessed can be less than nindexes_remains in some cases because it is possible that parallel worker is launched for vacuum and idx count is incremented in vacuum_or_cleanup_indexes_worker for particular index but work is still not finished(lvshared->nprocessed is not incremented yet) so in that case, nprocessed will be less than nindexes_remains. I think, we should remove this assert.

I have one comment for assert used variable:
+#ifdef USE_ASSERT_CHECKING
+ int nprocessed = 0;
+#endif
I think, we can make above declaration as " int nprocessed PG_USED_FOR_ASSERTS_ONLY = 0" so that code looks good because this USE_ASSERT_CHECKING is used in 3 places in 20-30 code lines.

Thank you for testing!

Yes, I think your analysis is right. I've removed the assertion in v35
patch that I've just posted[1]/messages/by-id/CA+fd4k5oAuGuwZ9XaOTv+cTU8-dmA3RjpJ+i4x5kt9VbAFse1w@mail.gmail.com.

[1]: /messages/by-id/CA+fd4k5oAuGuwZ9XaOTv+cTU8-dmA3RjpJ+i4x5kt9VbAFse1w@mail.gmail.com

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#240

Mahendra Singh

mahi6run@gmail.com

about 6 years ago

In reply to: Masahiko Sawada (#238)

On Wed, 27 Nov 2019 at 23:14, Masahiko Sawada <
masahiko.sawada@2ndquadrant.com> wrote:

On Wed, 27 Nov 2019 at 13:26, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Nov 27, 2019 at 8:14 AM Amit Kapila <amit.kapila16@gmail.com>

wrote:
On Wed, Nov 27, 2019 at 12:52 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

I've incorporated the comments I got so far including the above and
the memory alignment issue.

Thanks, I will look into the new version.
Few comments:
-----------------------
1.
+static void
+vacuum_or_cleanup_indexes_worker(Relation *Irel, int nindexes,
+ IndexBulkDeleteResult **stats,
+ LVShared *lvshared,
+ LVDeadTuples *dead_tuples)
+{
+ /* Increment the active worker count */
+ pg_atomic_add_fetch_u32(VacuumActiveNWorkers, 1);
The above code is wrong because it is possible that this function is
called even when there are no workers in which case
VacuumActiveNWorkers will be NULL.
2.
+ /* Take over the shared balance value to heap scan */
+ VacuumCostBalance = pg_atomic_read_u32(VacuumSharedCostBalance);
We can carry over shared balance only if the same is active.
3.
+ if (Irel[i]->rd_indam->amparallelvacuumoptions ==
+ VACUUM_OPTION_NO_PARALLEL)
+ {
+
/* Set NULL as this index does not support parallel vacuum */
+ lvshared->bitmap[i >> 3] |= 0 << (i & 0x07);
Can we avoid setting this for each index by initializing bitmap as all
NULL's as is done in the attached patch?
4.
+ /*
+ * Variables to control parallel index vacuuming.  Index statistics
+ * returned from ambulkdelete and amvacuumcleanup is nullable
variable
+ * length.  'offset' is NULL bitmap. Note that a 0 indicates a null,
+ * while 1 indicates non-null.  The index statistics follows
at end of
+ * struct.
+ */
This comment is not clear, so I have re-worded it. See, if the
changed comment makes sense.

I have fixed all the above issues, made a couple of other cosmetic
changes and modified a few comments. See the changes in
v34-0002-delta-amit. I am attaching just the delta patch on top of
v34-0002-Add-parallel-option-to-VACUUM-command.
Thank you for reviewing this patch. All changes you made looks good to me.

I thought I already have posted all v34 patches but didn't, sorry. So
I've attached v35 patch set that incorporated your changes and it
includes Dilip's patch for gist index (0001). These patches can be
applied on top of the current HEAD and make check should pass.

Thanks for the re-based patches.

On the top of v35 patch, I can see one compilation warning.

parallel.c: In function ‘LaunchParallelWorkers’:
parallel.c:502:2: warning: ISO C90 forbids mixed declarations and code
[-Wdeclaration-after-statement]
int i;
^

Above warning is due to one extra semicolon added at the end of declaration
line in v35-0003 patch. Please fix this in next version.
+ int nworkers_to_launch = Min(nworkers, pcxt->nworkers);;

I will continue my testing on the top of v35 patch set and will post
results.

Thanks and Regards
Mahendra Thalor
EnterpriseDB: http://www.enterprisedb.com

#241

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

about 6 years ago

In reply to: Mahendra Singh (#240)

On Wed, 27 Nov 2019 at 19:21, Mahendra Singh <mahi6run@gmail.com> wrote:

Thanks for the re-based patches.

On the top of v35 patch, I can see one compilation warning.

parallel.c: In function ‘LaunchParallelWorkers’:
parallel.c:502:2: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement]
int i;
^

Above warning is due to one extra semicolon added at the end of declaration line in v35-0003 patch. Please fix this in next version.
+ int nworkers_to_launch = Min(nworkers, pcxt->nworkers);;

Thanks. I will fix it in the next version patch.

I will continue my testing on the top of v35 patch set and will post results.

Thank you!

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#242

Mahendra Singh

mahi6run@gmail.com

about 6 years ago

In reply to: Masahiko Sawada (#241)

On Thu, 28 Nov 2019 at 13:32, Masahiko Sawada <
masahiko.sawada@2ndquadrant.com> wrote:

On Wed, 27 Nov 2019 at 19:21, Mahendra Singh <mahi6run@gmail.com> wrote:

Thanks for the re-based patches.

On the top of v35 patch, I can see one compilation warning.

parallel.c: In function ‘LaunchParallelWorkers’:
parallel.c:502:2: warning: ISO C90 forbids mixed declarations and code

[-Wdeclaration-after-statement]

int i;
^

Above warning is due to one extra semicolon added at the end of

declaration line in v35-0003 patch. Please fix this in next version.

+ int nworkers_to_launch = Min(nworkers, pcxt->nworkers);;

Thanks. I will fix it in the next version patch.

I will continue my testing on the top of v35 patch set and will post

results.

While reviewing v35 patch set and doing testing, I found that if we disable
leader participation, then we are launching 1 less parallel worker than
total number of indexes. (I am using max_parallel_workers = 20,
max_parallel_maintenance_workers = 20)

For example: If table have 3 indexes and we gave 6 parallel vacuum
degree(leader participation is disabled), then I think, we should launch 3
parallel workers but we are launching 2 workers due to below check.
+       nworkers = lps->nindexes_parallel_bulkdel - 1;
+
+   /* Cap by the worker we computed at the beginning of parallel lazy
vacuum */
+   nworkers = Min(nworkers, lps->pcxt->nworkers);

Please let me know your thoughts for this.

Thanks and Regards
Mahendra Thalor
EnterpriseDB: http://www.enterprisedb.com

#243

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Mahendra Singh (#242)

On Thu, Nov 28, 2019 at 4:10 PM Mahendra Singh <mahi6run@gmail.com> wrote:

On Thu, 28 Nov 2019 at 13:32, Masahiko Sawada <masahiko.sawada@2ndquadrant.com> wrote:

On Wed, 27 Nov 2019 at 19:21, Mahendra Singh <mahi6run@gmail.com> wrote:

Thanks for the re-based patches.

On the top of v35 patch, I can see one compilation warning.

parallel.c: In function ‘LaunchParallelWorkers’:
parallel.c:502:2: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement]
int i;
^

Above warning is due to one extra semicolon added at the end of declaration line in v35-0003 patch. Please fix this in next version.
+ int nworkers_to_launch = Min(nworkers, pcxt->nworkers);;

Thanks. I will fix it in the next version patch.

I will continue my testing on the top of v35 patch set and will post results.

While reviewing v35 patch set and doing testing, I found that if we disable leader participation, then we are launching 1 less parallel worker than total number of indexes. (I am using max_parallel_workers = 20, max_parallel_maintenance_workers = 20)
For example: If table have 3 indexes and we gave 6 parallel vacuum degree(leader participation is disabled), then I think, we should launch 3 parallel workers but we are launching 2 workers due to below check.
+       nworkers = lps->nindexes_parallel_bulkdel - 1;
+
+   /* Cap by the worker we computed at the beginning of parallel lazy vacuum */
+   nworkers = Min(nworkers, lps->pcxt->nworkers);
Please let me know your thoughts for this.

I think it is probably because this part of the code doesn't consider
PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION. I think if we want we
can change it but I am slightly nervous about the code complexity this
will bring but maybe that is fine.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#244

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

about 6 years ago

In reply to: Amit Kapila (#243)

On Thu, 28 Nov 2019 at 11:57, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Thu, Nov 28, 2019 at 4:10 PM Mahendra Singh <mahi6run@gmail.com> wrote:
On Thu, 28 Nov 2019 at 13:32, Masahiko Sawada <masahiko.sawada@2ndquadrant.com> wrote:

On Wed, 27 Nov 2019 at 19:21, Mahendra Singh <mahi6run@gmail.com> wrote:

Thanks for the re-based patches.

On the top of v35 patch, I can see one compilation warning.

parallel.c: In function ‘LaunchParallelWorkers’:
parallel.c:502:2: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement]
int i;
^

Above warning is due to one extra semicolon added at the end of declaration line in v35-0003 patch. Please fix this in next version.
+ int nworkers_to_launch = Min(nworkers, pcxt->nworkers);;

Thanks. I will fix it in the next version patch.

I will continue my testing on the top of v35 patch set and will post results.

While reviewing v35 patch set and doing testing, I found that if we disable leader participation, then we are launching 1 less parallel worker than total number of indexes. (I am using max_parallel_workers = 20, max_parallel_maintenance_workers = 20)
For example: If table have 3 indexes and we gave 6 parallel vacuum degree(leader participation is disabled), then I think, we should launch 3 parallel workers but we are launching 2 workers due to below check.
+       nworkers = lps->nindexes_parallel_bulkdel - 1;
+
+   /* Cap by the worker we computed at the beginning of parallel lazy vacuum */
+   nworkers = Min(nworkers, lps->pcxt->nworkers);
Please let me know your thoughts for this.

Thanks!

I think it is probably because this part of the code doesn't consider
PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION. I think if we want we
can change it but I am slightly nervous about the code complexity this
will bring but maybe that is fine.

Right. I'll try to change so that.

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#245

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Masahiko Sawada (#244)

On Fri, Nov 29, 2019 at 7:11 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Thu, 28 Nov 2019 at 11:57, Amit Kapila <amit.kapila16@gmail.com> wrote:

I think it is probably because this part of the code doesn't consider
PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION. I think if we want we
can change it but I am slightly nervous about the code complexity this
will bring but maybe that is fine.

Right. I'll try to change so that.

I am thinking that as PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION is
a debugging/testing facility, we should ideally separate this out from
the main patch. BTW, I am hacking/reviewing the patch further, so
request you to wait for a few day's time before we do anything in this
regard.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#246

Sergei Kornilov

sk@zsrv.org

about 6 years ago

In reply to: Amit Kapila (#245)

Hello

Its possible to change order of index processing by parallel leader? In v35 patchset I see following order:
- start parallel processes
- leader and parallel workers processed index lixt and possible skip some entries
- after that parallel leader recheck index list and process the skipped indexes
- WaitForParallelWorkersToFinish

I think it would be better to:
- start parallel processes
- parallel leader goes through index list and process only indexes which are skip_parallel_index_vacuum = true
- parallel workers processes indexes with skip_parallel_index_vacuum = false
- parallel leader start participate with remainings parallel-safe index processing
- WaitForParallelWorkersToFinish

This would be less running time and better load balance across leader and workers in case of few non-parallel and few parallel indexes.
(if this is expected and required by some reason, we need a comment in code)

Also few notes to vacuumdb:
Seems we need version check at least in vacuum_one_database and prepare_vacuum_command. Similar to SKIP_LOCKED or DISABLE_PAGE_SKIPPING features.
discussion question: difference between --parallel and --jobs parameters will be confusing? We need more description for this options?

regards, Sergei

#247

Mahendra Singh

mahi6run@gmail.com

about 6 years ago

In reply to: Sergei Kornilov (#246)

1 attachment(s)

On Sat, 30 Nov 2019 at 19:18, Sergei Kornilov <sk@zsrv.org> wrote:

Hello

Its possible to change order of index processing by parallel leader? In
v35 patchset I see following order:
- start parallel processes
- leader and parallel workers processed index lixt and possible skip some
entries
- after that parallel leader recheck index list and process the skipped
indexes
- WaitForParallelWorkersToFinish

I think it would be better to:
- start parallel processes
- parallel leader goes through index list and process only indexes which
are skip_parallel_index_vacuum = true
- parallel workers processes indexes with skip_parallel_index_vacuum =
false
- parallel leader start participate with remainings parallel-safe index
processing
- WaitForParallelWorkersToFinish

This would be less running time and better load balance across leader and
workers in case of few non-parallel and few parallel indexes.
(if this is expected and required by some reason, we need a comment in
code)

Also few notes to vacuumdb:
Seems we need version check at least in vacuum_one_database and
prepare_vacuum_command. Similar to SKIP_LOCKED or DISABLE_PAGE_SKIPPING
features.
discussion question: difference between --parallel and --jobs parameters
will be confusing? We need more description for this options

While doing testing with different server configuration settings, I am
getting error (ERROR: no unpinned buffers available) in parallel vacuum
but normal vacuum is working fine.

*Test Setup*:
max_worker_processes = 40

autovacuum = off

shared_buffers = 128kB

max_parallel_workers = 40

max_parallel_maintenance_workers = 40

vacuum_cost_limit = 2000

vacuum_cost_delay = 10

*Table description: *table have 16 indexes(14 btree, 1 hash, 1 BRIN ) and
total 10,00,000 tuples and I am deleting all the tuples, then firing vacuum
command.
Run attached .sql file (test_16_indexes.sql)
$ ./psql postgres
postgres=# \i test_16_indexes.sql

Re-start the server and do vacuum.
Case 1) normal vacuum:
postgres=# vacuum test ;
VACUUM
Time: 115174.470 ms (01:55.174)

Case 2) parallel vacuum using 10 parallel workers:
postgres=# vacuum (parallel 10)test ;
ERROR: no unpinned buffers available
CONTEXT: parallel worker
postgres=#

This error is coming due to 128kB shared buffer. I think, I launched 10
parallel workers and all are working paralleling so due to less shared
buffer, I am getting this error.

Is this expected behavior with small shared buffer size or we should try to
come with a solution for this. Please let me know your thoughts.

Thanks and Regards
Mahendra Thalor
EnterpriseDB: http://www.enterprisedb.com

#248

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Sergei Kornilov (#246)

On Sat, Nov 30, 2019 at 7:18 PM Sergei Kornilov <sk@zsrv.org> wrote:

Hello

Its possible to change order of index processing by parallel leader? In
v35 patchset I see following order:
- start parallel processes
- leader and parallel workers processed index lixt and possible skip some
entries
- after that parallel leader recheck index list and process the skipped
indexes
- WaitForParallelWorkersToFinish

I think it would be better to:
- start parallel processes
- parallel leader goes through index list and process only indexes which
are skip_parallel_index_vacuum = true
- parallel workers processes indexes with skip_parallel_index_vacuum =
false
- parallel leader start participate with remainings parallel-safe index
processing
- WaitForParallelWorkersToFinish

This would be less running time and better load balance across leader and
workers in case of few non-parallel and few parallel indexes.

Why do you think so? I think the advantage of the current approach is that
once the parallel workers are launched, the leader can process indexes that
don't support parallelism. So, both type of indexes can be processed at
the same time.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#249

Sergei Kornilov

sk@zsrv.org

about 6 years ago

In reply to: Amit Kapila (#248)

I think the advantage of the current approach is that once the parallel workers are launched, the leader can process indexes that don't support parallelism. So, both type of indexes can be processed at the same time.

In lazy_parallel_vacuum_or_cleanup_indexes I see:

/*
* Join as a parallel worker. The leader process alone does that in
* case where no workers launched.
*/
if (lps->leaderparticipates || lps->pcxt->nworkers_launched == 0)
vacuum_or_cleanup_indexes_worker(Irel, nindexes, stats, lps->lvshared,
vacrelstats->dead_tuples);

/*
* Here, the indexes that had been skipped during parallel index vacuuming
* are remaining. If there are such indexes the leader process does vacuum
* or cleanup them one by one.
*/
vacuum_or_cleanup_skipped_indexes(vacrelstats, Irel, nindexes, stats,
lps);

So parallel leader will process parallel indexes first along with parallel workers and skip non-parallel ones. Only after end of the index list parallel leader will process non-parallel indexes one by one. In case of equal index processing time parallel leader will process (count of parallel indexes)/(nworkers+1) + all non-parallel, while parallel workers will process (count of parallel indexes)/(nworkers+1). I am wrong here?

regards, Sergei

#250

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

about 6 years ago

In reply to: Sergei Kornilov (#249)

On Sun, 1 Dec 2019 at 11:06, Sergei Kornilov <sk@zsrv.org> wrote:

Hi

I think the advantage of the current approach is that once the parallel workers are launched, the leader can process indexes that don't support parallelism. So, both type of indexes can be processed at the same time.

In lazy_parallel_vacuum_or_cleanup_indexes I see:

/*
* Join as a parallel worker. The leader process alone does that in
* case where no workers launched.
*/
if (lps->leaderparticipates || lps->pcxt->nworkers_launched == 0)
vacuum_or_cleanup_indexes_worker(Irel, nindexes, stats, lps->lvshared,
vacrelstats->dead_tuples);

/*
* Here, the indexes that had been skipped during parallel index vacuuming
* are remaining. If there are such indexes the leader process does vacuum
* or cleanup them one by one.
*/
vacuum_or_cleanup_skipped_indexes(vacrelstats, Irel, nindexes, stats,
lps);

So parallel leader will process parallel indexes first along with parallel workers and skip non-parallel ones. Only after end of the index list parallel leader will process non-parallel indexes one by one. In case of equal index processing time parallel leader will process (count of parallel indexes)/(nworkers+1) + all non-parallel, while parallel workers will process (count of parallel indexes)/(nworkers+1). I am wrong here?

I think I got your point. Your proposal is that it's more efficient if
we make the leader process vacuum the index that can be processed only
the leader process (i.e. indexes not supporting parallel index vacuum)
while workers are processing indexes supporting parallel index vacuum,
right? That way, we can process indexes in parallel as much as
possible. So maybe we can call vacuum_or_cleanup_skipped_indexes first
and then call vacuum_or_cleanup_indexes_worker. But I'm not sure that
there are parallel-safe remaining indexes after the leader finished
vacuum_or_cleanup_indexes_worker, as described on your proposal.

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#251

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

about 6 years ago

In reply to: Amit Kapila (#245)

On Sat, 30 Nov 2019 at 04:06, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Nov 29, 2019 at 7:11 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Thu, 28 Nov 2019 at 11:57, Amit Kapila <amit.kapila16@gmail.com> wrote:

I think it is probably because this part of the code doesn't consider
PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION. I think if we want we
can change it but I am slightly nervous about the code complexity this
will bring but maybe that is fine.

Right. I'll try to change so that.

I am thinking that as PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION is
a debugging/testing facility, we should ideally separate this out from
the main patch. BTW, I am hacking/reviewing the patch further, so
request you to wait for a few day's time before we do anything in this
regard.

Sure, thank you so much. I'll wait for your comments and reviewing.

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#252

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

about 6 years ago

In reply to: Mahendra Singh (#247)

On Sat, 30 Nov 2019 at 22:11, Mahendra Singh <mahi6run@gmail.com> wrote:

On Sat, 30 Nov 2019 at 19:18, Sergei Kornilov <sk@zsrv.org> wrote:

Hello

Its possible to change order of index processing by parallel leader? In v35 patchset I see following order:
- start parallel processes
- leader and parallel workers processed index lixt and possible skip some entries
- after that parallel leader recheck index list and process the skipped indexes
- WaitForParallelWorkersToFinish

I think it would be better to:
- start parallel processes
- parallel leader goes through index list and process only indexes which are skip_parallel_index_vacuum = true
- parallel workers processes indexes with skip_parallel_index_vacuum = false
- parallel leader start participate with remainings parallel-safe index processing
- WaitForParallelWorkersToFinish

This would be less running time and better load balance across leader and workers in case of few non-parallel and few parallel indexes.
(if this is expected and required by some reason, we need a comment in code)

Also few notes to vacuumdb:
Seems we need version check at least in vacuum_one_database and prepare_vacuum_command. Similar to SKIP_LOCKED or DISABLE_PAGE_SKIPPING features.
discussion question: difference between --parallel and --jobs parameters will be confusing? We need more description for this options

While doing testing with different server configuration settings, I am getting error (ERROR: no unpinned buffers available) in parallel vacuum but normal vacuum is working fine.

Test Setup:
max_worker_processes = 40
autovacuum = off
shared_buffers = 128kB
max_parallel_workers = 40
max_parallel_maintenance_workers = 40
vacuum_cost_limit = 2000
vacuum_cost_delay = 10

Table description: table have 16 indexes(14 btree, 1 hash, 1 BRIN ) and total 10,00,000 tuples and I am deleting all the tuples, then firing vacuum command.
Run attached .sql file (test_16_indexes.sql)
$ ./psql postgres
postgres=# \i test_16_indexes.sql

Re-start the server and do vacuum.
Case 1) normal vacuum:
postgres=# vacuum test ;
VACUUM
Time: 115174.470 ms (01:55.174)

Case 2) parallel vacuum using 10 parallel workers:
postgres=# vacuum (parallel 10)test ;
ERROR: no unpinned buffers available
CONTEXT: parallel worker
postgres=#

This error is coming due to 128kB shared buffer. I think, I launched 10 parallel workers and all are working paralleling so due to less shared buffer, I am getting this error.

Thank you for testing!

Is this expected behavior with small shared buffer size or we should try to come with a solution for this. Please let me know your thoughts.

I think it's normal behavior when the shared buffer is not enough.
Since the total 10 processes were processing different pages at the
same time and you set a small value to shared_buffers the shared
buffer gets full easily. And you got the proper error. So I think in
this case we should consider either to increase the shared buffer size
or to decrease the parallel degree. I guess you can get this error
even when you vacuum 10 different tables concurrently instead.

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#253

Sergei Kornilov

sk@zsrv.org

about 6 years ago

In reply to: Masahiko Sawada (#250)

I think I got your point. Your proposal is that it's more efficient if
we make the leader process vacuum the index that can be processed only
the leader process (i.e. indexes not supporting parallel index vacuum)
while workers are processing indexes supporting parallel index vacuum,
right? That way, we can process indexes in parallel as much as
possible.

Right

So maybe we can call vacuum_or_cleanup_skipped_indexes first
and then call vacuum_or_cleanup_indexes_worker. But I'm not sure that
there are parallel-safe remaining indexes after the leader finished
vacuum_or_cleanup_indexes_worker, as described on your proposal.

I meant that after processing missing indexes (not supporting parallel index vacuum), the leader can start processing indexes that support the parallel index vacuum, along with parallel workers.
Exactly call vacuum_or_cleanup_skipped_indexes after start parallel workers but before vacuum_or_cleanup_indexes_worker or something with similar effect.
If we have 0 missed indexes - parallel vacuum will run as in current implementation, with leader participation.

Sorry for my unclear english...

regards, Sergei

#254

Dilip Kumar

dilipbalaut@gmail.com

about 6 years ago

In reply to: Sergei Kornilov (#253)

On Sun, Dec 1, 2019 at 11:01 PM Sergei Kornilov <sk@zsrv.org> wrote:

Hi

I think I got your point. Your proposal is that it's more efficient if
we make the leader process vacuum the index that can be processed only
the leader process (i.e. indexes not supporting parallel index vacuum)
while workers are processing indexes supporting parallel index vacuum,
right? That way, we can process indexes in parallel as much as
possible.

Right

So maybe we can call vacuum_or_cleanup_skipped_indexes first
and then call vacuum_or_cleanup_indexes_worker. But I'm not sure that
there are parallel-safe remaining indexes after the leader finished
vacuum_or_cleanup_indexes_worker, as described on your proposal.

I meant that after processing missing indexes (not supporting parallel index vacuum), the leader can start processing indexes that support the parallel index vacuum, along with parallel workers.
Exactly call vacuum_or_cleanup_skipped_indexes after start parallel workers but before vacuum_or_cleanup_indexes_worker or something with similar effect.
If we have 0 missed indexes - parallel vacuum will run as in current implementation, with leader participation.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#255

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Sergei Kornilov (#253)

On Sun, Dec 1, 2019 at 11:01 PM Sergei Kornilov <sk@zsrv.org> wrote:

Hi

I think I got your point. Your proposal is that it's more efficient if
we make the leader process vacuum the index that can be processed only
the leader process (i.e. indexes not supporting parallel index vacuum)
while workers are processing indexes supporting parallel index vacuum,
right? That way, we can process indexes in parallel as much as
possible.

Right

So maybe we can call vacuum_or_cleanup_skipped_indexes first
and then call vacuum_or_cleanup_indexes_worker. But I'm not sure that
there are parallel-safe remaining indexes after the leader finished
vacuum_or_cleanup_indexes_worker, as described on your proposal.

I meant that after processing missing indexes (not supporting parallel
index vacuum), the leader can start processing indexes that support the
parallel index vacuum, along with parallel workers.

Your idea is good, but remember we have always considered a leader as one
worker if the leader can participate. If we do what you are suggesting
that won't be completely true as a leader will not completely participate
in a parallel vacuum. It might be that we don't consider leader equivalent
to one worker in the presence of indexes that don't support a parallel
vacuum, but I am not sure if that really matters much. I think overall it
should not matter much because we won't have that many indexes that don't
support a parallel vacuum.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#256

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

about 6 years ago

In reply to: Sergei Kornilov (#253)

On Sun, 1 Dec 2019 at 18:31, Sergei Kornilov <sk@zsrv.org> wrote:

Hi

I think I got your point. Your proposal is that it's more efficient if
we make the leader process vacuum the index that can be processed only
the leader process (i.e. indexes not supporting parallel index vacuum)
while workers are processing indexes supporting parallel index vacuum,
right? That way, we can process indexes in parallel as much as
possible.

Right

So maybe we can call vacuum_or_cleanup_skipped_indexes first
and then call vacuum_or_cleanup_indexes_worker. But I'm not sure that
there are parallel-safe remaining indexes after the leader finished
vacuum_or_cleanup_indexes_worker, as described on your proposal.

I meant that after processing missing indexes (not supporting parallel index vacuum), the leader can start processing indexes that support the parallel index vacuum, along with parallel workers.
Exactly call vacuum_or_cleanup_skipped_indexes after start parallel workers but before vacuum_or_cleanup_indexes_worker or something with similar effect.
If we have 0 missed indexes - parallel vacuum will run as in current implementation, with leader participation.

I think your idea might not work well in some cases. That is, I think
there are some cases where it's better if leader participates to
parallel vacuum as a worker as soon as possible especially if a table
has many indexes that designedly don't support parallel vacuum (e.g.
bulkdelete of brin and using VACUUM_OPTION_PARALLEL_COND_CLEANUP).
Suppose the table has both 3 indexes that support parallel vacuum and
takes time 5 sec, 10 sec and 10 sec to vacuum respectively and 3
indexes that don't support and takes 2 sec for each. In current patch
we launch 2 workers. Then they take two indexes to vacuum and will
take 5 sec and 10 sec. At the same time the leader processes 3 indexes
that don't support parallel index and takes 6 sec. Therefore after the
worker finishes its index it takes the next index and takes 10 sec
more. The total execution time will be 15 sec. On the other hand, if
the leader participated to parallel vacuum first the total execution
time can be 11 sec (taking 5 sec and 2 sec * 3).

It's just an example, I'm not saying your idea is bad. ISTM the idea
is good on an assumption that all indexes take the same time or take a
long time so I'd also like to consider if this is true even in
production and which approaches is better if we don't have such
assumption.

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#257

tushar

tushar.ahuja@enterprisedb.com

about 6 years ago

In reply to: Masahiko Sawada (#238)

On 11/27/19 11:13 PM, Masahiko Sawada wrote:

Thank you for reviewing this patch. All changes you made looks good to me.

I thought I already have posted all v34 patches but didn't, sorry. So
I've attached v35 patch set that incorporated your changes and it
includes Dilip's patch for gist index (0001). These patches can be
applied on top of the current HEAD and make check should pass.
Regards,

While doing testing of this feature against v35- patches ( minus 004) on
Master ,
getting crash when user connect to server using single mode and try to
perform vacuum (parallel 1 ) o/p

tushar@localhost bin]$ ./postgres --single -D data/ postgres
2019-12-03 12:49:26.967 +0530 [70300] LOG: database system was
interrupted; last known up at 2019-12-03 12:48:51 +0530
2019-12-03 12:49:26.987 +0530 [70300] LOG: database system was not
properly shut down; automatic recovery in progress
2019-12-03 12:49:26.990 +0530 [70300] LOG: invalid record length at
0/29F1638: wanted 24, got 0
2019-12-03 12:49:26.990 +0530 [70300] LOG: redo is not required

PostgreSQL stand-alone backend 13devel
backend>
backend> vacuum full;
backend> vacuum (parallel 1);
TRAP: FailedAssertion("IsUnderPostmaster", File: "dsm.c", Line: 444)
./postgres(ExceptionalCondition+0x53)[0x8c6fa3]
./postgres[0x785ced]
./postgres(GetSessionDsmHandle+0xca)[0x49304a]
./postgres(InitializeParallelDSM+0x74)[0x519d64]
./postgres(heap_vacuum_rel+0x18d3)[0x4e47e3]
./postgres[0x631d9a]
./postgres(vacuum+0x444)[0x632f14]
./postgres(ExecVacuum+0x2bb)[0x63369b]
./postgres(standard_ProcessUtility+0x4cf)[0x7b312f]
./postgres[0x7b02c6]
./postgres[0x7b0dd3]
./postgres(PortalRun+0x162)[0x7b1b02]
./postgres[0x7ad874]
./postgres(PostgresMain+0x1002)[0x7aebf2]
./postgres(main+0x1ce)[0x48188e]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x7f4fe6908505]
./postgres[0x481b6a]
Aborted (core dumped)

--
regards,tushar
EnterpriseDB https://www.enterprisedb.com/
The Enterprise PostgreSQL Company

#258

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: tushar (#257)

On Tue, Dec 3, 2019 at 12:55 PM tushar <tushar.ahuja@enterprisedb.com>
wrote:

On 11/27/19 11:13 PM, Masahiko Sawada wrote:

Thank you for reviewing this patch. All changes you made looks good to

me.

I thought I already have posted all v34 patches but didn't, sorry. So
I've attached v35 patch set that incorporated your changes and it
includes Dilip's patch for gist index (0001). These patches can be
applied on top of the current HEAD and make check should pass.
Regards,

While doing testing of this feature against v35- patches ( minus 004) on
Master ,

Thanks for doing the testing of these patches.

getting crash when user connect to server using single mode and try to
perform vacuum (parallel 1 ) o/p

tushar@localhost bin]$ ./postgres --single -D data/ postgres
2019-12-03 12:49:26.967 +0530 [70300] LOG: database system was
interrupted; last known up at 2019-12-03 12:48:51 +0530
2019-12-03 12:49:26.987 +0530 [70300] LOG: database system was not
properly shut down; automatic recovery in progress
2019-12-03 12:49:26.990 +0530 [70300] LOG: invalid record length at
0/29F1638: wanted 24, got 0
2019-12-03 12:49:26.990 +0530 [70300] LOG: redo is not required

PostgreSQL stand-alone backend 13devel
backend>
backend> vacuum full;
backend> vacuum (parallel 1);

The parallel vacuum shouldn't be allowed via standalone backends as we
can't create DSM segments in that mode and similar is true for the parallel
query. It should internally proceed with a serial vacuum. I'll fix it in
the next version I am planning to post. BTW, it seems that the same
problem will be there for parallel create index.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#259

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Masahiko Sawada (#256)

2 attachment(s)

On Tue, Dec 3, 2019 at 12:56 AM Masahiko Sawada <
masahiko.sawada@2ndquadrant.com> wrote:

On Sun, 1 Dec 2019 at 18:31, Sergei Kornilov <sk@zsrv.org> wrote:

Hi

I think I got your point. Your proposal is that it's more efficient if
we make the leader process vacuum the index that can be processed only
the leader process (i.e. indexes not supporting parallel index vacuum)
while workers are processing indexes supporting parallel index vacuum,
right? That way, we can process indexes in parallel as much as
possible.

Right

So maybe we can call vacuum_or_cleanup_skipped_indexes first
and then call vacuum_or_cleanup_indexes_worker. But I'm not sure that
there are parallel-safe remaining indexes after the leader finished
vacuum_or_cleanup_indexes_worker, as described on your proposal.

I meant that after processing missing indexes (not supporting parallel

index vacuum), the leader can start processing indexes that support the
parallel index vacuum, along with parallel workers.

Exactly call vacuum_or_cleanup_skipped_indexes after start parallel

workers but before vacuum_or_cleanup_indexes_worker or something with
similar effect.

If we have 0 missed indexes - parallel vacuum will run as in current

implementation, with leader participation.

I think your idea might not work well in some cases.

Good point. I am also not sure whether it is a good idea to make the
suggested change, but I think adding a comment on those lines is not a bad
idea which I have done in the attached patch.

I have made some other changes as well.
1.
+ if (VacuumSharedCostBalance != NULL)
  {
- double msec;
+ int nworkers = pg_atomic_read_u32
(VacuumActiveNWorkers);
+
+ /* At least count itself */
+ Assert(nworkers >= 1);
+
+ /* Update the shared cost
balance value atomically */
+ while (true)
+ {
+ uint32 shared_balance;
+ uint32 new_balance;
+
uint32 local_balance;
+
+ msec = 0;
+
+ /* compute new balance by adding the local value */
+
shared_balance = pg_atomic_read_u32(VacuumSharedCostBalance);
+ new_balance = shared_balance + VacuumCostBalance;
+
/* also compute the total local balance */
+ local_balance = VacuumCostBalanceLocal + VacuumCostBalance;
+
+
if ((new_balance >= VacuumCostLimit) &&
+ (local_balance > 0.5 * (VacuumCostLimit / nworkers)))
+ {
+
/* compute sleep time based on the local cost balance */
+ msec = VacuumCostDelay *
VacuumCostBalanceLocal / VacuumCostLimit;
+ new_balance = shared_balance - VacuumCostBalanceLocal;
+
VacuumCostBalanceLocal = 0;
+ }
+
+ if (pg_atomic_compare_exchange_u32(VacuumSharedCostBalance,
+
  &shared_balance,
+
  new_balance))
+ {
+ /* Updated successfully, break */
+
break;
+ }
+ }
+
+ VacuumCostBalanceLocal += VacuumCostBalance;

I see multiple problems with this code. (a) if the VacuumSharedCostBalance
is changed by the time of compare and exchange, then the next iteration
might not compute the correct values as you might have reset
VacuumCostBalanceLocal by that time. (b) In code line, new_balance =
shared_balance - VacuumCostBalanceLocal, you need to use new_balance
instead of shared_balance, otherwise, it won't account for the balance of
the latest cycle. (c) In code line, msec = VacuumCostDelay *
VacuumCostBalanceLocal / VacuumCostLimit;, I think you need to use
local_balance for reasons similar to (b). (d) I think we can write this
code with a lesser number of variables.

I have fixed all these problems and used a slightly different way to
compute the parallel delay. See compute_parallel_delay() in the attached
delta patch.

2.
+ /* Setup the shared cost-based vacuum delay and launch workers*/
+ if (nworkers > 0)
+ {
+ /*
+ * Reset the local value so that we compute cost balance during
+ * parallel index vacuuming.
+ */
+ VacuumCostBalance = 0;
+ VacuumCostBalanceLocal = 0;
+
+ LaunchParallelWorkers(lps->pcxt, nworkers);
+
+ /* Enable shared costing iff we process indexes in parallel. */
+ if (lps->pcxt->nworkers_launched > 0)
+ {
+ /* Enable shared cost balance */
+ VacuumSharedCostBalance = &(lps->lvshared->cost_balance);
+ VacuumActiveNWorkers = &(lps->lvshared->active_nworkers);
+
+ /*
+ * Set up shared cost balance and the number of active workers for
+ * vacuum delay.
+ */
+ pg_atomic_write_u32(VacuumSharedCostBalance, VacuumCostBalance);
+ pg_atomic_write_u32(VacuumActiveNWorkers, 0);

This code has issues. We can't
initialize VacuumSharedCostBalance/VacuumActiveNWorkers after launching
workers as by that time some other worker would have changed its value.
This has been reported offlist by Mahendra and I have fixed it.

3. Changed the name of functions which were too long and I think new names
are more meaningful. If you don't agree with these changes, then we can
discuss it.

4. Changed the order of parameters in many functions to match with existing
code.

5. Refactored the code at a few places so that it can be easy to follow.

6. Added/Edited many comments and other cosmetic changes.

You can find all these changes in v35-0003-Code-review-amit.patch.

Few other things, I would like you to consider.
1. I think disable_parallel_leader_participation related code can be
extracted into a separate patch as it is mainly a debug/test aid. You can
also fix the problem reported by Mahendra in that context.

2. I think if we cam somehow disallow very small indexes to use parallel
workers, then it will be better. Can we use min_parallel_index_scan_size
to decide whether a particular index can participate in a parallel vacuum?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachments:

v35-0003-Add-parallel-option-to-VACUUM-command.patchapplication/octet-stream; name=v35-0003-Add-parallel-option-to-VACUUM-command.patchDownload

From 483cd0ea5f9e3f3bddc18b788cc79f75204a3a74 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 25 Oct 2019 22:47:41 +0900
Subject: [PATCH v35 3/4] Add parallel option to VACUUM command

This change adds PARALLEL option to VACUUM command that enable us to
perform index vacuuming and index cleanup with background
workers. Individual indexes is processed by one vacuum
process. Therefore parallel vacuum can be used when the table has at
least two indexes and it cannot specify larger parallel degree than
the number of indexes that the table has.

The parallel degree is either specified by user or determined based on
the number of indexes that the table has, and further limited by
max_parallel_maintenance_workers. The table size and index size don't
affect it.
---
 doc/src/sgml/config.sgml               |   14 +-
 doc/src/sgml/ref/vacuum.sgml           |   45 +
 src/backend/access/heap/vacuumlazy.c   | 1288 ++++++++++++++++++++++--
 src/backend/access/nbtree/nbtsort.c    |    2 +-
 src/backend/access/transam/parallel.c  |    9 +-
 src/backend/commands/vacuum.c          |  109 +-
 src/backend/executor/nodeGather.c      |    2 +-
 src/backend/executor/nodeGatherMerge.c |    2 +-
 src/backend/postmaster/autovacuum.c    |    2 +
 src/bin/psql/tab-complete.c            |    2 +-
 src/include/access/heapam.h            |    6 +
 src/include/access/parallel.h          |    2 +-
 src/include/commands/vacuum.h          |    5 +
 src/test/regress/expected/vacuum.out   |   26 +
 src/test/regress/sql/vacuum.sql        |   25 +
 15 files changed, 1421 insertions(+), 118 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index d4d1fe45cc..7e17d98fd8 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2310,13 +2310,13 @@ include_dir 'conf.d'
        <listitem>
         <para>
          Sets the maximum number of parallel workers that can be
-         started by a single utility command.  Currently, the only
-         parallel utility command that supports the use of parallel
-         workers is <command>CREATE INDEX</command>, and only when
-         building a B-tree index.  Parallel workers are taken from the
-         pool of processes established by <xref
-         linkend="guc-max-worker-processes"/>, limited by <xref
-         linkend="guc-max-parallel-workers"/>.  Note that the requested
+         started by a single utility command.  Currently, the parallel
+         utility commands that support the use of parallel workers are
+         <command>CREATE INDEX</command> only when building a B-tree index,
+         and <command>VACUUM</command> without <literal>FULL</literal>
+         option.  Parallel workers are taken from the pool of processes
+         established by <xref linkend="guc-max-worker-processes"/>, limited
+         by <xref linkend="guc-max-parallel-workers"/>.  Note that the requested
          number of workers may not actually be available at run time.
          If this occurs, the utility operation will run with fewer
          workers than expected.  The default value is 2.  Setting this
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index f9b0fb8794..9fee083233 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -34,6 +34,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     SKIP_LOCKED [ <replaceable class="parameter">boolean</replaceable> ]
     INDEX_CLEANUP [ <replaceable class="parameter">boolean</replaceable> ]
     TRUNCATE [ <replaceable class="parameter">boolean</replaceable> ]
+    PARALLEL [ <replaceable class="parameter">integer</replaceable> ]
 
 <phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
 
@@ -223,6 +224,32 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><literal>PARALLEL</literal></term>
+    <listitem>
+     <para>
+      Perform vacuum index and cleanup index phases of <command>VACUUM</command>
+      in parallel using <replaceable class="parameter">integer</replaceable>
+      background workers (for the detail of each vacuum phases, please
+      refer to <xref linkend="vacuum-phases"/>).  If the parallel degree
+      <replaceable class="parameter">integer</replaceable> is omitted,
+      then <command>VACUUM</command> decides the number of workers based
+      on number of indexes that support parallel vacuum operation on the
+      relation which is further limited by
+      <xref linkend="guc-max-parallel-workers-maintenance"/>.  Please note
+      that it is not guaranteed that the number of parallel workers specified
+      in <replaceable class="parameter">integer</replaceable> will be used
+      during execution.  It is possible for a vacuum to run with fewer workers
+      than specified, or even with no workers at all.  Only one worker can
+      be used per index.  So parallel workers are launched only when there
+      are at least <literal>2</literal> indexes in the table.  Workers for
+      vacuum launches before starting each phase and exit at the end of
+      the phase.  These behaviors might change in a future release.  This
+      option can't be used with the <literal>FULL</literal> option.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><replaceable class="parameter">boolean</replaceable></term>
     <listitem>
@@ -237,6 +264,18 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><replaceable class="parameter">integer</replaceable></term>
+    <listitem>
+     <para>
+      Specifies a positive integer value passed to the selected option.
+      The <replaceable class="parameter">integer</replaceable> value can
+      also be omitted, in which case the value is decided by the command
+      based on the option used.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><replaceable class="parameter">table_name</replaceable></term>
     <listitem>
@@ -316,6 +355,12 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     more than a plain <command>VACUUM</command> would.
    </para>
 
+   <para>
+     The <option>PARALLEL</option> option is used only for vacuum purpose.
+     Even if this option is specified with <option>ANALYZE</option> option
+     it does not affect <option>ANALYZE</option>.
+   </para>
+
    <para>
     <command>VACUUM</command> causes a substantial increase in I/O traffic,
     which might cause poor performance for other active sessions.  Therefore,
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index a3c4a1df3b..36f4ef1772 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -22,6 +22,22 @@
  * of index scans performed.  So we don't use maintenance_work_mem memory for
  * the TID array, just enough to hold as many heap tuples as fit on one page.
  *
+ * Lazy vacuum supports parallel execution with parallel worker processes.  In
+ * a parallel lazy vacuum, we perform both index vacuuming and index cleanup
+ * with parallel worker processes.  Individual indexes are processed by one
+ * vacuum process.  At the beginning of a lazy vacuum (at lazy_scan_heap) we
+ * prepare the parallel context and initialize the DSM segment that contains
+ * shared information as well as the memory space for storing dead tuples.
+ * When starting either index vacuuming or index cleanup, we launch parallel
+ * worker processes.  Once all indexes are processed the parallel worker
+ * processes exit.  And then the leader process re-initializes the parallel
+ * context so that it can use the same DSM for multiple passses of index
+ * vacuum and for performing index cleanup.  Note that all parallel workers
+ * live during either index vacuuming or index cleanup but the leader process
+ * neither exits from the parallel mode nor destroys the parallel context.
+ * For updating the index statistics, since any updates are not allowed during
+ * parallel mode we update the index statistics after exited from the parallel
+ * mode.
  *
  * Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
@@ -36,13 +52,16 @@
 
 #include <math.h>
 
+#include "access/amapi.h"
 #include "access/genam.h"
 #include "access/heapam.h"
 #include "access/heapam_xlog.h"
 #include "access/htup_details.h"
 #include "access/multixact.h"
+#include "access/parallel.h"
 #include "access/transam.h"
 #include "access/visibilitymap.h"
+#include "access/xact.h"
 #include "access/xlog.h"
 #include "catalog/storage.h"
 #include "commands/dbcommands.h"
@@ -55,6 +74,7 @@
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
 #include "storage/lmgr.h"
+#include "tcop/tcopprot.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/pg_rusage.h"
@@ -110,6 +130,160 @@
  */
 #define PREFETCH_SIZE			((BlockNumber) 32)
 
+/*
+ * DSM keys for parallel lazy vacuum.  Unlike other parallel execution code,
+ * since we don't need to worry about DSM keys conflicting with plan_node_id
+ * we can use small integers.
+ */
+#define PARALLEL_VACUUM_KEY_SHARED			1
+#define PARALLEL_VACUUM_KEY_DEAD_TUPLES		2
+#define PARALLEL_VACUUM_KEY_QUERY_TEXT		3
+
+/*
+ * PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION disables the leader's
+ * participation in parallel lazy vacuum.  This may be useful as a debugging
+ * aid.
+#undef PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION
+ */
+
+/*
+ * Macro to check if we are in a parallel lazy vacuum.  If true, we are
+ * in the parallel mode and prepared the DSM segment.
+ */
+#define ParallelVacuumIsActive(lps) (((LVParallelState *) (lps)) != NULL)
+
+/*
+ * LVDeadTuples stores the dead tuple TIDs collected during the heap scan.
+ * This is allocated in the DSM segment in parallel mode and in local memory
+ * in non-parallel mode.
+ */
+typedef struct LVDeadTuples
+{
+	int			max_tuples;	/* # slots allocated in array */
+	int			num_tuples;	/* current # of entries */
+	/* List of TIDs of tuples we intend to delete */
+	/* NB: this list is ordered by TID address */
+	ItemPointerData itemptrs[FLEXIBLE_ARRAY_MEMBER];	/* array of ItemPointerData */
+} LVDeadTuples;
+
+#define SizeOfLVDeadTuples offsetof(LVDeadTuples, itemptrs) + sizeof(ItemPointerData)
+
+/*
+ * Shared information among parallel workers.  So this is allocated in the DSM
+ * segment.
+ */
+typedef struct LVShared
+{
+	/*
+	 * Target table relid and log level. These fields are not modified
+	 * during the lazy vacuum.
+	 */
+	Oid		relid;
+	int		elevel;
+
+	/*
+	 * An indication for vacuum workers to perform either index vacuuming or
+	 * index cleanup.  first_time is true only if for_cleanup is true and
+	 * bulk-deletion is not performed yet.
+	 */
+	bool	for_cleanup;
+	bool	first_time;
+
+	/*
+	 * Fields for both index vacuuming and index cleanup.
+	 *
+	 * reltuples is the total number of input heap tuples.  We set either
+	 * an old live tuples in index vacuuming case or the new live tuples in
+	 * index cleanup case.
+	 *
+	 * estimated_count is true if the reltuples is estimated value.
+	 */
+	double	reltuples;
+	bool	estimated_count;
+
+	/*
+	 * In single process lazy vacuum we could consume more memory during
+	 * index vacuuming or cleanup apart from the memory for heap scanning.
+	 * In parallel index vacuuming, since individual vacuum workers can
+	 * consume memory equal to maitenance_work_mem, the new
+	 * maitenance_work_mem for each worker is set such that the parallel
+	 * operation doesn't consume more memory than single process lazy vacuum.
+	 */
+	int		maintenance_work_mem_worker;
+
+	/*
+	 * Shared vacuum cost balance.  During parallel index vacuuming
+	 * VacuumSharedCostBalance points to this value and it accumulates the
+	 * balance of each parallel vacuum workers.
+	 */
+	pg_atomic_uint32 cost_balance;
+
+	/*
+	 * Number of active parallel workers.  This is used for computing the
+	 * minimum threshold of the vacuum cost balance for a worker to go
+	 * for the delay.
+	 */
+	pg_atomic_uint32 active_nworkers;
+
+	/*
+	 * Variables to control parallel index vacuuming.  We have a bitmap to
+	 * indicate which index has stats in shared memory.  The set bit in the
+	 * map indicates that the particular index supports a parallel vacuum.
+	 */
+	pg_atomic_uint32	idx;		/* counter for vacuuming and clean up */
+	pg_atomic_uint32	nprocessed;	/* # of indexes done during parallel execution */
+	uint32				offset;		/* sizeof header incl. bitmap */
+	bits8				bitmap[FLEXIBLE_ARRAY_MEMBER];	 /* bit map of NULLs */
+
+	/* Shared index statistics data follows at end of struct */
+} LVShared;
+
+#define SizeOfLVShared offsetof(LVShared, bitmap) + sizeof(bits8)
+#define GetSharedIndStats(s) \
+	((LVSharedIndStats *)((char *)(s) + ((LVShared *)(s))->offset))
+#define IndStatsIsNull(s, i) \
+	(!(((LVShared *)(s))->bitmap[(i) >> 3] & (1 << ((i) & 0x07))))
+
+/*
+ * Struct for an index bulk-deletion statistic used for parallel lazy
+ * vacuum.  This is allocated in the DSM segment.
+ */
+typedef struct LVSharedIndStats
+{
+	Size	size;
+	bool	updated;	/* are the stats updated? */
+
+	/* IndexBulkDeleteResult data follows at end of struct */
+} LVSharedIndStats;
+
+#define SizeOfSharedIndStats(s) \
+	(sizeof(LVSharedIndStats) + ((LVSharedIndStats *)(s))->size)
+#define GetIndexBulkDeleteResult(s) \
+	((IndexBulkDeleteResult *)((char *)(s) + sizeof(LVSharedIndStats)))
+
+/* Struct for maintaining a parallel vacuum state. */
+typedef struct LVParallelState
+{
+	ParallelContext	*pcxt;
+
+	/* Shared information among parallel vacuum workers */
+	LVShared		*lvshared;
+
+	/*
+	 * The number of indexes that support parallel index bulk-deletion
+	 * and parallel index cleanup respectively.
+	 */
+	int				nindexes_parallel_bulkdel;
+	int				nindexes_parallel_cleanup;
+	int				nindexes_parallel_condcleanup;
+
+	/*
+	 * Always true except for a debugging case where
+	 * PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION are defined.
+	 */
+	bool			leaderparticipates;
+} LVParallelState;
+
 typedef struct LVRelStats
 {
 	/* useindex = true means two-pass strategy; false means one-pass */
@@ -128,11 +302,7 @@ typedef struct LVRelStats
 	BlockNumber pages_removed;
 	double		tuples_deleted;
 	BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
-	/* List of TIDs of tuples we intend to delete */
-	/* NB: this list is ordered by TID address */
-	int			num_dead_tuples;	/* current # of entries */
-	int			max_dead_tuples;	/* # slots allocated in array */
-	ItemPointer dead_tuples;	/* array of ItemPointerData */
+	LVDeadTuples *dead_tuples;
 	int			num_index_scans;
 	TransactionId latestRemovedXid;
 	bool		lock_waiter_detected;
@@ -148,6 +318,26 @@ static MultiXactId MultiXactCutoff;
 
 static BufferAccessStrategy vac_strategy;
 
+/*
+ * Variables for cost-based vacuum delay for parallel index vacuuming.
+ * The basic idea of cost-based vacuum delay for parallel index vacuuming
+ * is to allow all parallel vacuum workers including the leader process
+ * to have a shared view of cost related parameters (mainly VacuumCostBalance)
+ * and allow each worker to update it and then based on that decide
+ * whether it needs to sleep.  Besides, we allow any worker to sleep
+ * only if it has performed the I/O above a certain threshold, which is
+ * calculated based on the number of active workers (VacuumActiveNWorkers),
+ * and the overall cost balance is more than VacuumCostLimit set by the
+ * system.  Then we will allow the worker to sleep proportional to the work
+ * done and reduce the VacuumSharedCostBalance by the amount which is
+ * consumed by the current worker (VacuumCostBalanceLocal).  This can
+ * avoid letting the workers sleep which has done less or no I/O as compared
+ * to other workers, and therefore can ensure that workers who are doing
+ * more I/O got throttled more.
+ */
+pg_atomic_uint32	*VacuumSharedCostBalance = NULL;
+pg_atomic_uint32	*VacuumActiveNWorkers = NULL;
+int					VacuumCostBalanceLocal = 0;
 
 /* non-export function prototypes */
 static void lazy_scan_heap(Relation onerel, VacuumParams *params,
@@ -155,12 +345,11 @@ static void lazy_scan_heap(Relation onerel, VacuumParams *params,
 						   bool aggressive);
 static void lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats);
 static bool lazy_check_needs_freeze(Buffer buf, bool *hastup);
-static void lazy_vacuum_index(Relation indrel,
-							  IndexBulkDeleteResult **stats,
-							  LVRelStats *vacrelstats);
+static void lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult **stats,
+							  LVDeadTuples *dead_tuples, double reltuples);
 static void lazy_cleanup_index(Relation indrel,
-							   IndexBulkDeleteResult *stats,
-							   LVRelStats *vacrelstats);
+							   IndexBulkDeleteResult **stats,
+							   double reltuples, bool estimated_count);
 static int	lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 							 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer);
 static bool should_attempt_truncation(VacuumParams *params,
@@ -169,12 +358,44 @@ static void lazy_truncate_heap(Relation onerel, LVRelStats *vacrelstats);
 static BlockNumber count_nondeletable_pages(Relation onerel,
 											LVRelStats *vacrelstats);
 static void lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks);
-static void lazy_record_dead_tuple(LVRelStats *vacrelstats,
+static void lazy_record_dead_tuple(LVDeadTuples *dead_tuples,
 								   ItemPointer itemptr);
 static bool lazy_tid_reaped(ItemPointer itemptr, void *state);
 static int	vac_cmp_itemptr(const void *left, const void *right);
 static bool heap_page_is_all_visible(Relation rel, Buffer buf,
 									 TransactionId *visibility_cutoff_xid, bool *all_frozen);
+static void lazy_parallel_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
+													int nindexes, IndexBulkDeleteResult **stats,
+													LVParallelState *lps);
+static void vacuum_or_cleanup_indexes_worker(Relation *Irel, int nindexes,
+											 IndexBulkDeleteResult **stats,
+											 LVShared *lvshared,
+											 LVDeadTuples *dead_tuples);
+static void vacuum_or_cleanup_skipped_indexes(LVRelStats *vacrelstats, Relation *Irel,
+											  int nindexes, IndexBulkDeleteResult **stats,
+											  LVParallelState *lps);
+static void vacuum_or_cleanup_one_index_worker(Relation indrel, IndexBulkDeleteResult **stats,
+											   LVShared *lvshared, LVSharedIndStats *shared_indstats,
+											   LVDeadTuples *dead_tuples);
+static void lazy_vacuum_indexes(LVRelStats *vacrelstats, Relation *Irel,
+								int nindexes, IndexBulkDeleteResult **stats,
+								LVParallelState *lps);
+static void lazy_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
+								 int nindexes, IndexBulkDeleteResult **stats,
+								 LVParallelState *lps);
+static void update_index_statistics(Relation *Irel, IndexBulkDeleteResult **stats,
+									int nindexes);
+static long compute_max_dead_tuples(BlockNumber relblocks, bool hasindex);
+static int compute_parallel_workers(Relation *Irel, int nindexes, int nrequested);
+static void prepare_index_statistics(LVShared *lvshared, Relation *Irel, int nindexes,
+									 int nworkers);
+static LVParallelState *begin_parallel_vacuum(LVRelStats *vacrelstats, Oid relid,
+											  BlockNumber nblocks, Relation *Irel,
+											  int nindexes, int nrequested);
+static void end_parallel_vacuum(LVParallelState *lps, Relation *Irel, int nindexes,
+								IndexBulkDeleteResult **stats);
+static LVSharedIndStats *get_indstats(LVShared *lvshared, int n);
+static bool skip_parallel_index_vacuum(Relation indrel, LVShared *lvshared);
 
 
 /*
@@ -488,6 +709,18 @@ vacuum_log_cleanup_info(Relation rel, LVRelStats *vacrelstats)
  *		dead-tuple TIDs, invoke vacuuming of indexes and call lazy_vacuum_heap
  *		to reclaim dead line pointers.
  *
+ *		If the table has at least two indexes and parallel lazy vacuum is
+ *		requested, we execute both index vacuuming and index cleanup with
+ *		parallel workers.  In parallel lazy vacuum, we enter parallel mode and
+ *		then create both the parallel context and the DSM segment before starting
+ *		heap scan so that we can record dead tuples to the DSM segment.  All
+ *		parallel workers are launched at beginning of index vacuuming and index
+ *		cleanup and they exit once done with all indexes.  At the end of this
+ *		function we exit from parallel mode.  Index bulk-deletion results are
+ *		stored in the DSM segment and update index statistics as a whole after
+ *		exited from parallel mode since all writes are not allowed during parallel
+ *		mode.
+ *
  *		If there are no indexes then we can reclaim line pointers on the fly;
  *		dead line pointers need only be retained until all index pointers that
  *		reference them have been killed.
@@ -496,6 +729,8 @@ static void
 lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			   Relation *Irel, int nindexes, bool aggressive)
 {
+	LVParallelState *lps = NULL;
+	LVDeadTuples *dead_tuples;
 	BlockNumber nblocks,
 				blkno;
 	HeapTupleData tuple;
@@ -518,6 +753,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	bool		skipping_blocks;
 	xl_heap_freeze_tuple *frozen;
 	StringInfoData buf;
+	int			parallel_workers = 0;
 	const int	initprog_index[] = {
 		PROGRESS_VACUUM_PHASE,
 		PROGRESS_VACUUM_TOTAL_HEAP_BLKS,
@@ -553,13 +789,41 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	vacrelstats->nonempty_pages = 0;
 	vacrelstats->latestRemovedXid = InvalidTransactionId;
 
-	lazy_space_alloc(vacrelstats, nblocks);
+	/*
+	 * Compute the number of parallel vacuum workers to launch if the parallel
+	 * vacuum is requested and we need to vacuum the indexes.
+	 */
+	if (params->nworkers >= 0 && vacrelstats->useindex)
+		parallel_workers = compute_parallel_workers(Irel, nindexes,
+													params->nworkers);
+
+	if (parallel_workers > 0)
+	{
+		/*
+		 * Enter parallel mode, create the parallel context and allocate the
+		 * DSM segment.
+		 */
+		lps = begin_parallel_vacuum(vacrelstats,
+									RelationGetRelid(onerel),
+									nblocks, Irel, nindexes,
+									parallel_workers);
+	}
+	else
+	{
+		/*
+		 * Use single process vacuum. We allocate the memory space for dead
+		 * tuples locally.
+		 */
+		lazy_space_alloc(vacrelstats, nblocks);
+	}
+
+	dead_tuples = vacrelstats->dead_tuples;
 	frozen = palloc(sizeof(xl_heap_freeze_tuple) * MaxHeapTuplesPerPage);
 
 	/* Report that we're scanning the heap, advertising total # of blocks */
 	initprog_val[0] = PROGRESS_VACUUM_PHASE_SCAN_HEAP;
 	initprog_val[1] = nblocks;
-	initprog_val[2] = vacrelstats->max_dead_tuples;
+	initprog_val[2] = dead_tuples->max_tuples;
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
 	/*
@@ -737,8 +1001,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * If we are close to overrunning the available space for dead-tuple
 		 * TIDs, pause and do a cycle of vacuuming before we tackle this page.
 		 */
-		if ((vacrelstats->max_dead_tuples - vacrelstats->num_dead_tuples) < MaxHeapTuplesPerPage &&
-			vacrelstats->num_dead_tuples > 0)
+		if ((dead_tuples->max_tuples - dead_tuples->num_tuples) < MaxHeapTuplesPerPage &&
+			dead_tuples->num_tuples > 0)
 		{
 			const int	hvp_index[] = {
 				PROGRESS_VACUUM_PHASE,
@@ -766,10 +1030,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 										 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
 			/* Remove index entries */
-			for (i = 0; i < nindexes; i++)
-				lazy_vacuum_index(Irel[i],
-								  &indstats[i],
-								  vacrelstats);
+			lazy_vacuum_indexes(vacrelstats, Irel, nindexes, indstats, lps);
 
 			/*
 			 * Report that we are now vacuuming the heap.  We also increase
@@ -789,7 +1050,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 			vacrelstats->num_index_scans++;
 
 			/*
@@ -985,7 +1246,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		has_dead_tuples = false;
 		nfrozen = 0;
 		hastup = false;
-		prev_dead_count = vacrelstats->num_dead_tuples;
+		prev_dead_count = dead_tuples->num_tuples;
 		maxoff = PageGetMaxOffsetNumber(page);
 
 		/*
@@ -1024,7 +1285,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 */
 			if (ItemIdIsDead(itemid))
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				all_visible = false;
 				continue;
 			}
@@ -1170,7 +1431,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 
 			if (tupgone)
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				HeapTupleHeaderAdvanceLatestRemovedXid(tuple.t_data,
 													   &vacrelstats->latestRemovedXid);
 				tups_vacuumed += 1;
@@ -1240,7 +1501,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * doing a second scan. Also we don't do that but forget dead tuples
 		 * when index cleanup is disabled.
 		 */
-		if (!vacrelstats->useindex && vacrelstats->num_dead_tuples > 0)
+		if (!vacrelstats->useindex && dead_tuples->num_tuples > 0)
 		{
 			if (nindexes == 0)
 			{
@@ -1269,7 +1530,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 
 			/*
 			 * Periodically do incremental FSM vacuuming to make newly-freed
@@ -1384,7 +1645,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * page, so remember its free space as-is.  (This path will always be
 		 * taken if there are no indexes.)
 		 */
-		if (vacrelstats->num_dead_tuples == prev_dead_count)
+		if (dead_tuples->num_tuples == prev_dead_count)
 			RecordPageWithFreeSpace(onerel, blkno, freespace);
 	}
 
@@ -1418,7 +1679,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 
 	/* If any tuples need to be deleted, perform final vacuum cycle */
 	/* XXX put a threshold on min number of tuples here? */
-	if (vacrelstats->num_dead_tuples > 0)
+	if (dead_tuples->num_tuples > 0)
 	{
 		const int	hvp_index[] = {
 			PROGRESS_VACUUM_PHASE,
@@ -1434,10 +1695,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 									 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
 		/* Remove index entries */
-		for (i = 0; i < nindexes; i++)
-			lazy_vacuum_index(Irel[i],
-							  &indstats[i],
-							  vacrelstats);
+		lazy_vacuum_indexes(vacrelstats, Irel, nindexes, indstats, lps);
 
 		/* Report that we are now vacuuming the heap */
 		hvp_val[0] = PROGRESS_VACUUM_PHASE_VACUUM_HEAP;
@@ -1463,12 +1721,19 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
 								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
 
-	/* Do post-vacuum cleanup and statistics update for each index */
+	/* Do post-vacuum cleanup */
 	if (vacrelstats->useindex)
-	{
-		for (i = 0; i < nindexes; i++)
-			lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
-	}
+		lazy_cleanup_indexes(vacrelstats, Irel, nindexes, indstats, lps);
+
+	/*
+	 * End parallel mode before updating index statistics as we cannot write
+	 * during parallel mode.
+	 */
+	if (ParallelVacuumIsActive(lps))
+		end_parallel_vacuum(lps, Irel, nindexes, indstats);
+
+	/* Update index statistics */
+	 update_index_statistics(Irel, indstats, nindexes);
 
 	/* If no indexes, make log report that lazy_vacuum_heap would've made */
 	if (vacuumed_pages)
@@ -1534,7 +1799,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 	npages = 0;
 
 	tupindex = 0;
-	while (tupindex < vacrelstats->num_dead_tuples)
+	while (tupindex < vacrelstats->dead_tuples->num_tuples)
 	{
 		BlockNumber tblk;
 		Buffer		buf;
@@ -1543,7 +1808,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 
 		vacuum_delay_point();
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples->itemptrs[tupindex]);
 		buf = ReadBufferExtended(onerel, MAIN_FORKNUM, tblk, RBM_NORMAL,
 								 vac_strategy);
 		if (!ConditionalLockBufferForCleanup(buf))
@@ -1591,6 +1856,7 @@ static int
 lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 				 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer)
 {
+	LVDeadTuples	*dead_tuples = vacrelstats->dead_tuples;
 	Page		page = BufferGetPage(buffer);
 	OffsetNumber unused[MaxOffsetNumber];
 	int			uncnt = 0;
@@ -1601,16 +1867,16 @@ lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 
 	START_CRIT_SECTION();
 
-	for (; tupindex < vacrelstats->num_dead_tuples; tupindex++)
+	for (; tupindex < dead_tuples->num_tuples; tupindex++)
 	{
 		BlockNumber tblk;
 		OffsetNumber toff;
 		ItemId		itemid;
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&dead_tuples->itemptrs[tupindex]);
 		if (tblk != blkno)
 			break;				/* past end of tuples for this block */
-		toff = ItemPointerGetOffsetNumber(&vacrelstats->dead_tuples[tupindex]);
+		toff = ItemPointerGetOffsetNumber(&dead_tuples->itemptrs[tupindex]);
 		itemid = PageGetItemId(page, toff);
 		ItemIdSetUnused(itemid);
 		unused[uncnt++] = toff;
@@ -1731,19 +1997,395 @@ lazy_check_needs_freeze(Buffer buf, bool *hastup)
 	return false;
 }
 
+/*
+ * Perform index vacuuming or index cleanup with parallel workers.  This
+ * function must be used by the parallel vacuum leader process. The caller
+ * must set lps->lvshared->for_cleanup to indicate whether to perform vacuum
+ * or cleanup.
+ */
+static void
+lazy_parallel_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
+										int nindexes, IndexBulkDeleteResult **stats,
+										LVParallelState *lps)
+{
+	int	nworkers;
+
+	Assert(!IsParallelWorker());
+	Assert(ParallelVacuumIsActive(lps));
+	Assert(nindexes > 0);
+
+	/* Determine the number of parallel workers to launch */
+	if (lps->lvshared->for_cleanup)
+	{
+		if (lps->lvshared->first_time)
+			nworkers = lps->nindexes_parallel_cleanup +
+				lps->nindexes_parallel_condcleanup - 1;
+		else
+			nworkers = lps->nindexes_parallel_cleanup - 1;
+
+	}
+	else
+		nworkers = lps->nindexes_parallel_bulkdel - 1;
+
+	/* Cap by the worker we computed at the beginning of parallel lazy vacuum */
+	nworkers = Min(nworkers, lps->pcxt->nworkers);
+
+	/* Setup the shared cost-based vacuum delay and launch workers*/
+	if (nworkers > 0)
+	{
+		/*
+		 * Reset the local value so that we compute cost balance during
+		 * parallel index vacuuming.
+		 */
+		VacuumCostBalance = 0;
+		VacuumCostBalanceLocal = 0;
+
+		LaunchParallelWorkers(lps->pcxt, nworkers);
+
+		/* Enable shared costing iff we process indexes in parallel. */
+		if (lps->pcxt->nworkers_launched > 0)
+		{
+			/* Enable shared cost balance */
+			VacuumSharedCostBalance = &(lps->lvshared->cost_balance);
+			VacuumActiveNWorkers = &(lps->lvshared->active_nworkers);
+
+			/*
+			 * Set up shared cost balance and the number of active workers for
+			 * vacuum delay.
+			 */
+			pg_atomic_write_u32(VacuumSharedCostBalance, VacuumCostBalance);
+			pg_atomic_write_u32(VacuumActiveNWorkers, 0);
+		}
+
+		if (lps->lvshared->for_cleanup)
+			ereport(elevel,
+					(errmsg(ngettext("launched %d parallel vacuum worker for index cleanup (planned: %d)",
+									 "launched %d parallel vacuum workers for index cleanup (planned: %d)",
+									 lps->pcxt->nworkers_launched),
+							lps->pcxt->nworkers_launched, nworkers)));
+		else
+			ereport(elevel,
+					(errmsg(ngettext("launched %d parallel vacuum worker for index vacuuming (planned: %d)",
+									 "launched %d parallel vacuum workers for index vacuuming (planned: %d)",
+									 lps->pcxt->nworkers_launched),
+							lps->pcxt->nworkers_launched, nworkers)));
+	}
+
+	/*
+	 * Join as a parallel worker. The leader process alone does that in
+	 * case where no workers launched.
+	 */
+	if (lps->leaderparticipates || lps->pcxt->nworkers_launched == 0)
+		vacuum_or_cleanup_indexes_worker(Irel, nindexes, stats, lps->lvshared,
+										 vacrelstats->dead_tuples);
+
+	/*
+	 * Here, the indexes that had been skipped during parallel index vacuuming
+	 * are remaining. If there are such indexes the leader process does vacuum
+	 * or cleanup them one by one.
+	 */
+	vacuum_or_cleanup_skipped_indexes(vacrelstats, Irel, nindexes, stats,
+									  lps);
+
+	/* Wait for all vacuum workers to finish */
+	WaitForParallelWorkersToFinish(lps->pcxt);
+
+	/* Carry the shared balance value to heap scan */
+	if (VacuumSharedCostBalance)
+		VacuumCostBalance = pg_atomic_read_u32(VacuumSharedCostBalance);
+
+	/* Disable shared cost balance for vacuum delay */
+	VacuumSharedCostBalance = NULL;
+	VacuumActiveNWorkers = NULL;
+
+	/*
+	 * In cleanup case we don't need to reinitialize the parallel
+	 * context as no more index vacuuming and index cleanup will be
+	 * performed after that.
+	 */
+	if (!lps->lvshared->for_cleanup)
+	{
+		/* Reset the processing counts */
+		pg_atomic_write_u32(&(lps->lvshared->idx), 0);
+		pg_atomic_write_u32(&(lps->lvshared->nprocessed), 0);
+
+		/*
+		 * Reinitialize the parallel context to relaunch parallel workers
+		 * for the next execution.
+		 */
+		ReinitializeParallelDSM(lps->pcxt);
+	}
+}
+
+/*
+ * Index vacuuming and index cleanup routine used by parallel vacuum
+ * worker processes and the leader process to process the indexes in
+ * parallel.
+ */
+static void
+vacuum_or_cleanup_indexes_worker(Relation *Irel, int nindexes,
+								 IndexBulkDeleteResult **stats,
+								 LVShared *lvshared,
+								 LVDeadTuples *dead_tuples)
+{
+	/*
+	 * Increment the active worker count if we are able to launch any worker.
+	 */
+	if (VacuumActiveNWorkers)
+		pg_atomic_add_fetch_u32(VacuumActiveNWorkers, 1);
+
+	/* Loop until all indexes are vacuumed */
+	for (;;)
+	{
+		int idx;
+		LVSharedIndStats *shared_indstats;
+
+		/* Get an index number to process */
+		idx = pg_atomic_fetch_add_u32(&(lvshared->idx), 1);
+
+		/* Done for all indexes? */
+		if (idx >= nindexes)
+			break;
+
+		/* Skip processing indexes that doesn't support parallel operation */
+		if (skip_parallel_index_vacuum(Irel[idx], lvshared))
+			continue;
+
+		/* Increment the processing count */
+		pg_atomic_add_fetch_u32(&(lvshared->nprocessed), 1);
+
+		/* Get the index statistics of this index from DSM */
+		shared_indstats = get_indstats(lvshared, idx);
+
+		/*
+		 * This must exist in DSM as we reach here only for indexes that
+		 * support the parallel operation.
+		 */
+		Assert(shared_indstats);
+
+		/* Do vacuum or cleanup one index */
+		vacuum_or_cleanup_one_index_worker(Irel[idx], &(stats[idx]),
+										   lvshared, shared_indstats,
+										   dead_tuples);
+	}
+
+	/*
+	 * We have completed the index vacuum so decrement the active worker
+	 * count.
+	 */
+	if (VacuumActiveNWorkers)
+		pg_atomic_sub_fetch_u32(VacuumActiveNWorkers, 1);
+}
+
+/*
+ * Vacuum or cleanup indexes that have been skipped during parallel operation
+ * because these indexes don't support parallel operation at that phase.  Therefore
+ * this function must be called by the leader process.
+ */
+static void
+vacuum_or_cleanup_skipped_indexes(LVRelStats *vacrelstats, Relation *Irel,
+								  int nindexes, IndexBulkDeleteResult **stats,
+								  LVParallelState *lps)
+{
+	int nindexes_remains;
+	int i;
+
+	Assert(!IsParallelWorker());
+
+	nindexes_remains = nindexes - pg_atomic_read_u32(&(lps->lvshared->nprocessed));
+	Assert(nindexes_remains >= 0);
+
+	/* Quick exit if all indexes have already been processed */
+	if (nindexes_remains == 0)
+		return;
+
+	/*
+	 * Increment the active worker count if we are able to launch any worker.
+	 */
+	if (VacuumActiveNWorkers)
+		pg_atomic_add_fetch_u32(VacuumActiveNWorkers, 1);
+
+	for (i = 0; i < nindexes; i++)
+	{
+		bool processed = !skip_parallel_index_vacuum(Irel[i], lps->lvshared);
+
+		/* Skip the already processed indexes */
+		if (processed)
+			continue;
+
+		vacuum_or_cleanup_one_index_worker(Irel[i], &(stats[i]),
+										   lps->lvshared, get_indstats(lps->lvshared, i),
+										   vacrelstats->dead_tuples);
+	}
+
+	/*
+	 * We have completed the index vacuum so decrement the active worker
+	 * count.
+	 */
+	if (VacuumActiveNWorkers)
+		pg_atomic_sub_fetch_u32(VacuumActiveNWorkers, 1);
+}
+
+/*
+ * Vacuum or cleanup one index by worker processing including the leader
+ * process.  After finished each indexes this function copies the index
+ * statistics returned from ambulkdelete and amvacuumcleanup to the DSM
+ * segment.
+ */
+static void
+vacuum_or_cleanup_one_index_worker(Relation indrel, IndexBulkDeleteResult **stats,
+								   LVShared *lvshared, LVSharedIndStats *shared_indstats,
+								   LVDeadTuples *dead_tuples)
+{
+	IndexBulkDeleteResult *bulkdelete_res = NULL;
+
+	if (shared_indstats)
+	{
+		/* Get the space for IndexBulkDeleteResult */
+		bulkdelete_res = GetIndexBulkDeleteResult(shared_indstats);
+
+		/*
+		 * Update the pointer to the corresponding bulk-deletion result
+		 * if someone has already updated it.
+		 */
+		if (shared_indstats->updated && *stats == NULL)
+			*stats = bulkdelete_res;
+	}
+
+	/* Do vacuum or cleanup one index */
+	if (lvshared->for_cleanup)
+		lazy_cleanup_index(indrel, stats, lvshared->reltuples,
+						   lvshared->estimated_count);
+	else
+		lazy_vacuum_index(indrel, stats, dead_tuples,
+						  lvshared->reltuples);
+
+	/*
+	 * Copy the index bulk-deletion result returned from ambulkdelete
+	 * and amvacuumcleanup to the DSM segment if it's the first time to
+	 * get it from them, because they allocate it locally and it's
+	 * possible that an index will be vacuumed by the different vacuum
+	 * process at the next time.  The copying the result normally
+	 * happens only after the first time of index vacuuming.  From the
+	 * second time, we pass the result on the DSM segment so that they
+	 * then update it directly.
+	 *
+	 * Since all vacuum workers write the bulk-deletion result at
+	 * different slots we can write them without locking.
+	 */
+	if (shared_indstats && !shared_indstats->updated && *stats != NULL)
+	{
+		memcpy(bulkdelete_res, *stats, shared_indstats->size);
+		shared_indstats->updated = true;
+
+		/*
+		 * no longer need the locally allocated result and now
+		 * stats[idx] points to the DSM segment.
+		 */
+		pfree(*stats);
+		*stats = bulkdelete_res;
+	}
+}
+
+/*
+ * Vacuum indexes.  We process the indexes serially unless we are doing
+ * parallel vacuum.
+ */
+static void
+lazy_vacuum_indexes(LVRelStats *vacrelstats, Relation *Irel,
+					int nindexes, IndexBulkDeleteResult **stats,
+					LVParallelState *lps)
+{
+	int		idx;
+
+	Assert(!IsParallelWorker());
+	Assert(nindexes > 0);
+
+	/* Perform index vacuuming with parallel workers for parallel vacuum. */
+	if (ParallelVacuumIsActive(lps))
+	{
+		/* Tell parallel workers to do index vacuuming */
+		lps->lvshared->for_cleanup = false;
+		lps->lvshared->first_time = false;
+
+		/*
+		 * We can only provide an approximate value of num_heap_tuples in
+		 * vacuum cases.
+		 */
+		lps->lvshared->reltuples = vacrelstats->old_live_tuples;
+		lps->lvshared->estimated_count = true;
+
+		lazy_parallel_vacuum_or_cleanup_indexes(vacrelstats, Irel, nindexes,
+												stats, lps);
+	}
+	else
+	{
+		for (idx = 0; idx < nindexes; idx++)
+			lazy_vacuum_index(Irel[idx], &stats[idx], vacrelstats->dead_tuples,
+							  vacrelstats->old_live_tuples);
+	}
+}
+
+/*
+ * Cleanup indexes.  We process the indexes serially unless we are doing
+ * parallel vacuum.
+ */
+static void
+lazy_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
+					 int nindexes, IndexBulkDeleteResult **stats,
+					 LVParallelState *lps)
+{
+	int		idx;
+
+	Assert(!IsParallelWorker());
+	Assert(nindexes > 0);
+
+	/*
+	 * If parallel vacuum is active we perform index cleanup with parallel
+	 * workers.
+	 */
+	if (ParallelVacuumIsActive(lps))
+	{
+		/* Tell parallel workers to do index cleanup */
+		lps->lvshared->for_cleanup = true;
+		lps->lvshared->first_time =
+						(vacrelstats->num_index_scans == 0);
+
+		/*
+		 * Now we can provide a better estimate of total number of
+		 * surviving tuples (we assume indexes are more interested in that
+		 * than in the number of nominally live tuples).
+		 */
+		lps->lvshared->reltuples = vacrelstats->new_rel_tuples;
+		lps->lvshared->estimated_count =
+					(vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+
+		lazy_parallel_vacuum_or_cleanup_indexes(vacrelstats, Irel, nindexes,
+												stats, lps);
+	}
+	else
+	{
+		for (idx = 0; idx < nindexes; idx++)
+			lazy_cleanup_index(Irel[idx], &stats[idx],
+							   vacrelstats->new_rel_tuples,
+							   vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	}
+}
 
 /*
  *	lazy_vacuum_index() -- vacuum one index relation.
  *
  *		Delete all the index entries pointing to tuples listed in
  *		vacrelstats->dead_tuples, and update running statistics.
+ *		reltuples is the number of heap tuples to be passed to the
+ *		bulk delete callback.
  */
 static void
-lazy_vacuum_index(Relation indrel,
-				  IndexBulkDeleteResult **stats,
-				  LVRelStats *vacrelstats)
+lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult **stats,
+				  LVDeadTuples *dead_tuples, double reltuples)
 {
 	IndexVacuumInfo ivinfo;
+	char		*msgfmt;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
@@ -1753,30 +2395,38 @@ lazy_vacuum_index(Relation indrel,
 	ivinfo.report_progress = false;
 	ivinfo.estimated_count = true;
 	ivinfo.message_level = elevel;
-	/* We can only provide an approximate value of num_heap_tuples here */
-	ivinfo.num_heap_tuples = vacrelstats->old_live_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
 	/* Do bulk deletion */
 	*stats = index_bulk_delete(&ivinfo, *stats,
-							   lazy_tid_reaped, (void *) vacrelstats);
+							   lazy_tid_reaped, (void *) dead_tuples);
+
+	if (IsParallelWorker())
+		msgfmt = gettext_noop("scanned index \"%s\" to remove %d row versions by parallel vacuum worker");
+	else
+		msgfmt = gettext_noop("scanned index \"%s\" to remove %d row versions");
 
 	ereport(elevel,
-			(errmsg("scanned index \"%s\" to remove %d row versions",
+			(errmsg(msgfmt,
 					RelationGetRelationName(indrel),
-					vacrelstats->num_dead_tuples),
+					dead_tuples->num_tuples),
 			 errdetail_internal("%s", pg_rusage_show(&ru0))));
 }
 
 /*
  *	lazy_cleanup_index() -- do post-vacuum cleanup for one index relation.
+ *
+ *		reltuples is the number of heap tuples and estimated_count is true
+ *		if the reltuples is an estimated value.
  */
 static void
 lazy_cleanup_index(Relation indrel,
-				   IndexBulkDeleteResult *stats,
-				   LVRelStats *vacrelstats)
+				   IndexBulkDeleteResult **stats,
+				   double reltuples, bool estimated_count)
 {
 	IndexVacuumInfo ivinfo;
+	char		*msgfmt;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
@@ -1784,49 +2434,62 @@ lazy_cleanup_index(Relation indrel,
 	ivinfo.index = indrel;
 	ivinfo.analyze_only = false;
 	ivinfo.report_progress = false;
-	ivinfo.estimated_count = (vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	ivinfo.estimated_count = estimated_count;
 	ivinfo.message_level = elevel;
 
-	/*
-	 * Now we can provide a better estimate of total number of surviving
-	 * tuples (we assume indexes are more interested in that than in the
-	 * number of nominally live tuples).
-	 */
-	ivinfo.num_heap_tuples = vacrelstats->new_rel_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
-	stats = index_vacuum_cleanup(&ivinfo, stats);
+	*stats = index_vacuum_cleanup(&ivinfo, *stats);
 
-	if (!stats)
+	if (!(*stats))
 		return;
 
-	/*
-	 * Now update statistics in pg_class, but only if the index says the count
-	 * is accurate.
-	 */
-	if (!stats->estimated_count)
-		vac_update_relstats(indrel,
-							stats->num_pages,
-							stats->num_index_tuples,
-							0,
-							false,
-							InvalidTransactionId,
-							InvalidMultiXactId,
-							false);
+	if (IsParallelWorker())
+		msgfmt = gettext_noop("index \"%s\" now contains %.0f row versions in %u pages, reported by parallel vacuum worker");
+	else
+		msgfmt = gettext_noop("index \"%s\" now contains %.0f row versions in %u pages");
 
 	ereport(elevel,
-			(errmsg("index \"%s\" now contains %.0f row versions in %u pages",
+			(errmsg(msgfmt,
 					RelationGetRelationName(indrel),
-					stats->num_index_tuples,
-					stats->num_pages),
+					(*stats)->num_index_tuples,
+					(*stats)->num_pages),
 			 errdetail("%.0f index row versions were removed.\n"
 					   "%u index pages have been deleted, %u are currently reusable.\n"
 					   "%s.",
-					   stats->tuples_removed,
-					   stats->pages_deleted, stats->pages_free,
+					   (*stats)->tuples_removed,
+					   (*stats)->pages_deleted, (*stats)->pages_free,
 					   pg_rusage_show(&ru0))));
+}
 
-	pfree(stats);
+/*
+ * Update index statistics in pg_class if the statistics is accurate.
+ */
+static void
+update_index_statistics(Relation *Irel, IndexBulkDeleteResult **stats,
+						int nindexes)
+{
+	int i;
+
+	Assert(!IsInParallelMode());
+
+	for (i = 0; i < nindexes; i++)
+	{
+		if (stats[i] == NULL || stats[i]->estimated_count)
+			continue;
+
+		/* Update index statistics */
+		vac_update_relstats(Irel[i],
+							stats[i]->num_pages,
+							stats[i]->num_index_tuples,
+							0,
+							false,
+							InvalidTransactionId,
+							InvalidMultiXactId,
+							false);
+		pfree(stats[i]);
+	}
 }
 
 /*
@@ -2134,19 +2797,17 @@ count_nondeletable_pages(Relation onerel, LVRelStats *vacrelstats)
 }
 
 /*
- * lazy_space_alloc - space allocation decisions for lazy vacuum
- *
- * See the comments at the head of this file for rationale.
+ * Return the maximum number of dead tuples we can record.
  */
-static void
-lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+static long
+compute_max_dead_tuples(BlockNumber relblocks, bool useindex)
 {
 	long		maxtuples;
 	int			vac_work_mem = IsAutoVacuumWorkerProcess() &&
 	autovacuum_work_mem != -1 ?
 	autovacuum_work_mem : maintenance_work_mem;
 
-	if (vacrelstats->useindex)
+	if (useindex)
 	{
 		maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);
 		maxtuples = Min(maxtuples, INT_MAX);
@@ -2160,34 +2821,49 @@ lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
 		maxtuples = Max(maxtuples, MaxHeapTuplesPerPage);
 	}
 	else
-	{
 		maxtuples = MaxHeapTuplesPerPage;
-	}
 
-	vacrelstats->num_dead_tuples = 0;
-	vacrelstats->max_dead_tuples = (int) maxtuples;
-	vacrelstats->dead_tuples = (ItemPointer)
-		palloc(maxtuples * sizeof(ItemPointerData));
+	return maxtuples;
+}
+
+/*
+ * lazy_space_alloc - space allocation decisions for lazy vacuum
+ *
+ * See the comments at the head of this file for rationale.
+ */
+static void
+lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+{
+	LVDeadTuples	*dead_tuples = NULL;
+	long		maxtuples;
+
+	maxtuples = compute_max_dead_tuples(relblocks, vacrelstats->useindex);
+
+	dead_tuples = (LVDeadTuples *)
+		palloc(SizeOfLVDeadTuples + maxtuples * sizeof(ItemPointerData));
+	dead_tuples->num_tuples = 0;
+	dead_tuples->max_tuples = (int) maxtuples;
+
+	vacrelstats->dead_tuples = dead_tuples;
 }
 
 /*
  * lazy_record_dead_tuple - remember one deletable tuple
  */
 static void
-lazy_record_dead_tuple(LVRelStats *vacrelstats,
-					   ItemPointer itemptr)
+lazy_record_dead_tuple(LVDeadTuples *dead_tuples, ItemPointer itemptr)
 {
 	/*
 	 * The array shouldn't overflow under normal behavior, but perhaps it
 	 * could if we are given a really small maintenance_work_mem. In that
 	 * case, just forget the last few tuples (we'll get 'em next time).
 	 */
-	if (vacrelstats->num_dead_tuples < vacrelstats->max_dead_tuples)
+	if (dead_tuples->num_tuples < dead_tuples->max_tuples)
 	{
-		vacrelstats->dead_tuples[vacrelstats->num_dead_tuples] = *itemptr;
-		vacrelstats->num_dead_tuples++;
+		dead_tuples->itemptrs[dead_tuples->num_tuples] = *itemptr;
+		dead_tuples->num_tuples++;
 		pgstat_progress_update_param(PROGRESS_VACUUM_NUM_DEAD_TUPLES,
-									 vacrelstats->num_dead_tuples);
+									 dead_tuples->num_tuples);
 	}
 }
 
@@ -2201,12 +2877,12 @@ lazy_record_dead_tuple(LVRelStats *vacrelstats,
 static bool
 lazy_tid_reaped(ItemPointer itemptr, void *state)
 {
-	LVRelStats *vacrelstats = (LVRelStats *) state;
+	LVDeadTuples	*dead_tuples = (LVDeadTuples *) state;
 	ItemPointer res;
 
 	res = (ItemPointer) bsearch((void *) itemptr,
-								(void *) vacrelstats->dead_tuples,
-								vacrelstats->num_dead_tuples,
+								(void *) dead_tuples->itemptrs,
+								dead_tuples->num_tuples,
 								sizeof(ItemPointerData),
 								vac_cmp_itemptr);
 
@@ -2354,3 +3030,415 @@ heap_page_is_all_visible(Relation rel, Buffer buf,
 
 	return all_visible;
 }
+
+/*
+ * Compute the number of parallel worker processes to request. Both index
+ * vacuuming and index cleanup can be executed together with parallel workers.
+ * The relation sizes of table and indexes don't affect to the parallel
+ * degree for now. nrequested is the number of parallel workers that user
+ * requested. If nrequested is 0 we compute the parallel degree based on
+ * nindexes that is the number of indexes that support parallel index
+ * vacuuming.
+ */
+static int
+compute_parallel_workers(Relation *Irel, int nindexes, int nrequested)
+{
+	bool	leaderparticipates = true;
+	int		nindexes_parallel = 0;
+	int		nindexes_parallel_bulkdel = 0;
+	int		nindexes_parallel_cleanup = 0;
+	int		parallel_workers;
+	int		i;
+
+	Assert(nrequested >= 0);
+
+	/* Return immediately when parallelism disabled */
+	if (max_parallel_maintenance_workers == 0)
+		return 0;
+
+	/*
+	 * Compute the number of indexes that can participate in parallel index
+	 * vacuuming.
+	 */
+	for (i = 0; i < nindexes; i++)
+	{
+		uint8 vacoptions = Irel[i]->rd_indam->amparallelvacuumoptions;
+
+		if ((vacoptions & VACUUM_OPTION_PARALLEL_BULKDEL) != 0)
+			nindexes_parallel_bulkdel++;
+		if (((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) != 0) ||
+			((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) != 0))
+			nindexes_parallel_cleanup++;
+	}
+
+	nindexes_parallel = Max(nindexes_parallel_bulkdel,
+							nindexes_parallel_cleanup);
+
+	/* No index supports parallel index vacuuming */
+	if (nindexes_parallel == 0)
+		return 0;
+
+#ifdef PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION
+	leaderparticipates = false;
+#endif
+
+	/* The leader process takes one index */
+	if (leaderparticipates)
+		nindexes_parallel--;
+
+	/* Compute the parallel degree */
+	parallel_workers = (nrequested > 0) ?
+		Min(nrequested, nindexes_parallel) : nindexes_parallel;
+
+	/* cap by max_parallel_maintenace_workers */
+	parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+
+	return parallel_workers;
+}
+
+/*
+ * Initialize variables for shared index statistics, set NULL bitmap and the
+ * size of stats for each index.  Also, this function sets the number of
+ * indexes that do not support parallel index vacuuming and that use
+ * maintenance_work_mem.  Since currently we don't support parallel vacuum
+ * for autovacuum we don't need to care about autovacuum_work_mem.
+ */
+static void
+prepare_index_statistics(LVShared *lvshared, Relation *Irel, int nindexes,
+	int nworkers)
+{
+	char *p = (char *) GetSharedIndStats(lvshared);
+	int nindexes_mwm = 0;
+	int i;
+
+	Assert(!IsAutoVacuumWorkerProcess());
+
+	/* Set NULL for all indexes */
+	memset(lvshared->bitmap, 0x00, BITMAPLEN(nindexes));
+
+	for (i = 0; i < nindexes; i++)
+	{
+		LVSharedIndStats *indstats;
+
+		if (Irel[i]->rd_indam->amparallelvacuumoptions ==
+			VACUUM_OPTION_NO_PARALLEL)
+			continue;
+
+		if (Irel[i]->rd_indam->amusemaintenanceworkmem)
+			nindexes_mwm++;
+
+		/* Set NOT NULL as this index do support parallelism */
+		lvshared->bitmap[i >> 3] |= 1 << (i & 0x07);
+
+		/* Set the size for index statistics */
+		indstats = (LVSharedIndStats *) p;
+		indstats->size = index_parallelvacuum_estimate(Irel[i]);
+
+		p += SizeOfSharedIndStats(indstats);
+	}
+
+	/* Compute the new maitenance_work_mem value for index vacuuming */
+	lvshared->maintenance_work_mem_worker =
+					(nindexes_mwm > 0) ?
+					maintenance_work_mem / Min(nworkers, nindexes_mwm) :
+					maintenance_work_mem;
+}
+
+/*
+ * Enter parallel mode, allocate and initialize the DSM segment.
+ */
+static LVParallelState *
+begin_parallel_vacuum(LVRelStats *vacrelstats, Oid relid, BlockNumber nblocks,
+					  Relation *Irel, int nindexes, int nrequested)
+{
+	LVParallelState *lps = (LVParallelState *) palloc0(sizeof(LVParallelState));
+	ParallelContext *pcxt;
+	LVShared		*shared;
+	LVDeadTuples	*dead_tuples;
+	long	maxtuples;
+	char	*sharedquery;
+	Size	est_shared;
+	Size	est_deadtuples;
+	int		querylen;
+	int		i;
+
+	Assert(nrequested > 0);
+	Assert(nindexes > 0);
+
+	lps->leaderparticipates = true;
+
+#ifdef PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION
+	lps->leaderparticipates = false;
+#endif
+
+	EnterParallelMode();
+	pcxt = CreateParallelContext("postgres", "heap_parallel_vacuum_main",
+								 nrequested);
+	lps->pcxt = pcxt;
+	Assert(pcxt->nworkers > 0);
+
+	/* Estimate size for shared information -- PARALLEL_VACUUM_KEY_SHARED */
+	est_shared = MAXALIGN(add_size(SizeOfLVShared, BITMAPLEN(nindexes)));
+	for (i = 0; i < nindexes; i++)
+	{
+		uint8 vacoptions = Irel[i]->rd_indam->amparallelvacuumoptions;
+
+		/*
+		 * Cleanup option should be either disabled, always performing
+		 * in parallel or conditionally performing in parallel.
+		 */
+		Assert(((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) == 0) ||
+			   ((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) == 0));
+		Assert(vacoptions <= VACUUM_OPTION_MAX_VALID_VALUE);
+
+		if (vacoptions != VACUUM_OPTION_NO_PARALLEL)
+		{
+			est_shared = add_size(est_shared,
+								  add_size(sizeof(LVSharedIndStats),
+										   index_parallelvacuum_estimate(Irel[i])));
+
+			/*
+			 * Remember the number of indexes that support parallel operation
+			 * for each phases.
+			 */
+			if ((vacoptions & VACUUM_OPTION_PARALLEL_BULKDEL) != 0)
+				lps->nindexes_parallel_bulkdel++;
+			if ((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) != 0)
+				lps->nindexes_parallel_cleanup++;
+			if ((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) != 0)
+				lps->nindexes_parallel_condcleanup++;
+		}
+	}
+	shm_toc_estimate_chunk(&pcxt->estimator, est_shared);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	/* Estimate size for dead tuples -- PARALLEL_VACUUM_KEY_DEAD_TUPLES */
+	maxtuples = compute_max_dead_tuples(nblocks, true);
+	est_deadtuples = MAXALIGN(add_size(SizeOfLVDeadTuples,
+									   mul_size(sizeof(ItemPointerData), maxtuples)));
+	shm_toc_estimate_chunk(&pcxt->estimator, est_deadtuples);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	/* Finally, estimate PARALLEL_VACUUM_KEY_QUERY_TEXT space */
+	querylen = strlen(debug_query_string);
+	shm_toc_estimate_chunk(&pcxt->estimator, querylen + 1);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	InitializeParallelDSM(pcxt);
+
+	/* Prepare shared information */
+	shared = (LVShared *) shm_toc_allocate(pcxt->toc, est_shared);
+	MemSet(shared, 0, est_shared);
+	shared->relid = relid;
+	shared->elevel = elevel;
+
+	/*
+	 * We need to care about alignment because we estimate the shared memory
+	 * in that way.
+	 */
+	shared->offset = MAXALIGN(add_size(SizeOfLVShared, BITMAPLEN(nindexes)));
+	prepare_index_statistics(shared, Irel, nindexes, nrequested);
+	pg_atomic_init_u32(&(shared->idx), 0);
+	pg_atomic_init_u32(&(shared->nprocessed), 0);
+	pg_atomic_init_u32(&(shared->cost_balance), 0);
+	pg_atomic_init_u32(&(shared->active_nworkers), 0);
+
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_SHARED, shared);
+	lps->lvshared = shared;
+
+	/* Prepare the dead tuple space */
+	dead_tuples = (LVDeadTuples *) shm_toc_allocate(pcxt->toc, est_deadtuples);
+	dead_tuples->max_tuples = maxtuples;
+	dead_tuples->num_tuples = 0;
+	MemSet(dead_tuples->itemptrs, 0, sizeof(ItemPointerData) * maxtuples);
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_DEAD_TUPLES, dead_tuples);
+	vacrelstats->dead_tuples = dead_tuples;
+
+	/* Store query string for workers */
+	sharedquery = (char *) shm_toc_allocate(pcxt->toc, querylen + 1);
+	memcpy(sharedquery, debug_query_string, querylen + 1);
+	sharedquery[querylen] = '\0';
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, sharedquery);
+
+	return lps;
+}
+
+/*
+ * Destroy the parallel context, and end parallel mode.
+ *
+ * All writes are not allowed during parallel mode and it might not be
+ * safe to exit from the parallel mode while keeping the parallel context.
+ * So we copy the updated index statistics to a local memory and then later
+ * use that to update the index statistics.
+ */
+static void
+end_parallel_vacuum(LVParallelState *lps, Relation *Irel, int nindexes,
+					IndexBulkDeleteResult **stats)
+{
+	int i;
+
+	Assert(!IsParallelWorker());
+
+	/* copy the updated statistics */
+	for (i = 0; i < nindexes; i++)
+	{
+		LVSharedIndStats *indstats = get_indstats(lps->lvshared, i);
+
+		/*
+		 * Skip unused slot.  The statistics of this index are already
+		 * stored in local memory.
+		 */
+		if (indstats == NULL)
+			continue;
+
+		if (indstats->updated)
+		{
+			stats[i] = (IndexBulkDeleteResult *) palloc0(sizeof(IndexBulkDeleteResult));
+			memcpy(stats[i],
+				   GetIndexBulkDeleteResult(indstats),
+				   sizeof(IndexBulkDeleteResult));
+		}
+		else
+			stats[i] = NULL;
+	}
+
+	DestroyParallelContext(lps->pcxt);
+	ExitParallelMode();
+
+	/* Deactivate parallel vacuum */
+	pfree(lps);
+	lps = NULL;
+}
+
+/* Return the Nth index statistics or NULL */
+static LVSharedIndStats *
+get_indstats(LVShared *lvshared, int n)
+{
+	int		i;
+	char	*p;
+
+	if (IndStatsIsNull(lvshared, n))
+		return NULL;
+
+	p = (char *) GetSharedIndStats(lvshared);
+	for (i = 0; i < n; i++)
+	{
+		if (IndStatsIsNull(lvshared, i))
+			continue;
+
+		p += SizeOfSharedIndStats(p);
+	}
+
+	return (LVSharedIndStats *) p;
+}
+
+/*
+ * Check if the given index participates in parallel index vacuuming
+ * or parallel index cleanup.
+ */
+static bool
+skip_parallel_index_vacuum(Relation indrel, LVShared *lvshared)
+{
+	uint8 vacoptions = indrel->rd_indam->amparallelvacuumoptions;
+
+	/* first_time must be true only if for_cleanup is true */
+	Assert(lvshared->for_cleanup || !lvshared->first_time);
+
+	if (lvshared->for_cleanup)
+	{
+		/* Skip, if the index does not support parallel cleanup */
+		if (((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) == 0) &&
+			((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) == 0))
+			return true;
+
+		/*
+		 * Skip, if the index supports parallel cleanup conditionally, but we
+		 * have already processed the index (for bulkdelete).  See the
+		 * comments for option VACUUM_OPTION_PARALLEL_COND_CLEANUP to know
+		 * when indexes support parallel cleanup conditionally.
+		 */
+		if (!lvshared->first_time &&
+			((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) != 0))
+			return true;
+	}
+	else if ((vacoptions & VACUUM_OPTION_PARALLEL_BULKDEL) == 0)
+	{
+		/* Skip if the index does not support parallel bulk deletion */
+		return true;
+	}
+
+	return false;
+}
+
+/*
+ * Perform work within a launched parallel process.
+ *
+ * Since parallel vacuum workers work only within index vacuuming and index
+ * cleanup, no need to report the progress information.
+ */
+void
+heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
+{
+	Relation	onerel;
+	Relation	*indrels;
+	LVShared	*lvshared;
+	LVDeadTuples	*dead_tuples;
+	int			nindexes;
+	char		*sharedquery;
+	IndexBulkDeleteResult **stats;
+
+	lvshared = (LVShared *) shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_SHARED,
+										   false);
+	elevel = lvshared->elevel;
+
+	ereport(DEBUG1,
+			(errmsg("starting parallel lazy vacuum worker for %s",
+					lvshared->for_cleanup ? "cleanup" : "vacuuming")));
+
+	/* Set debug_query_string for individual workers */
+	sharedquery = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, false);
+	debug_query_string = sharedquery;
+	pgstat_report_activity(STATE_RUNNING, debug_query_string);
+
+	/*
+	 * Open table.  The lock mode is the same as the leader process.  It's
+	 * okay because The lockmode does not conflict among the parallel
+	 * workers.
+	 */
+	onerel = table_open(lvshared->relid, ShareUpdateExclusiveLock);
+
+	/*
+	 * Open all indexes. indrels are sorted in order by OID, which should
+	 * be matched to the leader's one.
+	 */
+	vac_open_indexes(onerel, RowExclusiveLock, &nindexes, &indrels);
+	Assert(nindexes > 0);
+
+	/* Set dead tuple space */
+	dead_tuples = (LVDeadTuples *) shm_toc_lookup(toc,
+												  PARALLEL_VACUUM_KEY_DEAD_TUPLES,
+												  false);
+
+	/* Set cost-based vacuum delay */
+	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostBalance = 0;
+	VacuumPageHit = 0;
+	VacuumPageMiss = 0;
+	VacuumPageDirty = 0;
+	VacuumSharedCostBalance = &(lvshared->cost_balance);
+	VacuumActiveNWorkers = &(lvshared->active_nworkers);
+
+	stats = (IndexBulkDeleteResult **)
+		palloc0(nindexes * sizeof(IndexBulkDeleteResult *));
+
+	if (lvshared->maintenance_work_mem_worker > 0)
+		maintenance_work_mem = lvshared->maintenance_work_mem_worker;
+
+	/* Do either vacuuming indexes or cleaning indexes */
+	vacuum_or_cleanup_indexes_worker(indrels, nindexes, stats, lvshared,
+									 dead_tuples);
+
+	vac_close_indexes(nindexes, indrels, RowExclusiveLock);
+	table_close(onerel, ShareUpdateExclusiveLock);
+	pfree(stats);
+}
diff --git a/src/backend/access/nbtree/nbtsort.c b/src/backend/access/nbtree/nbtsort.c
index 1dd39a9535..c9972f5f37 100644
--- a/src/backend/access/nbtree/nbtsort.c
+++ b/src/backend/access/nbtree/nbtsort.c
@@ -1428,7 +1428,7 @@ _bt_begin_parallel(BTBuildState *buildstate, bool isconcurrent, int request)
 	shm_toc_insert(pcxt->toc, PARALLEL_KEY_QUERY_TEXT, sharedquery);
 
 	/* Launch workers, saving status for leader/caller */
-	LaunchParallelWorkers(pcxt);
+	LaunchParallelWorkers(pcxt, request);
 	btleader->pcxt = pcxt;
 	btleader->nparticipanttuplesorts = pcxt->nworkers_launched;
 	if (leaderparticipates)
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index d147236429..157c309211 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -14,6 +14,7 @@
 
 #include "postgres.h"
 
+#include "access/heapam.h"
 #include "access/nbtree.h"
 #include "access/parallel.h"
 #include "access/session.h"
@@ -139,6 +140,9 @@ static const struct
 	},
 	{
 		"_bt_parallel_build_main", _bt_parallel_build_main
+	},
+	{
+		"heap_parallel_vacuum_main", heap_parallel_vacuum_main
 	}
 };
 
@@ -490,10 +494,11 @@ ReinitializeParallelDSM(ParallelContext *pcxt)
  * Launch parallel workers.
  */
 void
-LaunchParallelWorkers(ParallelContext *pcxt)
+LaunchParallelWorkers(ParallelContext *pcxt, int nworkers)
 {
 	MemoryContext oldcontext;
 	BackgroundWorker worker;
+	int			nworkers_to_launch = Min(nworkers, pcxt->nworkers);;
 	int			i;
 	bool		any_registrations_failed = false;
 
@@ -533,7 +538,7 @@ LaunchParallelWorkers(ParallelContext *pcxt)
 	 * fails.  It wouldn't help much anyway, because registering the worker in
 	 * no way guarantees that it will start up and initialize successfully.
 	 */
-	for (i = 0; i < pcxt->nworkers; ++i)
+	for (i = 0; i < nworkers_to_launch; ++i)
 	{
 		memcpy(worker.bgw_extra, &i, sizeof(int));
 		if (!any_registrations_failed &&
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index da1da23400..4b7f480fd6 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -42,6 +42,7 @@
 #include "nodes/makefuncs.h"
 #include "pgstat.h"
 #include "postmaster/autovacuum.h"
+#include "postmaster/bgworker_internals.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
 #include "storage/proc.h"
@@ -99,6 +100,7 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 	/* Set default value */
 	params.index_cleanup = VACOPT_TERNARY_DEFAULT;
 	params.truncate = VACOPT_TERNARY_DEFAULT;
+	params.nworkers = -1;
 
 	/* Parse options list */
 	foreach(lc, vacstmt->options)
@@ -129,6 +131,28 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 			params.index_cleanup = get_vacopt_ternary_value(opt);
 		else if (strcmp(opt->defname, "truncate") == 0)
 			params.truncate = get_vacopt_ternary_value(opt);
+		else if (strcmp(opt->defname, "parallel") == 0)
+		{
+			if (opt->arg == NULL)
+			{
+				/*
+				 * Parallel lazy vacuum is requested but user didn't specify
+				 * the parallel degree. The parallel degree will be determined
+				 * at the start of lazy vacuum.
+				 */
+				params.nworkers = 0;
+			}
+			else
+			{
+				params.nworkers = defGetInt32(opt);
+				if (params.nworkers < 1 || params.nworkers > MAX_PARALLEL_WORKER_LIMIT)
+					ereport(ERROR,
+							(errcode(ERRCODE_SYNTAX_ERROR),
+							 errmsg("parallel vacuum degree must be between 1 and %d",
+									MAX_PARALLEL_WORKER_LIMIT),
+							 parser_errposition(pstate, opt->location)));
+			}
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -170,6 +194,11 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 		}
 	}
 
+	if ((params.options & VACOPT_FULL) && params.nworkers >= 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot specify FULL option with PARALLEL option")));
+
 	/*
 	 * All freeze ages are zero if the FREEZE option is given; otherwise pass
 	 * them as -1 which means to use the default values.
@@ -383,6 +412,7 @@ vacuum(List *relations, VacuumParams *params,
 		VacuumPageHit = 0;
 		VacuumPageMiss = 0;
 		VacuumPageDirty = 0;
+		VacuumSharedCostBalance = NULL;
 
 		/*
 		 * Loop to process each selected relation.
@@ -1738,6 +1768,20 @@ vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params)
 		return false;
 	}
 
+	/*
+	 * Since parallel workers cannot access data in temporary tables, parallel
+	 * vacuum is not allowed for temporary relation. However rather than
+	 * skipping vacuum on the table, just disabling parallel option is better
+	 * option in most cases.
+	 */
+	if (RelationUsesLocalBuffers(onerel) && params->nworkers >= 0)
+	{
+		ereport(WARNING,
+				(errmsg("disabling parallel option of vacuum on \"%s\" --- cannot vacuum temporary tables in parallel",
+						RelationGetRelationName(onerel))));
+		params->nworkers = 0;
+	}
+
 	/*
 	 * Silently ignore partitioned tables as there is no work to be done.  The
 	 * useful work is on their child partitions, which have been queued up for
@@ -1941,16 +1985,73 @@ vac_close_indexes(int nindexes, Relation *Irel, LOCKMODE lockmode)
 void
 vacuum_delay_point(void)
 {
+	double	msec = 0;
+
 	/* Always check for interrupts */
 	CHECK_FOR_INTERRUPTS();
 
-	/* Nap if appropriate */
-	if (VacuumCostActive && !InterruptPending &&
-		VacuumCostBalance >= VacuumCostLimit)
+	if (!VacuumCostActive || InterruptPending)
+		return;
+
+	/*
+	 * If the vacuum cost balance is shared among parallel workers we
+	 * decide whether to sleep based on that.
+	 */
+	if (VacuumSharedCostBalance != NULL)
 	{
-		double		msec;
+		int nworkers = pg_atomic_read_u32(VacuumActiveNWorkers);
+
+		/* At least count itself */
+		Assert(nworkers >= 1);
+
+		/* Update the shared cost balance value atomically */
+		while (true)
+		{
+			uint32 shared_balance;
+			uint32 new_balance;
+			uint32 local_balance;
+
+			msec = 0;
+
+			/* compute new balance by adding the local value */
+			shared_balance = pg_atomic_read_u32(VacuumSharedCostBalance);
+			new_balance = shared_balance + VacuumCostBalance;
 
+			/* also compute the total local balance */
+			local_balance = VacuumCostBalanceLocal + VacuumCostBalance;
+
+			if ((new_balance >= VacuumCostLimit) &&
+				(local_balance > 0.5 * (VacuumCostLimit / nworkers)))
+			{
+				/* compute sleep time based on the local cost balance */
+				msec = VacuumCostDelay * VacuumCostBalanceLocal / VacuumCostLimit;
+				new_balance = shared_balance - VacuumCostBalanceLocal;
+				VacuumCostBalanceLocal = 0;
+			}
+
+			if (pg_atomic_compare_exchange_u32(VacuumSharedCostBalance,
+											   &shared_balance,
+											   new_balance))
+			{
+				/* Updated successfully, break */
+				break;
+			}
+		}
+
+		VacuumCostBalanceLocal += VacuumCostBalance;
+
+		/*
+		 * Reset the local balance as we accumulated it into the shared
+		 * value.
+		 */
+		VacuumCostBalance = 0;
+	}
+	else if (VacuumCostBalance >= VacuumCostLimit)
 		msec = VacuumCostDelay * VacuumCostBalance / VacuumCostLimit;
+
+	/* Nap if appropriate */
+	if (msec > 0)
+	{
 		if (msec > VacuumCostDelay * 4)
 			msec = VacuumCostDelay * 4;
 
diff --git a/src/backend/executor/nodeGather.c b/src/backend/executor/nodeGather.c
index 69d5a1f239..df28ff2927 100644
--- a/src/backend/executor/nodeGather.c
+++ b/src/backend/executor/nodeGather.c
@@ -183,7 +183,7 @@ ExecGather(PlanState *pstate)
 			 * requested, or indeed any at all.
 			 */
 			pcxt = node->pei->pcxt;
-			LaunchParallelWorkers(pcxt);
+			LaunchParallelWorkers(pcxt, gather->num_workers);
 			/* We save # workers launched for the benefit of EXPLAIN */
 			node->nworkers_launched = pcxt->nworkers_launched;
 
diff --git a/src/backend/executor/nodeGatherMerge.c b/src/backend/executor/nodeGatherMerge.c
index 6ef128e2ab..cb9d5a725a 100644
--- a/src/backend/executor/nodeGatherMerge.c
+++ b/src/backend/executor/nodeGatherMerge.c
@@ -224,7 +224,7 @@ ExecGatherMerge(PlanState *pstate)
 
 			/* Try to launch workers. */
 			pcxt = node->pei->pcxt;
-			LaunchParallelWorkers(pcxt);
+			LaunchParallelWorkers(pcxt, gm->num_workers);
 			/* We save # workers launched for the benefit of EXPLAIN */
 			node->nworkers_launched = pcxt->nworkers_launched;
 
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index c1dd8168ca..c3690f9c41 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2891,6 +2891,8 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 			(!wraparound ? VACOPT_SKIP_LOCKED : 0);
 		tab->at_params.index_cleanup = VACOPT_TERNARY_DEFAULT;
 		tab->at_params.truncate = VACOPT_TERNARY_DEFAULT;
+		/* We don't support parallel vacuum for autovacuum for now */
+		tab->at_params.nworkers = -1;
 		tab->at_params.freeze_min_age = freeze_min_age;
 		tab->at_params.freeze_table_age = freeze_table_age;
 		tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index df26826993..4df3de429e 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -3585,7 +3585,7 @@ psql_completion(const char *text, int start, int end)
 		if (ends_with(prev_wd, '(') || ends_with(prev_wd, ','))
 			COMPLETE_WITH("FULL", "FREEZE", "ANALYZE", "VERBOSE",
 						  "DISABLE_PAGE_SKIPPING", "SKIP_LOCKED",
-						  "INDEX_CLEANUP", "TRUNCATE");
+						  "INDEX_CLEANUP", "TRUNCATE", "PARALLEL");
 		else if (TailMatches("FULL|FREEZE|ANALYZE|VERBOSE|DISABLE_PAGE_SKIPPING|SKIP_LOCKED|INDEX_CLEANUP|TRUNCATE"))
 			COMPLETE_WITH("ON", "OFF");
 	}
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 858bcb6bc9..61725e749f 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -24,6 +24,8 @@
 #include "nodes/primnodes.h"
 #include "storage/bufpage.h"
 #include "storage/lockdefs.h"
+#include "storage/shm_toc.h"
+#include "storage/dsm.h"
 #include "utils/relcache.h"
 #include "utils/snapshot.h"
 
@@ -190,9 +192,13 @@ extern void SyncScanShmemInit(void);
 extern Size SyncScanShmemSize(void);
 
 /* in heap/vacuumlazy.c */
+extern pg_atomic_uint32	*VacuumSharedCostBalance;
+extern pg_atomic_uint32	*VacuumActiveNWorkers;
+extern int	VacuumCostBalanceLocal;
 struct VacuumParams;
 extern void heap_vacuum_rel(Relation onerel,
 							struct VacuumParams *params, BufferAccessStrategy bstrategy);
+extern void heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
 
 /* in heap/heapam_visibility.c */
 extern bool HeapTupleSatisfiesVisibility(HeapTuple stup, Snapshot snapshot,
diff --git a/src/include/access/parallel.h b/src/include/access/parallel.h
index c00ae6424c..e5e6ae6c08 100644
--- a/src/include/access/parallel.h
+++ b/src/include/access/parallel.h
@@ -63,7 +63,7 @@ extern ParallelContext *CreateParallelContext(const char *library_name,
 											  const char *function_name, int nworkers);
 extern void InitializeParallelDSM(ParallelContext *pcxt);
 extern void ReinitializeParallelDSM(ParallelContext *pcxt);
-extern void LaunchParallelWorkers(ParallelContext *pcxt);
+extern void LaunchParallelWorkers(ParallelContext *pcxt, int nworkers);
 extern void WaitForParallelWorkersToAttach(ParallelContext *pcxt);
 extern void WaitForParallelWorkersToFinish(ParallelContext *pcxt);
 extern void DestroyParallelContext(ParallelContext *pcxt);
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 5f23f1ab1d..0a586dca8d 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -218,6 +218,11 @@ typedef struct VacuumParams
 										 * default value depends on reloptions */
 	VacOptTernaryValue truncate;	/* Truncate empty pages at the end,
 									 * default value depends on reloptions */
+	/*
+	 * The number of parallel vacuum workers. -1 by default for no workers
+	 * and 0 for choosing based on the number of indexes.
+	 */
+	int			nworkers;
 } VacuumParams;
 
 /* GUC parameters */
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index 9996d882d1..cf5e1f0a4e 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -92,6 +92,32 @@ CONTEXT:  SQL function "do_analyze" statement 1
 SQL function "wrap_do_analyze" statement 1
 VACUUM FULL vactst;
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+-- PARALLEL option
+CREATE TABLE pvactst (i INT, a INT[], p POINT) with (autovacuum_enabled = off);
+INSERT INTO pvactst SELECT i, array[1,2,3], point(i, i+1) FROM generate_series(1,100000) i;
+CREATE INDEX btree_pvactst ON pvactst USING btree (i);
+CREATE INDEX hash_pvactst ON pvactst USING hash (i);
+CREATE INDEX brin_pvactst ON pvactst USING brin (i);
+CREATE INDEX gin_pvactst ON pvactst USING gin (a);
+CREATE INDEX gist_pvactst ON pvactst USING gist (p);
+CREATE INDEX spgist_pvactst ON pvactst USING spgist (p);
+-- VACUUM invokes parallel index cleanup
+VACUUM (PARALLEL 2) pvactst;
+-- VACUUM invokes parallel bulk-deletion and parallel index cleanup
+UPDATE pvactst SET i = i WHERE i < 1000;
+VACUUM (PARALLEL 2) pvactst;
+VACUUM (PARALLEL 0) pvactst; -- error
+ERROR:  parallel vacuum degree must be between 1 and 1024
+LINE 1: VACUUM (PARALLEL 0) pvactst;
+                ^
+VACUUM (PARALLEL 2, INDEX_CLEANUP FALSE) pvactst;
+VACUUM (PARALLEL 2, FULL TRUE) pvactst; -- error, cannot use both PARALLEL and FULL
+ERROR:  cannot specify FULL option with PARALLEL option
+CREATE TEMPORARY TABLE tmp (a int PRIMARY KEY);
+CREATE INDEX tmp_idx1 ON tmp (a);
+VACUUM (PARALLEL 1) tmp; -- disables parallel vacuum option
+WARNING:  disabling parallel option of vacuum on "tmp" --- cannot vacuum temporary tables in parallel
+DROP TABLE pvactst;
 -- INDEX_CLEANUP option
 CREATE TABLE no_index_cleanup (i INT PRIMARY KEY, t TEXT);
 -- Use uncompressed data stored in toast.
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 69987f75e9..0aecf17773 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -75,6 +75,31 @@ VACUUM FULL vactst;
 
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
 
+-- PARALLEL option
+CREATE TABLE pvactst (i INT, a INT[], p POINT) with (autovacuum_enabled = off);
+INSERT INTO pvactst SELECT i, array[1,2,3], point(i, i+1) FROM generate_series(1,100000) i;
+CREATE INDEX btree_pvactst ON pvactst USING btree (i);
+CREATE INDEX hash_pvactst ON pvactst USING hash (i);
+CREATE INDEX brin_pvactst ON pvactst USING brin (i);
+CREATE INDEX gin_pvactst ON pvactst USING gin (a);
+CREATE INDEX gist_pvactst ON pvactst USING gist (p);
+CREATE INDEX spgist_pvactst ON pvactst USING spgist (p);
+
+-- VACUUM invokes parallel index cleanup
+VACUUM (PARALLEL 2) pvactst;
+
+-- VACUUM invokes parallel bulk-deletion and parallel index cleanup
+UPDATE pvactst SET i = i WHERE i < 1000;
+VACUUM (PARALLEL 2) pvactst;
+
+VACUUM (PARALLEL 0) pvactst; -- error
+VACUUM (PARALLEL 2, INDEX_CLEANUP FALSE) pvactst;
+VACUUM (PARALLEL 2, FULL TRUE) pvactst; -- error, cannot use both PARALLEL and FULL
+CREATE TEMPORARY TABLE tmp (a int PRIMARY KEY);
+CREATE INDEX tmp_idx1 ON tmp (a);
+VACUUM (PARALLEL 1) tmp; -- disables parallel vacuum option
+DROP TABLE pvactst;
+
 -- INDEX_CLEANUP option
 CREATE TABLE no_index_cleanup (i INT PRIMARY KEY, t TEXT);
 -- Use uncompressed data stored in toast.
-- 
2.23.0

v35-0003-Code-review-amit.patchapplication/octet-stream; name=v35-0003-Code-review-amit.patchDownload

From 878dc4587fdf4774fc78c54ffadd938ed7595030 Mon Sep 17 00:00:00 2001
From: Amit Kapila <amit.kapila@enterprisedb.com>
Date: Tue, 3 Dec 2019 15:36:46 +0530
Subject: [PATCH] Code review.

---
 src/backend/access/heap/vacuumlazy.c  | 441 +++++++++++++++++-----------------
 src/backend/access/transam/parallel.c |   4 +-
 src/backend/commands/vacuum.c         | 115 +++++----
 src/include/access/heapam.h           |   5 +-
 src/include/commands/vacuum.h         |   5 +
 5 files changed, 290 insertions(+), 280 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 36f4ef1..e1cbdb1 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -33,11 +33,12 @@
  * processes exit.  And then the leader process re-initializes the parallel
  * context so that it can use the same DSM for multiple passses of index
  * vacuum and for performing index cleanup.  Note that all parallel workers
- * live during either index vacuuming or index cleanup but the leader process
- * neither exits from the parallel mode nor destroys the parallel context.
- * For updating the index statistics, since any updates are not allowed during
- * parallel mode we update the index statistics after exited from the parallel
- * mode.
+ * are alive only during index vacuum or index cleanup but the leader process
+ * neither exits from the parallel mode nor destroys the parallel context till
+ * the entire parallel operation is finished.  For updating the index
+ * statistics, we need to update the system table and since updates are not
+ * allowed during parallel mode we update the index statistics after exiting
+ * from the parallel mode.
  *
  * Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
@@ -175,7 +176,7 @@ typedef struct LVDeadTuples
 typedef struct LVShared
 {
 	/*
-	 * Target table relid and log level. These fields are not modified
+	 * Target table relid and log level.  These fields are not modified
 	 * during the lazy vacuum.
 	 */
 	Oid		relid;
@@ -190,13 +191,13 @@ typedef struct LVShared
 	bool	first_time;
 
 	/*
-	 * Fields for both index vacuuming and index cleanup.
+	 * Fields for both index vacuum and cleanup.
 	 *
 	 * reltuples is the total number of input heap tuples.  We set either
-	 * an old live tuples in index vacuuming case or the new live tuples in
-	 * index cleanup case.
+	 * old live tuples in the index vacuum case or the new live tuples in
+	 * the index cleanup case.
 	 *
-	 * estimated_count is true if the reltuples is estimated value.
+	 * estimated_count is true if the reltuples is an estimated value.
 	 */
 	double	reltuples;
 	bool	estimated_count;
@@ -318,27 +319,6 @@ static MultiXactId MultiXactCutoff;
 
 static BufferAccessStrategy vac_strategy;
 
-/*
- * Variables for cost-based vacuum delay for parallel index vacuuming.
- * The basic idea of cost-based vacuum delay for parallel index vacuuming
- * is to allow all parallel vacuum workers including the leader process
- * to have a shared view of cost related parameters (mainly VacuumCostBalance)
- * and allow each worker to update it and then based on that decide
- * whether it needs to sleep.  Besides, we allow any worker to sleep
- * only if it has performed the I/O above a certain threshold, which is
- * calculated based on the number of active workers (VacuumActiveNWorkers),
- * and the overall cost balance is more than VacuumCostLimit set by the
- * system.  Then we will allow the worker to sleep proportional to the work
- * done and reduce the VacuumSharedCostBalance by the amount which is
- * consumed by the current worker (VacuumCostBalanceLocal).  This can
- * avoid letting the workers sleep which has done less or no I/O as compared
- * to other workers, and therefore can ensure that workers who are doing
- * more I/O got throttled more.
- */
-pg_atomic_uint32	*VacuumSharedCostBalance = NULL;
-pg_atomic_uint32	*VacuumActiveNWorkers = NULL;
-int					VacuumCostBalanceLocal = 0;
-
 /* non-export function prototypes */
 static void lazy_scan_heap(Relation onerel, VacuumParams *params,
 						   LVRelStats *vacrelstats, Relation *Irel, int nindexes,
@@ -364,38 +344,37 @@ static bool lazy_tid_reaped(ItemPointer itemptr, void *state);
 static int	vac_cmp_itemptr(const void *left, const void *right);
 static bool heap_page_is_all_visible(Relation rel, Buffer buf,
 									 TransactionId *visibility_cutoff_xid, bool *all_frozen);
-static void lazy_parallel_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
-													int nindexes, IndexBulkDeleteResult **stats,
-													LVParallelState *lps);
-static void vacuum_or_cleanup_indexes_worker(Relation *Irel, int nindexes,
-											 IndexBulkDeleteResult **stats,
-											 LVShared *lvshared,
-											 LVDeadTuples *dead_tuples);
-static void vacuum_or_cleanup_skipped_indexes(LVRelStats *vacrelstats, Relation *Irel,
-											  int nindexes, IndexBulkDeleteResult **stats,
-											  LVParallelState *lps);
-static void vacuum_or_cleanup_one_index_worker(Relation indrel, IndexBulkDeleteResult **stats,
-											   LVShared *lvshared, LVSharedIndStats *shared_indstats,
-											   LVDeadTuples *dead_tuples);
-static void lazy_vacuum_indexes(LVRelStats *vacrelstats, Relation *Irel,
-								int nindexes, IndexBulkDeleteResult **stats,
-								LVParallelState *lps);
-static void lazy_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
-								 int nindexes, IndexBulkDeleteResult **stats,
-								 LVParallelState *lps);
-static void update_index_statistics(Relation *Irel, IndexBulkDeleteResult **stats,
-									int nindexes);
+static void lazy_parallel_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+										 LVRelStats *vacrelstats, LVParallelState *lps,
+										 int nindexes);
+static void parallel_vacuum_index(Relation *Irel, IndexBulkDeleteResult **stats,
+								  LVShared *lvshared, LVDeadTuples *dead_tuples,
+								  int nindexes);
+static void vacuum_skipped_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+								   LVRelStats *vacrelstats, LVParallelState *lps,
+								   int nindexes);
+static void vacuum_one_index(Relation indrel, IndexBulkDeleteResult **stats,
+							 LVShared *lvshared, LVSharedIndStats *shared_indstats,
+							 LVDeadTuples *dead_tuples);
+static void lazy_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+								LVRelStats *vacrelstats, LVParallelState *lps,
+								int nindexes);
+static void lazy_cleanup_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+								 LVRelStats *vacrelstats, LVParallelState *lps,
+								 int nindexes);
 static long compute_max_dead_tuples(BlockNumber relblocks, bool hasindex);
-static int compute_parallel_workers(Relation *Irel, int nindexes, int nrequested);
-static void prepare_index_statistics(LVShared *lvshared, Relation *Irel, int nindexes,
+static int compute_parallel_vacuum_workers(Relation *Irel, int nindexes, int nrequested);
+static void prepare_index_statistics(Relation *Irel, LVShared *lvshared, int nindexes,
 									 int nworkers);
-static LVParallelState *begin_parallel_vacuum(LVRelStats *vacrelstats, Oid relid,
-											  BlockNumber nblocks, Relation *Irel,
-											  int nindexes, int nrequested);
-static void end_parallel_vacuum(LVParallelState *lps, Relation *Irel, int nindexes,
-								IndexBulkDeleteResult **stats);
+static void update_index_statistics(Relation *Irel, IndexBulkDeleteResult **stats,
+									int nindexes);
+static LVParallelState *begin_parallel_vacuum(Oid relid, Relation *Irel,
+							LVRelStats *vacrelstats, BlockNumber nblocks,
+							int nindexes, int nrequested);
+static void end_parallel_vacuum(Relation *Irel, IndexBulkDeleteResult **stats,
+								LVParallelState *lps, int nindexes);
 static LVSharedIndStats *get_indstats(LVShared *lvshared, int n);
-static bool skip_parallel_index_vacuum(Relation indrel, LVShared *lvshared);
+static bool skip_parallel_vacuum_index(Relation indrel, LVShared *lvshared);
 
 
 /*
@@ -753,7 +732,6 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	bool		skipping_blocks;
 	xl_heap_freeze_tuple *frozen;
 	StringInfoData buf;
-	int			parallel_workers = 0;
 	const int	initprog_index[] = {
 		PROGRESS_VACUUM_PHASE,
 		PROGRESS_VACUUM_TOTAL_HEAP_BLKS,
@@ -790,32 +768,19 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	vacrelstats->latestRemovedXid = InvalidTransactionId;
 
 	/*
-	 * Compute the number of parallel vacuum workers to launch if the parallel
-	 * vacuum is requested and we need to vacuum the indexes.
+	 * Try to initialize the parallel vacuum if requested
 	 */
 	if (params->nworkers >= 0 && vacrelstats->useindex)
-		parallel_workers = compute_parallel_workers(Irel, nindexes,
-													params->nworkers);
+		lps = begin_parallel_vacuum(RelationGetRelid(onerel), Irel,
+									vacrelstats, nblocks, nindexes,
+									params->nworkers);
 
-	if (parallel_workers > 0)
-	{
-		/*
-		 * Enter parallel mode, create the parallel context and allocate the
-		 * DSM segment.
-		 */
-		lps = begin_parallel_vacuum(vacrelstats,
-									RelationGetRelid(onerel),
-									nblocks, Irel, nindexes,
-									parallel_workers);
-	}
-	else
-	{
-		/*
-		 * Use single process vacuum. We allocate the memory space for dead
-		 * tuples locally.
-		 */
+	/*
+	 * Allocate the space for dead tuples in case the parallel vacuum is not
+	 * initialized.
+	 */
+	if (!ParallelVacuumIsActive(lps))
 		lazy_space_alloc(vacrelstats, nblocks);
-	}
 
 	dead_tuples = vacrelstats->dead_tuples;
 	frozen = palloc(sizeof(xl_heap_freeze_tuple) * MaxHeapTuplesPerPage);
@@ -1030,7 +995,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 										 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
 			/* Remove index entries */
-			lazy_vacuum_indexes(vacrelstats, Irel, nindexes, indstats, lps);
+			lazy_vacuum_indexes(Irel, indstats, vacrelstats, lps, nindexes);
 
 			/*
 			 * Report that we are now vacuuming the heap.  We also increase
@@ -1695,7 +1660,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 									 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
 		/* Remove index entries */
-		lazy_vacuum_indexes(vacrelstats, Irel, nindexes, indstats, lps);
+		lazy_vacuum_indexes(Irel, indstats, vacrelstats, lps, nindexes);
 
 		/* Report that we are now vacuuming the heap */
 		hvp_val[0] = PROGRESS_VACUUM_PHASE_VACUUM_HEAP;
@@ -1723,14 +1688,14 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 
 	/* Do post-vacuum cleanup */
 	if (vacrelstats->useindex)
-		lazy_cleanup_indexes(vacrelstats, Irel, nindexes, indstats, lps);
+		lazy_cleanup_indexes(Irel, indstats, vacrelstats, lps, nindexes);
 
 	/*
 	 * End parallel mode before updating index statistics as we cannot write
 	 * during parallel mode.
 	 */
 	if (ParallelVacuumIsActive(lps))
-		end_parallel_vacuum(lps, Irel, nindexes, indstats);
+		end_parallel_vacuum(Irel, indstats, lps, nindexes);
 
 	/* Update index statistics */
 	 update_index_statistics(Irel, indstats, nindexes);
@@ -2004,9 +1969,9 @@ lazy_check_needs_freeze(Buffer buf, bool *hastup)
  * or cleanup.
  */
 static void
-lazy_parallel_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
-										int nindexes, IndexBulkDeleteResult **stats,
-										LVParallelState *lps)
+lazy_parallel_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+							 LVRelStats *vacrelstats, LVParallelState *lps,
+							 int nindexes)
 {
 	int	nworkers;
 
@@ -2030,31 +1995,39 @@ lazy_parallel_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
 	/* Cap by the worker we computed at the beginning of parallel lazy vacuum */
 	nworkers = Min(nworkers, lps->pcxt->nworkers);
 
-	/* Setup the shared cost-based vacuum delay and launch workers*/
+	/* Setup the shared cost-based vacuum delay and launch workers */
 	if (nworkers > 0)
 	{
+		/* Enable shared cost balance */
+		VacuumSharedCostBalance = &(lps->lvshared->cost_balance);
+		VacuumActiveNWorkers = &(lps->lvshared->active_nworkers);
+
 		/*
-		 * Reset the local value so that we compute cost balance during
-		 * parallel index vacuuming.
+		 * Set up shared cost balance and the number of active workers for
+		 * vacuum delay.
 		 */
-		VacuumCostBalance = 0;
-		VacuumCostBalanceLocal = 0;
+		pg_atomic_write_u32(VacuumSharedCostBalance, VacuumCostBalance);
+		pg_atomic_write_u32(VacuumActiveNWorkers, 0);
 
 		LaunchParallelWorkers(lps->pcxt, nworkers);
 
-		/* Enable shared costing iff we process indexes in parallel. */
 		if (lps->pcxt->nworkers_launched > 0)
 		{
-			/* Enable shared cost balance */
-			VacuumSharedCostBalance = &(lps->lvshared->cost_balance);
-			VacuumActiveNWorkers = &(lps->lvshared->active_nworkers);
-
 			/*
-			 * Set up shared cost balance and the number of active workers for
-			 * vacuum delay.
+			 * Reset the local value so that we compute cost balance during
+			 * parallel index vacuuming.
+			 */
+			VacuumCostBalance = 0;
+			VacuumCostBalanceLocal = 0;
+		}
+		else
+		{
+			/*
+			 * Disable shared cost balance if we are not able to launch
+			 * workers.
 			 */
-			pg_atomic_write_u32(VacuumSharedCostBalance, VacuumCostBalance);
-			pg_atomic_write_u32(VacuumActiveNWorkers, 0);
+			VacuumSharedCostBalance = NULL;
+			VacuumActiveNWorkers = NULL;
 		}
 
 		if (lps->lvshared->for_cleanup)
@@ -2072,20 +2045,24 @@ lazy_parallel_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
 	}
 
 	/*
-	 * Join as a parallel worker. The leader process alone does that in
-	 * case where no workers launched.
+	 * Join as a parallel worker.  The leader process alone processes all the
+	 * indexes in the case where no workers are launched.
 	 */
 	if (lps->leaderparticipates || lps->pcxt->nworkers_launched == 0)
-		vacuum_or_cleanup_indexes_worker(Irel, nindexes, stats, lps->lvshared,
-										 vacrelstats->dead_tuples);
+		parallel_vacuum_index(Irel, stats, lps->lvshared,
+							  vacrelstats->dead_tuples, nindexes);
 
 	/*
-	 * Here, the indexes that had been skipped during parallel index vacuuming
-	 * are remaining. If there are such indexes the leader process does vacuum
-	 * or cleanup them one by one.
+	 * Now process the indexes that had been skipped during parallel index
+	 * vacuuming.
+	 *
+	 * XXX - We can do this before the leader backend participates as a worker
+	 * and that might make a few cases faster where indexes are of equal size,
+	 * but OTOH, it might slow down a few cases where indexes are of unequal
+	 * size.  In anycase, it won't matter much as we don't have indexes that
+	 * don't support parallel vacuum yet.
 	 */
-	vacuum_or_cleanup_skipped_indexes(vacrelstats, Irel, nindexes, stats,
-									  lps);
+	vacuum_skipped_indexes(Irel, stats, vacrelstats, lps, nindexes);
 
 	/* Wait for all vacuum workers to finish */
 	WaitForParallelWorkersToFinish(lps->pcxt);
@@ -2099,9 +2076,9 @@ lazy_parallel_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
 	VacuumActiveNWorkers = NULL;
 
 	/*
-	 * In cleanup case we don't need to reinitialize the parallel
-	 * context as no more index vacuuming and index cleanup will be
-	 * performed after that.
+	 * In the cleanup case, we don't need to reinitialize the parallel context
+	 * as no more index vacuuming or index cleanup will be performed after
+	 * that.
 	 */
 	if (!lps->lvshared->for_cleanup)
 	{
@@ -2118,15 +2095,13 @@ lazy_parallel_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
 }
 
 /*
- * Index vacuuming and index cleanup routine used by parallel vacuum
- * worker processes and the leader process to process the indexes in
- * parallel.
+ * Index vacuum/cleanup routine used by the leader process and parallel
+ * vacuum worker processes to process the indexes in parallel.
  */
 static void
-vacuum_or_cleanup_indexes_worker(Relation *Irel, int nindexes,
-								 IndexBulkDeleteResult **stats,
-								 LVShared *lvshared,
-								 LVDeadTuples *dead_tuples)
+parallel_vacuum_index(Relation *Irel, IndexBulkDeleteResult **stats,
+					  LVShared *lvshared, LVDeadTuples *dead_tuples,
+					  int nindexes)
 {
 	/*
 	 * Increment the active worker count if we are able to launch any worker.
@@ -2148,7 +2123,7 @@ vacuum_or_cleanup_indexes_worker(Relation *Irel, int nindexes,
 			break;
 
 		/* Skip processing indexes that doesn't support parallel operation */
-		if (skip_parallel_index_vacuum(Irel[idx], lvshared))
+		if (skip_parallel_vacuum_index(Irel[idx], lvshared))
 			continue;
 
 		/* Increment the processing count */
@@ -2164,9 +2139,8 @@ vacuum_or_cleanup_indexes_worker(Relation *Irel, int nindexes,
 		Assert(shared_indstats);
 
 		/* Do vacuum or cleanup one index */
-		vacuum_or_cleanup_one_index_worker(Irel[idx], &(stats[idx]),
-										   lvshared, shared_indstats,
-										   dead_tuples);
+		vacuum_one_index(Irel[idx], &(stats[idx]), lvshared, shared_indstats,
+						 dead_tuples);
 	}
 
 	/*
@@ -2179,13 +2153,13 @@ vacuum_or_cleanup_indexes_worker(Relation *Irel, int nindexes,
 
 /*
  * Vacuum or cleanup indexes that have been skipped during parallel operation
- * because these indexes don't support parallel operation at that phase.  Therefore
- * this function must be called by the leader process.
+ * because these indexes don't support parallel operation at that phase.
+ * Therefore this function must be called by the leader process.
  */
 static void
-vacuum_or_cleanup_skipped_indexes(LVRelStats *vacrelstats, Relation *Irel,
-								  int nindexes, IndexBulkDeleteResult **stats,
-								  LVParallelState *lps)
+vacuum_skipped_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+					   LVRelStats *vacrelstats, LVParallelState *lps,
+					   int nindexes)
 {
 	int nindexes_remains;
 	int i;
@@ -2207,15 +2181,15 @@ vacuum_or_cleanup_skipped_indexes(LVRelStats *vacrelstats, Relation *Irel,
 
 	for (i = 0; i < nindexes; i++)
 	{
-		bool processed = !skip_parallel_index_vacuum(Irel[i], lps->lvshared);
+		bool processed = !skip_parallel_vacuum_index(Irel[i], lps->lvshared);
 
 		/* Skip the already processed indexes */
 		if (processed)
 			continue;
 
-		vacuum_or_cleanup_one_index_worker(Irel[i], &(stats[i]),
-										   lps->lvshared, get_indstats(lps->lvshared, i),
-										   vacrelstats->dead_tuples);
+		vacuum_one_index(Irel[i], &(stats[i]), lps->lvshared,
+						 get_indstats(lps->lvshared, i),
+						 vacrelstats->dead_tuples);
 	}
 
 	/*
@@ -2227,15 +2201,15 @@ vacuum_or_cleanup_skipped_indexes(LVRelStats *vacrelstats, Relation *Irel,
 }
 
 /*
- * Vacuum or cleanup one index by worker processing including the leader
- * process.  After finished each indexes this function copies the index
+ * Vacuum or cleanup index either by leader process or by one of the worker
+ * process.  After processing the index this function copies the index
  * statistics returned from ambulkdelete and amvacuumcleanup to the DSM
  * segment.
  */
 static void
-vacuum_or_cleanup_one_index_worker(Relation indrel, IndexBulkDeleteResult **stats,
-								   LVShared *lvshared, LVSharedIndStats *shared_indstats,
-								   LVDeadTuples *dead_tuples)
+vacuum_one_index(Relation indrel, IndexBulkDeleteResult **stats,
+				 LVShared *lvshared, LVSharedIndStats *shared_indstats,
+				 LVDeadTuples *dead_tuples)
 {
 	IndexBulkDeleteResult *bulkdelete_res = NULL;
 
@@ -2292,9 +2266,9 @@ vacuum_or_cleanup_one_index_worker(Relation indrel, IndexBulkDeleteResult **stat
  * parallel vacuum.
  */
 static void
-lazy_vacuum_indexes(LVRelStats *vacrelstats, Relation *Irel,
-					int nindexes, IndexBulkDeleteResult **stats,
-					LVParallelState *lps)
+lazy_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+					LVRelStats *vacrelstats, LVParallelState *lps,
+					int nindexes)
 {
 	int		idx;
 
@@ -2315,8 +2289,7 @@ lazy_vacuum_indexes(LVRelStats *vacrelstats, Relation *Irel,
 		lps->lvshared->reltuples = vacrelstats->old_live_tuples;
 		lps->lvshared->estimated_count = true;
 
-		lazy_parallel_vacuum_or_cleanup_indexes(vacrelstats, Irel, nindexes,
-												stats, lps);
+		lazy_parallel_vacuum_indexes(Irel, stats, vacrelstats, lps, nindexes);
 	}
 	else
 	{
@@ -2331,9 +2304,9 @@ lazy_vacuum_indexes(LVRelStats *vacrelstats, Relation *Irel,
  * parallel vacuum.
  */
 static void
-lazy_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
-					 int nindexes, IndexBulkDeleteResult **stats,
-					 LVParallelState *lps)
+lazy_cleanup_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+					 LVRelStats *vacrelstats, LVParallelState *lps,
+					 int nindexes)
 {
 	int		idx;
 
@@ -2360,8 +2333,7 @@ lazy_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
 		lps->lvshared->estimated_count =
 					(vacrelstats->tupcount_pages < vacrelstats->rel_pages);
 
-		lazy_parallel_vacuum_or_cleanup_indexes(vacrelstats, Irel, nindexes,
-												stats, lps);
+		lazy_parallel_vacuum_indexes(Irel, stats, vacrelstats, lps, nindexes);
 	}
 	else
 	{
@@ -2376,16 +2348,17 @@ lazy_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
  *	lazy_vacuum_index() -- vacuum one index relation.
  *
  *		Delete all the index entries pointing to tuples listed in
- *		vacrelstats->dead_tuples, and update running statistics.
+ *		dead_tuples, and update running statistics.
+ *
  *		reltuples is the number of heap tuples to be passed to the
- *		bulk delete callback.
+ *		bulkdelete callback.
  */
 static void
 lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult **stats,
 				  LVDeadTuples *dead_tuples, double reltuples)
 {
 	IndexVacuumInfo ivinfo;
-	char		*msgfmt;
+	const char	*msg;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
@@ -2403,12 +2376,12 @@ lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult **stats,
 							   lazy_tid_reaped, (void *) dead_tuples);
 
 	if (IsParallelWorker())
-		msgfmt = gettext_noop("scanned index \"%s\" to remove %d row versions by parallel vacuum worker");
+		msg = gettext_noop("scanned index \"%s\" to remove %d row versions by parallel vacuum worker");
 	else
-		msgfmt = gettext_noop("scanned index \"%s\" to remove %d row versions");
+		msg = gettext_noop("scanned index \"%s\" to remove %d row versions");
 
 	ereport(elevel,
-			(errmsg(msgfmt,
+			(errmsg(msg,
 					RelationGetRelationName(indrel),
 					dead_tuples->num_tuples),
 			 errdetail_internal("%s", pg_rusage_show(&ru0))));
@@ -2426,7 +2399,7 @@ lazy_cleanup_index(Relation indrel,
 				   double reltuples, bool estimated_count)
 {
 	IndexVacuumInfo ivinfo;
-	char		*msgfmt;
+	const char	*msg;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
@@ -2446,12 +2419,12 @@ lazy_cleanup_index(Relation indrel,
 		return;
 
 	if (IsParallelWorker())
-		msgfmt = gettext_noop("index \"%s\" now contains %.0f row versions in %u pages, reported by parallel vacuum worker");
+		msg = gettext_noop("index \"%s\" now contains %.0f row versions in %u pages as reported by parallel vacuum worker");
 	else
-		msgfmt = gettext_noop("index \"%s\" now contains %.0f row versions in %u pages");
+		msg = gettext_noop("index \"%s\" now contains %.0f row versions in %u pages");
 
 	ereport(elevel,
-			(errmsg(msgfmt,
+			(errmsg(msg,
 					RelationGetRelationName(indrel),
 					(*stats)->num_index_tuples,
 					(*stats)->num_pages),
@@ -2464,35 +2437,6 @@ lazy_cleanup_index(Relation indrel,
 }
 
 /*
- * Update index statistics in pg_class if the statistics is accurate.
- */
-static void
-update_index_statistics(Relation *Irel, IndexBulkDeleteResult **stats,
-						int nindexes)
-{
-	int i;
-
-	Assert(!IsInParallelMode());
-
-	for (i = 0; i < nindexes; i++)
-	{
-		if (stats[i] == NULL || stats[i]->estimated_count)
-			continue;
-
-		/* Update index statistics */
-		vac_update_relstats(Irel[i],
-							stats[i]->num_pages,
-							stats[i]->num_index_tuples,
-							0,
-							false,
-							InvalidTransactionId,
-							InvalidMultiXactId,
-							false);
-		pfree(stats[i]);
-	}
-}
-
-/*
  * should_attempt_truncation - should we attempt to truncate the heap?
  *
  * Don't even think about it unless we have a shot at releasing a goodly
@@ -3032,7 +2976,7 @@ heap_page_is_all_visible(Relation rel, Buffer buf,
 }
 
 /*
- * Compute the number of parallel worker processes to request. Both index
+ * Compute the number of parallel worker processes to request.  Both index
  * vacuuming and index cleanup can be executed together with parallel workers.
  * The relation sizes of table and indexes don't affect to the parallel
  * degree for now. nrequested is the number of parallel workers that user
@@ -3041,7 +2985,7 @@ heap_page_is_all_visible(Relation rel, Buffer buf,
  * vacuuming.
  */
 static int
-compute_parallel_workers(Relation *Irel, int nindexes, int nrequested)
+compute_parallel_vacuum_workers(Relation *Irel, int nindexes, int nrequested)
 {
 	bool	leaderparticipates = true;
 	int		nindexes_parallel = 0;
@@ -3050,10 +2994,11 @@ compute_parallel_workers(Relation *Irel, int nindexes, int nrequested)
 	int		parallel_workers;
 	int		i;
 
-	Assert(nrequested >= 0);
-
-	/* Return immediately when parallelism disabled */
-	if (max_parallel_maintenance_workers == 0)
+	/*
+	 * We don't allow to perform parallel operation in standalone backend or
+	 * when parallelism is disabled.
+	 */
+	if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
 		return 0;
 
 	/*
@@ -3104,8 +3049,8 @@ compute_parallel_workers(Relation *Irel, int nindexes, int nrequested)
  * for autovacuum we don't need to care about autovacuum_work_mem.
  */
 static void
-prepare_index_statistics(LVShared *lvshared, Relation *Irel, int nindexes,
-	int nworkers)
+prepare_index_statistics(Relation *Irel, LVShared *lvshared, int nindexes,
+						 int nworkers)
 {
 	char *p = (char *) GetSharedIndStats(lvshared);
 	int nindexes_mwm = 0;
@@ -3145,13 +3090,44 @@ prepare_index_statistics(LVShared *lvshared, Relation *Irel, int nindexes,
 }
 
 /*
- * Enter parallel mode, allocate and initialize the DSM segment.
+ * Update index statistics in pg_class if the statistics is accurate.
+ */
+static void
+update_index_statistics(Relation *Irel, IndexBulkDeleteResult **stats,
+						int nindexes)
+{
+	int i;
+
+	Assert(!IsInParallelMode());
+
+	for (i = 0; i < nindexes; i++)
+	{
+		if (stats[i] == NULL || stats[i]->estimated_count)
+			continue;
+
+		/* Update index statistics */
+		vac_update_relstats(Irel[i],
+							stats[i]->num_pages,
+							stats[i]->num_index_tuples,
+							0,
+							false,
+							InvalidTransactionId,
+							InvalidMultiXactId,
+							false);
+		pfree(stats[i]);
+	}
+}
+
+/*
+ * This function prepares and returns parallel vacuum state if we can launch
+ * even one worker.  This function is responsible to create a parallel
+ * context, enter parallel mode, and then initialize the DSM segment.
  */
 static LVParallelState *
-begin_parallel_vacuum(LVRelStats *vacrelstats, Oid relid, BlockNumber nblocks,
-					  Relation *Irel, int nindexes, int nrequested)
+begin_parallel_vacuum(Oid relid, Relation *Irel, LVRelStats *vacrelstats,
+					  BlockNumber nblocks, int nindexes, int nrequested)
 {
-	LVParallelState *lps = (LVParallelState *) palloc0(sizeof(LVParallelState));
+	LVParallelState *lps = NULL;
 	ParallelContext *pcxt;
 	LVShared		*shared;
 	LVDeadTuples	*dead_tuples;
@@ -3159,12 +3135,29 @@ begin_parallel_vacuum(LVRelStats *vacrelstats, Oid relid, BlockNumber nblocks,
 	char	*sharedquery;
 	Size	est_shared;
 	Size	est_deadtuples;
+	int		parallel_workers = 0;
 	int		querylen;
 	int		i;
 
-	Assert(nrequested > 0);
+	/*
+	 * a parallel vacuum must be requested and there must be indexes on the
+	 * relation
+	 */
+	Assert(nrequested >= 0);
 	Assert(nindexes > 0);
 
+	/*
+	 * Compute the number of parallel vacuum workers to launch
+	 */
+	parallel_workers = compute_parallel_vacuum_workers(Irel, nindexes,
+													   nrequested);
+
+	/* Can't perform vacuum in parallel */
+	if (parallel_workers <= 0)
+		return lps;
+
+	lps = (LVParallelState *) palloc0(sizeof(LVParallelState));
+
 	lps->leaderparticipates = true;
 
 #ifdef PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION
@@ -3172,8 +3165,8 @@ begin_parallel_vacuum(LVRelStats *vacrelstats, Oid relid, BlockNumber nblocks,
 #endif
 
 	EnterParallelMode();
-	pcxt = CreateParallelContext("postgres", "heap_parallel_vacuum_main",
-								 nrequested);
+	pcxt = CreateParallelContext("postgres", "parallel_vacuum_main",
+								 parallel_workers);
 	lps->pcxt = pcxt;
 	Assert(pcxt->nworkers > 0);
 
@@ -3199,7 +3192,7 @@ begin_parallel_vacuum(LVRelStats *vacrelstats, Oid relid, BlockNumber nblocks,
 
 			/*
 			 * Remember the number of indexes that support parallel operation
-			 * for each phases.
+			 * for each phase.
 			 */
 			if ((vacoptions & VACUUM_OPTION_PARALLEL_BULKDEL) != 0)
 				lps->nindexes_parallel_bulkdel++;
@@ -3237,7 +3230,7 @@ begin_parallel_vacuum(LVRelStats *vacrelstats, Oid relid, BlockNumber nblocks,
 	 * in that way.
 	 */
 	shared->offset = MAXALIGN(add_size(SizeOfLVShared, BITMAPLEN(nindexes)));
-	prepare_index_statistics(shared, Irel, nindexes, nrequested);
+	prepare_index_statistics(Irel, shared, nindexes, parallel_workers);
 	pg_atomic_init_u32(&(shared->idx), 0);
 	pg_atomic_init_u32(&(shared->nprocessed), 0);
 	pg_atomic_init_u32(&(shared->cost_balance), 0);
@@ -3266,14 +3259,15 @@ begin_parallel_vacuum(LVRelStats *vacrelstats, Oid relid, BlockNumber nblocks,
 /*
  * Destroy the parallel context, and end parallel mode.
  *
- * All writes are not allowed during parallel mode and it might not be
- * safe to exit from the parallel mode while keeping the parallel context.
- * So we copy the updated index statistics to a local memory and then later
- * use that to update the index statistics.
+ * Since writes are not allowed during the parallel mode, so we copy the
+ * updated index statistics from DSM in local memory and then later use that
+ * to update the index statistics.  One might think that we can exit from
+ * parallel mode, update the index statistics and then destroy parallel
+ * context, but that won't be safe (see ExitParallelMode).
  */
 static void
-end_parallel_vacuum(LVParallelState *lps, Relation *Irel, int nindexes,
-					IndexBulkDeleteResult **stats)
+end_parallel_vacuum(Relation *Irel, IndexBulkDeleteResult **stats,
+					LVParallelState *lps, int nindexes)
 {
 	int i;
 
@@ -3285,8 +3279,8 @@ end_parallel_vacuum(LVParallelState *lps, Relation *Irel, int nindexes,
 		LVSharedIndStats *indstats = get_indstats(lps->lvshared, i);
 
 		/*
-		 * Skip unused slot.  The statistics of this index are already
-		 * stored in local memory.
+		 * Skip unused slot.  The statistics of this index are already stored
+		 * in local memory.
 		 */
 		if (indstats == NULL)
 			continue;
@@ -3333,11 +3327,11 @@ get_indstats(LVShared *lvshared, int n)
 }
 
 /*
- * Check if the given index participates in parallel index vacuuming
- * or parallel index cleanup.
+ * Check if the given index participates in parallel index vacuum or parallel
+ * index cleanup.
  */
 static bool
-skip_parallel_index_vacuum(Relation indrel, LVShared *lvshared)
+skip_parallel_vacuum_index(Relation indrel, LVShared *lvshared)
 {
 	uint8 vacoptions = indrel->rd_indam->amparallelvacuumoptions;
 
@@ -3373,11 +3367,11 @@ skip_parallel_index_vacuum(Relation indrel, LVShared *lvshared)
 /*
  * Perform work within a launched parallel process.
  *
- * Since parallel vacuum workers work only within index vacuuming and index
- * cleanup, no need to report the progress information.
+ * Since parallel vacuum workers perform only index vacuum or index cleanup,
+ * we don't need to report the progress information.
  */
 void
-heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
+parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
 {
 	Relation	onerel;
 	Relation	*indrels;
@@ -3434,9 +3428,8 @@ heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
 	if (lvshared->maintenance_work_mem_worker > 0)
 		maintenance_work_mem = lvshared->maintenance_work_mem_worker;
 
-	/* Do either vacuuming indexes or cleaning indexes */
-	vacuum_or_cleanup_indexes_worker(indrels, nindexes, stats, lvshared,
-									 dead_tuples);
+	/* Process indexes to perform vacuum/cleanup */
+	parallel_vacuum_index(indrels, stats, lvshared, dead_tuples, nindexes);
 
 	vac_close_indexes(nindexes, indrels, RowExclusiveLock);
 	table_close(onerel, ShareUpdateExclusiveLock);
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index 157c309..a387932 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -142,7 +142,7 @@ static const struct
 		"_bt_parallel_build_main", _bt_parallel_build_main
 	},
 	{
-		"heap_parallel_vacuum_main", heap_parallel_vacuum_main
+		"parallel_vacuum_main", parallel_vacuum_main
 	}
 };
 
@@ -498,8 +498,8 @@ LaunchParallelWorkers(ParallelContext *pcxt, int nworkers)
 {
 	MemoryContext oldcontext;
 	BackgroundWorker worker;
-	int			nworkers_to_launch = Min(nworkers, pcxt->nworkers);;
 	int			i;
+	int			nworkers_to_launch = Min(nworkers, pcxt->nworkers);
 	bool		any_registrations_failed = false;
 
 	/* Skip this if we have no workers. */
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 4b7f480..f6dc890 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -69,6 +69,14 @@ static MemoryContext vac_context = NULL;
 static BufferAccessStrategy vac_strategy;
 
 
+/*
+ * Variables for cost-based parallel vacuum.  See comments atop
+ * compute_parallel_delay to understand how it works.
+ */
+pg_atomic_uint32	*VacuumSharedCostBalance = NULL;
+pg_atomic_uint32	*VacuumActiveNWorkers = NULL;
+int					VacuumCostBalanceLocal = 0;
+
 /* non-export function prototypes */
 static List *expand_vacuum_rel(VacuumRelation *vrel, int options);
 static List *get_all_vacuum_rels(int options);
@@ -77,6 +85,7 @@ static void vac_truncate_clog(TransactionId frozenXID,
 							  TransactionId lastSaneFrozenXid,
 							  MultiXactId lastSaneMinMulti);
 static bool vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params);
+static double compute_parallel_delay(void);
 static VacOptTernaryValue get_vacopt_ternary_value(DefElem *def);
 
 /*
@@ -1994,58 +2003,11 @@ vacuum_delay_point(void)
 		return;
 
 	/*
-	 * If the vacuum cost balance is shared among parallel workers we
-	 * decide whether to sleep based on that.
+	 * For parallel vacuum, the delay is computed based on the shared cost
+	 * balance.  See compute_parallel_delay.
 	 */
 	if (VacuumSharedCostBalance != NULL)
-	{
-		int nworkers = pg_atomic_read_u32(VacuumActiveNWorkers);
-
-		/* At least count itself */
-		Assert(nworkers >= 1);
-
-		/* Update the shared cost balance value atomically */
-		while (true)
-		{
-			uint32 shared_balance;
-			uint32 new_balance;
-			uint32 local_balance;
-
-			msec = 0;
-
-			/* compute new balance by adding the local value */
-			shared_balance = pg_atomic_read_u32(VacuumSharedCostBalance);
-			new_balance = shared_balance + VacuumCostBalance;
-
-			/* also compute the total local balance */
-			local_balance = VacuumCostBalanceLocal + VacuumCostBalance;
-
-			if ((new_balance >= VacuumCostLimit) &&
-				(local_balance > 0.5 * (VacuumCostLimit / nworkers)))
-			{
-				/* compute sleep time based on the local cost balance */
-				msec = VacuumCostDelay * VacuumCostBalanceLocal / VacuumCostLimit;
-				new_balance = shared_balance - VacuumCostBalanceLocal;
-				VacuumCostBalanceLocal = 0;
-			}
-
-			if (pg_atomic_compare_exchange_u32(VacuumSharedCostBalance,
-											   &shared_balance,
-											   new_balance))
-			{
-				/* Updated successfully, break */
-				break;
-			}
-		}
-
-		VacuumCostBalanceLocal += VacuumCostBalance;
-
-		/*
-		 * Reset the local balance as we accumulated it into the shared
-		 * value.
-		 */
-		VacuumCostBalance = 0;
-	}
+		msec = compute_parallel_delay();
 	else if (VacuumCostBalance >= VacuumCostLimit)
 		msec = VacuumCostDelay * VacuumCostBalance / VacuumCostLimit;
 
@@ -2068,6 +2030,59 @@ vacuum_delay_point(void)
 }
 
 /*
+ * Computes the vacuum delay for parallel workers.
+ *
+ * The basic idea of a cost-based vacuum delay for parallel index vacuuming
+ * is to allow all parallel vacuum workers including the leader process to
+ * have a shared view of cost related parameters (mainly VacuumCostBalance)
+ * and allow each worker to update it and then based on that decide
+ * whether it needs to sleep.  Besides, we allow any worker to sleep only
+ * if it has performed the I/O above a certain threshold, which is calculated
+ * based on the number of active workers (VacuumActiveNWorkers), and the
+ * overall cost balance is more than VacuumCostLimit set by the system.  Then
+ * we will allow the worker to sleep proportional to the work done and reduce
+ * the VacuumSharedCostBalance by the amount which is consumed by the current
+ * worker (VacuumCostBalanceLocal).  This can avoid letting the workers sleep
+ * who have done less or no I/O as compared to other workers and therefore can
+ * ensure that workers who are doing more I/O got throttled more.
+ */
+static double
+compute_parallel_delay(void)
+{
+	double	msec = 0;
+	uint32	shared_balance;
+	int		nworkers = pg_atomic_read_u32(VacuumActiveNWorkers);
+
+	/* At least count itself */
+	Assert(nworkers >= 1);
+	/* parallel vacuum must be active */
+	Assert(VacuumSharedCostBalance);
+
+	/* Update the shared cost balance value atomically */
+	shared_balance = pg_atomic_add_fetch_u32(VacuumSharedCostBalance, VacuumCostBalance);
+
+	/* also compute the total local balance */
+	VacuumCostBalanceLocal += VacuumCostBalance;
+
+	if ((shared_balance >= VacuumCostLimit) &&
+		(VacuumCostBalanceLocal > 0.5 * (VacuumCostLimit / nworkers)))
+	{
+		/* compute sleep time based on the local cost balance */
+		msec = VacuumCostDelay * VacuumCostBalanceLocal / VacuumCostLimit;
+		shared_balance = pg_atomic_sub_fetch_u32(VacuumSharedCostBalance, VacuumCostBalanceLocal);
+		VacuumCostBalanceLocal = 0;
+	}
+
+	/*
+	 * Reset the local balance as we accumulated it into the shared
+	 * value.
+	 */
+	VacuumCostBalance = 0;
+
+	return msec;
+}
+
+/*
  * A wrapper function of defGetBoolean().
  *
  * This function returns VACOPT_TERNARY_ENABLED and VACOPT_TERNARY_DISABLED
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 61725e7..e89c125 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -192,13 +192,10 @@ extern void SyncScanShmemInit(void);
 extern Size SyncScanShmemSize(void);
 
 /* in heap/vacuumlazy.c */
-extern pg_atomic_uint32	*VacuumSharedCostBalance;
-extern pg_atomic_uint32	*VacuumActiveNWorkers;
-extern int	VacuumCostBalanceLocal;
 struct VacuumParams;
 extern void heap_vacuum_rel(Relation onerel,
 							struct VacuumParams *params, BufferAccessStrategy bstrategy);
-extern void heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
+extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
 
 /* in heap/heapam_visibility.c */
 extern bool HeapTupleSatisfiesVisibility(HeapTuple stup, Snapshot snapshot,
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 0a586dc..5222e33 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -232,6 +232,11 @@ extern int	vacuum_freeze_table_age;
 extern int	vacuum_multixact_freeze_min_age;
 extern int	vacuum_multixact_freeze_table_age;
 
+/* Variables for cost-based parallel vacuum  */
+extern pg_atomic_uint32	*VacuumSharedCostBalance;
+extern pg_atomic_uint32	*VacuumActiveNWorkers;
+extern int	VacuumCostBalanceLocal;
+
 
 /* in commands/vacuum.c */
 extern void ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel);
-- 
1.8.3.1

#260

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Amit Kapila (#259)

On Tue, Dec 3, 2019 at 4:25 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

Few other things, I would like you to consider.
1. I think disable_parallel_leader_participation related code can be
extracted into a separate patch as it is mainly a debug/test aid. You can
also fix the problem reported by Mahendra in that context.

2. I think if we cam somehow disallow very small indexes to use parallel
workers, then it will be better. Can we use min_parallel_index_scan_size
to decide whether a particular index can participate in a parallel vacuum?

Forgot one minor point. Please run pgindent on all the patches.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#261

Mahendra Singh

mahi6run@gmail.com

about 6 years ago

In reply to: Amit Kapila (#260)

On Tue, 3 Dec 2019 at 16:27, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Dec 3, 2019 at 4:25 PM Amit Kapila <amit.kapila16@gmail.com>
wrote:

Few other things, I would like you to consider.
1. I think disable_parallel_leader_participation related code can be
extracted into a separate patch as it is mainly a debug/test aid. You can
also fix the problem reported by Mahendra in that context.

2. I think if we cam somehow disallow very small indexes to use parallel
workers, then it will be better. Can we use min_parallel_index_scan_size
to decide whether a particular index can participate in a parallel vacuum?

Forgot one minor point. Please run pgindent on all the patches.

While reviewing and testing v35 patch set, I noticed some problems. Below
are some comments:

1.
  /*
+ * Since parallel workers cannot access data in temporary tables, parallel
+ * vacuum is not allowed for temporary relation. However rather than
+ * skipping vacuum on the table, just disabling parallel option is better
+ * option in most cases.
+ */
+ if (RelationUsesLocalBuffers(onerel) && params->nworkers >= 0)
+ {
+ ereport(WARNING,
+ (errmsg("disabling parallel option of vacuum on \"%s\" --- cannot vacuum
temporary tables in parallel",
+ RelationGetRelationName(onerel))));
+ params->nworkers = 0;
+ }

Here, I think, we should set params->nworkers = -1 to disable parallel
vacuum for temporary tables. I noticed that even after warning, we were
doing vacuum in parallel mode and were launching parallel workers that was
wrong.

2.
Amit suggested me to check time taken by vacuum.sql regression test.

vacuum ... ok 20684 ms -------on the top
of v35 patch set
vacuum ... ok 1020 ms -------without v35
patch set

Here, we can see that time taken by vacuum test case is increased too much
due to parallel vacuum test cases so I will try to come with a small test
case.

Thanks and Regards
Mahendra Thalor
EnterpriseDB: http://www.enterprisedb.com

#262

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

about 6 years ago

In reply to: Amit Kapila (#259)

On Tue, 3 Dec 2019 at 11:55, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Dec 3, 2019 at 12:56 AM Masahiko Sawada <masahiko.sawada@2ndquadrant.com> wrote:

On Sun, 1 Dec 2019 at 18:31, Sergei Kornilov <sk@zsrv.org> wrote:

Hi

I think I got your point. Your proposal is that it's more efficient if
we make the leader process vacuum the index that can be processed only
the leader process (i.e. indexes not supporting parallel index vacuum)
while workers are processing indexes supporting parallel index vacuum,
right? That way, we can process indexes in parallel as much as
possible.

Right

So maybe we can call vacuum_or_cleanup_skipped_indexes first
and then call vacuum_or_cleanup_indexes_worker. But I'm not sure that
there are parallel-safe remaining indexes after the leader finished
vacuum_or_cleanup_indexes_worker, as described on your proposal.

I meant that after processing missing indexes (not supporting parallel index vacuum), the leader can start processing indexes that support the parallel index vacuum, along with parallel workers.
Exactly call vacuum_or_cleanup_skipped_indexes after start parallel workers but before vacuum_or_cleanup_indexes_worker or something with similar effect.
If we have 0 missed indexes - parallel vacuum will run as in current implementation, with leader participation.

I think your idea might not work well in some cases.

Good point. I am also not sure whether it is a good idea to make the suggested change, but I think adding a comment on those lines is not a bad idea which I have done in the attached patch.

Thank you for updating the patch!

I have made some other changes as well.
1.
+ if (VacuumSharedCostBalance != NULL)
{
- double msec;
+ int nworkers = pg_atomic_read_u32
(VacuumActiveNWorkers);
+
+ /* At least count itself */
+ Assert(nworkers >= 1);
+
+ /* Update the shared cost
balance value atomically */
+ while (true)
+ {
+ uint32 shared_balance;
+ uint32 new_balance;
+
uint32 local_balance;
+
+ msec = 0;
+
+ /* compute new balance by adding the local value */
+
shared_balance = pg_atomic_read_u32(VacuumSharedCostBalance);
+ new_balance = shared_balance + VacuumCostBalance;
+
/* also compute the total local balance */
+ local_balance = VacuumCostBalanceLocal + VacuumCostBalance;
+
+
if ((new_balance >= VacuumCostLimit) &&
+ (local_balance > 0.5 * (VacuumCostLimit / nworkers)))
+ {
+
/* compute sleep time based on the local cost balance */
+ msec = VacuumCostDelay *
VacuumCostBalanceLocal / VacuumCostLimit;
+ new_balance = shared_balance - VacuumCostBalanceLocal;
+
VacuumCostBalanceLocal = 0;
+ }
+
+ if (pg_atomic_compare_exchange_u32(VacuumSharedCostBalance,
+
&shared_balance,
+
new_balance))
+ {
+ /* Updated successfully, break */
+
break;
+ }
+ }
+
+ VacuumCostBalanceLocal += VacuumCostBalance;
I see multiple problems with this code. (a) if the VacuumSharedCostBalance is changed by the time of compare and exchange, then the next iteration might not compute the correct values as you might have reset VacuumCostBalanceLocal by that time. (b) In code line, new_balance = shared_balance - VacuumCostBalanceLocal, you need to use new_balance instead of shared_balance, otherwise, it won't account for the balance of the latest cycle. (c) In code line, msec = VacuumCostDelay * VacuumCostBalanceLocal / VacuumCostLimit;, I think you need to use local_balance for reasons similar to (b). (d) I think we can write this code with a lesser number of variables.

In your code, I think if two workers enter to compute_parallel_delay
function at the same time, they add their local balance to
VacuumSharedCostBalance and both workers sleep because both values
reach the VacuumCostLimit. But either one worker should not sleep in
this case.

I have fixed all these problems and used a slightly different way to compute the parallel delay. See compute_parallel_delay() in the attached delta patch.
2.
+ /* Setup the shared cost-based vacuum delay and launch workers*/
+ if (nworkers > 0)
+ {
+ /*
+ * Reset the local value so that we compute cost balance during
+ * parallel index vacuuming.
+ */
+ VacuumCostBalance = 0;
+ VacuumCostBalanceLocal = 0;
+
+ LaunchParallelWorkers(lps->pcxt, nworkers);
+
+ /* Enable shared costing iff we process indexes in parallel. */
+ if (lps->pcxt->nworkers_launched > 0)
+ {
+ /* Enable shared cost balance */
+ VacuumSharedCostBalance = &(lps->lvshared->cost_balance);
+ VacuumActiveNWorkers = &(lps->lvshared->active_nworkers);
+
+ /*
+ * Set up shared cost balance and the number of active workers for
+ * vacuum delay.
+ */
+ pg_atomic_write_u32(VacuumSharedCostBalance, VacuumCostBalance);
+ pg_atomic_write_u32(VacuumActiveNWorkers, 0);
This code has issues. We can't initialize VacuumSharedCostBalance/VacuumActiveNWorkers after launching workers as by that time some other worker would have changed its value. This has been reported offlist by Mahendra and I have fixed it.

3. Changed the name of functions which were too long and I think new names are more meaningful. If you don't agree with these changes, then we can discuss it.

4. Changed the order of parameters in many functions to match with existing code.

5. Refactored the code at a few places so that it can be easy to follow.

6. Added/Edited many comments and other cosmetic changes.

You can find all these changes in v35-0003-Code-review-amit.patch.

I've confirmed these changes and these look good to me.

Few other things, I would like you to consider.
1. I think disable_parallel_leader_participation related code can be extracted into a separate patch as it is mainly a debug/test aid. You can also fix the problem reported by Mahendra in that context.

Agreed. I'll create a patch for disable_parallel_leader_participation.

2. I think if we cam somehow disallow very small indexes to use parallel workers, then it will be better. Can we use min_parallel_index_scan_size to decide whether a particular index can participate in a parallel vacuum?

I think it's a good idea but I'm concerned that the default value of
min_parallel_index_scan_size, 512kB, is too small for parallel vacuum
purpose. Given that people who want to use parallel vacuum are likely
to have a big table the indexes that can be skipped by the default
value would be only brin indexes, I think. Also I guess that the
reason why the default value is small is that
min_parallel_index_scan_size compares to the number of blocks being
scanned during index scan, not whole index. On the other hand in
parallel vacuum we will compare it to the whole index blocks because
the index vacuuming is always full scan. So I'm also concerned that
user will get confused about reasonable setting.

As another idea how about using min_parallel_table_scan_size instead?
That is, we cannot do parallel vacuum on the table smaller than that
value. I think this idea had already been proposed once in this thread
but now I think it's also a good idea.

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#263

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

about 6 years ago

In reply to: Amit Kapila (#260)

On Tue, 3 Dec 2019 at 11:57, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Dec 3, 2019 at 4:25 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

Few other things, I would like you to consider.
1. I think disable_parallel_leader_participation related code can be extracted into a separate patch as it is mainly a debug/test aid. You can also fix the problem reported by Mahendra in that context.

2. I think if we cam somehow disallow very small indexes to use parallel workers, then it will be better. Can we use min_parallel_index_scan_size to decide whether a particular index can participate in a parallel vacuum?

Forgot one minor point. Please run pgindent on all the patches.

Got it. I will run pgindent before sending patch from next time.

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#264

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Masahiko Sawada (#262)

On Wed, Dec 4, 2019 at 1:58 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Tue, 3 Dec 2019 at 11:55, Amit Kapila <amit.kapila16@gmail.com> wrote:

In your code, I think if two workers enter to compute_parallel_delay
function at the same time, they add their local balance to
VacuumSharedCostBalance and both workers sleep because both values
reach the VacuumCostLimit.

True, but isn't it more appropriate because the local cost of any
worker should be ideally added to shared cost as soon as it occurred?
I mean to say that we are not adding any cost in shared balance
without actually incurring it. Then we also consider the individual
worker's local balance as well and sleep according to local balance.

2. I think if we cam somehow disallow very small indexes to use parallel workers, then it will be better. Can we use min_parallel_index_scan_size to decide whether a particular index can participate in a parallel vacuum?

I think it's a good idea but I'm concerned that the default value of
min_parallel_index_scan_size, 512kB, is too small for parallel vacuum
purpose. Given that people who want to use parallel vacuum are likely
to have a big table the indexes that can be skipped by the default
value would be only brin indexes, I think.

Yeah or probably hash indexes in some cases.

Also I guess that the
reason why the default value is small is that
min_parallel_index_scan_size compares to the number of blocks being
scanned during index scan, not whole index. On the other hand in
parallel vacuum we will compare it to the whole index blocks because
the index vacuuming is always full scan. So I'm also concerned that
user will get confused about reasonable setting.

This setting is about how much of index we are going to scan, so I am
not sure if it matters whether it is part or full index scan. Also,
in an index scan, we will launch multiple workers to scan that index
and here we will consider launching just one worker.

As another idea how about using min_parallel_table_scan_size instead?

Hmm, yeah, that can be another option, but it might not be a good idea
for partial indexes.

That is, we cannot do parallel vacuum on the table smaller than that
value.

Yeah, that makes sense, but I feel if we can directly target index
scan size that may be a better option. If we can't use
min_parallel_index_scan_size, then we can consider this.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#265

Dilip Kumar

dilipbalaut@gmail.com

about 6 years ago

In reply to: Amit Kapila (#264)

On Wed, Dec 4, 2019 at 9:12 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Dec 4, 2019 at 1:58 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Tue, 3 Dec 2019 at 11:55, Amit Kapila <amit.kapila16@gmail.com> wrote:

In your code, I think if two workers enter to compute_parallel_delay
function at the same time, they add their local balance to
VacuumSharedCostBalance and both workers sleep because both values
reach the VacuumCostLimit.

True, but isn't it more appropriate because the local cost of any
worker should be ideally added to shared cost as soon as it occurred?
I mean to say that we are not adding any cost in shared balance
without actually incurring it. Then we also consider the individual
worker's local balance as well and sleep according to local balance.

Even I think it is better to add the balance to the shared balance at
the earliest opportunity. Just consider the case that there are 5
workers and all have I/O balance of 20, and VacuumCostLimit is 50. So
Actually, there combined balance is 100 (which is double of the
VacuumCostLimit) but if we don't add immediately then none of the
workers will sleep and it may go to the next cycle which is not very
good. OTOH, if we add 20 immediately then check the shared balance
then all the workers might go for sleep if their local balances have
reached the limit but they will only sleep in proportion to their
local balance. So IMHO, adding the current balance to shared balance
early is more close to the model we are trying to implement i.e.
shared cost accounting.

2. I think if we cam somehow disallow very small indexes to use parallel workers, then it will be better. Can we use min_parallel_index_scan_size to decide whether a particular index can participate in a parallel vacuum?

I think it's a good idea but I'm concerned that the default value of
min_parallel_index_scan_size, 512kB, is too small for parallel vacuum
purpose. Given that people who want to use parallel vacuum are likely
to have a big table the indexes that can be skipped by the default
value would be only brin indexes, I think.

Yeah or probably hash indexes in some cases.

Also I guess that the
reason why the default value is small is that
min_parallel_index_scan_size compares to the number of blocks being
scanned during index scan, not whole index. On the other hand in
parallel vacuum we will compare it to the whole index blocks because
the index vacuuming is always full scan. So I'm also concerned that
user will get confused about reasonable setting.

This setting is about how much of index we are going to scan, so I am
not sure if it matters whether it is part or full index scan. Also,
in an index scan, we will launch multiple workers to scan that index
and here we will consider launching just one worker.

As another idea how about using min_parallel_table_scan_size instead?

Hmm, yeah, that can be another option, but it might not be a good idea
for partial indexes.

That is, we cannot do parallel vacuum on the table smaller than that
value.

Yeah, that makes sense, but I feel if we can directly target index
scan size that may be a better option. If we can't use
min_parallel_index_scan_size, then we can consider this.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#266

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Masahiko Sawada (#263)

On Wed, Dec 4, 2019 at 2:01 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Tue, 3 Dec 2019 at 11:57, Amit Kapila <amit.kapila16@gmail.com> wrote:

Forgot one minor point. Please run pgindent on all the patches.

Got it. I will run pgindent before sending patch from next time.

Today, I again read the patch and found a few more minor comments:

1.
void
-LaunchParallelWorkers(ParallelContext *pcxt)
+LaunchParallelWorkers(ParallelContext *pcxt, int nworkers)

I think we should add a comment for this API change which should
indicate why we need to pass nworkers as an additional parameter when
the context itself contains information about the number of workers.

2.
At the beginning of a lazy vacuum (at lazy_scan_heap) we
+ * prepare the parallel context and initialize the DSM segment that contains
+ * shared
information as well as the memory space for storing dead tuples.
+ * When starting either index vacuuming or index cleanup, we launch parallel
+ *
worker processes.  Once all indexes are processed the parallel worker
+ * processes exit.  And then the leader process re-initializes the parallel
+ *
context so that it can use the same DSM for multiple passses of index
+ * vacuum and for performing index cleanup.

a. /And then the leader/After that, the leader .. This will avoid
using 'and' two times in this sentence.
b. typo, /passses/passes

3.
+ * Macro to check if we are in a parallel lazy vacuum.  If true, we are
+ * in the parallel mode and prepared the DSM segment.

How about changing it slightly as /and prepared the DSM segment./ and
the DSM segment is initialized.?

4.
-
/* non-export function prototypes */
static void lazy_scan_heap(Relation onerel, VacuumParams *params,

LVRelStats *vacrelstats, Relation *Irel, int nindexes,
bool aggressive);

Spurious change, please remove. I think this is done by me in one of
the versions.

5.
+ * function we exit from parallel mode.  Index bulk-deletion results are
+ * stored in the DSM segment and update index
statistics as a whole after
+ * exited from parallel mode since all writes are not allowed during parallel
+ * mode.

Can we slightly change the above sentence as "Index bulk-deletion
results are stored in the DSM segment and we update index statistics
as a whole after exited from parallel mode since writes are not
allowed during the parallel mode."?

6.
/*
+ * Reset the local value so that we compute cost balance during
+ * parallel index vacuuming.
+
*/

This comment is a bit unclear. How about "Reset the local cost values
for leader backend as we have already accumulated the remaining
balance of heap."?

7.
+ /* Do vacuum or cleanup one index */

How about changing it as: Do vacuum or cleanup of the index?

8.
The copying the result normally
+ * happens only after the first time of index vacuuming.

/The copying the ../The copying of the

9.
+ /*
+ * no longer need the locally allocated result and now
+ * stats[idx] points to the DSM segment.
+
 */

How about changing it as below:
"Now that the stats[idx] points to the DSM segment, we don't need the
locally allocated results."

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#267

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

about 6 years ago

In reply to: Dilip Kumar (#265)

On Wed, 4 Dec 2019 at 04:57, Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Wed, Dec 4, 2019 at 9:12 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Dec 4, 2019 at 1:58 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Tue, 3 Dec 2019 at 11:55, Amit Kapila <amit.kapila16@gmail.com> wrote:

In your code, I think if two workers enter to compute_parallel_delay
function at the same time, they add their local balance to
VacuumSharedCostBalance and both workers sleep because both values
reach the VacuumCostLimit.

True, but isn't it more appropriate because the local cost of any
worker should be ideally added to shared cost as soon as it occurred?
I mean to say that we are not adding any cost in shared balance
without actually incurring it. Then we also consider the individual
worker's local balance as well and sleep according to local balance.

Even I think it is better to add the balance to the shared balance at
the earliest opportunity. Just consider the case that there are 5
workers and all have I/O balance of 20, and VacuumCostLimit is 50. So
Actually, there combined balance is 100 (which is double of the
VacuumCostLimit) but if we don't add immediately then none of the
workers will sleep and it may go to the next cycle which is not very
good. OTOH, if we add 20 immediately then check the shared balance
then all the workers might go for sleep if their local balances have
reached the limit but they will only sleep in proportion to their
local balance. So IMHO, adding the current balance to shared balance
early is more close to the model we are trying to implement i.e.
shared cost accounting.

I agree to add the balance as soon as it occurred. But the problem I'm
concerned is, let's suppose we have 4 workers, the cost limit is 100
and the shared balance is now 95. Two workers, whom local
balance(VacuumCostBalanceLocal) are 40, consumed I/O, added 10 to
theirs local balance and entered compute_parallel_delay function at
the same time. One worker adds 10 to the shared
balance(VacuumSharedCostBalance) and another worker also adds 10 to
the shared balance. The one worker then subtracts the local balance
from the shared balance and sleeps because the shared cost is now 115
(> the cost limit) and its local balance is 50 (> 0.5*(100/4)). Even
another worker also does the same for the same reason. On the other
hand if two workers do that serially, only one worker sleeps and
another worker doesn't because the total shared cost will be 75 when
the later worker enters the condition. At first glance it looks like a
concurrency problem but is that expected behaviour?

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#268

Robert Haas

robertmhaas@gmail.com

about 6 years ago

In reply to: Dilip Kumar (#222)

On Thu, Nov 21, 2019 at 12:32 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:

In v33-0001-Add-index-AM-field-and-callback-for-parallel-ind patch, I
am a bit doubtful about this kind of arrangement, where the code in
the "if" is always unreachable with the current AMs. I am not sure
what is the best way to handle this, should we just drop the
amestimateparallelvacuum altogether? Because currently, we are just
providing a size estimate function without a copy function, even if
the in future some Am give an estimate about the size of the stats, we
can not directly memcpy the stat from the local memory to the shared
memory, we might then need a copy function also from the AM so that it
can flatten the stats and store in proper format?

I agree that it's a crock to add an AM method that is never used for
anything. That's just asking for the design to prove buggy and
inadequate. One way to avoid this would be to require that every AM
that wants to support parallel vacuuming supply this method, and if it
wants to just return sizeof(IndexBulkDeleteResult), then it can. But I
also think someone should modify one of the AMs to use a
differently-sized object, and then see whether they can really make
parallel vacuum work with this patch. If, as you speculated here, it
needs another API, then we should add both of them or neither. A
half-baked solution is worse than nothing at all.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#269

Robert Haas

robertmhaas@gmail.com

about 6 years ago

In reply to: Masahiko Sawada (#256)

On Mon, Dec 2, 2019 at 2:26 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

It's just an example, I'm not saying your idea is bad. ISTM the idea
is good on an assumption that all indexes take the same time or take a
long time so I'd also like to consider if this is true even in
production and which approaches is better if we don't have such
assumption.

I think his idea is good. You're not wrong when you say that there are
cases where it could work out badly, but I think on the whole it's a
clear improvement. Generally, the indexes should be of relatively
similar size because index size is driven by table size; it's true
that different AMs could result in different-size indexes, but it
seems like a stretch to suppose that the indexes that don't support
parallelism are also going to be the little tiny ones that go fast
anyway, unless we have some evidence that this is really true. I also
wonder whether we really need the option to disable parallel vacuum in
the first place. Maybe I'm looking in the right place, but I'm not
finding anything in the way of comments or documentation explaining
why some AMs don't support it. It's an important design point, and
should be documented.

I also think PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION seems like a
waste of space. For parallel queries, there is a trade-off between
having the leader do work (which can speed up the query) and having it
remain idle so that it can immediately absorb tuples from workers and
keep them from having their tuple queues fill up (which can speed up
the query). But here, at least as I understand it, there's no such
trade-off. Having the leader fail to participate is just a loser.
Maybe it's useful to test while debugging the patch, but why should
the committed code support it?

To respond to another point from a different part of the email chain,
the reason why LaunchParallelWorkers() does not take an argument for
the number of workers is because I believed that the caller should
always know how many workers they're going to want at the time they
CreateParallelContext(). Increasing it later is not possible, because
the DSM has already sized based on the count provided. I grant that it
would be possible to allow the number to be reduced later, but why
would you want to do that? Why not get the number right when first
creating the DSM?

Is there any legitimate use case for parallel vacuum in combination
with vacuum cost delay? As I understand it, any serious vacuuming is
going to be I/O-bound, so can you really need multiple workers at the
same time that you are limiting the I/O rate? Perhaps it's possible if
the I/O limit is so high that a single worker can't hit the limit by
itself, but multiple workers can, but it seems like a bad idea to
spawn more workers and then throttle them rather than just starting
fewer workers. In any case, the algorithm suggested in vacuumlazy.c
around the definition of VacuumSharedCostBalance seems almost the
opposite of what you probably want. The idea there seems to be that
you shouldn't make a worker sleep if it hasn't actually got to do
anything. Apparently the idea is that if you have 3 workers and you
only have enough I/O rate for 1 worker, you want all 3 workers to run
at once, so that the I/O is random, rather than having them run 1 at a
time, so that the I/O is sequential. That seems counterintuitive. It
could be right if the indexes are in different tablespaces, but if
they are in the same tablespace it's probably wrong. I guess it could
still be right if there's just so much I/O that you aren't going to
run out ever, and the more important consideration is that you don't
know which index will take longer to vacuum and so want to start them
all at the same time so that you don't accidentally start the slow one
last, but that sounds like a stretch. I think this whole area needs
more thought. I feel like we're trying to jam a go-slower feature and
a go-faster feature into the same box.

+ * vacuum and for performing index cleanup.  Note that all parallel workers
+ * live during either index vacuuming or index cleanup but the leader process
+ * neither exits from the parallel mode nor destroys the parallel context.
+ * For updating the index statistics, since any updates are not allowed during
+ * parallel mode we update the index statistics after exited from the parallel

The first of these sentences ("Note that all...") is not very clear to
me, and seems like it may amount to a statement that the leader
doesn't try to destroy the parallel context too early, but since I
don't understand it, maybe that's not what it is saying. The second
sentence needs exited -> exiting, and maybe some more work on the
grammar, too.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#270

Dilip Kumar

dilipbalaut@gmail.com

about 6 years ago

In reply to: Masahiko Sawada (#267)

On Thu, Dec 5, 2019 at 12:21 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Wed, 4 Dec 2019 at 04:57, Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Wed, Dec 4, 2019 at 9:12 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Dec 4, 2019 at 1:58 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Tue, 3 Dec 2019 at 11:55, Amit Kapila <amit.kapila16@gmail.com> wrote:

In your code, I think if two workers enter to compute_parallel_delay
function at the same time, they add their local balance to
VacuumSharedCostBalance and both workers sleep because both values
reach the VacuumCostLimit.

True, but isn't it more appropriate because the local cost of any
worker should be ideally added to shared cost as soon as it occurred?
I mean to say that we are not adding any cost in shared balance
without actually incurring it. Then we also consider the individual
worker's local balance as well and sleep according to local balance.

Even I think it is better to add the balance to the shared balance at
the earliest opportunity. Just consider the case that there are 5
workers and all have I/O balance of 20, and VacuumCostLimit is 50. So
Actually, there combined balance is 100 (which is double of the
VacuumCostLimit) but if we don't add immediately then none of the
workers will sleep and it may go to the next cycle which is not very
good. OTOH, if we add 20 immediately then check the shared balance
then all the workers might go for sleep if their local balances have
reached the limit but they will only sleep in proportion to their
local balance. So IMHO, adding the current balance to shared balance
early is more close to the model we are trying to implement i.e.
shared cost accounting.

I agree to add the balance as soon as it occurred. But the problem I'm
concerned is, let's suppose we have 4 workers, the cost limit is 100
and the shared balance is now 95. Two workers, whom local
balance(VacuumCostBalanceLocal) are 40, consumed I/O, added 10 to
theirs local balance and entered compute_parallel_delay function at
the same time. One worker adds 10 to the shared
balance(VacuumSharedCostBalance) and another worker also adds 10 to
the shared balance. The one worker then subtracts the local balance
from the shared balance and sleeps because the shared cost is now 115
(> the cost limit) and its local balance is 50 (> 0.5*(100/4)). Even
another worker also does the same for the same reason. On the other
hand if two workers do that serially, only one worker sleeps and
another worker doesn't because the total shared cost will be 75 when
the later worker enters the condition. At first glance it looks like a
concurrency problem but is that expected behaviour?

If both workers sleep then the remaining shared balance will be 15 and
their local balances will be 0. OTOH if one worker sleep then the
remaining shared balance will be 75, so the second worker has missed
this sleep cycle but on the next opportunity when the shared value
again reaches 100 and if the second worker performs more I/O it will
sleep for a longer duration.

Even if we add it to the shared balance later (like you were doing
earlier) then also we can reproduce the similar behavior, suppose
shared balance is 85 and both workers have local balance 40 each. Now,
each worker has done the I/O of 10. Now, suppose we don't add to
shared balance then both workers will see the balance as 85+10= 95 so
none of them will sleep. OTOH, if they do serially the first worker
will add 10 and make it 95 and then the second worker will locally
check 95+10 which is more than 100 and it will sleep. Right?

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#271

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Robert Haas (#269)

On Thu, Dec 5, 2019 at 1:41 AM Robert Haas <robertmhaas@gmail.com> wrote:

On Mon, Dec 2, 2019 at 2:26 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

It's just an example, I'm not saying your idea is bad. ISTM the idea
is good on an assumption that all indexes take the same time or take a
long time so I'd also like to consider if this is true even in
production and which approaches is better if we don't have such
assumption.

I think his idea is good. You're not wrong when you say that there are
cases where it could work out badly, but I think on the whole it's a
clear improvement. Generally, the indexes should be of relatively
similar size because index size is driven by table size; it's true
that different AMs could result in different-size indexes, but it
seems like a stretch to suppose that the indexes that don't support
parallelism are also going to be the little tiny ones that go fast
anyway, unless we have some evidence that this is really true. I also
wonder whether we really need the option to disable parallel vacuum in
the first place.

I think it could be required for the cases where the AM doesn't have a
way (or it is difficult to come up with a way) to communicate the
stats allocated by the first ambulkdelete call to the subsequent ones
until amvacuumcleanup. Currently, we have such a case for the Gist
index, see email thread [1]. Though we have come up with a way to
avoid that for Gist indexes, I am not sure if we can assume that it is
the case for any possible index AM especially when there is a
provision that indexAM can have additional stats information. In the
worst case, if we have to modify some existing index AM like we did
for the Gist index, we need such a provision so that it is possible.
In the ideal case, the index AM should provide a way to copy such
stats, but we can't assume that, so we come up with this option.

We have used this for dummy_index_am which also provides a way to test it.

Maybe I'm looking in the right place, but I'm not
finding anything in the way of comments or documentation explaining
why some AMs don't support it. It's an important design point, and
should be documented.

Agreed. We should do this.

I also think PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION seems like a
waste of space. For parallel queries, there is a trade-off between
having the leader do work (which can speed up the query) and having it
remain idle so that it can immediately absorb tuples from workers and
keep them from having their tuple queues fill up (which can speed up
the query). But here, at least as I understand it, there's no such
trade-off. Having the leader fail to participate is just a loser.
Maybe it's useful to test while debugging the patch,

Yeah, it is primarily a debugging/testing aid patch and it helped us
in discovering some issues. During development, it is being used for
tesing purpose as well. This is the reason the code is under #ifdef

but why should
the committed code support it?

I am also not sure whether we should commit this part of code and that
is why I told in one of the above emails to keep it as a separate
patch. We can later see whether to commit this code. Now, the point
in its favor is that we already have a similar define
(DISABLE_LEADER_PARTICIPATION) for parallel create index, so having it
here is not a bad idea. I think it might help us in debugging some
bugs where we want forcefully the index to be vacuumed by some worker.
We might want to have something like force_parallel_mode for
testing/debugging purpose, but not sure which is better. I think
having something as a debugging aid for such features is good.

To respond to another point from a different part of the email chain,
the reason why LaunchParallelWorkers() does not take an argument for
the number of workers is because I believed that the caller should
always know how many workers they're going to want at the time they
CreateParallelContext(). Increasing it later is not possible, because
the DSM has already sized based on the count provided. I grant that it
would be possible to allow the number to be reduced later, but why
would you want to do that? Why not get the number right when first
creating the DSM?

Here, we have a need to reduce the number of workers. Index Vacuum
has two different phases (index vacuum and index cleanup) which uses
the same parallel-context/DSM but both could have different
requirements for workers. The second phase (cleanup) would normally
need fewer workers as if the work is done in the first phase, second
wouldn't need it, but we have exceptions like gin indexes where we
need it for the second phase as well because it takes the pass
over-index again even if we have cleaned the index in the first phase.
Now, consider the case where we have 3 btree indexes and 2 gin
indexes, we would need 5 workers for index vacuum phase and 2 workers
for index cleanup phase. There are other cases too.

We also considered to have a separate DSM for each phase, but that
appeared to have overhead without much benefit.

Is there any legitimate use case for parallel vacuum in combination
with vacuum cost delay?

Yeah, we also initially thought that it is not legitimate to use a
parallel vacuum with a cost delay. But to get a wider view, we
started a separate thread [2] and there we reach to the conclusion
that we need a solution for throttling [3].

+ * vacuum and for performing index cleanup.  Note that all parallel workers
+ * live during either index vacuuming or index cleanup but the leader process
+ * neither exits from the parallel mode nor destroys the parallel context.
+ * For updating the index statistics, since any updates are not allowed during
+ * parallel mode we update the index statistics after exited from the parallel
The first of these sentences ("Note that all...") is not very clear to
me, and seems like it may amount to a statement that the leader
doesn't try to destroy the parallel context too early, but since I
don't understand it, maybe that's not what it is saying.

Your understanding is correct. How about if we modify it to something
like: "Note that parallel workers are alive only during index vacuum
or index cleanup but the leader process neither exits from the
parallel mode nor destroys the parallel context until the entire
parallel operation is finished." OR something like "The leader backend
holds the parallel context till the index vacuum and cleanup is
finished. Both index vacuum and cleanup separately perform the work
with parallel workers."

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#272

Dilip Kumar

dilipbalaut@gmail.com

about 6 years ago

In reply to: Robert Haas (#269)

On Thu, Dec 5, 2019 at 1:41 AM Robert Haas <robertmhaas@gmail.com> wrote:

On Mon, Dec 2, 2019 at 2:26 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

It's just an example, I'm not saying your idea is bad. ISTM the idea
is good on an assumption that all indexes take the same time or take a
long time so I'd also like to consider if this is true even in
production and which approaches is better if we don't have such
assumption.

I think his idea is good. You're not wrong when you say that there are
cases where it could work out badly, but I think on the whole it's a
clear improvement. Generally, the indexes should be of relatively
similar size because index size is driven by table size; it's true
that different AMs could result in different-size indexes, but it
seems like a stretch to suppose that the indexes that don't support
parallelism are also going to be the little tiny ones that go fast
anyway, unless we have some evidence that this is really true. I also
wonder whether we really need the option to disable parallel vacuum in
the first place. Maybe I'm looking in the right place, but I'm not
finding anything in the way of comments or documentation explaining
why some AMs don't support it. It's an important design point, and
should be documented.

I also think PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION seems like a
waste of space. For parallel queries, there is a trade-off between
having the leader do work (which can speed up the query) and having it
remain idle so that it can immediately absorb tuples from workers and
keep them from having their tuple queues fill up (which can speed up
the query). But here, at least as I understand it, there's no such
trade-off. Having the leader fail to participate is just a loser.
Maybe it's useful to test while debugging the patch, but why should
the committed code support it?

To respond to another point from a different part of the email chain,
the reason why LaunchParallelWorkers() does not take an argument for
the number of workers is because I believed that the caller should
always know how many workers they're going to want at the time they
CreateParallelContext(). Increasing it later is not possible, because
the DSM has already sized based on the count provided. I grant that it
would be possible to allow the number to be reduced later, but why
would you want to do that? Why not get the number right when first
creating the DSM?

Is there any legitimate use case for parallel vacuum in combination
with vacuum cost delay? As I understand it, any serious vacuuming is
going to be I/O-bound, so can you really need multiple workers at the
same time that you are limiting the I/O rate? Perhaps it's possible if
the I/O limit is so high that a single worker can't hit the limit by
itself, but multiple workers can, but it seems like a bad idea to
spawn more workers and then throttle them rather than just starting
fewer workers.

I agree that there is no point is first to spawn more workers to get
the work done faster and later throttle them. Basically, that will
lose the whole purpose of running it in parallel. OTOH, we should
also consider the cases where there could be some vacuum that may not
hit the I/O limit right? because it may find all the pages in the
shared buffers and they might not need to dirty a lot of pages. So I
think for such cases it is advantageous to run in parallel. The
problem is that there is no way to know in advance whether the total
I/O for the vacuum will hit the I/O limit or not so we can not decide
in advance whether to run it in parallel or not.

In any case, the algorithm suggested in vacuumlazy.c

around the definition of VacuumSharedCostBalance seems almost the
opposite of what you probably want. The idea there seems to be that
you shouldn't make a worker sleep if it hasn't actually got to do
anything. Apparently the idea is that if you have 3 workers and you
only have enough I/O rate for 1 worker, you want all 3 workers to run
at once, so that the I/O is random, rather than having them run 1 at a
time, so that the I/O is sequential. That seems counterintuitive. It
could be right if the indexes are in different tablespaces, but if
they are in the same tablespace it's probably wrong. I guess it could
still be right if there's just so much I/O that you aren't going to
run out ever, and the more important consideration is that you don't
know which index will take longer to vacuum and so want to start them
all at the same time so that you don't accidentally start the slow one
last, but that sounds like a stretch. I think this whole area needs
more thought. I feel like we're trying to jam a go-slower feature and
a go-faster feature into the same box.
+ * vacuum and for performing index cleanup.  Note that all parallel workers
+ * live during either index vacuuming or index cleanup but the leader process
+ * neither exits from the parallel mode nor destroys the parallel context.
+ * For updating the index statistics, since any updates are not allowed during
+ * parallel mode we update the index statistics after exited from the parallel
The first of these sentences ("Note that all...") is not very clear to
me, and seems like it may amount to a statement that the leader
doesn't try to destroy the parallel context too early, but since I
don't understand it, maybe that's not what it is saying. The second
sentence needs exited -> exiting, and maybe some more work on the
grammar, too.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#273

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Amit Kapila (#271)

On Thu, Dec 5, 2019 at 10:52 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Thu, Dec 5, 2019 at 1:41 AM Robert Haas <robertmhaas@gmail.com> wrote:

On Mon, Dec 2, 2019 at 2:26 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

It's just an example, I'm not saying your idea is bad. ISTM the idea
is good on an assumption that all indexes take the same time or take a
long time so I'd also like to consider if this is true even in
production and which approaches is better if we don't have such
assumption.

I think his idea is good. You're not wrong when you say that there are
cases where it could work out badly, but I think on the whole it's a
clear improvement. Generally, the indexes should be of relatively
similar size because index size is driven by table size; it's true
that different AMs could result in different-size indexes, but it
seems like a stretch to suppose that the indexes that don't support
parallelism are also going to be the little tiny ones that go fast
anyway, unless we have some evidence that this is really true. I also
wonder whether we really need the option to disable parallel vacuum in
the first place.

I think it could be required for the cases where the AM doesn't have a
way (or it is difficult to come up with a way) to communicate the
stats allocated by the first ambulkdelete call to the subsequent ones
until amvacuumcleanup. Currently, we have such a case for the Gist
index, see email thread [1].

oops, I had referred to a couple of other discussions in my reply but
forgot to mention the links, doing it now.

[1]: /messages/by-id/CAA4eK1LGr+MN0xHZpJ2dfS8QNQ1a_aROKowZB+MPNep8FVtwAA@mail.gmail.com
[2]: /messages/by-id/CAA4eK1J-VoR9gzS5E75pcD-OH0mEyCdp8RihcwKrcuw7J-Q0+w@mail.gmail.com
[3]: /messages/by-id/20191106022550.zq7nai2ct2ashegq@alap3.anarazel.de

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#274

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Robert Haas (#268)

On Thu, Dec 5, 2019 at 12:54 AM Robert Haas <robertmhaas@gmail.com> wrote:

On Thu, Nov 21, 2019 at 12:32 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:

In v33-0001-Add-index-AM-field-and-callback-for-parallel-ind patch, I
am a bit doubtful about this kind of arrangement, where the code in
the "if" is always unreachable with the current AMs. I am not sure
what is the best way to handle this, should we just drop the
amestimateparallelvacuum altogether? Because currently, we are just
providing a size estimate function without a copy function, even if
the in future some Am give an estimate about the size of the stats, we
can not directly memcpy the stat from the local memory to the shared
memory, we might then need a copy function also from the AM so that it
can flatten the stats and store in proper format?

I agree that it's a crock to add an AM method that is never used for
anything. That's just asking for the design to prove buggy and
inadequate. One way to avoid this would be to require that every AM
that wants to support parallel vacuuming supply this method, and if it
wants to just return sizeof(IndexBulkDeleteResult), then it can. But I
also think someone should modify one of the AMs to use a
differently-sized object, and then see whether they can really make
parallel vacuum work with this patch. If, as you speculated here, it
needs another API, then we should add both of them or neither. A
half-baked solution is worse than nothing at all.

It is possible that we need another API to make it work as is
currently the case for Gist Index where we need to someway first
serialize it (which as mentioned earlier that we have now a way to
avoid serializing it). However, if it is for some simple case where
there are some additional constants apart from IndexBulkDeleteResult,
then we don't need it. I think here, we were cautious to not expose
more API's unless there is a real need, but I guess it is better to
completely avoid such cases and don't expose any API unless we have
some examples. In any case, the user will have the facility to
disable a parallel vacuum for such cases.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#275

Robert Haas

robertmhaas@gmail.com

about 6 years ago

In reply to: Amit Kapila (#271)

On Thu, Dec 5, 2019 at 12:22 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

I think it could be required for the cases where the AM doesn't have a
way (or it is difficult to come up with a way) to communicate the
stats allocated by the first ambulkdelete call to the subsequent ones
until amvacuumcleanup. Currently, we have such a case for the Gist
index, see email thread [1]. Though we have come up with a way to
avoid that for Gist indexes, I am not sure if we can assume that it is
the case for any possible index AM especially when there is a
provision that indexAM can have additional stats information. In the
worst case, if we have to modify some existing index AM like we did
for the Gist index, we need such a provision so that it is possible.
In the ideal case, the index AM should provide a way to copy such
stats, but we can't assume that, so we come up with this option.

We have used this for dummy_index_am which also provides a way to test it.

I think it might be a good idea to change what we expect index AMs to
do rather than trying to make anything that they happen to be doing
right now work, no matter how crazy. In particular, suppose we say
that you CAN'T add data on to the end of IndexBulkDeleteResult any
more, and that instead the extra data is passed through a separate
parameter. And then you add an estimate method that gives the size of
the space provided by that parameter (and if the estimate method isn't
defined then the extra parameter is passed as NULL) and document that
the data stored there might get flat-copied. Now, you've taken the
onus off of parallel vacuum to cope with any crazy thing a
hypothetical AM might be doing, and instead you've defined the
behavior of that hypothetical AM as wrong. If somebody really needs
that, it's now their job to modify the index AM machinery further
instead of your job to somehow cope.

Here, we have a need to reduce the number of workers. Index Vacuum
has two different phases (index vacuum and index cleanup) which uses
the same parallel-context/DSM but both could have different
requirements for workers. The second phase (cleanup) would normally
need fewer workers as if the work is done in the first phase, second
wouldn't need it, but we have exceptions like gin indexes where we
need it for the second phase as well because it takes the pass
over-index again even if we have cleaned the index in the first phase.
Now, consider the case where we have 3 btree indexes and 2 gin
indexes, we would need 5 workers for index vacuum phase and 2 workers
for index cleanup phase. There are other cases too.

We also considered to have a separate DSM for each phase, but that
appeared to have overhead without much benefit.

How about adding an additional argument to ReinitializeParallelDSM()
that allows the number of workers to be reduced? That seems like it
would be less confusing than what you have now, and would involve
modify code in a lot fewer places.

Is there any legitimate use case for parallel vacuum in combination
with vacuum cost delay?

Yeah, we also initially thought that it is not legitimate to use a
parallel vacuum with a cost delay. But to get a wider view, we
started a separate thread [2] and there we reach to the conclusion
that we need a solution for throttling [3].

OK, thanks for the pointer. This doesn't address the other part of my
complaint, though, which is that the whole discussion between you and
Dilip and Sawada-san presumes that you want the delays ought to be
scattered across the workers roughly in proportion to their share of
the I/O, and it seems NOT AT ALL clear that this is actually a
desirable property. You're all assuming that, but none of you has
justified it, and I think the opposite might be true in some cases.
You're adding extra complexity for something that isn't a clear
improvement.

Your understanding is correct. How about if we modify it to something
like: "Note that parallel workers are alive only during index vacuum
or index cleanup but the leader process neither exits from the
parallel mode nor destroys the parallel context until the entire
parallel operation is finished." OR something like "The leader backend
holds the parallel context till the index vacuum and cleanup is
finished. Both index vacuum and cleanup separately perform the work
with parallel workers."

How about if you just delete it? You don't need a comment explaining
that this caller of CreateParallelContext() does something which
*every* caller of CreateParallelContext() must do. If you didn't do
that, you'd fail assertions and everything would break, so *of course*
you are doing it.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#276

Robert Haas

robertmhaas@gmail.com

about 6 years ago

In reply to: Dilip Kumar (#272)

[ Please trim excess quoted material from your replies. ]

On Thu, Dec 5, 2019 at 12:27 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:

I agree that there is no point is first to spawn more workers to get
the work done faster and later throttle them. Basically, that will
lose the whole purpose of running it in parallel.

Right. I mean if you throttle something that would have otherwise
kept 3 workers running full blast back to the point where it uses the
equivalent of 2.5 workers, that might make sense. It's a little
marginal, maybe, but sure. But once you throttle it back to <= 2
workers, you're just wasting resources.

I think my concern here is ultimately more about usability than
whether or not we allow throttling. I agree that there are some
possible cases where throttling a parallel vacuum is useful, so I
guess we should support it. But I also think there's a real risk of
people not realizing that throttling is happening and then being sad
because they used parallel VACUUM and it was still slow. I think we
should document explicitly that parallel VACUUM is still potentially
throttled and that you should consider setting the cost delay to a
higher value or 0 before using it.

We might even want to add a FAST option (or similar) to VACUUM that
makes it behave as if vacuum_cost_delay = 0, and add something to the
examples section for VACUUM that suggests e.g.

VACUUM (PARALLEL 3, FAST) my_big_table
Vacuum my_big_table with 3 workers and with resource throttling
disabled for maximum performance.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#277

Mahendra Singh

mahi6run@gmail.com

about 6 years ago

In reply to: Robert Haas (#276)

On Thu, 5 Dec 2019 at 19:54, Robert Haas <robertmhaas@gmail.com> wrote:

[ Please trim excess quoted material from your replies. ]

On Thu, Dec 5, 2019 at 12:27 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:

I agree that there is no point is first to spawn more workers to get
the work done faster and later throttle them. Basically, that will
lose the whole purpose of running it in parallel.

Right. I mean if you throttle something that would have otherwise
kept 3 workers running full blast back to the point where it uses the
equivalent of 2.5 workers, that might make sense. It's a little
marginal, maybe, but sure. But once you throttle it back to <= 2
workers, you're just wasting resources.

I think my concern here is ultimately more about usability than
whether or not we allow throttling. I agree that there are some
possible cases where throttling a parallel vacuum is useful, so I
guess we should support it. But I also think there's a real risk of
people not realizing that throttling is happening and then being sad
because they used parallel VACUUM and it was still slow. I think we
should document explicitly that parallel VACUUM is still potentially
throttled and that you should consider setting the cost delay to a
higher value or 0 before using it.

We might even want to add a FAST option (or similar) to VACUUM that
makes it behave as if vacuum_cost_delay = 0, and add something to the
examples section for VACUUM that suggests e.g.

VACUUM (PARALLEL 3, FAST) my_big_table
Vacuum my_big_table with 3 workers and with resource throttling
disabled for maximum performance.

Please find some review comments for v35 patch set

1.
+    /* Return immediately when parallelism disabled */
+    if (max_parallel_maintenance_workers == 0)
+        return 0;
+
Here, we should add check of max_worker_processes because if
max_worker_processes is set as 0, then we can't launch any worker so
we should return from here.

2.
+    /* cap by max_parallel_maintenace_workers */
+    parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+
Here also, we should consider max_worker_processes to calculate
parallel_workers. (by default, max_worker_processes = 8)

Thanks and Regards
Mahendra Thalor
EnterpriseDB: http://www.enterprisedb.com

#278

Dilip Kumar

dilipbalaut@gmail.com

about 6 years ago

In reply to: Mahendra Singh (#277)

On Fri, Dec 6, 2019 at 12:55 AM Mahendra Singh <mahi6run@gmail.com> wrote:

On Thu, 5 Dec 2019 at 19:54, Robert Haas <robertmhaas@gmail.com> wrote:

[ Please trim excess quoted material from your replies. ]

On Thu, Dec 5, 2019 at 12:27 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:

I agree that there is no point is first to spawn more workers to get
the work done faster and later throttle them. Basically, that will
lose the whole purpose of running it in parallel.

Right. I mean if you throttle something that would have otherwise
kept 3 workers running full blast back to the point where it uses the
equivalent of 2.5 workers, that might make sense. It's a little
marginal, maybe, but sure. But once you throttle it back to <= 2
workers, you're just wasting resources.

I think my concern here is ultimately more about usability than
whether or not we allow throttling. I agree that there are some
possible cases where throttling a parallel vacuum is useful, so I
guess we should support it. But I also think there's a real risk of
people not realizing that throttling is happening and then being sad
because they used parallel VACUUM and it was still slow. I think we
should document explicitly that parallel VACUUM is still potentially
throttled and that you should consider setting the cost delay to a
higher value or 0 before using it.

We might even want to add a FAST option (or similar) to VACUUM that
makes it behave as if vacuum_cost_delay = 0, and add something to the
examples section for VACUUM that suggests e.g.

VACUUM (PARALLEL 3, FAST) my_big_table
Vacuum my_big_table with 3 workers and with resource throttling
disabled for maximum performance.

Please find some review comments for v35 patch set
1.
+    /* Return immediately when parallelism disabled */
+    if (max_parallel_maintenance_workers == 0)
+        return 0;
+
Here, we should add check of max_worker_processes because if
max_worker_processes is set as 0, then we can't launch any worker so
we should return from here.
2.
+    /* cap by max_parallel_maintenace_workers */
+    parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+
Here also, we should consider max_worker_processes to calculate
parallel_workers. (by default, max_worker_processes = 8)

IMHO, it's enough to cap with max_parallel_maintenace_workers. So I
think it's the user's responsibility to keep
max_parallel_maintenace_workers under parallel_workers limit. And, if
the user fails to set max_parallel_maintenace_workers under the
parallel_workers or enough workers are not available then
LaunchParallel worker will take care.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#279

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Robert Haas (#275)

On Thu, Dec 5, 2019 at 7:44 PM Robert Haas <robertmhaas@gmail.com> wrote:

I think it might be a good idea to change what we expect index AMs to
do rather than trying to make anything that they happen to be doing
right now work, no matter how crazy. In particular, suppose we say
that you CAN'T add data on to the end of IndexBulkDeleteResult any
more, and that instead the extra data is passed through a separate
parameter. And then you add an estimate method that gives the size of
the space provided by that parameter (and if the estimate method isn't
defined then the extra parameter is passed as NULL) and document that
the data stored there might get flat-copied.

I think this is a good idea and serves the purpose we are trying to
achieve currently. However, if there are any IndexAM that is using
the current way to pass stats with additional information, they would
need to change even if they don't want to use parallel vacuum
functionality (say because their indexes are too small or whatever
other reasons). I think this is a reasonable trade-off and the
changes on their end won't be that big. So, we should do this.

Now, you've taken the
onus off of parallel vacuum to cope with any crazy thing a
hypothetical AM might be doing, and instead you've defined the
behavior of that hypothetical AM as wrong. If somebody really needs
that, it's now their job to modify the index AM machinery further
instead of your job to somehow cope.

makes sense.

Here, we have a need to reduce the number of workers. Index Vacuum
has two different phases (index vacuum and index cleanup) which uses
the same parallel-context/DSM but both could have different
requirements for workers. The second phase (cleanup) would normally
need fewer workers as if the work is done in the first phase, second
wouldn't need it, but we have exceptions like gin indexes where we
need it for the second phase as well because it takes the pass
over-index again even if we have cleaned the index in the first phase.
Now, consider the case where we have 3 btree indexes and 2 gin
indexes, we would need 5 workers for index vacuum phase and 2 workers
for index cleanup phase. There are other cases too.

We also considered to have a separate DSM for each phase, but that
appeared to have overhead without much benefit.

How about adding an additional argument to ReinitializeParallelDSM()
that allows the number of workers to be reduced? That seems like it
would be less confusing than what you have now, and would involve
modify code in a lot fewer places.

Yeah, we can do that. We can maintain some information in
LVParallelState which indicates whether we need to reinitialize the
DSM before launching workers. Sawada-San, do you see any problem with
this idea?

Is there any legitimate use case for parallel vacuum in combination
with vacuum cost delay?

Yeah, we also initially thought that it is not legitimate to use a
parallel vacuum with a cost delay. But to get a wider view, we
started a separate thread [2] and there we reach to the conclusion
that we need a solution for throttling [3].

OK, thanks for the pointer. This doesn't address the other part of my
complaint, though, which is that the whole discussion between you and
Dilip and Sawada-san presumes that you want the delays ought to be
scattered across the workers roughly in proportion to their share of
the I/O, and it seems NOT AT ALL clear that this is actually a
desirable property. You're all assuming that, but none of you has
justified it, and I think the opposite might be true in some cases.

IIUC, your complaint is that in some cases, even if the I/O rate is
enough for one worker, we will still launch more workers and throttle
them. The point is we can't know in advance how much I/O is required
for each index. We can try to do that based on index size, but I
don't think that will be right because it is possible that for the
bigger index, we don't need to dirty the pages and most of the pages
are in shared buffers, etc. The current algorithm won't use more I/O
than required and it will be good for cases where one or some of the
indexes are doing more I/O as compared to others and it will also work
equally good even when the indexes have a similar amount of work. I
think we could do better if we can predict how much I/O each index
requires before actually scanning the index.

I agree with the other points (add a FAST option for parallel vacuum
and document that parallel vacuum is still potentially throttled ...)
you made in a separate email.

You're adding extra complexity for something that isn't a clear
improvement.

Your understanding is correct. How about if we modify it to something
like: "Note that parallel workers are alive only during index vacuum
or index cleanup but the leader process neither exits from the
parallel mode nor destroys the parallel context until the entire
parallel operation is finished." OR something like "The leader backend
holds the parallel context till the index vacuum and cleanup is
finished. Both index vacuum and cleanup separately perform the work
with parallel workers."

How about if you just delete it? You don't need a comment explaining
that this caller of CreateParallelContext() does something which
*every* caller of CreateParallelContext() must do. If you didn't do
that, you'd fail assertions and everything would break, so *of course*
you are doing it.

Fair enough, we can just remove this part of the comment.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#280

Mahendra Singh

mahi6run@gmail.com

about 6 years ago

In reply to: Amit Kapila (#279)

On Fri, 6 Dec 2019 at 10:50, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Thu, Dec 5, 2019 at 7:44 PM Robert Haas <robertmhaas@gmail.com> wrote:

I think it might be a good idea to change what we expect index AMs to
do rather than trying to make anything that they happen to be doing
right now work, no matter how crazy. In particular, suppose we say
that you CAN'T add data on to the end of IndexBulkDeleteResult any
more, and that instead the extra data is passed through a separate
parameter. And then you add an estimate method that gives the size of
the space provided by that parameter (and if the estimate method isn't
defined then the extra parameter is passed as NULL) and document that
the data stored there might get flat-copied.

I think this is a good idea and serves the purpose we are trying to
achieve currently. However, if there are any IndexAM that is using
the current way to pass stats with additional information, they would
need to change even if they don't want to use parallel vacuum
functionality (say because their indexes are too small or whatever
other reasons). I think this is a reasonable trade-off and the
changes on their end won't be that big. So, we should do this.

Now, you've taken the
onus off of parallel vacuum to cope with any crazy thing a
hypothetical AM might be doing, and instead you've defined the
behavior of that hypothetical AM as wrong. If somebody really needs
that, it's now their job to modify the index AM machinery further
instead of your job to somehow cope.

makes sense.

Here, we have a need to reduce the number of workers. Index Vacuum
has two different phases (index vacuum and index cleanup) which uses
the same parallel-context/DSM but both could have different
requirements for workers. The second phase (cleanup) would normally
need fewer workers as if the work is done in the first phase, second
wouldn't need it, but we have exceptions like gin indexes where we
need it for the second phase as well because it takes the pass
over-index again even if we have cleaned the index in the first phase.
Now, consider the case where we have 3 btree indexes and 2 gin
indexes, we would need 5 workers for index vacuum phase and 2 workers
for index cleanup phase. There are other cases too.

We also considered to have a separate DSM for each phase, but that
appeared to have overhead without much benefit.

How about adding an additional argument to ReinitializeParallelDSM()
that allows the number of workers to be reduced? That seems like it
would be less confusing than what you have now, and would involve
modify code in a lot fewer places.

Yeah, we can do that. We can maintain some information in
LVParallelState which indicates whether we need to reinitialize the
DSM before launching workers. Sawada-San, do you see any problem with
this idea?

Is there any legitimate use case for parallel vacuum in combination
with vacuum cost delay?

Yeah, we also initially thought that it is not legitimate to use a
parallel vacuum with a cost delay. But to get a wider view, we
started a separate thread [2] and there we reach to the conclusion
that we need a solution for throttling [3].

OK, thanks for the pointer. This doesn't address the other part of my
complaint, though, which is that the whole discussion between you and
Dilip and Sawada-san presumes that you want the delays ought to be
scattered across the workers roughly in proportion to their share of
the I/O, and it seems NOT AT ALL clear that this is actually a
desirable property. You're all assuming that, but none of you has
justified it, and I think the opposite might be true in some cases.

IIUC, your complaint is that in some cases, even if the I/O rate is
enough for one worker, we will still launch more workers and throttle
them. The point is we can't know in advance how much I/O is required
for each index. We can try to do that based on index size, but I
don't think that will be right because it is possible that for the
bigger index, we don't need to dirty the pages and most of the pages
are in shared buffers, etc. The current algorithm won't use more I/O
than required and it will be good for cases where one or some of the
indexes are doing more I/O as compared to others and it will also work
equally good even when the indexes have a similar amount of work. I
think we could do better if we can predict how much I/O each index
requires before actually scanning the index.

I agree with the other points (add a FAST option for parallel vacuum
and document that parallel vacuum is still potentially throttled ...)
you made in a separate email.

You're adding extra complexity for something that isn't a clear
improvement.

Your understanding is correct. How about if we modify it to something
like: "Note that parallel workers are alive only during index vacuum
or index cleanup but the leader process neither exits from the
parallel mode nor destroys the parallel context until the entire
parallel operation is finished." OR something like "The leader backend
holds the parallel context till the index vacuum and cleanup is
finished. Both index vacuum and cleanup separately perform the work
with parallel workers."

How about if you just delete it? You don't need a comment explaining
that this caller of CreateParallelContext() does something which
*every* caller of CreateParallelContext() must do. If you didn't do
that, you'd fail assertions and everything would break, so *of course*
you are doing it.

Fair enough, we can just remove this part of the comment.

Hi All,
Below is the brief about testing of v35 patch set.

1.
All the test cases are passing on the top of v35 patch set (make check
world and all contrib test cases)

2.
By enabling PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION, "make check
world" is passing.

3.
After v35 patch, vacuum.sql regression test is taking too much time due to
large number of inserts so by reducing number of tuples, we can reduce that
time.
+INSERT INTO pvactst SELECT i, array[1,2,3], point(i, i+1) FROM
generate_series(1,100000) i;

here, instead of 100000, we can make 1000 to reduce time of this test case
because we only want to test code and functionality.

4.
I tested functionality of parallel vacuum with different server
configuration setting and behavior is as per expected.
*shared_buffers, max_parallel_workers, max_parallel_maintenance_workers,
vacuum_cost_limit, vacuum_cost_delay, maintenance_work_mem,
max_worker_processes*

*5. *
index and table stats of parallel vacuum are matching with normal vacuum.

*7.*
If any worker(or main worker) got error, then immediately all the workers
are exiting and action is marked as abort.

8.
I tested parallel vacuum for all the types of indexes and by varying size
of indexes, all are working and didn't got any unexpected behavior.

9.
While doing testing, I found that if we delete all the tuples from table,
then also size of btree indexes was not reducing.

delete all tuples from table.
before vacuum, total pages in btree index: 8000
after vacuum(normal/parallel), total pages in btree index: 8000
but size of table is reducing after deleting all the tuples.
Can we add a check in vacuum to truncate all the pages of btree indexes if
there is no tuple in table.

Please let me know if you have any inputs for more testing.

Thanks and Regards
Mahendra Thalor
EnterpriseDB: http://www.enterprisedb.com

#281

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

about 6 years ago

In reply to: Amit Kapila (#279)

Sorry for the late reply.

On Fri, 6 Dec 2019 at 14:20, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Thu, Dec 5, 2019 at 7:44 PM Robert Haas <robertmhaas@gmail.com> wrote:

I think it might be a good idea to change what we expect index AMs to
do rather than trying to make anything that they happen to be doing
right now work, no matter how crazy. In particular, suppose we say
that you CAN'T add data on to the end of IndexBulkDeleteResult any
more, and that instead the extra data is passed through a separate
parameter. And then you add an estimate method that gives the size of
the space provided by that parameter (and if the estimate method isn't
defined then the extra parameter is passed as NULL) and document that
the data stored there might get flat-copied.

I think this is a good idea and serves the purpose we are trying to
achieve currently. However, if there are any IndexAM that is using
the current way to pass stats with additional information, they would
need to change even if they don't want to use parallel vacuum
functionality (say because their indexes are too small or whatever
other reasons). I think this is a reasonable trade-off and the
changes on their end won't be that big. So, we should do this.

Now, you've taken the
onus off of parallel vacuum to cope with any crazy thing a
hypothetical AM might be doing, and instead you've defined the
behavior of that hypothetical AM as wrong. If somebody really needs
that, it's now their job to modify the index AM machinery further
instead of your job to somehow cope.

makes sense.

Here, we have a need to reduce the number of workers. Index Vacuum
has two different phases (index vacuum and index cleanup) which uses
the same parallel-context/DSM but both could have different
requirements for workers. The second phase (cleanup) would normally
need fewer workers as if the work is done in the first phase, second
wouldn't need it, but we have exceptions like gin indexes where we
need it for the second phase as well because it takes the pass
over-index again even if we have cleaned the index in the first phase.
Now, consider the case where we have 3 btree indexes and 2 gin
indexes, we would need 5 workers for index vacuum phase and 2 workers
for index cleanup phase. There are other cases too.

We also considered to have a separate DSM for each phase, but that
appeared to have overhead without much benefit.

How about adding an additional argument to ReinitializeParallelDSM()
that allows the number of workers to be reduced? That seems like it
would be less confusing than what you have now, and would involve
modify code in a lot fewer places.

Yeah, we can do that. We can maintain some information in
LVParallelState which indicates whether we need to reinitialize the
DSM before launching workers. Sawada-San, do you see any problem with
this idea?

I think the number of workers could be increased in cleanup phase. For
example, if we have 1 brin index and 2 gin indexes then in bulkdelete
phase we need only 1 worker but in cleanup we need 2 workers.

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#282

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Masahiko Sawada (#281)

On Fri, Dec 13, 2019 at 10:03 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

Sorry for the late reply.

On Fri, 6 Dec 2019 at 14:20, Amit Kapila <amit.kapila16@gmail.com> wrote:

Here, we have a need to reduce the number of workers. Index Vacuum
has two different phases (index vacuum and index cleanup) which uses
the same parallel-context/DSM but both could have different
requirements for workers. The second phase (cleanup) would normally
need fewer workers as if the work is done in the first phase, second
wouldn't need it, but we have exceptions like gin indexes where we
need it for the second phase as well because it takes the pass
over-index again even if we have cleaned the index in the first phase.
Now, consider the case where we have 3 btree indexes and 2 gin
indexes, we would need 5 workers for index vacuum phase and 2 workers
for index cleanup phase. There are other cases too.

We also considered to have a separate DSM for each phase, but that
appeared to have overhead without much benefit.

How about adding an additional argument to ReinitializeParallelDSM()
that allows the number of workers to be reduced? That seems like it
would be less confusing than what you have now, and would involve
modify code in a lot fewer places.

Yeah, we can do that. We can maintain some information in
LVParallelState which indicates whether we need to reinitialize the
DSM before launching workers. Sawada-San, do you see any problem with
this idea?

I think the number of workers could be increased in cleanup phase. For
example, if we have 1 brin index and 2 gin indexes then in bulkdelete
phase we need only 1 worker but in cleanup we need 2 workers.

I think it shouldn't be more than the number with which we have
created a parallel context, no? If that is the case, then I think it
should be fine.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#283

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

about 6 years ago

In reply to: Amit Kapila (#282)

On Fri, 13 Dec 2019 at 14:19, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Dec 13, 2019 at 10:03 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

Sorry for the late reply.

On Fri, 6 Dec 2019 at 14:20, Amit Kapila <amit.kapila16@gmail.com> wrote:

Here, we have a need to reduce the number of workers. Index Vacuum
has two different phases (index vacuum and index cleanup) which uses
the same parallel-context/DSM but both could have different
requirements for workers. The second phase (cleanup) would normally
need fewer workers as if the work is done in the first phase, second
wouldn't need it, but we have exceptions like gin indexes where we
need it for the second phase as well because it takes the pass
over-index again even if we have cleaned the index in the first phase.
Now, consider the case where we have 3 btree indexes and 2 gin
indexes, we would need 5 workers for index vacuum phase and 2 workers
for index cleanup phase. There are other cases too.

We also considered to have a separate DSM for each phase, but that
appeared to have overhead without much benefit.

How about adding an additional argument to ReinitializeParallelDSM()
that allows the number of workers to be reduced? That seems like it
would be less confusing than what you have now, and would involve
modify code in a lot fewer places.

Yeah, we can do that. We can maintain some information in
LVParallelState which indicates whether we need to reinitialize the
DSM before launching workers. Sawada-San, do you see any problem with
this idea?

I think the number of workers could be increased in cleanup phase. For
example, if we have 1 brin index and 2 gin indexes then in bulkdelete
phase we need only 1 worker but in cleanup we need 2 workers.

I think it shouldn't be more than the number with which we have
created a parallel context, no? If that is the case, then I think it
should be fine.

Right. I thought that ReinitializeParallelDSM() with an additional
argument would reduce DSM but I understand that it doesn't actually
reduce DSM but just have a variable for the number of workers to
launch, is that right? And we also would need to call
ReinitializeParallelDSM() at the beginning of vacuum index or vacuum
cleanup since we don't know that we will do either index vacuum or
index cleanup, at the end of index vacum.

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#284

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Masahiko Sawada (#283)

On Fri, Dec 13, 2019 at 11:08 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Fri, 13 Dec 2019 at 14:19, Amit Kapila <amit.kapila16@gmail.com> wrote:

How about adding an additional argument to ReinitializeParallelDSM()
that allows the number of workers to be reduced? That seems like it
would be less confusing than what you have now, and would involve
modify code in a lot fewer places.

Yeah, we can do that. We can maintain some information in
LVParallelState which indicates whether we need to reinitialize the
DSM before launching workers. Sawada-San, do you see any problem with
this idea?

I think the number of workers could be increased in cleanup phase. For
example, if we have 1 brin index and 2 gin indexes then in bulkdelete
phase we need only 1 worker but in cleanup we need 2 workers.

I think it shouldn't be more than the number with which we have
created a parallel context, no? If that is the case, then I think it
should be fine.

Right. I thought that ReinitializeParallelDSM() with an additional
argument would reduce DSM but I understand that it doesn't actually
reduce DSM but just have a variable for the number of workers to
launch, is that right?

Yeah, probably, we need to change the nworkers stored in the context
and it should be lesser than the value already stored in that number.

And we also would need to call
ReinitializeParallelDSM() at the beginning of vacuum index or vacuum
cleanup since we don't know that we will do either index vacuum or
index cleanup, at the end of index vacum.

Right.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#285

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

about 6 years ago

In reply to: Amit Kapila (#284)

3 attachment(s)

On Fri, 13 Dec 2019 at 15:50, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Dec 13, 2019 at 11:08 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Fri, 13 Dec 2019 at 14:19, Amit Kapila <amit.kapila16@gmail.com> wrote:

How about adding an additional argument to ReinitializeParallelDSM()
that allows the number of workers to be reduced? That seems like it
would be less confusing than what you have now, and would involve
modify code in a lot fewer places.

Yeah, we can do that. We can maintain some information in
LVParallelState which indicates whether we need to reinitialize the
DSM before launching workers. Sawada-San, do you see any problem with
this idea?

I think the number of workers could be increased in cleanup phase. For
example, if we have 1 brin index and 2 gin indexes then in bulkdelete
phase we need only 1 worker but in cleanup we need 2 workers.

I think it shouldn't be more than the number with which we have
created a parallel context, no? If that is the case, then I think it
should be fine.

Right. I thought that ReinitializeParallelDSM() with an additional
argument would reduce DSM but I understand that it doesn't actually
reduce DSM but just have a variable for the number of workers to
launch, is that right?

Yeah, probably, we need to change the nworkers stored in the context
and it should be lesser than the value already stored in that number.

And we also would need to call
ReinitializeParallelDSM() at the beginning of vacuum index or vacuum
cleanup since we don't know that we will do either index vacuum or
index cleanup, at the end of index vacum.

Right.

I've attached the latest version patch set. These patches requires the
gist vacuum patch[1]/messages/by-id/CAA4eK1J1RxmXFAHC34S4_BznT76cfbrvqORSk23iBgRAOj1azw@mail.gmail.com. The patch incorporated the review comments. In
current version patch only indexes that support parallel vacuum and
whose size is larger than min_parallel_index_scan_size can participate
parallel vacuum. I'm still not unclear to me that using
min_parallel_index_scan_size is the best approach but I agreed to set
a lower bound of relation size. I separated the patch for
PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION from the main patch and
I'm working on that patch.

Please review it.

[1]: /messages/by-id/CAA4eK1J1RxmXFAHC34S4_BznT76cfbrvqORSk23iBgRAOj1azw@mail.gmail.com

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

v36-0001-Add-index-AM-field-and-callback-for-parallel-ind.patchapplication/octet-stream; name=v36-0001-Add-index-AM-field-and-callback-for-parallel-ind.patchDownload

From 6ad5bc29bd87b14e7aeec828f3731a12b126fb96 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Tue, 15 Oct 2019 17:03:22 +0900
Subject: [PATCH v36 1/3] Add index AM field and callback for parallel index
 vacuum

---
 contrib/bloom/blutils.c                       |  4 ++
 doc/src/sgml/indexam.sgml                     | 19 ++++++++++
 src/backend/access/brin/brin.c                |  4 ++
 src/backend/access/gin/ginutil.c              |  4 ++
 src/backend/access/gist/gist.c                |  4 ++
 src/backend/access/hash/hash.c                |  3 ++
 src/backend/access/nbtree/nbtree.c            |  3 ++
 src/backend/access/spgist/spgutils.c          |  4 ++
 src/include/access/amapi.h                    |  4 ++
 src/include/commands/vacuum.h                 | 37 +++++++++++++++++++
 .../modules/dummy_index_am/dummy_index_am.c   |  3 ++
 11 files changed, 89 insertions(+)

diff --git a/contrib/bloom/blutils.c b/contrib/bloom/blutils.c
index e2063bac62..1a8b3474db 100644
--- a/contrib/bloom/blutils.c
+++ b/contrib/bloom/blutils.c
@@ -18,6 +18,7 @@
 #include "access/reloptions.h"
 #include "bloom.h"
 #include "catalog/index.h"
+#include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
@@ -121,6 +122,9 @@ blhandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_CLEANUP;
+	amroutine->amusemaintenanceworkmem = false;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = blbuild;
diff --git a/doc/src/sgml/indexam.sgml b/doc/src/sgml/indexam.sgml
index dd54c68802..f9211a5ec9 100644
--- a/doc/src/sgml/indexam.sgml
+++ b/doc/src/sgml/indexam.sgml
@@ -122,6 +122,10 @@ typedef struct IndexAmRoutine
     bool        amcanparallel;
     /* does AM support columns included with clause INCLUDE? */
     bool        amcaninclude;
+    /* OR of parallel vacuum flags */
+    uint8       amparallelvacuumoptions;
+    /* does AM use maintenance_work_mem? */
+    bool        amusemaintenanceworkmem;
     /* type of data stored in index, or InvalidOid if variable */
     Oid         amkeytype;
 
@@ -731,6 +735,21 @@ amparallelrescan (IndexScanDesc scan);
    the beginning.
   </para>
 
+  <para>
+<programlisting>
+Size
+amestimateparallelvacuum (void);
+</programlisting>
+   Estimate and return the number of bytes of dynamic shared memory needed to
+   store statistics returned by the access method.
+  </para>
+
+  <para>
+   It is not necessary to implement this function for access methods which
+   do not support parallel vacuum or in cases where the access method does not
+   require more than size of <structname>IndexBulkDeleteResult</structname> to
+   store statistics.
+  </para>
  </sect1>
 
  <sect1 id="index-scanning">
diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index 294ffa6e20..00ee84a896 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -27,6 +27,7 @@
 #include "access/xloginsert.h"
 #include "catalog/index.h"
 #include "catalog/pg_am.h"
+#include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "pgstat.h"
 #include "postmaster/autovacuum.h"
@@ -101,6 +102,9 @@ brinhandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_CLEANUP;
+	amroutine->amusemaintenanceworkmem = false;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = brinbuild;
diff --git a/src/backend/access/gin/ginutil.c b/src/backend/access/gin/ginutil.c
index 38593554f0..5685e8caf6 100644
--- a/src/backend/access/gin/ginutil.c
+++ b/src/backend/access/gin/ginutil.c
@@ -20,6 +20,7 @@
 #include "access/xloginsert.h"
 #include "catalog/pg_collation.h"
 #include "catalog/pg_type.h"
+#include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "storage/indexfsm.h"
 #include "storage/lmgr.h"
@@ -53,6 +54,9 @@ ginhandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_CLEANUP;
+	amroutine->amusemaintenanceworkmem = true;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = ginbuild;
diff --git a/src/backend/access/gist/gist.c b/src/backend/access/gist/gist.c
index a259c80616..7df990cc63 100644
--- a/src/backend/access/gist/gist.c
+++ b/src/backend/access/gist/gist.c
@@ -17,6 +17,7 @@
 #include "access/gist_private.h"
 #include "access/gistscan.h"
 #include "catalog/pg_collation.h"
+#include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "nodes/execnodes.h"
 #include "storage/lmgr.h"
@@ -74,6 +75,9 @@ gisthandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = true;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP;
+	amroutine->amusemaintenanceworkmem = false;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = gistbuild;
diff --git a/src/backend/access/hash/hash.c b/src/backend/access/hash/hash.c
index a0597a0c6e..53db3ab24b 100644
--- a/src/backend/access/hash/hash.c
+++ b/src/backend/access/hash/hash.c
@@ -72,6 +72,9 @@ hashhandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL;
+	amroutine->amusemaintenanceworkmem = false;
 	amroutine->amkeytype = INT4OID;
 
 	amroutine->ambuild = hashbuild;
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index c67235ab80..a2904a9c82 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -123,6 +123,9 @@ bthandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = true;
 	amroutine->amcaninclude = true;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP;
+	amroutine->amusemaintenanceworkmem = false;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = btbuild;
diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c
index e2d391ee75..5d814eeba9 100644
--- a/src/backend/access/spgist/spgutils.c
+++ b/src/backend/access/spgist/spgutils.c
@@ -22,6 +22,7 @@
 #include "access/transam.h"
 #include "access/xact.h"
 #include "catalog/pg_amop.h"
+#include "commands/vacuum.h"
 #include "storage/bufmgr.h"
 #include "storage/indexfsm.h"
 #include "storage/lmgr.h"
@@ -56,6 +57,9 @@ spghandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP;
+	amroutine->amusemaintenanceworkmem = false;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = spgbuild;
diff --git a/src/include/access/amapi.h b/src/include/access/amapi.h
index 6e3db06eed..3fb5a030bd 100644
--- a/src/include/access/amapi.h
+++ b/src/include/access/amapi.h
@@ -197,6 +197,10 @@ typedef struct IndexAmRoutine
 	bool		amcanparallel;
 	/* does AM support columns included with clause INCLUDE? */
 	bool		amcaninclude;
+	/* OR of parallel vacuum flags.  See vacuum.h for flags. */
+	uint8		amparallelvacuumoptions;
+	/* does AM use maintenance_work_mem? */
+	bool		amusemaintenanceworkmem;
 	/* type of data stored in index, or InvalidOid if variable */
 	Oid			amkeytype;
 
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 128f7ae65d..b05aedc670 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -23,6 +23,43 @@
 #include "storage/lock.h"
 #include "utils/relcache.h"
 
+/*
+ * Flags for amparallelvacuumoptions to control the participation of bulkdelete
+ * and vacuumcleanup in parallel vacuum.
+ */
+
+/*
+ * Both bulkdelete and vacuumcleanup are disabled by default.  This will be
+ * used by IndexAM's that don't want to or cannot participate in parallel vacuum.
+ * If an index AM doesn't have a way to communicate the index statistics allocated
+ * by the first ambulkdelete call to the subsequent ones until amvacuumcleanup,
+ * the index AM cannot participate in parallel vacuum.
+ */
+#define VACUUM_OPTION_NO_PARALLEL			0
+
+/*
+ * bulkdelete can be performed in parallel.  This option can be used by
+ * IndexAm's that need to scan the index to delete the tuples.
+ */
+#define VACUUM_OPTION_PARALLEL_BULKDEL		(1 << 0)
+
+/*
+ * vacuumcleanup can be performed in parallel if bulkdelete is not performed
+ * yet.  This will be used by IndexAM's that can scan the index if the
+ * bulkdelete is not performed.
+ */
+#define VACUUM_OPTION_PARALLEL_COND_CLEANUP	(1 << 1)
+
+/*
+ * vacuumcleanup can be performed in parallel even if bulkdelete has already
+ * processed the index.  This will be used by IndexAM's that scan the index
+ * during the cleanup phase of index irrespective of whether the index is
+ * already scanned or not during bulkdelete phase.
+ */
+#define VACUUM_OPTION_PARALLEL_CLEANUP		(1 << 2)
+
+/* value for checking vacuum flags */
+#define VACUUM_OPTION_MAX_VALID_VALUE		((1 << 3) - 1)
 
 /*----------
  * ANALYZE builds one of these structs for each attribute (column) that is
diff --git a/src/test/modules/dummy_index_am/dummy_index_am.c b/src/test/modules/dummy_index_am/dummy_index_am.c
index 053636e4b4..246d68ffc8 100644
--- a/src/test/modules/dummy_index_am/dummy_index_am.c
+++ b/src/test/modules/dummy_index_am/dummy_index_am.c
@@ -16,6 +16,7 @@
 #include "access/amapi.h"
 #include "access/reloptions.h"
 #include "catalog/index.h"
+#include "commands/vacuum.h"
 #include "nodes/pathnodes.h"
 #include "utils/guc.h"
 #include "utils/rel.h"
@@ -294,6 +295,8 @@ dihandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amparallelvacuumoptions = VACUUM_OPTION_NO_PARALLEL;
+	amroutine->amusemaintenanceworkmem = false;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = dibuild;
-- 
2.23.0

v36-0003-Add-FAST-option-to-vacuum-command.patchapplication/octet-stream; name=v36-0003-Add-FAST-option-to-vacuum-command.patchDownload

From 6d3d2907236d3ffb5954123cad238d77dcd925b5 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Tue, 17 Dec 2019 15:18:22 +0900
Subject: [PATCH v36 3/3] Add FAST option to vacuum command.

---
 doc/src/sgml/ref/vacuum.sgml         | 13 +++++++++
 src/backend/access/heap/vacuumlazy.c | 42 +++++++++++++++++-----------
 src/backend/commands/vacuum.c        |  9 ++++--
 src/include/commands/vacuum.h        |  3 +-
 src/test/regress/expected/vacuum.out |  3 ++
 src/test/regress/sql/vacuum.sql      |  4 +++
 6 files changed, 55 insertions(+), 19 deletions(-)

diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index 9fee083233..b190cb0a98 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -35,6 +35,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     INDEX_CLEANUP [ <replaceable class="parameter">boolean</replaceable> ]
     TRUNCATE [ <replaceable class="parameter">boolean</replaceable> ]
     PARALLEL [ <replaceable class="parameter">integer</replaceable> ]
+    FAST [ <replaceable class="parameter">boolean</replaceable> ]
 
 <phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
 
@@ -250,6 +251,18 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><literal>FAST</literal></term>
+    <listitem>
+     <para>
+      Perform vacuum while disabling cost-based vacuum delay feature.
+      Specifying <literal>FAST</literal> is equivalent to performing
+      <command>VACUUM</command> with the
+      <xref linkend="guc-vacuum-cost-delay"/> parameter set to zero.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><replaceable class="parameter">boolean</replaceable></term>
     <listitem>
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 6f90a4dc40..0fe1a3b4fa 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -218,6 +218,13 @@ typedef struct LVShared
 	 */
 	pg_atomic_uint32 active_nworkers;
 
+	/*
+	 * True if we forcibly disable cost-based vacuum delay during parallel
+	 * index vacuum. This can be true when use specified the FAST vacuum
+	 * option.
+	 */
+	bool		disable_delay;
+
 	/*
 	 * Variables to control parallel index vacuuming.  We have a bitmap to
 	 * indicate which index has stats in shared memory.  The set bit in the
@@ -349,7 +356,7 @@ static void update_index_statistics(Relation *Irel, IndexBulkDeleteResult **stat
 									int nindexes);
 static LVParallelState *begin_parallel_vacuum(Oid relid, Relation *Irel,
 											  LVRelStats *vacrelstats, BlockNumber nblocks,
-											  int nindexes, int nrequested);
+											  int nindexes, VacuumParams *params);
 static void end_parallel_vacuum(Relation *Irel, IndexBulkDeleteResult **stats,
 								LVParallelState *lps, int nindexes);
 static LVSharedIndStats *get_indstats(LVShared *lvshared, int n);
@@ -752,7 +759,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	if (params->nworkers >= 0 && vacrelstats->useindex)
 		lps = begin_parallel_vacuum(RelationGetRelid(onerel), Irel,
 									vacrelstats, nblocks, nindexes,
-									params->nworkers);
+									params);
 
 	/*
 	 * Allocate the space for dead tuples in case the parallel vacuum is not
@@ -1985,16 +1992,19 @@ lazy_parallel_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
 			ReinitializeParallelDSM(lps->pcxt, nworkers);
 		}
 
-		/* Enable shared cost balance */
-		VacuumSharedCostBalance = &(lps->lvshared->cost_balance);
-		VacuumActiveNWorkers = &(lps->lvshared->active_nworkers);
+		if (!lps->lvshared->disable_delay)
+		{
+			/* Enable shared cost balance */
+			VacuumSharedCostBalance = &(lps->lvshared->cost_balance);
+			VacuumActiveNWorkers = &(lps->lvshared->active_nworkers);
 
-		/*
-		 * Set up shared cost balance and the number of active workers for
-		 * vacuum delay.
-		 */
-		pg_atomic_write_u32(VacuumSharedCostBalance, VacuumCostBalance);
-		pg_atomic_write_u32(VacuumActiveNWorkers, 0);
+			/*
+			 * Set up shared cost balance and the number of active workers for
+			 * vacuum delay.
+			 */
+			pg_atomic_write_u32(VacuumSharedCostBalance, VacuumCostBalance);
+			pg_atomic_write_u32(VacuumActiveNWorkers, 0);
+		}
 
 		LaunchParallelWorkers(lps->pcxt);
 
@@ -2007,7 +2017,7 @@ lazy_parallel_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
 			VacuumCostBalance = 0;
 			VacuumCostBalanceLocal = 0;
 		}
-		else
+		else if (!lps->lvshared->disable_delay)
 		{
 			/*
 			 * Disable shared cost balance if we are not able to launch
@@ -3051,7 +3061,7 @@ update_index_statistics(Relation *Irel, IndexBulkDeleteResult **stats,
  */
 static LVParallelState *
 begin_parallel_vacuum(Oid relid, Relation *Irel, LVRelStats *vacrelstats,
-					  BlockNumber nblocks, int nindexes, int nrequested)
+					  BlockNumber nblocks, int nindexes, VacuumParams *params)
 {
 	LVParallelState *lps = NULL;
 	ParallelContext *pcxt;
@@ -3071,14 +3081,14 @@ begin_parallel_vacuum(Oid relid, Relation *Irel, LVRelStats *vacrelstats,
 	 * a parallel vacuum must be requested and there must be indexes on the
 	 * relation
 	 */
-	Assert(nrequested >= 0);
+	Assert(params->nworkers >= 0);
 	Assert(nindexes > 0);
 
 	/*
 	 * Compute the number of parallel vacuum workers to launch
 	 */
 	parallel_workers = compute_parallel_vacuum_workers(Irel, nindexes,
-													   nrequested);
+													   params->nworkers);
 
 	/* Can't perform vacuum in parallel */
 	if (parallel_workers <= 0)
@@ -3352,7 +3362,7 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
 												  false);
 
 	/* Set cost-based vacuum delay */
-	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostActive = ((VacuumCostDelay > 0) && !(lvshared->disable_delay));
 	VacuumCostBalance = 0;
 	VacuumPageHit = 0;
 	VacuumPageMiss = 0;
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 14a9b2432e..f93c93f9e0 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -101,6 +101,7 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 	bool		verbose = false;
 	bool		skip_locked = false;
 	bool		analyze = false;
+	bool		fast = false;
 	bool		freeze = false;
 	bool		full = false;
 	bool		disable_page_skipping = false;
@@ -130,6 +131,8 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 		/* Parse options available on VACUUM */
 		else if (strcmp(opt->defname, "analyze") == 0)
 			analyze = defGetBoolean(opt);
+		else if (strcmp(opt->defname, "fast") == 0)
+			fast = defGetBoolean(opt);
 		else if (strcmp(opt->defname, "freeze") == 0)
 			freeze = defGetBoolean(opt);
 		else if (strcmp(opt->defname, "full") == 0)
@@ -177,7 +180,8 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 		(analyze ? VACOPT_ANALYZE : 0) |
 		(freeze ? VACOPT_FREEZE : 0) |
 		(full ? VACOPT_FULL : 0) |
-		(disable_page_skipping ? VACOPT_DISABLE_PAGE_SKIPPING : 0);
+		(disable_page_skipping ? VACOPT_DISABLE_PAGE_SKIPPING : 0) |
+		(fast ? VACOPT_FAST : 0);
 
 	/* sanity checks on options */
 	Assert(params.options & (VACOPT_VACUUM | VACOPT_ANALYZE));
@@ -416,7 +420,8 @@ vacuum(List *relations, VacuumParams *params,
 		ListCell   *cur;
 
 		in_vacuum = true;
-		VacuumCostActive = (VacuumCostDelay > 0);
+		VacuumCostActive = ((VacuumCostDelay > 0) &&
+							!(params->options & VACOPT_FAST));
 		VacuumCostBalance = 0;
 		VacuumPageHit = 0;
 		VacuumPageMiss = 0;
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 6e9d918cfe..f772219549 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -182,7 +182,8 @@ typedef enum VacuumOption
 	VACOPT_FULL = 1 << 4,		/* FULL (non-concurrent) vacuum */
 	VACOPT_SKIP_LOCKED = 1 << 5,	/* skip if cannot get lock */
 	VACOPT_SKIPTOAST = 1 << 6,	/* don't process the TOAST table, if any */
-	VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7	/* don't skip any pages */
+	VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7,	/* don't skip any pages */
+	VACOPT_FAST = 1 << 8		/* disable vacuum delay */
 } VacuumOption;
 
 /*
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index 5b42371d95..f52d640589 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -117,6 +117,9 @@ CREATE TEMPORARY TABLE tmp (a int PRIMARY KEY);
 CREATE INDEX tmp_idx1 ON tmp (a);
 VACUUM (PARALLEL 1) tmp; -- disables parallel vacuum option
 WARNING:  disabling parallel option of vacuum on "tmp" --- cannot vacuum temporary tables in parallel
+-- FAST option
+UPDATE pvactst SET i = i WHERE i < 1000;
+VACUUM (FAST) pvactst;
 DROP TABLE pvactst;
 -- INDEX_CLEANUP option
 CREATE TABLE no_index_cleanup (i INT PRIMARY KEY, t TEXT);
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 0cdda11b25..2e42239da3 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -98,6 +98,10 @@ VACUUM (PARALLEL 2, FULL TRUE) pvactst; -- error, cannot use both PARALLEL and F
 CREATE TEMPORARY TABLE tmp (a int PRIMARY KEY);
 CREATE INDEX tmp_idx1 ON tmp (a);
 VACUUM (PARALLEL 1) tmp; -- disables parallel vacuum option
+
+-- FAST option
+UPDATE pvactst SET i = i WHERE i < 1000;
+VACUUM (FAST) pvactst;
 DROP TABLE pvactst;
 
 -- INDEX_CLEANUP option
-- 
2.23.0

v36-0002-Add-parallel-option-to-VACUUM-command.patchapplication/octet-stream; name=v36-0002-Add-parallel-option-to-VACUUM-command.patchDownload

From 715bb7dec39f74c87df5b06a0e3b2138ca110fff Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Tue, 17 Dec 2019 14:24:26 +0900
Subject: [PATCH v36 2/3] Add parallel option to VACUUM command

This change adds PARALLEL option to VACUUM command that enable us to
perform index vacuuming and index cleanup with background
workers. Individual indexes is processed by one vacuum
process. Therefore parallel vacuum can be used when the table has at
least two indexes and it cannot specify larger parallel degree than
the number of indexes that the table has.

The parallel degree is either specified by user or determined based on
the number of indexes that the table has, and further limited by
max_parallel_maintenance_workers. The table size and index size don't
affect it.
---
 doc/src/sgml/config.sgml              |   14 +-
 doc/src/sgml/ref/vacuum.sgml          |   45 +
 src/backend/access/heap/vacuumlazy.c  | 1227 ++++++++++++++++++++++---
 src/backend/access/transam/parallel.c |   16 +-
 src/backend/commands/vacuum.c         |  126 ++-
 src/backend/executor/execParallel.c   |    2 +-
 src/backend/postmaster/autovacuum.c   |    2 +
 src/bin/psql/tab-complete.c           |    2 +-
 src/include/access/heapam.h           |    3 +
 src/include/access/parallel.h         |    5 +-
 src/include/commands/vacuum.h         |   10 +
 src/test/regress/expected/vacuum.out  |   26 +
 src/test/regress/sql/vacuum.sql       |   25 +
 13 files changed, 1380 insertions(+), 123 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 5d1c90282f..74756277b7 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2308,13 +2308,13 @@ include_dir 'conf.d'
        <listitem>
         <para>
          Sets the maximum number of parallel workers that can be
-         started by a single utility command.  Currently, the only
-         parallel utility command that supports the use of parallel
-         workers is <command>CREATE INDEX</command>, and only when
-         building a B-tree index.  Parallel workers are taken from the
-         pool of processes established by <xref
-         linkend="guc-max-worker-processes"/>, limited by <xref
-         linkend="guc-max-parallel-workers"/>.  Note that the requested
+         started by a single utility command.  Currently, the parallel
+         utility commands that support the use of parallel workers are
+         <command>CREATE INDEX</command> only when building a B-tree index,
+         and <command>VACUUM</command> without <literal>FULL</literal>
+         option.  Parallel workers are taken from the pool of processes
+         established by <xref linkend="guc-max-worker-processes"/>, limited
+         by <xref linkend="guc-max-parallel-workers"/>.  Note that the requested
          number of workers may not actually be available at run time.
          If this occurs, the utility operation will run with fewer
          workers than expected.  The default value is 2.  Setting this
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index f9b0fb8794..9fee083233 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -34,6 +34,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     SKIP_LOCKED [ <replaceable class="parameter">boolean</replaceable> ]
     INDEX_CLEANUP [ <replaceable class="parameter">boolean</replaceable> ]
     TRUNCATE [ <replaceable class="parameter">boolean</replaceable> ]
+    PARALLEL [ <replaceable class="parameter">integer</replaceable> ]
 
 <phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
 
@@ -223,6 +224,32 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><literal>PARALLEL</literal></term>
+    <listitem>
+     <para>
+      Perform vacuum index and cleanup index phases of <command>VACUUM</command>
+      in parallel using <replaceable class="parameter">integer</replaceable>
+      background workers (for the detail of each vacuum phases, please
+      refer to <xref linkend="vacuum-phases"/>).  If the parallel degree
+      <replaceable class="parameter">integer</replaceable> is omitted,
+      then <command>VACUUM</command> decides the number of workers based
+      on number of indexes that support parallel vacuum operation on the
+      relation which is further limited by
+      <xref linkend="guc-max-parallel-workers-maintenance"/>.  Please note
+      that it is not guaranteed that the number of parallel workers specified
+      in <replaceable class="parameter">integer</replaceable> will be used
+      during execution.  It is possible for a vacuum to run with fewer workers
+      than specified, or even with no workers at all.  Only one worker can
+      be used per index.  So parallel workers are launched only when there
+      are at least <literal>2</literal> indexes in the table.  Workers for
+      vacuum launches before starting each phase and exit at the end of
+      the phase.  These behaviors might change in a future release.  This
+      option can't be used with the <literal>FULL</literal> option.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><replaceable class="parameter">boolean</replaceable></term>
     <listitem>
@@ -237,6 +264,18 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><replaceable class="parameter">integer</replaceable></term>
+    <listitem>
+     <para>
+      Specifies a positive integer value passed to the selected option.
+      The <replaceable class="parameter">integer</replaceable> value can
+      also be omitted, in which case the value is decided by the command
+      based on the option used.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><replaceable class="parameter">table_name</replaceable></term>
     <listitem>
@@ -316,6 +355,12 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     more than a plain <command>VACUUM</command> would.
    </para>
 
+   <para>
+     The <option>PARALLEL</option> option is used only for vacuum purpose.
+     Even if this option is specified with <option>ANALYZE</option> option
+     it does not affect <option>ANALYZE</option>.
+   </para>
+
    <para>
     <command>VACUUM</command> causes a substantial increase in I/O traffic,
     which might cause poor performance for other active sessions.  Therefore,
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index ab09d8408c..6f90a4dc40 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -22,6 +22,20 @@
  * of index scans performed.  So we don't use maintenance_work_mem memory for
  * the TID array, just enough to hold as many heap tuples as fit on one page.
  *
+ * Lazy vacuum supports parallel execution with parallel worker processes.  In
+ * a parallel lazy vacuum, we perform both index vacuuming and index cleanup
+ * with parallel worker processes.  Individual indexes are processed by one
+ * vacuum process.  At the beginning of a lazy vacuum (at lazy_scan_heap) we
+ * prepare the parallel context and initialize the DSM segment that contains
+ * shared information as well as the memory space for storing dead tuples.
+ * When starting either index vacuuming or index cleanup, we launch parallel
+ * worker processes.  Once all indexes are processed the parallel worker
+ * processes exit.  And then the leader process re-initializes the parallel
+ * context so that it can use the same DSM for multiple passses of index
+ * vacuum and for performing index cleanup.  For updating the index statistics,
+ * we need to update the system table and since updates are not
+ * allowed during parallel mode we update the index statistics after exiting
+ * from the parallel mode.
  *
  * Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
@@ -36,25 +50,30 @@
 
 #include <math.h>
 
+#include "access/amapi.h"
 #include "access/genam.h"
 #include "access/heapam.h"
 #include "access/heapam_xlog.h"
 #include "access/htup_details.h"
 #include "access/multixact.h"
+#include "access/parallel.h"
 #include "access/transam.h"
 #include "access/visibilitymap.h"
+#include "access/xact.h"
 #include "access/xlog.h"
 #include "catalog/storage.h"
 #include "commands/dbcommands.h"
 #include "commands/progress.h"
 #include "commands/vacuum.h"
 #include "miscadmin.h"
+#include "optimizer/paths.h"
 #include "pgstat.h"
 #include "portability/instr_time.h"
 #include "postmaster/autovacuum.h"
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
 #include "storage/lmgr.h"
+#include "tcop/tcopprot.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/pg_rusage.h"
@@ -110,6 +129,142 @@
  */
 #define PREFETCH_SIZE			((BlockNumber) 32)
 
+/*
+ * DSM keys for parallel lazy vacuum.  Unlike other parallel execution code,
+ * since we don't need to worry about DSM keys conflicting with plan_node_id
+ * we can use small integers.
+ */
+#define PARALLEL_VACUUM_KEY_SHARED			1
+#define PARALLEL_VACUUM_KEY_DEAD_TUPLES		2
+#define PARALLEL_VACUUM_KEY_QUERY_TEXT		3
+
+/*
+ * Macro to check if we are in a parallel lazy vacuum.  If true, we are
+ * in the parallel mode and prepared the DSM segment.
+ */
+#define ParallelVacuumIsActive(lps) (((LVParallelState *) (lps)) != NULL)
+
+/*
+ * LVDeadTuples stores the dead tuple TIDs collected during the heap scan.
+ * This is allocated in the DSM segment in parallel mode and in local memory
+ * in non-parallel mode.
+ */
+typedef struct LVDeadTuples
+{
+	int			max_tuples;		/* # slots allocated in array */
+	int			num_tuples;		/* current # of entries */
+	/* List of TIDs of tuples we intend to delete */
+	/* NB: this list is ordered by TID address */
+	ItemPointerData itemptrs[FLEXIBLE_ARRAY_MEMBER];	/* array of
+														 * ItemPointerData */
+} LVDeadTuples;
+
+#define SizeOfLVDeadTuples offsetof(LVDeadTuples, itemptrs) + sizeof(ItemPointerData)
+
+/*
+ * Shared information among parallel workers.  So this is allocated in the DSM
+ * segment.
+ */
+typedef struct LVShared
+{
+	/*
+	 * Target table relid and log level.  These fields are not modified during
+	 * the lazy vacuum.
+	 */
+	Oid			relid;
+	int			elevel;
+
+	/*
+	 * An indication for vacuum workers to perform either index vacuuming or
+	 * index cleanup.  first_time is true only if for_cleanup is true and
+	 * bulk-deletion is not performed yet.
+	 */
+	bool		for_cleanup;
+	bool		first_time;
+
+	/*
+	 * Fields for both index vacuum and cleanup.
+	 *
+	 * reltuples is the total number of input heap tuples.  We set either old
+	 * live tuples in the index vacuum case or the new live tuples in the
+	 * index cleanup case.
+	 *
+	 * estimated_count is true if the reltuples is an estimated value.
+	 */
+	double		reltuples;
+	bool		estimated_count;
+
+	/*
+	 * In single process lazy vacuum we could consume more memory during index
+	 * vacuuming or cleanup apart from the memory for heap scanning. In
+	 * parallel index vacuuming, since individual vacuum workers can consume
+	 * memory equal to maitenance_work_mem, the new maitenance_work_mem for
+	 * each worker is set such that the parallel operation doesn't consume
+	 * more memory than single process lazy vacuum.
+	 */
+	int			maintenance_work_mem_worker;
+
+	/*
+	 * Shared vacuum cost balance.  During parallel index vacuuming
+	 * VacuumSharedCostBalance points to this value and it accumulates the
+	 * balance of each parallel vacuum workers.
+	 */
+	pg_atomic_uint32 cost_balance;
+
+	/*
+	 * Number of active parallel workers.  This is used for computing the
+	 * minimum threshold of the vacuum cost balance for a worker to go for the
+	 * delay.
+	 */
+	pg_atomic_uint32 active_nworkers;
+
+	/*
+	 * Variables to control parallel index vacuuming.  We have a bitmap to
+	 * indicate which index has stats in shared memory.  The set bit in the
+	 * map indicates that the particular index supports a parallel vacuum.
+	 */
+	pg_atomic_uint32 idx;		/* counter for vacuuming and clean up */
+	pg_atomic_uint32 nprocessed;	/* # of indexes done during parallel
+									 * execution */
+	uint32		offset;			/* sizeof header incl. bitmap */
+	bits8		bitmap[FLEXIBLE_ARRAY_MEMBER];	/* bit map of NULLs */
+
+	/* Shared index statistics data follows at end of struct */
+} LVShared;
+
+#define SizeOfLVShared offsetof(LVShared, bitmap) + sizeof(bits8)
+#define GetSharedIndStats(s) \
+	((LVSharedIndStats *)((char *)(s) + ((LVShared *)(s))->offset))
+#define IndStatsIsNull(s, i) \
+	(!(((LVShared *)(s))->bitmap[(i) >> 3] & (1 << ((i) & 0x07))))
+
+/*
+ * Struct for an index bulk-deletion statistic used for parallel lazy
+ * vacuum.  This is allocated in the DSM segment.
+ */
+typedef struct LVSharedIndStats
+{
+	bool		updated;		/* are the stats updated? */
+	IndexBulkDeleteResult stats;
+} LVSharedIndStats;
+
+/* Struct for maintaining a parallel vacuum state. */
+typedef struct LVParallelState
+{
+	ParallelContext *pcxt;
+
+	/* Shared information among parallel vacuum workers */
+	LVShared   *lvshared;
+
+	/*
+	 * The number of indexes that support parallel index bulk-deletion and
+	 * parallel index cleanup respectively.
+	 */
+	int			nindexes_parallel_bulkdel;
+	int			nindexes_parallel_cleanup;
+	int			nindexes_parallel_condcleanup;
+} LVParallelState;
+
 typedef struct LVRelStats
 {
 	/* useindex = true means two-pass strategy; false means one-pass */
@@ -128,11 +283,7 @@ typedef struct LVRelStats
 	BlockNumber pages_removed;
 	double		tuples_deleted;
 	BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
-	/* List of TIDs of tuples we intend to delete */
-	/* NB: this list is ordered by TID address */
-	int			num_dead_tuples;	/* current # of entries */
-	int			max_dead_tuples;	/* # slots allocated in array */
-	ItemPointer dead_tuples;	/* array of ItemPointerData */
+	LVDeadTuples *dead_tuples;
 	int			num_index_scans;
 	TransactionId latestRemovedXid;
 	bool		lock_waiter_detected;
@@ -148,19 +299,17 @@ static MultiXactId MultiXactCutoff;
 
 static BufferAccessStrategy vac_strategy;
 
-
 /* non-export function prototypes */
 static void lazy_scan_heap(Relation onerel, VacuumParams *params,
 						   LVRelStats *vacrelstats, Relation *Irel, int nindexes,
 						   bool aggressive);
 static void lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats);
 static bool lazy_check_needs_freeze(Buffer buf, bool *hastup);
-static void lazy_vacuum_index(Relation indrel,
-							  IndexBulkDeleteResult **stats,
-							  LVRelStats *vacrelstats);
+static void lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult **stats,
+							  LVDeadTuples *dead_tuples, double reltuples);
 static void lazy_cleanup_index(Relation indrel,
-							   IndexBulkDeleteResult *stats,
-							   LVRelStats *vacrelstats);
+							   IndexBulkDeleteResult **stats,
+							   double reltuples, bool estimated_count);
 static int	lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 							 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer);
 static bool should_attempt_truncation(VacuumParams *params,
@@ -169,12 +318,42 @@ static void lazy_truncate_heap(Relation onerel, LVRelStats *vacrelstats);
 static BlockNumber count_nondeletable_pages(Relation onerel,
 											LVRelStats *vacrelstats);
 static void lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks);
-static void lazy_record_dead_tuple(LVRelStats *vacrelstats,
+static void lazy_record_dead_tuple(LVDeadTuples *dead_tuples,
 								   ItemPointer itemptr);
 static bool lazy_tid_reaped(ItemPointer itemptr, void *state);
 static int	vac_cmp_itemptr(const void *left, const void *right);
 static bool heap_page_is_all_visible(Relation rel, Buffer buf,
 									 TransactionId *visibility_cutoff_xid, bool *all_frozen);
+static void lazy_parallel_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+										 LVRelStats *vacrelstats, LVParallelState *lps,
+										 int nindexes);
+static void parallel_vacuum_index(Relation *Irel, IndexBulkDeleteResult **stats,
+								  LVShared *lvshared, LVDeadTuples *dead_tuples,
+								  int nindexes);
+static void vacuum_indexes_leader(Relation *Irel, int nindexes, IndexBulkDeleteResult **stats,
+								  LVRelStats *vacrelstats, LVParallelState *lps);
+static void vacuum_one_index(Relation indrel, IndexBulkDeleteResult **stats,
+							 LVShared *lvshared, LVSharedIndStats *shared_indstats,
+							 LVDeadTuples *dead_tuples);
+static void lazy_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+								LVRelStats *vacrelstats, LVParallelState *lps,
+								int nindexes);
+static void lazy_cleanup_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+								 LVRelStats *vacrelstats, LVParallelState *lps,
+								 int nindexes);
+static long compute_max_dead_tuples(BlockNumber relblocks, bool hasindex);
+static int	compute_parallel_vacuum_workers(Relation *Irel, int nindexes, int nrequested);
+static void prepare_index_statistics(LVShared *lvshared, bool *can_parallel_vacuum,
+									 int nindexes);
+static void update_index_statistics(Relation *Irel, IndexBulkDeleteResult **stats,
+									int nindexes);
+static LVParallelState *begin_parallel_vacuum(Oid relid, Relation *Irel,
+											  LVRelStats *vacrelstats, BlockNumber nblocks,
+											  int nindexes, int nrequested);
+static void end_parallel_vacuum(Relation *Irel, IndexBulkDeleteResult **stats,
+								LVParallelState *lps, int nindexes);
+static LVSharedIndStats *get_indstats(LVShared *lvshared, int n);
+static bool skip_parallel_vacuum_index(Relation indrel, LVShared *lvshared);
 
 
 /*
@@ -488,6 +667,18 @@ vacuum_log_cleanup_info(Relation rel, LVRelStats *vacrelstats)
  *		dead-tuple TIDs, invoke vacuuming of indexes and call lazy_vacuum_heap
  *		to reclaim dead line pointers.
  *
+ *		If the table has at least two indexes and parallel lazy vacuum is
+ *		requested, we execute both index vacuuming and index cleanup with
+ *		parallel workers.  In parallel lazy vacuum, we enter parallel mode and
+ *		then create both the parallel context and the DSM segment before starting
+ *		heap scan so that we can record dead tuples to the DSM segment.  All
+ *		parallel workers are launched at beginning of index vacuuming and index
+ *		cleanup and they exit once done with all indexes.  At the end of this
+ *		function we exit from parallel mode.  Index bulk-deletion results are
+ *		stored in the DSM segment and update index statistics as a whole after
+ *		exited from parallel mode since all writes are not allowed during parallel
+ *		mode.
+ *
  *		If there are no indexes then we can reclaim line pointers on the fly;
  *		dead line pointers need only be retained until all index pointers that
  *		reference them have been killed.
@@ -496,6 +687,8 @@ static void
 lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			   Relation *Irel, int nindexes, bool aggressive)
 {
+	LVParallelState *lps = NULL;
+	LVDeadTuples *dead_tuples;
 	BlockNumber nblocks,
 				blkno;
 	HeapTupleData tuple;
@@ -553,13 +746,28 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	vacrelstats->nonempty_pages = 0;
 	vacrelstats->latestRemovedXid = InvalidTransactionId;
 
-	lazy_space_alloc(vacrelstats, nblocks);
+	/*
+	 * Try to initialize the parallel vacuum if requested
+	 */
+	if (params->nworkers >= 0 && vacrelstats->useindex)
+		lps = begin_parallel_vacuum(RelationGetRelid(onerel), Irel,
+									vacrelstats, nblocks, nindexes,
+									params->nworkers);
+
+	/*
+	 * Allocate the space for dead tuples in case the parallel vacuum is not
+	 * initialized.
+	 */
+	if (!ParallelVacuumIsActive(lps))
+		lazy_space_alloc(vacrelstats, nblocks);
+
+	dead_tuples = vacrelstats->dead_tuples;
 	frozen = palloc(sizeof(xl_heap_freeze_tuple) * MaxHeapTuplesPerPage);
 
 	/* Report that we're scanning the heap, advertising total # of blocks */
 	initprog_val[0] = PROGRESS_VACUUM_PHASE_SCAN_HEAP;
 	initprog_val[1] = nblocks;
-	initprog_val[2] = vacrelstats->max_dead_tuples;
+	initprog_val[2] = dead_tuples->max_tuples;
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
 	/*
@@ -737,8 +945,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * If we are close to overrunning the available space for dead-tuple
 		 * TIDs, pause and do a cycle of vacuuming before we tackle this page.
 		 */
-		if ((vacrelstats->max_dead_tuples - vacrelstats->num_dead_tuples) < MaxHeapTuplesPerPage &&
-			vacrelstats->num_dead_tuples > 0)
+		if ((dead_tuples->max_tuples - dead_tuples->num_tuples) < MaxHeapTuplesPerPage &&
+			dead_tuples->num_tuples > 0)
 		{
 			const int	hvp_index[] = {
 				PROGRESS_VACUUM_PHASE,
@@ -766,10 +974,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 										 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
 			/* Remove index entries */
-			for (i = 0; i < nindexes; i++)
-				lazy_vacuum_index(Irel[i],
-								  &indstats[i],
-								  vacrelstats);
+			lazy_vacuum_indexes(Irel, indstats, vacrelstats, lps, nindexes);
 
 			/*
 			 * Report that we are now vacuuming the heap.  We also increase
@@ -789,7 +994,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 			vacrelstats->num_index_scans++;
 
 			/*
@@ -985,7 +1190,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		has_dead_tuples = false;
 		nfrozen = 0;
 		hastup = false;
-		prev_dead_count = vacrelstats->num_dead_tuples;
+		prev_dead_count = dead_tuples->num_tuples;
 		maxoff = PageGetMaxOffsetNumber(page);
 
 		/*
@@ -1024,7 +1229,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 */
 			if (ItemIdIsDead(itemid))
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				all_visible = false;
 				continue;
 			}
@@ -1170,7 +1375,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 
 			if (tupgone)
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				HeapTupleHeaderAdvanceLatestRemovedXid(tuple.t_data,
 													   &vacrelstats->latestRemovedXid);
 				tups_vacuumed += 1;
@@ -1240,7 +1445,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * doing a second scan. Also we don't do that but forget dead tuples
 		 * when index cleanup is disabled.
 		 */
-		if (!vacrelstats->useindex && vacrelstats->num_dead_tuples > 0)
+		if (!vacrelstats->useindex && dead_tuples->num_tuples > 0)
 		{
 			if (nindexes == 0)
 			{
@@ -1269,7 +1474,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 
 			/*
 			 * Periodically do incremental FSM vacuuming to make newly-freed
@@ -1384,7 +1589,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * page, so remember its free space as-is.  (This path will always be
 		 * taken if there are no indexes.)
 		 */
-		if (vacrelstats->num_dead_tuples == prev_dead_count)
+		if (dead_tuples->num_tuples == prev_dead_count)
 			RecordPageWithFreeSpace(onerel, blkno, freespace);
 	}
 
@@ -1418,7 +1623,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 
 	/* If any tuples need to be deleted, perform final vacuum cycle */
 	/* XXX put a threshold on min number of tuples here? */
-	if (vacrelstats->num_dead_tuples > 0)
+	if (dead_tuples->num_tuples > 0)
 	{
 		const int	hvp_index[] = {
 			PROGRESS_VACUUM_PHASE,
@@ -1434,10 +1639,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 									 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
 		/* Remove index entries */
-		for (i = 0; i < nindexes; i++)
-			lazy_vacuum_index(Irel[i],
-							  &indstats[i],
-							  vacrelstats);
+		lazy_vacuum_indexes(Irel, indstats, vacrelstats, lps, nindexes);
 
 		/* Report that we are now vacuuming the heap */
 		hvp_val[0] = PROGRESS_VACUUM_PHASE_VACUUM_HEAP;
@@ -1461,12 +1663,19 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
 								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
 
-	/* Do post-vacuum cleanup and statistics update for each index */
+	/* Do post-vacuum cleanup */
 	if (vacrelstats->useindex)
-	{
-		for (i = 0; i < nindexes; i++)
-			lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
-	}
+		lazy_cleanup_indexes(Irel, indstats, vacrelstats, lps, nindexes);
+
+	/*
+	 * End parallel mode before updating index statistics as we cannot write
+	 * during parallel mode.
+	 */
+	if (ParallelVacuumIsActive(lps))
+		end_parallel_vacuum(Irel, indstats, lps, nindexes);
+
+	/* Update index statistics */
+	update_index_statistics(Irel, indstats, nindexes);
 
 	/* If no indexes, make log report that lazy_vacuum_heap would've made */
 	if (vacuumed_pages)
@@ -1532,7 +1741,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 	npages = 0;
 
 	tupindex = 0;
-	while (tupindex < vacrelstats->num_dead_tuples)
+	while (tupindex < vacrelstats->dead_tuples->num_tuples)
 	{
 		BlockNumber tblk;
 		Buffer		buf;
@@ -1541,7 +1750,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 
 		vacuum_delay_point();
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples->itemptrs[tupindex]);
 		buf = ReadBufferExtended(onerel, MAIN_FORKNUM, tblk, RBM_NORMAL,
 								 vac_strategy);
 		if (!ConditionalLockBufferForCleanup(buf))
@@ -1589,6 +1798,7 @@ static int
 lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 				 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer)
 {
+	LVDeadTuples *dead_tuples = vacrelstats->dead_tuples;
 	Page		page = BufferGetPage(buffer);
 	OffsetNumber unused[MaxOffsetNumber];
 	int			uncnt = 0;
@@ -1599,16 +1809,16 @@ lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 
 	START_CRIT_SECTION();
 
-	for (; tupindex < vacrelstats->num_dead_tuples; tupindex++)
+	for (; tupindex < dead_tuples->num_tuples; tupindex++)
 	{
 		BlockNumber tblk;
 		OffsetNumber toff;
 		ItemId		itemid;
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&dead_tuples->itemptrs[tupindex]);
 		if (tblk != blkno)
 			break;				/* past end of tuples for this block */
-		toff = ItemPointerGetOffsetNumber(&vacrelstats->dead_tuples[tupindex]);
+		toff = ItemPointerGetOffsetNumber(&dead_tuples->itemptrs[tupindex]);
 		itemid = PageGetItemId(page, toff);
 		ItemIdSetUnused(itemid);
 		unused[uncnt++] = toff;
@@ -1729,19 +1939,380 @@ lazy_check_needs_freeze(Buffer buf, bool *hastup)
 	return false;
 }
 
+/*
+ * Perform index vacuuming or index cleanup with parallel workers.  This
+ * function must be used by the parallel vacuum leader process. The caller
+ * must set lps->lvshared->for_cleanup to indicate whether to perform vacuum
+ * or cleanup.
+ */
+static void
+lazy_parallel_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+							 LVRelStats *vacrelstats, LVParallelState *lps,
+							 int nindexes)
+{
+	int			nworkers;
+
+	Assert(!IsParallelWorker());
+	Assert(ParallelVacuumIsActive(lps));
+	Assert(nindexes > 0);
+
+	/* Determine the number of parallel workers to launch */
+	if (lps->lvshared->for_cleanup)
+	{
+		if (lps->lvshared->first_time)
+			nworkers = lps->nindexes_parallel_cleanup +
+				lps->nindexes_parallel_condcleanup - 1;
+		else
+			nworkers = lps->nindexes_parallel_cleanup - 1;
+
+	}
+	else
+		nworkers = lps->nindexes_parallel_bulkdel - 1;
+
+	/* Cap by the worker we computed at the beginning of parallel lazy vacuum */
+	nworkers = Min(nworkers, lps->pcxt->nworkers);
+
+	/* Setup the shared cost-based vacuum delay and launch workers */
+	if (nworkers > 0)
+	{
+		if (vacrelstats->num_index_scans > 0)
+		{
+			/* Reset the processing counts */
+			pg_atomic_write_u32(&(lps->lvshared->idx), 0);
+			pg_atomic_write_u32(&(lps->lvshared->nprocessed), 0);
+
+			/* Reinitialize the parallel context to relaunch parallel workers */
+			ReinitializeParallelDSM(lps->pcxt, nworkers);
+		}
+
+		/* Enable shared cost balance */
+		VacuumSharedCostBalance = &(lps->lvshared->cost_balance);
+		VacuumActiveNWorkers = &(lps->lvshared->active_nworkers);
+
+		/*
+		 * Set up shared cost balance and the number of active workers for
+		 * vacuum delay.
+		 */
+		pg_atomic_write_u32(VacuumSharedCostBalance, VacuumCostBalance);
+		pg_atomic_write_u32(VacuumActiveNWorkers, 0);
+
+		LaunchParallelWorkers(lps->pcxt);
+
+		if (lps->pcxt->nworkers_launched > 0)
+		{
+			/*
+			 * Reset the local value so that we compute cost balance during
+			 * parallel index vacuuming.
+			 */
+			VacuumCostBalance = 0;
+			VacuumCostBalanceLocal = 0;
+		}
+		else
+		{
+			/*
+			 * Disable shared cost balance if we are not able to launch
+			 * workers.
+			 */
+			VacuumSharedCostBalance = NULL;
+			VacuumActiveNWorkers = NULL;
+		}
+
+		if (lps->lvshared->for_cleanup)
+			ereport(elevel,
+					(errmsg(ngettext("launched %d parallel vacuum worker for index cleanup (planned: %d)",
+									 "launched %d parallel vacuum workers for index cleanup (planned: %d)",
+									 lps->pcxt->nworkers_launched),
+							lps->pcxt->nworkers_launched, nworkers)));
+		else
+			ereport(elevel,
+					(errmsg(ngettext("launched %d parallel vacuum worker for index vacuuming (planned: %d)",
+									 "launched %d parallel vacuum workers for index vacuuming (planned: %d)",
+									 lps->pcxt->nworkers_launched),
+							lps->pcxt->nworkers_launched, nworkers)));
+	}
+
+	/* Process the indexes that can be processed by only leader process */
+	vacuum_indexes_leader(Irel, nindexes, stats, vacrelstats, lps);
+
+	/*
+	 * Join as a parallel worker.  The leader process alone processes all the
+	 * indexes in the case where no workers are launched.
+	 */
+	parallel_vacuum_index(Irel, stats, lps->lvshared,
+						  vacrelstats->dead_tuples, nindexes);
+
+	/* Wait for all vacuum workers to finish */
+	WaitForParallelWorkersToFinish(lps->pcxt);
+
+	/* Carry the shared balance value to heap scan */
+	if (VacuumSharedCostBalance)
+		VacuumCostBalance = pg_atomic_read_u32(VacuumSharedCostBalance);
+
+	if (nworkers > 0)
+	{
+		/* Disable shared cost balance */
+		VacuumSharedCostBalance = NULL;
+		VacuumActiveNWorkers = NULL;
+	}
+}
+
+/*
+ * Index vacuum/cleanup routine used by the leader process and parallel
+ * vacuum worker processes to process the indexes in parallel.
+ */
+static void
+parallel_vacuum_index(Relation *Irel, IndexBulkDeleteResult **stats,
+					  LVShared *lvshared, LVDeadTuples *dead_tuples,
+					  int nindexes)
+{
+	/*
+	 * Increment the active worker count if we are able to launch any worker.
+	 */
+	if (VacuumActiveNWorkers)
+		pg_atomic_add_fetch_u32(VacuumActiveNWorkers, 1);
+
+	/* Loop until all indexes are vacuumed */
+	for (;;)
+	{
+		int			idx;
+		LVSharedIndStats *shared_indstats;
+
+		/* Get an index number to process */
+		idx = pg_atomic_fetch_add_u32(&(lvshared->idx), 1);
+
+		/* Done for all indexes? */
+		if (idx >= nindexes)
+			break;
+
+		/* Skip processing indexes that doesn't support parallel operation */
+		if (get_indstats(lvshared, idx) == NULL ||
+			skip_parallel_vacuum_index(Irel[idx], lvshared))
+			continue;
+
+		/* Increment the processing count */
+		pg_atomic_add_fetch_u32(&(lvshared->nprocessed), 1);
+
+		/* Get the index statistics of this index from DSM */
+		shared_indstats = get_indstats(lvshared, idx);
+
+		/*
+		 * This must exist in DSM as we reach here only for indexes that
+		 * support the parallel operation.
+		 */
+		Assert(shared_indstats);
+
+		/* Do vacuum or cleanup one index */
+		vacuum_one_index(Irel[idx], &(stats[idx]), lvshared, shared_indstats,
+						 dead_tuples);
+	}
+
+	/*
+	 * We have completed the index vacuum so decrement the active worker
+	 * count.
+	 */
+	if (VacuumActiveNWorkers)
+		pg_atomic_sub_fetch_u32(VacuumActiveNWorkers, 1);
+}
+
+/*
+ * Vacuum or cleanup indexes that can be processed by only the leader process
+ * because these indexes don't support parallel operation at that phase.
+ * Therefore this function must be called by the leader process.
+ */
+static void
+vacuum_indexes_leader(Relation *Irel, int nindexes, IndexBulkDeleteResult **stats,
+					  LVRelStats *vacrelstats, LVParallelState *lps)
+{
+	int			i;
+
+	Assert(!IsParallelWorker());
+
+	/*
+	 * Increment the active worker count if we are able to launch any worker.
+	 */
+	if (VacuumActiveNWorkers)
+		pg_atomic_add_fetch_u32(VacuumActiveNWorkers, 1);
+
+	for (i = 0; i < nindexes; i++)
+	{
+		bool		leader_only = (get_indstats(lps->lvshared, i) == NULL ||
+								   skip_parallel_vacuum_index(Irel[i], lps->lvshared));
+
+		/* Skip the indexes that can be processed by parallel workers */
+		if (!leader_only)
+			continue;
+
+		vacuum_one_index(Irel[i], &(stats[i]), lps->lvshared,
+						 get_indstats(lps->lvshared, i),
+						 vacrelstats->dead_tuples);
+	}
+
+	/*
+	 * We have completed the index vacuum so decrement the active worker
+	 * count.
+	 */
+	if (VacuumActiveNWorkers)
+		pg_atomic_sub_fetch_u32(VacuumActiveNWorkers, 1);
+}
+
+/*
+ * Vacuum or cleanup index either by leader process or by one of the worker
+ * process.  After processing the index this function copies the index
+ * statistics returned from ambulkdelete and amvacuumcleanup to the DSM
+ * segment.
+ */
+static void
+vacuum_one_index(Relation indrel, IndexBulkDeleteResult **stats,
+				 LVShared *lvshared, LVSharedIndStats *shared_indstats,
+				 LVDeadTuples *dead_tuples)
+{
+	IndexBulkDeleteResult *bulkdelete_res = NULL;
+
+	if (shared_indstats)
+	{
+		/* Get the space for IndexBulkDeleteResult */
+		bulkdelete_res = &(shared_indstats->stats);
+
+		/*
+		 * Update the pointer to the corresponding bulk-deletion result if
+		 * someone has already updated it.
+		 */
+		if (shared_indstats->updated && *stats == NULL)
+			*stats = bulkdelete_res;
+	}
+
+	/* Do vacuum or cleanup one index */
+	if (lvshared->for_cleanup)
+		lazy_cleanup_index(indrel, stats, lvshared->reltuples,
+						   lvshared->estimated_count);
+	else
+		lazy_vacuum_index(indrel, stats, dead_tuples,
+						  lvshared->reltuples);
+
+	/*
+	 * Copy the index bulk-deletion result returned from ambulkdelete and
+	 * amvacuumcleanup to the DSM segment if it's the first time to get it
+	 * from them, because they allocate it locally and it's possible that an
+	 * index will be vacuumed by the different vacuum process at the next
+	 * time.  The copying the result normally happens only after the first
+	 * time of index vacuuming.  From the second time, we pass the result on
+	 * the DSM segment so that they then update it directly.
+	 *
+	 * Since all vacuum workers write the bulk-deletion result at different
+	 * slots we can write them without locking.
+	 */
+	if (shared_indstats && !shared_indstats->updated && *stats != NULL)
+	{
+		memcpy(bulkdelete_res, *stats, sizeof(IndexBulkDeleteResult));
+		shared_indstats->updated = true;
+
+		/*
+		 * no longer need the locally allocated result and now stats[idx]
+		 * points to the DSM segment.
+		 */
+		pfree(*stats);
+		*stats = bulkdelete_res;
+	}
+}
+
+/*
+ * Vacuum indexes.  We process the indexes serially unless we are doing
+ * parallel vacuum.
+ */
+static void
+lazy_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+					LVRelStats *vacrelstats, LVParallelState *lps,
+					int nindexes)
+{
+	int			idx;
+
+	Assert(!IsParallelWorker());
+	Assert(nindexes > 0);
+
+	/* Perform index vacuuming with parallel workers for parallel vacuum. */
+	if (ParallelVacuumIsActive(lps))
+	{
+		/* Tell parallel workers to do index vacuuming */
+		lps->lvshared->for_cleanup = false;
+		lps->lvshared->first_time = false;
+
+		/*
+		 * We can only provide an approximate value of num_heap_tuples in
+		 * vacuum cases.
+		 */
+		lps->lvshared->reltuples = vacrelstats->old_live_tuples;
+		lps->lvshared->estimated_count = true;
+
+		lazy_parallel_vacuum_indexes(Irel, stats, vacrelstats, lps, nindexes);
+	}
+	else
+	{
+		for (idx = 0; idx < nindexes; idx++)
+			lazy_vacuum_index(Irel[idx], &stats[idx], vacrelstats->dead_tuples,
+							  vacrelstats->old_live_tuples);
+	}
+}
+
+/*
+ * Cleanup indexes.  We process the indexes serially unless we are doing
+ * parallel vacuum.
+ */
+static void
+lazy_cleanup_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+					 LVRelStats *vacrelstats, LVParallelState *lps,
+					 int nindexes)
+{
+	int			idx;
+
+	Assert(!IsParallelWorker());
+	Assert(nindexes > 0);
+
+	/*
+	 * If parallel vacuum is active we perform index cleanup with parallel
+	 * workers.
+	 */
+	if (ParallelVacuumIsActive(lps))
+	{
+		/* Tell parallel workers to do index cleanup */
+		lps->lvshared->for_cleanup = true;
+		lps->lvshared->first_time =
+			(vacrelstats->num_index_scans == 0);
+
+		/*
+		 * Now we can provide a better estimate of total number of surviving
+		 * tuples (we assume indexes are more interested in that than in the
+		 * number of nominally live tuples).
+		 */
+		lps->lvshared->reltuples = vacrelstats->new_rel_tuples;
+		lps->lvshared->estimated_count =
+			(vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+
+		lazy_parallel_vacuum_indexes(Irel, stats, vacrelstats, lps, nindexes);
+	}
+	else
+	{
+		for (idx = 0; idx < nindexes; idx++)
+			lazy_cleanup_index(Irel[idx], &stats[idx],
+							   vacrelstats->new_rel_tuples,
+							   vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	}
+}
 
 /*
  *	lazy_vacuum_index() -- vacuum one index relation.
  *
  *		Delete all the index entries pointing to tuples listed in
- *		vacrelstats->dead_tuples, and update running statistics.
+ *		dead_tuples, and update running statistics.
+ *
+ *		reltuples is the number of heap tuples to be passed to the
+ *		bulkdelete callback.
  */
 static void
-lazy_vacuum_index(Relation indrel,
-				  IndexBulkDeleteResult **stats,
-				  LVRelStats *vacrelstats)
+lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult **stats,
+				  LVDeadTuples *dead_tuples, double reltuples)
 {
 	IndexVacuumInfo ivinfo;
+	const char *msg;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
@@ -1751,30 +2322,38 @@ lazy_vacuum_index(Relation indrel,
 	ivinfo.report_progress = false;
 	ivinfo.estimated_count = true;
 	ivinfo.message_level = elevel;
-	/* We can only provide an approximate value of num_heap_tuples here */
-	ivinfo.num_heap_tuples = vacrelstats->old_live_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
 	/* Do bulk deletion */
 	*stats = index_bulk_delete(&ivinfo, *stats,
-							   lazy_tid_reaped, (void *) vacrelstats);
+							   lazy_tid_reaped, (void *) dead_tuples);
+
+	if (IsParallelWorker())
+		msg = gettext_noop("scanned index \"%s\" to remove %d row versions by parallel vacuum worker");
+	else
+		msg = gettext_noop("scanned index \"%s\" to remove %d row versions");
 
 	ereport(elevel,
-			(errmsg("scanned index \"%s\" to remove %d row versions",
+			(errmsg(msg,
 					RelationGetRelationName(indrel),
-					vacrelstats->num_dead_tuples),
+					dead_tuples->num_tuples),
 			 errdetail_internal("%s", pg_rusage_show(&ru0))));
 }
 
 /*
  *	lazy_cleanup_index() -- do post-vacuum cleanup for one index relation.
+ *
+ *		reltuples is the number of heap tuples and estimated_count is true
+ *		if the reltuples is an estimated value.
  */
 static void
 lazy_cleanup_index(Relation indrel,
-				   IndexBulkDeleteResult *stats,
-				   LVRelStats *vacrelstats)
+				   IndexBulkDeleteResult **stats,
+				   double reltuples, bool estimated_count)
 {
 	IndexVacuumInfo ivinfo;
+	const char *msg;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
@@ -1782,49 +2361,33 @@ lazy_cleanup_index(Relation indrel,
 	ivinfo.index = indrel;
 	ivinfo.analyze_only = false;
 	ivinfo.report_progress = false;
-	ivinfo.estimated_count = (vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	ivinfo.estimated_count = estimated_count;
 	ivinfo.message_level = elevel;
 
-	/*
-	 * Now we can provide a better estimate of total number of surviving
-	 * tuples (we assume indexes are more interested in that than in the
-	 * number of nominally live tuples).
-	 */
-	ivinfo.num_heap_tuples = vacrelstats->new_rel_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
-	stats = index_vacuum_cleanup(&ivinfo, stats);
+	*stats = index_vacuum_cleanup(&ivinfo, *stats);
 
-	if (!stats)
+	if (!(*stats))
 		return;
 
-	/*
-	 * Now update statistics in pg_class, but only if the index says the count
-	 * is accurate.
-	 */
-	if (!stats->estimated_count)
-		vac_update_relstats(indrel,
-							stats->num_pages,
-							stats->num_index_tuples,
-							0,
-							false,
-							InvalidTransactionId,
-							InvalidMultiXactId,
-							false);
+	if (IsParallelWorker())
+		msg = gettext_noop("index \"%s\" now contains %.0f row versions in %u pages as reported by parallel vacuum worker");
+	else
+		msg = gettext_noop("index \"%s\" now contains %.0f row versions in %u pages");
 
 	ereport(elevel,
-			(errmsg("index \"%s\" now contains %.0f row versions in %u pages",
+			(errmsg(msg,
 					RelationGetRelationName(indrel),
-					stats->num_index_tuples,
-					stats->num_pages),
+					(*stats)->num_index_tuples,
+					(*stats)->num_pages),
 			 errdetail("%.0f index row versions were removed.\n"
 					   "%u index pages have been deleted, %u are currently reusable.\n"
 					   "%s.",
-					   stats->tuples_removed,
-					   stats->pages_deleted, stats->pages_free,
+					   (*stats)->tuples_removed,
+					   (*stats)->pages_deleted, (*stats)->pages_free,
 					   pg_rusage_show(&ru0))));
-
-	pfree(stats);
 }
 
 /*
@@ -2132,19 +2695,17 @@ count_nondeletable_pages(Relation onerel, LVRelStats *vacrelstats)
 }
 
 /*
- * lazy_space_alloc - space allocation decisions for lazy vacuum
- *
- * See the comments at the head of this file for rationale.
+ * Return the maximum number of dead tuples we can record.
  */
-static void
-lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+static long
+compute_max_dead_tuples(BlockNumber relblocks, bool useindex)
 {
 	long		maxtuples;
 	int			vac_work_mem = IsAutoVacuumWorkerProcess() &&
 	autovacuum_work_mem != -1 ?
 	autovacuum_work_mem : maintenance_work_mem;
 
-	if (vacrelstats->useindex)
+	if (useindex)
 	{
 		maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);
 		maxtuples = Min(maxtuples, INT_MAX);
@@ -2158,34 +2719,49 @@ lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
 		maxtuples = Max(maxtuples, MaxHeapTuplesPerPage);
 	}
 	else
-	{
 		maxtuples = MaxHeapTuplesPerPage;
-	}
 
-	vacrelstats->num_dead_tuples = 0;
-	vacrelstats->max_dead_tuples = (int) maxtuples;
-	vacrelstats->dead_tuples = (ItemPointer)
-		palloc(maxtuples * sizeof(ItemPointerData));
+	return maxtuples;
+}
+
+/*
+ * lazy_space_alloc - space allocation decisions for lazy vacuum
+ *
+ * See the comments at the head of this file for rationale.
+ */
+static void
+lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+{
+	LVDeadTuples *dead_tuples = NULL;
+	long		maxtuples;
+
+	maxtuples = compute_max_dead_tuples(relblocks, vacrelstats->useindex);
+
+	dead_tuples = (LVDeadTuples *)
+		palloc(SizeOfLVDeadTuples + maxtuples * sizeof(ItemPointerData));
+	dead_tuples->num_tuples = 0;
+	dead_tuples->max_tuples = (int) maxtuples;
+
+	vacrelstats->dead_tuples = dead_tuples;
 }
 
 /*
  * lazy_record_dead_tuple - remember one deletable tuple
  */
 static void
-lazy_record_dead_tuple(LVRelStats *vacrelstats,
-					   ItemPointer itemptr)
+lazy_record_dead_tuple(LVDeadTuples *dead_tuples, ItemPointer itemptr)
 {
 	/*
 	 * The array shouldn't overflow under normal behavior, but perhaps it
 	 * could if we are given a really small maintenance_work_mem. In that
 	 * case, just forget the last few tuples (we'll get 'em next time).
 	 */
-	if (vacrelstats->num_dead_tuples < vacrelstats->max_dead_tuples)
+	if (dead_tuples->num_tuples < dead_tuples->max_tuples)
 	{
-		vacrelstats->dead_tuples[vacrelstats->num_dead_tuples] = *itemptr;
-		vacrelstats->num_dead_tuples++;
+		dead_tuples->itemptrs[dead_tuples->num_tuples] = *itemptr;
+		dead_tuples->num_tuples++;
 		pgstat_progress_update_param(PROGRESS_VACUUM_NUM_DEAD_TUPLES,
-									 vacrelstats->num_dead_tuples);
+									 dead_tuples->num_tuples);
 	}
 }
 
@@ -2199,12 +2775,12 @@ lazy_record_dead_tuple(LVRelStats *vacrelstats,
 static bool
 lazy_tid_reaped(ItemPointer itemptr, void *state)
 {
-	LVRelStats *vacrelstats = (LVRelStats *) state;
+	LVDeadTuples *dead_tuples = (LVDeadTuples *) state;
 	ItemPointer res;
 
 	res = (ItemPointer) bsearch((void *) itemptr,
-								(void *) vacrelstats->dead_tuples,
-								vacrelstats->num_dead_tuples,
+								(void *) dead_tuples->itemptrs,
+								dead_tuples->num_tuples,
 								sizeof(ItemPointerData),
 								vac_cmp_itemptr);
 
@@ -2352,3 +2928,448 @@ heap_page_is_all_visible(Relation rel, Buffer buf,
 
 	return all_visible;
 }
+
+/*
+ * Compute the number of parallel worker processes to request.  Both index
+ * vacuuming and index cleanup can be executed together with parallel workers.
+ * The relation sizes of table and indexes don't affect to the parallel
+ * degree for now. nrequested is the number of parallel workers that user
+ * requested. If nrequested is 0 we compute the parallel degree based on
+ * nindexes that is the number of indexes that support parallel index
+ * vacuuming.
+ */
+static int
+compute_parallel_vacuum_workers(Relation *Irel, int nindexes, int nrequested)
+{
+	int			nindexes_parallel = 0;
+	int			nindexes_parallel_bulkdel = 0;
+	int			nindexes_parallel_cleanup = 0;
+	int			parallel_workers;
+	int			i;
+
+	/*
+	 * We don't allow to perform parallel operation in standalone backend or
+	 * when parallelism is disabled.
+	 */
+	if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+		return 0;
+
+	/*
+	 * Compute the number of indexes that can participate in parallel index
+	 * vacuuming.
+	 */
+	for (i = 0; i < nindexes; i++)
+	{
+		uint8		vacoptions = Irel[i]->rd_indam->amparallelvacuumoptions;
+
+		if ((vacoptions & VACUUM_OPTION_PARALLEL_BULKDEL) != 0)
+			nindexes_parallel_bulkdel++;
+		if (((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) != 0) ||
+			((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) != 0))
+			nindexes_parallel_cleanup++;
+	}
+
+	nindexes_parallel = Max(nindexes_parallel_bulkdel,
+							nindexes_parallel_cleanup);
+
+	/* No index supports parallel index vacuuming */
+	if (nindexes_parallel == 0)
+		return 0;
+
+	/* The leader process takes one index */
+	nindexes_parallel--;
+
+	/* Compute the parallel degree */
+	parallel_workers = (nrequested > 0) ?
+		Min(nrequested, nindexes_parallel) : nindexes_parallel;
+
+	/* cap by max_parallel_maintenace_workers */
+	parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+
+	return parallel_workers;
+}
+
+/*
+ * Initialize variables for shared index statistics, set NULL bitmap and the
+ * size of stats for each index.  Also, this function   Since currently we don't support parallel vacuum
+ * for autovacuum we don't need to care about autovacuum_work_mem.
+ */
+static void
+prepare_index_statistics(LVShared *lvshared, bool *can_parallel_vacuum,
+						 int nindexes)
+{
+	int			i;
+
+	Assert(!IsAutoVacuumWorkerProcess());
+
+	/* Set NULL for all indexes */
+	memset(lvshared->bitmap, 0x00, BITMAPLEN(nindexes));
+
+	for (i = 0; i < nindexes; i++)
+	{
+		if (!can_parallel_vacuum[i])
+			continue;
+
+		/* Set NOT NULL as this index do support parallelism */
+		lvshared->bitmap[i >> 3] |= 1 << (i & 0x07);
+	}
+}
+
+/*
+ * Update index statistics in pg_class if the statistics is accurate.
+ */
+static void
+update_index_statistics(Relation *Irel, IndexBulkDeleteResult **stats,
+						int nindexes)
+{
+	int			i;
+
+	Assert(!IsInParallelMode());
+
+	for (i = 0; i < nindexes; i++)
+	{
+		if (stats[i] == NULL || stats[i]->estimated_count)
+			continue;
+
+		/* Update index statistics */
+		vac_update_relstats(Irel[i],
+							stats[i]->num_pages,
+							stats[i]->num_index_tuples,
+							0,
+							false,
+							InvalidTransactionId,
+							InvalidMultiXactId,
+							false);
+		pfree(stats[i]);
+	}
+}
+
+/*
+ * This function prepares and returns parallel vacuum state if we can launch
+ * even one worker.  This function is responsible to create a parallel
+ * context, enter parallel mode, and then initialize the DSM segment.
+ */
+static LVParallelState *
+begin_parallel_vacuum(Oid relid, Relation *Irel, LVRelStats *vacrelstats,
+					  BlockNumber nblocks, int nindexes, int nrequested)
+{
+	LVParallelState *lps = NULL;
+	ParallelContext *pcxt;
+	LVShared   *shared;
+	LVDeadTuples *dead_tuples;
+	bool	   *can_parallel_vacuum;
+	long		maxtuples;
+	char	   *sharedquery;
+	Size		est_shared;
+	Size		est_deadtuples;
+	int			nindexes_mwm = 0;
+	int			parallel_workers = 0;
+	int			querylen;
+	int			i;
+
+	/*
+	 * a parallel vacuum must be requested and there must be indexes on the
+	 * relation
+	 */
+	Assert(nrequested >= 0);
+	Assert(nindexes > 0);
+
+	/*
+	 * Compute the number of parallel vacuum workers to launch
+	 */
+	parallel_workers = compute_parallel_vacuum_workers(Irel, nindexes,
+													   nrequested);
+
+	/* Can't perform vacuum in parallel */
+	if (parallel_workers <= 0)
+		return lps;
+
+	lps = (LVParallelState *) palloc0(sizeof(LVParallelState));
+	can_parallel_vacuum = (bool *) palloc0(sizeof(bool) * nindexes);
+
+	EnterParallelMode();
+	pcxt = CreateParallelContext("postgres", "parallel_vacuum_main",
+								 parallel_workers);
+	lps->pcxt = pcxt;
+	Assert(pcxt->nworkers > 0);
+
+	/* Estimate size for shared information -- PARALLEL_VACUUM_KEY_SHARED */
+	est_shared = MAXALIGN(add_size(SizeOfLVShared, BITMAPLEN(nindexes)));
+	for (i = 0; i < nindexes; i++)
+	{
+		uint8		vacoptions = Irel[i]->rd_indam->amparallelvacuumoptions;
+
+		/*
+		 * Cleanup option should be either disabled, always performing in
+		 * parallel or conditionally performing in parallel.
+		 */
+		Assert(((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) == 0) ||
+			   ((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) == 0));
+		Assert(vacoptions <= VACUUM_OPTION_MAX_VALID_VALUE);
+
+		if (Irel[i]->rd_indam->amusemaintenanceworkmem)
+			nindexes_mwm++;
+
+		/* Skip indexes that don't participate parallel index vacuum */
+		if (vacoptions == VACUUM_OPTION_NO_PARALLEL ||
+			RelationGetNumberOfBlocks(Irel[i]) < min_parallel_index_scan_size)
+			continue;
+
+		/*
+		 * Remember indexes that can participate parallel index vacuum and use
+		 * it for index statistics initialization on DSM because the index
+		 * size can get bigger during vacuum.
+		 */
+		can_parallel_vacuum[i] = true;
+
+		est_shared = add_size(est_shared, sizeof(LVSharedIndStats));
+
+
+		/*
+		 * Remember the number of indexes that support parallel operation for
+		 * each phase.
+		 */
+		if ((vacoptions & VACUUM_OPTION_PARALLEL_BULKDEL) != 0)
+			lps->nindexes_parallel_bulkdel++;
+		if ((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) != 0)
+			lps->nindexes_parallel_cleanup++;
+		if ((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) != 0)
+			lps->nindexes_parallel_condcleanup++;
+	}
+	shm_toc_estimate_chunk(&pcxt->estimator, est_shared);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	/* Estimate size for dead tuples -- PARALLEL_VACUUM_KEY_DEAD_TUPLES */
+	maxtuples = compute_max_dead_tuples(nblocks, true);
+	est_deadtuples = MAXALIGN(add_size(SizeOfLVDeadTuples,
+									   mul_size(sizeof(ItemPointerData), maxtuples)));
+	shm_toc_estimate_chunk(&pcxt->estimator, est_deadtuples);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	/* Finally, estimate PARALLEL_VACUUM_KEY_QUERY_TEXT space */
+	querylen = strlen(debug_query_string);
+	shm_toc_estimate_chunk(&pcxt->estimator, querylen + 1);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	InitializeParallelDSM(pcxt);
+
+	/* Prepare shared information */
+	shared = (LVShared *) shm_toc_allocate(pcxt->toc, est_shared);
+	MemSet(shared, 0, est_shared);
+	shared->relid = relid;
+	shared->elevel = elevel;
+	shared->disable_delay = (params->options & VACOPT_FAST);
+	shared->maintenance_work_mem_worker =
+		(nindexes_mwm > 0) ?
+		maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
+		maintenance_work_mem;
+
+	/*
+	 * We need to care about alignment because we estimate the shared memory
+	 * in that way.
+	 */
+	shared->offset = MAXALIGN(add_size(SizeOfLVShared, BITMAPLEN(nindexes)));
+	prepare_index_statistics(shared, can_parallel_vacuum, nindexes);
+	pg_atomic_init_u32(&(shared->idx), 0);
+	pg_atomic_init_u32(&(shared->nprocessed), 0);
+	pg_atomic_init_u32(&(shared->cost_balance), 0);
+	pg_atomic_init_u32(&(shared->active_nworkers), 0);
+
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_SHARED, shared);
+	lps->lvshared = shared;
+
+	/* Prepare the dead tuple space */
+	dead_tuples = (LVDeadTuples *) shm_toc_allocate(pcxt->toc, est_deadtuples);
+	dead_tuples->max_tuples = maxtuples;
+	dead_tuples->num_tuples = 0;
+	MemSet(dead_tuples->itemptrs, 0, sizeof(ItemPointerData) * maxtuples);
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_DEAD_TUPLES, dead_tuples);
+	vacrelstats->dead_tuples = dead_tuples;
+
+	/* Store query string for workers */
+	sharedquery = (char *) shm_toc_allocate(pcxt->toc, querylen + 1);
+	memcpy(sharedquery, debug_query_string, querylen + 1);
+	sharedquery[querylen] = '\0';
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, sharedquery);
+
+	return lps;
+}
+
+/*
+ * Destroy the parallel context, and end parallel mode.
+ *
+ * Since writes are not allowed during the parallel mode, so we copy the
+ * updated index statistics from DSM in local memory and then later use that
+ * to update the index statistics.  One might think that we can exit from
+ * parallel mode, update the index statistics and then destroy parallel
+ * context, but that won't be safe (see ExitParallelMode).
+ */
+static void
+end_parallel_vacuum(Relation *Irel, IndexBulkDeleteResult **stats,
+					LVParallelState *lps, int nindexes)
+{
+	int			i;
+
+	Assert(!IsParallelWorker());
+
+	/* copy the updated statistics */
+	for (i = 0; i < nindexes; i++)
+	{
+		LVSharedIndStats *indstats = get_indstats(lps->lvshared, i);
+
+		/*
+		 * Skip unused slot.  The statistics of this index are already stored
+		 * in local memory.
+		 */
+		if (indstats == NULL)
+			continue;
+
+		if (indstats->updated)
+		{
+			stats[i] = (IndexBulkDeleteResult *) palloc0(sizeof(IndexBulkDeleteResult));
+			memcpy(stats[i],
+				   &(indstats->stats),
+				   sizeof(IndexBulkDeleteResult));
+		}
+		else
+			stats[i] = NULL;
+	}
+
+	DestroyParallelContext(lps->pcxt);
+	ExitParallelMode();
+
+	/* Deactivate parallel vacuum */
+	pfree(lps);
+	lps = NULL;
+}
+
+/* Return the Nth index statistics or NULL */
+static LVSharedIndStats *
+get_indstats(LVShared *lvshared, int n)
+{
+	int			i;
+	char	   *p;
+
+	if (IndStatsIsNull(lvshared, n))
+		return NULL;
+
+	p = (char *) GetSharedIndStats(lvshared);
+	for (i = 0; i < n; i++)
+	{
+		if (IndStatsIsNull(lvshared, i))
+			continue;
+
+		p += sizeof(LVSharedIndStats);
+	}
+
+	return (LVSharedIndStats *) p;
+}
+
+/*
+ * Check if the given index participates in parallel index vacuum or parallel
+ * index cleanup.
+ */
+static bool
+skip_parallel_vacuum_index(Relation indrel, LVShared *lvshared)
+{
+	uint8		vacoptions = indrel->rd_indam->amparallelvacuumoptions;
+
+	/* first_time must be true only if for_cleanup is true */
+	Assert(lvshared->for_cleanup || !lvshared->first_time);
+
+	if (lvshared->for_cleanup)
+	{
+		/* Skip, if the index does not support parallel cleanup */
+		if (((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) == 0) &&
+			((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) == 0))
+			return true;
+
+		/*
+		 * Skip, if the index supports parallel cleanup conditionally, but we
+		 * have already processed the index (for bulkdelete).  See the
+		 * comments for option VACUUM_OPTION_PARALLEL_COND_CLEANUP to know
+		 * when indexes support parallel cleanup conditionally.
+		 */
+		if (!lvshared->first_time &&
+			((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) != 0))
+			return true;
+	}
+	else if ((vacoptions & VACUUM_OPTION_PARALLEL_BULKDEL) == 0)
+	{
+		/* Skip if the index does not support parallel bulk deletion */
+		return true;
+	}
+
+	return false;
+}
+
+/*
+ * Perform work within a launched parallel process.
+ *
+ * Since parallel vacuum workers perform only index vacuum or index cleanup,
+ * we don't need to report the progress information.
+ */
+void
+parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
+{
+	Relation	onerel;
+	Relation   *indrels;
+	LVShared   *lvshared;
+	LVDeadTuples *dead_tuples;
+	int			nindexes;
+	char	   *sharedquery;
+	IndexBulkDeleteResult **stats;
+
+	lvshared = (LVShared *) shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_SHARED,
+										   false);
+	elevel = lvshared->elevel;
+
+	ereport(DEBUG1,
+			(errmsg("starting parallel lazy vacuum worker for %s",
+					lvshared->for_cleanup ? "cleanup" : "vacuuming")));
+
+	/* Set debug_query_string for individual workers */
+	sharedquery = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, false);
+	debug_query_string = sharedquery;
+	pgstat_report_activity(STATE_RUNNING, debug_query_string);
+
+	/*
+	 * Open table.  The lock mode is the same as the leader process.  It's
+	 * okay because The lockmode does not conflict among the parallel workers.
+	 */
+	onerel = table_open(lvshared->relid, ShareUpdateExclusiveLock);
+
+	/*
+	 * Open all indexes. indrels are sorted in order by OID, which should be
+	 * matched to the leader's one.
+	 */
+	vac_open_indexes(onerel, RowExclusiveLock, &nindexes, &indrels);
+	Assert(nindexes > 0);
+
+	/* Set dead tuple space */
+	dead_tuples = (LVDeadTuples *) shm_toc_lookup(toc,
+												  PARALLEL_VACUUM_KEY_DEAD_TUPLES,
+												  false);
+
+	/* Set cost-based vacuum delay */
+	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostBalance = 0;
+	VacuumPageHit = 0;
+	VacuumPageMiss = 0;
+	VacuumPageDirty = 0;
+	VacuumSharedCostBalance = &(lvshared->cost_balance);
+	VacuumActiveNWorkers = &(lvshared->active_nworkers);
+
+	stats = (IndexBulkDeleteResult **)
+		palloc0(nindexes * sizeof(IndexBulkDeleteResult *));
+
+	if (lvshared->maintenance_work_mem_worker > 0)
+		maintenance_work_mem = lvshared->maintenance_work_mem_worker;
+
+	/* Process indexes to perform vacuum/cleanup */
+	parallel_vacuum_index(indrels, stats, lvshared, dead_tuples, nindexes);
+
+	vac_close_indexes(nindexes, indrels, RowExclusiveLock);
+	table_close(onerel, ShareUpdateExclusiveLock);
+	pfree(stats);
+}
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index d147236429..b90305cde6 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -14,6 +14,7 @@
 
 #include "postgres.h"
 
+#include "access/heapam.h"
 #include "access/nbtree.h"
 #include "access/parallel.h"
 #include "access/session.h"
@@ -139,6 +140,9 @@ static const struct
 	},
 	{
 		"_bt_parallel_build_main", _bt_parallel_build_main
+	},
+	{
+		"parallel_vacuum_main", parallel_vacuum_main
 	}
 };
 
@@ -174,6 +178,7 @@ CreateParallelContext(const char *library_name, const char *function_name,
 	pcxt = palloc0(sizeof(ParallelContext));
 	pcxt->subid = GetCurrentSubTransactionId();
 	pcxt->nworkers = nworkers;
+	pcxt->nworkers_to_launch = nworkers;
 	pcxt->library_name = pstrdup(library_name);
 	pcxt->function_name = pstrdup(function_name);
 	pcxt->error_context_stack = error_context_stack;
@@ -440,13 +445,16 @@ InitializeParallelDSM(ParallelContext *pcxt)
 
 /*
  * Reinitialize the dynamic shared memory segment for a parallel context such
- * that we could launch workers for it again.
+ * that we could launch workers for it again. nworkers is the number of worker
+ * to launch in the next execution.
  */
 void
-ReinitializeParallelDSM(ParallelContext *pcxt)
+ReinitializeParallelDSM(ParallelContext *pcxt, int nworkers)
 {
 	FixedParallelState *fps;
 
+	Assert(nworkers >= 0 && pcxt->nworkers >= nworkers);
+
 	/* Wait for any old workers to exit. */
 	if (pcxt->nworkers_launched > 0)
 	{
@@ -498,7 +506,7 @@ LaunchParallelWorkers(ParallelContext *pcxt)
 	bool		any_registrations_failed = false;
 
 	/* Skip this if we have no workers. */
-	if (pcxt->nworkers == 0)
+	if (pcxt->nworkers == 0 || pcxt->nworkers_to_launch == 0)
 		return;
 
 	/* We need to be a lock group leader. */
@@ -533,7 +541,7 @@ LaunchParallelWorkers(ParallelContext *pcxt)
 	 * fails.  It wouldn't help much anyway, because registering the worker in
 	 * no way guarantees that it will start up and initialize successfully.
 	 */
-	for (i = 0; i < pcxt->nworkers; ++i)
+	for (i = 0; i < pcxt->nworkers_to_launch; ++i)
 	{
 		memcpy(worker.bgw_extra, &i, sizeof(int));
 		if (!any_registrations_failed &&
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index da1da23400..14a9b2432e 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -42,6 +42,7 @@
 #include "nodes/makefuncs.h"
 #include "pgstat.h"
 #include "postmaster/autovacuum.h"
+#include "postmaster/bgworker_internals.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
 #include "storage/proc.h"
@@ -68,6 +69,14 @@ static MemoryContext vac_context = NULL;
 static BufferAccessStrategy vac_strategy;
 
 
+/*
+ * Variables for cost-based parallel vacuum.  See comments atop
+ * compute_parallel_delay to understand how it works.
+ */
+pg_atomic_uint32 *VacuumSharedCostBalance = NULL;
+pg_atomic_uint32 *VacuumActiveNWorkers = NULL;
+int			VacuumCostBalanceLocal = 0;
+
 /* non-export function prototypes */
 static List *expand_vacuum_rel(VacuumRelation *vrel, int options);
 static List *get_all_vacuum_rels(int options);
@@ -76,6 +85,7 @@ static void vac_truncate_clog(TransactionId frozenXID,
 							  TransactionId lastSaneFrozenXid,
 							  MultiXactId lastSaneMinMulti);
 static bool vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params);
+static double compute_parallel_delay(void);
 static VacOptTernaryValue get_vacopt_ternary_value(DefElem *def);
 
 /*
@@ -99,6 +109,7 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 	/* Set default value */
 	params.index_cleanup = VACOPT_TERNARY_DEFAULT;
 	params.truncate = VACOPT_TERNARY_DEFAULT;
+	params.nworkers = -1;
 
 	/* Parse options list */
 	foreach(lc, vacstmt->options)
@@ -129,6 +140,28 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 			params.index_cleanup = get_vacopt_ternary_value(opt);
 		else if (strcmp(opt->defname, "truncate") == 0)
 			params.truncate = get_vacopt_ternary_value(opt);
+		else if (strcmp(opt->defname, "parallel") == 0)
+		{
+			if (opt->arg == NULL)
+			{
+				/*
+				 * Parallel lazy vacuum is requested but user didn't specify
+				 * the parallel degree. The parallel degree will be determined
+				 * at the start of lazy vacuum.
+				 */
+				params.nworkers = 0;
+			}
+			else
+			{
+				params.nworkers = defGetInt32(opt);
+				if (params.nworkers < 1 || params.nworkers > MAX_PARALLEL_WORKER_LIMIT)
+					ereport(ERROR,
+							(errcode(ERRCODE_SYNTAX_ERROR),
+							 errmsg("parallel vacuum degree must be between 1 and %d",
+									MAX_PARALLEL_WORKER_LIMIT),
+							 parser_errposition(pstate, opt->location)));
+			}
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -170,6 +203,11 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 		}
 	}
 
+	if ((params.options & VACOPT_FULL) && params.nworkers >= 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot specify FULL option with PARALLEL option")));
+
 	/*
 	 * All freeze ages are zero if the FREEZE option is given; otherwise pass
 	 * them as -1 which means to use the default values.
@@ -383,6 +421,7 @@ vacuum(List *relations, VacuumParams *params,
 		VacuumPageHit = 0;
 		VacuumPageMiss = 0;
 		VacuumPageDirty = 0;
+		VacuumSharedCostBalance = NULL;
 
 		/*
 		 * Loop to process each selected relation.
@@ -1738,6 +1777,20 @@ vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params)
 		return false;
 	}
 
+	/*
+	 * Since parallel workers cannot access data in temporary tables, parallel
+	 * vacuum is not allowed for temporary relation. However rather than
+	 * skipping vacuum on the table, just disabling parallel option is better
+	 * option in most cases.
+	 */
+	if (RelationUsesLocalBuffers(onerel) && params->nworkers >= 0)
+	{
+		ereport(WARNING,
+				(errmsg("disabling parallel option of vacuum on \"%s\" --- cannot vacuum temporary tables in parallel",
+						RelationGetRelationName(onerel))));
+		params->nworkers = -1;
+	}
+
 	/*
 	 * Silently ignore partitioned tables as there is no work to be done.  The
 	 * useful work is on their child partitions, which have been queued up for
@@ -1941,16 +1994,26 @@ vac_close_indexes(int nindexes, Relation *Irel, LOCKMODE lockmode)
 void
 vacuum_delay_point(void)
 {
+	double		msec = 0;
+
 	/* Always check for interrupts */
 	CHECK_FOR_INTERRUPTS();
 
-	/* Nap if appropriate */
-	if (VacuumCostActive && !InterruptPending &&
-		VacuumCostBalance >= VacuumCostLimit)
-	{
-		double		msec;
+	if (!VacuumCostActive || InterruptPending)
+		return;
 
+	/*
+	 * For parallel vacuum, the delay is computed based on the shared cost
+	 * balance.  See compute_parallel_delay.
+	 */
+	if (VacuumSharedCostBalance != NULL)
+		msec = compute_parallel_delay();
+	else if (VacuumCostBalance >= VacuumCostLimit)
 		msec = VacuumCostDelay * VacuumCostBalance / VacuumCostLimit;
+
+	/* Nap if appropriate */
+	if (msec > 0)
+	{
 		if (msec > VacuumCostDelay * 4)
 			msec = VacuumCostDelay * 4;
 
@@ -1966,6 +2029,59 @@ vacuum_delay_point(void)
 	}
 }
 
+/*
+ * Computes the vacuum delay for parallel workers.
+ *
+ * The basic idea of a cost-based vacuum delay for parallel index vacuuming
+ * is to allow all parallel vacuum workers including the leader process to
+ * have a shared view of cost related parameters (mainly VacuumCostBalance)
+ * and allow each worker to update it and then based on that decide
+ * whether it needs to sleep.  Besides, we allow any worker to sleep only
+ * if it has performed the I/O above a certain threshold, which is calculated
+ * based on the number of active workers (VacuumActiveNWorkers), and the
+ * overall cost balance is more than VacuumCostLimit set by the system.  Then
+ * we will allow the worker to sleep proportional to the work done and reduce
+ * the VacuumSharedCostBalance by the amount which is consumed by the current
+ * worker (VacuumCostBalanceLocal).  This can avoid letting the workers sleep
+ * who have done less or no I/O as compared to other workers and therefore can
+ * ensure that workers who are doing more I/O got throttled more.
+ */
+static double
+compute_parallel_delay(void)
+{
+	double		msec = 0;
+	uint32		shared_balance;
+	int			nworkers = pg_atomic_read_u32(VacuumActiveNWorkers);
+
+	/* At least count itself */
+	Assert(nworkers >= 1);
+
+	/* parallel vacuum must be active */
+	Assert(VacuumSharedCostBalance);
+
+	/* Update the shared cost balance value atomically */
+	shared_balance = pg_atomic_add_fetch_u32(VacuumSharedCostBalance, VacuumCostBalance);
+
+	/* also compute the total local balance */
+	VacuumCostBalanceLocal += VacuumCostBalance;
+
+	if ((shared_balance >= VacuumCostLimit) &&
+		(VacuumCostBalanceLocal > 0.5 * (VacuumCostLimit / nworkers)))
+	{
+		/* compute sleep time based on the local cost balance */
+		msec = VacuumCostDelay * VacuumCostBalanceLocal / VacuumCostLimit;
+		shared_balance = pg_atomic_sub_fetch_u32(VacuumSharedCostBalance, VacuumCostBalanceLocal);
+		VacuumCostBalanceLocal = 0;
+	}
+
+	/*
+	 * Reset the local balance as we accumulated it into the shared value.
+	 */
+	VacuumCostBalance = 0;
+
+	return msec;
+}
+
 /*
  * A wrapper function of defGetBoolean().
  *
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index 683b06cdd6..0dd1bf4472 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -873,7 +873,7 @@ ExecParallelReinitialize(PlanState *planstate,
 	 */
 	ExecSetParamPlanMulti(sendParams, GetPerTupleExprContext(estate));
 
-	ReinitializeParallelDSM(pei->pcxt);
+	ReinitializeParallelDSM(pei->pcxt, pei->pcxt->nworkers);
 	pei->tqueue = ExecParallelSetupTupleQueues(pei->pcxt, true);
 	pei->reader = NULL;
 	pei->finished = false;
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index c1dd8168ca..c3690f9c41 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2891,6 +2891,8 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 			(!wraparound ? VACOPT_SKIP_LOCKED : 0);
 		tab->at_params.index_cleanup = VACOPT_TERNARY_DEFAULT;
 		tab->at_params.truncate = VACOPT_TERNARY_DEFAULT;
+		/* We don't support parallel vacuum for autovacuum for now */
+		tab->at_params.nworkers = -1;
 		tab->at_params.freeze_min_age = freeze_min_age;
 		tab->at_params.freeze_table_age = freeze_table_age;
 		tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 5e0db3515d..e2dbd94a3e 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -3591,7 +3591,7 @@ psql_completion(const char *text, int start, int end)
 		if (ends_with(prev_wd, '(') || ends_with(prev_wd, ','))
 			COMPLETE_WITH("FULL", "FREEZE", "ANALYZE", "VERBOSE",
 						  "DISABLE_PAGE_SKIPPING", "SKIP_LOCKED",
-						  "INDEX_CLEANUP", "TRUNCATE");
+						  "INDEX_CLEANUP", "TRUNCATE", "PARALLEL");
 		else if (TailMatches("FULL|FREEZE|ANALYZE|VERBOSE|DISABLE_PAGE_SKIPPING|SKIP_LOCKED|INDEX_CLEANUP|TRUNCATE"))
 			COMPLETE_WITH("ON", "OFF");
 	}
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 858bcb6bc9..e89c1252d3 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -24,6 +24,8 @@
 #include "nodes/primnodes.h"
 #include "storage/bufpage.h"
 #include "storage/lockdefs.h"
+#include "storage/shm_toc.h"
+#include "storage/dsm.h"
 #include "utils/relcache.h"
 #include "utils/snapshot.h"
 
@@ -193,6 +195,7 @@ extern Size SyncScanShmemSize(void);
 struct VacuumParams;
 extern void heap_vacuum_rel(Relation onerel,
 							struct VacuumParams *params, BufferAccessStrategy bstrategy);
+extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
 
 /* in heap/heapam_visibility.c */
 extern bool HeapTupleSatisfiesVisibility(HeapTuple stup, Snapshot snapshot,
diff --git a/src/include/access/parallel.h b/src/include/access/parallel.h
index c00ae6424c..86d24650ba 100644
--- a/src/include/access/parallel.h
+++ b/src/include/access/parallel.h
@@ -33,7 +33,8 @@ typedef struct ParallelContext
 {
 	dlist_node	node;
 	SubTransactionId subid;
-	int			nworkers;
+	int			nworkers;	/* Maximum number of workers to launch */
+	int			nworkers_to_launch;	/* Actual number of workers to launch */
 	int			nworkers_launched;
 	char	   *library_name;
 	char	   *function_name;
@@ -62,7 +63,7 @@ extern PGDLLIMPORT bool InitializingParallelWorker;
 extern ParallelContext *CreateParallelContext(const char *library_name,
 											  const char *function_name, int nworkers);
 extern void InitializeParallelDSM(ParallelContext *pcxt);
-extern void ReinitializeParallelDSM(ParallelContext *pcxt);
+extern void ReinitializeParallelDSM(ParallelContext *pcxt, int nworkers);
 extern void LaunchParallelWorkers(ParallelContext *pcxt);
 extern void WaitForParallelWorkersToAttach(ParallelContext *pcxt);
 extern void WaitForParallelWorkersToFinish(ParallelContext *pcxt);
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index b05aedc670..6e9d918cfe 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -221,6 +221,11 @@ typedef struct VacuumParams
 										 * default value depends on reloptions */
 	VacOptTernaryValue truncate;	/* Truncate empty pages at the end,
 									 * default value depends on reloptions */
+	/*
+	 * The number of parallel vacuum workers. -1 by default for no workers
+	 * and 0 for choosing based on the number of indexes.
+	 */
+	int			nworkers;
 } VacuumParams;
 
 /* GUC parameters */
@@ -230,6 +235,11 @@ extern int	vacuum_freeze_table_age;
 extern int	vacuum_multixact_freeze_min_age;
 extern int	vacuum_multixact_freeze_table_age;
 
+/* Variables for cost-based parallel vacuum  */
+extern pg_atomic_uint32	*VacuumSharedCostBalance;
+extern pg_atomic_uint32	*VacuumActiveNWorkers;
+extern int	VacuumCostBalanceLocal;
+
 
 /* in commands/vacuum.c */
 extern void ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel);
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index 9996d882d1..5b42371d95 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -92,6 +92,32 @@ CONTEXT:  SQL function "do_analyze" statement 1
 SQL function "wrap_do_analyze" statement 1
 VACUUM FULL vactst;
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+-- PARALLEL option
+CREATE TABLE pvactst (i INT, a INT[], p POINT) with (autovacuum_enabled = off);
+INSERT INTO pvactst SELECT i, array[1,2,3], point(i, i+1) FROM generate_series(1,1000) i;
+CREATE INDEX btree_pvactst ON pvactst USING btree (i);
+CREATE INDEX hash_pvactst ON pvactst USING hash (i);
+CREATE INDEX brin_pvactst ON pvactst USING brin (i);
+CREATE INDEX gin_pvactst ON pvactst USING gin (a);
+CREATE INDEX gist_pvactst ON pvactst USING gist (p);
+CREATE INDEX spgist_pvactst ON pvactst USING spgist (p);
+-- VACUUM invokes parallel index cleanup
+VACUUM (PARALLEL 2) pvactst;
+-- VACUUM invokes parallel bulk-deletion and parallel index cleanup
+UPDATE pvactst SET i = i WHERE i < 1000;
+VACUUM (PARALLEL 2) pvactst;
+VACUUM (PARALLEL 0) pvactst; -- error
+ERROR:  parallel vacuum degree must be between 1 and 1024
+LINE 1: VACUUM (PARALLEL 0) pvactst;
+                ^
+VACUUM (PARALLEL 2, INDEX_CLEANUP FALSE) pvactst;
+VACUUM (PARALLEL 2, FULL TRUE) pvactst; -- error, cannot use both PARALLEL and FULL
+ERROR:  cannot specify FULL option with PARALLEL option
+CREATE TEMPORARY TABLE tmp (a int PRIMARY KEY);
+CREATE INDEX tmp_idx1 ON tmp (a);
+VACUUM (PARALLEL 1) tmp; -- disables parallel vacuum option
+WARNING:  disabling parallel option of vacuum on "tmp" --- cannot vacuum temporary tables in parallel
+DROP TABLE pvactst;
 -- INDEX_CLEANUP option
 CREATE TABLE no_index_cleanup (i INT PRIMARY KEY, t TEXT);
 -- Use uncompressed data stored in toast.
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 69987f75e9..0cdda11b25 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -75,6 +75,31 @@ VACUUM FULL vactst;
 
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
 
+-- PARALLEL option
+CREATE TABLE pvactst (i INT, a INT[], p POINT) with (autovacuum_enabled = off);
+INSERT INTO pvactst SELECT i, array[1,2,3], point(i, i+1) FROM generate_series(1,1000) i;
+CREATE INDEX btree_pvactst ON pvactst USING btree (i);
+CREATE INDEX hash_pvactst ON pvactst USING hash (i);
+CREATE INDEX brin_pvactst ON pvactst USING brin (i);
+CREATE INDEX gin_pvactst ON pvactst USING gin (a);
+CREATE INDEX gist_pvactst ON pvactst USING gist (p);
+CREATE INDEX spgist_pvactst ON pvactst USING spgist (p);
+
+-- VACUUM invokes parallel index cleanup
+VACUUM (PARALLEL 2) pvactst;
+
+-- VACUUM invokes parallel bulk-deletion and parallel index cleanup
+UPDATE pvactst SET i = i WHERE i < 1000;
+VACUUM (PARALLEL 2) pvactst;
+
+VACUUM (PARALLEL 0) pvactst; -- error
+VACUUM (PARALLEL 2, INDEX_CLEANUP FALSE) pvactst;
+VACUUM (PARALLEL 2, FULL TRUE) pvactst; -- error, cannot use both PARALLEL and FULL
+CREATE TEMPORARY TABLE tmp (a int PRIMARY KEY);
+CREATE INDEX tmp_idx1 ON tmp (a);
+VACUUM (PARALLEL 1) tmp; -- disables parallel vacuum option
+DROP TABLE pvactst;
+
 -- INDEX_CLEANUP option
 CREATE TABLE no_index_cleanup (i INT PRIMARY KEY, t TEXT);
 -- Use uncompressed data stored in toast.
-- 
2.23.0

#286

Mahendra Singh

mahi6run@gmail.com

about 6 years ago

In reply to: Masahiko Sawada (#285)

On Tue, 17 Dec 2019 at 18:07, Masahiko Sawada <
masahiko.sawada@2ndquadrant.com> wrote:

On Fri, 13 Dec 2019 at 15:50, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Dec 13, 2019 at 11:08 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Fri, 13 Dec 2019 at 14:19, Amit Kapila <amit.kapila16@gmail.com>

wrote:

How about adding an additional argument to

ReinitializeParallelDSM()

that allows the number of workers to be reduced? That seems

like it

would be less confusing than what you have now, and would

involve

modify code in a lot fewer places.

Yeah, we can do that. We can maintain some information in
LVParallelState which indicates whether we need to reinitialize

the

DSM before launching workers. Sawada-San, do you see any

problem with

this idea?

I think the number of workers could be increased in cleanup

phase. For

example, if we have 1 brin index and 2 gin indexes then in

bulkdelete

phase we need only 1 worker but in cleanup we need 2 workers.

I think it shouldn't be more than the number with which we have
created a parallel context, no? If that is the case, then I think

should be fine.

Right. I thought that ReinitializeParallelDSM() with an additional
argument would reduce DSM but I understand that it doesn't actually
reduce DSM but just have a variable for the number of workers to
launch, is that right?

Yeah, probably, we need to change the nworkers stored in the context
and it should be lesser than the value already stored in that number.

And we also would need to call
ReinitializeParallelDSM() at the beginning of vacuum index or vacuum
cleanup since we don't know that we will do either index vacuum or
index cleanup, at the end of index vacum.

Right.

I've attached the latest version patch set. These patches requires the
gist vacuum patch[1]. The patch incorporated the review comments. In
current version patch only indexes that support parallel vacuum and
whose size is larger than min_parallel_index_scan_size can participate
parallel vacuum. I'm still not unclear to me that using
min_parallel_index_scan_size is the best approach but I agreed to set
a lower bound of relation size. I separated the patch for
PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION from the main patch and
I'm working on that patch.

Please review it.

[1]

/messages/by-id/CAA4eK1J1RxmXFAHC34S4_BznT76cfbrvqORSk23iBgRAOj1azw@mail.gmail.com

Thanks for updated patches. I verified my all reported issues and all are
fixed in v36 patch set.

Below are some review comments:
1.
+ /* cap by *max_parallel_maintenace_workers* */

+ parallel_workers = Min(parallel_workers,
max_parallel_maintenance_workers);

Here, spell of max_parallel_*maintenace*_workers is wrong. (correct:
max_parallel_maintenance_workers)

2.
+ * size of stats for each index. Also, this function Since currently we
don't support parallel vacuum

+ * for autovacuum we don't need to care about autovacuum_work_mem

Here, I think, 1st line should be changed because it is not looking correct
as grammatically.

Thanks and Regards
Mahendra Thalor
EnterpriseDB: http://www.enterprisedb.com

#287

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Masahiko Sawada (#285)

On Tue, Dec 17, 2019 at 6:07 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Fri, 13 Dec 2019 at 15:50, Amit Kapila <amit.kapila16@gmail.com> wrote:

I think it shouldn't be more than the number with which we have
created a parallel context, no? If that is the case, then I think it
should be fine.

Right. I thought that ReinitializeParallelDSM() with an additional
argument would reduce DSM but I understand that it doesn't actually
reduce DSM but just have a variable for the number of workers to
launch, is that right?

Yeah, probably, we need to change the nworkers stored in the context
and it should be lesser than the value already stored in that number.

And we also would need to call
ReinitializeParallelDSM() at the beginning of vacuum index or vacuum
cleanup since we don't know that we will do either index vacuum or
index cleanup, at the end of index vacum.

Right.

I've attached the latest version patch set. These patches requires the
gist vacuum patch[1]. The patch incorporated the review comments.

I was analyzing your changes related to ReinitializeParallelDSM() and
it seems like we might launch more number of workers for the
bulkdelete phase. While creating a parallel context, we used the
maximum of "workers required for bulkdelete phase" and "workers
required for cleanup", but now if the number of workers required in
bulkdelete phase is lesser than a cleanup phase(as mentioned by you in
one example), then we would launch more workers for bulkdelete phase.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#288

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

about 6 years ago

In reply to: Amit Kapila (#287)

On Wed, 18 Dec 2019 at 15:03, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Dec 17, 2019 at 6:07 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Fri, 13 Dec 2019 at 15:50, Amit Kapila <amit.kapila16@gmail.com> wrote:

I think it shouldn't be more than the number with which we have
created a parallel context, no? If that is the case, then I think it
should be fine.

Right. I thought that ReinitializeParallelDSM() with an additional
argument would reduce DSM but I understand that it doesn't actually
reduce DSM but just have a variable for the number of workers to
launch, is that right?

Yeah, probably, we need to change the nworkers stored in the context
and it should be lesser than the value already stored in that number.

And we also would need to call
ReinitializeParallelDSM() at the beginning of vacuum index or vacuum
cleanup since we don't know that we will do either index vacuum or
index cleanup, at the end of index vacum.

Right.

I've attached the latest version patch set. These patches requires the
gist vacuum patch[1]. The patch incorporated the review comments.

I was analyzing your changes related to ReinitializeParallelDSM() and
it seems like we might launch more number of workers for the
bulkdelete phase. While creating a parallel context, we used the
maximum of "workers required for bulkdelete phase" and "workers
required for cleanup", but now if the number of workers required in
bulkdelete phase is lesser than a cleanup phase(as mentioned by you in
one example), then we would launch more workers for bulkdelete phase.

Good catch. Currently when creating a parallel context the number of
workers passed to CreateParallelContext() is set not only to
pcxt->nworkers but also pcxt->nworkers_to_launch. We would need to
specify the number of workers actually to launch after created the
parallel context or when creating it. Or I think we call
ReinitializeParallelDSM() even the first time running index vacuum.

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#289

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

about 6 years ago

In reply to: Mahendra Singh (#286)

On Wed, 18 Dec 2019 at 03:39, Mahendra Singh <mahi6run@gmail.com> wrote:

Thanks for updated patches. I verified my all reported issues and all are fixed in v36 patch set.
Below are some review comments:
1.
+   /* cap by max_parallel_maintenace_workers */
+   parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
Here, spell of max_parallel_maintenace_workers is wrong. (correct: max_parallel_maintenance_workers)
2.
+ * size of stats for each index.  Also, this function   Since currently we don't support parallel vacuum
+ * for autovacuum we don't need to care about autovacuum_work_mem
Here, I think, 1st line should be changed because it is not looking correct as grammatically.

Thank you for reviewing and testing this patch. I'll incorporate your
comments in the next version patch.

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#290

Mahendra Singh

mahi6run@gmail.com

about 6 years ago

In reply to: Mahendra Singh (#280)

On Tue, 10 Dec 2019 at 00:30, Mahendra Singh <mahi6run@gmail.com> wrote:

On Fri, 6 Dec 2019 at 10:50, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Thu, Dec 5, 2019 at 7:44 PM Robert Haas <robertmhaas@gmail.com> wrote:

I think it might be a good idea to change what we expect index AMs to
do rather than trying to make anything that they happen to be doing
right now work, no matter how crazy. In particular, suppose we say
that you CAN'T add data on to the end of IndexBulkDeleteResult any
more, and that instead the extra data is passed through a separate
parameter. And then you add an estimate method that gives the size of
the space provided by that parameter (and if the estimate method isn't
defined then the extra parameter is passed as NULL) and document that
the data stored there might get flat-copied.

I think this is a good idea and serves the purpose we are trying to
achieve currently. However, if there are any IndexAM that is using
the current way to pass stats with additional information, they would
need to change even if they don't want to use parallel vacuum
functionality (say because their indexes are too small or whatever
other reasons). I think this is a reasonable trade-off and the
changes on their end won't be that big. So, we should do this.

Now, you've taken the
onus off of parallel vacuum to cope with any crazy thing a
hypothetical AM might be doing, and instead you've defined the
behavior of that hypothetical AM as wrong. If somebody really needs
that, it's now their job to modify the index AM machinery further
instead of your job to somehow cope.

makes sense.

Here, we have a need to reduce the number of workers. Index Vacuum
has two different phases (index vacuum and index cleanup) which uses
the same parallel-context/DSM but both could have different
requirements for workers. The second phase (cleanup) would normally
need fewer workers as if the work is done in the first phase, second
wouldn't need it, but we have exceptions like gin indexes where we
need it for the second phase as well because it takes the pass
over-index again even if we have cleaned the index in the first phase.
Now, consider the case where we have 3 btree indexes and 2 gin
indexes, we would need 5 workers for index vacuum phase and 2 workers
for index cleanup phase. There are other cases too.

We also considered to have a separate DSM for each phase, but that
appeared to have overhead without much benefit.

How about adding an additional argument to ReinitializeParallelDSM()
that allows the number of workers to be reduced? That seems like it
would be less confusing than what you have now, and would involve
modify code in a lot fewer places.

Yeah, we can do that. We can maintain some information in
LVParallelState which indicates whether we need to reinitialize the
DSM before launching workers. Sawada-San, do you see any problem with
this idea?

Is there any legitimate use case for parallel vacuum in combination
with vacuum cost delay?

Yeah, we also initially thought that it is not legitimate to use a
parallel vacuum with a cost delay. But to get a wider view, we
started a separate thread [2] and there we reach to the conclusion
that we need a solution for throttling [3].

OK, thanks for the pointer. This doesn't address the other part of my
complaint, though, which is that the whole discussion between you and
Dilip and Sawada-san presumes that you want the delays ought to be
scattered across the workers roughly in proportion to their share of
the I/O, and it seems NOT AT ALL clear that this is actually a
desirable property. You're all assuming that, but none of you has
justified it, and I think the opposite might be true in some cases.

IIUC, your complaint is that in some cases, even if the I/O rate is
enough for one worker, we will still launch more workers and throttle
them. The point is we can't know in advance how much I/O is required
for each index. We can try to do that based on index size, but I
don't think that will be right because it is possible that for the
bigger index, we don't need to dirty the pages and most of the pages
are in shared buffers, etc. The current algorithm won't use more I/O
than required and it will be good for cases where one or some of the
indexes are doing more I/O as compared to others and it will also work
equally good even when the indexes have a similar amount of work. I
think we could do better if we can predict how much I/O each index
requires before actually scanning the index.

I agree with the other points (add a FAST option for parallel vacuum
and document that parallel vacuum is still potentially throttled ...)
you made in a separate email.

You're adding extra complexity for something that isn't a clear
improvement.

Your understanding is correct. How about if we modify it to something
like: "Note that parallel workers are alive only during index vacuum
or index cleanup but the leader process neither exits from the
parallel mode nor destroys the parallel context until the entire
parallel operation is finished." OR something like "The leader backend
holds the parallel context till the index vacuum and cleanup is
finished. Both index vacuum and cleanup separately perform the work
with parallel workers."

How about if you just delete it? You don't need a comment explaining
that this caller of CreateParallelContext() does something which
*every* caller of CreateParallelContext() must do. If you didn't do
that, you'd fail assertions and everything would break, so *of course*
you are doing it.

Fair enough, we can just remove this part of the comment.

Hi All,
Below is the brief about testing of v35 patch set.

1.
All the test cases are passing on the top of v35 patch set (make check world and all contrib test cases)

2.
By enabling PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION, "make check world" is passing.

3.
After v35 patch, vacuum.sql regression test is taking too much time due to large number of inserts so by reducing number of tuples, we can reduce that time.
+INSERT INTO pvactst SELECT i, array[1,2,3], point(i, i+1) FROM generate_series(1,100000) i;

here, instead of 100000, we can make 1000 to reduce time of this test case because we only want to test code and functionality.

As we added check of min_parallel_index_scan_size in v36 patch set to
decide parallel vacuum, 1000 tuples are not enough to do parallel
vacuum. I can see that we are not launching any workers in vacuum.sql
test case and hence, code coverage also decreased. I am not sure that
how to fix this.

Thanks and Regards
Mahendra Thalor
EnterpriseDB: http://www.enterprisedb.com

Show quoted text

4.
I tested functionality of parallel vacuum with different server configuration setting and behavior is as per expected.
shared_buffers, max_parallel_workers, max_parallel_maintenance_workers, vacuum_cost_limit, vacuum_cost_delay, maintenance_work_mem, max_worker_processes

5.
index and table stats of parallel vacuum are matching with normal vacuum.

postgres=# select * from pg_statio_all_tables where relname = 'test';
relid | schemaname | relname | heap_blks_read | heap_blks_hit | idx_blks_read | idx_blks_hit | toast_blks_read | toast_blks_hit | tidx_blks_read | tidx_blks_hit
-------+------------+---------+----------------+---------------+---------------+--------------+-----------------+----------------+----------------+---------------
16384 | public | test | 399 | 5000 | 3 | 0 | 0 | 0 | 0 | 0
(1 row)

6.
vacuum Progress Reporting is as per expectation.
postgres=# select * from pg_stat_progress_vacuum;
pid | datid | datname | relid | phase | heap_blks_total | heap_blks_scanned | heap_blks_vacuumed | index_vacuum_count | max_dead_tuples | num_dead_tuples
-------+-------+----------+-------+---------------------+-----------------+-------------------+--------------------+--------------------+-----------------+-----------------
44161 | 13577 | postgres | 16384 | cleaning up indexes | 41650 | 41650 | 41650 | 1 | 11184810 | 1000000
(1 row)

7.
If any worker(or main worker) got error, then immediately all the workers are exiting and action is marked as abort.

8.
I tested parallel vacuum for all the types of indexes and by varying size of indexes, all are working and didn't got any unexpected behavior.

9.
While doing testing, I found that if we delete all the tuples from table, then also size of btree indexes was not reducing.

delete all tuples from table.
before vacuum, total pages in btree index: 8000
after vacuum(normal/parallel), total pages in btree index: 8000
but size of table is reducing after deleting all the tuples.
Can we add a check in vacuum to truncate all the pages of btree indexes if there is no tuple in table.

Please let me know if you have any inputs for more testing.

Thanks and Regards
Mahendra Thalor
EnterpriseDB: http://www.enterprisedb.com

#291

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Masahiko Sawada (#288)

On Wed, Dec 18, 2019 at 11:46 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Wed, 18 Dec 2019 at 15:03, Amit Kapila <amit.kapila16@gmail.com> wrote:

I was analyzing your changes related to ReinitializeParallelDSM() and
it seems like we might launch more number of workers for the
bulkdelete phase. While creating a parallel context, we used the
maximum of "workers required for bulkdelete phase" and "workers
required for cleanup", but now if the number of workers required in
bulkdelete phase is lesser than a cleanup phase(as mentioned by you in
one example), then we would launch more workers for bulkdelete phase.

Good catch. Currently when creating a parallel context the number of
workers passed to CreateParallelContext() is set not only to
pcxt->nworkers but also pcxt->nworkers_to_launch. We would need to
specify the number of workers actually to launch after created the
parallel context or when creating it. Or I think we call
ReinitializeParallelDSM() even the first time running index vacuum.

How about just having ReinitializeParallelWorkers which can be called
only via vacuum even for the first time before the launch of workers
as of now?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#292

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Mahendra Singh (#290)

[please trim extra text before responding]

On Wed, Dec 18, 2019 at 12:01 PM Mahendra Singh <mahi6run@gmail.com> wrote:

On Tue, 10 Dec 2019 at 00:30, Mahendra Singh <mahi6run@gmail.com> wrote:

3.
After v35 patch, vacuum.sql regression test is taking too much time due to large number of inserts so by reducing number of tuples, we can reduce that time.
+INSERT INTO pvactst SELECT i, array[1,2,3], point(i, i+1) FROM generate_series(1,100000) i;

here, instead of 100000, we can make 1000 to reduce time of this test case because we only want to test code and functionality.

As we added check of min_parallel_index_scan_size in v36 patch set to
decide parallel vacuum, 1000 tuples are not enough to do parallel
vacuum. I can see that we are not launching any workers in vacuum.sql
test case and hence, code coverage also decreased. I am not sure that
how to fix this.

Try by setting min_parallel_index_scan_size to 0 in test case.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#293

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Amit Kapila (#291)

1 attachment(s)

On Wed, Dec 18, 2019 at 12:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Dec 18, 2019 at 11:46 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Wed, 18 Dec 2019 at 15:03, Amit Kapila <amit.kapila16@gmail.com> wrote:

I was analyzing your changes related to ReinitializeParallelDSM() and
it seems like we might launch more number of workers for the
bulkdelete phase. While creating a parallel context, we used the
maximum of "workers required for bulkdelete phase" and "workers
required for cleanup", but now if the number of workers required in
bulkdelete phase is lesser than a cleanup phase(as mentioned by you in
one example), then we would launch more workers for bulkdelete phase.

Good catch. Currently when creating a parallel context the number of
workers passed to CreateParallelContext() is set not only to
pcxt->nworkers but also pcxt->nworkers_to_launch. We would need to
specify the number of workers actually to launch after created the
parallel context or when creating it. Or I think we call
ReinitializeParallelDSM() even the first time running index vacuum.

How about just having ReinitializeParallelWorkers which can be called
only via vacuum even for the first time before the launch of workers
as of now?

See in the attached what I have in mind. Few other comments:

1.
+ shared->disable_delay = (params->options & VACOPT_FAST);

This should be part of the third patch.

2.
+lazy_parallel_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+ LVRelStats *vacrelstats, LVParallelState *lps,
+ int nindexes)
{
..
..
+ /* Cap by the worker we computed at the beginning of parallel lazy vacuum */
+ nworkers = Min(nworkers, lps->pcxt->nworkers);
..
}

This should be Assert. In no case, the computed workers can be more
than what we have in context.

3.
+ if (((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) != 0) ||
+ ((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) != 0))
+ nindexes_parallel_cleanup++;

I think the second condition should be VACUUM_OPTION_PARALLEL_COND_CLEANUP.

I have fixed the above comments and some given by me earlier [1]/messages/by-id/CAA4eK1+KBAt1JS+asDd7K9C10OtBiyuUC75y8LR6QVnD2wrsMw@mail.gmail.com in
the attached patch. The attached patch is a diff on top of
v36-0002-Add-parallel-option-to-VACUUM-command.

Few other comments which I have not fixed:

4.
+ if (Irel[i]->rd_indam->amusemaintenanceworkmem)
+ nindexes_mwm++;
+
+ /* Skip indexes that don't participate parallel index vacuum */
+ if (vacoptions == VACUUM_OPTION_NO_PARALLEL ||
+ RelationGetNumberOfBlocks(Irel[i]) < min_parallel_index_scan_size)
+ continue;

Won't we need to worry about the number of indexes that uses
maintenance_work_mem only for indexes that can participate in a
parallel vacuum? If so, the above checks need to be reversed.

5.
/*
+ * Remember indexes that can participate parallel index vacuum and use
+ * it for index statistics initialization on DSM because the index
+ * size can get bigger during vacuum.
+ */
+ can_parallel_vacuum[i] = true;

I am not able to understand the second part of the comment ("because
the index size can get bigger during vacuum."). What is its
relevance?

6.
+/*
+ * Vacuum or cleanup indexes that can be processed by only the leader process
+ * because these indexes don't support parallel operation at that phase.
+ * Therefore this function must be called by the leader process.
+ */
+static void
+vacuum_indexes_leader(Relation *Irel, int nindexes,
IndexBulkDeleteResult **stats,
+   LVRelStats *vacrelstats, LVParallelState *lps)
{
..

Why you have changed the order of nindexes parameter? I think in the
previous patch, it was the last parameter and that seems to be better
place for it. Also, I think after the latest modifications, you can
remove the second sentence in the above comment ("Therefore this
function must be called by the leader process.).

7.
+ for (i = 0; i < nindexes; i++)
+ {
+ bool leader_only = (get_indstats(lps->lvshared, i) == NULL ||
+    skip_parallel_vacuum_index(Irel[i], lps->lvshared));
+
+ /* Skip the indexes that can be processed by parallel workers */
+ if (!leader_only)
+ continue;

It is better to name this parameter as skip_index or something like that.

[1]: /messages/by-id/CAA4eK1+KBAt1JS+asDd7K9C10OtBiyuUC75y8LR6QVnD2wrsMw@mail.gmail.com

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachments:

v36-0002-Add-parallel-option-to-VACUUM-command.diff.amit.patchapplication/octet-stream; name=v36-0002-Add-parallel-option-to-VACUUM-command.diff.amit.patchDownload

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 6f90a4dc40..0bf62cf2c5 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -30,8 +30,8 @@
  * shared information as well as the memory space for storing dead tuples.
  * When starting either index vacuuming or index cleanup, we launch parallel
  * worker processes.  Once all indexes are processed the parallel worker
- * processes exit.  And then the leader process re-initializes the parallel
- * context so that it can use the same DSM for multiple passses of index
+ * processes exit.  After that, the leader process re-initializes the parallel
+ * context so that it can use the same DSM for multiple passes of index
  * vacuum and for performing index cleanup.  For updating the index statistics,
  * we need to update the system table and since updates are not
  * allowed during parallel mode we update the index statistics after exiting
@@ -140,7 +140,7 @@
 
 /*
  * Macro to check if we are in a parallel lazy vacuum.  If true, we are
- * in the parallel mode and prepared the DSM segment.
+ * in the parallel mode and the DSM segment is initialized.
  */
 #define ParallelVacuumIsActive(lps) (((LVParallelState *) (lps)) != NULL)
 
@@ -299,6 +299,7 @@ static MultiXactId MultiXactCutoff;
 
 static BufferAccessStrategy vac_strategy;
 
+
 /* non-export function prototypes */
 static void lazy_scan_heap(Relation onerel, VacuumParams *params,
 						   LVRelStats *vacrelstats, Relation *Irel, int nindexes,
@@ -675,9 +676,9 @@ vacuum_log_cleanup_info(Relation rel, LVRelStats *vacrelstats)
  *		parallel workers are launched at beginning of index vacuuming and index
  *		cleanup and they exit once done with all indexes.  At the end of this
  *		function we exit from parallel mode.  Index bulk-deletion results are
- *		stored in the DSM segment and update index statistics as a whole after
- *		exited from parallel mode since all writes are not allowed during parallel
- *		mode.
+ *		stored in the DSM segment and we update index statistics for all the
+ *		indexes after exiting from parallel mode since writes are not allowed
+ *		during parallel mode.
  *
  *		If there are no indexes then we can reclaim line pointers on the fly;
  *		dead line pointers need only be retained until all index pointers that
@@ -1969,8 +1970,11 @@ lazy_parallel_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
 	else
 		nworkers = lps->nindexes_parallel_bulkdel - 1;
 
-	/* Cap by the worker we computed at the beginning of parallel lazy vacuum */
-	nworkers = Min(nworkers, lps->pcxt->nworkers);
+	/*
+	 * The number of workers required for parallel vacuum phase must be less
+	 * than the number of workers with which parallel context is initialized.
+	 */
+	Assert(lps->pcxt->nworkers >= nworkers);
 
 	/* Setup the shared cost-based vacuum delay and launch workers */
 	if (nworkers > 0)
@@ -1982,7 +1986,7 @@ lazy_parallel_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
 			pg_atomic_write_u32(&(lps->lvshared->nprocessed), 0);
 
 			/* Reinitialize the parallel context to relaunch parallel workers */
-			ReinitializeParallelDSM(lps->pcxt, nworkers);
+			ReinitializeParallelDSM(lps->pcxt);
 		}
 
 		/* Enable shared cost balance */
@@ -1996,13 +2000,19 @@ lazy_parallel_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
 		pg_atomic_write_u32(VacuumSharedCostBalance, VacuumCostBalance);
 		pg_atomic_write_u32(VacuumActiveNWorkers, 0);
 
+		/*
+		 * The number of workers can vary between and bulkdelete and cleanup
+		 * phase.
+		 */
+		ReinitializeParallelWorkers(lps->pcxt, nworkers);
+
 		LaunchParallelWorkers(lps->pcxt);
 
 		if (lps->pcxt->nworkers_launched > 0)
 		{
 			/*
-			 * Reset the local value so that we compute cost balance during
-			 * parallel index vacuuming.
+			 * Reset the local cost values for leader backend as we have
+			 * already accumulated the remaining balance of heap.
 			 */
 			VacuumCostBalance = 0;
 			VacuumCostBalanceLocal = 0;
@@ -2101,7 +2111,7 @@ parallel_vacuum_index(Relation *Irel, IndexBulkDeleteResult **stats,
 		 */
 		Assert(shared_indstats);
 
-		/* Do vacuum or cleanup one index */
+		/* Do vacuum or cleanup of the index */
 		vacuum_one_index(Irel[idx], &(stats[idx]), lvshared, shared_indstats,
 						 dead_tuples);
 	}
@@ -2181,7 +2191,7 @@ vacuum_one_index(Relation indrel, IndexBulkDeleteResult **stats,
 			*stats = bulkdelete_res;
 	}
 
-	/* Do vacuum or cleanup one index */
+	/* Do vacuum or cleanup of the index */
 	if (lvshared->for_cleanup)
 		lazy_cleanup_index(indrel, stats, lvshared->reltuples,
 						   lvshared->estimated_count);
@@ -2194,7 +2204,7 @@ vacuum_one_index(Relation indrel, IndexBulkDeleteResult **stats,
 	 * amvacuumcleanup to the DSM segment if it's the first time to get it
 	 * from them, because they allocate it locally and it's possible that an
 	 * index will be vacuumed by the different vacuum process at the next
-	 * time.  The copying the result normally happens only after the first
+	 * time.  The copying of the result normally happens only after the first
 	 * time of index vacuuming.  From the second time, we pass the result on
 	 * the DSM segment so that they then update it directly.
 	 *
@@ -2207,8 +2217,8 @@ vacuum_one_index(Relation indrel, IndexBulkDeleteResult **stats,
 		shared_indstats->updated = true;
 
 		/*
-		 * no longer need the locally allocated result and now stats[idx]
-		 * points to the DSM segment.
+		 * Now that the stats[idx] points to the DSM segment, we don't need
+		 * the locally allocated results.
 		 */
 		pfree(*stats);
 		*stats = bulkdelete_res;
@@ -2965,7 +2975,7 @@ compute_parallel_vacuum_workers(Relation *Irel, int nindexes, int nrequested)
 		if ((vacoptions & VACUUM_OPTION_PARALLEL_BULKDEL) != 0)
 			nindexes_parallel_bulkdel++;
 		if (((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) != 0) ||
-			((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) != 0))
+			((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) != 0))
 			nindexes_parallel_cleanup++;
 	}
 
@@ -3158,7 +3168,6 @@ begin_parallel_vacuum(Oid relid, Relation *Irel, LVRelStats *vacrelstats,
 	MemSet(shared, 0, est_shared);
 	shared->relid = relid;
 	shared->elevel = elevel;
-	shared->disable_delay = (params->options & VACOPT_FAST);
 	shared->maintenance_work_mem_worker =
 		(nindexes_mwm > 0) ?
 		maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index b90305cde6..aae81a571a 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -449,12 +449,10 @@ InitializeParallelDSM(ParallelContext *pcxt)
  * to launch in the next execution.
  */
 void
-ReinitializeParallelDSM(ParallelContext *pcxt, int nworkers)
+ReinitializeParallelDSM(ParallelContext *pcxt)
 {
 	FixedParallelState *fps;
 
-	Assert(nworkers >= 0 && pcxt->nworkers >= nworkers);
-
 	/* Wait for any old workers to exit. */
 	if (pcxt->nworkers_launched > 0)
 	{
@@ -494,6 +492,23 @@ ReinitializeParallelDSM(ParallelContext *pcxt, int nworkers)
 	}
 }
 
+/*
+ * Reinitialize parallel workers for a parallel context such that we could
+ * launch the different number of workers.  This is required for cases where
+ * we need to reuse the same DSM segment, but the number of workers can
+ * vary from run-to-run.
+ */
+void
+ReinitializeParallelWorkers(ParallelContext *pcxt, int nworkers_to_launch)
+{
+	/*
+	 * The number of workers that need to be launched must be less than the
+	 * number of workers with which the parallel context is initialized.
+	 */
+	Assert(pcxt->nworkers >= nworkers_to_launch);
+	pcxt->nworkers_to_launch = nworkers_to_launch;
+}
+
 /*
  * Launch parallel workers.
  */
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index 0dd1bf4472..683b06cdd6 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -873,7 +873,7 @@ ExecParallelReinitialize(PlanState *planstate,
 	 */
 	ExecSetParamPlanMulti(sendParams, GetPerTupleExprContext(estate));
 
-	ReinitializeParallelDSM(pei->pcxt, pei->pcxt->nworkers);
+	ReinitializeParallelDSM(pei->pcxt);
 	pei->tqueue = ExecParallelSetupTupleQueues(pei->pcxt, true);
 	pei->reader = NULL;
 	pei->finished = false;
diff --git a/src/include/access/parallel.h b/src/include/access/parallel.h
index 86d24650ba..5add74a6dd 100644
--- a/src/include/access/parallel.h
+++ b/src/include/access/parallel.h
@@ -63,7 +63,8 @@ extern PGDLLIMPORT bool InitializingParallelWorker;
 extern ParallelContext *CreateParallelContext(const char *library_name,
 											  const char *function_name, int nworkers);
 extern void InitializeParallelDSM(ParallelContext *pcxt);
-extern void ReinitializeParallelDSM(ParallelContext *pcxt, int nworkers);
+extern void ReinitializeParallelDSM(ParallelContext *pcxt);
+extern void ReinitializeParallelWorkers(ParallelContext *pcxt, int nworkers_to_launch);
 extern void LaunchParallelWorkers(ParallelContext *pcxt);
 extern void WaitForParallelWorkersToAttach(ParallelContext *pcxt);
 extern void WaitForParallelWorkersToFinish(ParallelContext *pcxt);

#294

Prabhat Sahu

prabhat.sahu@enterprisedb.com

about 6 years ago

In reply to: Amit Kapila (#293)

Hi all,

While testing on v36 patch with gist index, I came across below
segmentation fault.

-- PG Head+ v36_patch
create table tab1(c1 int, c2 text PRIMARY KEY, c3 bool, c4 timestamp
without time zone, c5 timestamp with time zone, p point);
create index gist_idx1 on tab1 using gist(p);
create index gist_idx2 on tab1 using gist(p);
create index gist_idx3 on tab1 using gist(p);
create index gist_idx4 on tab1 using gist(p);
create index gist_idx5 on tab1 using gist(p);

-- Cancel the insert statement in middle:
postgres=# insert into tab1 (select x, x||'_c2', 'T', current_date-x/100,
current_date-x/100,point (x,x) from generate_series(1,1000000) x);
^CCancel request sent
ERROR: canceling statement due to user request

-- Segmentation fault during VACUUM(PARALLEL):
postgres=# vacuum(parallel 10) tab1 ;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.

-- Below is the stack trace:
[centos@parallel-vacuum-testing bin]$ gdb -q -c data/core.14650 postgres
Reading symbols from
/home/centos/BLP_Vacuum/postgresql/inst/bin/postgres...done.
[New LWP 14650]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `postgres: centos postgres [local] VACUUM
'.
Program terminated with signal 11, Segmentation fault.
#0 0x000000000075e713 in intset_num_entries (intset=0x1f62) at
integerset.c:353
353 return intset->num_entries;
Missing separate debuginfos, use: debuginfo-install
glibc-2.17-292.el7.x86_64 keyutils-libs-1.5.8-3.el7.x86_64
krb5-libs-1.15.1-37.el7_7.2.x86_64 libcom_err-1.42.9-16.el7.x86_64
libselinux-2.5-14.1.el7.x86_64 openssl-libs-1.0.2k-19.el7.x86_64
pcre-8.32-17.el7.x86_64 zlib-1.2.7-18.el7.x86_64
(gdb) bt
#0 0x000000000075e713 in intset_num_entries (intset=0x1f62) at
integerset.c:353
#1 0x00000000004cbe0f in gistvacuum_delete_empty_pages
(info=0x7fff32f8eba0, stats=0x7f2923b3f4d8) at gistvacuum.c:478
#2 0x00000000004cb353 in gistvacuumcleanup (info=0x7fff32f8eba0,
stats=0x7f2923b3f4d8) at gistvacuum.c:124
#3 0x000000000050dcca in index_vacuum_cleanup (info=0x7fff32f8eba0,
stats=0x7f2923b3f4d8) at indexam.c:711
#4 0x00000000005079ba in lazy_cleanup_index (indrel=0x7f292e149560,
stats=0x2db5e40, reltuples=0, estimated_count=false) at vacuumlazy.c:2380
#5 0x00000000005074f0 in vacuum_one_index (indrel=0x7f292e149560,
stats=0x2db5e40, lvshared=0x7f2923b3f460, shared_indstats=0x7f2923b3f4d0,
dead_tuples=0x7f2922fbe2c0) at vacuumlazy.c:2196
#6 0x0000000000507428 in vacuum_indexes_leader (Irel=0x2db5de0,
nindexes=6, stats=0x2db5e38, vacrelstats=0x2db5cb0, lps=0x2db5e90) at
vacuumlazy.c:2155
#7 0x0000000000507126 in lazy_parallel_vacuum_indexes (Irel=0x2db5de0,
stats=0x2db5e38, vacrelstats=0x2db5cb0, lps=0x2db5e90, nindexes=6)
at vacuumlazy.c:2045
#8 0x0000000000507770 in lazy_cleanup_indexes (Irel=0x2db5de0,
stats=0x2db5e38, vacrelstats=0x2db5cb0, lps=0x2db5e90, nindexes=6) at
vacuumlazy.c:2300
#9 0x0000000000506076 in lazy_scan_heap (onerel=0x7f292e1473b8,
params=0x7fff32f8f3e0, vacrelstats=0x2db5cb0, Irel=0x2db5de0, nindexes=6,
aggressive=false)
at vacuumlazy.c:1675
#10 0x0000000000504228 in heap_vacuum_rel (onerel=0x7f292e1473b8,
params=0x7fff32f8f3e0, bstrategy=0x2deb3a0) at vacuumlazy.c:475
#11 0x00000000006ea059 in table_relation_vacuum (rel=0x7f292e1473b8,
params=0x7fff32f8f3e0, bstrategy=0x2deb3a0)
at ../../../src/include/access/tableam.h:1432
#12 0x00000000006ecb74 in vacuum_rel (relid=16384, relation=0x2cf5cf8,
params=0x7fff32f8f3e0) at vacuum.c:1885
#13 0x00000000006eac8d in vacuum (relations=0x2deb548,
params=0x7fff32f8f3e0, bstrategy=0x2deb3a0, isTopLevel=true) at vacuum.c:440
#14 0x00000000006ea776 in ExecVacuum (pstate=0x2deaf90, vacstmt=0x2cf5de0,
isTopLevel=true) at vacuum.c:241
#15 0x000000000091da3d in standard_ProcessUtility (pstmt=0x2cf5ea8,
queryString=0x2cf51a0 "vacuum(parallel 10) tab1 ;",
context=PROCESS_UTILITY_TOPLEVEL,
params=0x0, queryEnv=0x0, dest=0x2cf6188, completionTag=0x7fff32f8f840
"") at utility.c:665
#16 0x000000000091d270 in ProcessUtility (pstmt=0x2cf5ea8,
queryString=0x2cf51a0 "vacuum(parallel 10) tab1 ;",
context=PROCESS_UTILITY_TOPLEVEL, params=0x0,
queryEnv=0x0, dest=0x2cf6188, completionTag=0x7fff32f8f840 "") at
utility.c:359
#17 0x000000000091c187 in PortalRunUtility (portal=0x2d5c530,
pstmt=0x2cf5ea8, isTopLevel=true, setHoldSnapshot=false, dest=0x2cf6188,
completionTag=0x7fff32f8f840 "") at pquery.c:1175
#18 0x000000000091c39e in PortalRunMulti (portal=0x2d5c530,
isTopLevel=true, setHoldSnapshot=false, dest=0x2cf6188, altdest=0x2cf6188,
completionTag=0x7fff32f8f840 "") at pquery.c:1321
#19 0x000000000091b8c8 in PortalRun (portal=0x2d5c530,
count=9223372036854775807, isTopLevel=true, run_once=true, dest=0x2cf6188,
altdest=0x2cf6188,
completionTag=0x7fff32f8f840 "") at pquery.c:796
#20 0x00000000009156d4 in exec_simple_query (query_string=0x2cf51a0
"vacuum(parallel 10) tab1 ;") at postgres.c:1227
#21 0x0000000000919a1c in PostgresMain (argc=1, argv=0x2d1f608,
dbname=0x2d1f520 "postgres", username=0x2d1f500 "centos") at postgres.c:4288
#22 0x000000000086de39 in BackendRun (port=0x2d174e0) at postmaster.c:4498
#23 0x000000000086d617 in BackendStartup (port=0x2d174e0) at
postmaster.c:4189
#24 0x0000000000869992 in ServerLoop () at postmaster.c:1727
#25 0x0000000000869248 in PostmasterMain (argc=3, argv=0x2cefd70) at
postmaster.c:1400
#26 0x0000000000778593 in main (argc=3, argv=0x2cefd70) at main.c:210

On Wed, Dec 18, 2019 at 3:36 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Dec 18, 2019 at 12:04 PM Amit Kapila <amit.kapila16@gmail.com>
wrote:

On Wed, Dec 18, 2019 at 11:46 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Wed, 18 Dec 2019 at 15:03, Amit Kapila <amit.kapila16@gmail.com>

wrote:

I was analyzing your changes related to ReinitializeParallelDSM() and
it seems like we might launch more number of workers for the
bulkdelete phase. While creating a parallel context, we used the
maximum of "workers required for bulkdelete phase" and "workers
required for cleanup", but now if the number of workers required in
bulkdelete phase is lesser than a cleanup phase(as mentioned by you

in

one example), then we would launch more workers for bulkdelete phase.

Good catch. Currently when creating a parallel context the number of
workers passed to CreateParallelContext() is set not only to
pcxt->nworkers but also pcxt->nworkers_to_launch. We would need to
specify the number of workers actually to launch after created the
parallel context or when creating it. Or I think we call
ReinitializeParallelDSM() even the first time running index vacuum.

How about just having ReinitializeParallelWorkers which can be called
only via vacuum even for the first time before the launch of workers
as of now?

See in the attached what I have in mind. Few other comments:

1.
+ shared->disable_delay = (params->options & VACOPT_FAST);

This should be part of the third patch.
2.
+lazy_parallel_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult
**stats,
+ LVRelStats *vacrelstats, LVParallelState *lps,
+ int nindexes)
{
..
..
+ /* Cap by the worker we computed at the beginning of parallel lazy
vacuum */
+ nworkers = Min(nworkers, lps->pcxt->nworkers);
..
}
This should be Assert. In no case, the computed workers can be more
than what we have in context.
3.
+ if (((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) != 0) ||
+ ((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) != 0))
+ nindexes_parallel_cleanup++;
I think the second condition should be VACUUM_OPTION_PARALLEL_COND_CLEANUP.

I have fixed the above comments and some given by me earlier [1] in
the attached patch. The attached patch is a diff on top of
v36-0002-Add-parallel-option-to-VACUUM-command.

Few other comments which I have not fixed:
4.
+ if (Irel[i]->rd_indam->amusemaintenanceworkmem)
+ nindexes_mwm++;
+
+ /* Skip indexes that don't participate parallel index vacuum */
+ if (vacoptions == VACUUM_OPTION_NO_PARALLEL ||
+ RelationGetNumberOfBlocks(Irel[i]) < min_parallel_index_scan_size)
+ continue;
Won't we need to worry about the number of indexes that uses
maintenance_work_mem only for indexes that can participate in a
parallel vacuum? If so, the above checks need to be reversed.
5.
/*
+ * Remember indexes that can participate parallel index vacuum and use
+ * it for index statistics initialization on DSM because the index
+ * size can get bigger during vacuum.
+ */
+ can_parallel_vacuum[i] = true;
I am not able to understand the second part of the comment ("because
the index size can get bigger during vacuum."). What is its
relevance?
6.
+/*
+ * Vacuum or cleanup indexes that can be processed by only the leader
process
+ * because these indexes don't support parallel operation at that phase.
+ * Therefore this function must be called by the leader process.
+ */
+static void
+vacuum_indexes_leader(Relation *Irel, int nindexes,
IndexBulkDeleteResult **stats,
+   LVRelStats *vacrelstats, LVParallelState *lps)
{
..
Why you have changed the order of nindexes parameter? I think in the
previous patch, it was the last parameter and that seems to be better
place for it. Also, I think after the latest modifications, you can
remove the second sentence in the above comment ("Therefore this
function must be called by the leader process.).
7.
+ for (i = 0; i < nindexes; i++)
+ {
+ bool leader_only = (get_indstats(lps->lvshared, i) == NULL ||
+    skip_parallel_vacuum_index(Irel[i], lps->lvshared));
+
+ /* Skip the indexes that can be processed by parallel workers */
+ if (!leader_only)
+ continue;
It is better to name this parameter as skip_index or something like that.

[1] -
/messages/by-id/CAA4eK1+KBAt1JS+asDd7K9C10OtBiyuUC75y8LR6QVnD2wrsMw@mail.gmail.com

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

With Regards,

Prabhat Kumar Sahu
Skype ID: prabhat.sahu1984
EnterpriseDB Software India Pvt. Ltd.

The Postgres Database Company

#295

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Amit Kapila (#293)

On Wed, Dec 18, 2019 at 3:36 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

Few other comments which I have not fixed:

+    /* interface function to support parallel vacuum */
+    amestimateparallelvacuum_function amestimateparallelvacuum; /*
can be NULL */
 } IndexAmRoutine;

One more thing, why have you removed the estimate function for API
patch? It seems to me Robert has given a different suggestion [1]/messages/by-id/CA+TgmobjtHdLfQhmzqBNt7VEsz+5w3P0yy0-EsoT05yAJViBSQ@mail.gmail.com to
deal with it. I think he suggests to add a new member like void
*private_data to IndexBulkDeleteResult and then provide an estimate
function. See his email [1]/messages/by-id/CA+TgmobjtHdLfQhmzqBNt7VEsz+5w3P0yy0-EsoT05yAJViBSQ@mail.gmail.com for detailed explanation. Did I
misunderstood it or you have handled it differently? Can you please
share your thoughts on this?

[1]: /messages/by-id/CA+TgmobjtHdLfQhmzqBNt7VEsz+5w3P0yy0-EsoT05yAJViBSQ@mail.gmail.com

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#296

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Prabhat Sahu (#294)

On Wed, Dec 18, 2019 at 6:01 PM Prabhat Sahu <prabhat.sahu@enterprisedb.com>
wrote:

Hi all,

While testing on v36 patch with gist index, I came across below
segmentation fault.

It seems you forgot to apply the Gist index patch as mentioned by
Masahiko-San. You need to first apply the patch at
/messages/by-id/CAA4eK1J1RxmXFAHC34S4_BznT76cfbrvqORSk23iBgRAOj1azw@mail.gmail.com
and
then apply other v-36 patches. If you have already done that, then we need
to investigate. Kindly confirm.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#297

Prabhat Sahu

prabhat.sahu@enterprisedb.com

about 6 years ago

In reply to: Amit Kapila (#296)

On Wed, Dec 18, 2019 at 6:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Dec 18, 2019 at 6:01 PM Prabhat Sahu <
prabhat.sahu@enterprisedb.com> wrote:

Hi all,

While testing on v36 patch with gist index, I came across below
segmentation fault.

It seems you forgot to apply the Gist index patch as mentioned by
Masahiko-San. You need to first apply the patch at
/messages/by-id/CAA4eK1J1RxmXFAHC34S4_BznT76cfbrvqORSk23iBgRAOj1azw@mail.gmail.com and
then apply other v-36 patches. If you have already done that, then we need
to investigate. Kindly confirm.

Yes Amit, Thanks for the suggestion. I have forgotten to add the v4 patch.

I have retested the same scenario, now the issue is not reproducible and it
is working fine.
--

With Regards,

Prabhat Kumar Sahu
Skype ID: prabhat.sahu1984
EnterpriseDB Software India Pvt. Ltd.

The Postgres Database Company

#298

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Amit Kapila (#295)

On Wed, Dec 18, 2019 at 6:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Dec 18, 2019 at 3:36 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

Few other comments which I have not fixed:
+    /* interface function to support parallel vacuum */
+    amestimateparallelvacuum_function amestimateparallelvacuum; /*
can be NULL */
} IndexAmRoutine;
One more thing, why have you removed the estimate function for API
patch?

Again thinking about this, it seems to me what you have done here is
probably the right direction because whatever else we will do we need
to have some untested code or we need to write/enhance some IndexAM to
test this. The point is that we don't have any IndexAM in the core
(after working around Gist index) which has this requirement and we
have not even heard from anyone of such usage, so there is a good
chance that whatever we do might not be sufficient for the IndexAM
that have such usage.

Now, we are already providing an option that one can set
VACUUM_OPTION_NO_PARALLEL to indicate that the IndexAM can't
participate in a parallel vacuum. So, I feel if there is any IndexAM
which would like to pass more data along with IndexBulkDeleteResult,
they can use that option. It won't be very difficult to enhance or
provide the new APIs to support a parallel vacuum if we come across
such a usage. I think we should just modify the comments atop
VACUUM_OPTION_NO_PARALLEL to mention this. I think this should be
good enough for the first version of parallel vacuum considering we
are able to support a parallel vacuum for all in-core indexes.

Thoughts?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#299

Dilip Kumar

dilipbalaut@gmail.com

about 6 years ago

In reply to: Amit Kapila (#298)

On Thu, Dec 19, 2019 at 8:17 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Dec 18, 2019 at 6:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Dec 18, 2019 at 3:36 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

Few other comments which I have not fixed:
+    /* interface function to support parallel vacuum */
+    amestimateparallelvacuum_function amestimateparallelvacuum; /*
can be NULL */
} IndexAmRoutine;
One more thing, why have you removed the estimate function for API
patch?
Again thinking about this, it seems to me what you have done here is
probably the right direction because whatever else we will do we need
to have some untested code or we need to write/enhance some IndexAM to
test this. The point is that we don't have any IndexAM in the core
(after working around Gist index) which has this requirement and we
have not even heard from anyone of such usage, so there is a good
chance that whatever we do might not be sufficient for the IndexAM
that have such usage.

Now, we are already providing an option that one can set
VACUUM_OPTION_NO_PARALLEL to indicate that the IndexAM can't
participate in a parallel vacuum. So, I feel if there is any IndexAM
which would like to pass more data along with IndexBulkDeleteResult,
they can use that option. It won't be very difficult to enhance or
provide the new APIs to support a parallel vacuum if we come across
such a usage. I think we should just modify the comments atop
VACUUM_OPTION_NO_PARALLEL to mention this. I think this should be
good enough for the first version of parallel vacuum considering we
are able to support a parallel vacuum for all in-core indexes.

Thoughts?

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#300

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

about 6 years ago

In reply to: Amit Kapila (#298)

On Thu, 19 Dec 2019 at 11:47, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Dec 18, 2019 at 6:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Dec 18, 2019 at 3:36 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

Few other comments which I have not fixed:
+    /* interface function to support parallel vacuum */
+    amestimateparallelvacuum_function amestimateparallelvacuum; /*
can be NULL */
} IndexAmRoutine;
One more thing, why have you removed the estimate function for API
patch?
Again thinking about this, it seems to me what you have done here is
probably the right direction because whatever else we will do we need
to have some untested code or we need to write/enhance some IndexAM to
test this. The point is that we don't have any IndexAM in the core
(after working around Gist index) which has this requirement and we
have not even heard from anyone of such usage, so there is a good
chance that whatever we do might not be sufficient for the IndexAM
that have such usage.

Now, we are already providing an option that one can set
VACUUM_OPTION_NO_PARALLEL to indicate that the IndexAM can't
participate in a parallel vacuum. So, I feel if there is any IndexAM
which would like to pass more data along with IndexBulkDeleteResult,
they can use that option. It won't be very difficult to enhance or
provide the new APIs to support a parallel vacuum if we come across
such a usage.

Yeah that's exactly what I was thinking. I was about to send such
email. The idea is good but I thought we can exclude this feature from
the first version patch because we still don't have index AMs that
uses that callback in core after gist index patch gets committed. That
is, an index AM that does vacuum like the current gist indexes should
set VACUUM_OPTION_NO_PARALLEL and we can discuss that again when we
got real voice from index AM developers.

I think we should just modify the comments atop
VACUUM_OPTION_NO_PARALLEL to mention this. I think this should be
good enough for the first version of parallel vacuum considering we
are able to support a parallel vacuum for all in-core indexes.

I added some comments about that in v36 patch but I slightly modified it.

I'll submit an updated version patch soon.

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#301

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

about 6 years ago

In reply to: Amit Kapila (#293)

3 attachment(s)

On Wed, 18 Dec 2019 at 19:06, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Dec 18, 2019 at 12:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Dec 18, 2019 at 11:46 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Wed, 18 Dec 2019 at 15:03, Amit Kapila <amit.kapila16@gmail.com> wrote:

I was analyzing your changes related to ReinitializeParallelDSM() and
it seems like we might launch more number of workers for the
bulkdelete phase. While creating a parallel context, we used the
maximum of "workers required for bulkdelete phase" and "workers
required for cleanup", but now if the number of workers required in
bulkdelete phase is lesser than a cleanup phase(as mentioned by you in
one example), then we would launch more workers for bulkdelete phase.

Good catch. Currently when creating a parallel context the number of
workers passed to CreateParallelContext() is set not only to
pcxt->nworkers but also pcxt->nworkers_to_launch. We would need to
specify the number of workers actually to launch after created the
parallel context or when creating it. Or I think we call
ReinitializeParallelDSM() even the first time running index vacuum.

How about just having ReinitializeParallelWorkers which can be called
only via vacuum even for the first time before the launch of workers
as of now?

See in the attached what I have in mind. Few other comments:

1.
+ shared->disable_delay = (params->options & VACOPT_FAST);

This should be part of the third patch.
2.
+lazy_parallel_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+ LVRelStats *vacrelstats, LVParallelState *lps,
+ int nindexes)
{
..
..
+ /* Cap by the worker we computed at the beginning of parallel lazy vacuum */
+ nworkers = Min(nworkers, lps->pcxt->nworkers);
..
}
This should be Assert. In no case, the computed workers can be more
than what we have in context.
3.
+ if (((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) != 0) ||
+ ((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) != 0))
+ nindexes_parallel_cleanup++;
I think the second condition should be VACUUM_OPTION_PARALLEL_COND_CLEANUP.

I have fixed the above comments and some given by me earlier [1] in
the attached patch. The attached patch is a diff on top of
v36-0002-Add-parallel-option-to-VACUUM-command.

Thank you!

- /* Cap by the worker we computed at the beginning of parallel lazy vacuum */
- nworkers = Min(nworkers, lps->pcxt->nworkers);
+ /*
+ * The number of workers required for parallel vacuum phase must be less
+ * than the number of workers with which parallel context is initialized.
+ */
+ Assert(lps->pcxt->nworkers >= nworkers);

Regarding the above change in your patch I think we need to cap the
number of workers by lps->pcxt->nworkers because the computation of
the number of indexes based on lps->nindexes_paralle_XXX can be larger
than the number determined when creating the parallel context, for
example, when max_parallel_maintenance_workers is smaller than the
number of indexes that can be vacuumed in parallel at bulkdelete
phase.

Few other comments which I have not fixed:
4.
+ if (Irel[i]->rd_indam->amusemaintenanceworkmem)
+ nindexes_mwm++;
+
+ /* Skip indexes that don't participate parallel index vacuum */
+ if (vacoptions == VACUUM_OPTION_NO_PARALLEL ||
+ RelationGetNumberOfBlocks(Irel[i]) < min_parallel_index_scan_size)
+ continue;
Won't we need to worry about the number of indexes that uses
maintenance_work_mem only for indexes that can participate in a
parallel vacuum? If so, the above checks need to be reversed.

You're right. Fixed.

5.
/*
+ * Remember indexes that can participate parallel index vacuum and use
+ * it for index statistics initialization on DSM because the index
+ * size can get bigger during vacuum.
+ */
+ can_parallel_vacuum[i] = true;
I am not able to understand the second part of the comment ("because
the index size can get bigger during vacuum."). What is its
relevance?

I meant that the indexes can be begger even during vacuum. So we need
to check the size of indexes and determine participations of parallel
index vacuum at one place.

6.
+/*
+ * Vacuum or cleanup indexes that can be processed by only the leader process
+ * because these indexes don't support parallel operation at that phase.
+ * Therefore this function must be called by the leader process.
+ */
+static void
+vacuum_indexes_leader(Relation *Irel, int nindexes,
IndexBulkDeleteResult **stats,
+   LVRelStats *vacrelstats, LVParallelState *lps)
{
..

Why you have changed the order of nindexes parameter? I think in the
previous patch, it was the last parameter and that seems to be better
place for it.

Since some existing codes place nindexes right after *Irel I thought
it's more understandable but I'm also fine with the previous order.

Also, I think after the latest modifications, you can
remove the second sentence in the above comment ("Therefore this
function must be called by the leader process.).

Fixed.

7.
+ for (i = 0; i < nindexes; i++)
+ {
+ bool leader_only = (get_indstats(lps->lvshared, i) == NULL ||
+    skip_parallel_vacuum_index(Irel[i], lps->lvshared));
+
+ /* Skip the indexes that can be processed by parallel workers */
+ if (!leader_only)
+ continue;

It is better to name this parameter as skip_index or something like that.

Fixed.

Attached the updated version patch. This version patch incorporates
the above comments and the comments from Mahendra. I also fixed one
bug around determining the indexes that are vacuumed in parallel based
on their option and size. Please review it.

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

v37-0003-Add-FAST-option-to-vacuum-command.patchapplication/octet-stream; name=v37-0003-Add-FAST-option-to-vacuum-command.patchDownload

From d299d8abf75a4909d0890a3c26cd6863195bf2f1 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Tue, 17 Dec 2019 15:18:22 +0900
Subject: [PATCH v37 3/3] Add FAST option to vacuum command.

---
 doc/src/sgml/ref/vacuum.sgml         | 13 +++++++++
 src/backend/access/heap/vacuumlazy.c | 43 +++++++++++++++++-----------
 src/backend/commands/vacuum.c        |  9 ++++--
 src/include/commands/vacuum.h        |  3 +-
 src/test/regress/expected/vacuum.out |  3 ++
 src/test/regress/sql/vacuum.sql      |  4 +++
 6 files changed, 56 insertions(+), 19 deletions(-)

diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index 9fee083233..b190cb0a98 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -35,6 +35,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     INDEX_CLEANUP [ <replaceable class="parameter">boolean</replaceable> ]
     TRUNCATE [ <replaceable class="parameter">boolean</replaceable> ]
     PARALLEL [ <replaceable class="parameter">integer</replaceable> ]
+    FAST [ <replaceable class="parameter">boolean</replaceable> ]
 
 <phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
 
@@ -250,6 +251,18 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><literal>FAST</literal></term>
+    <listitem>
+     <para>
+      Perform vacuum while disabling cost-based vacuum delay feature.
+      Specifying <literal>FAST</literal> is equivalent to performing
+      <command>VACUUM</command> with the
+      <xref linkend="guc-vacuum-cost-delay"/> parameter set to zero.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><replaceable class="parameter">boolean</replaceable></term>
     <listitem>
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 71876caf17..efbdc456ad 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -218,6 +218,13 @@ typedef struct LVShared
 	 */
 	pg_atomic_uint32 active_nworkers;
 
+	/*
+	 * True if we forcibly disable cost-based vacuum delay during parallel
+	 * index vacuum. This can be true when use specified the FAST vacuum
+	 * option.
+	 */
+	bool		disable_delay;
+
 	/*
 	 * Variables to control parallel index vacuuming.  We have a bitmap to
 	 * indicate which index has stats in shared memory.  The set bit in the
@@ -352,7 +359,7 @@ static void update_index_statistics(Relation *Irel, IndexBulkDeleteResult **stat
 									int nindexes);
 static LVParallelState *begin_parallel_vacuum(Oid relid, Relation *Irel,
 											  LVRelStats *vacrelstats, BlockNumber nblocks,
-											  int nindexes, int nrequested);
+											  int nindexes, VacuumParams *params);
 static void end_parallel_vacuum(Relation *Irel, IndexBulkDeleteResult **stats,
 								LVParallelState *lps, int nindexes);
 static LVSharedIndStats *get_indstats(LVShared *lvshared, int n);
@@ -755,7 +762,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	if (params->nworkers >= 0 && vacrelstats->useindex)
 		lps = begin_parallel_vacuum(RelationGetRelid(onerel), Irel,
 									vacrelstats, nblocks, nindexes,
-									params->nworkers);
+									params);
 
 	/*
 	 * Allocate the space for dead tuples in case the parallel vacuum is not
@@ -1991,16 +1998,19 @@ lazy_parallel_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
 			ReinitializeParallelDSM(lps->pcxt);
 		}
 
-		/* Enable shared cost balance */
-		VacuumSharedCostBalance = &(lps->lvshared->cost_balance);
-		VacuumActiveNWorkers = &(lps->lvshared->active_nworkers);
+		if (!lps->lvshared->disable_delay)
+		{
+			/* Enable shared cost balance */
+			VacuumSharedCostBalance = &(lps->lvshared->cost_balance);
+			VacuumActiveNWorkers = &(lps->lvshared->active_nworkers);
 
-		/*
-		 * Set up shared cost balance and the number of active workers for
-		 * vacuum delay.
-		 */
-		pg_atomic_write_u32(VacuumSharedCostBalance, VacuumCostBalance);
-		pg_atomic_write_u32(VacuumActiveNWorkers, 0);
+			/*
+			 * Set up shared cost balance and the number of active workers for
+			 * vacuum delay.
+			 */
+			pg_atomic_write_u32(VacuumSharedCostBalance, VacuumCostBalance);
+			pg_atomic_write_u32(VacuumActiveNWorkers, 0);
+		}
 
 		/*
 		 * The number of workers can vary between and bulkdelete and cleanup
@@ -2019,7 +2029,7 @@ lazy_parallel_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
 			VacuumCostBalance = 0;
 			VacuumCostBalanceLocal = 0;
 		}
-		else
+		else if (!lps->lvshared->disable_delay)
 		{
 			/*
 			 * Disable shared cost balance if we are not able to launch
@@ -3081,7 +3091,7 @@ update_index_statistics(Relation *Irel, IndexBulkDeleteResult **stats,
  */
 static LVParallelState *
 begin_parallel_vacuum(Oid relid, Relation *Irel, LVRelStats *vacrelstats,
-					  BlockNumber nblocks, int nindexes, int nrequested)
+					  BlockNumber nblocks, int nindexes, VacuumParams *params)
 {
 	LVParallelState *lps = NULL;
 	ParallelContext *pcxt;
@@ -3101,7 +3111,7 @@ begin_parallel_vacuum(Oid relid, Relation *Irel, LVRelStats *vacrelstats,
 	 * a parallel vacuum must be requested and there must be indexes on the
 	 * relation
 	 */
-	Assert(nrequested >= 0);
+	Assert(params->nworkers >= 0);
 	Assert(nindexes > 0);
 
 	/*
@@ -3109,7 +3119,7 @@ begin_parallel_vacuum(Oid relid, Relation *Irel, LVRelStats *vacrelstats,
 	 */
 	can_parallel_vacuum = (bool *) palloc0(sizeof(bool) * nindexes);
 	parallel_workers = compute_parallel_vacuum_workers(Irel, nindexes,
-													   nrequested,
+													   params->nworkers,
 													   can_parallel_vacuum);
 
 	/* Can't perform vacuum in parallel */
@@ -3188,6 +3198,7 @@ begin_parallel_vacuum(Oid relid, Relation *Irel, LVRelStats *vacrelstats,
 		(nindexes_mwm > 0) ?
 		maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
 		maintenance_work_mem;
+	shared->disable_delay = (params->options & VACOPT_FAST);
 
 	/*
 	 * We need to care about alignment because we estimate the shared memory
@@ -3378,7 +3389,7 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
 												  false);
 
 	/* Set cost-based vacuum delay */
-	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostActive = ((VacuumCostDelay > 0) && !(lvshared->disable_delay));
 	VacuumCostBalance = 0;
 	VacuumPageHit = 0;
 	VacuumPageMiss = 0;
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 14a9b2432e..f93c93f9e0 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -101,6 +101,7 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 	bool		verbose = false;
 	bool		skip_locked = false;
 	bool		analyze = false;
+	bool		fast = false;
 	bool		freeze = false;
 	bool		full = false;
 	bool		disable_page_skipping = false;
@@ -130,6 +131,8 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 		/* Parse options available on VACUUM */
 		else if (strcmp(opt->defname, "analyze") == 0)
 			analyze = defGetBoolean(opt);
+		else if (strcmp(opt->defname, "fast") == 0)
+			fast = defGetBoolean(opt);
 		else if (strcmp(opt->defname, "freeze") == 0)
 			freeze = defGetBoolean(opt);
 		else if (strcmp(opt->defname, "full") == 0)
@@ -177,7 +180,8 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 		(analyze ? VACOPT_ANALYZE : 0) |
 		(freeze ? VACOPT_FREEZE : 0) |
 		(full ? VACOPT_FULL : 0) |
-		(disable_page_skipping ? VACOPT_DISABLE_PAGE_SKIPPING : 0);
+		(disable_page_skipping ? VACOPT_DISABLE_PAGE_SKIPPING : 0) |
+		(fast ? VACOPT_FAST : 0);
 
 	/* sanity checks on options */
 	Assert(params.options & (VACOPT_VACUUM | VACOPT_ANALYZE));
@@ -416,7 +420,8 @@ vacuum(List *relations, VacuumParams *params,
 		ListCell   *cur;
 
 		in_vacuum = true;
-		VacuumCostActive = (VacuumCostDelay > 0);
+		VacuumCostActive = ((VacuumCostDelay > 0) &&
+							!(params->options & VACOPT_FAST));
 		VacuumCostBalance = 0;
 		VacuumPageHit = 0;
 		VacuumPageMiss = 0;
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 6bb15bef0d..25761a8096 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -182,7 +182,8 @@ typedef enum VacuumOption
 	VACOPT_FULL = 1 << 4,		/* FULL (non-concurrent) vacuum */
 	VACOPT_SKIP_LOCKED = 1 << 5,	/* skip if cannot get lock */
 	VACOPT_SKIPTOAST = 1 << 6,	/* don't process the TOAST table, if any */
-	VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7	/* don't skip any pages */
+	VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7,	/* don't skip any pages */
+	VACOPT_FAST = 1 << 8		/* disable vacuum delay */
 } VacuumOption;
 
 /*
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index 5b42371d95..f52d640589 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -117,6 +117,9 @@ CREATE TEMPORARY TABLE tmp (a int PRIMARY KEY);
 CREATE INDEX tmp_idx1 ON tmp (a);
 VACUUM (PARALLEL 1) tmp; -- disables parallel vacuum option
 WARNING:  disabling parallel option of vacuum on "tmp" --- cannot vacuum temporary tables in parallel
+-- FAST option
+UPDATE pvactst SET i = i WHERE i < 1000;
+VACUUM (FAST) pvactst;
 DROP TABLE pvactst;
 -- INDEX_CLEANUP option
 CREATE TABLE no_index_cleanup (i INT PRIMARY KEY, t TEXT);
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 0cdda11b25..2e42239da3 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -98,6 +98,10 @@ VACUUM (PARALLEL 2, FULL TRUE) pvactst; -- error, cannot use both PARALLEL and F
 CREATE TEMPORARY TABLE tmp (a int PRIMARY KEY);
 CREATE INDEX tmp_idx1 ON tmp (a);
 VACUUM (PARALLEL 1) tmp; -- disables parallel vacuum option
+
+-- FAST option
+UPDATE pvactst SET i = i WHERE i < 1000;
+VACUUM (FAST) pvactst;
 DROP TABLE pvactst;
 
 -- INDEX_CLEANUP option
-- 
2.23.0

v37-0002-Add-parallel-option-to-VACUUM-command.patchapplication/octet-stream; name=v37-0002-Add-parallel-option-to-VACUUM-command.patchDownload

From 8a3e4a952e0509b292de193f626e6f814400273a Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Tue, 17 Dec 2019 14:24:26 +0900
Subject: [PATCH v37 2/3] Add parallel option to VACUUM command

This change adds PARALLEL option to VACUUM command that enable us to
perform index vacuuming and index cleanup with background
workers. Individual indexes is processed by one vacuum
process. Therefore parallel vacuum can be used when the table has at
least two indexes and it cannot specify larger parallel degree than
the number of indexes that the table has.

The parallel degree is either specified by user or determined based on
the number of indexes that the table has, and further limited by
max_parallel_maintenance_workers. The table size and index size don't
affect it.
---
 doc/src/sgml/config.sgml              |   14 +-
 doc/src/sgml/ref/vacuum.sgml          |   45 +
 src/backend/access/heap/vacuumlazy.c  | 1251 +++++++++++++++++++++++--
 src/backend/access/transam/parallel.c |   26 +-
 src/backend/commands/vacuum.c         |  126 ++-
 src/backend/postmaster/autovacuum.c   |    2 +
 src/bin/psql/tab-complete.c           |    2 +-
 src/include/access/heapam.h           |    3 +
 src/include/access/parallel.h         |    4 +-
 src/include/commands/vacuum.h         |   15 +-
 src/test/regress/expected/vacuum.out  |   26 +
 src/test/regress/sql/vacuum.sql       |   25 +
 12 files changed, 1419 insertions(+), 120 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 5d1c90282f..74756277b7 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2308,13 +2308,13 @@ include_dir 'conf.d'
        <listitem>
         <para>
          Sets the maximum number of parallel workers that can be
-         started by a single utility command.  Currently, the only
-         parallel utility command that supports the use of parallel
-         workers is <command>CREATE INDEX</command>, and only when
-         building a B-tree index.  Parallel workers are taken from the
-         pool of processes established by <xref
-         linkend="guc-max-worker-processes"/>, limited by <xref
-         linkend="guc-max-parallel-workers"/>.  Note that the requested
+         started by a single utility command.  Currently, the parallel
+         utility commands that support the use of parallel workers are
+         <command>CREATE INDEX</command> only when building a B-tree index,
+         and <command>VACUUM</command> without <literal>FULL</literal>
+         option.  Parallel workers are taken from the pool of processes
+         established by <xref linkend="guc-max-worker-processes"/>, limited
+         by <xref linkend="guc-max-parallel-workers"/>.  Note that the requested
          number of workers may not actually be available at run time.
          If this occurs, the utility operation will run with fewer
          workers than expected.  The default value is 2.  Setting this
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index f9b0fb8794..9fee083233 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -34,6 +34,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     SKIP_LOCKED [ <replaceable class="parameter">boolean</replaceable> ]
     INDEX_CLEANUP [ <replaceable class="parameter">boolean</replaceable> ]
     TRUNCATE [ <replaceable class="parameter">boolean</replaceable> ]
+    PARALLEL [ <replaceable class="parameter">integer</replaceable> ]
 
 <phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
 
@@ -223,6 +224,32 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><literal>PARALLEL</literal></term>
+    <listitem>
+     <para>
+      Perform vacuum index and cleanup index phases of <command>VACUUM</command>
+      in parallel using <replaceable class="parameter">integer</replaceable>
+      background workers (for the detail of each vacuum phases, please
+      refer to <xref linkend="vacuum-phases"/>).  If the parallel degree
+      <replaceable class="parameter">integer</replaceable> is omitted,
+      then <command>VACUUM</command> decides the number of workers based
+      on number of indexes that support parallel vacuum operation on the
+      relation which is further limited by
+      <xref linkend="guc-max-parallel-workers-maintenance"/>.  Please note
+      that it is not guaranteed that the number of parallel workers specified
+      in <replaceable class="parameter">integer</replaceable> will be used
+      during execution.  It is possible for a vacuum to run with fewer workers
+      than specified, or even with no workers at all.  Only one worker can
+      be used per index.  So parallel workers are launched only when there
+      are at least <literal>2</literal> indexes in the table.  Workers for
+      vacuum launches before starting each phase and exit at the end of
+      the phase.  These behaviors might change in a future release.  This
+      option can't be used with the <literal>FULL</literal> option.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><replaceable class="parameter">boolean</replaceable></term>
     <listitem>
@@ -237,6 +264,18 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><replaceable class="parameter">integer</replaceable></term>
+    <listitem>
+     <para>
+      Specifies a positive integer value passed to the selected option.
+      The <replaceable class="parameter">integer</replaceable> value can
+      also be omitted, in which case the value is decided by the command
+      based on the option used.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><replaceable class="parameter">table_name</replaceable></term>
     <listitem>
@@ -316,6 +355,12 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     more than a plain <command>VACUUM</command> would.
    </para>
 
+   <para>
+     The <option>PARALLEL</option> option is used only for vacuum purpose.
+     Even if this option is specified with <option>ANALYZE</option> option
+     it does not affect <option>ANALYZE</option>.
+   </para>
+
    <para>
     <command>VACUUM</command> causes a substantial increase in I/O traffic,
     which might cause poor performance for other active sessions.  Therefore,
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index ab09d8408c..71876caf17 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -22,6 +22,20 @@
  * of index scans performed.  So we don't use maintenance_work_mem memory for
  * the TID array, just enough to hold as many heap tuples as fit on one page.
  *
+ * Lazy vacuum supports parallel execution with parallel worker processes.  In
+ * a parallel lazy vacuum, we perform both index vacuuming and index cleanup
+ * with parallel worker processes.  Individual indexes are processed by one
+ * vacuum process.  At the beginning of a lazy vacuum (at lazy_scan_heap) we
+ * prepare the parallel context and initialize the DSM segment that contains
+ * shared information as well as the memory space for storing dead tuples.
+ * When starting either index vacuuming or index cleanup, we launch parallel
+ * worker processes.  Once all indexes are processed the parallel worker
+ * processes exit.  After that, the leader process re-initializes the parallel
+ * context so that it can use the same DSM for multiple passes of index
+ * vacuum and for performing index cleanup.  For updating the index statistics,
+ * we need to update the system table and since updates are not
+ * allowed during parallel mode we update the index statistics after exiting
+ * from the parallel mode.
  *
  * Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
@@ -36,25 +50,30 @@
 
 #include <math.h>
 
+#include "access/amapi.h"
 #include "access/genam.h"
 #include "access/heapam.h"
 #include "access/heapam_xlog.h"
 #include "access/htup_details.h"
 #include "access/multixact.h"
+#include "access/parallel.h"
 #include "access/transam.h"
 #include "access/visibilitymap.h"
+#include "access/xact.h"
 #include "access/xlog.h"
 #include "catalog/storage.h"
 #include "commands/dbcommands.h"
 #include "commands/progress.h"
 #include "commands/vacuum.h"
 #include "miscadmin.h"
+#include "optimizer/paths.h"
 #include "pgstat.h"
 #include "portability/instr_time.h"
 #include "postmaster/autovacuum.h"
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
 #include "storage/lmgr.h"
+#include "tcop/tcopprot.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/pg_rusage.h"
@@ -110,6 +129,142 @@
  */
 #define PREFETCH_SIZE			((BlockNumber) 32)
 
+/*
+ * DSM keys for parallel lazy vacuum.  Unlike other parallel execution code,
+ * since we don't need to worry about DSM keys conflicting with plan_node_id
+ * we can use small integers.
+ */
+#define PARALLEL_VACUUM_KEY_SHARED			1
+#define PARALLEL_VACUUM_KEY_DEAD_TUPLES		2
+#define PARALLEL_VACUUM_KEY_QUERY_TEXT		3
+
+/*
+ * Macro to check if we are in a parallel lazy vacuum.  If true, we are
+ * in the parallel mode and the DSM segment is initialized.
+ */
+#define ParallelVacuumIsActive(lps) (((LVParallelState *) (lps)) != NULL)
+
+/*
+ * LVDeadTuples stores the dead tuple TIDs collected during the heap scan.
+ * This is allocated in the DSM segment in parallel mode and in local memory
+ * in non-parallel mode.
+ */
+typedef struct LVDeadTuples
+{
+	int			max_tuples;		/* # slots allocated in array */
+	int			num_tuples;		/* current # of entries */
+	/* List of TIDs of tuples we intend to delete */
+	/* NB: this list is ordered by TID address */
+	ItemPointerData itemptrs[FLEXIBLE_ARRAY_MEMBER];	/* array of
+														 * ItemPointerData */
+} LVDeadTuples;
+
+#define SizeOfLVDeadTuples offsetof(LVDeadTuples, itemptrs) + sizeof(ItemPointerData)
+
+/*
+ * Shared information among parallel workers.  So this is allocated in the DSM
+ * segment.
+ */
+typedef struct LVShared
+{
+	/*
+	 * Target table relid and log level.  These fields are not modified during
+	 * the lazy vacuum.
+	 */
+	Oid			relid;
+	int			elevel;
+
+	/*
+	 * An indication for vacuum workers to perform either index vacuuming or
+	 * index cleanup.  first_time is true only if for_cleanup is true and
+	 * bulk-deletion is not performed yet.
+	 */
+	bool		for_cleanup;
+	bool		first_time;
+
+	/*
+	 * Fields for both index vacuum and cleanup.
+	 *
+	 * reltuples is the total number of input heap tuples.  We set either old
+	 * live tuples in the index vacuum case or the new live tuples in the
+	 * index cleanup case.
+	 *
+	 * estimated_count is true if the reltuples is an estimated value.
+	 */
+	double		reltuples;
+	bool		estimated_count;
+
+	/*
+	 * In single process lazy vacuum we could consume more memory during index
+	 * vacuuming or cleanup apart from the memory for heap scanning. In
+	 * parallel index vacuuming, since individual vacuum workers can consume
+	 * memory equal to maitenance_work_mem, the new maitenance_work_mem for
+	 * each worker is set such that the parallel operation doesn't consume
+	 * more memory than single process lazy vacuum.
+	 */
+	int			maintenance_work_mem_worker;
+
+	/*
+	 * Shared vacuum cost balance.  During parallel index vacuuming
+	 * VacuumSharedCostBalance points to this value and it accumulates the
+	 * balance of each parallel vacuum workers.
+	 */
+	pg_atomic_uint32 cost_balance;
+
+	/*
+	 * Number of active parallel workers.  This is used for computing the
+	 * minimum threshold of the vacuum cost balance for a worker to go for the
+	 * delay.
+	 */
+	pg_atomic_uint32 active_nworkers;
+
+	/*
+	 * Variables to control parallel index vacuuming.  We have a bitmap to
+	 * indicate which index has stats in shared memory.  The set bit in the
+	 * map indicates that the particular index supports a parallel vacuum.
+	 */
+	pg_atomic_uint32 idx;		/* counter for vacuuming and clean up */
+	pg_atomic_uint32 nprocessed;	/* # of indexes done during parallel
+									 * execution */
+	uint32		offset;			/* sizeof header incl. bitmap */
+	bits8		bitmap[FLEXIBLE_ARRAY_MEMBER];	/* bit map of NULLs */
+
+	/* Shared index statistics data follows at end of struct */
+} LVShared;
+
+#define SizeOfLVShared offsetof(LVShared, bitmap) + sizeof(bits8)
+#define GetSharedIndStats(s) \
+	((LVSharedIndStats *)((char *)(s) + ((LVShared *)(s))->offset))
+#define IndStatsIsNull(s, i) \
+	(!(((LVShared *)(s))->bitmap[(i) >> 3] & (1 << ((i) & 0x07))))
+
+/*
+ * Struct for an index bulk-deletion statistic used for parallel lazy
+ * vacuum.  This is allocated in the DSM segment.
+ */
+typedef struct LVSharedIndStats
+{
+	bool		updated;		/* are the stats updated? */
+	IndexBulkDeleteResult stats;
+} LVSharedIndStats;
+
+/* Struct for maintaining a parallel vacuum state. */
+typedef struct LVParallelState
+{
+	ParallelContext *pcxt;
+
+	/* Shared information among parallel vacuum workers */
+	LVShared   *lvshared;
+
+	/*
+	 * The number of indexes that support parallel index bulk-deletion and
+	 * parallel index cleanup respectively.
+	 */
+	int			nindexes_parallel_bulkdel;
+	int			nindexes_parallel_cleanup;
+	int			nindexes_parallel_condcleanup;
+} LVParallelState;
+
 typedef struct LVRelStats
 {
 	/* useindex = true means two-pass strategy; false means one-pass */
@@ -128,11 +283,7 @@ typedef struct LVRelStats
 	BlockNumber pages_removed;
 	double		tuples_deleted;
 	BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
-	/* List of TIDs of tuples we intend to delete */
-	/* NB: this list is ordered by TID address */
-	int			num_dead_tuples;	/* current # of entries */
-	int			max_dead_tuples;	/* # slots allocated in array */
-	ItemPointer dead_tuples;	/* array of ItemPointerData */
+	LVDeadTuples *dead_tuples;
 	int			num_index_scans;
 	TransactionId latestRemovedXid;
 	bool		lock_waiter_detected;
@@ -155,12 +306,11 @@ static void lazy_scan_heap(Relation onerel, VacuumParams *params,
 						   bool aggressive);
 static void lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats);
 static bool lazy_check_needs_freeze(Buffer buf, bool *hastup);
-static void lazy_vacuum_index(Relation indrel,
-							  IndexBulkDeleteResult **stats,
-							  LVRelStats *vacrelstats);
+static void lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult **stats,
+							  LVDeadTuples *dead_tuples, double reltuples);
 static void lazy_cleanup_index(Relation indrel,
-							   IndexBulkDeleteResult *stats,
-							   LVRelStats *vacrelstats);
+							   IndexBulkDeleteResult **stats,
+							   double reltuples, bool estimated_count);
 static int	lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 							 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer);
 static bool should_attempt_truncation(VacuumParams *params,
@@ -169,12 +319,44 @@ static void lazy_truncate_heap(Relation onerel, LVRelStats *vacrelstats);
 static BlockNumber count_nondeletable_pages(Relation onerel,
 											LVRelStats *vacrelstats);
 static void lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks);
-static void lazy_record_dead_tuple(LVRelStats *vacrelstats,
+static void lazy_record_dead_tuple(LVDeadTuples *dead_tuples,
 								   ItemPointer itemptr);
 static bool lazy_tid_reaped(ItemPointer itemptr, void *state);
 static int	vac_cmp_itemptr(const void *left, const void *right);
 static bool heap_page_is_all_visible(Relation rel, Buffer buf,
 									 TransactionId *visibility_cutoff_xid, bool *all_frozen);
+static void lazy_parallel_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+										 LVRelStats *vacrelstats, LVParallelState *lps,
+										 int nindexes);
+static void parallel_vacuum_index(Relation *Irel, IndexBulkDeleteResult **stats,
+								  LVShared *lvshared, LVDeadTuples *dead_tuples,
+								  int nindexes);
+static void vacuum_indexes_leader(Relation *Irel, IndexBulkDeleteResult **stats,
+								  LVRelStats *vacrelstats, LVParallelState *lps,
+								  int nindexes);
+static void vacuum_one_index(Relation indrel, IndexBulkDeleteResult **stats,
+							 LVShared *lvshared, LVSharedIndStats *shared_indstats,
+							 LVDeadTuples *dead_tuples);
+static void lazy_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+								LVRelStats *vacrelstats, LVParallelState *lps,
+								int nindexes);
+static void lazy_cleanup_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+								 LVRelStats *vacrelstats, LVParallelState *lps,
+								 int nindexes);
+static long compute_max_dead_tuples(BlockNumber relblocks, bool hasindex);
+static int	compute_parallel_vacuum_workers(Relation *Irel, int nindexes, int nrequested,
+											bool *can_parallel_vacuum);
+static void prepare_index_statistics(LVShared *lvshared, bool *can_parallel_vacuum,
+									 int nindexes);
+static void update_index_statistics(Relation *Irel, IndexBulkDeleteResult **stats,
+									int nindexes);
+static LVParallelState *begin_parallel_vacuum(Oid relid, Relation *Irel,
+											  LVRelStats *vacrelstats, BlockNumber nblocks,
+											  int nindexes, int nrequested);
+static void end_parallel_vacuum(Relation *Irel, IndexBulkDeleteResult **stats,
+								LVParallelState *lps, int nindexes);
+static LVSharedIndStats *get_indstats(LVShared *lvshared, int n);
+static bool skip_parallel_vacuum_index(Relation indrel, LVShared *lvshared);
 
 
 /*
@@ -488,6 +670,18 @@ vacuum_log_cleanup_info(Relation rel, LVRelStats *vacrelstats)
  *		dead-tuple TIDs, invoke vacuuming of indexes and call lazy_vacuum_heap
  *		to reclaim dead line pointers.
  *
+ *		If the table has at least two indexes and parallel lazy vacuum is
+ *		requested, we execute both index vacuuming and index cleanup with
+ *		parallel workers.  In parallel lazy vacuum, we enter parallel mode and
+ *		then create both the parallel context and the DSM segment before starting
+ *		heap scan so that we can record dead tuples to the DSM segment.  All
+ *		parallel workers are launched at beginning of index vacuuming and index
+ *		cleanup and they exit once done with all indexes.  At the end of this
+ *		function we exit from parallel mode.  Index bulk-deletion results are
+ *		stored in the DSM segment and we update index statistics for all the
+ *		indexes after exiting from parallel mode since writes are not allowed
+ *		during parallel mode.
+ *
  *		If there are no indexes then we can reclaim line pointers on the fly;
  *		dead line pointers need only be retained until all index pointers that
  *		reference them have been killed.
@@ -496,6 +690,8 @@ static void
 lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			   Relation *Irel, int nindexes, bool aggressive)
 {
+	LVParallelState *lps = NULL;
+	LVDeadTuples *dead_tuples;
 	BlockNumber nblocks,
 				blkno;
 	HeapTupleData tuple;
@@ -553,13 +749,28 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	vacrelstats->nonempty_pages = 0;
 	vacrelstats->latestRemovedXid = InvalidTransactionId;
 
-	lazy_space_alloc(vacrelstats, nblocks);
+	/*
+	 * Try to initialize the parallel vacuum if requested
+	 */
+	if (params->nworkers >= 0 && vacrelstats->useindex)
+		lps = begin_parallel_vacuum(RelationGetRelid(onerel), Irel,
+									vacrelstats, nblocks, nindexes,
+									params->nworkers);
+
+	/*
+	 * Allocate the space for dead tuples in case the parallel vacuum is not
+	 * initialized.
+	 */
+	if (!ParallelVacuumIsActive(lps))
+		lazy_space_alloc(vacrelstats, nblocks);
+
+	dead_tuples = vacrelstats->dead_tuples;
 	frozen = palloc(sizeof(xl_heap_freeze_tuple) * MaxHeapTuplesPerPage);
 
 	/* Report that we're scanning the heap, advertising total # of blocks */
 	initprog_val[0] = PROGRESS_VACUUM_PHASE_SCAN_HEAP;
 	initprog_val[1] = nblocks;
-	initprog_val[2] = vacrelstats->max_dead_tuples;
+	initprog_val[2] = dead_tuples->max_tuples;
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
 	/*
@@ -737,8 +948,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * If we are close to overrunning the available space for dead-tuple
 		 * TIDs, pause and do a cycle of vacuuming before we tackle this page.
 		 */
-		if ((vacrelstats->max_dead_tuples - vacrelstats->num_dead_tuples) < MaxHeapTuplesPerPage &&
-			vacrelstats->num_dead_tuples > 0)
+		if ((dead_tuples->max_tuples - dead_tuples->num_tuples) < MaxHeapTuplesPerPage &&
+			dead_tuples->num_tuples > 0)
 		{
 			const int	hvp_index[] = {
 				PROGRESS_VACUUM_PHASE,
@@ -766,10 +977,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 										 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
 			/* Remove index entries */
-			for (i = 0; i < nindexes; i++)
-				lazy_vacuum_index(Irel[i],
-								  &indstats[i],
-								  vacrelstats);
+			lazy_vacuum_indexes(Irel, indstats, vacrelstats, lps, nindexes);
 
 			/*
 			 * Report that we are now vacuuming the heap.  We also increase
@@ -789,7 +997,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 			vacrelstats->num_index_scans++;
 
 			/*
@@ -985,7 +1193,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		has_dead_tuples = false;
 		nfrozen = 0;
 		hastup = false;
-		prev_dead_count = vacrelstats->num_dead_tuples;
+		prev_dead_count = dead_tuples->num_tuples;
 		maxoff = PageGetMaxOffsetNumber(page);
 
 		/*
@@ -1024,7 +1232,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 */
 			if (ItemIdIsDead(itemid))
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				all_visible = false;
 				continue;
 			}
@@ -1170,7 +1378,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 
 			if (tupgone)
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				HeapTupleHeaderAdvanceLatestRemovedXid(tuple.t_data,
 													   &vacrelstats->latestRemovedXid);
 				tups_vacuumed += 1;
@@ -1240,7 +1448,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * doing a second scan. Also we don't do that but forget dead tuples
 		 * when index cleanup is disabled.
 		 */
-		if (!vacrelstats->useindex && vacrelstats->num_dead_tuples > 0)
+		if (!vacrelstats->useindex && dead_tuples->num_tuples > 0)
 		{
 			if (nindexes == 0)
 			{
@@ -1269,7 +1477,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 
 			/*
 			 * Periodically do incremental FSM vacuuming to make newly-freed
@@ -1384,7 +1592,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * page, so remember its free space as-is.  (This path will always be
 		 * taken if there are no indexes.)
 		 */
-		if (vacrelstats->num_dead_tuples == prev_dead_count)
+		if (dead_tuples->num_tuples == prev_dead_count)
 			RecordPageWithFreeSpace(onerel, blkno, freespace);
 	}
 
@@ -1418,7 +1626,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 
 	/* If any tuples need to be deleted, perform final vacuum cycle */
 	/* XXX put a threshold on min number of tuples here? */
-	if (vacrelstats->num_dead_tuples > 0)
+	if (dead_tuples->num_tuples > 0)
 	{
 		const int	hvp_index[] = {
 			PROGRESS_VACUUM_PHASE,
@@ -1434,10 +1642,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 									 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
 		/* Remove index entries */
-		for (i = 0; i < nindexes; i++)
-			lazy_vacuum_index(Irel[i],
-							  &indstats[i],
-							  vacrelstats);
+		lazy_vacuum_indexes(Irel, indstats, vacrelstats, lps, nindexes);
 
 		/* Report that we are now vacuuming the heap */
 		hvp_val[0] = PROGRESS_VACUUM_PHASE_VACUUM_HEAP;
@@ -1461,12 +1666,19 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
 								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
 
-	/* Do post-vacuum cleanup and statistics update for each index */
+	/* Do post-vacuum cleanup */
 	if (vacrelstats->useindex)
-	{
-		for (i = 0; i < nindexes; i++)
-			lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
-	}
+		lazy_cleanup_indexes(Irel, indstats, vacrelstats, lps, nindexes);
+
+	/*
+	 * End parallel mode before updating index statistics as we cannot write
+	 * during parallel mode.
+	 */
+	if (ParallelVacuumIsActive(lps))
+		end_parallel_vacuum(Irel, indstats, lps, nindexes);
+
+	/* Update index statistics */
+	update_index_statistics(Irel, indstats, nindexes);
 
 	/* If no indexes, make log report that lazy_vacuum_heap would've made */
 	if (vacuumed_pages)
@@ -1532,7 +1744,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 	npages = 0;
 
 	tupindex = 0;
-	while (tupindex < vacrelstats->num_dead_tuples)
+	while (tupindex < vacrelstats->dead_tuples->num_tuples)
 	{
 		BlockNumber tblk;
 		Buffer		buf;
@@ -1541,7 +1753,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 
 		vacuum_delay_point();
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples->itemptrs[tupindex]);
 		buf = ReadBufferExtended(onerel, MAIN_FORKNUM, tblk, RBM_NORMAL,
 								 vac_strategy);
 		if (!ConditionalLockBufferForCleanup(buf))
@@ -1589,6 +1801,7 @@ static int
 lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 				 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer)
 {
+	LVDeadTuples *dead_tuples = vacrelstats->dead_tuples;
 	Page		page = BufferGetPage(buffer);
 	OffsetNumber unused[MaxOffsetNumber];
 	int			uncnt = 0;
@@ -1599,16 +1812,16 @@ lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 
 	START_CRIT_SECTION();
 
-	for (; tupindex < vacrelstats->num_dead_tuples; tupindex++)
+	for (; tupindex < dead_tuples->num_tuples; tupindex++)
 	{
 		BlockNumber tblk;
 		OffsetNumber toff;
 		ItemId		itemid;
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&dead_tuples->itemptrs[tupindex]);
 		if (tblk != blkno)
 			break;				/* past end of tuples for this block */
-		toff = ItemPointerGetOffsetNumber(&vacrelstats->dead_tuples[tupindex]);
+		toff = ItemPointerGetOffsetNumber(&dead_tuples->itemptrs[tupindex]);
 		itemid = PageGetItemId(page, toff);
 		ItemIdSetUnused(itemid);
 		unused[uncnt++] = toff;
@@ -1729,19 +1942,392 @@ lazy_check_needs_freeze(Buffer buf, bool *hastup)
 	return false;
 }
 
+/*
+ * Perform index vacuuming or index cleanup with parallel workers.  This
+ * function must be used by the parallel vacuum leader process. The caller
+ * must set lps->lvshared->for_cleanup to indicate whether to perform vacuum
+ * or cleanup.
+ */
+static void
+lazy_parallel_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+							 LVRelStats *vacrelstats, LVParallelState *lps,
+							 int nindexes)
+{
+	int			nworkers;
+
+	Assert(!IsParallelWorker());
+	Assert(ParallelVacuumIsActive(lps));
+	Assert(nindexes > 0);
+
+	/* Determine the number of parallel workers to launch */
+	if (lps->lvshared->for_cleanup)
+	{
+		if (lps->lvshared->first_time)
+			nworkers = lps->nindexes_parallel_cleanup +
+				lps->nindexes_parallel_condcleanup - 1;
+		else
+			nworkers = lps->nindexes_parallel_cleanup - 1;
+
+	}
+	else
+		nworkers = lps->nindexes_parallel_bulkdel - 1;
+
+	/*
+	 * The number of workers required for parallel vacuum phase must be less
+	 * than the number of workers with which parallel context is initialized.
+	 */
+	nworkers = Min(nworkers, lps->pcxt->nworkers);
+
+	/* Setup the shared cost-based vacuum delay and launch workers */
+	if (nworkers > 0)
+	{
+		if (vacrelstats->num_index_scans > 0)
+		{
+			/* Reset the processing counts */
+			pg_atomic_write_u32(&(lps->lvshared->idx), 0);
+			pg_atomic_write_u32(&(lps->lvshared->nprocessed), 0);
+
+			/* Reinitialize the parallel context to relaunch parallel workers */
+			ReinitializeParallelDSM(lps->pcxt);
+		}
+
+		/* Enable shared cost balance */
+		VacuumSharedCostBalance = &(lps->lvshared->cost_balance);
+		VacuumActiveNWorkers = &(lps->lvshared->active_nworkers);
+
+		/*
+		 * Set up shared cost balance and the number of active workers for
+		 * vacuum delay.
+		 */
+		pg_atomic_write_u32(VacuumSharedCostBalance, VacuumCostBalance);
+		pg_atomic_write_u32(VacuumActiveNWorkers, 0);
+
+		/*
+		 * The number of workers can vary between and bulkdelete and cleanup
+		 * phase.
+		 */
+		ReinitializeParallelWorkers(lps->pcxt, nworkers);
+
+		LaunchParallelWorkers(lps->pcxt);
+
+		if (lps->pcxt->nworkers_launched > 0)
+		{
+			/*
+			 * Reset the local cost values for leader backend as we have
+			 * already accumulated the remaining balance of heap.
+			 */
+			VacuumCostBalance = 0;
+			VacuumCostBalanceLocal = 0;
+		}
+		else
+		{
+			/*
+			 * Disable shared cost balance if we are not able to launch
+			 * workers.
+			 */
+			VacuumSharedCostBalance = NULL;
+			VacuumActiveNWorkers = NULL;
+		}
+
+		if (lps->lvshared->for_cleanup)
+			ereport(elevel,
+					(errmsg(ngettext("launched %d parallel vacuum worker for index cleanup (planned: %d)",
+									 "launched %d parallel vacuum workers for index cleanup (planned: %d)",
+									 lps->pcxt->nworkers_launched),
+							lps->pcxt->nworkers_launched, nworkers)));
+		else
+			ereport(elevel,
+					(errmsg(ngettext("launched %d parallel vacuum worker for index vacuuming (planned: %d)",
+									 "launched %d parallel vacuum workers for index vacuuming (planned: %d)",
+									 lps->pcxt->nworkers_launched),
+							lps->pcxt->nworkers_launched, nworkers)));
+	}
+
+	/* Process the indexes that can be processed by only leader process */
+	vacuum_indexes_leader(Irel, stats, vacrelstats, lps, nindexes);
+
+	/*
+	 * Join as a parallel worker.  The leader process alone processes all the
+	 * indexes in the case where no workers are launched.
+	 */
+	parallel_vacuum_index(Irel, stats, lps->lvshared,
+						  vacrelstats->dead_tuples, nindexes);
+
+	/* Wait for all vacuum workers to finish */
+	WaitForParallelWorkersToFinish(lps->pcxt);
+
+	/* Carry the shared balance value to heap scan */
+	if (VacuumSharedCostBalance)
+		VacuumCostBalance = pg_atomic_read_u32(VacuumSharedCostBalance);
+
+	if (nworkers > 0)
+	{
+		/* Disable shared cost balance */
+		VacuumSharedCostBalance = NULL;
+		VacuumActiveNWorkers = NULL;
+	}
+}
+
+/*
+ * Index vacuum/cleanup routine used by the leader process and parallel
+ * vacuum worker processes to process the indexes in parallel.
+ */
+static void
+parallel_vacuum_index(Relation *Irel, IndexBulkDeleteResult **stats,
+					  LVShared *lvshared, LVDeadTuples *dead_tuples,
+					  int nindexes)
+{
+	/*
+	 * Increment the active worker count if we are able to launch any worker.
+	 */
+	if (VacuumActiveNWorkers)
+		pg_atomic_add_fetch_u32(VacuumActiveNWorkers, 1);
+
+	/* Loop until all indexes are vacuumed */
+	for (;;)
+	{
+		int			idx;
+		LVSharedIndStats *shared_indstats;
+
+		/* Get an index number to process */
+		idx = pg_atomic_fetch_add_u32(&(lvshared->idx), 1);
+
+		/* Done for all indexes? */
+		if (idx >= nindexes)
+			break;
+
+		/*
+		 * Skip processing indexes that either doesn't support parallel
+		 * operation or are too small for parallel index vacuum.
+		 */
+		if (get_indstats(lvshared, idx) == NULL ||
+			skip_parallel_vacuum_index(Irel[idx], lvshared))
+			continue;
+
+		/* Increment the processing count */
+		pg_atomic_add_fetch_u32(&(lvshared->nprocessed), 1);
+
+		/* Get the index statistics of this index from DSM */
+		shared_indstats = get_indstats(lvshared, idx);
+
+		/*
+		 * This must exist in DSM as we reach here only for indexes that
+		 * support the parallel operation.
+		 */
+		Assert(shared_indstats);
+
+		/* Do vacuum or cleanup of the index */
+		vacuum_one_index(Irel[idx], &(stats[idx]), lvshared, shared_indstats,
+						 dead_tuples);
+	}
+
+	/*
+	 * We have completed the index vacuum so decrement the active worker
+	 * count.
+	 */
+	if (VacuumActiveNWorkers)
+		pg_atomic_sub_fetch_u32(VacuumActiveNWorkers, 1);
+}
+
+/*
+ * Vacuum or cleanup indexes that can be processed by only the leader process
+ * because these indexes don't support parallel operation at that phase.
+ */
+static void
+vacuum_indexes_leader(Relation *Irel, IndexBulkDeleteResult **stats,
+					  LVRelStats *vacrelstats, LVParallelState *lps,
+					  int nindexes)
+{
+	int			i;
+
+	Assert(!IsParallelWorker());
+
+	/*
+	 * Increment the active worker count if we are able to launch any worker.
+	 */
+	if (VacuumActiveNWorkers)
+		pg_atomic_add_fetch_u32(VacuumActiveNWorkers, 1);
+
+	for (i = 0; i < nindexes; i++)
+	{
+		bool		skip_index = (get_indstats(lps->lvshared, i) == NULL ||
+								  skip_parallel_vacuum_index(Irel[i], lps->lvshared));
+
+		/* Skip the indexes that can be processed by parallel workers */
+		if (!skip_index)
+			continue;
+
+		vacuum_one_index(Irel[i], &(stats[i]), lps->lvshared,
+						 get_indstats(lps->lvshared, i),
+						 vacrelstats->dead_tuples);
+	}
+
+	/*
+	 * We have completed the index vacuum so decrement the active worker
+	 * count.
+	 */
+	if (VacuumActiveNWorkers)
+		pg_atomic_sub_fetch_u32(VacuumActiveNWorkers, 1);
+}
+
+/*
+ * Vacuum or cleanup index either by leader process or by one of the worker
+ * process.  After processing the index this function copies the index
+ * statistics returned from ambulkdelete and amvacuumcleanup to the DSM
+ * segment.
+ */
+static void
+vacuum_one_index(Relation indrel, IndexBulkDeleteResult **stats,
+				 LVShared *lvshared, LVSharedIndStats *shared_indstats,
+				 LVDeadTuples *dead_tuples)
+{
+	IndexBulkDeleteResult *bulkdelete_res = NULL;
+
+	if (shared_indstats)
+	{
+		/* Get the space for IndexBulkDeleteResult */
+		bulkdelete_res = &(shared_indstats->stats);
+
+		/*
+		 * Update the pointer to the corresponding bulk-deletion result if
+		 * someone has already updated it.
+		 */
+		if (shared_indstats->updated && *stats == NULL)
+			*stats = bulkdelete_res;
+	}
+
+	/* Do vacuum or cleanup of the index */
+	if (lvshared->for_cleanup)
+		lazy_cleanup_index(indrel, stats, lvshared->reltuples,
+						   lvshared->estimated_count);
+	else
+		lazy_vacuum_index(indrel, stats, dead_tuples,
+						  lvshared->reltuples);
+
+	/*
+	 * Copy the index bulk-deletion result returned from ambulkdelete and
+	 * amvacuumcleanup to the DSM segment if it's the first time to get it
+	 * from them, because they allocate it locally and it's possible that an
+	 * index will be vacuumed by the different vacuum process at the next
+	 * time.  The copying of the result normally happens only after the first
+	 * time of index vacuuming.  From the second time, we pass the result on
+	 * the DSM segment so that they then update it directly.
+	 *
+	 * Since all vacuum workers write the bulk-deletion result at different
+	 * slots we can write them without locking.
+	 */
+	if (shared_indstats && !shared_indstats->updated && *stats != NULL)
+	{
+		memcpy(bulkdelete_res, *stats, sizeof(IndexBulkDeleteResult));
+		shared_indstats->updated = true;
+
+		/*
+		 * Now that the stats[idx] points to the DSM segment, we don't need
+		 * the locally allocated results.
+		 */
+		pfree(*stats);
+		*stats = bulkdelete_res;
+	}
+}
+
+/*
+ * Vacuum indexes.  We process the indexes serially unless we are doing
+ * parallel vacuum.
+ */
+static void
+lazy_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+					LVRelStats *vacrelstats, LVParallelState *lps,
+					int nindexes)
+{
+	int			idx;
+
+	Assert(!IsParallelWorker());
+	Assert(nindexes > 0);
+
+	/* Perform index vacuuming with parallel workers for parallel vacuum. */
+	if (ParallelVacuumIsActive(lps))
+	{
+		/* Tell parallel workers to do index vacuuming */
+		lps->lvshared->for_cleanup = false;
+		lps->lvshared->first_time = false;
+
+		/*
+		 * We can only provide an approximate value of num_heap_tuples in
+		 * vacuum cases.
+		 */
+		lps->lvshared->reltuples = vacrelstats->old_live_tuples;
+		lps->lvshared->estimated_count = true;
+
+		lazy_parallel_vacuum_indexes(Irel, stats, vacrelstats, lps, nindexes);
+	}
+	else
+	{
+		for (idx = 0; idx < nindexes; idx++)
+			lazy_vacuum_index(Irel[idx], &stats[idx], vacrelstats->dead_tuples,
+							  vacrelstats->old_live_tuples);
+	}
+}
+
+/*
+ * Cleanup indexes.  We process the indexes serially unless we are doing
+ * parallel vacuum.
+ */
+static void
+lazy_cleanup_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+					 LVRelStats *vacrelstats, LVParallelState *lps,
+					 int nindexes)
+{
+	int			idx;
+
+	Assert(!IsParallelWorker());
+	Assert(nindexes > 0);
+
+	/*
+	 * If parallel vacuum is active we perform index cleanup with parallel
+	 * workers.
+	 */
+	if (ParallelVacuumIsActive(lps))
+	{
+		/* Tell parallel workers to do index cleanup */
+		lps->lvshared->for_cleanup = true;
+		lps->lvshared->first_time =
+			(vacrelstats->num_index_scans == 0);
+
+		/*
+		 * Now we can provide a better estimate of total number of surviving
+		 * tuples (we assume indexes are more interested in that than in the
+		 * number of nominally live tuples).
+		 */
+		lps->lvshared->reltuples = vacrelstats->new_rel_tuples;
+		lps->lvshared->estimated_count =
+			(vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+
+		lazy_parallel_vacuum_indexes(Irel, stats, vacrelstats, lps, nindexes);
+	}
+	else
+	{
+		for (idx = 0; idx < nindexes; idx++)
+			lazy_cleanup_index(Irel[idx], &stats[idx],
+							   vacrelstats->new_rel_tuples,
+							   vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	}
+}
 
 /*
  *	lazy_vacuum_index() -- vacuum one index relation.
  *
  *		Delete all the index entries pointing to tuples listed in
- *		vacrelstats->dead_tuples, and update running statistics.
+ *		dead_tuples, and update running statistics.
+ *
+ *		reltuples is the number of heap tuples to be passed to the
+ *		bulkdelete callback.
  */
 static void
-lazy_vacuum_index(Relation indrel,
-				  IndexBulkDeleteResult **stats,
-				  LVRelStats *vacrelstats)
+lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult **stats,
+				  LVDeadTuples *dead_tuples, double reltuples)
 {
 	IndexVacuumInfo ivinfo;
+	const char *msg;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
@@ -1751,30 +2337,38 @@ lazy_vacuum_index(Relation indrel,
 	ivinfo.report_progress = false;
 	ivinfo.estimated_count = true;
 	ivinfo.message_level = elevel;
-	/* We can only provide an approximate value of num_heap_tuples here */
-	ivinfo.num_heap_tuples = vacrelstats->old_live_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
 	/* Do bulk deletion */
 	*stats = index_bulk_delete(&ivinfo, *stats,
-							   lazy_tid_reaped, (void *) vacrelstats);
+							   lazy_tid_reaped, (void *) dead_tuples);
+
+	if (IsParallelWorker())
+		msg = gettext_noop("scanned index \"%s\" to remove %d row versions by parallel vacuum worker");
+	else
+		msg = gettext_noop("scanned index \"%s\" to remove %d row versions");
 
 	ereport(elevel,
-			(errmsg("scanned index \"%s\" to remove %d row versions",
+			(errmsg(msg,
 					RelationGetRelationName(indrel),
-					vacrelstats->num_dead_tuples),
+					dead_tuples->num_tuples),
 			 errdetail_internal("%s", pg_rusage_show(&ru0))));
 }
 
 /*
  *	lazy_cleanup_index() -- do post-vacuum cleanup for one index relation.
+ *
+ *		reltuples is the number of heap tuples and estimated_count is true
+ *		if the reltuples is an estimated value.
  */
 static void
 lazy_cleanup_index(Relation indrel,
-				   IndexBulkDeleteResult *stats,
-				   LVRelStats *vacrelstats)
+				   IndexBulkDeleteResult **stats,
+				   double reltuples, bool estimated_count)
 {
 	IndexVacuumInfo ivinfo;
+	const char *msg;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
@@ -1782,49 +2376,33 @@ lazy_cleanup_index(Relation indrel,
 	ivinfo.index = indrel;
 	ivinfo.analyze_only = false;
 	ivinfo.report_progress = false;
-	ivinfo.estimated_count = (vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	ivinfo.estimated_count = estimated_count;
 	ivinfo.message_level = elevel;
 
-	/*
-	 * Now we can provide a better estimate of total number of surviving
-	 * tuples (we assume indexes are more interested in that than in the
-	 * number of nominally live tuples).
-	 */
-	ivinfo.num_heap_tuples = vacrelstats->new_rel_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
-	stats = index_vacuum_cleanup(&ivinfo, stats);
+	*stats = index_vacuum_cleanup(&ivinfo, *stats);
 
-	if (!stats)
+	if (!(*stats))
 		return;
 
-	/*
-	 * Now update statistics in pg_class, but only if the index says the count
-	 * is accurate.
-	 */
-	if (!stats->estimated_count)
-		vac_update_relstats(indrel,
-							stats->num_pages,
-							stats->num_index_tuples,
-							0,
-							false,
-							InvalidTransactionId,
-							InvalidMultiXactId,
-							false);
+	if (IsParallelWorker())
+		msg = gettext_noop("index \"%s\" now contains %.0f row versions in %u pages as reported by parallel vacuum worker");
+	else
+		msg = gettext_noop("index \"%s\" now contains %.0f row versions in %u pages");
 
 	ereport(elevel,
-			(errmsg("index \"%s\" now contains %.0f row versions in %u pages",
+			(errmsg(msg,
 					RelationGetRelationName(indrel),
-					stats->num_index_tuples,
-					stats->num_pages),
+					(*stats)->num_index_tuples,
+					(*stats)->num_pages),
 			 errdetail("%.0f index row versions were removed.\n"
 					   "%u index pages have been deleted, %u are currently reusable.\n"
 					   "%s.",
-					   stats->tuples_removed,
-					   stats->pages_deleted, stats->pages_free,
+					   (*stats)->tuples_removed,
+					   (*stats)->pages_deleted, (*stats)->pages_free,
 					   pg_rusage_show(&ru0))));
-
-	pfree(stats);
 }
 
 /*
@@ -2132,19 +2710,17 @@ count_nondeletable_pages(Relation onerel, LVRelStats *vacrelstats)
 }
 
 /*
- * lazy_space_alloc - space allocation decisions for lazy vacuum
- *
- * See the comments at the head of this file for rationale.
+ * Return the maximum number of dead tuples we can record.
  */
-static void
-lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+static long
+compute_max_dead_tuples(BlockNumber relblocks, bool useindex)
 {
 	long		maxtuples;
 	int			vac_work_mem = IsAutoVacuumWorkerProcess() &&
 	autovacuum_work_mem != -1 ?
 	autovacuum_work_mem : maintenance_work_mem;
 
-	if (vacrelstats->useindex)
+	if (useindex)
 	{
 		maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);
 		maxtuples = Min(maxtuples, INT_MAX);
@@ -2158,34 +2734,49 @@ lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
 		maxtuples = Max(maxtuples, MaxHeapTuplesPerPage);
 	}
 	else
-	{
 		maxtuples = MaxHeapTuplesPerPage;
-	}
 
-	vacrelstats->num_dead_tuples = 0;
-	vacrelstats->max_dead_tuples = (int) maxtuples;
-	vacrelstats->dead_tuples = (ItemPointer)
-		palloc(maxtuples * sizeof(ItemPointerData));
+	return maxtuples;
+}
+
+/*
+ * lazy_space_alloc - space allocation decisions for lazy vacuum
+ *
+ * See the comments at the head of this file for rationale.
+ */
+static void
+lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+{
+	LVDeadTuples *dead_tuples = NULL;
+	long		maxtuples;
+
+	maxtuples = compute_max_dead_tuples(relblocks, vacrelstats->useindex);
+
+	dead_tuples = (LVDeadTuples *)
+		palloc(SizeOfLVDeadTuples + maxtuples * sizeof(ItemPointerData));
+	dead_tuples->num_tuples = 0;
+	dead_tuples->max_tuples = (int) maxtuples;
+
+	vacrelstats->dead_tuples = dead_tuples;
 }
 
 /*
  * lazy_record_dead_tuple - remember one deletable tuple
  */
 static void
-lazy_record_dead_tuple(LVRelStats *vacrelstats,
-					   ItemPointer itemptr)
+lazy_record_dead_tuple(LVDeadTuples *dead_tuples, ItemPointer itemptr)
 {
 	/*
 	 * The array shouldn't overflow under normal behavior, but perhaps it
 	 * could if we are given a really small maintenance_work_mem. In that
 	 * case, just forget the last few tuples (we'll get 'em next time).
 	 */
-	if (vacrelstats->num_dead_tuples < vacrelstats->max_dead_tuples)
+	if (dead_tuples->num_tuples < dead_tuples->max_tuples)
 	{
-		vacrelstats->dead_tuples[vacrelstats->num_dead_tuples] = *itemptr;
-		vacrelstats->num_dead_tuples++;
+		dead_tuples->itemptrs[dead_tuples->num_tuples] = *itemptr;
+		dead_tuples->num_tuples++;
 		pgstat_progress_update_param(PROGRESS_VACUUM_NUM_DEAD_TUPLES,
-									 vacrelstats->num_dead_tuples);
+									 dead_tuples->num_tuples);
 	}
 }
 
@@ -2199,12 +2790,12 @@ lazy_record_dead_tuple(LVRelStats *vacrelstats,
 static bool
 lazy_tid_reaped(ItemPointer itemptr, void *state)
 {
-	LVRelStats *vacrelstats = (LVRelStats *) state;
+	LVDeadTuples *dead_tuples = (LVDeadTuples *) state;
 	ItemPointer res;
 
 	res = (ItemPointer) bsearch((void *) itemptr,
-								(void *) vacrelstats->dead_tuples,
-								vacrelstats->num_dead_tuples,
+								(void *) dead_tuples->itemptrs,
+								dead_tuples->num_tuples,
 								sizeof(ItemPointerData),
 								vac_cmp_itemptr);
 
@@ -2352,3 +2943,459 @@ heap_page_is_all_visible(Relation rel, Buffer buf,
 
 	return all_visible;
 }
+
+/*
+ * Compute the number of parallel worker processes to request.  Both index
+ * vacuuming and index cleanup can be executed together with parallel workers.
+ * The relation sizes of table don't affect to the parallel degree for now.
+ * nrequested is the number of parallel workers that user
+ * requested. If nrequested is 0 we compute the parallel degree based on
+ * nindexes that is the number of indexes that support parallel index
+ * vacuuming. This function also sets can_parallel_vacuum to remember indexes
+ * that participates parallel index vacuum.
+ */
+static int
+compute_parallel_vacuum_workers(Relation *Irel, int nindexes, int nrequested,
+								bool *can_parallel_vacuum)
+{
+	int			nindexes_parallel = 0;
+	int			nindexes_parallel_bulkdel = 0;
+	int			nindexes_parallel_cleanup = 0;
+	int			parallel_workers;
+	int			i;
+
+	/*
+	 * We don't allow to perform parallel operation in standalone backend or
+	 * when parallelism is disabled.
+	 */
+	if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+		return 0;
+
+	/*
+	 * Compute the number of indexes that can participate in parallel index
+	 * vacuuming.
+	 */
+	for (i = 0; i < nindexes; i++)
+	{
+		uint8		vacoptions = Irel[i]->rd_indam->amparallelvacuumoptions;
+
+		if (vacoptions == VACUUM_OPTION_NO_PARALLEL ||
+			RelationGetNumberOfBlocks(Irel[i]) < min_parallel_index_scan_size)
+			continue;
+
+		/*
+		 * Remember indexes that can participate parallel index vacuum based
+		 * on the current index size and use this information for index
+		 * statistics initialization on DSM. We have to determine the indexes
+		 * that are vacuumed in parallel only once because the index size can
+		 * be changed during vacuum.
+		 */
+		can_parallel_vacuum[i] = true;
+
+		if ((vacoptions & VACUUM_OPTION_PARALLEL_BULKDEL) != 0)
+			nindexes_parallel_bulkdel++;
+		if (((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) != 0) ||
+			((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) != 0))
+			nindexes_parallel_cleanup++;
+	}
+
+	nindexes_parallel = Max(nindexes_parallel_bulkdel,
+							nindexes_parallel_cleanup);
+
+	/* No index supports parallel index vacuuming */
+	if (nindexes_parallel == 0)
+		return 0;
+
+	/* The leader process takes one index */
+	nindexes_parallel--;
+
+	/* Compute the parallel degree */
+	parallel_workers = (nrequested > 0) ?
+		Min(nrequested, nindexes_parallel) : nindexes_parallel;
+
+	/* cap by max_parallel_maintenance_workers */
+	parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+
+	return parallel_workers;
+}
+
+/*
+ * Initialize variables for shared index statistics, set NULL bitmap and the
+ * size of stats for each index.  Since currently we don't support parallel vacuum
+ * for autovacuum we don't need to care about autovacuum_work_mem.
+ */
+static void
+prepare_index_statistics(LVShared *lvshared, bool *can_parallel_vacuum,
+						 int nindexes)
+{
+	int			i;
+
+	Assert(!IsAutoVacuumWorkerProcess());
+
+	/* Set NULL for all indexes */
+	memset(lvshared->bitmap, 0x00, BITMAPLEN(nindexes));
+
+	for (i = 0; i < nindexes; i++)
+	{
+		if (!can_parallel_vacuum[i])
+			continue;
+
+		/* Set NOT NULL as this index do support parallelism */
+		lvshared->bitmap[i >> 3] |= 1 << (i & 0x07);
+	}
+}
+
+/*
+ * Update index statistics in pg_class if the statistics is accurate.
+ */
+static void
+update_index_statistics(Relation *Irel, IndexBulkDeleteResult **stats,
+						int nindexes)
+{
+	int			i;
+
+	Assert(!IsInParallelMode());
+
+	for (i = 0; i < nindexes; i++)
+	{
+		if (stats[i] == NULL || stats[i]->estimated_count)
+			continue;
+
+		/* Update index statistics */
+		vac_update_relstats(Irel[i],
+							stats[i]->num_pages,
+							stats[i]->num_index_tuples,
+							0,
+							false,
+							InvalidTransactionId,
+							InvalidMultiXactId,
+							false);
+		pfree(stats[i]);
+	}
+}
+
+/*
+ * This function prepares and returns parallel vacuum state if we can launch
+ * even one worker.  This function is responsible to create a parallel
+ * context, enter parallel mode, and then initialize the DSM segment.
+ */
+static LVParallelState *
+begin_parallel_vacuum(Oid relid, Relation *Irel, LVRelStats *vacrelstats,
+					  BlockNumber nblocks, int nindexes, int nrequested)
+{
+	LVParallelState *lps = NULL;
+	ParallelContext *pcxt;
+	LVShared   *shared;
+	LVDeadTuples *dead_tuples;
+	bool	   *can_parallel_vacuum;
+	long		maxtuples;
+	char	   *sharedquery;
+	Size		est_shared;
+	Size		est_deadtuples;
+	int			nindexes_mwm = 0;
+	int			parallel_workers = 0;
+	int			querylen;
+	int			i;
+
+	/*
+	 * a parallel vacuum must be requested and there must be indexes on the
+	 * relation
+	 */
+	Assert(nrequested >= 0);
+	Assert(nindexes > 0);
+
+	/*
+	 * Compute the number of parallel vacuum workers to launch
+	 */
+	can_parallel_vacuum = (bool *) palloc0(sizeof(bool) * nindexes);
+	parallel_workers = compute_parallel_vacuum_workers(Irel, nindexes,
+													   nrequested,
+													   can_parallel_vacuum);
+
+	/* Can't perform vacuum in parallel */
+	if (parallel_workers <= 0)
+	{
+		pfree(can_parallel_vacuum);
+		return lps;
+	}
+
+	lps = (LVParallelState *) palloc0(sizeof(LVParallelState));
+
+	EnterParallelMode();
+	pcxt = CreateParallelContext("postgres", "parallel_vacuum_main",
+								 parallel_workers);
+	lps->pcxt = pcxt;
+	Assert(pcxt->nworkers > 0);
+
+	/* Estimate size for shared information -- PARALLEL_VACUUM_KEY_SHARED */
+	est_shared = MAXALIGN(add_size(SizeOfLVShared, BITMAPLEN(nindexes)));
+	for (i = 0; i < nindexes; i++)
+	{
+		uint8		vacoptions = Irel[i]->rd_indam->amparallelvacuumoptions;
+
+		/*
+		 * Cleanup option should be either disabled, always performing in
+		 * parallel or conditionally performing in parallel.
+		 */
+		Assert(((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) == 0) ||
+			   ((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) == 0));
+		Assert(vacoptions <= VACUUM_OPTION_MAX_VALID_VALUE);
+
+		/* Skip indexes that don't participate parallel index vacuum */
+		if (!can_parallel_vacuum[i])
+			continue;
+
+		if (Irel[i]->rd_indam->amusemaintenanceworkmem)
+			nindexes_mwm++;
+
+		est_shared = add_size(est_shared, sizeof(LVSharedIndStats));
+
+
+		/*
+		 * Remember the number of indexes that support parallel operation for
+		 * each phase.
+		 */
+		if ((vacoptions & VACUUM_OPTION_PARALLEL_BULKDEL) != 0)
+			lps->nindexes_parallel_bulkdel++;
+		if ((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) != 0)
+			lps->nindexes_parallel_cleanup++;
+		if ((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) != 0)
+			lps->nindexes_parallel_condcleanup++;
+	}
+	shm_toc_estimate_chunk(&pcxt->estimator, est_shared);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	/* Estimate size for dead tuples -- PARALLEL_VACUUM_KEY_DEAD_TUPLES */
+	maxtuples = compute_max_dead_tuples(nblocks, true);
+	est_deadtuples = MAXALIGN(add_size(SizeOfLVDeadTuples,
+									   mul_size(sizeof(ItemPointerData), maxtuples)));
+	shm_toc_estimate_chunk(&pcxt->estimator, est_deadtuples);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	/* Finally, estimate PARALLEL_VACUUM_KEY_QUERY_TEXT space */
+	querylen = strlen(debug_query_string);
+	shm_toc_estimate_chunk(&pcxt->estimator, querylen + 1);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	InitializeParallelDSM(pcxt);
+
+	/* Prepare shared information */
+	shared = (LVShared *) shm_toc_allocate(pcxt->toc, est_shared);
+	MemSet(shared, 0, est_shared);
+	shared->relid = relid;
+	shared->elevel = elevel;
+	shared->maintenance_work_mem_worker =
+		(nindexes_mwm > 0) ?
+		maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
+		maintenance_work_mem;
+
+	/*
+	 * We need to care about alignment because we estimate the shared memory
+	 * in that way.
+	 */
+	shared->offset = MAXALIGN(add_size(SizeOfLVShared, BITMAPLEN(nindexes)));
+	prepare_index_statistics(shared, can_parallel_vacuum, nindexes);
+	pg_atomic_init_u32(&(shared->idx), 0);
+	pg_atomic_init_u32(&(shared->nprocessed), 0);
+	pg_atomic_init_u32(&(shared->cost_balance), 0);
+	pg_atomic_init_u32(&(shared->active_nworkers), 0);
+
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_SHARED, shared);
+	lps->lvshared = shared;
+
+	/* Prepare the dead tuple space */
+	dead_tuples = (LVDeadTuples *) shm_toc_allocate(pcxt->toc, est_deadtuples);
+	dead_tuples->max_tuples = maxtuples;
+	dead_tuples->num_tuples = 0;
+	MemSet(dead_tuples->itemptrs, 0, sizeof(ItemPointerData) * maxtuples);
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_DEAD_TUPLES, dead_tuples);
+	vacrelstats->dead_tuples = dead_tuples;
+
+	/* Store query string for workers */
+	sharedquery = (char *) shm_toc_allocate(pcxt->toc, querylen + 1);
+	memcpy(sharedquery, debug_query_string, querylen + 1);
+	sharedquery[querylen] = '\0';
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, sharedquery);
+
+	pfree(can_parallel_vacuum);
+	return lps;
+}
+
+/*
+ * Destroy the parallel context, and end parallel mode.
+ *
+ * Since writes are not allowed during the parallel mode, so we copy the
+ * updated index statistics from DSM in local memory and then later use that
+ * to update the index statistics.  One might think that we can exit from
+ * parallel mode, update the index statistics and then destroy parallel
+ * context, but that won't be safe (see ExitParallelMode).
+ */
+static void
+end_parallel_vacuum(Relation *Irel, IndexBulkDeleteResult **stats,
+					LVParallelState *lps, int nindexes)
+{
+	int			i;
+
+	Assert(!IsParallelWorker());
+
+	/* copy the updated statistics */
+	for (i = 0; i < nindexes; i++)
+	{
+		LVSharedIndStats *indstats = get_indstats(lps->lvshared, i);
+
+		/*
+		 * Skip unused slot.  The statistics of this index are already stored
+		 * in local memory.
+		 */
+		if (indstats == NULL)
+			continue;
+
+		if (indstats->updated)
+		{
+			stats[i] = (IndexBulkDeleteResult *) palloc0(sizeof(IndexBulkDeleteResult));
+			memcpy(stats[i],
+				   &(indstats->stats),
+				   sizeof(IndexBulkDeleteResult));
+		}
+		else
+			stats[i] = NULL;
+	}
+
+	DestroyParallelContext(lps->pcxt);
+	ExitParallelMode();
+
+	/* Deactivate parallel vacuum */
+	pfree(lps);
+	lps = NULL;
+}
+
+/* Return the Nth index statistics or NULL */
+static LVSharedIndStats *
+get_indstats(LVShared *lvshared, int n)
+{
+	int			i;
+	char	   *p;
+
+	if (IndStatsIsNull(lvshared, n))
+		return NULL;
+
+	p = (char *) GetSharedIndStats(lvshared);
+	for (i = 0; i < n; i++)
+	{
+		if (IndStatsIsNull(lvshared, i))
+			continue;
+
+		p += sizeof(LVSharedIndStats);
+	}
+
+	return (LVSharedIndStats *) p;
+}
+
+/*
+ * Check if the given index participates in parallel index vacuum or parallel
+ * index cleanup.
+ */
+static bool
+skip_parallel_vacuum_index(Relation indrel, LVShared *lvshared)
+{
+	uint8		vacoptions = indrel->rd_indam->amparallelvacuumoptions;
+
+	/* first_time must be true only if for_cleanup is true */
+	Assert(lvshared->for_cleanup || !lvshared->first_time);
+
+	if (lvshared->for_cleanup)
+	{
+		/* Skip, if the index does not support parallel cleanup */
+		if (((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) == 0) &&
+			((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) == 0))
+			return true;
+
+		/*
+		 * Skip, if the index supports parallel cleanup conditionally, but we
+		 * have already processed the index (for bulkdelete).  See the
+		 * comments for option VACUUM_OPTION_PARALLEL_COND_CLEANUP to know
+		 * when indexes support parallel cleanup conditionally.
+		 */
+		if (!lvshared->first_time &&
+			((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) != 0))
+			return true;
+	}
+	else if ((vacoptions & VACUUM_OPTION_PARALLEL_BULKDEL) == 0)
+	{
+		/* Skip if the index does not support parallel bulk deletion */
+		return true;
+	}
+
+	return false;
+}
+
+/*
+ * Perform work within a launched parallel process.
+ *
+ * Since parallel vacuum workers perform only index vacuum or index cleanup,
+ * we don't need to report the progress information.
+ */
+void
+parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
+{
+	Relation	onerel;
+	Relation   *indrels;
+	LVShared   *lvshared;
+	LVDeadTuples *dead_tuples;
+	int			nindexes;
+	char	   *sharedquery;
+	IndexBulkDeleteResult **stats;
+
+	lvshared = (LVShared *) shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_SHARED,
+										   false);
+	elevel = lvshared->elevel;
+
+	ereport(DEBUG1,
+			(errmsg("starting parallel lazy vacuum worker for %s",
+					lvshared->for_cleanup ? "cleanup" : "vacuuming")));
+
+	/* Set debug_query_string for individual workers */
+	sharedquery = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, false);
+	debug_query_string = sharedquery;
+	pgstat_report_activity(STATE_RUNNING, debug_query_string);
+
+	/*
+	 * Open table.  The lock mode is the same as the leader process.  It's
+	 * okay because The lockmode does not conflict among the parallel workers.
+	 */
+	onerel = table_open(lvshared->relid, ShareUpdateExclusiveLock);
+
+	/*
+	 * Open all indexes. indrels are sorted in order by OID, which should be
+	 * matched to the leader's one.
+	 */
+	vac_open_indexes(onerel, RowExclusiveLock, &nindexes, &indrels);
+	Assert(nindexes > 0);
+
+	/* Set dead tuple space */
+	dead_tuples = (LVDeadTuples *) shm_toc_lookup(toc,
+												  PARALLEL_VACUUM_KEY_DEAD_TUPLES,
+												  false);
+
+	/* Set cost-based vacuum delay */
+	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostBalance = 0;
+	VacuumPageHit = 0;
+	VacuumPageMiss = 0;
+	VacuumPageDirty = 0;
+	VacuumSharedCostBalance = &(lvshared->cost_balance);
+	VacuumActiveNWorkers = &(lvshared->active_nworkers);
+
+	stats = (IndexBulkDeleteResult **)
+		palloc0(nindexes * sizeof(IndexBulkDeleteResult *));
+
+	if (lvshared->maintenance_work_mem_worker > 0)
+		maintenance_work_mem = lvshared->maintenance_work_mem_worker;
+
+	/* Process indexes to perform vacuum/cleanup */
+	parallel_vacuum_index(indrels, stats, lvshared, dead_tuples, nindexes);
+
+	vac_close_indexes(nindexes, indrels, RowExclusiveLock);
+	table_close(onerel, ShareUpdateExclusiveLock);
+	pfree(stats);
+}
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index d147236429..6c9ee65ba2 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -14,6 +14,7 @@
 
 #include "postgres.h"
 
+#include "access/heapam.h"
 #include "access/nbtree.h"
 #include "access/parallel.h"
 #include "access/session.h"
@@ -139,6 +140,9 @@ static const struct
 	},
 	{
 		"_bt_parallel_build_main", _bt_parallel_build_main
+	},
+	{
+		"parallel_vacuum_main", parallel_vacuum_main
 	}
 };
 
@@ -174,6 +178,7 @@ CreateParallelContext(const char *library_name, const char *function_name,
 	pcxt = palloc0(sizeof(ParallelContext));
 	pcxt->subid = GetCurrentSubTransactionId();
 	pcxt->nworkers = nworkers;
+	pcxt->nworkers_to_launch = nworkers;
 	pcxt->library_name = pstrdup(library_name);
 	pcxt->function_name = pstrdup(function_name);
 	pcxt->error_context_stack = error_context_stack;
@@ -486,6 +491,23 @@ ReinitializeParallelDSM(ParallelContext *pcxt)
 	}
 }
 
+/*
+ * Reinitialize parallel workers for a parallel context such that we could
+ * launch the different number of workers.  This is required for cases where
+ * we need to reuse the same DSM segment, but the number of workers can
+ * vary from run-to-run.
+ */
+void
+ReinitializeParallelWorkers(ParallelContext *pcxt, int nworkers_to_launch)
+{
+	/*
+	 * The number of workers that need to be launched must be less than the
+	 * number of workers with which the parallel context is initialized.
+	 */
+	Assert(pcxt->nworkers >= nworkers_to_launch);
+	pcxt->nworkers_to_launch = nworkers_to_launch;
+}
+
 /*
  * Launch parallel workers.
  */
@@ -498,7 +520,7 @@ LaunchParallelWorkers(ParallelContext *pcxt)
 	bool		any_registrations_failed = false;
 
 	/* Skip this if we have no workers. */
-	if (pcxt->nworkers == 0)
+	if (pcxt->nworkers == 0 || pcxt->nworkers_to_launch == 0)
 		return;
 
 	/* We need to be a lock group leader. */
@@ -533,7 +555,7 @@ LaunchParallelWorkers(ParallelContext *pcxt)
 	 * fails.  It wouldn't help much anyway, because registering the worker in
 	 * no way guarantees that it will start up and initialize successfully.
 	 */
-	for (i = 0; i < pcxt->nworkers; ++i)
+	for (i = 0; i < pcxt->nworkers_to_launch; ++i)
 	{
 		memcpy(worker.bgw_extra, &i, sizeof(int));
 		if (!any_registrations_failed &&
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index da1da23400..14a9b2432e 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -42,6 +42,7 @@
 #include "nodes/makefuncs.h"
 #include "pgstat.h"
 #include "postmaster/autovacuum.h"
+#include "postmaster/bgworker_internals.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
 #include "storage/proc.h"
@@ -68,6 +69,14 @@ static MemoryContext vac_context = NULL;
 static BufferAccessStrategy vac_strategy;
 
 
+/*
+ * Variables for cost-based parallel vacuum.  See comments atop
+ * compute_parallel_delay to understand how it works.
+ */
+pg_atomic_uint32 *VacuumSharedCostBalance = NULL;
+pg_atomic_uint32 *VacuumActiveNWorkers = NULL;
+int			VacuumCostBalanceLocal = 0;
+
 /* non-export function prototypes */
 static List *expand_vacuum_rel(VacuumRelation *vrel, int options);
 static List *get_all_vacuum_rels(int options);
@@ -76,6 +85,7 @@ static void vac_truncate_clog(TransactionId frozenXID,
 							  TransactionId lastSaneFrozenXid,
 							  MultiXactId lastSaneMinMulti);
 static bool vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params);
+static double compute_parallel_delay(void);
 static VacOptTernaryValue get_vacopt_ternary_value(DefElem *def);
 
 /*
@@ -99,6 +109,7 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 	/* Set default value */
 	params.index_cleanup = VACOPT_TERNARY_DEFAULT;
 	params.truncate = VACOPT_TERNARY_DEFAULT;
+	params.nworkers = -1;
 
 	/* Parse options list */
 	foreach(lc, vacstmt->options)
@@ -129,6 +140,28 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 			params.index_cleanup = get_vacopt_ternary_value(opt);
 		else if (strcmp(opt->defname, "truncate") == 0)
 			params.truncate = get_vacopt_ternary_value(opt);
+		else if (strcmp(opt->defname, "parallel") == 0)
+		{
+			if (opt->arg == NULL)
+			{
+				/*
+				 * Parallel lazy vacuum is requested but user didn't specify
+				 * the parallel degree. The parallel degree will be determined
+				 * at the start of lazy vacuum.
+				 */
+				params.nworkers = 0;
+			}
+			else
+			{
+				params.nworkers = defGetInt32(opt);
+				if (params.nworkers < 1 || params.nworkers > MAX_PARALLEL_WORKER_LIMIT)
+					ereport(ERROR,
+							(errcode(ERRCODE_SYNTAX_ERROR),
+							 errmsg("parallel vacuum degree must be between 1 and %d",
+									MAX_PARALLEL_WORKER_LIMIT),
+							 parser_errposition(pstate, opt->location)));
+			}
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -170,6 +203,11 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 		}
 	}
 
+	if ((params.options & VACOPT_FULL) && params.nworkers >= 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot specify FULL option with PARALLEL option")));
+
 	/*
 	 * All freeze ages are zero if the FREEZE option is given; otherwise pass
 	 * them as -1 which means to use the default values.
@@ -383,6 +421,7 @@ vacuum(List *relations, VacuumParams *params,
 		VacuumPageHit = 0;
 		VacuumPageMiss = 0;
 		VacuumPageDirty = 0;
+		VacuumSharedCostBalance = NULL;
 
 		/*
 		 * Loop to process each selected relation.
@@ -1738,6 +1777,20 @@ vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params)
 		return false;
 	}
 
+	/*
+	 * Since parallel workers cannot access data in temporary tables, parallel
+	 * vacuum is not allowed for temporary relation. However rather than
+	 * skipping vacuum on the table, just disabling parallel option is better
+	 * option in most cases.
+	 */
+	if (RelationUsesLocalBuffers(onerel) && params->nworkers >= 0)
+	{
+		ereport(WARNING,
+				(errmsg("disabling parallel option of vacuum on \"%s\" --- cannot vacuum temporary tables in parallel",
+						RelationGetRelationName(onerel))));
+		params->nworkers = -1;
+	}
+
 	/*
 	 * Silently ignore partitioned tables as there is no work to be done.  The
 	 * useful work is on their child partitions, which have been queued up for
@@ -1941,16 +1994,26 @@ vac_close_indexes(int nindexes, Relation *Irel, LOCKMODE lockmode)
 void
 vacuum_delay_point(void)
 {
+	double		msec = 0;
+
 	/* Always check for interrupts */
 	CHECK_FOR_INTERRUPTS();
 
-	/* Nap if appropriate */
-	if (VacuumCostActive && !InterruptPending &&
-		VacuumCostBalance >= VacuumCostLimit)
-	{
-		double		msec;
+	if (!VacuumCostActive || InterruptPending)
+		return;
 
+	/*
+	 * For parallel vacuum, the delay is computed based on the shared cost
+	 * balance.  See compute_parallel_delay.
+	 */
+	if (VacuumSharedCostBalance != NULL)
+		msec = compute_parallel_delay();
+	else if (VacuumCostBalance >= VacuumCostLimit)
 		msec = VacuumCostDelay * VacuumCostBalance / VacuumCostLimit;
+
+	/* Nap if appropriate */
+	if (msec > 0)
+	{
 		if (msec > VacuumCostDelay * 4)
 			msec = VacuumCostDelay * 4;
 
@@ -1966,6 +2029,59 @@ vacuum_delay_point(void)
 	}
 }
 
+/*
+ * Computes the vacuum delay for parallel workers.
+ *
+ * The basic idea of a cost-based vacuum delay for parallel index vacuuming
+ * is to allow all parallel vacuum workers including the leader process to
+ * have a shared view of cost related parameters (mainly VacuumCostBalance)
+ * and allow each worker to update it and then based on that decide
+ * whether it needs to sleep.  Besides, we allow any worker to sleep only
+ * if it has performed the I/O above a certain threshold, which is calculated
+ * based on the number of active workers (VacuumActiveNWorkers), and the
+ * overall cost balance is more than VacuumCostLimit set by the system.  Then
+ * we will allow the worker to sleep proportional to the work done and reduce
+ * the VacuumSharedCostBalance by the amount which is consumed by the current
+ * worker (VacuumCostBalanceLocal).  This can avoid letting the workers sleep
+ * who have done less or no I/O as compared to other workers and therefore can
+ * ensure that workers who are doing more I/O got throttled more.
+ */
+static double
+compute_parallel_delay(void)
+{
+	double		msec = 0;
+	uint32		shared_balance;
+	int			nworkers = pg_atomic_read_u32(VacuumActiveNWorkers);
+
+	/* At least count itself */
+	Assert(nworkers >= 1);
+
+	/* parallel vacuum must be active */
+	Assert(VacuumSharedCostBalance);
+
+	/* Update the shared cost balance value atomically */
+	shared_balance = pg_atomic_add_fetch_u32(VacuumSharedCostBalance, VacuumCostBalance);
+
+	/* also compute the total local balance */
+	VacuumCostBalanceLocal += VacuumCostBalance;
+
+	if ((shared_balance >= VacuumCostLimit) &&
+		(VacuumCostBalanceLocal > 0.5 * (VacuumCostLimit / nworkers)))
+	{
+		/* compute sleep time based on the local cost balance */
+		msec = VacuumCostDelay * VacuumCostBalanceLocal / VacuumCostLimit;
+		shared_balance = pg_atomic_sub_fetch_u32(VacuumSharedCostBalance, VacuumCostBalanceLocal);
+		VacuumCostBalanceLocal = 0;
+	}
+
+	/*
+	 * Reset the local balance as we accumulated it into the shared value.
+	 */
+	VacuumCostBalance = 0;
+
+	return msec;
+}
+
 /*
  * A wrapper function of defGetBoolean().
  *
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index c1dd8168ca..c3690f9c41 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2891,6 +2891,8 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 			(!wraparound ? VACOPT_SKIP_LOCKED : 0);
 		tab->at_params.index_cleanup = VACOPT_TERNARY_DEFAULT;
 		tab->at_params.truncate = VACOPT_TERNARY_DEFAULT;
+		/* We don't support parallel vacuum for autovacuum for now */
+		tab->at_params.nworkers = -1;
 		tab->at_params.freeze_min_age = freeze_min_age;
 		tab->at_params.freeze_table_age = freeze_table_age;
 		tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 5e0db3515d..e2dbd94a3e 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -3591,7 +3591,7 @@ psql_completion(const char *text, int start, int end)
 		if (ends_with(prev_wd, '(') || ends_with(prev_wd, ','))
 			COMPLETE_WITH("FULL", "FREEZE", "ANALYZE", "VERBOSE",
 						  "DISABLE_PAGE_SKIPPING", "SKIP_LOCKED",
-						  "INDEX_CLEANUP", "TRUNCATE");
+						  "INDEX_CLEANUP", "TRUNCATE", "PARALLEL");
 		else if (TailMatches("FULL|FREEZE|ANALYZE|VERBOSE|DISABLE_PAGE_SKIPPING|SKIP_LOCKED|INDEX_CLEANUP|TRUNCATE"))
 			COMPLETE_WITH("ON", "OFF");
 	}
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 858bcb6bc9..e89c1252d3 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -24,6 +24,8 @@
 #include "nodes/primnodes.h"
 #include "storage/bufpage.h"
 #include "storage/lockdefs.h"
+#include "storage/shm_toc.h"
+#include "storage/dsm.h"
 #include "utils/relcache.h"
 #include "utils/snapshot.h"
 
@@ -193,6 +195,7 @@ extern Size SyncScanShmemSize(void);
 struct VacuumParams;
 extern void heap_vacuum_rel(Relation onerel,
 							struct VacuumParams *params, BufferAccessStrategy bstrategy);
+extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
 
 /* in heap/heapam_visibility.c */
 extern bool HeapTupleSatisfiesVisibility(HeapTuple stup, Snapshot snapshot,
diff --git a/src/include/access/parallel.h b/src/include/access/parallel.h
index c00ae6424c..b9ad6cf671 100644
--- a/src/include/access/parallel.h
+++ b/src/include/access/parallel.h
@@ -33,7 +33,8 @@ typedef struct ParallelContext
 {
 	dlist_node	node;
 	SubTransactionId subid;
-	int			nworkers;
+	int			nworkers;		/* Maximum number of workers to launch */
+	int			nworkers_to_launch; /* Actual number of workers to launch */
 	int			nworkers_launched;
 	char	   *library_name;
 	char	   *function_name;
@@ -63,6 +64,7 @@ extern ParallelContext *CreateParallelContext(const char *library_name,
 											  const char *function_name, int nworkers);
 extern void InitializeParallelDSM(ParallelContext *pcxt);
 extern void ReinitializeParallelDSM(ParallelContext *pcxt);
+extern void ReinitializeParallelWorkers(ParallelContext *pcxt, int nworkers_to_launch);
 extern void LaunchParallelWorkers(ParallelContext *pcxt);
 extern void WaitForParallelWorkersToAttach(ParallelContext *pcxt);
 extern void WaitForParallelWorkersToFinish(ParallelContext *pcxt);
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index b05aedc670..6bb15bef0d 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -31,8 +31,8 @@
 /*
  * Both bulkdelete and vacuumcleanup are disabled by default.  This will be
  * used by IndexAM's that don't want to or cannot participate in parallel vacuum.
- * If an index AM doesn't have a way to communicate the index statistics allocated
- * by the first ambulkdelete call to the subsequent ones until amvacuumcleanup,
+ * For example, if an index AM doesn't have a way to communicate the index statistics
+ * allocated by the first ambulkdelete call to the subsequent ones until amvacuumcleanup,
  * the index AM cannot participate in parallel vacuum.
  */
 #define VACUUM_OPTION_NO_PARALLEL			0
@@ -221,6 +221,12 @@ typedef struct VacuumParams
 										 * default value depends on reloptions */
 	VacOptTernaryValue truncate;	/* Truncate empty pages at the end,
 									 * default value depends on reloptions */
+
+	/*
+	 * The number of parallel vacuum workers. -1 by default for no workers and
+	 * 0 for choosing based on the number of indexes.
+	 */
+	int			nworkers;
 } VacuumParams;
 
 /* GUC parameters */
@@ -230,6 +236,11 @@ extern int	vacuum_freeze_table_age;
 extern int	vacuum_multixact_freeze_min_age;
 extern int	vacuum_multixact_freeze_table_age;
 
+/* Variables for cost-based parallel vacuum  */
+extern pg_atomic_uint32 *VacuumSharedCostBalance;
+extern pg_atomic_uint32 *VacuumActiveNWorkers;
+extern int	VacuumCostBalanceLocal;
+
 
 /* in commands/vacuum.c */
 extern void ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel);
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index 9996d882d1..5b42371d95 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -92,6 +92,32 @@ CONTEXT:  SQL function "do_analyze" statement 1
 SQL function "wrap_do_analyze" statement 1
 VACUUM FULL vactst;
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+-- PARALLEL option
+CREATE TABLE pvactst (i INT, a INT[], p POINT) with (autovacuum_enabled = off);
+INSERT INTO pvactst SELECT i, array[1,2,3], point(i, i+1) FROM generate_series(1,1000) i;
+CREATE INDEX btree_pvactst ON pvactst USING btree (i);
+CREATE INDEX hash_pvactst ON pvactst USING hash (i);
+CREATE INDEX brin_pvactst ON pvactst USING brin (i);
+CREATE INDEX gin_pvactst ON pvactst USING gin (a);
+CREATE INDEX gist_pvactst ON pvactst USING gist (p);
+CREATE INDEX spgist_pvactst ON pvactst USING spgist (p);
+-- VACUUM invokes parallel index cleanup
+VACUUM (PARALLEL 2) pvactst;
+-- VACUUM invokes parallel bulk-deletion and parallel index cleanup
+UPDATE pvactst SET i = i WHERE i < 1000;
+VACUUM (PARALLEL 2) pvactst;
+VACUUM (PARALLEL 0) pvactst; -- error
+ERROR:  parallel vacuum degree must be between 1 and 1024
+LINE 1: VACUUM (PARALLEL 0) pvactst;
+                ^
+VACUUM (PARALLEL 2, INDEX_CLEANUP FALSE) pvactst;
+VACUUM (PARALLEL 2, FULL TRUE) pvactst; -- error, cannot use both PARALLEL and FULL
+ERROR:  cannot specify FULL option with PARALLEL option
+CREATE TEMPORARY TABLE tmp (a int PRIMARY KEY);
+CREATE INDEX tmp_idx1 ON tmp (a);
+VACUUM (PARALLEL 1) tmp; -- disables parallel vacuum option
+WARNING:  disabling parallel option of vacuum on "tmp" --- cannot vacuum temporary tables in parallel
+DROP TABLE pvactst;
 -- INDEX_CLEANUP option
 CREATE TABLE no_index_cleanup (i INT PRIMARY KEY, t TEXT);
 -- Use uncompressed data stored in toast.
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 69987f75e9..0cdda11b25 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -75,6 +75,31 @@ VACUUM FULL vactst;
 
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
 
+-- PARALLEL option
+CREATE TABLE pvactst (i INT, a INT[], p POINT) with (autovacuum_enabled = off);
+INSERT INTO pvactst SELECT i, array[1,2,3], point(i, i+1) FROM generate_series(1,1000) i;
+CREATE INDEX btree_pvactst ON pvactst USING btree (i);
+CREATE INDEX hash_pvactst ON pvactst USING hash (i);
+CREATE INDEX brin_pvactst ON pvactst USING brin (i);
+CREATE INDEX gin_pvactst ON pvactst USING gin (a);
+CREATE INDEX gist_pvactst ON pvactst USING gist (p);
+CREATE INDEX spgist_pvactst ON pvactst USING spgist (p);
+
+-- VACUUM invokes parallel index cleanup
+VACUUM (PARALLEL 2) pvactst;
+
+-- VACUUM invokes parallel bulk-deletion and parallel index cleanup
+UPDATE pvactst SET i = i WHERE i < 1000;
+VACUUM (PARALLEL 2) pvactst;
+
+VACUUM (PARALLEL 0) pvactst; -- error
+VACUUM (PARALLEL 2, INDEX_CLEANUP FALSE) pvactst;
+VACUUM (PARALLEL 2, FULL TRUE) pvactst; -- error, cannot use both PARALLEL and FULL
+CREATE TEMPORARY TABLE tmp (a int PRIMARY KEY);
+CREATE INDEX tmp_idx1 ON tmp (a);
+VACUUM (PARALLEL 1) tmp; -- disables parallel vacuum option
+DROP TABLE pvactst;
+
 -- INDEX_CLEANUP option
 CREATE TABLE no_index_cleanup (i INT PRIMARY KEY, t TEXT);
 -- Use uncompressed data stored in toast.
-- 
2.23.0

v37-0001-Add-index-AM-field-and-callback-for-parallel-ind.patchapplication/octet-stream; name=v37-0001-Add-index-AM-field-and-callback-for-parallel-ind.patchDownload

From 6ad5bc29bd87b14e7aeec828f3731a12b126fb96 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Tue, 15 Oct 2019 17:03:22 +0900
Subject: [PATCH v37 1/3] Add index AM field and callback for parallel index
 vacuum

---
 contrib/bloom/blutils.c                       |  4 ++
 doc/src/sgml/indexam.sgml                     | 19 ++++++++++
 src/backend/access/brin/brin.c                |  4 ++
 src/backend/access/gin/ginutil.c              |  4 ++
 src/backend/access/gist/gist.c                |  4 ++
 src/backend/access/hash/hash.c                |  3 ++
 src/backend/access/nbtree/nbtree.c            |  3 ++
 src/backend/access/spgist/spgutils.c          |  4 ++
 src/include/access/amapi.h                    |  4 ++
 src/include/commands/vacuum.h                 | 37 +++++++++++++++++++
 .../modules/dummy_index_am/dummy_index_am.c   |  3 ++
 11 files changed, 89 insertions(+)

diff --git a/contrib/bloom/blutils.c b/contrib/bloom/blutils.c
index e2063bac62..1a8b3474db 100644
--- a/contrib/bloom/blutils.c
+++ b/contrib/bloom/blutils.c
@@ -18,6 +18,7 @@
 #include "access/reloptions.h"
 #include "bloom.h"
 #include "catalog/index.h"
+#include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
@@ -121,6 +122,9 @@ blhandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_CLEANUP;
+	amroutine->amusemaintenanceworkmem = false;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = blbuild;
diff --git a/doc/src/sgml/indexam.sgml b/doc/src/sgml/indexam.sgml
index dd54c68802..f9211a5ec9 100644
--- a/doc/src/sgml/indexam.sgml
+++ b/doc/src/sgml/indexam.sgml
@@ -122,6 +122,10 @@ typedef struct IndexAmRoutine
     bool        amcanparallel;
     /* does AM support columns included with clause INCLUDE? */
     bool        amcaninclude;
+    /* OR of parallel vacuum flags */
+    uint8       amparallelvacuumoptions;
+    /* does AM use maintenance_work_mem? */
+    bool        amusemaintenanceworkmem;
     /* type of data stored in index, or InvalidOid if variable */
     Oid         amkeytype;
 
@@ -731,6 +735,21 @@ amparallelrescan (IndexScanDesc scan);
    the beginning.
   </para>
 
+  <para>
+<programlisting>
+Size
+amestimateparallelvacuum (void);
+</programlisting>
+   Estimate and return the number of bytes of dynamic shared memory needed to
+   store statistics returned by the access method.
+  </para>
+
+  <para>
+   It is not necessary to implement this function for access methods which
+   do not support parallel vacuum or in cases where the access method does not
+   require more than size of <structname>IndexBulkDeleteResult</structname> to
+   store statistics.
+  </para>
  </sect1>
 
  <sect1 id="index-scanning">
diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index 294ffa6e20..00ee84a896 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -27,6 +27,7 @@
 #include "access/xloginsert.h"
 #include "catalog/index.h"
 #include "catalog/pg_am.h"
+#include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "pgstat.h"
 #include "postmaster/autovacuum.h"
@@ -101,6 +102,9 @@ brinhandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_CLEANUP;
+	amroutine->amusemaintenanceworkmem = false;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = brinbuild;
diff --git a/src/backend/access/gin/ginutil.c b/src/backend/access/gin/ginutil.c
index 38593554f0..5685e8caf6 100644
--- a/src/backend/access/gin/ginutil.c
+++ b/src/backend/access/gin/ginutil.c
@@ -20,6 +20,7 @@
 #include "access/xloginsert.h"
 #include "catalog/pg_collation.h"
 #include "catalog/pg_type.h"
+#include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "storage/indexfsm.h"
 #include "storage/lmgr.h"
@@ -53,6 +54,9 @@ ginhandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_CLEANUP;
+	amroutine->amusemaintenanceworkmem = true;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = ginbuild;
diff --git a/src/backend/access/gist/gist.c b/src/backend/access/gist/gist.c
index a259c80616..7df990cc63 100644
--- a/src/backend/access/gist/gist.c
+++ b/src/backend/access/gist/gist.c
@@ -17,6 +17,7 @@
 #include "access/gist_private.h"
 #include "access/gistscan.h"
 #include "catalog/pg_collation.h"
+#include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "nodes/execnodes.h"
 #include "storage/lmgr.h"
@@ -74,6 +75,9 @@ gisthandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = true;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP;
+	amroutine->amusemaintenanceworkmem = false;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = gistbuild;
diff --git a/src/backend/access/hash/hash.c b/src/backend/access/hash/hash.c
index a0597a0c6e..53db3ab24b 100644
--- a/src/backend/access/hash/hash.c
+++ b/src/backend/access/hash/hash.c
@@ -72,6 +72,9 @@ hashhandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL;
+	amroutine->amusemaintenanceworkmem = false;
 	amroutine->amkeytype = INT4OID;
 
 	amroutine->ambuild = hashbuild;
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index c67235ab80..a2904a9c82 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -123,6 +123,9 @@ bthandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = true;
 	amroutine->amcaninclude = true;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP;
+	amroutine->amusemaintenanceworkmem = false;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = btbuild;
diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c
index e2d391ee75..5d814eeba9 100644
--- a/src/backend/access/spgist/spgutils.c
+++ b/src/backend/access/spgist/spgutils.c
@@ -22,6 +22,7 @@
 #include "access/transam.h"
 #include "access/xact.h"
 #include "catalog/pg_amop.h"
+#include "commands/vacuum.h"
 #include "storage/bufmgr.h"
 #include "storage/indexfsm.h"
 #include "storage/lmgr.h"
@@ -56,6 +57,9 @@ spghandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP;
+	amroutine->amusemaintenanceworkmem = false;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = spgbuild;
diff --git a/src/include/access/amapi.h b/src/include/access/amapi.h
index 6e3db06eed..3fb5a030bd 100644
--- a/src/include/access/amapi.h
+++ b/src/include/access/amapi.h
@@ -197,6 +197,10 @@ typedef struct IndexAmRoutine
 	bool		amcanparallel;
 	/* does AM support columns included with clause INCLUDE? */
 	bool		amcaninclude;
+	/* OR of parallel vacuum flags.  See vacuum.h for flags. */
+	uint8		amparallelvacuumoptions;
+	/* does AM use maintenance_work_mem? */
+	bool		amusemaintenanceworkmem;
 	/* type of data stored in index, or InvalidOid if variable */
 	Oid			amkeytype;
 
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 128f7ae65d..b05aedc670 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -23,6 +23,43 @@
 #include "storage/lock.h"
 #include "utils/relcache.h"
 
+/*
+ * Flags for amparallelvacuumoptions to control the participation of bulkdelete
+ * and vacuumcleanup in parallel vacuum.
+ */
+
+/*
+ * Both bulkdelete and vacuumcleanup are disabled by default.  This will be
+ * used by IndexAM's that don't want to or cannot participate in parallel vacuum.
+ * If an index AM doesn't have a way to communicate the index statistics allocated
+ * by the first ambulkdelete call to the subsequent ones until amvacuumcleanup,
+ * the index AM cannot participate in parallel vacuum.
+ */
+#define VACUUM_OPTION_NO_PARALLEL			0
+
+/*
+ * bulkdelete can be performed in parallel.  This option can be used by
+ * IndexAm's that need to scan the index to delete the tuples.
+ */
+#define VACUUM_OPTION_PARALLEL_BULKDEL		(1 << 0)
+
+/*
+ * vacuumcleanup can be performed in parallel if bulkdelete is not performed
+ * yet.  This will be used by IndexAM's that can scan the index if the
+ * bulkdelete is not performed.
+ */
+#define VACUUM_OPTION_PARALLEL_COND_CLEANUP	(1 << 1)
+
+/*
+ * vacuumcleanup can be performed in parallel even if bulkdelete has already
+ * processed the index.  This will be used by IndexAM's that scan the index
+ * during the cleanup phase of index irrespective of whether the index is
+ * already scanned or not during bulkdelete phase.
+ */
+#define VACUUM_OPTION_PARALLEL_CLEANUP		(1 << 2)
+
+/* value for checking vacuum flags */
+#define VACUUM_OPTION_MAX_VALID_VALUE		((1 << 3) - 1)
 
 /*----------
  * ANALYZE builds one of these structs for each attribute (column) that is
diff --git a/src/test/modules/dummy_index_am/dummy_index_am.c b/src/test/modules/dummy_index_am/dummy_index_am.c
index 053636e4b4..246d68ffc8 100644
--- a/src/test/modules/dummy_index_am/dummy_index_am.c
+++ b/src/test/modules/dummy_index_am/dummy_index_am.c
@@ -16,6 +16,7 @@
 #include "access/amapi.h"
 #include "access/reloptions.h"
 #include "catalog/index.h"
+#include "commands/vacuum.h"
 #include "nodes/pathnodes.h"
 #include "utils/guc.h"
 #include "utils/rel.h"
@@ -294,6 +295,8 @@ dihandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amparallelvacuumoptions = VACUUM_OPTION_NO_PARALLEL;
+	amroutine->amusemaintenanceworkmem = false;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = dibuild;
-- 
2.23.0

#302

Mahendra Singh

mahi6run@gmail.com

about 6 years ago

In reply to: Amit Kapila (#292)

On Wed, 18 Dec 2019 at 12:07, Amit Kapila <amit.kapila16@gmail.com> wrote:

[please trim extra text before responding]

On Wed, Dec 18, 2019 at 12:01 PM Mahendra Singh <mahi6run@gmail.com> wrote:

On Tue, 10 Dec 2019 at 00:30, Mahendra Singh <mahi6run@gmail.com> wrote:

3.
After v35 patch, vacuum.sql regression test is taking too much time due to large number of inserts so by reducing number of tuples, we can reduce that time.
+INSERT INTO pvactst SELECT i, array[1,2,3], point(i, i+1) FROM generate_series(1,100000) i;

here, instead of 100000, we can make 1000 to reduce time of this test case because we only want to test code and functionality.

As we added check of min_parallel_index_scan_size in v36 patch set to
decide parallel vacuum, 1000 tuples are not enough to do parallel
vacuum. I can see that we are not launching any workers in vacuum.sql
test case and hence, code coverage also decreased. I am not sure that
how to fix this.

Try by setting min_parallel_index_scan_size to 0 in test case.

Thanks Amit for the fix.

Yes, we can add "set min_parallel_index_scan_size = 0;" in vacuum.sql
test case. I tested by setting min_parallel_index_scan_size=0 and it
is working fine.

@Masahiko san, please add above line in vacuum.sql test case.

Thanks and Regards
Mahendra Thalor
EnterpriseDB: http://www.enterprisedb.com

#303

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Masahiko Sawada (#301)

On Thu, Dec 19, 2019 at 11:11 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Wed, 18 Dec 2019 at 19:06, Amit Kapila <amit.kapila16@gmail.com> wrote:
- /* Cap by the worker we computed at the beginning of parallel lazy vacuum */
- nworkers = Min(nworkers, lps->pcxt->nworkers);
+ /*
+ * The number of workers required for parallel vacuum phase must be less
+ * than the number of workers with which parallel context is initialized.
+ */
+ Assert(lps->pcxt->nworkers >= nworkers);
Regarding the above change in your patch I think we need to cap the
number of workers by lps->pcxt->nworkers because the computation of
the number of indexes based on lps->nindexes_paralle_XXX can be larger
than the number determined when creating the parallel context, for
example, when max_parallel_maintenance_workers is smaller than the
number of indexes that can be vacuumed in parallel at bulkdelete
phase.

oh, right, but then probably, you can write a comment as this is not so obvious.

Few other comments which I have not fixed:
4.
+ if (Irel[i]->rd_indam->amusemaintenanceworkmem)
+ nindexes_mwm++;
+
+ /* Skip indexes that don't participate parallel index vacuum */
+ if (vacoptions == VACUUM_OPTION_NO_PARALLEL ||
+ RelationGetNumberOfBlocks(Irel[i]) < min_parallel_index_scan_size)
+ continue;
Won't we need to worry about the number of indexes that uses
maintenance_work_mem only for indexes that can participate in a
parallel vacuum? If so, the above checks need to be reversed.
You're right. Fixed.
5.
/*
+ * Remember indexes that can participate parallel index vacuum and use
+ * it for index statistics initialization on DSM because the index
+ * size can get bigger during vacuum.
+ */
+ can_parallel_vacuum[i] = true;
I am not able to understand the second part of the comment ("because
the index size can get bigger during vacuum."). What is its
relevance?
I meant that the indexes can be begger even during vacuum. So we need
to check the size of indexes and determine participations of parallel
index vacuum at one place.

Okay, but that doesn't go with the earlier part of the comment. We
can either remove it or explain it a bit more.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#304

Robert Haas

robertmhaas@gmail.com

about 6 years ago

In reply to: Masahiko Sawada (#301)

On Thu, Dec 19, 2019 at 12:41 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

Attached the updated version patch. This version patch incorporates
the above comments and the comments from Mahendra. I also fixed one
bug around determining the indexes that are vacuumed in parallel based
on their option and size. Please review it.

I'm not enthusiastic about the fact that 0003 calls the fast option
'disable_delay' in some places. I think it would be more clear to call
it 'fast' everywhere.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#305

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

about 6 years ago

In reply to: Robert Haas (#304)

3 attachment(s)

On Thu, 19 Dec 2019 at 22:48, Robert Haas <robertmhaas@gmail.com> wrote:

On Thu, Dec 19, 2019 at 12:41 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

Attached the updated version patch. This version patch incorporates
the above comments and the comments from Mahendra. I also fixed one
bug around determining the indexes that are vacuumed in parallel based
on their option and size. Please review it.

I'm not enthusiastic about the fact that 0003 calls the fast option
'disable_delay' in some places. I think it would be more clear to call
it 'fast' everywhere.

Agreed.

I've attached the updated version patch that incorporated the all
review comments I go so far.

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

v38-0003-Add-FAST-option-to-vacuum-command.patchapplication/octet-stream; name=v38-0003-Add-FAST-option-to-vacuum-command.patchDownload

From c831ce9a27e0017ba121cd5075ed95db74b50f5f Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Tue, 17 Dec 2019 15:18:22 +0900
Subject: [PATCH v38 3/3] Add FAST option to vacuum command.

---
 doc/src/sgml/ref/vacuum.sgml         | 13 +++++++++
 src/backend/access/heap/vacuumlazy.c | 43 +++++++++++++++++-----------
 src/backend/commands/vacuum.c        |  9 ++++--
 src/include/commands/vacuum.h        |  3 +-
 src/test/regress/expected/vacuum.out |  3 ++
 src/test/regress/sql/vacuum.sql      |  4 +++
 6 files changed, 56 insertions(+), 19 deletions(-)

diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index 9fee083233..b190cb0a98 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -35,6 +35,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     INDEX_CLEANUP [ <replaceable class="parameter">boolean</replaceable> ]
     TRUNCATE [ <replaceable class="parameter">boolean</replaceable> ]
     PARALLEL [ <replaceable class="parameter">integer</replaceable> ]
+    FAST [ <replaceable class="parameter">boolean</replaceable> ]
 
 <phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
 
@@ -250,6 +251,18 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><literal>FAST</literal></term>
+    <listitem>
+     <para>
+      Perform vacuum while disabling cost-based vacuum delay feature.
+      Specifying <literal>FAST</literal> is equivalent to performing
+      <command>VACUUM</command> with the
+      <xref linkend="guc-vacuum-cost-delay"/> parameter set to zero.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><replaceable class="parameter">boolean</replaceable></term>
     <listitem>
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 752398ed07..71fbaa3e51 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -218,6 +218,13 @@ typedef struct LVShared
 	 */
 	pg_atomic_uint32 active_nworkers;
 
+	/*
+	 * True if we forcibly disable cost-based vacuum delay during parallel
+	 * index vacuum. This can be true when use specified the FAST vacuum
+	 * option.
+	 */
+	bool		fast;
+
 	/*
 	 * Variables to control parallel index vacuuming.  We have a bitmap to
 	 * indicate which index has stats in shared memory.  The set bit in the
@@ -352,7 +359,7 @@ static void update_index_statistics(Relation *Irel, IndexBulkDeleteResult **stat
 									int nindexes);
 static LVParallelState *begin_parallel_vacuum(Oid relid, Relation *Irel,
 											  LVRelStats *vacrelstats, BlockNumber nblocks,
-											  int nindexes, int nrequested);
+											  int nindexes, VacuumParams *params);
 static void end_parallel_vacuum(Relation *Irel, IndexBulkDeleteResult **stats,
 								LVParallelState *lps, int nindexes);
 static LVSharedIndStats *get_indstats(LVShared *lvshared, int n);
@@ -755,7 +762,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	if (params->nworkers >= 0 && vacrelstats->useindex)
 		lps = begin_parallel_vacuum(RelationGetRelid(onerel), Irel,
 									vacrelstats, nblocks, nindexes,
-									params->nworkers);
+									params);
 
 	/*
 	 * Allocate the space for dead tuples in case the parallel vacuum is not
@@ -1988,16 +1995,19 @@ lazy_parallel_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
 			ReinitializeParallelDSM(lps->pcxt);
 		}
 
-		/* Enable shared cost balance */
-		VacuumSharedCostBalance = &(lps->lvshared->cost_balance);
-		VacuumActiveNWorkers = &(lps->lvshared->active_nworkers);
+		if (!lps->lvshared->fast)
+		{
+			/* Enable shared cost balance */
+			VacuumSharedCostBalance = &(lps->lvshared->cost_balance);
+			VacuumActiveNWorkers = &(lps->lvshared->active_nworkers);
 
-		/*
-		 * Set up shared cost balance and the number of active workers for
-		 * vacuum delay.
-		 */
-		pg_atomic_write_u32(VacuumSharedCostBalance, VacuumCostBalance);
-		pg_atomic_write_u32(VacuumActiveNWorkers, 0);
+			/*
+			 * Set up shared cost balance and the number of active workers for
+			 * vacuum delay.
+			 */
+			pg_atomic_write_u32(VacuumSharedCostBalance, VacuumCostBalance);
+			pg_atomic_write_u32(VacuumActiveNWorkers, 0);
+		}
 
 		/*
 		 * The number of workers can vary between and bulkdelete and cleanup
@@ -2016,7 +2026,7 @@ lazy_parallel_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
 			VacuumCostBalance = 0;
 			VacuumCostBalanceLocal = 0;
 		}
-		else
+		else if (!lps->lvshared->fast)
 		{
 			/*
 			 * Disable shared cost balance if we are not able to launch
@@ -3065,7 +3075,7 @@ update_index_statistics(Relation *Irel, IndexBulkDeleteResult **stats,
  */
 static LVParallelState *
 begin_parallel_vacuum(Oid relid, Relation *Irel, LVRelStats *vacrelstats,
-					  BlockNumber nblocks, int nindexes, int nrequested)
+					  BlockNumber nblocks, int nindexes, VacuumParams *params)
 {
 	LVParallelState *lps = NULL;
 	ParallelContext *pcxt;
@@ -3085,7 +3095,7 @@ begin_parallel_vacuum(Oid relid, Relation *Irel, LVRelStats *vacrelstats,
 	 * a parallel vacuum must be requested and there must be indexes on the
 	 * relation
 	 */
-	Assert(nrequested >= 0);
+	Assert(params->nworkers >= 0);
 	Assert(nindexes > 0);
 
 	/*
@@ -3093,7 +3103,7 @@ begin_parallel_vacuum(Oid relid, Relation *Irel, LVRelStats *vacrelstats,
 	 */
 	can_parallel_vacuum = (bool *) palloc0(sizeof(bool) * nindexes);
 	parallel_workers = compute_parallel_vacuum_workers(Irel, nindexes,
-													   nrequested,
+													   params->nworkers,
 													   can_parallel_vacuum);
 
 	/* Can't perform vacuum in parallel */
@@ -3172,6 +3182,7 @@ begin_parallel_vacuum(Oid relid, Relation *Irel, LVRelStats *vacrelstats,
 		(nindexes_mwm > 0) ?
 		maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
 		maintenance_work_mem;
+	shared->fast = (params->options & VACOPT_FAST);
 
 	/*
 	 * We need to care about alignment because we estimate the shared memory
@@ -3362,7 +3373,7 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
 												  false);
 
 	/* Set cost-based vacuum delay */
-	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostActive = ((VacuumCostDelay > 0) && !(lvshared->fast));
 	VacuumCostBalance = 0;
 	VacuumPageHit = 0;
 	VacuumPageMiss = 0;
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 14a9b2432e..f93c93f9e0 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -101,6 +101,7 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 	bool		verbose = false;
 	bool		skip_locked = false;
 	bool		analyze = false;
+	bool		fast = false;
 	bool		freeze = false;
 	bool		full = false;
 	bool		disable_page_skipping = false;
@@ -130,6 +131,8 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 		/* Parse options available on VACUUM */
 		else if (strcmp(opt->defname, "analyze") == 0)
 			analyze = defGetBoolean(opt);
+		else if (strcmp(opt->defname, "fast") == 0)
+			fast = defGetBoolean(opt);
 		else if (strcmp(opt->defname, "freeze") == 0)
 			freeze = defGetBoolean(opt);
 		else if (strcmp(opt->defname, "full") == 0)
@@ -177,7 +180,8 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 		(analyze ? VACOPT_ANALYZE : 0) |
 		(freeze ? VACOPT_FREEZE : 0) |
 		(full ? VACOPT_FULL : 0) |
-		(disable_page_skipping ? VACOPT_DISABLE_PAGE_SKIPPING : 0);
+		(disable_page_skipping ? VACOPT_DISABLE_PAGE_SKIPPING : 0) |
+		(fast ? VACOPT_FAST : 0);
 
 	/* sanity checks on options */
 	Assert(params.options & (VACOPT_VACUUM | VACOPT_ANALYZE));
@@ -416,7 +420,8 @@ vacuum(List *relations, VacuumParams *params,
 		ListCell   *cur;
 
 		in_vacuum = true;
-		VacuumCostActive = (VacuumCostDelay > 0);
+		VacuumCostActive = ((VacuumCostDelay > 0) &&
+							!(params->options & VACOPT_FAST));
 		VacuumCostBalance = 0;
 		VacuumPageHit = 0;
 		VacuumPageMiss = 0;
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 6bb15bef0d..25761a8096 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -182,7 +182,8 @@ typedef enum VacuumOption
 	VACOPT_FULL = 1 << 4,		/* FULL (non-concurrent) vacuum */
 	VACOPT_SKIP_LOCKED = 1 << 5,	/* skip if cannot get lock */
 	VACOPT_SKIPTOAST = 1 << 6,	/* don't process the TOAST table, if any */
-	VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7	/* don't skip any pages */
+	VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7,	/* don't skip any pages */
+	VACOPT_FAST = 1 << 8		/* disable vacuum delay */
 } VacuumOption;
 
 /*
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index 33e15f0200..ac96d25bb9 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -118,6 +118,9 @@ CREATE TEMPORARY TABLE tmp (a int PRIMARY KEY);
 CREATE INDEX tmp_idx1 ON tmp (a);
 VACUUM (PARALLEL 1) tmp; -- disables parallel vacuum option
 WARNING:  disabling parallel option of vacuum on "tmp" --- cannot vacuum temporary tables in parallel
+-- FAST option
+UPDATE pvactst SET i = i WHERE i < 1000;
+VACUUM (FAST) pvactst;
 DROP TABLE pvactst;
 -- INDEX_CLEANUP option
 CREATE TABLE no_index_cleanup (i INT PRIMARY KEY, t TEXT);
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index cd57283854..f872ef135b 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -99,6 +99,10 @@ VACUUM (PARALLEL 2, FULL TRUE) pvactst; -- error, cannot use both PARALLEL and F
 CREATE TEMPORARY TABLE tmp (a int PRIMARY KEY);
 CREATE INDEX tmp_idx1 ON tmp (a);
 VACUUM (PARALLEL 1) tmp; -- disables parallel vacuum option
+
+-- FAST option
+UPDATE pvactst SET i = i WHERE i < 1000;
+VACUUM (FAST) pvactst;
 DROP TABLE pvactst;
 
 -- INDEX_CLEANUP option
-- 
2.23.0

v38-0001-Add-index-AM-field-and-callback-for-parallel-ind.patchapplication/octet-stream; name=v38-0001-Add-index-AM-field-and-callback-for-parallel-ind.patchDownload

From 95f0fe213bd2378ba87b94af8ee1173e26f65056 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Tue, 15 Oct 2019 17:03:22 +0900
Subject: [PATCH v38 1/3] Add index AM field and callback for parallel index
 vacuum

---
 contrib/bloom/blutils.c                       |  4 ++
 doc/src/sgml/indexam.sgml                     | 19 ++++++++++
 src/backend/access/brin/brin.c                |  4 ++
 src/backend/access/gin/ginutil.c              |  4 ++
 src/backend/access/gist/gist.c                |  4 ++
 src/backend/access/hash/hash.c                |  3 ++
 src/backend/access/nbtree/nbtree.c            |  3 ++
 src/backend/access/spgist/spgutils.c          |  4 ++
 src/include/access/amapi.h                    |  4 ++
 src/include/commands/vacuum.h                 | 37 +++++++++++++++++++
 .../modules/dummy_index_am/dummy_index_am.c   |  3 ++
 11 files changed, 89 insertions(+)

diff --git a/contrib/bloom/blutils.c b/contrib/bloom/blutils.c
index e2063bac62..1a8b3474db 100644
--- a/contrib/bloom/blutils.c
+++ b/contrib/bloom/blutils.c
@@ -18,6 +18,7 @@
 #include "access/reloptions.h"
 #include "bloom.h"
 #include "catalog/index.h"
+#include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
@@ -121,6 +122,9 @@ blhandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_CLEANUP;
+	amroutine->amusemaintenanceworkmem = false;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = blbuild;
diff --git a/doc/src/sgml/indexam.sgml b/doc/src/sgml/indexam.sgml
index dd54c68802..f9211a5ec9 100644
--- a/doc/src/sgml/indexam.sgml
+++ b/doc/src/sgml/indexam.sgml
@@ -122,6 +122,10 @@ typedef struct IndexAmRoutine
     bool        amcanparallel;
     /* does AM support columns included with clause INCLUDE? */
     bool        amcaninclude;
+    /* OR of parallel vacuum flags */
+    uint8       amparallelvacuumoptions;
+    /* does AM use maintenance_work_mem? */
+    bool        amusemaintenanceworkmem;
     /* type of data stored in index, or InvalidOid if variable */
     Oid         amkeytype;
 
@@ -731,6 +735,21 @@ amparallelrescan (IndexScanDesc scan);
    the beginning.
   </para>
 
+  <para>
+<programlisting>
+Size
+amestimateparallelvacuum (void);
+</programlisting>
+   Estimate and return the number of bytes of dynamic shared memory needed to
+   store statistics returned by the access method.
+  </para>
+
+  <para>
+   It is not necessary to implement this function for access methods which
+   do not support parallel vacuum or in cases where the access method does not
+   require more than size of <structname>IndexBulkDeleteResult</structname> to
+   store statistics.
+  </para>
  </sect1>
 
  <sect1 id="index-scanning">
diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index 294ffa6e20..00ee84a896 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -27,6 +27,7 @@
 #include "access/xloginsert.h"
 #include "catalog/index.h"
 #include "catalog/pg_am.h"
+#include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "pgstat.h"
 #include "postmaster/autovacuum.h"
@@ -101,6 +102,9 @@ brinhandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_CLEANUP;
+	amroutine->amusemaintenanceworkmem = false;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = brinbuild;
diff --git a/src/backend/access/gin/ginutil.c b/src/backend/access/gin/ginutil.c
index 38593554f0..5685e8caf6 100644
--- a/src/backend/access/gin/ginutil.c
+++ b/src/backend/access/gin/ginutil.c
@@ -20,6 +20,7 @@
 #include "access/xloginsert.h"
 #include "catalog/pg_collation.h"
 #include "catalog/pg_type.h"
+#include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "storage/indexfsm.h"
 #include "storage/lmgr.h"
@@ -53,6 +54,9 @@ ginhandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_CLEANUP;
+	amroutine->amusemaintenanceworkmem = true;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = ginbuild;
diff --git a/src/backend/access/gist/gist.c b/src/backend/access/gist/gist.c
index a259c80616..7df990cc63 100644
--- a/src/backend/access/gist/gist.c
+++ b/src/backend/access/gist/gist.c
@@ -17,6 +17,7 @@
 #include "access/gist_private.h"
 #include "access/gistscan.h"
 #include "catalog/pg_collation.h"
+#include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "nodes/execnodes.h"
 #include "storage/lmgr.h"
@@ -74,6 +75,9 @@ gisthandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = true;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP;
+	amroutine->amusemaintenanceworkmem = false;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = gistbuild;
diff --git a/src/backend/access/hash/hash.c b/src/backend/access/hash/hash.c
index a0597a0c6e..53db3ab24b 100644
--- a/src/backend/access/hash/hash.c
+++ b/src/backend/access/hash/hash.c
@@ -72,6 +72,9 @@ hashhandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL;
+	amroutine->amusemaintenanceworkmem = false;
 	amroutine->amkeytype = INT4OID;
 
 	amroutine->ambuild = hashbuild;
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index 065b5290b0..b38eee26b3 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -121,6 +121,9 @@ bthandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = true;
 	amroutine->amcaninclude = true;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP;
+	amroutine->amusemaintenanceworkmem = false;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = btbuild;
diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c
index e2d391ee75..5d814eeba9 100644
--- a/src/backend/access/spgist/spgutils.c
+++ b/src/backend/access/spgist/spgutils.c
@@ -22,6 +22,7 @@
 #include "access/transam.h"
 #include "access/xact.h"
 #include "catalog/pg_amop.h"
+#include "commands/vacuum.h"
 #include "storage/bufmgr.h"
 #include "storage/indexfsm.h"
 #include "storage/lmgr.h"
@@ -56,6 +57,9 @@ spghandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP;
+	amroutine->amusemaintenanceworkmem = false;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = spgbuild;
diff --git a/src/include/access/amapi.h b/src/include/access/amapi.h
index 6e3db06eed..3fb5a030bd 100644
--- a/src/include/access/amapi.h
+++ b/src/include/access/amapi.h
@@ -197,6 +197,10 @@ typedef struct IndexAmRoutine
 	bool		amcanparallel;
 	/* does AM support columns included with clause INCLUDE? */
 	bool		amcaninclude;
+	/* OR of parallel vacuum flags.  See vacuum.h for flags. */
+	uint8		amparallelvacuumoptions;
+	/* does AM use maintenance_work_mem? */
+	bool		amusemaintenanceworkmem;
 	/* type of data stored in index, or InvalidOid if variable */
 	Oid			amkeytype;
 
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 128f7ae65d..b05aedc670 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -23,6 +23,43 @@
 #include "storage/lock.h"
 #include "utils/relcache.h"
 
+/*
+ * Flags for amparallelvacuumoptions to control the participation of bulkdelete
+ * and vacuumcleanup in parallel vacuum.
+ */
+
+/*
+ * Both bulkdelete and vacuumcleanup are disabled by default.  This will be
+ * used by IndexAM's that don't want to or cannot participate in parallel vacuum.
+ * If an index AM doesn't have a way to communicate the index statistics allocated
+ * by the first ambulkdelete call to the subsequent ones until amvacuumcleanup,
+ * the index AM cannot participate in parallel vacuum.
+ */
+#define VACUUM_OPTION_NO_PARALLEL			0
+
+/*
+ * bulkdelete can be performed in parallel.  This option can be used by
+ * IndexAm's that need to scan the index to delete the tuples.
+ */
+#define VACUUM_OPTION_PARALLEL_BULKDEL		(1 << 0)
+
+/*
+ * vacuumcleanup can be performed in parallel if bulkdelete is not performed
+ * yet.  This will be used by IndexAM's that can scan the index if the
+ * bulkdelete is not performed.
+ */
+#define VACUUM_OPTION_PARALLEL_COND_CLEANUP	(1 << 1)
+
+/*
+ * vacuumcleanup can be performed in parallel even if bulkdelete has already
+ * processed the index.  This will be used by IndexAM's that scan the index
+ * during the cleanup phase of index irrespective of whether the index is
+ * already scanned or not during bulkdelete phase.
+ */
+#define VACUUM_OPTION_PARALLEL_CLEANUP		(1 << 2)
+
+/* value for checking vacuum flags */
+#define VACUUM_OPTION_MAX_VALID_VALUE		((1 << 3) - 1)
 
 /*----------
  * ANALYZE builds one of these structs for each attribute (column) that is
diff --git a/src/test/modules/dummy_index_am/dummy_index_am.c b/src/test/modules/dummy_index_am/dummy_index_am.c
index 053636e4b4..246d68ffc8 100644
--- a/src/test/modules/dummy_index_am/dummy_index_am.c
+++ b/src/test/modules/dummy_index_am/dummy_index_am.c
@@ -16,6 +16,7 @@
 #include "access/amapi.h"
 #include "access/reloptions.h"
 #include "catalog/index.h"
+#include "commands/vacuum.h"
 #include "nodes/pathnodes.h"
 #include "utils/guc.h"
 #include "utils/rel.h"
@@ -294,6 +295,8 @@ dihandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amparallelvacuumoptions = VACUUM_OPTION_NO_PARALLEL;
+	amroutine->amusemaintenanceworkmem = false;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = dibuild;
-- 
2.23.0

v38-0002-Add-parallel-option-to-VACUUM-command.patchapplication/octet-stream; name=v38-0002-Add-parallel-option-to-VACUUM-command.patchDownload

From 140b48155f222c7641193749a25b275ae349ceda Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Tue, 17 Dec 2019 14:24:26 +0900
Subject: [PATCH v38 2/3] Add parallel option to VACUUM command

This change adds PARALLEL option to VACUUM command that enable us to
perform index vacuuming and index cleanup with background
workers. Individual indexes is processed by one vacuum
process. Therefore parallel vacuum can be used when the table has at
least two indexes and it cannot specify larger parallel degree than
the number of indexes that the table has.

The parallel degree is either specified by user or determined based on
the number of indexes that the table has, and further limited by
max_parallel_maintenance_workers. The table size and index size don't
affect it.
---
 doc/src/sgml/config.sgml              |   14 +-
 doc/src/sgml/ref/vacuum.sgml          |   45 +
 src/backend/access/heap/vacuumlazy.c  | 1235 +++++++++++++++++++++++--
 src/backend/access/transam/parallel.c |   26 +-
 src/backend/commands/vacuum.c         |  126 ++-
 src/backend/postmaster/autovacuum.c   |    2 +
 src/bin/psql/tab-complete.c           |    2 +-
 src/include/access/heapam.h           |    3 +
 src/include/access/parallel.h         |    4 +-
 src/include/commands/vacuum.h         |   15 +-
 src/test/regress/expected/vacuum.out  |   27 +
 src/test/regress/sql/vacuum.sql       |   26 +
 12 files changed, 1405 insertions(+), 120 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 5d1c90282f..74756277b7 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2308,13 +2308,13 @@ include_dir 'conf.d'
        <listitem>
         <para>
          Sets the maximum number of parallel workers that can be
-         started by a single utility command.  Currently, the only
-         parallel utility command that supports the use of parallel
-         workers is <command>CREATE INDEX</command>, and only when
-         building a B-tree index.  Parallel workers are taken from the
-         pool of processes established by <xref
-         linkend="guc-max-worker-processes"/>, limited by <xref
-         linkend="guc-max-parallel-workers"/>.  Note that the requested
+         started by a single utility command.  Currently, the parallel
+         utility commands that support the use of parallel workers are
+         <command>CREATE INDEX</command> only when building a B-tree index,
+         and <command>VACUUM</command> without <literal>FULL</literal>
+         option.  Parallel workers are taken from the pool of processes
+         established by <xref linkend="guc-max-worker-processes"/>, limited
+         by <xref linkend="guc-max-parallel-workers"/>.  Note that the requested
          number of workers may not actually be available at run time.
          If this occurs, the utility operation will run with fewer
          workers than expected.  The default value is 2.  Setting this
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index f9b0fb8794..9fee083233 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -34,6 +34,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     SKIP_LOCKED [ <replaceable class="parameter">boolean</replaceable> ]
     INDEX_CLEANUP [ <replaceable class="parameter">boolean</replaceable> ]
     TRUNCATE [ <replaceable class="parameter">boolean</replaceable> ]
+    PARALLEL [ <replaceable class="parameter">integer</replaceable> ]
 
 <phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
 
@@ -223,6 +224,32 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><literal>PARALLEL</literal></term>
+    <listitem>
+     <para>
+      Perform vacuum index and cleanup index phases of <command>VACUUM</command>
+      in parallel using <replaceable class="parameter">integer</replaceable>
+      background workers (for the detail of each vacuum phases, please
+      refer to <xref linkend="vacuum-phases"/>).  If the parallel degree
+      <replaceable class="parameter">integer</replaceable> is omitted,
+      then <command>VACUUM</command> decides the number of workers based
+      on number of indexes that support parallel vacuum operation on the
+      relation which is further limited by
+      <xref linkend="guc-max-parallel-workers-maintenance"/>.  Please note
+      that it is not guaranteed that the number of parallel workers specified
+      in <replaceable class="parameter">integer</replaceable> will be used
+      during execution.  It is possible for a vacuum to run with fewer workers
+      than specified, or even with no workers at all.  Only one worker can
+      be used per index.  So parallel workers are launched only when there
+      are at least <literal>2</literal> indexes in the table.  Workers for
+      vacuum launches before starting each phase and exit at the end of
+      the phase.  These behaviors might change in a future release.  This
+      option can't be used with the <literal>FULL</literal> option.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><replaceable class="parameter">boolean</replaceable></term>
     <listitem>
@@ -237,6 +264,18 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><replaceable class="parameter">integer</replaceable></term>
+    <listitem>
+     <para>
+      Specifies a positive integer value passed to the selected option.
+      The <replaceable class="parameter">integer</replaceable> value can
+      also be omitted, in which case the value is decided by the command
+      based on the option used.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><replaceable class="parameter">table_name</replaceable></term>
     <listitem>
@@ -316,6 +355,12 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     more than a plain <command>VACUUM</command> would.
    </para>
 
+   <para>
+     The <option>PARALLEL</option> option is used only for vacuum purpose.
+     Even if this option is specified with <option>ANALYZE</option> option
+     it does not affect <option>ANALYZE</option>.
+   </para>
+
    <para>
     <command>VACUUM</command> causes a substantial increase in I/O traffic,
     which might cause poor performance for other active sessions.  Therefore,
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index ab09d8408c..752398ed07 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -22,6 +22,20 @@
  * of index scans performed.  So we don't use maintenance_work_mem memory for
  * the TID array, just enough to hold as many heap tuples as fit on one page.
  *
+ * Lazy vacuum supports parallel execution with parallel worker processes.  In
+ * a parallel lazy vacuum, we perform both index vacuuming and index cleanup
+ * with parallel worker processes.  Individual indexes are processed by one
+ * vacuum process.  At the beginning of a lazy vacuum (at lazy_scan_heap) we
+ * prepare the parallel context and initialize the DSM segment that contains
+ * shared information as well as the memory space for storing dead tuples.
+ * When starting either index vacuuming or index cleanup, we launch parallel
+ * worker processes.  Once all indexes are processed the parallel worker
+ * processes exit.  After that, the leader process re-initializes the parallel
+ * context so that it can use the same DSM for multiple passes of index
+ * vacuum and for performing index cleanup.  For updating the index statistics,
+ * we need to update the system table and since updates are not
+ * allowed during parallel mode we update the index statistics after exiting
+ * from the parallel mode.
  *
  * Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
@@ -36,25 +50,30 @@
 
 #include <math.h>
 
+#include "access/amapi.h"
 #include "access/genam.h"
 #include "access/heapam.h"
 #include "access/heapam_xlog.h"
 #include "access/htup_details.h"
 #include "access/multixact.h"
+#include "access/parallel.h"
 #include "access/transam.h"
 #include "access/visibilitymap.h"
+#include "access/xact.h"
 #include "access/xlog.h"
 #include "catalog/storage.h"
 #include "commands/dbcommands.h"
 #include "commands/progress.h"
 #include "commands/vacuum.h"
 #include "miscadmin.h"
+#include "optimizer/paths.h"
 #include "pgstat.h"
 #include "portability/instr_time.h"
 #include "postmaster/autovacuum.h"
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
 #include "storage/lmgr.h"
+#include "tcop/tcopprot.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/pg_rusage.h"
@@ -110,6 +129,142 @@
  */
 #define PREFETCH_SIZE			((BlockNumber) 32)
 
+/*
+ * DSM keys for parallel lazy vacuum.  Unlike other parallel execution code,
+ * since we don't need to worry about DSM keys conflicting with plan_node_id
+ * we can use small integers.
+ */
+#define PARALLEL_VACUUM_KEY_SHARED			1
+#define PARALLEL_VACUUM_KEY_DEAD_TUPLES		2
+#define PARALLEL_VACUUM_KEY_QUERY_TEXT		3
+
+/*
+ * Macro to check if we are in a parallel lazy vacuum.  If true, we are
+ * in the parallel mode and the DSM segment is initialized.
+ */
+#define ParallelVacuumIsActive(lps) (((LVParallelState *) (lps)) != NULL)
+
+/*
+ * LVDeadTuples stores the dead tuple TIDs collected during the heap scan.
+ * This is allocated in the DSM segment in parallel mode and in local memory
+ * in non-parallel mode.
+ */
+typedef struct LVDeadTuples
+{
+	int			max_tuples;		/* # slots allocated in array */
+	int			num_tuples;		/* current # of entries */
+	/* List of TIDs of tuples we intend to delete */
+	/* NB: this list is ordered by TID address */
+	ItemPointerData itemptrs[FLEXIBLE_ARRAY_MEMBER];	/* array of
+														 * ItemPointerData */
+} LVDeadTuples;
+
+#define SizeOfLVDeadTuples offsetof(LVDeadTuples, itemptrs) + sizeof(ItemPointerData)
+
+/*
+ * Shared information among parallel workers.  So this is allocated in the DSM
+ * segment.
+ */
+typedef struct LVShared
+{
+	/*
+	 * Target table relid and log level.  These fields are not modified during
+	 * the lazy vacuum.
+	 */
+	Oid			relid;
+	int			elevel;
+
+	/*
+	 * An indication for vacuum workers to perform either index vacuuming or
+	 * index cleanup.  first_time is true only if for_cleanup is true and
+	 * bulk-deletion is not performed yet.
+	 */
+	bool		for_cleanup;
+	bool		first_time;
+
+	/*
+	 * Fields for both index vacuum and cleanup.
+	 *
+	 * reltuples is the total number of input heap tuples.  We set either old
+	 * live tuples in the index vacuum case or the new live tuples in the
+	 * index cleanup case.
+	 *
+	 * estimated_count is true if the reltuples is an estimated value.
+	 */
+	double		reltuples;
+	bool		estimated_count;
+
+	/*
+	 * In single process lazy vacuum we could consume more memory during index
+	 * vacuuming or cleanup apart from the memory for heap scanning. In
+	 * parallel index vacuuming, since individual vacuum workers can consume
+	 * memory equal to maitenance_work_mem, the new maitenance_work_mem for
+	 * each worker is set such that the parallel operation doesn't consume
+	 * more memory than single process lazy vacuum.
+	 */
+	int			maintenance_work_mem_worker;
+
+	/*
+	 * Shared vacuum cost balance.  During parallel index vacuuming
+	 * VacuumSharedCostBalance points to this value and it accumulates the
+	 * balance of each parallel vacuum workers.
+	 */
+	pg_atomic_uint32 cost_balance;
+
+	/*
+	 * Number of active parallel workers.  This is used for computing the
+	 * minimum threshold of the vacuum cost balance for a worker to go for the
+	 * delay.
+	 */
+	pg_atomic_uint32 active_nworkers;
+
+	/*
+	 * Variables to control parallel index vacuuming.  We have a bitmap to
+	 * indicate which index has stats in shared memory.  The set bit in the
+	 * map indicates that the particular index supports a parallel vacuum.
+	 */
+	pg_atomic_uint32 idx;		/* counter for vacuuming and clean up */
+	pg_atomic_uint32 nprocessed;	/* # of indexes done during parallel
+									 * execution */
+	uint32		offset;			/* sizeof header incl. bitmap */
+	bits8		bitmap[FLEXIBLE_ARRAY_MEMBER];	/* bit map of NULLs */
+
+	/* Shared index statistics data follows at end of struct */
+} LVShared;
+
+#define SizeOfLVShared offsetof(LVShared, bitmap) + sizeof(bits8)
+#define GetSharedIndStats(s) \
+	((LVSharedIndStats *)((char *)(s) + ((LVShared *)(s))->offset))
+#define IndStatsIsNull(s, i) \
+	(!(((LVShared *)(s))->bitmap[(i) >> 3] & (1 << ((i) & 0x07))))
+
+/*
+ * Struct for an index bulk-deletion statistic used for parallel lazy
+ * vacuum.  This is allocated in the DSM segment.
+ */
+typedef struct LVSharedIndStats
+{
+	bool		updated;		/* are the stats updated? */
+	IndexBulkDeleteResult stats;
+} LVSharedIndStats;
+
+/* Struct for maintaining a parallel vacuum state. */
+typedef struct LVParallelState
+{
+	ParallelContext *pcxt;
+
+	/* Shared information among parallel vacuum workers */
+	LVShared   *lvshared;
+
+	/*
+	 * The number of indexes that support parallel index bulk-deletion and
+	 * parallel index cleanup respectively.
+	 */
+	int			nindexes_parallel_bulkdel;
+	int			nindexes_parallel_cleanup;
+	int			nindexes_parallel_condcleanup;
+} LVParallelState;
+
 typedef struct LVRelStats
 {
 	/* useindex = true means two-pass strategy; false means one-pass */
@@ -128,11 +283,7 @@ typedef struct LVRelStats
 	BlockNumber pages_removed;
 	double		tuples_deleted;
 	BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
-	/* List of TIDs of tuples we intend to delete */
-	/* NB: this list is ordered by TID address */
-	int			num_dead_tuples;	/* current # of entries */
-	int			max_dead_tuples;	/* # slots allocated in array */
-	ItemPointer dead_tuples;	/* array of ItemPointerData */
+	LVDeadTuples *dead_tuples;
 	int			num_index_scans;
 	TransactionId latestRemovedXid;
 	bool		lock_waiter_detected;
@@ -155,12 +306,11 @@ static void lazy_scan_heap(Relation onerel, VacuumParams *params,
 						   bool aggressive);
 static void lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats);
 static bool lazy_check_needs_freeze(Buffer buf, bool *hastup);
-static void lazy_vacuum_index(Relation indrel,
-							  IndexBulkDeleteResult **stats,
-							  LVRelStats *vacrelstats);
+static void lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult **stats,
+							  LVDeadTuples *dead_tuples, double reltuples);
 static void lazy_cleanup_index(Relation indrel,
-							   IndexBulkDeleteResult *stats,
-							   LVRelStats *vacrelstats);
+							   IndexBulkDeleteResult **stats,
+							   double reltuples, bool estimated_count);
 static int	lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 							 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer);
 static bool should_attempt_truncation(VacuumParams *params,
@@ -169,12 +319,44 @@ static void lazy_truncate_heap(Relation onerel, LVRelStats *vacrelstats);
 static BlockNumber count_nondeletable_pages(Relation onerel,
 											LVRelStats *vacrelstats);
 static void lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks);
-static void lazy_record_dead_tuple(LVRelStats *vacrelstats,
+static void lazy_record_dead_tuple(LVDeadTuples *dead_tuples,
 								   ItemPointer itemptr);
 static bool lazy_tid_reaped(ItemPointer itemptr, void *state);
 static int	vac_cmp_itemptr(const void *left, const void *right);
 static bool heap_page_is_all_visible(Relation rel, Buffer buf,
 									 TransactionId *visibility_cutoff_xid, bool *all_frozen);
+static void lazy_parallel_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+										 LVRelStats *vacrelstats, LVParallelState *lps,
+										 int nindexes);
+static void parallel_vacuum_index(Relation *Irel, IndexBulkDeleteResult **stats,
+								  LVShared *lvshared, LVDeadTuples *dead_tuples,
+								  int nindexes);
+static void vacuum_indexes_leader(Relation *Irel, IndexBulkDeleteResult **stats,
+								  LVRelStats *vacrelstats, LVParallelState *lps,
+								  int nindexes);
+static void vacuum_one_index(Relation indrel, IndexBulkDeleteResult **stats,
+							 LVShared *lvshared, LVSharedIndStats *shared_indstats,
+							 LVDeadTuples *dead_tuples);
+static void lazy_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+								LVRelStats *vacrelstats, LVParallelState *lps,
+								int nindexes);
+static void lazy_cleanup_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+								 LVRelStats *vacrelstats, LVParallelState *lps,
+								 int nindexes);
+static long compute_max_dead_tuples(BlockNumber relblocks, bool hasindex);
+static int	compute_parallel_vacuum_workers(Relation *Irel, int nindexes, int nrequested,
+											bool *can_parallel_vacuum);
+static void prepare_index_statistics(LVShared *lvshared, bool *can_parallel_vacuum,
+									 int nindexes);
+static void update_index_statistics(Relation *Irel, IndexBulkDeleteResult **stats,
+									int nindexes);
+static LVParallelState *begin_parallel_vacuum(Oid relid, Relation *Irel,
+											  LVRelStats *vacrelstats, BlockNumber nblocks,
+											  int nindexes, int nrequested);
+static void end_parallel_vacuum(Relation *Irel, IndexBulkDeleteResult **stats,
+								LVParallelState *lps, int nindexes);
+static LVSharedIndStats *get_indstats(LVShared *lvshared, int n);
+static bool skip_parallel_vacuum_index(Relation indrel, LVShared *lvshared);
 
 
 /*
@@ -488,6 +670,18 @@ vacuum_log_cleanup_info(Relation rel, LVRelStats *vacrelstats)
  *		dead-tuple TIDs, invoke vacuuming of indexes and call lazy_vacuum_heap
  *		to reclaim dead line pointers.
  *
+ *		If the table has at least two indexes and parallel lazy vacuum is
+ *		requested, we execute both index vacuuming and index cleanup with
+ *		parallel workers.  In parallel lazy vacuum, we enter parallel mode and
+ *		then create both the parallel context and the DSM segment before starting
+ *		heap scan so that we can record dead tuples to the DSM segment.  All
+ *		parallel workers are launched at beginning of index vacuuming and index
+ *		cleanup and they exit once done with all indexes.  At the end of this
+ *		function we exit from parallel mode.  Index bulk-deletion results are
+ *		stored in the DSM segment and we update index statistics for all the
+ *		indexes after exiting from parallel mode since writes are not allowed
+ *		during parallel mode.
+ *
  *		If there are no indexes then we can reclaim line pointers on the fly;
  *		dead line pointers need only be retained until all index pointers that
  *		reference them have been killed.
@@ -496,6 +690,8 @@ static void
 lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			   Relation *Irel, int nindexes, bool aggressive)
 {
+	LVParallelState *lps = NULL;
+	LVDeadTuples *dead_tuples;
 	BlockNumber nblocks,
 				blkno;
 	HeapTupleData tuple;
@@ -553,13 +749,28 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	vacrelstats->nonempty_pages = 0;
 	vacrelstats->latestRemovedXid = InvalidTransactionId;
 
-	lazy_space_alloc(vacrelstats, nblocks);
+	/*
+	 * Try to initialize the parallel vacuum if requested
+	 */
+	if (params->nworkers >= 0 && vacrelstats->useindex)
+		lps = begin_parallel_vacuum(RelationGetRelid(onerel), Irel,
+									vacrelstats, nblocks, nindexes,
+									params->nworkers);
+
+	/*
+	 * Allocate the space for dead tuples in case the parallel vacuum is not
+	 * initialized.
+	 */
+	if (!ParallelVacuumIsActive(lps))
+		lazy_space_alloc(vacrelstats, nblocks);
+
+	dead_tuples = vacrelstats->dead_tuples;
 	frozen = palloc(sizeof(xl_heap_freeze_tuple) * MaxHeapTuplesPerPage);
 
 	/* Report that we're scanning the heap, advertising total # of blocks */
 	initprog_val[0] = PROGRESS_VACUUM_PHASE_SCAN_HEAP;
 	initprog_val[1] = nblocks;
-	initprog_val[2] = vacrelstats->max_dead_tuples;
+	initprog_val[2] = dead_tuples->max_tuples;
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
 	/*
@@ -737,8 +948,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * If we are close to overrunning the available space for dead-tuple
 		 * TIDs, pause and do a cycle of vacuuming before we tackle this page.
 		 */
-		if ((vacrelstats->max_dead_tuples - vacrelstats->num_dead_tuples) < MaxHeapTuplesPerPage &&
-			vacrelstats->num_dead_tuples > 0)
+		if ((dead_tuples->max_tuples - dead_tuples->num_tuples) < MaxHeapTuplesPerPage &&
+			dead_tuples->num_tuples > 0)
 		{
 			const int	hvp_index[] = {
 				PROGRESS_VACUUM_PHASE,
@@ -766,10 +977,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 										 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
 			/* Remove index entries */
-			for (i = 0; i < nindexes; i++)
-				lazy_vacuum_index(Irel[i],
-								  &indstats[i],
-								  vacrelstats);
+			lazy_vacuum_indexes(Irel, indstats, vacrelstats, lps, nindexes);
 
 			/*
 			 * Report that we are now vacuuming the heap.  We also increase
@@ -789,7 +997,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 			vacrelstats->num_index_scans++;
 
 			/*
@@ -985,7 +1193,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		has_dead_tuples = false;
 		nfrozen = 0;
 		hastup = false;
-		prev_dead_count = vacrelstats->num_dead_tuples;
+		prev_dead_count = dead_tuples->num_tuples;
 		maxoff = PageGetMaxOffsetNumber(page);
 
 		/*
@@ -1024,7 +1232,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 */
 			if (ItemIdIsDead(itemid))
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				all_visible = false;
 				continue;
 			}
@@ -1170,7 +1378,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 
 			if (tupgone)
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				HeapTupleHeaderAdvanceLatestRemovedXid(tuple.t_data,
 													   &vacrelstats->latestRemovedXid);
 				tups_vacuumed += 1;
@@ -1240,7 +1448,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * doing a second scan. Also we don't do that but forget dead tuples
 		 * when index cleanup is disabled.
 		 */
-		if (!vacrelstats->useindex && vacrelstats->num_dead_tuples > 0)
+		if (!vacrelstats->useindex && dead_tuples->num_tuples > 0)
 		{
 			if (nindexes == 0)
 			{
@@ -1269,7 +1477,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 
 			/*
 			 * Periodically do incremental FSM vacuuming to make newly-freed
@@ -1384,7 +1592,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * page, so remember its free space as-is.  (This path will always be
 		 * taken if there are no indexes.)
 		 */
-		if (vacrelstats->num_dead_tuples == prev_dead_count)
+		if (dead_tuples->num_tuples == prev_dead_count)
 			RecordPageWithFreeSpace(onerel, blkno, freespace);
 	}
 
@@ -1418,7 +1626,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 
 	/* If any tuples need to be deleted, perform final vacuum cycle */
 	/* XXX put a threshold on min number of tuples here? */
-	if (vacrelstats->num_dead_tuples > 0)
+	if (dead_tuples->num_tuples > 0)
 	{
 		const int	hvp_index[] = {
 			PROGRESS_VACUUM_PHASE,
@@ -1434,10 +1642,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 									 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
 		/* Remove index entries */
-		for (i = 0; i < nindexes; i++)
-			lazy_vacuum_index(Irel[i],
-							  &indstats[i],
-							  vacrelstats);
+		lazy_vacuum_indexes(Irel, indstats, vacrelstats, lps, nindexes);
 
 		/* Report that we are now vacuuming the heap */
 		hvp_val[0] = PROGRESS_VACUUM_PHASE_VACUUM_HEAP;
@@ -1461,12 +1666,19 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
 								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
 
-	/* Do post-vacuum cleanup and statistics update for each index */
+	/* Do post-vacuum cleanup */
 	if (vacrelstats->useindex)
-	{
-		for (i = 0; i < nindexes; i++)
-			lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
-	}
+		lazy_cleanup_indexes(Irel, indstats, vacrelstats, lps, nindexes);
+
+	/*
+	 * End parallel mode before updating index statistics as we cannot write
+	 * during parallel mode.
+	 */
+	if (ParallelVacuumIsActive(lps))
+		end_parallel_vacuum(Irel, indstats, lps, nindexes);
+
+	/* Update index statistics */
+	update_index_statistics(Irel, indstats, nindexes);
 
 	/* If no indexes, make log report that lazy_vacuum_heap would've made */
 	if (vacuumed_pages)
@@ -1532,7 +1744,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 	npages = 0;
 
 	tupindex = 0;
-	while (tupindex < vacrelstats->num_dead_tuples)
+	while (tupindex < vacrelstats->dead_tuples->num_tuples)
 	{
 		BlockNumber tblk;
 		Buffer		buf;
@@ -1541,7 +1753,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 
 		vacuum_delay_point();
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples->itemptrs[tupindex]);
 		buf = ReadBufferExtended(onerel, MAIN_FORKNUM, tblk, RBM_NORMAL,
 								 vac_strategy);
 		if (!ConditionalLockBufferForCleanup(buf))
@@ -1589,6 +1801,7 @@ static int
 lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 				 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer)
 {
+	LVDeadTuples *dead_tuples = vacrelstats->dead_tuples;
 	Page		page = BufferGetPage(buffer);
 	OffsetNumber unused[MaxOffsetNumber];
 	int			uncnt = 0;
@@ -1599,16 +1812,16 @@ lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 
 	START_CRIT_SECTION();
 
-	for (; tupindex < vacrelstats->num_dead_tuples; tupindex++)
+	for (; tupindex < dead_tuples->num_tuples; tupindex++)
 	{
 		BlockNumber tblk;
 		OffsetNumber toff;
 		ItemId		itemid;
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&dead_tuples->itemptrs[tupindex]);
 		if (tblk != blkno)
 			break;				/* past end of tuples for this block */
-		toff = ItemPointerGetOffsetNumber(&vacrelstats->dead_tuples[tupindex]);
+		toff = ItemPointerGetOffsetNumber(&dead_tuples->itemptrs[tupindex]);
 		itemid = PageGetItemId(page, toff);
 		ItemIdSetUnused(itemid);
 		unused[uncnt++] = toff;
@@ -1729,19 +1942,383 @@ lazy_check_needs_freeze(Buffer buf, bool *hastup)
 	return false;
 }
 
+/*
+ * Perform index vacuuming or index cleanup with parallel workers.  This
+ * function must be used by the parallel vacuum leader process. The caller
+ * must set lps->lvshared->for_cleanup to indicate whether to perform vacuum
+ * or cleanup.
+ */
+static void
+lazy_parallel_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+							 LVRelStats *vacrelstats, LVParallelState *lps,
+							 int nindexes)
+{
+	int			nworkers;
+
+	Assert(!IsParallelWorker());
+	Assert(ParallelVacuumIsActive(lps));
+	Assert(nindexes > 0);
+
+	/* Determine the number of parallel workers to launch */
+	if (lps->lvshared->for_cleanup)
+	{
+		if (lps->lvshared->first_time)
+			nworkers = lps->nindexes_parallel_cleanup +
+				lps->nindexes_parallel_condcleanup - 1;
+		else
+			nworkers = lps->nindexes_parallel_cleanup - 1;
+
+	}
+	else
+		nworkers = lps->nindexes_parallel_bulkdel - 1;
+
+	/* Cap by the number of workers with which parallel context is initialized */
+	nworkers = Min(nworkers, lps->pcxt->nworkers);
+
+	/* Setup the shared cost-based vacuum delay and launch workers */
+	if (nworkers > 0)
+	{
+		if (vacrelstats->num_index_scans > 0)
+		{
+			/* Reset the processing counts */
+			pg_atomic_write_u32(&(lps->lvshared->idx), 0);
+			pg_atomic_write_u32(&(lps->lvshared->nprocessed), 0);
+
+			/* Reinitialize the parallel context to relaunch parallel workers */
+			ReinitializeParallelDSM(lps->pcxt);
+		}
+
+		/* Enable shared cost balance */
+		VacuumSharedCostBalance = &(lps->lvshared->cost_balance);
+		VacuumActiveNWorkers = &(lps->lvshared->active_nworkers);
+
+		/*
+		 * Set up shared cost balance and the number of active workers for
+		 * vacuum delay.
+		 */
+		pg_atomic_write_u32(VacuumSharedCostBalance, VacuumCostBalance);
+		pg_atomic_write_u32(VacuumActiveNWorkers, 0);
+
+		/*
+		 * The number of workers can vary between and bulkdelete and cleanup
+		 * phase.
+		 */
+		ReinitializeParallelWorkers(lps->pcxt, nworkers);
+
+		LaunchParallelWorkers(lps->pcxt);
+
+		if (lps->pcxt->nworkers_launched > 0)
+		{
+			/*
+			 * Reset the local cost values for leader backend as we have
+			 * already accumulated the remaining balance of heap.
+			 */
+			VacuumCostBalance = 0;
+			VacuumCostBalanceLocal = 0;
+		}
+		else
+		{
+			/*
+			 * Disable shared cost balance if we are not able to launch
+			 * workers.
+			 */
+			VacuumSharedCostBalance = NULL;
+			VacuumActiveNWorkers = NULL;
+		}
+
+		if (lps->lvshared->for_cleanup)
+			ereport(elevel,
+					(errmsg(ngettext("launched %d parallel vacuum worker for index cleanup (planned: %d)",
+									 "launched %d parallel vacuum workers for index cleanup (planned: %d)",
+									 lps->pcxt->nworkers_launched),
+							lps->pcxt->nworkers_launched, nworkers)));
+		else
+			ereport(elevel,
+					(errmsg(ngettext("launched %d parallel vacuum worker for index vacuuming (planned: %d)",
+									 "launched %d parallel vacuum workers for index vacuuming (planned: %d)",
+									 lps->pcxt->nworkers_launched),
+							lps->pcxt->nworkers_launched, nworkers)));
+	}
+
+	/* Process the indexes that can be processed by only leader process */
+	vacuum_indexes_leader(Irel, stats, vacrelstats, lps, nindexes);
+
+	/*
+	 * Join as a parallel worker.  The leader process alone processes all the
+	 * indexes in the case where no workers are launched.
+	 */
+	parallel_vacuum_index(Irel, stats, lps->lvshared,
+						  vacrelstats->dead_tuples, nindexes);
+
+	/* Wait for all vacuum workers to finish */
+	WaitForParallelWorkersToFinish(lps->pcxt);
+
+	/* Carry the shared balance value to heap scan */
+	if (VacuumSharedCostBalance)
+		VacuumCostBalance = pg_atomic_read_u32(VacuumSharedCostBalance);
+
+	if (nworkers > 0)
+	{
+		/* Disable shared cost balance */
+		VacuumSharedCostBalance = NULL;
+		VacuumActiveNWorkers = NULL;
+	}
+}
+
+/*
+ * Index vacuum/cleanup routine used by the leader process and parallel
+ * vacuum worker processes to process the indexes in parallel.
+ */
+static void
+parallel_vacuum_index(Relation *Irel, IndexBulkDeleteResult **stats,
+					  LVShared *lvshared, LVDeadTuples *dead_tuples,
+					  int nindexes)
+{
+	/*
+	 * Increment the active worker count if we are able to launch any worker.
+	 */
+	if (VacuumActiveNWorkers)
+		pg_atomic_add_fetch_u32(VacuumActiveNWorkers, 1);
+
+	/* Loop until all indexes are vacuumed */
+	for (;;)
+	{
+		int			idx;
+		LVSharedIndStats *shared_indstats;
+
+		/* Get an index number to process */
+		idx = pg_atomic_fetch_add_u32(&(lvshared->idx), 1);
+
+		/* Done for all indexes? */
+		if (idx >= nindexes)
+			break;
+
+		/* Get the index statistics of this index from DSM */
+		shared_indstats = get_indstats(lvshared, idx);
+
+		/*
+		 * Skip processing indexes that doesn't participat in parallel
+		 * operation
+		 */
+		if (shared_indstats == NULL ||
+			skip_parallel_vacuum_index(Irel[idx], lvshared))
+			continue;
+
+		/* Increment the processing count */
+		pg_atomic_add_fetch_u32(&(lvshared->nprocessed), 1);
+
+		/* Do vacuum or cleanup of the index */
+		vacuum_one_index(Irel[idx], &(stats[idx]), lvshared, shared_indstats,
+						 dead_tuples);
+	}
+
+	/*
+	 * We have completed the index vacuum so decrement the active worker
+	 * count.
+	 */
+	if (VacuumActiveNWorkers)
+		pg_atomic_sub_fetch_u32(VacuumActiveNWorkers, 1);
+}
+
+/*
+ * Vacuum or cleanup indexes that can be processed by only the leader process
+ * because these indexes don't support parallel operation at that phase.
+ */
+static void
+vacuum_indexes_leader(Relation *Irel, IndexBulkDeleteResult **stats,
+					  LVRelStats *vacrelstats, LVParallelState *lps,
+					  int nindexes)
+{
+	int			i;
+
+	Assert(!IsParallelWorker());
+
+	/*
+	 * Increment the active worker count if we are able to launch any worker.
+	 */
+	if (VacuumActiveNWorkers)
+		pg_atomic_add_fetch_u32(VacuumActiveNWorkers, 1);
+
+	for (i = 0; i < nindexes; i++)
+	{
+		bool		skip_index = (get_indstats(lps->lvshared, i) == NULL ||
+								  skip_parallel_vacuum_index(Irel[i], lps->lvshared));
+
+		/* Skip the indexes that can be processed by parallel workers */
+		if (!skip_index)
+			continue;
+
+		vacuum_one_index(Irel[i], &(stats[i]), lps->lvshared,
+						 get_indstats(lps->lvshared, i),
+						 vacrelstats->dead_tuples);
+	}
+
+	/*
+	 * We have completed the index vacuum so decrement the active worker
+	 * count.
+	 */
+	if (VacuumActiveNWorkers)
+		pg_atomic_sub_fetch_u32(VacuumActiveNWorkers, 1);
+}
+
+/*
+ * Vacuum or cleanup index either by leader process or by one of the worker
+ * process.  After processing the index this function copies the index
+ * statistics returned from ambulkdelete and amvacuumcleanup to the DSM
+ * segment.
+ */
+static void
+vacuum_one_index(Relation indrel, IndexBulkDeleteResult **stats,
+				 LVShared *lvshared, LVSharedIndStats *shared_indstats,
+				 LVDeadTuples *dead_tuples)
+{
+	IndexBulkDeleteResult *bulkdelete_res = NULL;
+
+	if (shared_indstats)
+	{
+		/* Get the space for IndexBulkDeleteResult */
+		bulkdelete_res = &(shared_indstats->stats);
+
+		/*
+		 * Update the pointer to the corresponding bulk-deletion result if
+		 * someone has already updated it.
+		 */
+		if (shared_indstats->updated && *stats == NULL)
+			*stats = bulkdelete_res;
+	}
+
+	/* Do vacuum or cleanup of the index */
+	if (lvshared->for_cleanup)
+		lazy_cleanup_index(indrel, stats, lvshared->reltuples,
+						   lvshared->estimated_count);
+	else
+		lazy_vacuum_index(indrel, stats, dead_tuples,
+						  lvshared->reltuples);
+
+	/*
+	 * Copy the index bulk-deletion result returned from ambulkdelete and
+	 * amvacuumcleanup to the DSM segment if it's the first time to get it
+	 * from them, because they allocate it locally and it's possible that an
+	 * index will be vacuumed by the different vacuum process at the next
+	 * time.  The copying of the result normally happens only after the first
+	 * time of index vacuuming.  From the second time, we pass the result on
+	 * the DSM segment so that they then update it directly.
+	 *
+	 * Since all vacuum workers write the bulk-deletion result at different
+	 * slots we can write them without locking.
+	 */
+	if (shared_indstats && !shared_indstats->updated && *stats != NULL)
+	{
+		memcpy(bulkdelete_res, *stats, sizeof(IndexBulkDeleteResult));
+		shared_indstats->updated = true;
+
+		/*
+		 * Now that the stats[idx] points to the DSM segment, we don't need
+		 * the locally allocated results.
+		 */
+		pfree(*stats);
+		*stats = bulkdelete_res;
+	}
+}
+
+/*
+ * Vacuum indexes.  We process the indexes serially unless we are doing
+ * parallel vacuum.
+ */
+static void
+lazy_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+					LVRelStats *vacrelstats, LVParallelState *lps,
+					int nindexes)
+{
+	int			idx;
+
+	Assert(!IsParallelWorker());
+	Assert(nindexes > 0);
+
+	/* Perform index vacuuming with parallel workers for parallel vacuum. */
+	if (ParallelVacuumIsActive(lps))
+	{
+		/* Tell parallel workers to do index vacuuming */
+		lps->lvshared->for_cleanup = false;
+		lps->lvshared->first_time = false;
+
+		/*
+		 * We can only provide an approximate value of num_heap_tuples in
+		 * vacuum cases.
+		 */
+		lps->lvshared->reltuples = vacrelstats->old_live_tuples;
+		lps->lvshared->estimated_count = true;
+
+		lazy_parallel_vacuum_indexes(Irel, stats, vacrelstats, lps, nindexes);
+	}
+	else
+	{
+		for (idx = 0; idx < nindexes; idx++)
+			lazy_vacuum_index(Irel[idx], &stats[idx], vacrelstats->dead_tuples,
+							  vacrelstats->old_live_tuples);
+	}
+}
+
+/*
+ * Cleanup indexes.  We process the indexes serially unless we are doing
+ * parallel vacuum.
+ */
+static void
+lazy_cleanup_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+					 LVRelStats *vacrelstats, LVParallelState *lps,
+					 int nindexes)
+{
+	int			idx;
+
+	Assert(!IsParallelWorker());
+	Assert(nindexes > 0);
+
+	/*
+	 * If parallel vacuum is active we perform index cleanup with parallel
+	 * workers.
+	 */
+	if (ParallelVacuumIsActive(lps))
+	{
+		/* Tell parallel workers to do index cleanup */
+		lps->lvshared->for_cleanup = true;
+		lps->lvshared->first_time =
+			(vacrelstats->num_index_scans == 0);
+
+		/*
+		 * Now we can provide a better estimate of total number of surviving
+		 * tuples (we assume indexes are more interested in that than in the
+		 * number of nominally live tuples).
+		 */
+		lps->lvshared->reltuples = vacrelstats->new_rel_tuples;
+		lps->lvshared->estimated_count =
+			(vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+
+		lazy_parallel_vacuum_indexes(Irel, stats, vacrelstats, lps, nindexes);
+	}
+	else
+	{
+		for (idx = 0; idx < nindexes; idx++)
+			lazy_cleanup_index(Irel[idx], &stats[idx],
+							   vacrelstats->new_rel_tuples,
+							   vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	}
+}
 
 /*
  *	lazy_vacuum_index() -- vacuum one index relation.
  *
  *		Delete all the index entries pointing to tuples listed in
- *		vacrelstats->dead_tuples, and update running statistics.
+ *		dead_tuples, and update running statistics.
+ *
+ *		reltuples is the number of heap tuples to be passed to the
+ *		bulkdelete callback.
  */
 static void
-lazy_vacuum_index(Relation indrel,
-				  IndexBulkDeleteResult **stats,
-				  LVRelStats *vacrelstats)
+lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult **stats,
+				  LVDeadTuples *dead_tuples, double reltuples)
 {
 	IndexVacuumInfo ivinfo;
+	const char *msg;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
@@ -1751,30 +2328,38 @@ lazy_vacuum_index(Relation indrel,
 	ivinfo.report_progress = false;
 	ivinfo.estimated_count = true;
 	ivinfo.message_level = elevel;
-	/* We can only provide an approximate value of num_heap_tuples here */
-	ivinfo.num_heap_tuples = vacrelstats->old_live_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
 	/* Do bulk deletion */
 	*stats = index_bulk_delete(&ivinfo, *stats,
-							   lazy_tid_reaped, (void *) vacrelstats);
+							   lazy_tid_reaped, (void *) dead_tuples);
+
+	if (IsParallelWorker())
+		msg = gettext_noop("scanned index \"%s\" to remove %d row versions by parallel vacuum worker");
+	else
+		msg = gettext_noop("scanned index \"%s\" to remove %d row versions");
 
 	ereport(elevel,
-			(errmsg("scanned index \"%s\" to remove %d row versions",
+			(errmsg(msg,
 					RelationGetRelationName(indrel),
-					vacrelstats->num_dead_tuples),
+					dead_tuples->num_tuples),
 			 errdetail_internal("%s", pg_rusage_show(&ru0))));
 }
 
 /*
  *	lazy_cleanup_index() -- do post-vacuum cleanup for one index relation.
+ *
+ *		reltuples is the number of heap tuples and estimated_count is true
+ *		if the reltuples is an estimated value.
  */
 static void
 lazy_cleanup_index(Relation indrel,
-				   IndexBulkDeleteResult *stats,
-				   LVRelStats *vacrelstats)
+				   IndexBulkDeleteResult **stats,
+				   double reltuples, bool estimated_count)
 {
 	IndexVacuumInfo ivinfo;
+	const char *msg;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
@@ -1782,49 +2367,33 @@ lazy_cleanup_index(Relation indrel,
 	ivinfo.index = indrel;
 	ivinfo.analyze_only = false;
 	ivinfo.report_progress = false;
-	ivinfo.estimated_count = (vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	ivinfo.estimated_count = estimated_count;
 	ivinfo.message_level = elevel;
 
-	/*
-	 * Now we can provide a better estimate of total number of surviving
-	 * tuples (we assume indexes are more interested in that than in the
-	 * number of nominally live tuples).
-	 */
-	ivinfo.num_heap_tuples = vacrelstats->new_rel_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
-	stats = index_vacuum_cleanup(&ivinfo, stats);
+	*stats = index_vacuum_cleanup(&ivinfo, *stats);
 
-	if (!stats)
+	if (!(*stats))
 		return;
 
-	/*
-	 * Now update statistics in pg_class, but only if the index says the count
-	 * is accurate.
-	 */
-	if (!stats->estimated_count)
-		vac_update_relstats(indrel,
-							stats->num_pages,
-							stats->num_index_tuples,
-							0,
-							false,
-							InvalidTransactionId,
-							InvalidMultiXactId,
-							false);
+	if (IsParallelWorker())
+		msg = gettext_noop("index \"%s\" now contains %.0f row versions in %u pages as reported by parallel vacuum worker");
+	else
+		msg = gettext_noop("index \"%s\" now contains %.0f row versions in %u pages");
 
 	ereport(elevel,
-			(errmsg("index \"%s\" now contains %.0f row versions in %u pages",
+			(errmsg(msg,
 					RelationGetRelationName(indrel),
-					stats->num_index_tuples,
-					stats->num_pages),
+					(*stats)->num_index_tuples,
+					(*stats)->num_pages),
 			 errdetail("%.0f index row versions were removed.\n"
 					   "%u index pages have been deleted, %u are currently reusable.\n"
 					   "%s.",
-					   stats->tuples_removed,
-					   stats->pages_deleted, stats->pages_free,
+					   (*stats)->tuples_removed,
+					   (*stats)->pages_deleted, (*stats)->pages_free,
 					   pg_rusage_show(&ru0))));
-
-	pfree(stats);
 }
 
 /*
@@ -2132,19 +2701,17 @@ count_nondeletable_pages(Relation onerel, LVRelStats *vacrelstats)
 }
 
 /*
- * lazy_space_alloc - space allocation decisions for lazy vacuum
- *
- * See the comments at the head of this file for rationale.
+ * Return the maximum number of dead tuples we can record.
  */
-static void
-lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+static long
+compute_max_dead_tuples(BlockNumber relblocks, bool useindex)
 {
 	long		maxtuples;
 	int			vac_work_mem = IsAutoVacuumWorkerProcess() &&
 	autovacuum_work_mem != -1 ?
 	autovacuum_work_mem : maintenance_work_mem;
 
-	if (vacrelstats->useindex)
+	if (useindex)
 	{
 		maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);
 		maxtuples = Min(maxtuples, INT_MAX);
@@ -2158,34 +2725,49 @@ lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
 		maxtuples = Max(maxtuples, MaxHeapTuplesPerPage);
 	}
 	else
-	{
 		maxtuples = MaxHeapTuplesPerPage;
-	}
 
-	vacrelstats->num_dead_tuples = 0;
-	vacrelstats->max_dead_tuples = (int) maxtuples;
-	vacrelstats->dead_tuples = (ItemPointer)
-		palloc(maxtuples * sizeof(ItemPointerData));
+	return maxtuples;
+}
+
+/*
+ * lazy_space_alloc - space allocation decisions for lazy vacuum
+ *
+ * See the comments at the head of this file for rationale.
+ */
+static void
+lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+{
+	LVDeadTuples *dead_tuples = NULL;
+	long		maxtuples;
+
+	maxtuples = compute_max_dead_tuples(relblocks, vacrelstats->useindex);
+
+	dead_tuples = (LVDeadTuples *)
+		palloc(SizeOfLVDeadTuples + maxtuples * sizeof(ItemPointerData));
+	dead_tuples->num_tuples = 0;
+	dead_tuples->max_tuples = (int) maxtuples;
+
+	vacrelstats->dead_tuples = dead_tuples;
 }
 
 /*
  * lazy_record_dead_tuple - remember one deletable tuple
  */
 static void
-lazy_record_dead_tuple(LVRelStats *vacrelstats,
-					   ItemPointer itemptr)
+lazy_record_dead_tuple(LVDeadTuples *dead_tuples, ItemPointer itemptr)
 {
 	/*
 	 * The array shouldn't overflow under normal behavior, but perhaps it
 	 * could if we are given a really small maintenance_work_mem. In that
 	 * case, just forget the last few tuples (we'll get 'em next time).
 	 */
-	if (vacrelstats->num_dead_tuples < vacrelstats->max_dead_tuples)
+	if (dead_tuples->num_tuples < dead_tuples->max_tuples)
 	{
-		vacrelstats->dead_tuples[vacrelstats->num_dead_tuples] = *itemptr;
-		vacrelstats->num_dead_tuples++;
+		dead_tuples->itemptrs[dead_tuples->num_tuples] = *itemptr;
+		dead_tuples->num_tuples++;
 		pgstat_progress_update_param(PROGRESS_VACUUM_NUM_DEAD_TUPLES,
-									 vacrelstats->num_dead_tuples);
+									 dead_tuples->num_tuples);
 	}
 }
 
@@ -2199,12 +2781,12 @@ lazy_record_dead_tuple(LVRelStats *vacrelstats,
 static bool
 lazy_tid_reaped(ItemPointer itemptr, void *state)
 {
-	LVRelStats *vacrelstats = (LVRelStats *) state;
+	LVDeadTuples *dead_tuples = (LVDeadTuples *) state;
 	ItemPointer res;
 
 	res = (ItemPointer) bsearch((void *) itemptr,
-								(void *) vacrelstats->dead_tuples,
-								vacrelstats->num_dead_tuples,
+								(void *) dead_tuples->itemptrs,
+								dead_tuples->num_tuples,
 								sizeof(ItemPointerData),
 								vac_cmp_itemptr);
 
@@ -2352,3 +2934,452 @@ heap_page_is_all_visible(Relation rel, Buffer buf,
 
 	return all_visible;
 }
+
+/*
+ * Compute the number of parallel worker processes to request.  Both index
+ * vacuuming and index cleanup can be executed together with parallel workers.
+ * The relation sizes of table don't affect to the parallel degree for now.
+ * nrequested is the number of parallel workers that user
+ * requested. If nrequested is 0 we compute the parallel degree based on
+ * nindexes that is the number of indexes that support parallel index
+ * vacuuming. This function also sets can_parallel_vacuum to remember indexes
+ * that participate in parallel index vacuum.
+ */
+static int
+compute_parallel_vacuum_workers(Relation *Irel, int nindexes, int nrequested,
+								bool *can_parallel_vacuum)
+{
+	int			nindexes_parallel = 0;
+	int			nindexes_parallel_bulkdel = 0;
+	int			nindexes_parallel_cleanup = 0;
+	int			parallel_workers;
+	int			i;
+
+	/*
+	 * We don't allow to perform parallel operation in standalone backend or
+	 * when parallelism is disabled.
+	 */
+	if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+		return 0;
+
+	/*
+	 * Compute the number of indexes that can participate in parallel index
+	 * vacuuming.
+	 */
+	for (i = 0; i < nindexes; i++)
+	{
+		uint8		vacoptions = Irel[i]->rd_indam->amparallelvacuumoptions;
+
+		if (vacoptions == VACUUM_OPTION_NO_PARALLEL ||
+			RelationGetNumberOfBlocks(Irel[i]) < min_parallel_index_scan_size)
+			continue;
+
+		can_parallel_vacuum[i] = true;
+
+		if ((vacoptions & VACUUM_OPTION_PARALLEL_BULKDEL) != 0)
+			nindexes_parallel_bulkdel++;
+		if (((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) != 0) ||
+			((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) != 0))
+			nindexes_parallel_cleanup++;
+	}
+
+	nindexes_parallel = Max(nindexes_parallel_bulkdel,
+							nindexes_parallel_cleanup);
+
+	/* No index supports parallel index vacuuming */
+	if (nindexes_parallel == 0)
+		return 0;
+
+	/* The leader process takes one index */
+	nindexes_parallel--;
+
+	/* Compute the parallel degree */
+	parallel_workers = (nrequested > 0) ?
+		Min(nrequested, nindexes_parallel) : nindexes_parallel;
+
+	/* cap by max_parallel_maintenance_workers */
+	parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+
+	return parallel_workers;
+}
+
+/*
+ * Initialize variables for shared index statistics, set NULL bitmap and the
+ * size of stats for each index.  Since currently we don't support parallel vacuum
+ * for autovacuum we don't need to care about autovacuum_work_mem.
+ */
+static void
+prepare_index_statistics(LVShared *lvshared, bool *can_parallel_vacuum,
+						 int nindexes)
+{
+	int			i;
+
+	Assert(!IsAutoVacuumWorkerProcess());
+
+	/* Set NULL for all indexes */
+	memset(lvshared->bitmap, 0x00, BITMAPLEN(nindexes));
+
+	for (i = 0; i < nindexes; i++)
+	{
+		if (!can_parallel_vacuum[i])
+			continue;
+
+		/* Set NOT NULL as this index do support parallelism */
+		lvshared->bitmap[i >> 3] |= 1 << (i & 0x07);
+	}
+}
+
+/*
+ * Update index statistics in pg_class if the statistics is accurate.
+ */
+static void
+update_index_statistics(Relation *Irel, IndexBulkDeleteResult **stats,
+						int nindexes)
+{
+	int			i;
+
+	Assert(!IsInParallelMode());
+
+	for (i = 0; i < nindexes; i++)
+	{
+		if (stats[i] == NULL || stats[i]->estimated_count)
+			continue;
+
+		/* Update index statistics */
+		vac_update_relstats(Irel[i],
+							stats[i]->num_pages,
+							stats[i]->num_index_tuples,
+							0,
+							false,
+							InvalidTransactionId,
+							InvalidMultiXactId,
+							false);
+		pfree(stats[i]);
+	}
+}
+
+/*
+ * This function prepares and returns parallel vacuum state if we can launch
+ * even one worker.  This function is responsible to create a parallel
+ * context, enter parallel mode, and then initialize the DSM segment.
+ */
+static LVParallelState *
+begin_parallel_vacuum(Oid relid, Relation *Irel, LVRelStats *vacrelstats,
+					  BlockNumber nblocks, int nindexes, int nrequested)
+{
+	LVParallelState *lps = NULL;
+	ParallelContext *pcxt;
+	LVShared   *shared;
+	LVDeadTuples *dead_tuples;
+	bool	   *can_parallel_vacuum;
+	long		maxtuples;
+	char	   *sharedquery;
+	Size		est_shared;
+	Size		est_deadtuples;
+	int			nindexes_mwm = 0;
+	int			parallel_workers = 0;
+	int			querylen;
+	int			i;
+
+	/*
+	 * a parallel vacuum must be requested and there must be indexes on the
+	 * relation
+	 */
+	Assert(nrequested >= 0);
+	Assert(nindexes > 0);
+
+	/*
+	 * Compute the number of parallel vacuum workers to launch
+	 */
+	can_parallel_vacuum = (bool *) palloc0(sizeof(bool) * nindexes);
+	parallel_workers = compute_parallel_vacuum_workers(Irel, nindexes,
+													   nrequested,
+													   can_parallel_vacuum);
+
+	/* Can't perform vacuum in parallel */
+	if (parallel_workers <= 0)
+	{
+		pfree(can_parallel_vacuum);
+		return lps;
+	}
+
+	lps = (LVParallelState *) palloc0(sizeof(LVParallelState));
+
+	EnterParallelMode();
+	pcxt = CreateParallelContext("postgres", "parallel_vacuum_main",
+								 parallel_workers);
+	lps->pcxt = pcxt;
+	Assert(pcxt->nworkers > 0);
+
+	/* Estimate size for shared information -- PARALLEL_VACUUM_KEY_SHARED */
+	est_shared = MAXALIGN(add_size(SizeOfLVShared, BITMAPLEN(nindexes)));
+	for (i = 0; i < nindexes; i++)
+	{
+		uint8		vacoptions = Irel[i]->rd_indam->amparallelvacuumoptions;
+
+		/*
+		 * Cleanup option should be either disabled, always performing in
+		 * parallel or conditionally performing in parallel.
+		 */
+		Assert(((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) == 0) ||
+			   ((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) == 0));
+		Assert(vacoptions <= VACUUM_OPTION_MAX_VALID_VALUE);
+
+		/* Skip indexes that don't participate in parallel index vacuum */
+		if (!can_parallel_vacuum[i])
+			continue;
+
+		if (Irel[i]->rd_indam->amusemaintenanceworkmem)
+			nindexes_mwm++;
+
+		est_shared = add_size(est_shared, sizeof(LVSharedIndStats));
+
+
+		/*
+		 * Remember the number of indexes that support parallel operation for
+		 * each phase.
+		 */
+		if ((vacoptions & VACUUM_OPTION_PARALLEL_BULKDEL) != 0)
+			lps->nindexes_parallel_bulkdel++;
+		if ((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) != 0)
+			lps->nindexes_parallel_cleanup++;
+		if ((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) != 0)
+			lps->nindexes_parallel_condcleanup++;
+	}
+	shm_toc_estimate_chunk(&pcxt->estimator, est_shared);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	/* Estimate size for dead tuples -- PARALLEL_VACUUM_KEY_DEAD_TUPLES */
+	maxtuples = compute_max_dead_tuples(nblocks, true);
+	est_deadtuples = MAXALIGN(add_size(SizeOfLVDeadTuples,
+									   mul_size(sizeof(ItemPointerData), maxtuples)));
+	shm_toc_estimate_chunk(&pcxt->estimator, est_deadtuples);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	/* Finally, estimate PARALLEL_VACUUM_KEY_QUERY_TEXT space */
+	querylen = strlen(debug_query_string);
+	shm_toc_estimate_chunk(&pcxt->estimator, querylen + 1);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	InitializeParallelDSM(pcxt);
+
+	/* Prepare shared information */
+	shared = (LVShared *) shm_toc_allocate(pcxt->toc, est_shared);
+	MemSet(shared, 0, est_shared);
+	shared->relid = relid;
+	shared->elevel = elevel;
+	shared->maintenance_work_mem_worker =
+		(nindexes_mwm > 0) ?
+		maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
+		maintenance_work_mem;
+
+	/*
+	 * We need to care about alignment because we estimate the shared memory
+	 * in that way.
+	 */
+	shared->offset = MAXALIGN(add_size(SizeOfLVShared, BITMAPLEN(nindexes)));
+	prepare_index_statistics(shared, can_parallel_vacuum, nindexes);
+	pg_atomic_init_u32(&(shared->idx), 0);
+	pg_atomic_init_u32(&(shared->nprocessed), 0);
+	pg_atomic_init_u32(&(shared->cost_balance), 0);
+	pg_atomic_init_u32(&(shared->active_nworkers), 0);
+
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_SHARED, shared);
+	lps->lvshared = shared;
+
+	/* Prepare the dead tuple space */
+	dead_tuples = (LVDeadTuples *) shm_toc_allocate(pcxt->toc, est_deadtuples);
+	dead_tuples->max_tuples = maxtuples;
+	dead_tuples->num_tuples = 0;
+	MemSet(dead_tuples->itemptrs, 0, sizeof(ItemPointerData) * maxtuples);
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_DEAD_TUPLES, dead_tuples);
+	vacrelstats->dead_tuples = dead_tuples;
+
+	/* Store query string for workers */
+	sharedquery = (char *) shm_toc_allocate(pcxt->toc, querylen + 1);
+	memcpy(sharedquery, debug_query_string, querylen + 1);
+	sharedquery[querylen] = '\0';
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, sharedquery);
+
+	pfree(can_parallel_vacuum);
+	return lps;
+}
+
+/*
+ * Destroy the parallel context, and end parallel mode.
+ *
+ * Since writes are not allowed during the parallel mode, so we copy the
+ * updated index statistics from DSM in local memory and then later use that
+ * to update the index statistics.  One might think that we can exit from
+ * parallel mode, update the index statistics and then destroy parallel
+ * context, but that won't be safe (see ExitParallelMode).
+ */
+static void
+end_parallel_vacuum(Relation *Irel, IndexBulkDeleteResult **stats,
+					LVParallelState *lps, int nindexes)
+{
+	int			i;
+
+	Assert(!IsParallelWorker());
+
+	/* copy the updated statistics */
+	for (i = 0; i < nindexes; i++)
+	{
+		LVSharedIndStats *indstats = get_indstats(lps->lvshared, i);
+
+		/*
+		 * Skip unused slot.  The statistics of this index are already stored
+		 * in local memory.
+		 */
+		if (indstats == NULL)
+			continue;
+
+		if (indstats->updated)
+		{
+			stats[i] = (IndexBulkDeleteResult *) palloc0(sizeof(IndexBulkDeleteResult));
+			memcpy(stats[i],
+				   &(indstats->stats),
+				   sizeof(IndexBulkDeleteResult));
+		}
+		else
+			stats[i] = NULL;
+	}
+
+	DestroyParallelContext(lps->pcxt);
+	ExitParallelMode();
+
+	/* Deactivate parallel vacuum */
+	pfree(lps);
+	lps = NULL;
+}
+
+/* Return the Nth index statistics or NULL */
+static LVSharedIndStats *
+get_indstats(LVShared *lvshared, int n)
+{
+	int			i;
+	char	   *p;
+
+	if (IndStatsIsNull(lvshared, n))
+		return NULL;
+
+	p = (char *) GetSharedIndStats(lvshared);
+	for (i = 0; i < n; i++)
+	{
+		if (IndStatsIsNull(lvshared, i))
+			continue;
+
+		p += sizeof(LVSharedIndStats);
+	}
+
+	return (LVSharedIndStats *) p;
+}
+
+/*
+ * Check if the given index participates in parallel index vacuum or parallel
+ * index cleanup.
+ */
+static bool
+skip_parallel_vacuum_index(Relation indrel, LVShared *lvshared)
+{
+	uint8		vacoptions = indrel->rd_indam->amparallelvacuumoptions;
+
+	/* first_time must be true only if for_cleanup is true */
+	Assert(lvshared->for_cleanup || !lvshared->first_time);
+
+	if (lvshared->for_cleanup)
+	{
+		/* Skip, if the index does not support parallel cleanup */
+		if (((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) == 0) &&
+			((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) == 0))
+			return true;
+
+		/*
+		 * Skip, if the index supports parallel cleanup conditionally, but we
+		 * have already processed the index (for bulkdelete).  See the
+		 * comments for option VACUUM_OPTION_PARALLEL_COND_CLEANUP to know
+		 * when indexes support parallel cleanup conditionally.
+		 */
+		if (!lvshared->first_time &&
+			((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) != 0))
+			return true;
+	}
+	else if ((vacoptions & VACUUM_OPTION_PARALLEL_BULKDEL) == 0)
+	{
+		/* Skip if the index does not support parallel bulk deletion */
+		return true;
+	}
+
+	return false;
+}
+
+/*
+ * Perform work within a launched parallel process.
+ *
+ * Since parallel vacuum workers perform only index vacuum or index cleanup,
+ * we don't need to report the progress information.
+ */
+void
+parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
+{
+	Relation	onerel;
+	Relation   *indrels;
+	LVShared   *lvshared;
+	LVDeadTuples *dead_tuples;
+	int			nindexes;
+	char	   *sharedquery;
+	IndexBulkDeleteResult **stats;
+
+	lvshared = (LVShared *) shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_SHARED,
+										   false);
+	elevel = lvshared->elevel;
+
+	ereport(DEBUG1,
+			(errmsg("starting parallel lazy vacuum worker for %s",
+					lvshared->for_cleanup ? "cleanup" : "vacuuming")));
+
+	/* Set debug_query_string for individual workers */
+	sharedquery = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, false);
+	debug_query_string = sharedquery;
+	pgstat_report_activity(STATE_RUNNING, debug_query_string);
+
+	/*
+	 * Open table.  The lock mode is the same as the leader process.  It's
+	 * okay because The lockmode does not conflict among the parallel workers.
+	 */
+	onerel = table_open(lvshared->relid, ShareUpdateExclusiveLock);
+
+	/*
+	 * Open all indexes. indrels are sorted in order by OID, which should be
+	 * matched to the leader's one.
+	 */
+	vac_open_indexes(onerel, RowExclusiveLock, &nindexes, &indrels);
+	Assert(nindexes > 0);
+
+	/* Set dead tuple space */
+	dead_tuples = (LVDeadTuples *) shm_toc_lookup(toc,
+												  PARALLEL_VACUUM_KEY_DEAD_TUPLES,
+												  false);
+
+	/* Set cost-based vacuum delay */
+	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostBalance = 0;
+	VacuumPageHit = 0;
+	VacuumPageMiss = 0;
+	VacuumPageDirty = 0;
+	VacuumSharedCostBalance = &(lvshared->cost_balance);
+	VacuumActiveNWorkers = &(lvshared->active_nworkers);
+
+	stats = (IndexBulkDeleteResult **)
+		palloc0(nindexes * sizeof(IndexBulkDeleteResult *));
+
+	if (lvshared->maintenance_work_mem_worker > 0)
+		maintenance_work_mem = lvshared->maintenance_work_mem_worker;
+
+	/* Process indexes to perform vacuum/cleanup */
+	parallel_vacuum_index(indrels, stats, lvshared, dead_tuples, nindexes);
+
+	vac_close_indexes(nindexes, indrels, RowExclusiveLock);
+	table_close(onerel, ShareUpdateExclusiveLock);
+	pfree(stats);
+}
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index d147236429..6c9ee65ba2 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -14,6 +14,7 @@
 
 #include "postgres.h"
 
+#include "access/heapam.h"
 #include "access/nbtree.h"
 #include "access/parallel.h"
 #include "access/session.h"
@@ -139,6 +140,9 @@ static const struct
 	},
 	{
 		"_bt_parallel_build_main", _bt_parallel_build_main
+	},
+	{
+		"parallel_vacuum_main", parallel_vacuum_main
 	}
 };
 
@@ -174,6 +178,7 @@ CreateParallelContext(const char *library_name, const char *function_name,
 	pcxt = palloc0(sizeof(ParallelContext));
 	pcxt->subid = GetCurrentSubTransactionId();
 	pcxt->nworkers = nworkers;
+	pcxt->nworkers_to_launch = nworkers;
 	pcxt->library_name = pstrdup(library_name);
 	pcxt->function_name = pstrdup(function_name);
 	pcxt->error_context_stack = error_context_stack;
@@ -486,6 +491,23 @@ ReinitializeParallelDSM(ParallelContext *pcxt)
 	}
 }
 
+/*
+ * Reinitialize parallel workers for a parallel context such that we could
+ * launch the different number of workers.  This is required for cases where
+ * we need to reuse the same DSM segment, but the number of workers can
+ * vary from run-to-run.
+ */
+void
+ReinitializeParallelWorkers(ParallelContext *pcxt, int nworkers_to_launch)
+{
+	/*
+	 * The number of workers that need to be launched must be less than the
+	 * number of workers with which the parallel context is initialized.
+	 */
+	Assert(pcxt->nworkers >= nworkers_to_launch);
+	pcxt->nworkers_to_launch = nworkers_to_launch;
+}
+
 /*
  * Launch parallel workers.
  */
@@ -498,7 +520,7 @@ LaunchParallelWorkers(ParallelContext *pcxt)
 	bool		any_registrations_failed = false;
 
 	/* Skip this if we have no workers. */
-	if (pcxt->nworkers == 0)
+	if (pcxt->nworkers == 0 || pcxt->nworkers_to_launch == 0)
 		return;
 
 	/* We need to be a lock group leader. */
@@ -533,7 +555,7 @@ LaunchParallelWorkers(ParallelContext *pcxt)
 	 * fails.  It wouldn't help much anyway, because registering the worker in
 	 * no way guarantees that it will start up and initialize successfully.
 	 */
-	for (i = 0; i < pcxt->nworkers; ++i)
+	for (i = 0; i < pcxt->nworkers_to_launch; ++i)
 	{
 		memcpy(worker.bgw_extra, &i, sizeof(int));
 		if (!any_registrations_failed &&
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index da1da23400..14a9b2432e 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -42,6 +42,7 @@
 #include "nodes/makefuncs.h"
 #include "pgstat.h"
 #include "postmaster/autovacuum.h"
+#include "postmaster/bgworker_internals.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
 #include "storage/proc.h"
@@ -68,6 +69,14 @@ static MemoryContext vac_context = NULL;
 static BufferAccessStrategy vac_strategy;
 
 
+/*
+ * Variables for cost-based parallel vacuum.  See comments atop
+ * compute_parallel_delay to understand how it works.
+ */
+pg_atomic_uint32 *VacuumSharedCostBalance = NULL;
+pg_atomic_uint32 *VacuumActiveNWorkers = NULL;
+int			VacuumCostBalanceLocal = 0;
+
 /* non-export function prototypes */
 static List *expand_vacuum_rel(VacuumRelation *vrel, int options);
 static List *get_all_vacuum_rels(int options);
@@ -76,6 +85,7 @@ static void vac_truncate_clog(TransactionId frozenXID,
 							  TransactionId lastSaneFrozenXid,
 							  MultiXactId lastSaneMinMulti);
 static bool vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params);
+static double compute_parallel_delay(void);
 static VacOptTernaryValue get_vacopt_ternary_value(DefElem *def);
 
 /*
@@ -99,6 +109,7 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 	/* Set default value */
 	params.index_cleanup = VACOPT_TERNARY_DEFAULT;
 	params.truncate = VACOPT_TERNARY_DEFAULT;
+	params.nworkers = -1;
 
 	/* Parse options list */
 	foreach(lc, vacstmt->options)
@@ -129,6 +140,28 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 			params.index_cleanup = get_vacopt_ternary_value(opt);
 		else if (strcmp(opt->defname, "truncate") == 0)
 			params.truncate = get_vacopt_ternary_value(opt);
+		else if (strcmp(opt->defname, "parallel") == 0)
+		{
+			if (opt->arg == NULL)
+			{
+				/*
+				 * Parallel lazy vacuum is requested but user didn't specify
+				 * the parallel degree. The parallel degree will be determined
+				 * at the start of lazy vacuum.
+				 */
+				params.nworkers = 0;
+			}
+			else
+			{
+				params.nworkers = defGetInt32(opt);
+				if (params.nworkers < 1 || params.nworkers > MAX_PARALLEL_WORKER_LIMIT)
+					ereport(ERROR,
+							(errcode(ERRCODE_SYNTAX_ERROR),
+							 errmsg("parallel vacuum degree must be between 1 and %d",
+									MAX_PARALLEL_WORKER_LIMIT),
+							 parser_errposition(pstate, opt->location)));
+			}
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -170,6 +203,11 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 		}
 	}
 
+	if ((params.options & VACOPT_FULL) && params.nworkers >= 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot specify FULL option with PARALLEL option")));
+
 	/*
 	 * All freeze ages are zero if the FREEZE option is given; otherwise pass
 	 * them as -1 which means to use the default values.
@@ -383,6 +421,7 @@ vacuum(List *relations, VacuumParams *params,
 		VacuumPageHit = 0;
 		VacuumPageMiss = 0;
 		VacuumPageDirty = 0;
+		VacuumSharedCostBalance = NULL;
 
 		/*
 		 * Loop to process each selected relation.
@@ -1738,6 +1777,20 @@ vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params)
 		return false;
 	}
 
+	/*
+	 * Since parallel workers cannot access data in temporary tables, parallel
+	 * vacuum is not allowed for temporary relation. However rather than
+	 * skipping vacuum on the table, just disabling parallel option is better
+	 * option in most cases.
+	 */
+	if (RelationUsesLocalBuffers(onerel) && params->nworkers >= 0)
+	{
+		ereport(WARNING,
+				(errmsg("disabling parallel option of vacuum on \"%s\" --- cannot vacuum temporary tables in parallel",
+						RelationGetRelationName(onerel))));
+		params->nworkers = -1;
+	}
+
 	/*
 	 * Silently ignore partitioned tables as there is no work to be done.  The
 	 * useful work is on their child partitions, which have been queued up for
@@ -1941,16 +1994,26 @@ vac_close_indexes(int nindexes, Relation *Irel, LOCKMODE lockmode)
 void
 vacuum_delay_point(void)
 {
+	double		msec = 0;
+
 	/* Always check for interrupts */
 	CHECK_FOR_INTERRUPTS();
 
-	/* Nap if appropriate */
-	if (VacuumCostActive && !InterruptPending &&
-		VacuumCostBalance >= VacuumCostLimit)
-	{
-		double		msec;
+	if (!VacuumCostActive || InterruptPending)
+		return;
 
+	/*
+	 * For parallel vacuum, the delay is computed based on the shared cost
+	 * balance.  See compute_parallel_delay.
+	 */
+	if (VacuumSharedCostBalance != NULL)
+		msec = compute_parallel_delay();
+	else if (VacuumCostBalance >= VacuumCostLimit)
 		msec = VacuumCostDelay * VacuumCostBalance / VacuumCostLimit;
+
+	/* Nap if appropriate */
+	if (msec > 0)
+	{
 		if (msec > VacuumCostDelay * 4)
 			msec = VacuumCostDelay * 4;
 
@@ -1966,6 +2029,59 @@ vacuum_delay_point(void)
 	}
 }
 
+/*
+ * Computes the vacuum delay for parallel workers.
+ *
+ * The basic idea of a cost-based vacuum delay for parallel index vacuuming
+ * is to allow all parallel vacuum workers including the leader process to
+ * have a shared view of cost related parameters (mainly VacuumCostBalance)
+ * and allow each worker to update it and then based on that decide
+ * whether it needs to sleep.  Besides, we allow any worker to sleep only
+ * if it has performed the I/O above a certain threshold, which is calculated
+ * based on the number of active workers (VacuumActiveNWorkers), and the
+ * overall cost balance is more than VacuumCostLimit set by the system.  Then
+ * we will allow the worker to sleep proportional to the work done and reduce
+ * the VacuumSharedCostBalance by the amount which is consumed by the current
+ * worker (VacuumCostBalanceLocal).  This can avoid letting the workers sleep
+ * who have done less or no I/O as compared to other workers and therefore can
+ * ensure that workers who are doing more I/O got throttled more.
+ */
+static double
+compute_parallel_delay(void)
+{
+	double		msec = 0;
+	uint32		shared_balance;
+	int			nworkers = pg_atomic_read_u32(VacuumActiveNWorkers);
+
+	/* At least count itself */
+	Assert(nworkers >= 1);
+
+	/* parallel vacuum must be active */
+	Assert(VacuumSharedCostBalance);
+
+	/* Update the shared cost balance value atomically */
+	shared_balance = pg_atomic_add_fetch_u32(VacuumSharedCostBalance, VacuumCostBalance);
+
+	/* also compute the total local balance */
+	VacuumCostBalanceLocal += VacuumCostBalance;
+
+	if ((shared_balance >= VacuumCostLimit) &&
+		(VacuumCostBalanceLocal > 0.5 * (VacuumCostLimit / nworkers)))
+	{
+		/* compute sleep time based on the local cost balance */
+		msec = VacuumCostDelay * VacuumCostBalanceLocal / VacuumCostLimit;
+		shared_balance = pg_atomic_sub_fetch_u32(VacuumSharedCostBalance, VacuumCostBalanceLocal);
+		VacuumCostBalanceLocal = 0;
+	}
+
+	/*
+	 * Reset the local balance as we accumulated it into the shared value.
+	 */
+	VacuumCostBalance = 0;
+
+	return msec;
+}
+
 /*
  * A wrapper function of defGetBoolean().
  *
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index e919317bab..641999c0f4 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2886,6 +2886,8 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 			(!wraparound ? VACOPT_SKIP_LOCKED : 0);
 		tab->at_params.index_cleanup = VACOPT_TERNARY_DEFAULT;
 		tab->at_params.truncate = VACOPT_TERNARY_DEFAULT;
+		/* We don't support parallel vacuum for autovacuum for now */
+		tab->at_params.nworkers = -1;
 		tab->at_params.freeze_min_age = freeze_min_age;
 		tab->at_params.freeze_table_age = freeze_table_age;
 		tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 5e0db3515d..e2dbd94a3e 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -3591,7 +3591,7 @@ psql_completion(const char *text, int start, int end)
 		if (ends_with(prev_wd, '(') || ends_with(prev_wd, ','))
 			COMPLETE_WITH("FULL", "FREEZE", "ANALYZE", "VERBOSE",
 						  "DISABLE_PAGE_SKIPPING", "SKIP_LOCKED",
-						  "INDEX_CLEANUP", "TRUNCATE");
+						  "INDEX_CLEANUP", "TRUNCATE", "PARALLEL");
 		else if (TailMatches("FULL|FREEZE|ANALYZE|VERBOSE|DISABLE_PAGE_SKIPPING|SKIP_LOCKED|INDEX_CLEANUP|TRUNCATE"))
 			COMPLETE_WITH("ON", "OFF");
 	}
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 858bcb6bc9..e89c1252d3 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -24,6 +24,8 @@
 #include "nodes/primnodes.h"
 #include "storage/bufpage.h"
 #include "storage/lockdefs.h"
+#include "storage/shm_toc.h"
+#include "storage/dsm.h"
 #include "utils/relcache.h"
 #include "utils/snapshot.h"
 
@@ -193,6 +195,7 @@ extern Size SyncScanShmemSize(void);
 struct VacuumParams;
 extern void heap_vacuum_rel(Relation onerel,
 							struct VacuumParams *params, BufferAccessStrategy bstrategy);
+extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
 
 /* in heap/heapam_visibility.c */
 extern bool HeapTupleSatisfiesVisibility(HeapTuple stup, Snapshot snapshot,
diff --git a/src/include/access/parallel.h b/src/include/access/parallel.h
index c00ae6424c..b9ad6cf671 100644
--- a/src/include/access/parallel.h
+++ b/src/include/access/parallel.h
@@ -33,7 +33,8 @@ typedef struct ParallelContext
 {
 	dlist_node	node;
 	SubTransactionId subid;
-	int			nworkers;
+	int			nworkers;		/* Maximum number of workers to launch */
+	int			nworkers_to_launch; /* Actual number of workers to launch */
 	int			nworkers_launched;
 	char	   *library_name;
 	char	   *function_name;
@@ -63,6 +64,7 @@ extern ParallelContext *CreateParallelContext(const char *library_name,
 											  const char *function_name, int nworkers);
 extern void InitializeParallelDSM(ParallelContext *pcxt);
 extern void ReinitializeParallelDSM(ParallelContext *pcxt);
+extern void ReinitializeParallelWorkers(ParallelContext *pcxt, int nworkers_to_launch);
 extern void LaunchParallelWorkers(ParallelContext *pcxt);
 extern void WaitForParallelWorkersToAttach(ParallelContext *pcxt);
 extern void WaitForParallelWorkersToFinish(ParallelContext *pcxt);
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index b05aedc670..6bb15bef0d 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -31,8 +31,8 @@
 /*
  * Both bulkdelete and vacuumcleanup are disabled by default.  This will be
  * used by IndexAM's that don't want to or cannot participate in parallel vacuum.
- * If an index AM doesn't have a way to communicate the index statistics allocated
- * by the first ambulkdelete call to the subsequent ones until amvacuumcleanup,
+ * For example, if an index AM doesn't have a way to communicate the index statistics
+ * allocated by the first ambulkdelete call to the subsequent ones until amvacuumcleanup,
  * the index AM cannot participate in parallel vacuum.
  */
 #define VACUUM_OPTION_NO_PARALLEL			0
@@ -221,6 +221,12 @@ typedef struct VacuumParams
 										 * default value depends on reloptions */
 	VacOptTernaryValue truncate;	/* Truncate empty pages at the end,
 									 * default value depends on reloptions */
+
+	/*
+	 * The number of parallel vacuum workers. -1 by default for no workers and
+	 * 0 for choosing based on the number of indexes.
+	 */
+	int			nworkers;
 } VacuumParams;
 
 /* GUC parameters */
@@ -230,6 +236,11 @@ extern int	vacuum_freeze_table_age;
 extern int	vacuum_multixact_freeze_min_age;
 extern int	vacuum_multixact_freeze_table_age;
 
+/* Variables for cost-based parallel vacuum  */
+extern pg_atomic_uint32 *VacuumSharedCostBalance;
+extern pg_atomic_uint32 *VacuumActiveNWorkers;
+extern int	VacuumCostBalanceLocal;
+
 
 /* in commands/vacuum.c */
 extern void ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel);
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index 9996d882d1..33e15f0200 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -92,6 +92,33 @@ CONTEXT:  SQL function "do_analyze" statement 1
 SQL function "wrap_do_analyze" statement 1
 VACUUM FULL vactst;
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+-- PARALLEL option
+CREATE TABLE pvactst (i INT, a INT[], p POINT) with (autovacuum_enabled = off);
+INSERT INTO pvactst SELECT i, array[1,2,3], point(i, i+1) FROM generate_series(1,1000) i;
+CREATE INDEX btree_pvactst ON pvactst USING btree (i);
+CREATE INDEX hash_pvactst ON pvactst USING hash (i);
+CREATE INDEX brin_pvactst ON pvactst USING brin (i);
+CREATE INDEX gin_pvactst ON pvactst USING gin (a);
+CREATE INDEX gist_pvactst ON pvactst USING gist (p);
+CREATE INDEX spgist_pvactst ON pvactst USING spgist (p);
+-- VACUUM invokes parallel index cleanup
+set min_parallel_index_scan_size to 0;
+VACUUM (PARALLEL 2) pvactst;
+-- VACUUM invokes parallel bulk-deletion
+UPDATE pvactst SET i = i WHERE i < 1000;
+VACUUM (PARALLEL 2) pvactst;
+VACUUM (PARALLEL 0) pvactst; -- error
+ERROR:  parallel vacuum degree must be between 1 and 1024
+LINE 1: VACUUM (PARALLEL 0) pvactst;
+                ^
+VACUUM (PARALLEL 2, INDEX_CLEANUP FALSE) pvactst;
+VACUUM (PARALLEL 2, FULL TRUE) pvactst; -- error, cannot use both PARALLEL and FULL
+ERROR:  cannot specify FULL option with PARALLEL option
+CREATE TEMPORARY TABLE tmp (a int PRIMARY KEY);
+CREATE INDEX tmp_idx1 ON tmp (a);
+VACUUM (PARALLEL 1) tmp; -- disables parallel vacuum option
+WARNING:  disabling parallel option of vacuum on "tmp" --- cannot vacuum temporary tables in parallel
+DROP TABLE pvactst;
 -- INDEX_CLEANUP option
 CREATE TABLE no_index_cleanup (i INT PRIMARY KEY, t TEXT);
 -- Use uncompressed data stored in toast.
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 69987f75e9..cd57283854 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -75,6 +75,32 @@ VACUUM FULL vactst;
 
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
 
+-- PARALLEL option
+CREATE TABLE pvactst (i INT, a INT[], p POINT) with (autovacuum_enabled = off);
+INSERT INTO pvactst SELECT i, array[1,2,3], point(i, i+1) FROM generate_series(1,1000) i;
+CREATE INDEX btree_pvactst ON pvactst USING btree (i);
+CREATE INDEX hash_pvactst ON pvactst USING hash (i);
+CREATE INDEX brin_pvactst ON pvactst USING brin (i);
+CREATE INDEX gin_pvactst ON pvactst USING gin (a);
+CREATE INDEX gist_pvactst ON pvactst USING gist (p);
+CREATE INDEX spgist_pvactst ON pvactst USING spgist (p);
+
+-- VACUUM invokes parallel index cleanup
+set min_parallel_index_scan_size to 0;
+VACUUM (PARALLEL 2) pvactst;
+
+-- VACUUM invokes parallel bulk-deletion
+UPDATE pvactst SET i = i WHERE i < 1000;
+VACUUM (PARALLEL 2) pvactst;
+
+VACUUM (PARALLEL 0) pvactst; -- error
+VACUUM (PARALLEL 2, INDEX_CLEANUP FALSE) pvactst;
+VACUUM (PARALLEL 2, FULL TRUE) pvactst; -- error, cannot use both PARALLEL and FULL
+CREATE TEMPORARY TABLE tmp (a int PRIMARY KEY);
+CREATE INDEX tmp_idx1 ON tmp (a);
+VACUUM (PARALLEL 1) tmp; -- disables parallel vacuum option
+DROP TABLE pvactst;
+
 -- INDEX_CLEANUP option
 CREATE TABLE no_index_cleanup (i INT PRIMARY KEY, t TEXT);
 -- Use uncompressed data stored in toast.
-- 
2.23.0

#306

Prabhat Sahu

prabhat.sahu@enterprisedb.com

about 6 years ago

In reply to: Mahendra Singh (#302)

Hi,

While testing this feature with parallel vacuum on "TEMPORARY TABLE", I got
a server crash on PG Head+V36_patch.
Changed configuration parameters and Stack trace are as below:

autovacuum = on
max_worker_processes = 4
shared_buffers = 10MB
max_parallel_workers = 8
max_parallel_maintenance_workers = 8
vacuum_cost_limit = 2000
vacuum_cost_delay = 10
min_parallel_table_scan_size = 8MB
min_parallel_index_scan_size = 0

-- Stack trace:
[centos@parallel-vacuum-testing bin]$ gdb -q -c data/core.1399 postgres
Reading symbols from
/home/centos/BLP_Vacuum/postgresql/inst/bin/postgres...done.
[New LWP 1399]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `postgres: autovacuum worker postgres
'.
Program terminated with signal 6, Aborted.
#0 0x00007f4517d80337 in raise () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install
glibc-2.17-292.el7.x86_64 keyutils-libs-1.5.8-3.el7.x86_64
krb5-libs-1.15.1-37.el7_7.2.x86_64 libcom_err-1.42.9-16.el7.x86_64
libgcc-4.8.5-39.el7.x86_64 libselinux-2.5-14.1.el7.x86_64
openssl-libs-1.0.2k-19.el7.x86_64 pcre-8.32-17.el7.x86_64
zlib-1.2.7-18.el7.x86_64
(gdb) bt
#0 0x00007f4517d80337 in raise () from /lib64/libc.so.6
#1 0x00007f4517d81a28 in abort () from /lib64/libc.so.6
#2 0x0000000000a96341 in ExceptionalCondition (conditionName=0xd18efb
"strvalue != NULL", errorType=0xd18eeb "FailedAssertion",
fileName=0xd18ee0 "snprintf.c", lineNumber=442) at assert.c:67
#3 0x0000000000b02522 in dopr (target=0x7ffdb0e38450, format=0xc5fa95
".%s\"", args=0x7ffdb0e38538) at snprintf.c:442
#4 0x0000000000b01ea6 in pg_vsnprintf (str=0x256df50 "autovacuum: dropping
orphan temp table \"postgres.", '\177' <repeats 151 times>..., count=1024,
fmt=0xc5fa68 "autovacuum: dropping orphan temp table \"%s.%s.%s\"",
args=0x7ffdb0e38538) at snprintf.c:195
#5 0x0000000000afbadf in pvsnprintf (buf=0x256df50 "autovacuum: dropping
orphan temp table \"postgres.", '\177' <repeats 151 times>..., len=1024,
fmt=0xc5fa68 "autovacuum: dropping orphan temp table \"%s.%s.%s\"",
args=0x7ffdb0e38538) at psprintf.c:110
#6 0x0000000000afd34b in appendStringInfoVA (str=0x7ffdb0e38550,
fmt=0xc5fa68 "autovacuum: dropping orphan temp table \"%s.%s.%s\"",
args=0x7ffdb0e38538)
at stringinfo.c:149
#7 0x0000000000a970fd in errmsg (fmt=0xc5fa68 "autovacuum: dropping orphan
temp table \"%s.%s.%s\"") at elog.c:832
#8 0x00000000008588d2 in do_autovacuum () at autovacuum.c:2249
#9 0x0000000000857b29 in AutoVacWorkerMain (argc=0, argv=0x0) at
autovacuum.c:1689
#10 0x000000000085772f in StartAutoVacWorker () at autovacuum.c:1483
#11 0x000000000086e64f in StartAutovacuumWorker () at postmaster.c:5562
#12 0x000000000086e106 in sigusr1_handler (postgres_signal_arg=10) at
postmaster.c:5279
#13 <signal handler called>
#14 0x00007f4517e3f933 in __select_nocancel () from /lib64/libc.so.6
#15 0x0000000000869838 in ServerLoop () at postmaster.c:1691
#16 0x0000000000869212 in PostmasterMain (argc=3, argv=0x256bd70) at
postmaster.c:1400
#17 0x000000000077855d in main (argc=3, argv=0x256bd70) at main.c:210
(gdb)

I have tried to reproduce the same with all previously executed queries but
now I am not able to reproduce the same.

On Thu, Dec 19, 2019 at 11:26 AM Mahendra Singh <mahi6run@gmail.com> wrote:

On Wed, 18 Dec 2019 at 12:07, Amit Kapila <amit.kapila16@gmail.com> wrote:

[please trim extra text before responding]

On Wed, Dec 18, 2019 at 12:01 PM Mahendra Singh <mahi6run@gmail.com>

wrote:

On Tue, 10 Dec 2019 at 00:30, Mahendra Singh <mahi6run@gmail.com>

wrote:

3.
After v35 patch, vacuum.sql regression test is taking too much time

due to large number of inserts so by reducing number of tuples, we can
reduce that time.

+INSERT INTO pvactst SELECT i, array[1,2,3], point(i, i+1) FROM

generate_series(1,100000) i;

here, instead of 100000, we can make 1000 to reduce time of this

test case because we only want to test code and functionality.

As we added check of min_parallel_index_scan_size in v36 patch set to
decide parallel vacuum, 1000 tuples are not enough to do parallel
vacuum. I can see that we are not launching any workers in vacuum.sql
test case and hence, code coverage also decreased. I am not sure that
how to fix this.

Try by setting min_parallel_index_scan_size to 0 in test case.

Thanks Amit for the fix.

Yes, we can add "set min_parallel_index_scan_size = 0;" in vacuum.sql
test case. I tested by setting min_parallel_index_scan_size=0 and it
is working fine.

@Masahiko san, please add above line in vacuum.sql test case.

Thanks and Regards
Mahendra Thalor
EnterpriseDB: http://www.enterprisedb.com

With Regards,

Prabhat Kumar Sahu
Skype ID: prabhat.sahu1984
EnterpriseDB Software India Pvt. Ltd.

The Postgres Database Company

#307

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Prabhat Sahu (#306)

On Fri, Dec 20, 2019 at 5:17 PM Prabhat Sahu <prabhat.sahu@enterprisedb.com>
wrote:

Hi,

While testing this feature with parallel vacuum on "TEMPORARY TABLE", I
got a server crash on PG Head+V36_patch.

From the call stack, it is not clear whether it is related to a patch at
all. Have you checked your test with and without the patch? The reason is
that the patch doesn't perform a parallel vacuum on temporary tables.

Changed configuration parameters and Stack trace are as below:

-- Stack trace:
[centos@parallel-vacuum-testing bin]$ gdb -q -c data/core.1399 postgres
Reading symbols from
/home/centos/BLP_Vacuum/postgresql/inst/bin/postgres...done.
[New LWP 1399]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `postgres: autovacuum worker postgres
'.
Program terminated with signal 6, Aborted.
#0 0x00007f4517d80337 in raise () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install
glibc-2.17-292.el7.x86_64 keyutils-libs-1.5.8-3.el7.x86_64
krb5-libs-1.15.1-37.el7_7.2.x86_64 libcom_err-1.42.9-16.el7.x86_64
libgcc-4.8.5-39.el7.x86_64 libselinux-2.5-14.1.el7.x86_64
openssl-libs-1.0.2k-19.el7.x86_64 pcre-8.32-17.el7.x86_64
zlib-1.2.7-18.el7.x86_64
(gdb) bt
#0 0x00007f4517d80337 in raise () from /lib64/libc.so.6
#1 0x00007f4517d81a28 in abort () from /lib64/libc.so.6
#2 0x0000000000a96341 in ExceptionalCondition (conditionName=0xd18efb
"strvalue != NULL", errorType=0xd18eeb "FailedAssertion",
fileName=0xd18ee0 "snprintf.c", lineNumber=442) at assert.c:67
#3 0x0000000000b02522 in dopr (target=0x7ffdb0e38450, format=0xc5fa95
".%s\"", args=0x7ffdb0e38538) at snprintf.c:442
#4 0x0000000000b01ea6 in pg_vsnprintf (str=0x256df50 "autovacuum:
dropping orphan temp table \"postgres.", '\177' <repeats 151 times>...,
count=1024,
fmt=0xc5fa68 "autovacuum: dropping orphan temp table \"%s.%s.%s\"",
args=0x7ffdb0e38538) at snprintf.c:195
#5 0x0000000000afbadf in pvsnprintf (buf=0x256df50 "autovacuum: dropping
orphan temp table \"postgres.", '\177' <repeats 151 times>..., len=1024,
fmt=0xc5fa68 "autovacuum: dropping orphan temp table \"%s.%s.%s\"",
args=0x7ffdb0e38538) at psprintf.c:110
#6 0x0000000000afd34b in appendStringInfoVA (str=0x7ffdb0e38550,
fmt=0xc5fa68 "autovacuum: dropping orphan temp table \"%s.%s.%s\"",
args=0x7ffdb0e38538)
at stringinfo.c:149
#7 0x0000000000a970fd in errmsg (fmt=0xc5fa68 "autovacuum: dropping
orphan temp table \"%s.%s.%s\"") at elog.c:832
#8 0x00000000008588d2 in do_autovacuum () at autovacuum.c:2249

The call stack seems to indicate that the backend from where you were doing
the operations on temporary tables seems to have crashed somehow and then
autovacuum tries to clean up that orphaned temporary table. And it crashes
while printing the message for dropping orphan tables. Below is that
message:

ereport(LOG,
(errmsg("autovacuum: dropping orphan temp table \"%s.%s.%s\"",
get_database_name(MyDatabaseId),
get_namespace_name(classForm->relnamespace),
NameStr(classForm->relname))));

Now it can fail the assertion only if one of three parameters (database
name, namespace, relname) is NULL which I can't see any way to happen
unless you have manually removed one of namespace or database.

(gdb)

I have tried to reproduce the same with all previously executed queries
but now I am not able to reproduce the same.

I am not sure how from this we can conclude if there is any problem with
this patch or otherwise unless you have some steps to show us what you have
done. It could happen if you somehow corrupt the database by manually
removing stuff or maybe there is some genuine bug, but it is not at all
clear.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#308

Mahendra Singh

mahi6run@gmail.com

about 6 years ago

In reply to: Prabhat Sahu (#306)

On Fri, 20 Dec 2019 at 17:17, Prabhat Sahu
<prabhat.sahu@enterprisedb.com> wrote:

Hi,

While testing this feature with parallel vacuum on "TEMPORARY TABLE", I got a server crash on PG Head+V36_patch.
Changed configuration parameters and Stack trace are as below:

autovacuum = on
max_worker_processes = 4
shared_buffers = 10MB
max_parallel_workers = 8
max_parallel_maintenance_workers = 8
vacuum_cost_limit = 2000
vacuum_cost_delay = 10
min_parallel_table_scan_size = 8MB
min_parallel_index_scan_size = 0

-- Stack trace:
[centos@parallel-vacuum-testing bin]$ gdb -q -c data/core.1399 postgres
Reading symbols from /home/centos/BLP_Vacuum/postgresql/inst/bin/postgres...done.
[New LWP 1399]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `postgres: autovacuum worker postgres '.
Program terminated with signal 6, Aborted.
#0 0x00007f4517d80337 in raise () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install glibc-2.17-292.el7.x86_64 keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.15.1-37.el7_7.2.x86_64 libcom_err-1.42.9-16.el7.x86_64 libgcc-4.8.5-39.el7.x86_64 libselinux-2.5-14.1.el7.x86_64 openssl-libs-1.0.2k-19.el7.x86_64 pcre-8.32-17.el7.x86_64 zlib-1.2.7-18.el7.x86_64
(gdb) bt
#0 0x00007f4517d80337 in raise () from /lib64/libc.so.6
#1 0x00007f4517d81a28 in abort () from /lib64/libc.so.6
#2 0x0000000000a96341 in ExceptionalCondition (conditionName=0xd18efb "strvalue != NULL", errorType=0xd18eeb "FailedAssertion",
fileName=0xd18ee0 "snprintf.c", lineNumber=442) at assert.c:67
#3 0x0000000000b02522 in dopr (target=0x7ffdb0e38450, format=0xc5fa95 ".%s\"", args=0x7ffdb0e38538) at snprintf.c:442
#4 0x0000000000b01ea6 in pg_vsnprintf (str=0x256df50 "autovacuum: dropping orphan temp table \"postgres.", '\177' <repeats 151 times>..., count=1024,
fmt=0xc5fa68 "autovacuum: dropping orphan temp table \"%s.%s.%s\"", args=0x7ffdb0e38538) at snprintf.c:195
#5 0x0000000000afbadf in pvsnprintf (buf=0x256df50 "autovacuum: dropping orphan temp table \"postgres.", '\177' <repeats 151 times>..., len=1024,
fmt=0xc5fa68 "autovacuum: dropping orphan temp table \"%s.%s.%s\"", args=0x7ffdb0e38538) at psprintf.c:110
#6 0x0000000000afd34b in appendStringInfoVA (str=0x7ffdb0e38550, fmt=0xc5fa68 "autovacuum: dropping orphan temp table \"%s.%s.%s\"", args=0x7ffdb0e38538)
at stringinfo.c:149
#7 0x0000000000a970fd in errmsg (fmt=0xc5fa68 "autovacuum: dropping orphan temp table \"%s.%s.%s\"") at elog.c:832
#8 0x00000000008588d2 in do_autovacuum () at autovacuum.c:2249
#9 0x0000000000857b29 in AutoVacWorkerMain (argc=0, argv=0x0) at autovacuum.c:1689
#10 0x000000000085772f in StartAutoVacWorker () at autovacuum.c:1483
#11 0x000000000086e64f in StartAutovacuumWorker () at postmaster.c:5562
#12 0x000000000086e106 in sigusr1_handler (postgres_signal_arg=10) at postmaster.c:5279
#13 <signal handler called>
#14 0x00007f4517e3f933 in __select_nocancel () from /lib64/libc.so.6
#15 0x0000000000869838 in ServerLoop () at postmaster.c:1691
#16 0x0000000000869212 in PostmasterMain (argc=3, argv=0x256bd70) at postmaster.c:1400
#17 0x000000000077855d in main (argc=3, argv=0x256bd70) at main.c:210
(gdb)

I have tried to reproduce the same with all previously executed queries but now I am not able to reproduce the same.

Thanks Prabhat for reporting this issue.

I am able to reproduce this issue at my end. I tested and verified
that this issue is not related to parallel vacuum patch. I am able to
reproduce this issue on HEAD without parallel vacuum patch(v37).

I will report this issue in new thread with reproducible test case.

Thanks and Regards
Mahendra Thalor
EnterpriseDB: http://www.enterprisedb.com

#309

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

about 6 years ago

In reply to: Mahendra Singh (#308)

On Mon, 23 Dec 2019 at 16:24, Mahendra Singh <mahi6run@gmail.com> wrote:

On Fri, 20 Dec 2019 at 17:17, Prabhat Sahu
<prabhat.sahu@enterprisedb.com> wrote:

Hi,

While testing this feature with parallel vacuum on "TEMPORARY TABLE", I got a server crash on PG Head+V36_patch.
Changed configuration parameters and Stack trace are as below:

autovacuum = on
max_worker_processes = 4
shared_buffers = 10MB
max_parallel_workers = 8
max_parallel_maintenance_workers = 8
vacuum_cost_limit = 2000
vacuum_cost_delay = 10
min_parallel_table_scan_size = 8MB
min_parallel_index_scan_size = 0

-- Stack trace:
[centos@parallel-vacuum-testing bin]$ gdb -q -c data/core.1399 postgres
Reading symbols from /home/centos/BLP_Vacuum/postgresql/inst/bin/postgres...done.
[New LWP 1399]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `postgres: autovacuum worker postgres '.
Program terminated with signal 6, Aborted.
#0 0x00007f4517d80337 in raise () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install glibc-2.17-292.el7.x86_64 keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.15.1-37.el7_7.2.x86_64 libcom_err-1.42.9-16.el7.x86_64 libgcc-4.8.5-39.el7.x86_64 libselinux-2.5-14.1.el7.x86_64 openssl-libs-1.0.2k-19.el7.x86_64 pcre-8.32-17.el7.x86_64 zlib-1.2.7-18.el7.x86_64
(gdb) bt
#0 0x00007f4517d80337 in raise () from /lib64/libc.so.6
#1 0x00007f4517d81a28 in abort () from /lib64/libc.so.6
#2 0x0000000000a96341 in ExceptionalCondition (conditionName=0xd18efb "strvalue != NULL", errorType=0xd18eeb "FailedAssertion",
fileName=0xd18ee0 "snprintf.c", lineNumber=442) at assert.c:67
#3 0x0000000000b02522 in dopr (target=0x7ffdb0e38450, format=0xc5fa95 ".%s\"", args=0x7ffdb0e38538) at snprintf.c:442
#4 0x0000000000b01ea6 in pg_vsnprintf (str=0x256df50 "autovacuum: dropping orphan temp table \"postgres.", '\177' <repeats 151 times>..., count=1024,
fmt=0xc5fa68 "autovacuum: dropping orphan temp table \"%s.%s.%s\"", args=0x7ffdb0e38538) at snprintf.c:195
#5 0x0000000000afbadf in pvsnprintf (buf=0x256df50 "autovacuum: dropping orphan temp table \"postgres.", '\177' <repeats 151 times>..., len=1024,
fmt=0xc5fa68 "autovacuum: dropping orphan temp table \"%s.%s.%s\"", args=0x7ffdb0e38538) at psprintf.c:110
#6 0x0000000000afd34b in appendStringInfoVA (str=0x7ffdb0e38550, fmt=0xc5fa68 "autovacuum: dropping orphan temp table \"%s.%s.%s\"", args=0x7ffdb0e38538)
at stringinfo.c:149
#7 0x0000000000a970fd in errmsg (fmt=0xc5fa68 "autovacuum: dropping orphan temp table \"%s.%s.%s\"") at elog.c:832
#8 0x00000000008588d2 in do_autovacuum () at autovacuum.c:2249
#9 0x0000000000857b29 in AutoVacWorkerMain (argc=0, argv=0x0) at autovacuum.c:1689
#10 0x000000000085772f in StartAutoVacWorker () at autovacuum.c:1483
#11 0x000000000086e64f in StartAutovacuumWorker () at postmaster.c:5562
#12 0x000000000086e106 in sigusr1_handler (postgres_signal_arg=10) at postmaster.c:5279
#13 <signal handler called>
#14 0x00007f4517e3f933 in __select_nocancel () from /lib64/libc.so.6
#15 0x0000000000869838 in ServerLoop () at postmaster.c:1691
#16 0x0000000000869212 in PostmasterMain (argc=3, argv=0x256bd70) at postmaster.c:1400
#17 0x000000000077855d in main (argc=3, argv=0x256bd70) at main.c:210
(gdb)

I have tried to reproduce the same with all previously executed queries but now I am not able to reproduce the same.

Thanks Prabhat for reporting this issue.

I am able to reproduce this issue at my end. I tested and verified
that this issue is not related to parallel vacuum patch. I am able to
reproduce this issue on HEAD without parallel vacuum patch(v37).

I will report this issue in new thread with reproducible test case.

Thank you so much!

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#310

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Masahiko Sawada (#305)

4 attachment(s)

On Fri, Dec 20, 2019 at 12:13 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

I've attached the updated version patch that incorporated the all
review comments I go so far.

I have further edited the first two patches posted by you. The
changes include (a) changed tests to reset the guc, (b) removing some
stuff which is not required in this version, (c) moving some variables
around to make them in better order, (d) changed comments and few
other cosmetic things and (e) commit messages for first two patches.

I think the first two patches attached in this email are in good shape
and we can commit those unless you or someone has more comments on
them, the main parallel vacuum patch can still be improved by some
more test/polish/review. I am planning to push the first two patches
next week after another pass. The first two patches are explained in
brief as below:

1. v4-0001-Delete-empty-pages-in-each-pass-during-GIST-VACUUM: It
allows us to delete empty pages in each pass during GIST VACUUM.
Earlier, we use to postpone deleting empty pages till the second stage
of vacuum to amortize the cost of scanning internal pages. However,
that can sometimes (say vacuum is canceled or errored between first
and second stage) delay the pages to be recycled. Another thing is
that to facilitate deleting empty pages in the second stage, we need
to share the information of internal and empty pages between different
stages of vacuum. It will be quite tricky to share this information
via DSM which is required for the main parallel vacuum patch. Also,
it will bring the logic to reclaim deleted pages closer to nbtree
where we delete empty pages in each pass. Overall, the advantages of
deleting empty pages in each pass outweigh the advantages of
postponing the same. This patch is discussed in detail in a separate
thread [1]/messages/by-id/CAA4eK1LGr+MN0xHZpJ2dfS8QNQ1a_aROKowZB+MPNep8FVtwAA@mail.gmail.com.

2. v39-0001-Introduce-IndexAM-fields-for-parallel-vacuum.patch:
Introduce new fields amusemaintenanceworkmem and
amparallelvacuumoptions in IndexAmRoutine for parallel vacuum. The
amusemaintenanceworkmem tells whether a particular IndexAM uses
maintenance_work_mem or not. This will help in controlling the memory
used by individual workers as otherwise, each worker can consume
memory equal to maintenance_work_mem. This has been discussed in
detail in a separate thread as well [2]/messages/by-id/CAA4eK1LmcD5aPogzwim5Nn58Ki+74a6Edghx4Wd8hAskvHaq5A@mail.gmail.com. The amparallelvacuumoptions
tell whether a particular IndexAM participates in a parallel vacuum
and if so in which phase (bulkdelete, vacuumcleanup) of vacuum.

[1]: /messages/by-id/CAA4eK1LGr+MN0xHZpJ2dfS8QNQ1a_aROKowZB+MPNep8FVtwAA@mail.gmail.com
[2]: /messages/by-id/CAA4eK1LmcD5aPogzwim5Nn58Ki+74a6Edghx4Wd8hAskvHaq5A@mail.gmail.com

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachments:

v39-0001-Introduce-IndexAM-fields-for-parallel-vacuum.patchapplication/octet-stream; name=v39-0001-Introduce-IndexAM-fields-for-parallel-vacuum.patchDownload

From 5761ed02e51415dbb54f8c6639fbac9a1cd12b8f Mon Sep 17 00:00:00 2001
From: Amit Kapila <akapila@postgresql.org>
Date: Mon, 23 Dec 2019 14:46:37 +0530
Subject: [PATCH 1/3] Introduce IndexAM fields for parallel vacuum.

Introduce new fields amusemaintenanceworkmem and amparallelvacuumoptions
in IndexAmRoutine for parallel vacuum.  The amusemaintenanceworkmem tells
whether a particular IndexAM uses maintenance_work_mem or not.  This will
help in controlling the memory used by individual workers as otherwise,
each worker can consume memory equal to maintenance_work_mem.  The
amparallelvacuumoptions tell whether a particular IndexAM participates in
a parallel vacuum and if so in which phase (bulkdelete, vacuumcleanup) of
vacuum.

Author: Masahiko Sawada and Amit Kapila
Reviewed-by: Dilip Kumar, Amit Kapila and Robert Haas
Discussion:
https://postgr.es/m/CAD21AoDTPMgzSkV4E3SFo1CH_x50bf5PqZFQf4jmqjk-C03BWg@mail.gmail.com
https://postgr.es/m/CAA4eK1LmcD5aPogzwim5Nn58Ki+74a6Edghx4Wd8hAskvHaq5A@mail.gmail.com
---
 contrib/bloom/blutils.c                          |  4 +++
 doc/src/sgml/indexam.sgml                        |  4 +++
 src/backend/access/brin/brin.c                   |  4 +++
 src/backend/access/gin/ginutil.c                 |  4 +++
 src/backend/access/gist/gist.c                   |  4 +++
 src/backend/access/hash/hash.c                   |  3 ++
 src/backend/access/nbtree/nbtree.c               |  3 ++
 src/backend/access/spgist/spgutils.c             |  4 +++
 src/include/access/amapi.h                       |  4 +++
 src/include/commands/vacuum.h                    | 38 ++++++++++++++++++++++++
 src/test/modules/dummy_index_am/dummy_index_am.c |  3 ++
 11 files changed, 75 insertions(+)

diff --git a/contrib/bloom/blutils.c b/contrib/bloom/blutils.c
index e2063ba..1874aee 100644
--- a/contrib/bloom/blutils.c
+++ b/contrib/bloom/blutils.c
@@ -18,6 +18,7 @@
 #include "access/reloptions.h"
 #include "bloom.h"
 #include "catalog/index.h"
+#include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
@@ -121,6 +122,9 @@ blhandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_CLEANUP;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = blbuild;
diff --git a/doc/src/sgml/indexam.sgml b/doc/src/sgml/indexam.sgml
index dd54c68..37f8d87 100644
--- a/doc/src/sgml/indexam.sgml
+++ b/doc/src/sgml/indexam.sgml
@@ -122,6 +122,10 @@ typedef struct IndexAmRoutine
     bool        amcanparallel;
     /* does AM support columns included with clause INCLUDE? */
     bool        amcaninclude;
+    /* does AM use maintenance_work_mem? */
+    bool        amusemaintenanceworkmem;
+    /* OR of parallel vacuum flags */
+    uint8       amparallelvacuumoptions;
     /* type of data stored in index, or InvalidOid if variable */
     Oid         amkeytype;
 
diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index 294ffa6..abd8c40 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -27,6 +27,7 @@
 #include "access/xloginsert.h"
 #include "catalog/index.h"
 #include "catalog/pg_am.h"
+#include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "pgstat.h"
 #include "postmaster/autovacuum.h"
@@ -101,6 +102,9 @@ brinhandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_CLEANUP;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = brinbuild;
diff --git a/src/backend/access/gin/ginutil.c b/src/backend/access/gin/ginutil.c
index 3859355..64bd81a 100644
--- a/src/backend/access/gin/ginutil.c
+++ b/src/backend/access/gin/ginutil.c
@@ -20,6 +20,7 @@
 #include "access/xloginsert.h"
 #include "catalog/pg_collation.h"
 #include "catalog/pg_type.h"
+#include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "storage/indexfsm.h"
 #include "storage/lmgr.h"
@@ -53,6 +54,9 @@ ginhandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amusemaintenanceworkmem = true;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_CLEANUP;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = ginbuild;
diff --git a/src/backend/access/gist/gist.c b/src/backend/access/gist/gist.c
index a259c80..e29a43b 100644
--- a/src/backend/access/gist/gist.c
+++ b/src/backend/access/gist/gist.c
@@ -17,6 +17,7 @@
 #include "access/gist_private.h"
 #include "access/gistscan.h"
 #include "catalog/pg_collation.h"
+#include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "nodes/execnodes.h"
 #include "storage/lmgr.h"
@@ -74,6 +75,9 @@ gisthandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = true;
+	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = gistbuild;
diff --git a/src/backend/access/hash/hash.c b/src/backend/access/hash/hash.c
index a0597a0..8b9272c 100644
--- a/src/backend/access/hash/hash.c
+++ b/src/backend/access/hash/hash.c
@@ -72,6 +72,9 @@ hashhandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL;
 	amroutine->amkeytype = INT4OID;
 
 	amroutine->ambuild = hashbuild;
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index 065b529..313e31c 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -121,6 +121,9 @@ bthandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = true;
 	amroutine->amcaninclude = true;
+	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = btbuild;
diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c
index e2d391e..cbec182 100644
--- a/src/backend/access/spgist/spgutils.c
+++ b/src/backend/access/spgist/spgutils.c
@@ -22,6 +22,7 @@
 #include "access/transam.h"
 #include "access/xact.h"
 #include "catalog/pg_amop.h"
+#include "commands/vacuum.h"
 #include "storage/bufmgr.h"
 #include "storage/indexfsm.h"
 #include "storage/lmgr.h"
@@ -56,6 +57,9 @@ spghandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = spgbuild;
diff --git a/src/include/access/amapi.h b/src/include/access/amapi.h
index 6e3db06..556affb 100644
--- a/src/include/access/amapi.h
+++ b/src/include/access/amapi.h
@@ -197,6 +197,10 @@ typedef struct IndexAmRoutine
 	bool		amcanparallel;
 	/* does AM support columns included with clause INCLUDE? */
 	bool		amcaninclude;
+	/* does AM use maintenance_work_mem? */
+	bool		amusemaintenanceworkmem;
+	/* OR of parallel vacuum flags.  See vacuum.h for flags. */
+	uint8		amparallelvacuumoptions;
 	/* type of data stored in index, or InvalidOid if variable */
 	Oid			amkeytype;
 
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 128f7ae..b9becdb 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -23,6 +23,44 @@
 #include "storage/lock.h"
 #include "utils/relcache.h"
 
+/*
+ * Flags for amparallelvacuumoptions to control the participation of bulkdelete
+ * and vacuumcleanup in parallel vacuum.
+ */
+
+/*
+ * Both bulkdelete and vacuumcleanup are disabled by default.  This will be
+ * used by IndexAM's that don't want to or cannot participate in parallel
+ * vacuum.  For example, if an index AM doesn't have a way to communicate the
+ * index statistics allocated by the first ambulkdelete call to the subsequent
+ * ones until amvacuumcleanup, the index AM cannot participate in parallel
+ * vacuum.
+ */
+#define VACUUM_OPTION_NO_PARALLEL			0
+
+/*
+ * bulkdelete can be performed in parallel.  This option can be used by
+ * IndexAm's that need to scan the index to delete the tuples.
+ */
+#define VACUUM_OPTION_PARALLEL_BULKDEL		(1 << 0)
+
+/*
+ * vacuumcleanup can be performed in parallel if bulkdelete is not performed
+ * yet.  This will be used by IndexAM's that can scan the index if the
+ * bulkdelete is not performed.
+ */
+#define VACUUM_OPTION_PARALLEL_COND_CLEANUP	(1 << 1)
+
+/*
+ * vacuumcleanup can be performed in parallel even if bulkdelete has already
+ * processed the index.  This will be used by IndexAM's that scan the index
+ * during the cleanup phase of index irrespective of whether the index is
+ * already scanned or not during bulkdelete phase.
+ */
+#define VACUUM_OPTION_PARALLEL_CLEANUP		(1 << 2)
+
+/* value for checking vacuum flags */
+#define VACUUM_OPTION_MAX_VALID_VALUE		((1 << 3) - 1)
 
 /*----------
  * ANALYZE builds one of these structs for each attribute (column) that is
diff --git a/src/test/modules/dummy_index_am/dummy_index_am.c b/src/test/modules/dummy_index_am/dummy_index_am.c
index 053636e..6bfd883 100644
--- a/src/test/modules/dummy_index_am/dummy_index_am.c
+++ b/src/test/modules/dummy_index_am/dummy_index_am.c
@@ -16,6 +16,7 @@
 #include "access/amapi.h"
 #include "access/reloptions.h"
 #include "catalog/index.h"
+#include "commands/vacuum.h"
 #include "nodes/pathnodes.h"
 #include "utils/guc.h"
 #include "utils/rel.h"
@@ -294,6 +295,8 @@ dihandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amparallelvacuumoptions = VACUUM_OPTION_NO_PARALLEL;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = dibuild;
-- 
1.8.3.1

v39-0002-Add-a-parallel-option-to-the-VACUUM-command.patchapplication/octet-stream; name=v39-0002-Add-a-parallel-option-to-the-VACUUM-command.patchDownload

From 30aee1346a34c15bf321ef4743b861bc63b219f6 Mon Sep 17 00:00:00 2001
From: Amit Kapila <akapila@postgresql.org>
Date: Mon, 23 Dec 2019 15:03:14 +0530
Subject: [PATCH 2/3] Add a parallel option to the VACUUM command.

This change adds a PARALLEL option to VACUUM command that enables us to
perform index vacuuming and index cleanup with background workers.  Each
index is processed by at most one vacuum process.  Therefore parallel
vacuum can be used when the table has at least two indexes.  Users can
specify a parallel degree along with this option which indicates the
number of workers used by this option which is limited by the number of
indexes on a table.  This option can't be used with the FULL option.

The parallel degree is either specified by the user or determined based on
the number of indexes that the table has, and further limited by
max_parallel_maintenance_workers.  The index can participate in parallel
vacuum iff it is greater than min_parallel_index_scan_size.

Author: Masahiko Sawada and Amit Kapila
Reviewed-by: Dilip Kumar, Amit Kapila, Robert Haas, Mahendra Singh and
Sergei Kornilov
Tested-by: Mahendra Singh and Prabhat Sahu
Discussion:
https://postgr.es/m/CAD21AoDTPMgzSkV4E3SFo1CH_x50bf5PqZFQf4jmqjk-C03BWg@mail.gmail.com
https://postgr.es/m/CAA4eK1J-VoR9gzS5E75pcD-OH0mEyCdp8RihcwKrcuw7J-Q0+w@mail.gmail.com
---
 doc/src/sgml/config.sgml              |   14 +-
 doc/src/sgml/ref/vacuum.sgml          |   45 ++
 src/backend/access/heap/vacuumlazy.c  | 1240 ++++++++++++++++++++++++++++++---
 src/backend/access/transam/parallel.c |   26 +-
 src/backend/commands/vacuum.c         |  132 +++-
 src/backend/postmaster/autovacuum.c   |    2 +
 src/bin/psql/tab-complete.c           |    2 +-
 src/include/access/heapam.h           |    3 +
 src/include/access/parallel.h         |    4 +-
 src/include/commands/vacuum.h         |   11 +
 src/test/regress/expected/vacuum.out  |   28 +
 src/test/regress/sql/vacuum.sql       |   27 +
 12 files changed, 1416 insertions(+), 118 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 5d1c902..7475627 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2308,13 +2308,13 @@ include_dir 'conf.d'
        <listitem>
         <para>
          Sets the maximum number of parallel workers that can be
-         started by a single utility command.  Currently, the only
-         parallel utility command that supports the use of parallel
-         workers is <command>CREATE INDEX</command>, and only when
-         building a B-tree index.  Parallel workers are taken from the
-         pool of processes established by <xref
-         linkend="guc-max-worker-processes"/>, limited by <xref
-         linkend="guc-max-parallel-workers"/>.  Note that the requested
+         started by a single utility command.  Currently, the parallel
+         utility commands that support the use of parallel workers are
+         <command>CREATE INDEX</command> only when building a B-tree index,
+         and <command>VACUUM</command> without <literal>FULL</literal>
+         option.  Parallel workers are taken from the pool of processes
+         established by <xref linkend="guc-max-worker-processes"/>, limited
+         by <xref linkend="guc-max-parallel-workers"/>.  Note that the requested
          number of workers may not actually be available at run time.
          If this occurs, the utility operation will run with fewer
          workers than expected.  The default value is 2.  Setting this
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index f9b0fb8..9fee083 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -34,6 +34,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     SKIP_LOCKED [ <replaceable class="parameter">boolean</replaceable> ]
     INDEX_CLEANUP [ <replaceable class="parameter">boolean</replaceable> ]
     TRUNCATE [ <replaceable class="parameter">boolean</replaceable> ]
+    PARALLEL [ <replaceable class="parameter">integer</replaceable> ]
 
 <phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
 
@@ -224,6 +225,32 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    </varlistentry>
 
    <varlistentry>
+    <term><literal>PARALLEL</literal></term>
+    <listitem>
+     <para>
+      Perform vacuum index and cleanup index phases of <command>VACUUM</command>
+      in parallel using <replaceable class="parameter">integer</replaceable>
+      background workers (for the detail of each vacuum phases, please
+      refer to <xref linkend="vacuum-phases"/>).  If the parallel degree
+      <replaceable class="parameter">integer</replaceable> is omitted,
+      then <command>VACUUM</command> decides the number of workers based
+      on number of indexes that support parallel vacuum operation on the
+      relation which is further limited by
+      <xref linkend="guc-max-parallel-workers-maintenance"/>.  Please note
+      that it is not guaranteed that the number of parallel workers specified
+      in <replaceable class="parameter">integer</replaceable> will be used
+      during execution.  It is possible for a vacuum to run with fewer workers
+      than specified, or even with no workers at all.  Only one worker can
+      be used per index.  So parallel workers are launched only when there
+      are at least <literal>2</literal> indexes in the table.  Workers for
+      vacuum launches before starting each phase and exit at the end of
+      the phase.  These behaviors might change in a future release.  This
+      option can't be used with the <literal>FULL</literal> option.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><replaceable class="parameter">boolean</replaceable></term>
     <listitem>
      <para>
@@ -238,6 +265,18 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    </varlistentry>
 
    <varlistentry>
+    <term><replaceable class="parameter">integer</replaceable></term>
+    <listitem>
+     <para>
+      Specifies a positive integer value passed to the selected option.
+      The <replaceable class="parameter">integer</replaceable> value can
+      also be omitted, in which case the value is decided by the command
+      based on the option used.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><replaceable class="parameter">table_name</replaceable></term>
     <listitem>
      <para>
@@ -317,6 +356,12 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    </para>
 
    <para>
+     The <option>PARALLEL</option> option is used only for vacuum purpose.
+     Even if this option is specified with <option>ANALYZE</option> option
+     it does not affect <option>ANALYZE</option>.
+   </para>
+
+   <para>
     <command>VACUUM</command> causes a substantial increase in I/O traffic,
     which might cause poor performance for other active sessions.  Therefore,
     it is sometimes advisable to use the cost-based vacuum delay feature.
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index ab09d84..71dea2c 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -22,6 +22,20 @@
  * of index scans performed.  So we don't use maintenance_work_mem memory for
  * the TID array, just enough to hold as many heap tuples as fit on one page.
  *
+ * Lazy vacuum supports parallel execution with parallel worker processes.  In
+ * a parallel lazy vacuum, we perform both index vacuuming and index cleanup
+ * with parallel worker processes.  Individual indexes are processed by one
+ * vacuum process.  At the beginning of a lazy vacuum (at lazy_scan_heap) we
+ * prepare the parallel context and initialize the DSM segment that contains
+ * shared information as well as the memory space for storing dead tuples.
+ * When starting either index vacuuming or index cleanup, we launch parallel
+ * worker processes.  Once all indexes are processed the parallel worker
+ * processes exit.  After that, the leader process re-initializes the parallel
+ * context so that it can use the same DSM for multiple passes of index
+ * vacuum and for performing index cleanup.  For updating the index statistics,
+ * we need to update the system table and since updates are not
+ * allowed during parallel mode we update the index statistics after exiting
+ * from the parallel mode.
  *
  * Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
@@ -36,25 +50,30 @@
 
 #include <math.h>
 
+#include "access/amapi.h"
 #include "access/genam.h"
 #include "access/heapam.h"
 #include "access/heapam_xlog.h"
 #include "access/htup_details.h"
 #include "access/multixact.h"
+#include "access/parallel.h"
 #include "access/transam.h"
 #include "access/visibilitymap.h"
+#include "access/xact.h"
 #include "access/xlog.h"
 #include "catalog/storage.h"
 #include "commands/dbcommands.h"
 #include "commands/progress.h"
 #include "commands/vacuum.h"
 #include "miscadmin.h"
+#include "optimizer/paths.h"
 #include "pgstat.h"
 #include "portability/instr_time.h"
 #include "postmaster/autovacuum.h"
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
 #include "storage/lmgr.h"
+#include "tcop/tcopprot.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/pg_rusage.h"
@@ -110,6 +129,142 @@
  */
 #define PREFETCH_SIZE			((BlockNumber) 32)
 
+/*
+ * DSM keys for parallel lazy vacuum.  Unlike other parallel execution code,
+ * since we don't need to worry about DSM keys conflicting with plan_node_id
+ * we can use small integers.
+ */
+#define PARALLEL_VACUUM_KEY_SHARED			1
+#define PARALLEL_VACUUM_KEY_DEAD_TUPLES		2
+#define PARALLEL_VACUUM_KEY_QUERY_TEXT		3
+
+/*
+ * Macro to check if we are in a parallel lazy vacuum.  If true, we are
+ * in the parallel mode and the DSM segment is initialized.
+ */
+#define ParallelVacuumIsActive(lps) (((LVParallelState *) (lps)) != NULL)
+
+/*
+ * LVDeadTuples stores the dead tuple TIDs collected during the heap scan.
+ * This is allocated in the DSM segment in parallel mode and in local memory
+ * in non-parallel mode.
+ */
+typedef struct LVDeadTuples
+{
+	int			max_tuples;		/* # slots allocated in array */
+	int			num_tuples;		/* current # of entries */
+	/* List of TIDs of tuples we intend to delete */
+	/* NB: this list is ordered by TID address */
+	ItemPointerData itemptrs[FLEXIBLE_ARRAY_MEMBER];	/* array of
+														 * ItemPointerData */
+} LVDeadTuples;
+
+#define SizeOfLVDeadTuples offsetof(LVDeadTuples, itemptrs) + sizeof(ItemPointerData)
+
+/*
+ * Shared information among parallel workers.  So this is allocated in the DSM
+ * segment.
+ */
+typedef struct LVShared
+{
+	/*
+	 * Target table relid and log level.  These fields are not modified during
+	 * the lazy vacuum.
+	 */
+	Oid			relid;
+	int			elevel;
+
+	/*
+	 * An indication for vacuum workers to perform either index vacuuming or
+	 * index cleanup.  first_time is true only if for_cleanup is true and
+	 * bulk-deletion is not performed yet.
+	 */
+	bool		for_cleanup;
+	bool		first_time;
+
+	/*
+	 * Fields for both index vacuum and cleanup.
+	 *
+	 * reltuples is the total number of input heap tuples.  We set either old
+	 * live tuples in the index vacuum case or the new live tuples in the
+	 * index cleanup case.
+	 *
+	 * estimated_count is true if the reltuples is an estimated value.
+	 */
+	double		reltuples;
+	bool		estimated_count;
+
+	/*
+	 * In single process lazy vacuum we could consume more memory during index
+	 * vacuuming or cleanup apart from the memory for heap scanning. In
+	 * parallel index vacuuming, since individual vacuum workers can consume
+	 * memory equal to maitenance_work_mem, the new maitenance_work_mem for
+	 * each worker is set such that the parallel operation doesn't consume
+	 * more memory than single process lazy vacuum.
+	 */
+	int			maintenance_work_mem_worker;
+
+	/*
+	 * Shared vacuum cost balance.  During parallel index vacuuming
+	 * VacuumSharedCostBalance points to this value and it accumulates the
+	 * balance of each parallel vacuum workers.
+	 */
+	pg_atomic_uint32 cost_balance;
+
+	/*
+	 * Number of active parallel workers.  This is used for computing the
+	 * minimum threshold of the vacuum cost balance for a worker to go for the
+	 * delay.
+	 */
+	pg_atomic_uint32 active_nworkers;
+
+	/*
+	 * Variables to control parallel index vacuuming.  We have a bitmap to
+	 * indicate which index has stats in shared memory.  The set bit in the
+	 * map indicates that the particular index supports a parallel vacuum.
+	 */
+	pg_atomic_uint32 idx;		/* counter for vacuuming and clean up */
+	pg_atomic_uint32 nprocessed;	/* # of indexes done during parallel
+									 * execution */
+	uint32		offset;			/* sizeof header incl. bitmap */
+	bits8		bitmap[FLEXIBLE_ARRAY_MEMBER];	/* bit map of NULLs */
+
+	/* Shared index statistics data follows at end of struct */
+} LVShared;
+
+#define SizeOfLVShared offsetof(LVShared, bitmap) + sizeof(bits8)
+#define GetSharedIndStats(s) \
+	((LVSharedIndStats *)((char *)(s) + ((LVShared *)(s))->offset))
+#define IndStatsIsNull(s, i) \
+	(!(((LVShared *)(s))->bitmap[(i) >> 3] & (1 << ((i) & 0x07))))
+
+/*
+ * Struct for an index bulk-deletion statistic used for parallel lazy
+ * vacuum.  This is allocated in the DSM segment.
+ */
+typedef struct LVSharedIndStats
+{
+	bool		updated;		/* are the stats updated? */
+	IndexBulkDeleteResult stats;
+} LVSharedIndStats;
+
+/* Struct for maintaining a parallel vacuum state. */
+typedef struct LVParallelState
+{
+	ParallelContext *pcxt;
+
+	/* Shared information among parallel vacuum workers */
+	LVShared   *lvshared;
+
+	/*
+	 * The number of indexes that support parallel index bulk-deletion and
+	 * parallel index cleanup respectively.
+	 */
+	int			nindexes_parallel_bulkdel;
+	int			nindexes_parallel_cleanup;
+	int			nindexes_parallel_condcleanup;
+} LVParallelState;
+
 typedef struct LVRelStats
 {
 	/* useindex = true means two-pass strategy; false means one-pass */
@@ -128,11 +283,7 @@ typedef struct LVRelStats
 	BlockNumber pages_removed;
 	double		tuples_deleted;
 	BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
-	/* List of TIDs of tuples we intend to delete */
-	/* NB: this list is ordered by TID address */
-	int			num_dead_tuples;	/* current # of entries */
-	int			max_dead_tuples;	/* # slots allocated in array */
-	ItemPointer dead_tuples;	/* array of ItemPointerData */
+	LVDeadTuples *dead_tuples;
 	int			num_index_scans;
 	TransactionId latestRemovedXid;
 	bool		lock_waiter_detected;
@@ -155,12 +306,11 @@ static void lazy_scan_heap(Relation onerel, VacuumParams *params,
 						   bool aggressive);
 static void lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats);
 static bool lazy_check_needs_freeze(Buffer buf, bool *hastup);
-static void lazy_vacuum_index(Relation indrel,
-							  IndexBulkDeleteResult **stats,
-							  LVRelStats *vacrelstats);
+static void lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult **stats,
+							  LVDeadTuples *dead_tuples, double reltuples);
 static void lazy_cleanup_index(Relation indrel,
-							   IndexBulkDeleteResult *stats,
-							   LVRelStats *vacrelstats);
+							   IndexBulkDeleteResult **stats,
+							   double reltuples, bool estimated_count);
 static int	lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 							 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer);
 static bool should_attempt_truncation(VacuumParams *params,
@@ -169,12 +319,44 @@ static void lazy_truncate_heap(Relation onerel, LVRelStats *vacrelstats);
 static BlockNumber count_nondeletable_pages(Relation onerel,
 											LVRelStats *vacrelstats);
 static void lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks);
-static void lazy_record_dead_tuple(LVRelStats *vacrelstats,
+static void lazy_record_dead_tuple(LVDeadTuples *dead_tuples,
 								   ItemPointer itemptr);
 static bool lazy_tid_reaped(ItemPointer itemptr, void *state);
 static int	vac_cmp_itemptr(const void *left, const void *right);
 static bool heap_page_is_all_visible(Relation rel, Buffer buf,
 									 TransactionId *visibility_cutoff_xid, bool *all_frozen);
+static void lazy_parallel_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+										 LVRelStats *vacrelstats, LVParallelState *lps,
+										 int nindexes);
+static void parallel_vacuum_index(Relation *Irel, IndexBulkDeleteResult **stats,
+								  LVShared *lvshared, LVDeadTuples *dead_tuples,
+								  int nindexes);
+static void vacuum_indexes_leader(Relation *Irel, IndexBulkDeleteResult **stats,
+								  LVRelStats *vacrelstats, LVParallelState *lps,
+								  int nindexes);
+static void vacuum_one_index(Relation indrel, IndexBulkDeleteResult **stats,
+							 LVShared *lvshared, LVSharedIndStats *shared_indstats,
+							 LVDeadTuples *dead_tuples);
+static void lazy_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+								LVRelStats *vacrelstats, LVParallelState *lps,
+								int nindexes);
+static void lazy_cleanup_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+								 LVRelStats *vacrelstats, LVParallelState *lps,
+								 int nindexes);
+static long compute_max_dead_tuples(BlockNumber relblocks, bool hasindex);
+static int	compute_parallel_vacuum_workers(Relation *Irel, int nindexes, int nrequested,
+											bool *can_parallel_vacuum);
+static void prepare_index_statistics(LVShared *lvshared, bool *can_parallel_vacuum,
+									 int nindexes);
+static void update_index_statistics(Relation *Irel, IndexBulkDeleteResult **stats,
+									int nindexes);
+static LVParallelState *begin_parallel_vacuum(Oid relid, Relation *Irel,
+											  LVRelStats *vacrelstats, BlockNumber nblocks,
+											  int nindexes, int nrequested);
+static void end_parallel_vacuum(Relation *Irel, IndexBulkDeleteResult **stats,
+								LVParallelState *lps, int nindexes);
+static LVSharedIndStats *get_indstats(LVShared *lvshared, int n);
+static bool skip_parallel_vacuum_index(Relation indrel, LVShared *lvshared);
 
 
 /*
@@ -488,6 +670,18 @@ vacuum_log_cleanup_info(Relation rel, LVRelStats *vacrelstats)
  *		dead-tuple TIDs, invoke vacuuming of indexes and call lazy_vacuum_heap
  *		to reclaim dead line pointers.
  *
+ *		If the table has at least two indexes and parallel lazy vacuum is
+ *		requested, we execute both index vacuuming and index cleanup with
+ *		parallel workers.  In parallel lazy vacuum, we enter parallel mode and
+ *		then create both the parallel context and the DSM segment before starting
+ *		heap scan so that we can record dead tuples to the DSM segment.  All
+ *		parallel workers are launched at beginning of index vacuuming and index
+ *		cleanup and they exit once done with all indexes.  At the end of this
+ *		function we exit from parallel mode.  Index bulk-deletion results are
+ *		stored in the DSM segment and we update index statistics for all the
+ *		indexes after exiting from parallel mode since writes are not allowed
+ *		during parallel mode.
+ *
  *		If there are no indexes then we can reclaim line pointers on the fly;
  *		dead line pointers need only be retained until all index pointers that
  *		reference them have been killed.
@@ -496,6 +690,8 @@ static void
 lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			   Relation *Irel, int nindexes, bool aggressive)
 {
+	LVParallelState *lps = NULL;
+	LVDeadTuples *dead_tuples;
 	BlockNumber nblocks,
 				blkno;
 	HeapTupleData tuple;
@@ -553,13 +749,28 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	vacrelstats->nonempty_pages = 0;
 	vacrelstats->latestRemovedXid = InvalidTransactionId;
 
-	lazy_space_alloc(vacrelstats, nblocks);
+	/*
+	 * Try to initialize the parallel vacuum if requested
+	 */
+	if (params->nworkers >= 0 && vacrelstats->useindex)
+		lps = begin_parallel_vacuum(RelationGetRelid(onerel), Irel,
+									vacrelstats, nblocks, nindexes,
+									params->nworkers);
+
+	/*
+	 * Allocate the space for dead tuples in case the parallel vacuum is not
+	 * initialized.
+	 */
+	if (!ParallelVacuumIsActive(lps))
+		lazy_space_alloc(vacrelstats, nblocks);
+
+	dead_tuples = vacrelstats->dead_tuples;
 	frozen = palloc(sizeof(xl_heap_freeze_tuple) * MaxHeapTuplesPerPage);
 
 	/* Report that we're scanning the heap, advertising total # of blocks */
 	initprog_val[0] = PROGRESS_VACUUM_PHASE_SCAN_HEAP;
 	initprog_val[1] = nblocks;
-	initprog_val[2] = vacrelstats->max_dead_tuples;
+	initprog_val[2] = dead_tuples->max_tuples;
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
 	/*
@@ -737,8 +948,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * If we are close to overrunning the available space for dead-tuple
 		 * TIDs, pause and do a cycle of vacuuming before we tackle this page.
 		 */
-		if ((vacrelstats->max_dead_tuples - vacrelstats->num_dead_tuples) < MaxHeapTuplesPerPage &&
-			vacrelstats->num_dead_tuples > 0)
+		if ((dead_tuples->max_tuples - dead_tuples->num_tuples) < MaxHeapTuplesPerPage &&
+			dead_tuples->num_tuples > 0)
 		{
 			const int	hvp_index[] = {
 				PROGRESS_VACUUM_PHASE,
@@ -766,10 +977,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 										 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
 			/* Remove index entries */
-			for (i = 0; i < nindexes; i++)
-				lazy_vacuum_index(Irel[i],
-								  &indstats[i],
-								  vacrelstats);
+			lazy_vacuum_indexes(Irel, indstats, vacrelstats, lps, nindexes);
 
 			/*
 			 * Report that we are now vacuuming the heap.  We also increase
@@ -789,7 +997,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 			vacrelstats->num_index_scans++;
 
 			/*
@@ -985,7 +1193,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		has_dead_tuples = false;
 		nfrozen = 0;
 		hastup = false;
-		prev_dead_count = vacrelstats->num_dead_tuples;
+		prev_dead_count = dead_tuples->num_tuples;
 		maxoff = PageGetMaxOffsetNumber(page);
 
 		/*
@@ -1024,7 +1232,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 */
 			if (ItemIdIsDead(itemid))
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				all_visible = false;
 				continue;
 			}
@@ -1170,7 +1378,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 
 			if (tupgone)
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				HeapTupleHeaderAdvanceLatestRemovedXid(tuple.t_data,
 													   &vacrelstats->latestRemovedXid);
 				tups_vacuumed += 1;
@@ -1240,7 +1448,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * doing a second scan. Also we don't do that but forget dead tuples
 		 * when index cleanup is disabled.
 		 */
-		if (!vacrelstats->useindex && vacrelstats->num_dead_tuples > 0)
+		if (!vacrelstats->useindex && dead_tuples->num_tuples > 0)
 		{
 			if (nindexes == 0)
 			{
@@ -1269,7 +1477,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 
 			/*
 			 * Periodically do incremental FSM vacuuming to make newly-freed
@@ -1384,7 +1592,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * page, so remember its free space as-is.  (This path will always be
 		 * taken if there are no indexes.)
 		 */
-		if (vacrelstats->num_dead_tuples == prev_dead_count)
+		if (dead_tuples->num_tuples == prev_dead_count)
 			RecordPageWithFreeSpace(onerel, blkno, freespace);
 	}
 
@@ -1418,7 +1626,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 
 	/* If any tuples need to be deleted, perform final vacuum cycle */
 	/* XXX put a threshold on min number of tuples here? */
-	if (vacrelstats->num_dead_tuples > 0)
+	if (dead_tuples->num_tuples > 0)
 	{
 		const int	hvp_index[] = {
 			PROGRESS_VACUUM_PHASE,
@@ -1434,10 +1642,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 									 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
 		/* Remove index entries */
-		for (i = 0; i < nindexes; i++)
-			lazy_vacuum_index(Irel[i],
-							  &indstats[i],
-							  vacrelstats);
+		lazy_vacuum_indexes(Irel, indstats, vacrelstats, lps, nindexes);
 
 		/* Report that we are now vacuuming the heap */
 		hvp_val[0] = PROGRESS_VACUUM_PHASE_VACUUM_HEAP;
@@ -1461,12 +1666,19 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
 								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
 
-	/* Do post-vacuum cleanup and statistics update for each index */
+	/* Do post-vacuum cleanup */
 	if (vacrelstats->useindex)
-	{
-		for (i = 0; i < nindexes; i++)
-			lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
-	}
+		lazy_cleanup_indexes(Irel, indstats, vacrelstats, lps, nindexes);
+
+	/*
+	 * End parallel mode before updating index statistics as we cannot write
+	 * during parallel mode.
+	 */
+	if (ParallelVacuumIsActive(lps))
+		end_parallel_vacuum(Irel, indstats, lps, nindexes);
+
+	/* Update index statistics */
+	update_index_statistics(Irel, indstats, nindexes);
 
 	/* If no indexes, make log report that lazy_vacuum_heap would've made */
 	if (vacuumed_pages)
@@ -1532,7 +1744,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 	npages = 0;
 
 	tupindex = 0;
-	while (tupindex < vacrelstats->num_dead_tuples)
+	while (tupindex < vacrelstats->dead_tuples->num_tuples)
 	{
 		BlockNumber tblk;
 		Buffer		buf;
@@ -1541,7 +1753,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 
 		vacuum_delay_point();
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples->itemptrs[tupindex]);
 		buf = ReadBufferExtended(onerel, MAIN_FORKNUM, tblk, RBM_NORMAL,
 								 vac_strategy);
 		if (!ConditionalLockBufferForCleanup(buf))
@@ -1589,6 +1801,7 @@ static int
 lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 				 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer)
 {
+	LVDeadTuples *dead_tuples = vacrelstats->dead_tuples;
 	Page		page = BufferGetPage(buffer);
 	OffsetNumber unused[MaxOffsetNumber];
 	int			uncnt = 0;
@@ -1599,16 +1812,16 @@ lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 
 	START_CRIT_SECTION();
 
-	for (; tupindex < vacrelstats->num_dead_tuples; tupindex++)
+	for (; tupindex < dead_tuples->num_tuples; tupindex++)
 	{
 		BlockNumber tblk;
 		OffsetNumber toff;
 		ItemId		itemid;
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&dead_tuples->itemptrs[tupindex]);
 		if (tblk != blkno)
 			break;				/* past end of tuples for this block */
-		toff = ItemPointerGetOffsetNumber(&vacrelstats->dead_tuples[tupindex]);
+		toff = ItemPointerGetOffsetNumber(&dead_tuples->itemptrs[tupindex]);
 		itemid = PageGetItemId(page, toff);
 		ItemIdSetUnused(itemid);
 		unused[uncnt++] = toff;
@@ -1729,19 +1942,387 @@ lazy_check_needs_freeze(Buffer buf, bool *hastup)
 	return false;
 }
 
+/*
+ * Perform index vacuuming or index cleanup with parallel workers.  This
+ * function must be used by the parallel vacuum leader process. The caller
+ * must set lps->lvshared->for_cleanup to indicate whether to perform vacuum
+ * or cleanup.
+ */
+static void
+lazy_parallel_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+							 LVRelStats *vacrelstats, LVParallelState *lps,
+							 int nindexes)
+{
+	int			nworkers;
+
+	Assert(!IsParallelWorker());
+	Assert(ParallelVacuumIsActive(lps));
+	Assert(nindexes > 0);
+
+	/* Determine the number of parallel workers to launch */
+	if (lps->lvshared->for_cleanup)
+	{
+		if (lps->lvshared->first_time)
+			nworkers = lps->nindexes_parallel_cleanup +
+				lps->nindexes_parallel_condcleanup - 1;
+		else
+			nworkers = lps->nindexes_parallel_cleanup - 1;
+
+	}
+	else
+		nworkers = lps->nindexes_parallel_bulkdel - 1;
+
+	/*
+	 * It is possible that parallel context is initialized with fewer workers
+	 * then the number of indexes that need a separate worker in the current
+	 * phase, so we need to consider it.  See compute_parallel_vacuum_workers.
+	 */
+	nworkers = Min(nworkers, lps->pcxt->nworkers);
+
+	/* Setup the shared cost-based vacuum delay and launch workers */
+	if (nworkers > 0)
+	{
+		if (vacrelstats->num_index_scans > 0)
+		{
+			/* Reset the processing counts */
+			pg_atomic_write_u32(&(lps->lvshared->idx), 0);
+			pg_atomic_write_u32(&(lps->lvshared->nprocessed), 0);
+
+			/* Reinitialize the parallel context to relaunch parallel workers */
+			ReinitializeParallelDSM(lps->pcxt);
+		}
+
+		/* Enable shared cost balance */
+		VacuumSharedCostBalance = &(lps->lvshared->cost_balance);
+		VacuumActiveNWorkers = &(lps->lvshared->active_nworkers);
+
+		/*
+		 * Set up shared cost balance and the number of active workers for
+		 * vacuum delay.
+		 */
+		pg_atomic_write_u32(VacuumSharedCostBalance, VacuumCostBalance);
+		pg_atomic_write_u32(VacuumActiveNWorkers, 0);
+
+		/*
+		 * The number of workers can vary between and bulkdelete and cleanup
+		 * phase.
+		 */
+		ReinitializeParallelWorkers(lps->pcxt, nworkers);
+
+		LaunchParallelWorkers(lps->pcxt);
+
+		if (lps->pcxt->nworkers_launched > 0)
+		{
+			/*
+			 * Reset the local cost values for leader backend as we have
+			 * already accumulated the remaining balance of heap.
+			 */
+			VacuumCostBalance = 0;
+			VacuumCostBalanceLocal = 0;
+		}
+		else
+		{
+			/*
+			 * Disable shared cost balance if we are not able to launch
+			 * workers.
+			 */
+			VacuumSharedCostBalance = NULL;
+			VacuumActiveNWorkers = NULL;
+		}
+
+		if (lps->lvshared->for_cleanup)
+			ereport(elevel,
+					(errmsg(ngettext("launched %d parallel vacuum worker for index cleanup (planned: %d)",
+									 "launched %d parallel vacuum workers for index cleanup (planned: %d)",
+									 lps->pcxt->nworkers_launched),
+							lps->pcxt->nworkers_launched, nworkers)));
+		else
+			ereport(elevel,
+					(errmsg(ngettext("launched %d parallel vacuum worker for index vacuuming (planned: %d)",
+									 "launched %d parallel vacuum workers for index vacuuming (planned: %d)",
+									 lps->pcxt->nworkers_launched),
+							lps->pcxt->nworkers_launched, nworkers)));
+	}
+
+	/* Process the indexes that can be processed by only leader process */
+	vacuum_indexes_leader(Irel, stats, vacrelstats, lps, nindexes);
+
+	/*
+	 * Join as a parallel worker.  The leader process alone processes all the
+	 * indexes in the case where no workers are launched.
+	 */
+	parallel_vacuum_index(Irel, stats, lps->lvshared,
+						  vacrelstats->dead_tuples, nindexes);
+
+	/* Wait for all vacuum workers to finish */
+	WaitForParallelWorkersToFinish(lps->pcxt);
+
+	/* Carry the shared balance value to heap scan */
+	if (VacuumSharedCostBalance)
+		VacuumCostBalance = pg_atomic_read_u32(VacuumSharedCostBalance);
+
+	if (nworkers > 0)
+	{
+		/* Disable shared cost balance */
+		VacuumSharedCostBalance = NULL;
+		VacuumActiveNWorkers = NULL;
+	}
+}
+
+/*
+ * Index vacuum/cleanup routine used by the leader process and parallel
+ * vacuum worker processes to process the indexes in parallel.
+ */
+static void
+parallel_vacuum_index(Relation *Irel, IndexBulkDeleteResult **stats,
+					  LVShared *lvshared, LVDeadTuples *dead_tuples,
+					  int nindexes)
+{
+	/*
+	 * Increment the active worker count if we are able to launch any worker.
+	 */
+	if (VacuumActiveNWorkers)
+		pg_atomic_add_fetch_u32(VacuumActiveNWorkers, 1);
+
+	/* Loop until all indexes are vacuumed */
+	for (;;)
+	{
+		int			idx;
+		LVSharedIndStats *shared_indstats;
+
+		/* Get an index number to process */
+		idx = pg_atomic_fetch_add_u32(&(lvshared->idx), 1);
+
+		/* Done for all indexes? */
+		if (idx >= nindexes)
+			break;
+
+		/* Get the index statistics of this index from DSM */
+		shared_indstats = get_indstats(lvshared, idx);
+
+		/*
+		 * Skip processing indexes that doesn't participate in parallel
+		 * operation
+		 */
+		if (shared_indstats == NULL ||
+			skip_parallel_vacuum_index(Irel[idx], lvshared))
+			continue;
+
+		/* Increment the processing count */
+		pg_atomic_add_fetch_u32(&(lvshared->nprocessed), 1);
+
+		/* Do vacuum or cleanup of the index */
+		vacuum_one_index(Irel[idx], &(stats[idx]), lvshared, shared_indstats,
+						 dead_tuples);
+	}
+
+	/*
+	 * We have completed the index vacuum so decrement the active worker
+	 * count.
+	 */
+	if (VacuumActiveNWorkers)
+		pg_atomic_sub_fetch_u32(VacuumActiveNWorkers, 1);
+}
+
+/*
+ * Vacuum or cleanup indexes that can be processed by only the leader process
+ * because these indexes don't support parallel operation at that phase.
+ */
+static void
+vacuum_indexes_leader(Relation *Irel, IndexBulkDeleteResult **stats,
+					  LVRelStats *vacrelstats, LVParallelState *lps,
+					  int nindexes)
+{
+	int			i;
+
+	Assert(!IsParallelWorker());
+
+	/*
+	 * Increment the active worker count if we are able to launch any worker.
+	 */
+	if (VacuumActiveNWorkers)
+		pg_atomic_add_fetch_u32(VacuumActiveNWorkers, 1);
+
+	for (i = 0; i < nindexes; i++)
+	{
+		bool		skip_index = (get_indstats(lps->lvshared, i) == NULL ||
+								  skip_parallel_vacuum_index(Irel[i], lps->lvshared));
+
+		/* Skip the indexes that can be processed by parallel workers */
+		if (!skip_index)
+			continue;
+
+		vacuum_one_index(Irel[i], &(stats[i]), lps->lvshared,
+						 get_indstats(lps->lvshared, i),
+						 vacrelstats->dead_tuples);
+	}
+
+	/*
+	 * We have completed the index vacuum so decrement the active worker
+	 * count.
+	 */
+	if (VacuumActiveNWorkers)
+		pg_atomic_sub_fetch_u32(VacuumActiveNWorkers, 1);
+}
+
+/*
+ * Vacuum or cleanup index either by leader process or by one of the worker
+ * process.  After processing the index this function copies the index
+ * statistics returned from ambulkdelete and amvacuumcleanup to the DSM
+ * segment.
+ */
+static void
+vacuum_one_index(Relation indrel, IndexBulkDeleteResult **stats,
+				 LVShared *lvshared, LVSharedIndStats *shared_indstats,
+				 LVDeadTuples *dead_tuples)
+{
+	IndexBulkDeleteResult *bulkdelete_res = NULL;
+
+	if (shared_indstats)
+	{
+		/* Get the space for IndexBulkDeleteResult */
+		bulkdelete_res = &(shared_indstats->stats);
+
+		/*
+		 * Update the pointer to the corresponding bulk-deletion result if
+		 * someone has already updated it.
+		 */
+		if (shared_indstats->updated && *stats == NULL)
+			*stats = bulkdelete_res;
+	}
+
+	/* Do vacuum or cleanup of the index */
+	if (lvshared->for_cleanup)
+		lazy_cleanup_index(indrel, stats, lvshared->reltuples,
+						   lvshared->estimated_count);
+	else
+		lazy_vacuum_index(indrel, stats, dead_tuples,
+						  lvshared->reltuples);
+
+	/*
+	 * Copy the index bulk-deletion result returned from ambulkdelete and
+	 * amvacuumcleanup to the DSM segment if it's the first time to get it
+	 * from them, because they allocate it locally and it's possible that an
+	 * index will be vacuumed by the different vacuum process at the next
+	 * time.  The copying of the result normally happens only after the first
+	 * time of index vacuuming.  From the second time, we pass the result on
+	 * the DSM segment so that they then update it directly.
+	 *
+	 * Since all vacuum workers write the bulk-deletion result at different
+	 * slots we can write them without locking.
+	 */
+	if (shared_indstats && !shared_indstats->updated && *stats != NULL)
+	{
+		memcpy(bulkdelete_res, *stats, sizeof(IndexBulkDeleteResult));
+		shared_indstats->updated = true;
+
+		/*
+		 * Now that the stats[idx] points to the DSM segment, we don't need
+		 * the locally allocated results.
+		 */
+		pfree(*stats);
+		*stats = bulkdelete_res;
+	}
+}
+
+/*
+ * Vacuum indexes.  We process the indexes serially unless we are doing
+ * parallel vacuum.
+ */
+static void
+lazy_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+					LVRelStats *vacrelstats, LVParallelState *lps,
+					int nindexes)
+{
+	int			idx;
+
+	Assert(!IsParallelWorker());
+	Assert(nindexes > 0);
+
+	/* Perform index vacuuming with parallel workers for parallel vacuum. */
+	if (ParallelVacuumIsActive(lps))
+	{
+		/* Tell parallel workers to do index vacuuming */
+		lps->lvshared->for_cleanup = false;
+		lps->lvshared->first_time = false;
+
+		/*
+		 * We can only provide an approximate value of num_heap_tuples in
+		 * vacuum cases.
+		 */
+		lps->lvshared->reltuples = vacrelstats->old_live_tuples;
+		lps->lvshared->estimated_count = true;
+
+		lazy_parallel_vacuum_indexes(Irel, stats, vacrelstats, lps, nindexes);
+	}
+	else
+	{
+		for (idx = 0; idx < nindexes; idx++)
+			lazy_vacuum_index(Irel[idx], &stats[idx], vacrelstats->dead_tuples,
+							  vacrelstats->old_live_tuples);
+	}
+}
+
+/*
+ * Cleanup indexes.  We process the indexes serially unless we are doing
+ * parallel vacuum.
+ */
+static void
+lazy_cleanup_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+					 LVRelStats *vacrelstats, LVParallelState *lps,
+					 int nindexes)
+{
+	int			idx;
+
+	Assert(!IsParallelWorker());
+	Assert(nindexes > 0);
+
+	/*
+	 * If parallel vacuum is active we perform index cleanup with parallel
+	 * workers.
+	 */
+	if (ParallelVacuumIsActive(lps))
+	{
+		/* Tell parallel workers to do index cleanup */
+		lps->lvshared->for_cleanup = true;
+		lps->lvshared->first_time =
+			(vacrelstats->num_index_scans == 0);
+
+		/*
+		 * Now we can provide a better estimate of total number of surviving
+		 * tuples (we assume indexes are more interested in that than in the
+		 * number of nominally live tuples).
+		 */
+		lps->lvshared->reltuples = vacrelstats->new_rel_tuples;
+		lps->lvshared->estimated_count =
+			(vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+
+		lazy_parallel_vacuum_indexes(Irel, stats, vacrelstats, lps, nindexes);
+	}
+	else
+	{
+		for (idx = 0; idx < nindexes; idx++)
+			lazy_cleanup_index(Irel[idx], &stats[idx],
+							   vacrelstats->new_rel_tuples,
+							   vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	}
+}
 
 /*
  *	lazy_vacuum_index() -- vacuum one index relation.
  *
  *		Delete all the index entries pointing to tuples listed in
- *		vacrelstats->dead_tuples, and update running statistics.
+ *		dead_tuples, and update running statistics.
+ *
+ *		reltuples is the number of heap tuples to be passed to the
+ *		bulkdelete callback.
  */
 static void
-lazy_vacuum_index(Relation indrel,
-				  IndexBulkDeleteResult **stats,
-				  LVRelStats *vacrelstats)
+lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult **stats,
+				  LVDeadTuples *dead_tuples, double reltuples)
 {
 	IndexVacuumInfo ivinfo;
+	const char *msg;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
@@ -1751,30 +2332,38 @@ lazy_vacuum_index(Relation indrel,
 	ivinfo.report_progress = false;
 	ivinfo.estimated_count = true;
 	ivinfo.message_level = elevel;
-	/* We can only provide an approximate value of num_heap_tuples here */
-	ivinfo.num_heap_tuples = vacrelstats->old_live_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
 	/* Do bulk deletion */
 	*stats = index_bulk_delete(&ivinfo, *stats,
-							   lazy_tid_reaped, (void *) vacrelstats);
+							   lazy_tid_reaped, (void *) dead_tuples);
+
+	if (IsParallelWorker())
+		msg = gettext_noop("scanned index \"%s\" to remove %d row versions by parallel vacuum worker");
+	else
+		msg = gettext_noop("scanned index \"%s\" to remove %d row versions");
 
 	ereport(elevel,
-			(errmsg("scanned index \"%s\" to remove %d row versions",
+			(errmsg(msg,
 					RelationGetRelationName(indrel),
-					vacrelstats->num_dead_tuples),
+					dead_tuples->num_tuples),
 			 errdetail_internal("%s", pg_rusage_show(&ru0))));
 }
 
 /*
  *	lazy_cleanup_index() -- do post-vacuum cleanup for one index relation.
+ *
+ *		reltuples is the number of heap tuples and estimated_count is true
+ *		if the reltuples is an estimated value.
  */
 static void
 lazy_cleanup_index(Relation indrel,
-				   IndexBulkDeleteResult *stats,
-				   LVRelStats *vacrelstats)
+				   IndexBulkDeleteResult **stats,
+				   double reltuples, bool estimated_count)
 {
 	IndexVacuumInfo ivinfo;
+	const char *msg;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
@@ -1782,49 +2371,33 @@ lazy_cleanup_index(Relation indrel,
 	ivinfo.index = indrel;
 	ivinfo.analyze_only = false;
 	ivinfo.report_progress = false;
-	ivinfo.estimated_count = (vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	ivinfo.estimated_count = estimated_count;
 	ivinfo.message_level = elevel;
 
-	/*
-	 * Now we can provide a better estimate of total number of surviving
-	 * tuples (we assume indexes are more interested in that than in the
-	 * number of nominally live tuples).
-	 */
-	ivinfo.num_heap_tuples = vacrelstats->new_rel_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
-	stats = index_vacuum_cleanup(&ivinfo, stats);
+	*stats = index_vacuum_cleanup(&ivinfo, *stats);
 
-	if (!stats)
+	if (!(*stats))
 		return;
 
-	/*
-	 * Now update statistics in pg_class, but only if the index says the count
-	 * is accurate.
-	 */
-	if (!stats->estimated_count)
-		vac_update_relstats(indrel,
-							stats->num_pages,
-							stats->num_index_tuples,
-							0,
-							false,
-							InvalidTransactionId,
-							InvalidMultiXactId,
-							false);
+	if (IsParallelWorker())
+		msg = gettext_noop("index \"%s\" now contains %.0f row versions in %u pages as reported by parallel vacuum worker");
+	else
+		msg = gettext_noop("index \"%s\" now contains %.0f row versions in %u pages");
 
 	ereport(elevel,
-			(errmsg("index \"%s\" now contains %.0f row versions in %u pages",
+			(errmsg(msg,
 					RelationGetRelationName(indrel),
-					stats->num_index_tuples,
-					stats->num_pages),
+					(*stats)->num_index_tuples,
+					(*stats)->num_pages),
 			 errdetail("%.0f index row versions were removed.\n"
 					   "%u index pages have been deleted, %u are currently reusable.\n"
 					   "%s.",
-					   stats->tuples_removed,
-					   stats->pages_deleted, stats->pages_free,
+					   (*stats)->tuples_removed,
+					   (*stats)->pages_deleted, (*stats)->pages_free,
 					   pg_rusage_show(&ru0))));
-
-	pfree(stats);
 }
 
 /*
@@ -2132,19 +2705,17 @@ count_nondeletable_pages(Relation onerel, LVRelStats *vacrelstats)
 }
 
 /*
- * lazy_space_alloc - space allocation decisions for lazy vacuum
- *
- * See the comments at the head of this file for rationale.
+ * Return the maximum number of dead tuples we can record.
  */
-static void
-lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+static long
+compute_max_dead_tuples(BlockNumber relblocks, bool useindex)
 {
 	long		maxtuples;
 	int			vac_work_mem = IsAutoVacuumWorkerProcess() &&
 	autovacuum_work_mem != -1 ?
 	autovacuum_work_mem : maintenance_work_mem;
 
-	if (vacrelstats->useindex)
+	if (useindex)
 	{
 		maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);
 		maxtuples = Min(maxtuples, INT_MAX);
@@ -2158,34 +2729,49 @@ lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
 		maxtuples = Max(maxtuples, MaxHeapTuplesPerPage);
 	}
 	else
-	{
 		maxtuples = MaxHeapTuplesPerPage;
-	}
 
-	vacrelstats->num_dead_tuples = 0;
-	vacrelstats->max_dead_tuples = (int) maxtuples;
-	vacrelstats->dead_tuples = (ItemPointer)
-		palloc(maxtuples * sizeof(ItemPointerData));
+	return maxtuples;
+}
+
+/*
+ * lazy_space_alloc - space allocation decisions for lazy vacuum
+ *
+ * See the comments at the head of this file for rationale.
+ */
+static void
+lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+{
+	LVDeadTuples *dead_tuples = NULL;
+	long		maxtuples;
+
+	maxtuples = compute_max_dead_tuples(relblocks, vacrelstats->useindex);
+
+	dead_tuples = (LVDeadTuples *)
+		palloc(SizeOfLVDeadTuples + maxtuples * sizeof(ItemPointerData));
+	dead_tuples->num_tuples = 0;
+	dead_tuples->max_tuples = (int) maxtuples;
+
+	vacrelstats->dead_tuples = dead_tuples;
 }
 
 /*
  * lazy_record_dead_tuple - remember one deletable tuple
  */
 static void
-lazy_record_dead_tuple(LVRelStats *vacrelstats,
-					   ItemPointer itemptr)
+lazy_record_dead_tuple(LVDeadTuples *dead_tuples, ItemPointer itemptr)
 {
 	/*
 	 * The array shouldn't overflow under normal behavior, but perhaps it
 	 * could if we are given a really small maintenance_work_mem. In that
 	 * case, just forget the last few tuples (we'll get 'em next time).
 	 */
-	if (vacrelstats->num_dead_tuples < vacrelstats->max_dead_tuples)
+	if (dead_tuples->num_tuples < dead_tuples->max_tuples)
 	{
-		vacrelstats->dead_tuples[vacrelstats->num_dead_tuples] = *itemptr;
-		vacrelstats->num_dead_tuples++;
+		dead_tuples->itemptrs[dead_tuples->num_tuples] = *itemptr;
+		dead_tuples->num_tuples++;
 		pgstat_progress_update_param(PROGRESS_VACUUM_NUM_DEAD_TUPLES,
-									 vacrelstats->num_dead_tuples);
+									 dead_tuples->num_tuples);
 	}
 }
 
@@ -2199,12 +2785,12 @@ lazy_record_dead_tuple(LVRelStats *vacrelstats,
 static bool
 lazy_tid_reaped(ItemPointer itemptr, void *state)
 {
-	LVRelStats *vacrelstats = (LVRelStats *) state;
+	LVDeadTuples *dead_tuples = (LVDeadTuples *) state;
 	ItemPointer res;
 
 	res = (ItemPointer) bsearch((void *) itemptr,
-								(void *) vacrelstats->dead_tuples,
-								vacrelstats->num_dead_tuples,
+								(void *) dead_tuples->itemptrs,
+								dead_tuples->num_tuples,
 								sizeof(ItemPointerData),
 								vac_cmp_itemptr);
 
@@ -2352,3 +2938,453 @@ heap_page_is_all_visible(Relation rel, Buffer buf,
 
 	return all_visible;
 }
+
+/*
+ * Compute the number of parallel worker processes to request.  Both index
+ * vacuuming and index cleanup can be executed with parallel workers.  The
+ * relation sizes of table don't affect to the parallel degree for now.
+ *
+ * nrequested is the number of parallel workers that user requested.  If
+ * nrequested is 0, we compute the parallel degree based on nindexes, that is
+ * the number of indexes that support parallel index vacuuming.  This function
+ * also sets can_parallel_vacuum to remember indexes that participate in
+ * parallel index vacuum.
+ */
+static int
+compute_parallel_vacuum_workers(Relation *Irel, int nindexes, int nrequested,
+								bool *can_parallel_vacuum)
+{
+	int			nindexes_parallel = 0;
+	int			nindexes_parallel_bulkdel = 0;
+	int			nindexes_parallel_cleanup = 0;
+	int			parallel_workers;
+	int			i;
+
+	/*
+	 * We don't allow to perform parallel operation in standalone backend or
+	 * when parallelism is disabled.
+	 */
+	if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+		return 0;
+
+	/*
+	 * Compute the number of indexes that can participate in parallel index
+	 * vacuuming.
+	 */
+	for (i = 0; i < nindexes; i++)
+	{
+		uint8		vacoptions = Irel[i]->rd_indam->amparallelvacuumoptions;
+
+		if (vacoptions == VACUUM_OPTION_NO_PARALLEL ||
+			RelationGetNumberOfBlocks(Irel[i]) < min_parallel_index_scan_size)
+			continue;
+
+		can_parallel_vacuum[i] = true;
+
+		if ((vacoptions & VACUUM_OPTION_PARALLEL_BULKDEL) != 0)
+			nindexes_parallel_bulkdel++;
+		if (((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) != 0) ||
+			((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) != 0))
+			nindexes_parallel_cleanup++;
+	}
+
+	nindexes_parallel = Max(nindexes_parallel_bulkdel,
+							nindexes_parallel_cleanup);
+
+	/* No index supports parallel index vacuuming */
+	if (nindexes_parallel == 0)
+		return 0;
+
+	/* The leader process takes one index */
+	nindexes_parallel--;
+
+	/* Compute the parallel degree */
+	parallel_workers = (nrequested > 0) ?
+		Min(nrequested, nindexes_parallel) : nindexes_parallel;
+
+	/* cap by max_parallel_maintenance_workers */
+	parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+
+	return parallel_workers;
+}
+
+/*
+ * Initialize variables for shared index statistics, set NULL bitmap and the
+ * size of stats for each index.  Since currently we don't support parallel vacuum
+ * for autovacuum we don't need to care about autovacuum_work_mem.
+ */
+static void
+prepare_index_statistics(LVShared *lvshared, bool *can_parallel_vacuum,
+						 int nindexes)
+{
+	int			i;
+
+	Assert(!IsAutoVacuumWorkerProcess());
+
+	/* Set NULL for all indexes */
+	memset(lvshared->bitmap, 0x00, BITMAPLEN(nindexes));
+
+	for (i = 0; i < nindexes; i++)
+	{
+		if (!can_parallel_vacuum[i])
+			continue;
+
+		/* Set NOT NULL as this index do support parallelism */
+		lvshared->bitmap[i >> 3] |= 1 << (i & 0x07);
+	}
+}
+
+/*
+ * Update index statistics in pg_class if the statistics is accurate.
+ */
+static void
+update_index_statistics(Relation *Irel, IndexBulkDeleteResult **stats,
+						int nindexes)
+{
+	int			i;
+
+	Assert(!IsInParallelMode());
+
+	for (i = 0; i < nindexes; i++)
+	{
+		if (stats[i] == NULL || stats[i]->estimated_count)
+			continue;
+
+		/* Update index statistics */
+		vac_update_relstats(Irel[i],
+							stats[i]->num_pages,
+							stats[i]->num_index_tuples,
+							0,
+							false,
+							InvalidTransactionId,
+							InvalidMultiXactId,
+							false);
+		pfree(stats[i]);
+	}
+}
+
+/*
+ * This function prepares and returns parallel vacuum state if we can launch
+ * even one worker.  This function is responsible to create a parallel
+ * context, enter parallel mode, and then initialize the DSM segment.
+ */
+static LVParallelState *
+begin_parallel_vacuum(Oid relid, Relation *Irel, LVRelStats *vacrelstats,
+					  BlockNumber nblocks, int nindexes, int nrequested)
+{
+	LVParallelState *lps = NULL;
+	ParallelContext *pcxt;
+	LVShared   *shared;
+	LVDeadTuples *dead_tuples;
+	bool	   *can_parallel_vacuum;
+	long		maxtuples;
+	char	   *sharedquery;
+	Size		est_shared;
+	Size		est_deadtuples;
+	int			nindexes_mwm = 0;
+	int			parallel_workers = 0;
+	int			querylen;
+	int			i;
+
+	/*
+	 * a parallel vacuum must be requested and there must be indexes on the
+	 * relation
+	 */
+	Assert(nrequested >= 0);
+	Assert(nindexes > 0);
+
+	/*
+	 * Compute the number of parallel vacuum workers to launch
+	 */
+	can_parallel_vacuum = (bool *) palloc0(sizeof(bool) * nindexes);
+	parallel_workers = compute_parallel_vacuum_workers(Irel, nindexes,
+													   nrequested,
+													   can_parallel_vacuum);
+
+	/* Can't perform vacuum in parallel */
+	if (parallel_workers <= 0)
+	{
+		pfree(can_parallel_vacuum);
+		return lps;
+	}
+
+	lps = (LVParallelState *) palloc0(sizeof(LVParallelState));
+
+	EnterParallelMode();
+	pcxt = CreateParallelContext("postgres", "parallel_vacuum_main",
+								 parallel_workers);
+	lps->pcxt = pcxt;
+	Assert(pcxt->nworkers > 0);
+
+	/* Estimate size for shared information -- PARALLEL_VACUUM_KEY_SHARED */
+	est_shared = MAXALIGN(add_size(SizeOfLVShared, BITMAPLEN(nindexes)));
+	for (i = 0; i < nindexes; i++)
+	{
+		uint8		vacoptions = Irel[i]->rd_indam->amparallelvacuumoptions;
+
+		/*
+		 * Cleanup option should be either disabled, always performing in
+		 * parallel or conditionally performing in parallel.
+		 */
+		Assert(((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) == 0) ||
+			   ((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) == 0));
+		Assert(vacoptions <= VACUUM_OPTION_MAX_VALID_VALUE);
+
+		/* Skip indexes that don't participate in parallel index vacuum */
+		if (!can_parallel_vacuum[i])
+			continue;
+
+		if (Irel[i]->rd_indam->amusemaintenanceworkmem)
+			nindexes_mwm++;
+
+		est_shared = add_size(est_shared, sizeof(LVSharedIndStats));
+
+
+		/*
+		 * Remember the number of indexes that support parallel operation for
+		 * each phase.
+		 */
+		if ((vacoptions & VACUUM_OPTION_PARALLEL_BULKDEL) != 0)
+			lps->nindexes_parallel_bulkdel++;
+		if ((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) != 0)
+			lps->nindexes_parallel_cleanup++;
+		if ((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) != 0)
+			lps->nindexes_parallel_condcleanup++;
+	}
+	shm_toc_estimate_chunk(&pcxt->estimator, est_shared);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	/* Estimate size for dead tuples -- PARALLEL_VACUUM_KEY_DEAD_TUPLES */
+	maxtuples = compute_max_dead_tuples(nblocks, true);
+	est_deadtuples = MAXALIGN(add_size(SizeOfLVDeadTuples,
+									   mul_size(sizeof(ItemPointerData), maxtuples)));
+	shm_toc_estimate_chunk(&pcxt->estimator, est_deadtuples);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	/* Finally, estimate PARALLEL_VACUUM_KEY_QUERY_TEXT space */
+	querylen = strlen(debug_query_string);
+	shm_toc_estimate_chunk(&pcxt->estimator, querylen + 1);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	InitializeParallelDSM(pcxt);
+
+	/* Prepare shared information */
+	shared = (LVShared *) shm_toc_allocate(pcxt->toc, est_shared);
+	MemSet(shared, 0, est_shared);
+	shared->relid = relid;
+	shared->elevel = elevel;
+	shared->maintenance_work_mem_worker =
+		(nindexes_mwm > 0) ?
+		maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
+		maintenance_work_mem;
+
+	/*
+	 * We need to care about alignment because we estimate the shared memory
+	 * in that way.
+	 */
+	shared->offset = MAXALIGN(add_size(SizeOfLVShared, BITMAPLEN(nindexes)));
+	prepare_index_statistics(shared, can_parallel_vacuum, nindexes);
+	pg_atomic_init_u32(&(shared->idx), 0);
+	pg_atomic_init_u32(&(shared->nprocessed), 0);
+	pg_atomic_init_u32(&(shared->cost_balance), 0);
+	pg_atomic_init_u32(&(shared->active_nworkers), 0);
+
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_SHARED, shared);
+	lps->lvshared = shared;
+
+	/* Prepare the dead tuple space */
+	dead_tuples = (LVDeadTuples *) shm_toc_allocate(pcxt->toc, est_deadtuples);
+	dead_tuples->max_tuples = maxtuples;
+	dead_tuples->num_tuples = 0;
+	MemSet(dead_tuples->itemptrs, 0, sizeof(ItemPointerData) * maxtuples);
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_DEAD_TUPLES, dead_tuples);
+	vacrelstats->dead_tuples = dead_tuples;
+
+	/* Store query string for workers */
+	sharedquery = (char *) shm_toc_allocate(pcxt->toc, querylen + 1);
+	memcpy(sharedquery, debug_query_string, querylen + 1);
+	sharedquery[querylen] = '\0';
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, sharedquery);
+
+	pfree(can_parallel_vacuum);
+	return lps;
+}
+
+/*
+ * Destroy the parallel context, and end parallel mode.
+ *
+ * Since writes are not allowed during the parallel mode, so we copy the
+ * updated index statistics from DSM in local memory and then later use that
+ * to update the index statistics.  One might think that we can exit from
+ * parallel mode, update the index statistics and then destroy parallel
+ * context, but that won't be safe (see ExitParallelMode).
+ */
+static void
+end_parallel_vacuum(Relation *Irel, IndexBulkDeleteResult **stats,
+					LVParallelState *lps, int nindexes)
+{
+	int			i;
+
+	Assert(!IsParallelWorker());
+
+	/* copy the updated statistics */
+	for (i = 0; i < nindexes; i++)
+	{
+		LVSharedIndStats *indstats = get_indstats(lps->lvshared, i);
+
+		/*
+		 * Skip unused slot.  The statistics of this index are already stored
+		 * in local memory.
+		 */
+		if (indstats == NULL)
+			continue;
+
+		if (indstats->updated)
+		{
+			stats[i] = (IndexBulkDeleteResult *) palloc0(sizeof(IndexBulkDeleteResult));
+			memcpy(stats[i],
+				   &(indstats->stats),
+				   sizeof(IndexBulkDeleteResult));
+		}
+		else
+			stats[i] = NULL;
+	}
+
+	DestroyParallelContext(lps->pcxt);
+	ExitParallelMode();
+
+	/* Deactivate parallel vacuum */
+	pfree(lps);
+	lps = NULL;
+}
+
+/* Return the Nth index statistics or NULL */
+static LVSharedIndStats *
+get_indstats(LVShared *lvshared, int n)
+{
+	int			i;
+	char	   *p;
+
+	if (IndStatsIsNull(lvshared, n))
+		return NULL;
+
+	p = (char *) GetSharedIndStats(lvshared);
+	for (i = 0; i < n; i++)
+	{
+		if (IndStatsIsNull(lvshared, i))
+			continue;
+
+		p += sizeof(LVSharedIndStats);
+	}
+
+	return (LVSharedIndStats *) p;
+}
+
+/*
+ * Check if the given index participates in parallel index vacuum or parallel
+ * index cleanup.
+ */
+static bool
+skip_parallel_vacuum_index(Relation indrel, LVShared *lvshared)
+{
+	uint8		vacoptions = indrel->rd_indam->amparallelvacuumoptions;
+
+	/* first_time must be true only if for_cleanup is true */
+	Assert(lvshared->for_cleanup || !lvshared->first_time);
+
+	if (lvshared->for_cleanup)
+	{
+		/* Skip, if the index does not support parallel cleanup */
+		if (((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) == 0) &&
+			((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) == 0))
+			return true;
+
+		/*
+		 * Skip, if the index supports parallel cleanup conditionally, but we
+		 * have already processed the index (for bulkdelete).  See the
+		 * comments for option VACUUM_OPTION_PARALLEL_COND_CLEANUP to know
+		 * when indexes support parallel cleanup conditionally.
+		 */
+		if (!lvshared->first_time &&
+			((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) != 0))
+			return true;
+	}
+	else if ((vacoptions & VACUUM_OPTION_PARALLEL_BULKDEL) == 0)
+	{
+		/* Skip if the index does not support parallel bulk deletion */
+		return true;
+	}
+
+	return false;
+}
+
+/*
+ * Perform work within a launched parallel process.
+ *
+ * Since parallel vacuum workers perform only index vacuum or index cleanup,
+ * we don't need to report the progress information.
+ */
+void
+parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
+{
+	Relation	onerel;
+	Relation   *indrels;
+	LVShared   *lvshared;
+	LVDeadTuples *dead_tuples;
+	int			nindexes;
+	char	   *sharedquery;
+	IndexBulkDeleteResult **stats;
+
+	lvshared = (LVShared *) shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_SHARED,
+										   false);
+	elevel = lvshared->elevel;
+
+	ereport(DEBUG1,
+			(errmsg("starting parallel lazy vacuum worker for %s",
+					lvshared->for_cleanup ? "cleanup" : "vacuuming")));
+
+	/* Set debug_query_string for individual workers */
+	sharedquery = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, false);
+	debug_query_string = sharedquery;
+	pgstat_report_activity(STATE_RUNNING, debug_query_string);
+
+	/*
+	 * Open table.  The lock mode is the same as the leader process.  It's
+	 * okay because The lockmode does not conflict among the parallel workers.
+	 */
+	onerel = table_open(lvshared->relid, ShareUpdateExclusiveLock);
+
+	/*
+	 * Open all indexes. indrels are sorted in order by OID, which should be
+	 * matched to the leader's one.
+	 */
+	vac_open_indexes(onerel, RowExclusiveLock, &nindexes, &indrels);
+	Assert(nindexes > 0);
+
+	/* Set dead tuple space */
+	dead_tuples = (LVDeadTuples *) shm_toc_lookup(toc,
+												  PARALLEL_VACUUM_KEY_DEAD_TUPLES,
+												  false);
+
+	/* Set cost-based vacuum delay */
+	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostBalance = 0;
+	VacuumPageHit = 0;
+	VacuumPageMiss = 0;
+	VacuumPageDirty = 0;
+	VacuumSharedCostBalance = &(lvshared->cost_balance);
+	VacuumActiveNWorkers = &(lvshared->active_nworkers);
+
+	stats = (IndexBulkDeleteResult **)
+		palloc0(nindexes * sizeof(IndexBulkDeleteResult *));
+
+	if (lvshared->maintenance_work_mem_worker > 0)
+		maintenance_work_mem = lvshared->maintenance_work_mem_worker;
+
+	/* Process indexes to perform vacuum/cleanup */
+	parallel_vacuum_index(indrels, stats, lvshared, dead_tuples, nindexes);
+
+	vac_close_indexes(nindexes, indrels, RowExclusiveLock);
+	table_close(onerel, ShareUpdateExclusiveLock);
+	pfree(stats);
+}
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index d147236..6c9ee65 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -14,6 +14,7 @@
 
 #include "postgres.h"
 
+#include "access/heapam.h"
 #include "access/nbtree.h"
 #include "access/parallel.h"
 #include "access/session.h"
@@ -139,6 +140,9 @@ static const struct
 	},
 	{
 		"_bt_parallel_build_main", _bt_parallel_build_main
+	},
+	{
+		"parallel_vacuum_main", parallel_vacuum_main
 	}
 };
 
@@ -174,6 +178,7 @@ CreateParallelContext(const char *library_name, const char *function_name,
 	pcxt = palloc0(sizeof(ParallelContext));
 	pcxt->subid = GetCurrentSubTransactionId();
 	pcxt->nworkers = nworkers;
+	pcxt->nworkers_to_launch = nworkers;
 	pcxt->library_name = pstrdup(library_name);
 	pcxt->function_name = pstrdup(function_name);
 	pcxt->error_context_stack = error_context_stack;
@@ -487,6 +492,23 @@ ReinitializeParallelDSM(ParallelContext *pcxt)
 }
 
 /*
+ * Reinitialize parallel workers for a parallel context such that we could
+ * launch the different number of workers.  This is required for cases where
+ * we need to reuse the same DSM segment, but the number of workers can
+ * vary from run-to-run.
+ */
+void
+ReinitializeParallelWorkers(ParallelContext *pcxt, int nworkers_to_launch)
+{
+	/*
+	 * The number of workers that need to be launched must be less than the
+	 * number of workers with which the parallel context is initialized.
+	 */
+	Assert(pcxt->nworkers >= nworkers_to_launch);
+	pcxt->nworkers_to_launch = nworkers_to_launch;
+}
+
+/*
  * Launch parallel workers.
  */
 void
@@ -498,7 +520,7 @@ LaunchParallelWorkers(ParallelContext *pcxt)
 	bool		any_registrations_failed = false;
 
 	/* Skip this if we have no workers. */
-	if (pcxt->nworkers == 0)
+	if (pcxt->nworkers == 0 || pcxt->nworkers_to_launch == 0)
 		return;
 
 	/* We need to be a lock group leader. */
@@ -533,7 +555,7 @@ LaunchParallelWorkers(ParallelContext *pcxt)
 	 * fails.  It wouldn't help much anyway, because registering the worker in
 	 * no way guarantees that it will start up and initialize successfully.
 	 */
-	for (i = 0; i < pcxt->nworkers; ++i)
+	for (i = 0; i < pcxt->nworkers_to_launch; ++i)
 	{
 		memcpy(worker.bgw_extra, &i, sizeof(int));
 		if (!any_registrations_failed &&
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index da1da23..0672be2 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -42,6 +42,7 @@
 #include "nodes/makefuncs.h"
 #include "pgstat.h"
 #include "postmaster/autovacuum.h"
+#include "postmaster/bgworker_internals.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
 #include "storage/proc.h"
@@ -68,6 +69,14 @@ static MemoryContext vac_context = NULL;
 static BufferAccessStrategy vac_strategy;
 
 
+/*
+ * Variables for cost-based parallel vacuum.  See comments atop
+ * compute_parallel_delay to understand how it works.
+ */
+pg_atomic_uint32 *VacuumSharedCostBalance = NULL;
+pg_atomic_uint32 *VacuumActiveNWorkers = NULL;
+int			VacuumCostBalanceLocal = 0;
+
 /* non-export function prototypes */
 static List *expand_vacuum_rel(VacuumRelation *vrel, int options);
 static List *get_all_vacuum_rels(int options);
@@ -76,6 +85,7 @@ static void vac_truncate_clog(TransactionId frozenXID,
 							  TransactionId lastSaneFrozenXid,
 							  MultiXactId lastSaneMinMulti);
 static bool vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params);
+static double compute_parallel_delay(void);
 static VacOptTernaryValue get_vacopt_ternary_value(DefElem *def);
 
 /*
@@ -99,6 +109,7 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 	/* Set default value */
 	params.index_cleanup = VACOPT_TERNARY_DEFAULT;
 	params.truncate = VACOPT_TERNARY_DEFAULT;
+	params.nworkers = -1;
 
 	/* Parse options list */
 	foreach(lc, vacstmt->options)
@@ -129,6 +140,28 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 			params.index_cleanup = get_vacopt_ternary_value(opt);
 		else if (strcmp(opt->defname, "truncate") == 0)
 			params.truncate = get_vacopt_ternary_value(opt);
+		else if (strcmp(opt->defname, "parallel") == 0)
+		{
+			if (opt->arg == NULL)
+			{
+				/*
+				 * Parallel lazy vacuum is requested but user didn't specify
+				 * the parallel degree. The parallel degree will be determined
+				 * at the start of lazy vacuum.
+				 */
+				params.nworkers = 0;
+			}
+			else
+			{
+				params.nworkers = defGetInt32(opt);
+				if (params.nworkers < 1 || params.nworkers > MAX_PARALLEL_WORKER_LIMIT)
+					ereport(ERROR,
+							(errcode(ERRCODE_SYNTAX_ERROR),
+							 errmsg("parallel vacuum degree must be between 1 and %d",
+									MAX_PARALLEL_WORKER_LIMIT),
+							 parser_errposition(pstate, opt->location)));
+			}
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -170,6 +203,11 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 		}
 	}
 
+	if ((params.options & VACOPT_FULL) && params.nworkers >= 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot specify FULL option with PARALLEL option")));
+
 	/*
 	 * All freeze ages are zero if the FREEZE option is given; otherwise pass
 	 * them as -1 which means to use the default values.
@@ -383,6 +421,7 @@ vacuum(List *relations, VacuumParams *params,
 		VacuumPageHit = 0;
 		VacuumPageMiss = 0;
 		VacuumPageDirty = 0;
+		VacuumSharedCostBalance = NULL;
 
 		/*
 		 * Loop to process each selected relation.
@@ -1739,6 +1778,20 @@ vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params)
 	}
 
 	/*
+	 * Since parallel workers cannot access data in temporary tables, parallel
+	 * vacuum is not allowed for temporary relation. However rather than
+	 * skipping vacuum on the table, just disabling parallel option is better
+	 * option in most cases.
+	 */
+	if (RelationUsesLocalBuffers(onerel) && params->nworkers >= 0)
+	{
+		ereport(WARNING,
+				(errmsg("disabling parallel option of vacuum on \"%s\" --- cannot vacuum temporary tables in parallel",
+						RelationGetRelationName(onerel))));
+		params->nworkers = -1;
+	}
+
+	/*
 	 * Silently ignore partitioned tables as there is no work to be done.  The
 	 * useful work is on their child partitions, which have been queued up for
 	 * us separately.
@@ -1941,16 +1994,26 @@ vac_close_indexes(int nindexes, Relation *Irel, LOCKMODE lockmode)
 void
 vacuum_delay_point(void)
 {
+	double		msec = 0;
+
 	/* Always check for interrupts */
 	CHECK_FOR_INTERRUPTS();
 
-	/* Nap if appropriate */
-	if (VacuumCostActive && !InterruptPending &&
-		VacuumCostBalance >= VacuumCostLimit)
-	{
-		double		msec;
+	if (!VacuumCostActive || InterruptPending)
+		return;
 
+	/*
+	 * For parallel vacuum, the delay is computed based on the shared cost
+	 * balance.  See compute_parallel_delay.
+	 */
+	if (VacuumSharedCostBalance != NULL)
+		msec = compute_parallel_delay();
+	else if (VacuumCostBalance >= VacuumCostLimit)
 		msec = VacuumCostDelay * VacuumCostBalance / VacuumCostLimit;
+
+	/* Nap if appropriate */
+	if (msec > 0)
+	{
 		if (msec > VacuumCostDelay * 4)
 			msec = VacuumCostDelay * 4;
 
@@ -1967,6 +2030,65 @@ vacuum_delay_point(void)
 }
 
 /*
+ * Computes the vacuum delay for parallel workers.
+ *
+ * The basic idea of a cost-based vacuum delay for parallel index vacuuming
+ * is to allow all parallel vacuum workers including the leader process to
+ * have a shared view of cost related parameters (mainly VacuumCostBalance).
+ * We allow each worker to update it as and when it has incurred any cost and
+ * then based on that decide whether it needs to sleep.  We allow the worker
+ * to sleep proportional to the work done and reduce the
+ * VacuumSharedCostBalance by the amount which is consumed by the current
+ * worker (VacuumCostBalanceLocal).  This can avoid letting the workers sleep
+ * who have done less or no I/O as compared to other workers and therefore can
+ * ensure that workers who are doing more I/O got throttled more.
+ *
+ * We allow any worker to sleep only if it has performed the I/O above a
+ * certain threshold, which is calculated based on the number of active
+ * workers (VacuumActiveNWorkers), and the overall cost balance is more than
+ * VacuumCostLimit set by the system.  The testing reveals that we achieve
+ * the required throttling if we allow a worker that has done more than 50%
+ * of its share of work to sleep.
+ */
+static double
+compute_parallel_delay(void)
+{
+	double		msec = 0;
+	uint32		shared_balance;
+	int			nworkers;
+
+	/* parallel vacuum must be active */
+	Assert(VacuumSharedCostBalance);
+
+	nworkers = pg_atomic_read_u32(VacuumActiveNWorkers);
+
+	/* At least count itself */
+	Assert(nworkers >= 1);
+
+	/* Update the shared cost balance value atomically */
+	shared_balance = pg_atomic_add_fetch_u32(VacuumSharedCostBalance, VacuumCostBalance);
+
+	/* Compute the total local balance for the current worker */
+	VacuumCostBalanceLocal += VacuumCostBalance;
+
+	if ((shared_balance >= VacuumCostLimit) &&
+		(VacuumCostBalanceLocal > 0.5 * (VacuumCostLimit / nworkers)))
+	{
+		/* Compute sleep time based on the local cost balance */
+		msec = VacuumCostDelay * VacuumCostBalanceLocal / VacuumCostLimit;
+		pg_atomic_sub_fetch_u32(VacuumSharedCostBalance, VacuumCostBalanceLocal);
+		VacuumCostBalanceLocal = 0;
+	}
+
+	/*
+	 * Reset the local balance as we accumulated it into the shared value.
+	 */
+	VacuumCostBalance = 0;
+
+	return msec;
+}
+
+/*
  * A wrapper function of defGetBoolean().
  *
  * This function returns VACOPT_TERNARY_ENABLED and VACOPT_TERNARY_DISABLED
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index e919317..641999c 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2886,6 +2886,8 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 			(!wraparound ? VACOPT_SKIP_LOCKED : 0);
 		tab->at_params.index_cleanup = VACOPT_TERNARY_DEFAULT;
 		tab->at_params.truncate = VACOPT_TERNARY_DEFAULT;
+		/* We don't support parallel vacuum for autovacuum for now */
+		tab->at_params.nworkers = -1;
 		tab->at_params.freeze_min_age = freeze_min_age;
 		tab->at_params.freeze_table_age = freeze_table_age;
 		tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 5e0db35..e2dbd94 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -3591,7 +3591,7 @@ psql_completion(const char *text, int start, int end)
 		if (ends_with(prev_wd, '(') || ends_with(prev_wd, ','))
 			COMPLETE_WITH("FULL", "FREEZE", "ANALYZE", "VERBOSE",
 						  "DISABLE_PAGE_SKIPPING", "SKIP_LOCKED",
-						  "INDEX_CLEANUP", "TRUNCATE");
+						  "INDEX_CLEANUP", "TRUNCATE", "PARALLEL");
 		else if (TailMatches("FULL|FREEZE|ANALYZE|VERBOSE|DISABLE_PAGE_SKIPPING|SKIP_LOCKED|INDEX_CLEANUP|TRUNCATE"))
 			COMPLETE_WITH("ON", "OFF");
 	}
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 858bcb6..e89c125 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -24,6 +24,8 @@
 #include "nodes/primnodes.h"
 #include "storage/bufpage.h"
 #include "storage/lockdefs.h"
+#include "storage/shm_toc.h"
+#include "storage/dsm.h"
 #include "utils/relcache.h"
 #include "utils/snapshot.h"
 
@@ -193,6 +195,7 @@ extern Size SyncScanShmemSize(void);
 struct VacuumParams;
 extern void heap_vacuum_rel(Relation onerel,
 							struct VacuumParams *params, BufferAccessStrategy bstrategy);
+extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
 
 /* in heap/heapam_visibility.c */
 extern bool HeapTupleSatisfiesVisibility(HeapTuple stup, Snapshot snapshot,
diff --git a/src/include/access/parallel.h b/src/include/access/parallel.h
index c00ae64..b9ad6cf 100644
--- a/src/include/access/parallel.h
+++ b/src/include/access/parallel.h
@@ -33,7 +33,8 @@ typedef struct ParallelContext
 {
 	dlist_node	node;
 	SubTransactionId subid;
-	int			nworkers;
+	int			nworkers;		/* Maximum number of workers to launch */
+	int			nworkers_to_launch; /* Actual number of workers to launch */
 	int			nworkers_launched;
 	char	   *library_name;
 	char	   *function_name;
@@ -63,6 +64,7 @@ extern ParallelContext *CreateParallelContext(const char *library_name,
 											  const char *function_name, int nworkers);
 extern void InitializeParallelDSM(ParallelContext *pcxt);
 extern void ReinitializeParallelDSM(ParallelContext *pcxt);
+extern void ReinitializeParallelWorkers(ParallelContext *pcxt, int nworkers_to_launch);
 extern void LaunchParallelWorkers(ParallelContext *pcxt);
 extern void WaitForParallelWorkersToAttach(ParallelContext *pcxt);
 extern void WaitForParallelWorkersToFinish(ParallelContext *pcxt);
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index b9becdb..254a6bc 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -222,6 +222,12 @@ typedef struct VacuumParams
 										 * default value depends on reloptions */
 	VacOptTernaryValue truncate;	/* Truncate empty pages at the end,
 									 * default value depends on reloptions */
+
+	/*
+	 * The number of parallel vacuum workers. -1 by default for no workers and
+	 * 0 for choosing based on the number of indexes.
+	 */
+	int			nworkers;
 } VacuumParams;
 
 /* GUC parameters */
@@ -231,6 +237,11 @@ extern int	vacuum_freeze_table_age;
 extern int	vacuum_multixact_freeze_min_age;
 extern int	vacuum_multixact_freeze_table_age;
 
+/* Variables for cost-based parallel vacuum  */
+extern pg_atomic_uint32 *VacuumSharedCostBalance;
+extern pg_atomic_uint32 *VacuumActiveNWorkers;
+extern int	VacuumCostBalanceLocal;
+
 
 /* in commands/vacuum.c */
 extern void ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel);
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index 9996d88..8571133 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -92,6 +92,34 @@ CONTEXT:  SQL function "do_analyze" statement 1
 SQL function "wrap_do_analyze" statement 1
 VACUUM FULL vactst;
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+-- PARALLEL option
+CREATE TABLE pvactst (i INT, a INT[], p POINT) with (autovacuum_enabled = off);
+INSERT INTO pvactst SELECT i, array[1,2,3], point(i, i+1) FROM generate_series(1,1000) i;
+CREATE INDEX btree_pvactst ON pvactst USING btree (i);
+CREATE INDEX hash_pvactst ON pvactst USING hash (i);
+CREATE INDEX brin_pvactst ON pvactst USING brin (i);
+CREATE INDEX gin_pvactst ON pvactst USING gin (a);
+CREATE INDEX gist_pvactst ON pvactst USING gist (p);
+CREATE INDEX spgist_pvactst ON pvactst USING spgist (p);
+-- VACUUM invokes parallel index cleanup
+SET min_parallel_index_scan_size to 0;
+VACUUM (PARALLEL 2) pvactst;
+-- VACUUM invokes parallel bulk-deletion
+UPDATE pvactst SET i = i WHERE i < 1000;
+VACUUM (PARALLEL 2) pvactst;
+VACUUM (PARALLEL 0) pvactst; -- error
+ERROR:  parallel vacuum degree must be between 1 and 1024
+LINE 1: VACUUM (PARALLEL 0) pvactst;
+                ^
+VACUUM (PARALLEL 2, INDEX_CLEANUP FALSE) pvactst;
+VACUUM (PARALLEL 2, FULL TRUE) pvactst; -- error, cannot use both PARALLEL and FULL
+ERROR:  cannot specify FULL option with PARALLEL option
+CREATE TEMPORARY TABLE tmp (a int PRIMARY KEY);
+CREATE INDEX tmp_idx1 ON tmp (a);
+VACUUM (PARALLEL 1) tmp; -- disables parallel vacuum option
+WARNING:  disabling parallel option of vacuum on "tmp" --- cannot vacuum temporary tables in parallel
+RESET min_parallel_index_scan_size;
+DROP TABLE pvactst;
 -- INDEX_CLEANUP option
 CREATE TABLE no_index_cleanup (i INT PRIMARY KEY, t TEXT);
 -- Use uncompressed data stored in toast.
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 69987f7..be4f556 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -75,6 +75,33 @@ VACUUM FULL vactst;
 
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
 
+-- PARALLEL option
+CREATE TABLE pvactst (i INT, a INT[], p POINT) with (autovacuum_enabled = off);
+INSERT INTO pvactst SELECT i, array[1,2,3], point(i, i+1) FROM generate_series(1,1000) i;
+CREATE INDEX btree_pvactst ON pvactst USING btree (i);
+CREATE INDEX hash_pvactst ON pvactst USING hash (i);
+CREATE INDEX brin_pvactst ON pvactst USING brin (i);
+CREATE INDEX gin_pvactst ON pvactst USING gin (a);
+CREATE INDEX gist_pvactst ON pvactst USING gist (p);
+CREATE INDEX spgist_pvactst ON pvactst USING spgist (p);
+
+-- VACUUM invokes parallel index cleanup
+SET min_parallel_index_scan_size to 0;
+VACUUM (PARALLEL 2) pvactst;
+
+-- VACUUM invokes parallel bulk-deletion
+UPDATE pvactst SET i = i WHERE i < 1000;
+VACUUM (PARALLEL 2) pvactst;
+
+VACUUM (PARALLEL 0) pvactst; -- error
+VACUUM (PARALLEL 2, INDEX_CLEANUP FALSE) pvactst;
+VACUUM (PARALLEL 2, FULL TRUE) pvactst; -- error, cannot use both PARALLEL and FULL
+CREATE TEMPORARY TABLE tmp (a int PRIMARY KEY);
+CREATE INDEX tmp_idx1 ON tmp (a);
+VACUUM (PARALLEL 1) tmp; -- disables parallel vacuum option
+RESET min_parallel_index_scan_size;
+DROP TABLE pvactst;
+
 -- INDEX_CLEANUP option
 CREATE TABLE no_index_cleanup (i INT PRIMARY KEY, t TEXT);
 -- Use uncompressed data stored in toast.
-- 
1.8.3.1

v39-0003-Add-FAST-option-to-vacuum-command.patchapplication/octet-stream; name=v39-0003-Add-FAST-option-to-vacuum-command.patchDownload

From c78e267328c6828473e78f87e8c17326ea059e0e Mon Sep 17 00:00:00 2001
From: Amit Kapila <akapila@postgresql.org>
Date: Mon, 23 Dec 2019 15:37:09 +0530
Subject: [PATCH 3/3] Add FAST option to vacuum command.

---
 doc/src/sgml/ref/vacuum.sgml         | 13 +++++++++++
 src/backend/access/heap/vacuumlazy.c | 43 ++++++++++++++++++++++--------------
 src/backend/commands/vacuum.c        |  9 ++++++--
 src/include/commands/vacuum.h        |  3 ++-
 src/test/regress/expected/vacuum.out |  3 +++
 src/test/regress/sql/vacuum.sql      |  4 ++++
 6 files changed, 56 insertions(+), 19 deletions(-)

diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index 9fee083..b190cb0 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -35,6 +35,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     INDEX_CLEANUP [ <replaceable class="parameter">boolean</replaceable> ]
     TRUNCATE [ <replaceable class="parameter">boolean</replaceable> ]
     PARALLEL [ <replaceable class="parameter">integer</replaceable> ]
+    FAST [ <replaceable class="parameter">boolean</replaceable> ]
 
 <phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
 
@@ -251,6 +252,18 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    </varlistentry>
 
    <varlistentry>
+    <term><literal>FAST</literal></term>
+    <listitem>
+     <para>
+      Perform vacuum while disabling cost-based vacuum delay feature.
+      Specifying <literal>FAST</literal> is equivalent to performing
+      <command>VACUUM</command> with the
+      <xref linkend="guc-vacuum-cost-delay"/> parameter set to zero.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><replaceable class="parameter">boolean</replaceable></term>
     <listitem>
      <para>
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 71dea2c..01e51ef 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -219,6 +219,13 @@ typedef struct LVShared
 	pg_atomic_uint32 active_nworkers;
 
 	/*
+	 * True if we forcibly disable cost-based vacuum delay during parallel
+	 * index vacuum. This can be true when use specified the FAST vacuum
+	 * option.
+	 */
+	bool		fast;
+
+	/*
 	 * Variables to control parallel index vacuuming.  We have a bitmap to
 	 * indicate which index has stats in shared memory.  The set bit in the
 	 * map indicates that the particular index supports a parallel vacuum.
@@ -352,7 +359,7 @@ static void update_index_statistics(Relation *Irel, IndexBulkDeleteResult **stat
 									int nindexes);
 static LVParallelState *begin_parallel_vacuum(Oid relid, Relation *Irel,
 											  LVRelStats *vacrelstats, BlockNumber nblocks,
-											  int nindexes, int nrequested);
+											  int nindexes, VacuumParams *params);
 static void end_parallel_vacuum(Relation *Irel, IndexBulkDeleteResult **stats,
 								LVParallelState *lps, int nindexes);
 static LVSharedIndStats *get_indstats(LVShared *lvshared, int n);
@@ -755,7 +762,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	if (params->nworkers >= 0 && vacrelstats->useindex)
 		lps = begin_parallel_vacuum(RelationGetRelid(onerel), Irel,
 									vacrelstats, nblocks, nindexes,
-									params->nworkers);
+									params);
 
 	/*
 	 * Allocate the space for dead tuples in case the parallel vacuum is not
@@ -1992,16 +1999,19 @@ lazy_parallel_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
 			ReinitializeParallelDSM(lps->pcxt);
 		}
 
-		/* Enable shared cost balance */
-		VacuumSharedCostBalance = &(lps->lvshared->cost_balance);
-		VacuumActiveNWorkers = &(lps->lvshared->active_nworkers);
+		if (!lps->lvshared->fast)
+		{
+			/* Enable shared cost balance */
+			VacuumSharedCostBalance = &(lps->lvshared->cost_balance);
+			VacuumActiveNWorkers = &(lps->lvshared->active_nworkers);
 
-		/*
-		 * Set up shared cost balance and the number of active workers for
-		 * vacuum delay.
-		 */
-		pg_atomic_write_u32(VacuumSharedCostBalance, VacuumCostBalance);
-		pg_atomic_write_u32(VacuumActiveNWorkers, 0);
+			/*
+			 * Set up shared cost balance and the number of active workers for
+			 * vacuum delay.
+			 */
+			pg_atomic_write_u32(VacuumSharedCostBalance, VacuumCostBalance);
+			pg_atomic_write_u32(VacuumActiveNWorkers, 0);
+		}
 
 		/*
 		 * The number of workers can vary between and bulkdelete and cleanup
@@ -2020,7 +2030,7 @@ lazy_parallel_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
 			VacuumCostBalance = 0;
 			VacuumCostBalanceLocal = 0;
 		}
-		else
+		else if (!lps->lvshared->fast)
 		{
 			/*
 			 * Disable shared cost balance if we are not able to launch
@@ -3070,7 +3080,7 @@ update_index_statistics(Relation *Irel, IndexBulkDeleteResult **stats,
  */
 static LVParallelState *
 begin_parallel_vacuum(Oid relid, Relation *Irel, LVRelStats *vacrelstats,
-					  BlockNumber nblocks, int nindexes, int nrequested)
+					  BlockNumber nblocks, int nindexes, VacuumParams *params)
 {
 	LVParallelState *lps = NULL;
 	ParallelContext *pcxt;
@@ -3090,7 +3100,7 @@ begin_parallel_vacuum(Oid relid, Relation *Irel, LVRelStats *vacrelstats,
 	 * a parallel vacuum must be requested and there must be indexes on the
 	 * relation
 	 */
-	Assert(nrequested >= 0);
+	Assert(params->nworkers >= 0);
 	Assert(nindexes > 0);
 
 	/*
@@ -3098,7 +3108,7 @@ begin_parallel_vacuum(Oid relid, Relation *Irel, LVRelStats *vacrelstats,
 	 */
 	can_parallel_vacuum = (bool *) palloc0(sizeof(bool) * nindexes);
 	parallel_workers = compute_parallel_vacuum_workers(Irel, nindexes,
-													   nrequested,
+													   params->nworkers,
 													   can_parallel_vacuum);
 
 	/* Can't perform vacuum in parallel */
@@ -3177,6 +3187,7 @@ begin_parallel_vacuum(Oid relid, Relation *Irel, LVRelStats *vacrelstats,
 		(nindexes_mwm > 0) ?
 		maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
 		maintenance_work_mem;
+	shared->fast = (params->options & VACOPT_FAST);
 
 	/*
 	 * We need to care about alignment because we estimate the shared memory
@@ -3367,7 +3378,7 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
 												  false);
 
 	/* Set cost-based vacuum delay */
-	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostActive = ((VacuumCostDelay > 0) && !(lvshared->fast));
 	VacuumCostBalance = 0;
 	VacuumPageHit = 0;
 	VacuumPageMiss = 0;
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 0672be2..3019a72 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -101,6 +101,7 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 	bool		verbose = false;
 	bool		skip_locked = false;
 	bool		analyze = false;
+	bool		fast = false;
 	bool		freeze = false;
 	bool		full = false;
 	bool		disable_page_skipping = false;
@@ -130,6 +131,8 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 		/* Parse options available on VACUUM */
 		else if (strcmp(opt->defname, "analyze") == 0)
 			analyze = defGetBoolean(opt);
+		else if (strcmp(opt->defname, "fast") == 0)
+			fast = defGetBoolean(opt);
 		else if (strcmp(opt->defname, "freeze") == 0)
 			freeze = defGetBoolean(opt);
 		else if (strcmp(opt->defname, "full") == 0)
@@ -177,7 +180,8 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 		(analyze ? VACOPT_ANALYZE : 0) |
 		(freeze ? VACOPT_FREEZE : 0) |
 		(full ? VACOPT_FULL : 0) |
-		(disable_page_skipping ? VACOPT_DISABLE_PAGE_SKIPPING : 0);
+		(disable_page_skipping ? VACOPT_DISABLE_PAGE_SKIPPING : 0) |
+		(fast ? VACOPT_FAST : 0);
 
 	/* sanity checks on options */
 	Assert(params.options & (VACOPT_VACUUM | VACOPT_ANALYZE));
@@ -416,7 +420,8 @@ vacuum(List *relations, VacuumParams *params,
 		ListCell   *cur;
 
 		in_vacuum = true;
-		VacuumCostActive = (VacuumCostDelay > 0);
+		VacuumCostActive = ((VacuumCostDelay > 0) &&
+							!(params->options & VACOPT_FAST));
 		VacuumCostBalance = 0;
 		VacuumPageHit = 0;
 		VacuumPageMiss = 0;
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 254a6bc..faed3f9 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -183,7 +183,8 @@ typedef enum VacuumOption
 	VACOPT_FULL = 1 << 4,		/* FULL (non-concurrent) vacuum */
 	VACOPT_SKIP_LOCKED = 1 << 5,	/* skip if cannot get lock */
 	VACOPT_SKIPTOAST = 1 << 6,	/* don't process the TOAST table, if any */
-	VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7	/* don't skip any pages */
+	VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7,	/* don't skip any pages */
+	VACOPT_FAST = 1 << 8		/* disable vacuum delay */
 } VacuumOption;
 
 /*
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index 8571133..07c7b88 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -118,6 +118,9 @@ CREATE TEMPORARY TABLE tmp (a int PRIMARY KEY);
 CREATE INDEX tmp_idx1 ON tmp (a);
 VACUUM (PARALLEL 1) tmp; -- disables parallel vacuum option
 WARNING:  disabling parallel option of vacuum on "tmp" --- cannot vacuum temporary tables in parallel
+--test FAST option
+UPDATE pvactst SET i = i WHERE i < 1000;
+VACUUM (FAST) pvactst;
 RESET min_parallel_index_scan_size;
 DROP TABLE pvactst;
 -- INDEX_CLEANUP option
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index be4f556..6227ab9 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -99,6 +99,10 @@ VACUUM (PARALLEL 2, FULL TRUE) pvactst; -- error, cannot use both PARALLEL and F
 CREATE TEMPORARY TABLE tmp (a int PRIMARY KEY);
 CREATE INDEX tmp_idx1 ON tmp (a);
 VACUUM (PARALLEL 1) tmp; -- disables parallel vacuum option
+
+--test FAST option
+UPDATE pvactst SET i = i WHERE i < 1000;
+VACUUM (FAST) pvactst;
 RESET min_parallel_index_scan_size;
 DROP TABLE pvactst;
 
-- 
1.8.3.1

v4-0001-Delete-empty-pages-in-each-pass-during-GIST-VACUUM.patchapplication/octet-stream; name=v4-0001-Delete-empty-pages-in-each-pass-during-GIST-VACUUM.patchDownload

From b44bc6deae88c9bec552f6de4e6e73a1b252c191 Mon Sep 17 00:00:00 2001
From: Amit Kapila <akapila@postgresql.org>
Date: Mon, 9 Dec 2019 14:12:59 +0530
Subject: [PATCH] Delete empty pages in each pass during GIST VACUUM.

Earlier, we use to postpone deleting empty pages till the second stage of
vacuum to amortize the cost of scanning internal pages.  However, that can
sometimes (say vacuum is canceled or errored between first and second
stage) delay the pages to be recycled.

Another thing is that to facilitate deleting empty pages in the second
stage, we need to share the information of internal and empty pages
between different stages of vacuum.  It will be quite tricky to share this
information via DSM which is required for the upcoming parallel vacuum
patch.

Also, it will bring the logic to reclaim deleted pages closer to nbtree
where we delete empty pages in each pass.

Overall, the advantages of deleting empty pages in each pass outweigh the
advantages of postponing the same.

Author: Dilip Kumar, with changes by Amit Kapila
Reviewed-by: Sawada Masahiko and Amit Kapila
Discussion: https://postgr.es/m/CAA4eK1LGr+MN0xHZpJ2dfS8QNQ1a_aROKowZB+MPNep8FVtwAA@mail.gmail.com
---
 src/backend/access/gist/README       |  23 +++--
 src/backend/access/gist/gistvacuum.c | 160 +++++++++++++++--------------------
 2 files changed, 78 insertions(+), 105 deletions(-)

diff --git a/src/backend/access/gist/README b/src/backend/access/gist/README
index 8cbca69..fffdfff 100644
--- a/src/backend/access/gist/README
+++ b/src/backend/access/gist/README
@@ -429,18 +429,17 @@ splits during searches, we don't need a "vacuum cycle ID" concept for that
 like B-tree does.
 
 While we scan all the pages, we also make note of any completely empty leaf
-pages. We will try to unlink them from the tree in the second stage. We also
-record the block numbers of all internal pages; they are needed in the second
-stage, to locate parents of the empty pages.
-
-In the second stage, we try to unlink any empty leaf pages from the tree, so
-that their space can be reused. In order to delete an empty page, its
-downlink must be removed from the parent. We scan all the internal pages,
-whose block numbers we memorized in the first stage, and look for downlinks
-to pages that we have memorized as being empty. Whenever we find one, we
-acquire a lock on the parent and child page, re-check that the child page is
-still empty. Then, we remove the downlink and mark the child as deleted, and
-release the locks.
+pages. We will try to unlink them from the tree after the scan. We also record
+the block numbers of all internal pages; they are needed to locate parents of
+the empty pages while unlinking them.
+
+We try to unlink any empty leaf pages from the tree, so that their space can
+be reused. In order to delete an empty page, its downlink must be removed from
+the parent. We scan all the internal pages, whose block numbers we memorized
+in the first stage, and look for downlinks to pages that we have memorized as
+being empty. Whenever we find one, we acquire a lock on the parent and child
+page, re-check that the child page is still empty. Then, we remove the
+downlink and mark the child as deleted, and release the locks.
 
 The insertion algorithm would get confused, if an internal page was completely
 empty. So we never delete the last child of an internal page, even if it's
diff --git a/src/backend/access/gist/gistvacuum.c b/src/backend/access/gist/gistvacuum.c
index 710e401..730f3e8 100644
--- a/src/backend/access/gist/gistvacuum.c
+++ b/src/backend/access/gist/gistvacuum.c
@@ -24,58 +24,34 @@
 #include "storage/lmgr.h"
 #include "utils/memutils.h"
 
-/*
- * State kept across vacuum stages.
- */
+/* Working state needed by gistbulkdelete */
 typedef struct
 {
-	IndexBulkDeleteResult stats;	/* must be first */
+	IndexVacuumInfo *info;
+	IndexBulkDeleteResult *stats;
+	IndexBulkDeleteCallback callback;
+	void	   *callback_state;
+	GistNSN		startNSN;
 
 	/*
-	 * These are used to memorize all internal and empty leaf pages in the 1st
-	 * vacuum stage.  They are used in the 2nd stage, to delete all the empty
-	 * pages.
+	 * These are used to memorize all internal and empty leaf pages. They are
+	 * used for deleting all the empty pages.
 	 */
 	IntegerSet *internal_page_set;
 	IntegerSet *empty_leaf_set;
 	MemoryContext page_set_context;
-} GistBulkDeleteResult;
-
-/* Working state needed by gistbulkdelete */
-typedef struct
-{
-	IndexVacuumInfo *info;
-	GistBulkDeleteResult *stats;
-	IndexBulkDeleteCallback callback;
-	void	   *callback_state;
-	GistNSN		startNSN;
 } GistVacState;
 
-static void gistvacuumscan(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
+static void gistvacuumscan(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
 						   IndexBulkDeleteCallback callback, void *callback_state);
 static void gistvacuumpage(GistVacState *vstate, BlockNumber blkno,
 						   BlockNumber orig_blkno);
 static void gistvacuum_delete_empty_pages(IndexVacuumInfo *info,
-										  GistBulkDeleteResult *stats);
-static bool gistdeletepage(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
+										  GistVacState *vstate);
+static bool gistdeletepage(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
 						   Buffer buffer, OffsetNumber downlink,
 						   Buffer leafBuffer);
 
-/* allocate the 'stats' struct that's kept over vacuum stages */
-static GistBulkDeleteResult *
-create_GistBulkDeleteResult(void)
-{
-	GistBulkDeleteResult *gist_stats;
-
-	gist_stats = (GistBulkDeleteResult *) palloc0(sizeof(GistBulkDeleteResult));
-	gist_stats->page_set_context =
-		GenerationContextCreate(CurrentMemoryContext,
-								"GiST VACUUM page set context",
-								16 * 1024);
-
-	return gist_stats;
-}
-
 /*
  * VACUUM bulkdelete stage: remove index entries.
  */
@@ -83,15 +59,13 @@ IndexBulkDeleteResult *
 gistbulkdelete(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
 			   IndexBulkDeleteCallback callback, void *callback_state)
 {
-	GistBulkDeleteResult *gist_stats = (GistBulkDeleteResult *) stats;
-
 	/* allocate stats if first time through, else re-use existing struct */
-	if (gist_stats == NULL)
-		gist_stats = create_GistBulkDeleteResult();
+	if (stats == NULL)
+		stats = (IndexBulkDeleteResult *) palloc0(sizeof(IndexBulkDeleteResult));
 
-	gistvacuumscan(info, gist_stats, callback, callback_state);
+	gistvacuumscan(info, stats, callback, callback_state);
 
-	return (IndexBulkDeleteResult *) gist_stats;
+	return stats;
 }
 
 /*
@@ -100,8 +74,6 @@ gistbulkdelete(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
 IndexBulkDeleteResult *
 gistvacuumcleanup(IndexVacuumInfo *info, IndexBulkDeleteResult *stats)
 {
-	GistBulkDeleteResult *gist_stats = (GistBulkDeleteResult *) stats;
-
 	/* No-op in ANALYZE ONLY mode */
 	if (info->analyze_only)
 		return stats;
@@ -111,25 +83,13 @@ gistvacuumcleanup(IndexVacuumInfo *info, IndexBulkDeleteResult *stats)
 	 * stats from the latest gistbulkdelete call.  If it wasn't called, we
 	 * still need to do a pass over the index, to obtain index statistics.
 	 */
-	if (gist_stats == NULL)
+	if (stats == NULL)
 	{
-		gist_stats = create_GistBulkDeleteResult();
-		gistvacuumscan(info, gist_stats, NULL, NULL);
+		stats = (IndexBulkDeleteResult *) palloc0(sizeof(IndexBulkDeleteResult));
+		gistvacuumscan(info, stats, NULL, NULL);
 	}
 
 	/*
-	 * If we saw any empty pages, try to unlink them from the tree so that
-	 * they can be reused.
-	 */
-	gistvacuum_delete_empty_pages(info, gist_stats);
-
-	/* we don't need the internal and empty page sets anymore */
-	MemoryContextDelete(gist_stats->page_set_context);
-	gist_stats->page_set_context = NULL;
-	gist_stats->internal_page_set = NULL;
-	gist_stats->empty_leaf_set = NULL;
-
-	/*
 	 * It's quite possible for us to be fooled by concurrent page splits into
 	 * double-counting some index tuples, so disbelieve any total that exceeds
 	 * the underlying heap's count ... if we know that accurately.  Otherwise
@@ -137,11 +97,11 @@ gistvacuumcleanup(IndexVacuumInfo *info, IndexBulkDeleteResult *stats)
 	 */
 	if (!info->estimated_count)
 	{
-		if (gist_stats->stats.num_index_tuples > info->num_heap_tuples)
-			gist_stats->stats.num_index_tuples = info->num_heap_tuples;
+		if (stats->num_index_tuples > info->num_heap_tuples)
+			stats->num_index_tuples = info->num_heap_tuples;
 	}
 
-	return (IndexBulkDeleteResult *) gist_stats;
+	return stats;
 }
 
 /*
@@ -153,15 +113,16 @@ gistvacuumcleanup(IndexVacuumInfo *info, IndexBulkDeleteResult *stats)
  * occurred).
  *
  * This also makes note of any empty leaf pages, as well as all internal
- * pages.  The second stage, gistvacuum_delete_empty_pages(), needs that
- * information.  Any deleted pages are added directly to the free space map.
- * (They should've been added there when they were originally deleted, already,
- * but it's possible that the FSM was lost at a crash, for example.)
+ * pages while looping over all index pages.  After scanning all the pages, we
+ * remove the empty pages so that they can be reused.  Any deleted pages are
+ * added directly to the free space map.  (They should've been added there
+ * when they were originally deleted, already, but it's possible that the FSM
+ * was lost at a crash, for example.)
  *
  * The caller is responsible for initially allocating/zeroing a stats struct.
  */
 static void
-gistvacuumscan(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
+gistvacuumscan(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
 			   IndexBulkDeleteCallback callback, void *callback_state)
 {
 	Relation	rel = info->index;
@@ -175,11 +136,10 @@ gistvacuumscan(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
 	 * Reset counts that will be incremented during the scan; needed in case
 	 * of multiple scans during a single VACUUM command.
 	 */
-	stats->stats.estimated_count = false;
-	stats->stats.num_index_tuples = 0;
-	stats->stats.pages_deleted = 0;
-	stats->stats.pages_free = 0;
-	MemoryContextReset(stats->page_set_context);
+	stats->estimated_count = false;
+	stats->num_index_tuples = 0;
+	stats->pages_deleted = 0;
+	stats->pages_free = 0;
 
 	/*
 	 * Create the integer sets to remember all the internal and the empty leaf
@@ -187,9 +147,12 @@ gistvacuumscan(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
 	 * this context so that the subsequent allocations for these integer sets
 	 * will be done from the same context.
 	 */
-	oldctx = MemoryContextSwitchTo(stats->page_set_context);
-	stats->internal_page_set = intset_create();
-	stats->empty_leaf_set = intset_create();
+	vstate.page_set_context = GenerationContextCreate(CurrentMemoryContext,
+													  "GiST VACUUM page set context",
+													  16 * 1024);
+	oldctx = MemoryContextSwitchTo(vstate.page_set_context);
+	vstate.internal_page_set = intset_create();
+	vstate.empty_leaf_set = intset_create();
 	MemoryContextSwitchTo(oldctx);
 
 	/* Set up info to pass down to gistvacuumpage */
@@ -257,11 +220,23 @@ gistvacuumscan(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
 	 * Note that if no recyclable pages exist, we don't bother vacuuming the
 	 * FSM at all.
 	 */
-	if (stats->stats.pages_free > 0)
+	if (stats->pages_free > 0)
 		IndexFreeSpaceMapVacuum(rel);
 
 	/* update statistics */
-	stats->stats.num_pages = num_pages;
+	stats->num_pages = num_pages;
+
+	/*
+	 * If we saw any empty pages, try to unlink them from the tree so that
+	 * they can be reused.
+	 */
+	gistvacuum_delete_empty_pages(info, &vstate);
+
+	/* we don't need the internal and empty page sets anymore */
+	MemoryContextDelete(vstate.page_set_context);
+	vstate.page_set_context = NULL;
+	vstate.internal_page_set = NULL;
+	vstate.empty_leaf_set = NULL;
 }
 
 /*
@@ -278,7 +253,6 @@ gistvacuumscan(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
 static void
 gistvacuumpage(GistVacState *vstate, BlockNumber blkno, BlockNumber orig_blkno)
 {
-	GistBulkDeleteResult *stats = vstate->stats;
 	IndexVacuumInfo *info = vstate->info;
 	IndexBulkDeleteCallback callback = vstate->callback;
 	void	   *callback_state = vstate->callback_state;
@@ -307,13 +281,13 @@ restart:
 	{
 		/* Okay to recycle this page */
 		RecordFreeIndexPage(rel, blkno);
-		stats->stats.pages_free++;
-		stats->stats.pages_deleted++;
+		vstate->stats->pages_free++;
+		vstate->stats->pages_deleted++;
 	}
 	else if (GistPageIsDeleted(page))
 	{
 		/* Already deleted, but can't recycle yet */
-		stats->stats.pages_deleted++;
+		vstate->stats->pages_deleted++;
 	}
 	else if (GistPageIsLeaf(page))
 	{
@@ -388,7 +362,7 @@ restart:
 
 			END_CRIT_SECTION();
 
-			stats->stats.tuples_removed += ntodelete;
+			vstate->stats->tuples_removed += ntodelete;
 			/* must recompute maxoff */
 			maxoff = PageGetMaxOffsetNumber(page);
 		}
@@ -405,10 +379,10 @@ restart:
 			 * it up.
 			 */
 			if (blkno == orig_blkno)
-				intset_add_member(stats->empty_leaf_set, blkno);
+				intset_add_member(vstate->empty_leaf_set, blkno);
 		}
 		else
-			stats->stats.num_index_tuples += nremain;
+			vstate->stats->num_index_tuples += nremain;
 	}
 	else
 	{
@@ -443,7 +417,7 @@ restart:
 		 * parents of empty leaf pages.
 		 */
 		if (blkno == orig_blkno)
-			intset_add_member(stats->internal_page_set, blkno);
+			intset_add_member(vstate->internal_page_set, blkno);
 	}
 
 	UnlockReleaseBuffer(buffer);
@@ -466,7 +440,7 @@ restart:
  * Scan all internal pages, and try to delete their empty child pages.
  */
 static void
-gistvacuum_delete_empty_pages(IndexVacuumInfo *info, GistBulkDeleteResult *stats)
+gistvacuum_delete_empty_pages(IndexVacuumInfo *info, GistVacState *vstate)
 {
 	Relation	rel = info->index;
 	BlockNumber empty_pages_remaining;
@@ -475,10 +449,10 @@ gistvacuum_delete_empty_pages(IndexVacuumInfo *info, GistBulkDeleteResult *stats
 	/*
 	 * Rescan all inner pages to find those that have empty child pages.
 	 */
-	empty_pages_remaining = intset_num_entries(stats->empty_leaf_set);
-	intset_begin_iterate(stats->internal_page_set);
+	empty_pages_remaining = intset_num_entries(vstate->empty_leaf_set);
+	intset_begin_iterate(vstate->internal_page_set);
 	while (empty_pages_remaining > 0 &&
-		   intset_iterate_next(stats->internal_page_set, &blkno))
+		   intset_iterate_next(vstate->internal_page_set, &blkno))
 	{
 		Buffer		buffer;
 		Page		page;
@@ -521,7 +495,7 @@ gistvacuum_delete_empty_pages(IndexVacuumInfo *info, GistBulkDeleteResult *stats
 			BlockNumber leafblk;
 
 			leafblk = ItemPointerGetBlockNumber(&(idxtuple->t_tid));
-			if (intset_is_member(stats->empty_leaf_set, leafblk))
+			if (intset_is_member(vstate->empty_leaf_set, leafblk))
 			{
 				leafs_to_delete[ntodelete] = leafblk;
 				todelete[ntodelete++] = off;
@@ -561,7 +535,7 @@ gistvacuum_delete_empty_pages(IndexVacuumInfo *info, GistBulkDeleteResult *stats
 			gistcheckpage(rel, leafbuf);
 
 			LockBuffer(buffer, GIST_EXCLUSIVE);
-			if (gistdeletepage(info, stats,
+			if (gistdeletepage(info, vstate->stats,
 							   buffer, todelete[i] - deleted,
 							   leafbuf))
 				deleted++;
@@ -573,7 +547,7 @@ gistvacuum_delete_empty_pages(IndexVacuumInfo *info, GistBulkDeleteResult *stats
 		ReleaseBuffer(buffer);
 
 		/* update stats */
-		stats->stats.pages_removed += deleted;
+		vstate->stats->pages_removed += deleted;
 
 		/*
 		 * We can stop the scan as soon as we have seen the downlinks, even if
@@ -596,7 +570,7 @@ gistvacuum_delete_empty_pages(IndexVacuumInfo *info, GistBulkDeleteResult *stats
  * prevented it.
  */
 static bool
-gistdeletepage(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
+gistdeletepage(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
 			   Buffer parentBuffer, OffsetNumber downlink,
 			   Buffer leafBuffer)
 {
@@ -665,7 +639,7 @@ gistdeletepage(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
 	/* mark the page as deleted */
 	MarkBufferDirty(leafBuffer);
 	GistPageSetDeleted(leafPage, txid);
-	stats->stats.pages_deleted++;
+	stats->pages_deleted++;
 
 	/* remove the downlink from the parent */
 	MarkBufferDirty(parentBuffer);
-- 
1.8.3.1

#311

Mahendra Singh

mahi6run@gmail.com

about 6 years ago

In reply to: Amit Kapila (#310)

g_indg_On Mon, 23 Dec 2019 at 16:11, Amit Kapila <amit.kapila16@gmail.com>
wrote:

On Fri, Dec 20, 2019 at 12:13 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

I've attached the updated version patch that incorporated the all
review comments I go so far.

I have further edited the first two patches posted by you. The
changes include (a) changed tests to reset the guc, (b) removing some
stuff which is not required in this version, (c) moving some variables
around to make them in better order, (d) changed comments and few
other cosmetic things and (e) commit messages for first two patches.

I think the first two patches attached in this email are in good shape
and we can commit those unless you or someone has more comments on
them, the main parallel vacuum patch can still be improved by some
more test/polish/review. I am planning to push the first two patches
next week after another pass. The first two patches are explained in
brief as below:

1. v4-0001-Delete-empty-pages-in-each-pass-during-GIST-VACUUM: It
allows us to delete empty pages in each pass during GIST VACUUM.
Earlier, we use to postpone deleting empty pages till the second stage
of vacuum to amortize the cost of scanning internal pages. However,
that can sometimes (say vacuum is canceled or errored between first
and second stage) delay the pages to be recycled. Another thing is
that to facilitate deleting empty pages in the second stage, we need
to share the information of internal and empty pages between different
stages of vacuum. It will be quite tricky to share this information
via DSM which is required for the main parallel vacuum patch. Also,
it will bring the logic to reclaim deleted pages closer to nbtree
where we delete empty pages in each pass. Overall, the advantages of
deleting empty pages in each pass outweigh the advantages of
postponing the same. This patch is discussed in detail in a separate
thread [1].

2. v39-0001-Introduce-IndexAM-fields-for-parallel-vacuum.patch:
Introduce new fields amusemaintenanceworkmem and
amparallelvacuumoptions in IndexAmRoutine for parallel vacuum. The
amusemaintenanceworkmem tells whether a particular IndexAM uses
maintenance_work_mem or not. This will help in controlling the memory
used by individual workers as otherwise, each worker can consume
memory equal to maintenance_work_mem. This has been discussed in
detail in a separate thread as well [2]. The amparallelvacuumoptions
tell whether a particular IndexAM participates in a parallel vacuum
and if so in which phase (bulkdelete, vacuumcleanup) of vacuum.

[1] -

/messages/by-id/CAA4eK1LGr+MN0xHZpJ2dfS8QNQ1a_aROKowZB+MPNep8FVtwAA@mail.gmail.com

[2] -

/messages/by-id/CAA4eK1LmcD5aPogzwim5Nn58Ki+74a6Edghx4Wd8hAskvHaq5A@mail.gmail.com

Hi,
I reviewed v39 patch set. Below are the some minor review comments:

1.
+ * memory equal to maitenance_work_mem, the new maitenance_work_mem for

maitenance_work_mem should be replaced by maintenance_work_mem.

2.
+ * The number of workers can vary between and bulkdelete and cleanup

I think, grammatically above sentence is not correct. "and" is extra in
above sentence.

3.
+ /*
+ * Open table.  The lock mode is the same as the leader process.  It's
+ * okay because The lockmode does not conflict among the parallel workers.
+ */

I think, "lock mode" and "lockmode", both should be same.(means extra space
should be removed from "lock mode"). In "The", "T" should be small case
letter.

4.
+ /* We don't support parallel vacuum for autovacuum for now */

I think, above sentence should be like "As of now, we don't support
parallel vacuum for autovacuum"

5. I am not sure that I am right but I can see that we are not consistent
while ending the single line comments.

I think, if single line comment is started with "upper case letter", then
we should not put period(dot) at the end of comment, but if comment started
with "lower case letter", then we should put period(dot) at the end of
comment.

a)
+ /* parallel vacuum must be active */
I think. we should end above comment with dot or we should make "p" of
parallel as upper case letter.

b)
+ /* At least count itself */
I think, above is correct.

If my understanding is correct, then please let me know so that I can make
these changes on the top of v39 patch set.

6.
+ bool amusemaintenanceworkmem;

I think, we haven't ran pgindent.

Thanks and Regards
Mahendra Thalor
EnterpriseDB: http://www.enterprisedb.com

#312

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Mahendra Singh (#311)

On Mon, Dec 23, 2019 at 11:02 PM Mahendra Singh <mahi6run@gmail.com> wrote:

5. I am not sure that I am right but I can see that we are not consistent while ending the single line comments.

I think, if single line comment is started with "upper case letter", then we should not put period(dot) at the end of comment, but if comment started with "lower case letter", then we should put period(dot) at the end of comment.

a)
+ /* parallel vacuum must be active */
I think. we should end above comment with dot or we should make "p" of parallel as upper case letter.

b)
+ /* At least count itself */
I think, above is correct.

I have checked a few files in this context and I don't see any
consistency, so I would suggest keeping the things matching with the
nearby code. Do you have any reason for the above conclusion?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#313

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

about 6 years ago

In reply to: Amit Kapila (#310)

On Mon, 23 Dec 2019 at 19:41, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Dec 20, 2019 at 12:13 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

I've attached the updated version patch that incorporated the all
review comments I go so far.

I have further edited the first two patches posted by you. The
changes include (a) changed tests to reset the guc, (b) removing some
stuff which is not required in this version, (c) moving some variables
around to make them in better order, (d) changed comments and few
other cosmetic things and (e) commit messages for first two patches.

I think the first two patches attached in this email are in good shape
and we can commit those unless you or someone has more comments on
them, the main parallel vacuum patch can still be improved by some
more test/polish/review. I am planning to push the first two patches
next week after another pass. The first two patches are explained in
brief as below:

1. v4-0001-Delete-empty-pages-in-each-pass-during-GIST-VACUUM: It
allows us to delete empty pages in each pass during GIST VACUUM.
Earlier, we use to postpone deleting empty pages till the second stage
of vacuum to amortize the cost of scanning internal pages. However,
that can sometimes (say vacuum is canceled or errored between first
and second stage) delay the pages to be recycled. Another thing is
that to facilitate deleting empty pages in the second stage, we need
to share the information of internal and empty pages between different
stages of vacuum. It will be quite tricky to share this information
via DSM which is required for the main parallel vacuum patch. Also,
it will bring the logic to reclaim deleted pages closer to nbtree
where we delete empty pages in each pass. Overall, the advantages of
deleting empty pages in each pass outweigh the advantages of
postponing the same. This patch is discussed in detail in a separate
thread [1].

2. v39-0001-Introduce-IndexAM-fields-for-parallel-vacuum.patch:
Introduce new fields amusemaintenanceworkmem and
amparallelvacuumoptions in IndexAmRoutine for parallel vacuum. The
amusemaintenanceworkmem tells whether a particular IndexAM uses
maintenance_work_mem or not. This will help in controlling the memory
used by individual workers as otherwise, each worker can consume
memory equal to maintenance_work_mem. This has been discussed in
detail in a separate thread as well [2]. The amparallelvacuumoptions
tell whether a particular IndexAM participates in a parallel vacuum
and if so in which phase (bulkdelete, vacuumcleanup) of vacuum.

Thank you for updating the patches!

The first patches look good to me. I'm reviewing other patches and
will post comments if there is.

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#314

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Masahiko Sawada (#313)

On Tue, Dec 24, 2019 at 12:08 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

The first patches look good to me. I'm reviewing other patches and
will post comments if there is.

Okay, feel free to address few comments raised by Mahendra along with
whatever you find.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#315

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

about 6 years ago

In reply to: Amit Kapila (#314)

On Tue, 24 Dec 2019 at 15:44, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Dec 24, 2019 at 12:08 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

The first patches look good to me. I'm reviewing other patches and
will post comments if there is.

Oops I meant first "two" patches look good to me.

Okay, feel free to address few comments raised by Mahendra along with
whatever you find.

Thanks!

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#316

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

about 6 years ago

In reply to: Masahiko Sawada (#315)

4 attachment(s)

On Tue, 24 Dec 2019 at 15:46, Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Tue, 24 Dec 2019 at 15:44, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Dec 24, 2019 at 12:08 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

The first patches look good to me. I'm reviewing other patches and
will post comments if there is.

Oops I meant first "two" patches look good to me.

Okay, feel free to address few comments raised by Mahendra along with
whatever you find.

Thanks!

I've attached updated patch set as the previous version patch set
conflicts to the current HEAD. This patch set incorporated the review
comments, a few fix and the patch for
PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION. 0001 patch is the same
as previous version.

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

v40-0004-Add-ability-to-disable-leader-participation-in-p.patchapplication/octet-stream; name=v40-0004-Add-ability-to-disable-leader-participation-in-p.patchDownload

From dae4ab88cb390d90ea969a5143df68432790deb5 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Wed, 25 Dec 2019 15:32:23 +0900
Subject: [PATCH v40 4/4] Add ability to disable leader participation in
 parallel vacuum

---
 src/backend/access/heap/vacuumlazy.c | 41 ++++++++++++++++++++++++----
 1 file changed, 35 insertions(+), 6 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index ff0acad1ec..aef947f3af 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -138,6 +138,13 @@
 #define PARALLEL_VACUUM_KEY_DEAD_TUPLES		2
 #define PARALLEL_VACUUM_KEY_QUERY_TEXT		3
 
+/*
+ * PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION disables the leader's
+ * participation in parallel lazy vacuum.  This may be useful as a debugging
+ * aid.
+#undef PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION
+ */
+
 /*
  * Macro to check if we are in a parallel lazy vacuum.  If true, we are
  * in the parallel mode and the DSM segment is initialized.
@@ -270,6 +277,12 @@ typedef struct LVParallelState
 	int			nindexes_parallel_bulkdel;
 	int			nindexes_parallel_cleanup;
 	int			nindexes_parallel_condcleanup;
+
+	/*
+	 * Always true except for a debugging case where
+	 * PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION are defined.
+	 */
+	bool		leaderparticipates;
 } LVParallelState;
 
 typedef struct LVRelStats
@@ -1971,13 +1984,17 @@ lazy_parallel_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
 	{
 		if (lps->lvshared->first_time)
 			nworkers = lps->nindexes_parallel_cleanup +
-				lps->nindexes_parallel_condcleanup - 1;
+				lps->nindexes_parallel_condcleanup;
 		else
-			nworkers = lps->nindexes_parallel_cleanup - 1;
+			nworkers = lps->nindexes_parallel_cleanup;
 
 	}
 	else
-		nworkers = lps->nindexes_parallel_bulkdel - 1;
+		nworkers = lps->nindexes_parallel_bulkdel;
+
+	/* The leader process takes one index */
+	if (lps->leaderparticipates)
+		nworkers--;
 
 	/*
 	 * It is possible that parallel context is initialized with fewer workers
@@ -2061,8 +2078,9 @@ lazy_parallel_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
 	 * Join as a parallel worker.  The leader process alone processes all the
 	 * indexes in the case where no workers are launched.
 	 */
-	parallel_vacuum_index(Irel, stats, lps->lvshared,
-						  vacrelstats->dead_tuples, nindexes);
+	if (lps->leaderparticipates || lps->pcxt->nworkers_launched == 0)
+		parallel_vacuum_index(Irel, stats, lps->lvshared,
+							  vacrelstats->dead_tuples, nindexes);
 
 	/* Wait for all vacuum workers to finish */
 	WaitForParallelWorkersToFinish(lps->pcxt);
@@ -2964,6 +2982,7 @@ static int
 compute_parallel_vacuum_workers(Relation *Irel, int nindexes, int nrequested,
 								bool *can_parallel_vacuum)
 {
+	bool		leaderparticipates = true;
 	int			nindexes_parallel = 0;
 	int			nindexes_parallel_bulkdel = 0;
 	int			nindexes_parallel_cleanup = 0;
@@ -3005,8 +3024,13 @@ compute_parallel_vacuum_workers(Relation *Irel, int nindexes, int nrequested,
 	if (nindexes_parallel == 0)
 		return 0;
 
+#ifdef PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION
+	leaderparticipates = false;
+#endif
+
 	/* The leader process takes one index */
-	nindexes_parallel--;
+	if (leaderparticipates)
+		nindexes_parallel--;
 
 	/* Compute the parallel degree */
 	parallel_workers = (nrequested > 0) ?
@@ -3125,6 +3149,11 @@ begin_parallel_vacuum(Oid relid, Relation *Irel, LVRelStats *vacrelstats,
 								 parallel_workers);
 	Assert(pcxt->nworkers > 0);
 	lps->pcxt = pcxt;
+	lps->leaderparticipates = true;
+
+#ifdef PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION
+	lps->leaderparticipates = false;
+#endif
 
 	/* Estimate size for shared information -- PARALLEL_VACUUM_KEY_SHARED */
 	est_shared = MAXALIGN(add_size(SizeOfLVShared, BITMAPLEN(nindexes)));
-- 
2.23.0

v40-0001-Introduce-IndexAM-fields-for-parallel-vacuum.patchapplication/octet-stream; name=v40-0001-Introduce-IndexAM-fields-for-parallel-vacuum.patchDownload

From 9cec20e440d8e6e5082198e84e942f9131e45ee3 Mon Sep 17 00:00:00 2001
From: Amit Kapila <akapila@postgresql.org>
Date: Mon, 23 Dec 2019 14:46:37 +0530
Subject: [PATCH v40 1/4] Introduce IndexAM fields for parallel vacuum.

Introduce new fields amusemaintenanceworkmem and amparallelvacuumoptions
in IndexAmRoutine for parallel vacuum.  The amusemaintenanceworkmem tells
whether a particular IndexAM uses maintenance_work_mem or not.  This will
help in controlling the memory used by individual workers as otherwise,
each worker can consume memory equal to maintenance_work_mem.  The
amparallelvacuumoptions tell whether a particular IndexAM participates in
a parallel vacuum and if so in which phase (bulkdelete, vacuumcleanup) of
vacuum.

Author: Masahiko Sawada and Amit Kapila
Reviewed-by: Dilip Kumar, Amit Kapila and Robert Haas
Discussion:
https://postgr.es/m/CAD21AoDTPMgzSkV4E3SFo1CH_x50bf5PqZFQf4jmqjk-C03BWg@mail.gmail.com
https://postgr.es/m/CAA4eK1LmcD5aPogzwim5Nn58Ki+74a6Edghx4Wd8hAskvHaq5A@mail.gmail.com
---
 contrib/bloom/blutils.c                       |  4 ++
 doc/src/sgml/indexam.sgml                     |  4 ++
 src/backend/access/brin/brin.c                |  4 ++
 src/backend/access/gin/ginutil.c              |  4 ++
 src/backend/access/gist/gist.c                |  4 ++
 src/backend/access/hash/hash.c                |  3 ++
 src/backend/access/nbtree/nbtree.c            |  3 ++
 src/backend/access/spgist/spgutils.c          |  4 ++
 src/include/access/indexam.h                  |  4 ++
 src/include/commands/vacuum.h                 | 38 +++++++++++++++++++
 .../modules/dummy_index_am/dummy_index_am.c   |  3 ++
 11 files changed, 75 insertions(+)

diff --git a/contrib/bloom/blutils.c b/contrib/bloom/blutils.c
index 44b7e74c5c..0ddb8d0160 100644
--- a/contrib/bloom/blutils.c
+++ b/contrib/bloom/blutils.c
@@ -18,6 +18,7 @@
 #include "access/reloptions.h"
 #include "bloom.h"
 #include "catalog/index.h"
+#include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
@@ -121,6 +122,9 @@ blhandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_CLEANUP;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = blbuild;
diff --git a/doc/src/sgml/indexam.sgml b/doc/src/sgml/indexam.sgml
index dd54c68802..37f8d8760a 100644
--- a/doc/src/sgml/indexam.sgml
+++ b/doc/src/sgml/indexam.sgml
@@ -122,6 +122,10 @@ typedef struct IndexAmRoutine
     bool        amcanparallel;
     /* does AM support columns included with clause INCLUDE? */
     bool        amcaninclude;
+    /* does AM use maintenance_work_mem? */
+    bool        amusemaintenanceworkmem;
+    /* OR of parallel vacuum flags */
+    uint8       amparallelvacuumoptions;
     /* type of data stored in index, or InvalidOid if variable */
     Oid         amkeytype;
 
diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index 294ffa6e20..abd8c40e7e 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -27,6 +27,7 @@
 #include "access/xloginsert.h"
 #include "catalog/index.h"
 #include "catalog/pg_am.h"
+#include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "pgstat.h"
 #include "postmaster/autovacuum.h"
@@ -101,6 +102,9 @@ brinhandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_CLEANUP;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = brinbuild;
diff --git a/src/backend/access/gin/ginutil.c b/src/backend/access/gin/ginutil.c
index 38593554f0..64bd81a003 100644
--- a/src/backend/access/gin/ginutil.c
+++ b/src/backend/access/gin/ginutil.c
@@ -20,6 +20,7 @@
 #include "access/xloginsert.h"
 #include "catalog/pg_collation.h"
 #include "catalog/pg_type.h"
+#include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "storage/indexfsm.h"
 #include "storage/lmgr.h"
@@ -53,6 +54,9 @@ ginhandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amusemaintenanceworkmem = true;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_CLEANUP;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = ginbuild;
diff --git a/src/backend/access/gist/gist.c b/src/backend/access/gist/gist.c
index a259c80616..e29a43bddf 100644
--- a/src/backend/access/gist/gist.c
+++ b/src/backend/access/gist/gist.c
@@ -17,6 +17,7 @@
 #include "access/gist_private.h"
 #include "access/gistscan.h"
 #include "catalog/pg_collation.h"
+#include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "nodes/execnodes.h"
 #include "storage/lmgr.h"
@@ -74,6 +75,9 @@ gisthandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = true;
+	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = gistbuild;
diff --git a/src/backend/access/hash/hash.c b/src/backend/access/hash/hash.c
index a0597a0c6e..8b9272c05f 100644
--- a/src/backend/access/hash/hash.c
+++ b/src/backend/access/hash/hash.c
@@ -72,6 +72,9 @@ hashhandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL;
 	amroutine->amkeytype = INT4OID;
 
 	amroutine->ambuild = hashbuild;
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index 065b5290b0..313e31c71b 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -121,6 +121,9 @@ bthandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = true;
 	amroutine->amcaninclude = true;
+	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = btbuild;
diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c
index d01ea59e14..f5b8fc4ed5 100644
--- a/src/backend/access/spgist/spgutils.c
+++ b/src/backend/access/spgist/spgutils.c
@@ -22,6 +22,7 @@
 #include "access/transam.h"
 #include "access/xact.h"
 #include "catalog/pg_amop.h"
+#include "commands/vacuum.h"
 #include "storage/bufmgr.h"
 #include "storage/indexfsm.h"
 #include "storage/lmgr.h"
@@ -56,6 +57,9 @@ spghandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = spgbuild;
diff --git a/src/include/access/indexam.h b/src/include/access/indexam.h
index 9b2eefb531..0215316c3d 100644
--- a/src/include/access/indexam.h
+++ b/src/include/access/indexam.h
@@ -197,6 +197,10 @@ typedef struct IndexAmRoutine
 	bool		amcanparallel;
 	/* does AM support columns included with clause INCLUDE? */
 	bool		amcaninclude;
+	/* does AM use maintenance_work_mem? */
+	bool		amusemaintenanceworkmem;
+	/* OR of parallel vacuum flags.  See vacuum.h for flags. */
+	uint8		amparallelvacuumoptions;
 	/* type of data stored in index, or InvalidOid if variable */
 	Oid			amkeytype;
 
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 128f7ae65d..b9becdbe99 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -23,6 +23,44 @@
 #include "storage/lock.h"
 #include "utils/relcache.h"
 
+/*
+ * Flags for amparallelvacuumoptions to control the participation of bulkdelete
+ * and vacuumcleanup in parallel vacuum.
+ */
+
+/*
+ * Both bulkdelete and vacuumcleanup are disabled by default.  This will be
+ * used by IndexAM's that don't want to or cannot participate in parallel
+ * vacuum.  For example, if an index AM doesn't have a way to communicate the
+ * index statistics allocated by the first ambulkdelete call to the subsequent
+ * ones until amvacuumcleanup, the index AM cannot participate in parallel
+ * vacuum.
+ */
+#define VACUUM_OPTION_NO_PARALLEL			0
+
+/*
+ * bulkdelete can be performed in parallel.  This option can be used by
+ * IndexAm's that need to scan the index to delete the tuples.
+ */
+#define VACUUM_OPTION_PARALLEL_BULKDEL		(1 << 0)
+
+/*
+ * vacuumcleanup can be performed in parallel if bulkdelete is not performed
+ * yet.  This will be used by IndexAM's that can scan the index if the
+ * bulkdelete is not performed.
+ */
+#define VACUUM_OPTION_PARALLEL_COND_CLEANUP	(1 << 1)
+
+/*
+ * vacuumcleanup can be performed in parallel even if bulkdelete has already
+ * processed the index.  This will be used by IndexAM's that scan the index
+ * during the cleanup phase of index irrespective of whether the index is
+ * already scanned or not during bulkdelete phase.
+ */
+#define VACUUM_OPTION_PARALLEL_CLEANUP		(1 << 2)
+
+/* value for checking vacuum flags */
+#define VACUUM_OPTION_MAX_VALID_VALUE		((1 << 3) - 1)
 
 /*----------
  * ANALYZE builds one of these structs for each attribute (column) that is
diff --git a/src/test/modules/dummy_index_am/dummy_index_am.c b/src/test/modules/dummy_index_am/dummy_index_am.c
index bade886866..db5cf80815 100644
--- a/src/test/modules/dummy_index_am/dummy_index_am.c
+++ b/src/test/modules/dummy_index_am/dummy_index_am.c
@@ -16,6 +16,7 @@
 #include "access/indexam.h"
 #include "access/reloptions.h"
 #include "catalog/index.h"
+#include "commands/vacuum.h"
 #include "nodes/pathnodes.h"
 #include "utils/guc.h"
 #include "utils/rel.h"
@@ -294,6 +295,8 @@ dihandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amparallelvacuumoptions = VACUUM_OPTION_NO_PARALLEL;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = dibuild;
-- 
2.23.0

v40-0003-Add-FAST-option-to-vacuum-command.patchapplication/octet-stream; name=v40-0003-Add-FAST-option-to-vacuum-command.patchDownload

From cf0854b52cc51b65de15a57bb43e0f5755e40de7 Mon Sep 17 00:00:00 2001
From: Amit Kapila <akapila@postgresql.org>
Date: Mon, 23 Dec 2019 15:37:09 +0530
Subject: [PATCH v40 3/4] Add FAST option to vacuum command.

---
 doc/src/sgml/ref/vacuum.sgml         | 13 +++++++++
 src/backend/access/heap/vacuumlazy.c | 43 +++++++++++++++++-----------
 src/backend/commands/vacuum.c        |  9 ++++--
 src/include/commands/vacuum.h        |  3 +-
 src/test/regress/expected/vacuum.out |  3 ++
 src/test/regress/sql/vacuum.sql      |  4 +++
 6 files changed, 56 insertions(+), 19 deletions(-)

diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index 9fee083233..b190cb0a98 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -35,6 +35,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     INDEX_CLEANUP [ <replaceable class="parameter">boolean</replaceable> ]
     TRUNCATE [ <replaceable class="parameter">boolean</replaceable> ]
     PARALLEL [ <replaceable class="parameter">integer</replaceable> ]
+    FAST [ <replaceable class="parameter">boolean</replaceable> ]
 
 <phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
 
@@ -250,6 +251,18 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><literal>FAST</literal></term>
+    <listitem>
+     <para>
+      Perform vacuum while disabling cost-based vacuum delay feature.
+      Specifying <literal>FAST</literal> is equivalent to performing
+      <command>VACUUM</command> with the
+      <xref linkend="guc-vacuum-cost-delay"/> parameter set to zero.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><replaceable class="parameter">boolean</replaceable></term>
     <listitem>
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 6324ed746c..ff0acad1ec 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -218,6 +218,13 @@ typedef struct LVShared
 	 */
 	pg_atomic_uint32 active_nworkers;
 
+	/*
+	 * True if we forcibly disable cost-based vacuum delay during parallel
+	 * index vacuum. This can be true when use specified the FAST vacuum
+	 * option.
+	 */
+	bool		fast;
+
 	/*
 	 * Variables to control parallel index vacuuming.  We have a bitmap to
 	 * indicate which index has stats in shared memory.  The set bit in the
@@ -352,7 +359,7 @@ static void update_index_statistics(Relation *Irel, IndexBulkDeleteResult **stat
 									int nindexes);
 static LVParallelState *begin_parallel_vacuum(Oid relid, Relation *Irel,
 											  LVRelStats *vacrelstats, BlockNumber nblocks,
-											  int nindexes, int nrequested);
+											  int nindexes, VacuumParams *params);
 static void end_parallel_vacuum(Relation *Irel, IndexBulkDeleteResult **stats,
 								LVParallelState *lps, int nindexes);
 static LVSharedIndStats *get_indstats(LVShared *lvshared, int n);
@@ -755,7 +762,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	if (params->nworkers >= 0 && vacrelstats->useindex)
 		lps = begin_parallel_vacuum(RelationGetRelid(onerel), Irel,
 									vacrelstats, nblocks, nindexes,
-									params->nworkers);
+									params);
 
 	/*
 	 * Allocate the space for dead tuples in case the parallel vacuum is not
@@ -1992,16 +1999,19 @@ lazy_parallel_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
 			ReinitializeParallelDSM(lps->pcxt);
 		}
 
-		/* Enable shared cost balance */
-		VacuumSharedCostBalance = &(lps->lvshared->cost_balance);
-		VacuumActiveNWorkers = &(lps->lvshared->active_nworkers);
+		if (!lps->lvshared->fast)
+		{
+			/* Enable shared cost balance */
+			VacuumSharedCostBalance = &(lps->lvshared->cost_balance);
+			VacuumActiveNWorkers = &(lps->lvshared->active_nworkers);
 
-		/*
-		 * Set up shared cost balance and the number of active workers for
-		 * vacuum delay.
-		 */
-		pg_atomic_write_u32(VacuumSharedCostBalance, VacuumCostBalance);
-		pg_atomic_write_u32(VacuumActiveNWorkers, 0);
+			/*
+			 * Set up shared cost balance and the number of active workers for
+			 * vacuum delay.
+			 */
+			pg_atomic_write_u32(VacuumSharedCostBalance, VacuumCostBalance);
+			pg_atomic_write_u32(VacuumActiveNWorkers, 0);
+		}
 
 		/*
 		 * The number of workers can vary between bulkdelete and cleanup
@@ -2020,7 +2030,7 @@ lazy_parallel_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
 			VacuumCostBalance = 0;
 			VacuumCostBalanceLocal = 0;
 		}
-		else
+		else if (!lps->lvshared->fast)
 		{
 			/*
 			 * Disable shared cost balance if we are not able to launch
@@ -3070,7 +3080,7 @@ update_index_statistics(Relation *Irel, IndexBulkDeleteResult **stats,
  */
 static LVParallelState *
 begin_parallel_vacuum(Oid relid, Relation *Irel, LVRelStats *vacrelstats,
-					  BlockNumber nblocks, int nindexes, int nrequested)
+					  BlockNumber nblocks, int nindexes, VacuumParams *params)
 {
 	LVParallelState *lps = NULL;
 	ParallelContext *pcxt;
@@ -3090,7 +3100,7 @@ begin_parallel_vacuum(Oid relid, Relation *Irel, LVRelStats *vacrelstats,
 	 * a parallel vacuum must be requested and there must be indexes on the
 	 * relation
 	 */
-	Assert(nrequested >= 0);
+	Assert(params->nworkers >= 0);
 	Assert(nindexes > 0);
 
 	/*
@@ -3098,7 +3108,7 @@ begin_parallel_vacuum(Oid relid, Relation *Irel, LVRelStats *vacrelstats,
 	 */
 	can_parallel_vacuum = (bool *) palloc0(sizeof(bool) * nindexes);
 	parallel_workers = compute_parallel_vacuum_workers(Irel, nindexes,
-													   nrequested,
+													   params->nworkers,
 													   can_parallel_vacuum);
 
 	/* Can't perform vacuum in parallel */
@@ -3176,6 +3186,7 @@ begin_parallel_vacuum(Oid relid, Relation *Irel, LVRelStats *vacrelstats,
 		(nindexes_mwm > 0) ?
 		maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
 		maintenance_work_mem;
+	shared->fast = (params->options & VACOPT_FAST);
 
 	/*
 	 * We need to care about alignment because we estimate the shared memory
@@ -3365,7 +3376,7 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
 												  false);
 
 	/* Set cost-based vacuum delay */
-	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostActive = ((VacuumCostDelay > 0) && !(lvshared->fast));
 	VacuumCostBalance = 0;
 	VacuumPageHit = 0;
 	VacuumPageMiss = 0;
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 9431a95e4a..ebc1f4c9f6 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -101,6 +101,7 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 	bool		verbose = false;
 	bool		skip_locked = false;
 	bool		analyze = false;
+	bool		fast = false;
 	bool		freeze = false;
 	bool		full = false;
 	bool		disable_page_skipping = false;
@@ -130,6 +131,8 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 		/* Parse options available on VACUUM */
 		else if (strcmp(opt->defname, "analyze") == 0)
 			analyze = defGetBoolean(opt);
+		else if (strcmp(opt->defname, "fast") == 0)
+			fast = defGetBoolean(opt);
 		else if (strcmp(opt->defname, "freeze") == 0)
 			freeze = defGetBoolean(opt);
 		else if (strcmp(opt->defname, "full") == 0)
@@ -177,7 +180,8 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 		(analyze ? VACOPT_ANALYZE : 0) |
 		(freeze ? VACOPT_FREEZE : 0) |
 		(full ? VACOPT_FULL : 0) |
-		(disable_page_skipping ? VACOPT_DISABLE_PAGE_SKIPPING : 0);
+		(disable_page_skipping ? VACOPT_DISABLE_PAGE_SKIPPING : 0) |
+		(fast ? VACOPT_FAST : 0);
 
 	/* sanity checks on options */
 	Assert(params.options & (VACOPT_VACUUM | VACOPT_ANALYZE));
@@ -416,7 +420,8 @@ vacuum(List *relations, VacuumParams *params,
 		ListCell   *cur;
 
 		in_vacuum = true;
-		VacuumCostActive = (VacuumCostDelay > 0);
+		VacuumCostActive = ((VacuumCostDelay > 0) &&
+							!(params->options & VACOPT_FAST));
 		VacuumCostBalance = 0;
 		VacuumPageHit = 0;
 		VacuumPageMiss = 0;
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 254a6bcda6..faed3f9718 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -183,7 +183,8 @@ typedef enum VacuumOption
 	VACOPT_FULL = 1 << 4,		/* FULL (non-concurrent) vacuum */
 	VACOPT_SKIP_LOCKED = 1 << 5,	/* skip if cannot get lock */
 	VACOPT_SKIPTOAST = 1 << 6,	/* don't process the TOAST table, if any */
-	VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7	/* don't skip any pages */
+	VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7,	/* don't skip any pages */
+	VACOPT_FAST = 1 << 8		/* disable vacuum delay */
 } VacuumOption;
 
 /*
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index 8571133fe7..07c7b88a16 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -118,6 +118,9 @@ CREATE TEMPORARY TABLE tmp (a int PRIMARY KEY);
 CREATE INDEX tmp_idx1 ON tmp (a);
 VACUUM (PARALLEL 1) tmp; -- disables parallel vacuum option
 WARNING:  disabling parallel option of vacuum on "tmp" --- cannot vacuum temporary tables in parallel
+--test FAST option
+UPDATE pvactst SET i = i WHERE i < 1000;
+VACUUM (FAST) pvactst;
 RESET min_parallel_index_scan_size;
 DROP TABLE pvactst;
 -- INDEX_CLEANUP option
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index be4f55616e..6227ab9423 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -99,6 +99,10 @@ VACUUM (PARALLEL 2, FULL TRUE) pvactst; -- error, cannot use both PARALLEL and F
 CREATE TEMPORARY TABLE tmp (a int PRIMARY KEY);
 CREATE INDEX tmp_idx1 ON tmp (a);
 VACUUM (PARALLEL 1) tmp; -- disables parallel vacuum option
+
+--test FAST option
+UPDATE pvactst SET i = i WHERE i < 1000;
+VACUUM (FAST) pvactst;
 RESET min_parallel_index_scan_size;
 DROP TABLE pvactst;
 
-- 
2.23.0

v40-0002-Add-a-parallel-option-to-the-VACUUM-command.patchapplication/octet-stream; name=v40-0002-Add-a-parallel-option-to-the-VACUUM-command.patchDownload

From 2c65a5814abfa5fedf6806c793eda0c134cf162c Mon Sep 17 00:00:00 2001
From: Amit Kapila <akapila@postgresql.org>
Date: Mon, 23 Dec 2019 15:03:14 +0530
Subject: [PATCH v40 2/4] Add a parallel option to the VACUUM command.

This change adds a PARALLEL option to VACUUM command that enables us to
perform index vacuuming and index cleanup with background workers.  Each
index is processed by at most one vacuum process.  Therefore parallel
vacuum can be used when the table has at least two indexes.  Users can
specify a parallel degree along with this option which indicates the
number of workers used by this option which is limited by the number of
indexes on a table.  This option can't be used with the FULL option.

The parallel degree is either specified by the user or determined based on
the number of indexes that the table has, and further limited by
max_parallel_maintenance_workers.  The index can participate in parallel
vacuum iff it is greater than min_parallel_index_scan_size.

Author: Masahiko Sawada and Amit Kapila
Reviewed-by: Dilip Kumar, Amit Kapila, Robert Haas, Mahendra Singh and
Sergei Kornilov
Tested-by: Mahendra Singh and Prabhat Sahu
Discussion:
https://postgr.es/m/CAD21AoDTPMgzSkV4E3SFo1CH_x50bf5PqZFQf4jmqjk-C03BWg@mail.gmail.com
https://postgr.es/m/CAA4eK1J-VoR9gzS5E75pcD-OH0mEyCdp8RihcwKrcuw7J-Q0+w@mail.gmail.com
---
 doc/src/sgml/config.sgml              |   14 +-
 doc/src/sgml/ref/vacuum.sgml          |   45 +
 src/backend/access/heap/vacuumlazy.c  | 1238 +++++++++++++++++++++++--
 src/backend/access/transam/parallel.c |   26 +-
 src/backend/commands/vacuum.c         |  132 ++-
 src/backend/postmaster/autovacuum.c   |    2 +
 src/bin/psql/tab-complete.c           |    2 +-
 src/include/access/heapam.h           |    3 +
 src/include/access/parallel.h         |    4 +-
 src/include/commands/vacuum.h         |   11 +
 src/test/regress/expected/vacuum.out  |   28 +
 src/test/regress/sql/vacuum.sql       |   27 +
 12 files changed, 1414 insertions(+), 118 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 5d1c90282f..74756277b7 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2308,13 +2308,13 @@ include_dir 'conf.d'
        <listitem>
         <para>
          Sets the maximum number of parallel workers that can be
-         started by a single utility command.  Currently, the only
-         parallel utility command that supports the use of parallel
-         workers is <command>CREATE INDEX</command>, and only when
-         building a B-tree index.  Parallel workers are taken from the
-         pool of processes established by <xref
-         linkend="guc-max-worker-processes"/>, limited by <xref
-         linkend="guc-max-parallel-workers"/>.  Note that the requested
+         started by a single utility command.  Currently, the parallel
+         utility commands that support the use of parallel workers are
+         <command>CREATE INDEX</command> only when building a B-tree index,
+         and <command>VACUUM</command> without <literal>FULL</literal>
+         option.  Parallel workers are taken from the pool of processes
+         established by <xref linkend="guc-max-worker-processes"/>, limited
+         by <xref linkend="guc-max-parallel-workers"/>.  Note that the requested
          number of workers may not actually be available at run time.
          If this occurs, the utility operation will run with fewer
          workers than expected.  The default value is 2.  Setting this
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index f9b0fb8794..9fee083233 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -34,6 +34,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     SKIP_LOCKED [ <replaceable class="parameter">boolean</replaceable> ]
     INDEX_CLEANUP [ <replaceable class="parameter">boolean</replaceable> ]
     TRUNCATE [ <replaceable class="parameter">boolean</replaceable> ]
+    PARALLEL [ <replaceable class="parameter">integer</replaceable> ]
 
 <phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
 
@@ -223,6 +224,32 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><literal>PARALLEL</literal></term>
+    <listitem>
+     <para>
+      Perform vacuum index and cleanup index phases of <command>VACUUM</command>
+      in parallel using <replaceable class="parameter">integer</replaceable>
+      background workers (for the detail of each vacuum phases, please
+      refer to <xref linkend="vacuum-phases"/>).  If the parallel degree
+      <replaceable class="parameter">integer</replaceable> is omitted,
+      then <command>VACUUM</command> decides the number of workers based
+      on number of indexes that support parallel vacuum operation on the
+      relation which is further limited by
+      <xref linkend="guc-max-parallel-workers-maintenance"/>.  Please note
+      that it is not guaranteed that the number of parallel workers specified
+      in <replaceable class="parameter">integer</replaceable> will be used
+      during execution.  It is possible for a vacuum to run with fewer workers
+      than specified, or even with no workers at all.  Only one worker can
+      be used per index.  So parallel workers are launched only when there
+      are at least <literal>2</literal> indexes in the table.  Workers for
+      vacuum launches before starting each phase and exit at the end of
+      the phase.  These behaviors might change in a future release.  This
+      option can't be used with the <literal>FULL</literal> option.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><replaceable class="parameter">boolean</replaceable></term>
     <listitem>
@@ -237,6 +264,18 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><replaceable class="parameter">integer</replaceable></term>
+    <listitem>
+     <para>
+      Specifies a positive integer value passed to the selected option.
+      The <replaceable class="parameter">integer</replaceable> value can
+      also be omitted, in which case the value is decided by the command
+      based on the option used.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><replaceable class="parameter">table_name</replaceable></term>
     <listitem>
@@ -316,6 +355,12 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     more than a plain <command>VACUUM</command> would.
    </para>
 
+   <para>
+     The <option>PARALLEL</option> option is used only for vacuum purpose.
+     Even if this option is specified with <option>ANALYZE</option> option
+     it does not affect <option>ANALYZE</option>.
+   </para>
+
    <para>
     <command>VACUUM</command> causes a substantial increase in I/O traffic,
     which might cause poor performance for other active sessions.  Therefore,
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 253b273366..6324ed746c 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -22,6 +22,20 @@
  * of index scans performed.  So we don't use maintenance_work_mem memory for
  * the TID array, just enough to hold as many heap tuples as fit on one page.
  *
+ * Lazy vacuum supports parallel execution with parallel worker processes.  In
+ * a parallel lazy vacuum, we perform both index vacuuming and index cleanup
+ * with parallel worker processes.  Individual indexes are processed by one
+ * vacuum process.  At the beginning of a lazy vacuum (at lazy_scan_heap) we
+ * prepare the parallel context and initialize the DSM segment that contains
+ * shared information as well as the memory space for storing dead tuples.
+ * When starting either index vacuuming or index cleanup, we launch parallel
+ * worker processes.  Once all indexes are processed the parallel worker
+ * processes exit.  After that, the leader process re-initializes the parallel
+ * context so that it can use the same DSM for multiple passes of index
+ * vacuum and for performing index cleanup.  For updating the index statistics,
+ * we need to update the system table and since updates are not
+ * allowed during parallel mode we update the index statistics after exiting
+ * from the parallel mode.
  *
  * Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
@@ -40,21 +54,26 @@
 #include "access/heapam_xlog.h"
 #include "access/htup_details.h"
 #include "access/indexgenam.h"
+#include "access/indexam.h"
 #include "access/multixact.h"
+#include "access/parallel.h"
 #include "access/transam.h"
 #include "access/visibilitymap.h"
+#include "access/xact.h"
 #include "access/xlog.h"
 #include "catalog/storage.h"
 #include "commands/dbcommands.h"
 #include "commands/progress.h"
 #include "commands/vacuum.h"
 #include "miscadmin.h"
+#include "optimizer/paths.h"
 #include "pgstat.h"
 #include "portability/instr_time.h"
 #include "postmaster/autovacuum.h"
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
 #include "storage/lmgr.h"
+#include "tcop/tcopprot.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/pg_rusage.h"
@@ -110,6 +129,142 @@
  */
 #define PREFETCH_SIZE			((BlockNumber) 32)
 
+/*
+ * DSM keys for parallel lazy vacuum.  Unlike other parallel execution code,
+ * since we don't need to worry about DSM keys conflicting with plan_node_id
+ * we can use small integers.
+ */
+#define PARALLEL_VACUUM_KEY_SHARED			1
+#define PARALLEL_VACUUM_KEY_DEAD_TUPLES		2
+#define PARALLEL_VACUUM_KEY_QUERY_TEXT		3
+
+/*
+ * Macro to check if we are in a parallel lazy vacuum.  If true, we are
+ * in the parallel mode and the DSM segment is initialized.
+ */
+#define ParallelVacuumIsActive(lps) (((LVParallelState *) (lps)) != NULL)
+
+/*
+ * LVDeadTuples stores the dead tuple TIDs collected during the heap scan.
+ * This is allocated in the DSM segment in parallel mode and in local memory
+ * in non-parallel mode.
+ */
+typedef struct LVDeadTuples
+{
+	int			max_tuples;		/* # slots allocated in array */
+	int			num_tuples;		/* current # of entries */
+	/* List of TIDs of tuples we intend to delete */
+	/* NB: this list is ordered by TID address */
+	ItemPointerData itemptrs[FLEXIBLE_ARRAY_MEMBER];	/* array of
+														 * ItemPointerData */
+} LVDeadTuples;
+
+#define SizeOfLVDeadTuples (offsetof(LVDeadTuples, itemptrs) + sizeof(ItemPointerData))
+
+/*
+ * Shared information among parallel workers.  So this is allocated in the DSM
+ * segment.
+ */
+typedef struct LVShared
+{
+	/*
+	 * Target table relid and log level.  These fields are not modified during
+	 * the lazy vacuum.
+	 */
+	Oid			relid;
+	int			elevel;
+
+	/*
+	 * An indication for vacuum workers to perform either index vacuuming or
+	 * index cleanup.  first_time is true only if for_cleanup is true and
+	 * bulk-deletion is not performed yet.
+	 */
+	bool		for_cleanup;
+	bool		first_time;
+
+	/*
+	 * Fields for both index vacuum and cleanup.
+	 *
+	 * reltuples is the total number of input heap tuples.  We set either old
+	 * live tuples in the index vacuum case or the new live tuples in the
+	 * index cleanup case.
+	 *
+	 * estimated_count is true if the reltuples is an estimated value.
+	 */
+	double		reltuples;
+	bool		estimated_count;
+
+	/*
+	 * In single process lazy vacuum we could consume more memory during index
+	 * vacuuming or cleanup apart from the memory for heap scanning. In
+	 * parallel index vacuuming, since individual vacuum workers can consume
+	 * memory equal to maintenance_work_mem, the new maintenance_work_mem for
+	 * each worker is set such that the parallel operation doesn't consume
+	 * more memory than single process lazy vacuum.
+	 */
+	int			maintenance_work_mem_worker;
+
+	/*
+	 * Shared vacuum cost balance.  During parallel index vacuuming
+	 * VacuumSharedCostBalance points to this value and it accumulates the
+	 * balance of each parallel vacuum workers.
+	 */
+	pg_atomic_uint32 cost_balance;
+
+	/*
+	 * Number of active parallel workers.  This is used for computing the
+	 * minimum threshold of the vacuum cost balance for a worker to go for the
+	 * delay.
+	 */
+	pg_atomic_uint32 active_nworkers;
+
+	/*
+	 * Variables to control parallel index vacuuming.  We have a bitmap to
+	 * indicate which index has stats in shared memory.  The set bit in the
+	 * map indicates that the particular index supports a parallel vacuum.
+	 */
+	pg_atomic_uint32 idx;		/* counter for vacuuming and clean up */
+	pg_atomic_uint32 nprocessed;	/* # of indexes done during parallel
+									 * execution */
+	uint32		offset;			/* sizeof header incl. bitmap */
+	bits8		bitmap[FLEXIBLE_ARRAY_MEMBER];	/* bit map of NULLs */
+
+	/* Shared index statistics data follows at end of struct */
+} LVShared;
+
+#define SizeOfLVShared (offsetof(LVShared, bitmap) + sizeof(bits8))
+#define GetSharedIndStats(s) \
+	((LVSharedIndStats *)((char *)(s) + ((LVShared *)(s))->offset))
+#define IndStatsIsNull(s, i) \
+	(!(((LVShared *)(s))->bitmap[(i) >> 3] & (1 << ((i) & 0x07))))
+
+/*
+ * Struct for an index bulk-deletion statistic used for parallel lazy
+ * vacuum.  This is allocated in the DSM segment.
+ */
+typedef struct LVSharedIndStats
+{
+	bool		updated;		/* are the stats updated? */
+	IndexBulkDeleteResult stats;
+} LVSharedIndStats;
+
+/* Struct for maintaining a parallel vacuum state. */
+typedef struct LVParallelState
+{
+	ParallelContext *pcxt;
+
+	/* Shared information among parallel vacuum workers */
+	LVShared   *lvshared;
+
+	/*
+	 * The number of indexes that support parallel index bulk-deletion and
+	 * parallel index cleanup respectively.
+	 */
+	int			nindexes_parallel_bulkdel;
+	int			nindexes_parallel_cleanup;
+	int			nindexes_parallel_condcleanup;
+} LVParallelState;
+
 typedef struct LVRelStats
 {
 	/* useindex = true means two-pass strategy; false means one-pass */
@@ -128,11 +283,7 @@ typedef struct LVRelStats
 	BlockNumber pages_removed;
 	double		tuples_deleted;
 	BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
-	/* List of TIDs of tuples we intend to delete */
-	/* NB: this list is ordered by TID address */
-	int			num_dead_tuples;	/* current # of entries */
-	int			max_dead_tuples;	/* # slots allocated in array */
-	ItemPointer dead_tuples;	/* array of ItemPointerData */
+	LVDeadTuples *dead_tuples;
 	int			num_index_scans;
 	TransactionId latestRemovedXid;
 	bool		lock_waiter_detected;
@@ -155,12 +306,11 @@ static void lazy_scan_heap(Relation onerel, VacuumParams *params,
 						   bool aggressive);
 static void lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats);
 static bool lazy_check_needs_freeze(Buffer buf, bool *hastup);
-static void lazy_vacuum_index(Relation indrel,
-							  IndexBulkDeleteResult **stats,
-							  LVRelStats *vacrelstats);
+static void lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult **stats,
+							  LVDeadTuples *dead_tuples, double reltuples);
 static void lazy_cleanup_index(Relation indrel,
-							   IndexBulkDeleteResult *stats,
-							   LVRelStats *vacrelstats);
+							   IndexBulkDeleteResult **stats,
+							   double reltuples, bool estimated_count);
 static int	lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 							 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer);
 static bool should_attempt_truncation(VacuumParams *params,
@@ -169,12 +319,44 @@ static void lazy_truncate_heap(Relation onerel, LVRelStats *vacrelstats);
 static BlockNumber count_nondeletable_pages(Relation onerel,
 											LVRelStats *vacrelstats);
 static void lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks);
-static void lazy_record_dead_tuple(LVRelStats *vacrelstats,
+static void lazy_record_dead_tuple(LVDeadTuples *dead_tuples,
 								   ItemPointer itemptr);
 static bool lazy_tid_reaped(ItemPointer itemptr, void *state);
 static int	vac_cmp_itemptr(const void *left, const void *right);
 static bool heap_page_is_all_visible(Relation rel, Buffer buf,
 									 TransactionId *visibility_cutoff_xid, bool *all_frozen);
+static void lazy_parallel_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+										 LVRelStats *vacrelstats, LVParallelState *lps,
+										 int nindexes);
+static void parallel_vacuum_index(Relation *Irel, IndexBulkDeleteResult **stats,
+								  LVShared *lvshared, LVDeadTuples *dead_tuples,
+								  int nindexes);
+static void vacuum_indexes_leader(Relation *Irel, IndexBulkDeleteResult **stats,
+								  LVRelStats *vacrelstats, LVParallelState *lps,
+								  int nindexes);
+static void vacuum_one_index(Relation indrel, IndexBulkDeleteResult **stats,
+							 LVShared *lvshared, LVSharedIndStats *shared_indstats,
+							 LVDeadTuples *dead_tuples);
+static void lazy_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+								LVRelStats *vacrelstats, LVParallelState *lps,
+								int nindexes);
+static void lazy_cleanup_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+								 LVRelStats *vacrelstats, LVParallelState *lps,
+								 int nindexes);
+static long compute_max_dead_tuples(BlockNumber relblocks, bool hasindex);
+static int	compute_parallel_vacuum_workers(Relation *Irel, int nindexes, int nrequested,
+											bool *can_parallel_vacuum);
+static void prepare_index_statistics(LVShared *lvshared, bool *can_parallel_vacuum,
+									 int nindexes);
+static void update_index_statistics(Relation *Irel, IndexBulkDeleteResult **stats,
+									int nindexes);
+static LVParallelState *begin_parallel_vacuum(Oid relid, Relation *Irel,
+											  LVRelStats *vacrelstats, BlockNumber nblocks,
+											  int nindexes, int nrequested);
+static void end_parallel_vacuum(Relation *Irel, IndexBulkDeleteResult **stats,
+								LVParallelState *lps, int nindexes);
+static LVSharedIndStats *get_indstats(LVShared *lvshared, int n);
+static bool skip_parallel_vacuum_index(Relation indrel, LVShared *lvshared);
 
 
 /*
@@ -488,6 +670,18 @@ vacuum_log_cleanup_info(Relation rel, LVRelStats *vacrelstats)
  *		dead-tuple TIDs, invoke vacuuming of indexes and call lazy_vacuum_heap
  *		to reclaim dead line pointers.
  *
+ *		If the table has at least two indexes and parallel lazy vacuum is
+ *		requested, we execute both index vacuuming and index cleanup with
+ *		parallel workers.  In parallel lazy vacuum, we enter parallel mode and
+ *		then create both the parallel context and the DSM segment before starting
+ *		heap scan so that we can record dead tuples to the DSM segment.  All
+ *		parallel workers are launched at beginning of index vacuuming and index
+ *		cleanup and they exit once done with all indexes.  At the end of this
+ *		function we exit from parallel mode.  Index bulk-deletion results are
+ *		stored in the DSM segment and we update index statistics for all the
+ *		indexes after exiting from parallel mode since writes are not allowed
+ *		during parallel mode.
+ *
  *		If there are no indexes then we can reclaim line pointers on the fly;
  *		dead line pointers need only be retained until all index pointers that
  *		reference them have been killed.
@@ -496,6 +690,8 @@ static void
 lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			   Relation *Irel, int nindexes, bool aggressive)
 {
+	LVParallelState *lps = NULL;
+	LVDeadTuples *dead_tuples;
 	BlockNumber nblocks,
 				blkno;
 	HeapTupleData tuple;
@@ -553,13 +749,28 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	vacrelstats->nonempty_pages = 0;
 	vacrelstats->latestRemovedXid = InvalidTransactionId;
 
-	lazy_space_alloc(vacrelstats, nblocks);
+	/*
+	 * Try to initialize the parallel vacuum if requested
+	 */
+	if (params->nworkers >= 0 && vacrelstats->useindex)
+		lps = begin_parallel_vacuum(RelationGetRelid(onerel), Irel,
+									vacrelstats, nblocks, nindexes,
+									params->nworkers);
+
+	/*
+	 * Allocate the space for dead tuples in case the parallel vacuum is not
+	 * initialized.
+	 */
+	if (!ParallelVacuumIsActive(lps))
+		lazy_space_alloc(vacrelstats, nblocks);
+
+	dead_tuples = vacrelstats->dead_tuples;
 	frozen = palloc(sizeof(xl_heap_freeze_tuple) * MaxHeapTuplesPerPage);
 
 	/* Report that we're scanning the heap, advertising total # of blocks */
 	initprog_val[0] = PROGRESS_VACUUM_PHASE_SCAN_HEAP;
 	initprog_val[1] = nblocks;
-	initprog_val[2] = vacrelstats->max_dead_tuples;
+	initprog_val[2] = dead_tuples->max_tuples;
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
 	/*
@@ -737,8 +948,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * If we are close to overrunning the available space for dead-tuple
 		 * TIDs, pause and do a cycle of vacuuming before we tackle this page.
 		 */
-		if ((vacrelstats->max_dead_tuples - vacrelstats->num_dead_tuples) < MaxHeapTuplesPerPage &&
-			vacrelstats->num_dead_tuples > 0)
+		if ((dead_tuples->max_tuples - dead_tuples->num_tuples) < MaxHeapTuplesPerPage &&
+			dead_tuples->num_tuples > 0)
 		{
 			const int	hvp_index[] = {
 				PROGRESS_VACUUM_PHASE,
@@ -766,10 +977,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 										 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
 			/* Remove index entries */
-			for (i = 0; i < nindexes; i++)
-				lazy_vacuum_index(Irel[i],
-								  &indstats[i],
-								  vacrelstats);
+			lazy_vacuum_indexes(Irel, indstats, vacrelstats, lps, nindexes);
 
 			/*
 			 * Report that we are now vacuuming the heap.  We also increase
@@ -789,7 +997,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 			vacrelstats->num_index_scans++;
 
 			/*
@@ -985,7 +1193,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		has_dead_tuples = false;
 		nfrozen = 0;
 		hastup = false;
-		prev_dead_count = vacrelstats->num_dead_tuples;
+		prev_dead_count = dead_tuples->num_tuples;
 		maxoff = PageGetMaxOffsetNumber(page);
 
 		/*
@@ -1024,7 +1232,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 */
 			if (ItemIdIsDead(itemid))
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				all_visible = false;
 				continue;
 			}
@@ -1170,7 +1378,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 
 			if (tupgone)
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				HeapTupleHeaderAdvanceLatestRemovedXid(tuple.t_data,
 													   &vacrelstats->latestRemovedXid);
 				tups_vacuumed += 1;
@@ -1240,7 +1448,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * doing a second scan. Also we don't do that but forget dead tuples
 		 * when index cleanup is disabled.
 		 */
-		if (!vacrelstats->useindex && vacrelstats->num_dead_tuples > 0)
+		if (!vacrelstats->useindex && dead_tuples->num_tuples > 0)
 		{
 			if (nindexes == 0)
 			{
@@ -1269,7 +1477,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 
 			/*
 			 * Periodically do incremental FSM vacuuming to make newly-freed
@@ -1384,7 +1592,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * page, so remember its free space as-is.  (This path will always be
 		 * taken if there are no indexes.)
 		 */
-		if (vacrelstats->num_dead_tuples == prev_dead_count)
+		if (dead_tuples->num_tuples == prev_dead_count)
 			RecordPageWithFreeSpace(onerel, blkno, freespace);
 	}
 
@@ -1418,7 +1626,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 
 	/* If any tuples need to be deleted, perform final vacuum cycle */
 	/* XXX put a threshold on min number of tuples here? */
-	if (vacrelstats->num_dead_tuples > 0)
+	if (dead_tuples->num_tuples > 0)
 	{
 		const int	hvp_index[] = {
 			PROGRESS_VACUUM_PHASE,
@@ -1434,10 +1642,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 									 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
 		/* Remove index entries */
-		for (i = 0; i < nindexes; i++)
-			lazy_vacuum_index(Irel[i],
-							  &indstats[i],
-							  vacrelstats);
+		lazy_vacuum_indexes(Irel, indstats, vacrelstats, lps, nindexes);
 
 		/* Report that we are now vacuuming the heap */
 		hvp_val[0] = PROGRESS_VACUUM_PHASE_VACUUM_HEAP;
@@ -1461,12 +1666,19 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
 								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
 
-	/* Do post-vacuum cleanup and statistics update for each index */
+	/* Do post-vacuum cleanup */
 	if (vacrelstats->useindex)
-	{
-		for (i = 0; i < nindexes; i++)
-			lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
-	}
+		lazy_cleanup_indexes(Irel, indstats, vacrelstats, lps, nindexes);
+
+	/*
+	 * End parallel mode before updating index statistics as we cannot write
+	 * during parallel mode.
+	 */
+	if (ParallelVacuumIsActive(lps))
+		end_parallel_vacuum(Irel, indstats, lps, nindexes);
+
+	/* Update index statistics */
+	update_index_statistics(Irel, indstats, nindexes);
 
 	/* If no indexes, make log report that lazy_vacuum_heap would've made */
 	if (vacuumed_pages)
@@ -1532,7 +1744,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 	npages = 0;
 
 	tupindex = 0;
-	while (tupindex < vacrelstats->num_dead_tuples)
+	while (tupindex < vacrelstats->dead_tuples->num_tuples)
 	{
 		BlockNumber tblk;
 		Buffer		buf;
@@ -1541,7 +1753,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 
 		vacuum_delay_point();
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples->itemptrs[tupindex]);
 		buf = ReadBufferExtended(onerel, MAIN_FORKNUM, tblk, RBM_NORMAL,
 								 vac_strategy);
 		if (!ConditionalLockBufferForCleanup(buf))
@@ -1589,6 +1801,7 @@ static int
 lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 				 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer)
 {
+	LVDeadTuples *dead_tuples = vacrelstats->dead_tuples;
 	Page		page = BufferGetPage(buffer);
 	OffsetNumber unused[MaxOffsetNumber];
 	int			uncnt = 0;
@@ -1599,16 +1812,16 @@ lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 
 	START_CRIT_SECTION();
 
-	for (; tupindex < vacrelstats->num_dead_tuples; tupindex++)
+	for (; tupindex < dead_tuples->num_tuples; tupindex++)
 	{
 		BlockNumber tblk;
 		OffsetNumber toff;
 		ItemId		itemid;
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&dead_tuples->itemptrs[tupindex]);
 		if (tblk != blkno)
 			break;				/* past end of tuples for this block */
-		toff = ItemPointerGetOffsetNumber(&vacrelstats->dead_tuples[tupindex]);
+		toff = ItemPointerGetOffsetNumber(&dead_tuples->itemptrs[tupindex]);
 		itemid = PageGetItemId(page, toff);
 		ItemIdSetUnused(itemid);
 		unused[uncnt++] = toff;
@@ -1729,19 +1942,387 @@ lazy_check_needs_freeze(Buffer buf, bool *hastup)
 	return false;
 }
 
+/*
+ * Perform index vacuuming or index cleanup with parallel workers.  This
+ * function must be used by the parallel vacuum leader process. The caller
+ * must set lps->lvshared->for_cleanup to indicate whether to perform vacuum
+ * or cleanup.
+ */
+static void
+lazy_parallel_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+							 LVRelStats *vacrelstats, LVParallelState *lps,
+							 int nindexes)
+{
+	int			nworkers;
+
+	Assert(!IsParallelWorker());
+	Assert(ParallelVacuumIsActive(lps));
+	Assert(nindexes > 0);
+
+	/* Determine the number of parallel workers to launch */
+	if (lps->lvshared->for_cleanup)
+	{
+		if (lps->lvshared->first_time)
+			nworkers = lps->nindexes_parallel_cleanup +
+				lps->nindexes_parallel_condcleanup - 1;
+		else
+			nworkers = lps->nindexes_parallel_cleanup - 1;
+
+	}
+	else
+		nworkers = lps->nindexes_parallel_bulkdel - 1;
+
+	/*
+	 * It is possible that parallel context is initialized with fewer workers
+	 * then the number of indexes that need a separate worker in the current
+	 * phase, so we need to consider it.  See compute_parallel_vacuum_workers.
+	 */
+	nworkers = Min(nworkers, lps->pcxt->nworkers);
+
+	/* Setup the shared cost-based vacuum delay and launch workers */
+	if (nworkers > 0)
+	{
+		if (vacrelstats->num_index_scans > 0)
+		{
+			/* Reset the processing counts */
+			pg_atomic_write_u32(&(lps->lvshared->idx), 0);
+			pg_atomic_write_u32(&(lps->lvshared->nprocessed), 0);
+
+			/* Reinitialize the parallel context to relaunch parallel workers */
+			ReinitializeParallelDSM(lps->pcxt);
+		}
+
+		/* Enable shared cost balance */
+		VacuumSharedCostBalance = &(lps->lvshared->cost_balance);
+		VacuumActiveNWorkers = &(lps->lvshared->active_nworkers);
+
+		/*
+		 * Set up shared cost balance and the number of active workers for
+		 * vacuum delay.
+		 */
+		pg_atomic_write_u32(VacuumSharedCostBalance, VacuumCostBalance);
+		pg_atomic_write_u32(VacuumActiveNWorkers, 0);
+
+		/*
+		 * The number of workers can vary between bulkdelete and cleanup
+		 * phase.
+		 */
+		ReinitializeParallelWorkers(lps->pcxt, nworkers);
+
+		LaunchParallelWorkers(lps->pcxt);
+
+		if (lps->pcxt->nworkers_launched > 0)
+		{
+			/*
+			 * Reset the local cost values for leader backend as we have
+			 * already accumulated the remaining balance of heap.
+			 */
+			VacuumCostBalance = 0;
+			VacuumCostBalanceLocal = 0;
+		}
+		else
+		{
+			/*
+			 * Disable shared cost balance if we are not able to launch
+			 * workers.
+			 */
+			VacuumSharedCostBalance = NULL;
+			VacuumActiveNWorkers = NULL;
+		}
+
+		if (lps->lvshared->for_cleanup)
+			ereport(elevel,
+					(errmsg(ngettext("launched %d parallel vacuum worker for index cleanup (planned: %d)",
+									 "launched %d parallel vacuum workers for index cleanup (planned: %d)",
+									 lps->pcxt->nworkers_launched),
+							lps->pcxt->nworkers_launched, nworkers)));
+		else
+			ereport(elevel,
+					(errmsg(ngettext("launched %d parallel vacuum worker for index vacuuming (planned: %d)",
+									 "launched %d parallel vacuum workers for index vacuuming (planned: %d)",
+									 lps->pcxt->nworkers_launched),
+							lps->pcxt->nworkers_launched, nworkers)));
+	}
+
+	/* Process the indexes that can be processed by only leader process */
+	vacuum_indexes_leader(Irel, stats, vacrelstats, lps, nindexes);
+
+	/*
+	 * Join as a parallel worker.  The leader process alone processes all the
+	 * indexes in the case where no workers are launched.
+	 */
+	parallel_vacuum_index(Irel, stats, lps->lvshared,
+						  vacrelstats->dead_tuples, nindexes);
+
+	/* Wait for all vacuum workers to finish */
+	WaitForParallelWorkersToFinish(lps->pcxt);
+
+	/* Carry the shared balance value to heap scan */
+	if (VacuumSharedCostBalance)
+		VacuumCostBalance = pg_atomic_read_u32(VacuumSharedCostBalance);
+
+	if (nworkers > 0)
+	{
+		/* Disable shared cost balance */
+		VacuumSharedCostBalance = NULL;
+		VacuumActiveNWorkers = NULL;
+	}
+}
+
+/*
+ * Index vacuum/cleanup routine used by the leader process and parallel
+ * vacuum worker processes to process the indexes in parallel.
+ */
+static void
+parallel_vacuum_index(Relation *Irel, IndexBulkDeleteResult **stats,
+					  LVShared *lvshared, LVDeadTuples *dead_tuples,
+					  int nindexes)
+{
+	/*
+	 * Increment the active worker count if we are able to launch any worker.
+	 */
+	if (VacuumActiveNWorkers)
+		pg_atomic_add_fetch_u32(VacuumActiveNWorkers, 1);
+
+	/* Loop until all indexes are vacuumed */
+	for (;;)
+	{
+		int			idx;
+		LVSharedIndStats *shared_indstats;
+
+		/* Get an index number to process */
+		idx = pg_atomic_fetch_add_u32(&(lvshared->idx), 1);
+
+		/* Done for all indexes? */
+		if (idx >= nindexes)
+			break;
+
+		/* Get the index statistics of this index from DSM */
+		shared_indstats = get_indstats(lvshared, idx);
+
+		/*
+		 * Skip processing indexes that doesn't participate in parallel
+		 * operation
+		 */
+		if (shared_indstats == NULL ||
+			skip_parallel_vacuum_index(Irel[idx], lvshared))
+			continue;
+
+		/* Increment the processing count */
+		pg_atomic_add_fetch_u32(&(lvshared->nprocessed), 1);
+
+		/* Do vacuum or cleanup of the index */
+		vacuum_one_index(Irel[idx], &(stats[idx]), lvshared, shared_indstats,
+						 dead_tuples);
+	}
+
+	/*
+	 * We have completed the index vacuum so decrement the active worker
+	 * count.
+	 */
+	if (VacuumActiveNWorkers)
+		pg_atomic_sub_fetch_u32(VacuumActiveNWorkers, 1);
+}
+
+/*
+ * Vacuum or cleanup indexes that can be processed by only the leader process
+ * because these indexes don't support parallel operation at that phase.
+ */
+static void
+vacuum_indexes_leader(Relation *Irel, IndexBulkDeleteResult **stats,
+					  LVRelStats *vacrelstats, LVParallelState *lps,
+					  int nindexes)
+{
+	int			i;
+
+	Assert(!IsParallelWorker());
+
+	/*
+	 * Increment the active worker count if we are able to launch any worker.
+	 */
+	if (VacuumActiveNWorkers)
+		pg_atomic_add_fetch_u32(VacuumActiveNWorkers, 1);
+
+	for (i = 0; i < nindexes; i++)
+	{
+		bool		skip_index = (get_indstats(lps->lvshared, i) == NULL ||
+								  skip_parallel_vacuum_index(Irel[i], lps->lvshared));
+
+		/* Skip the indexes that can be processed by parallel workers */
+		if (!skip_index)
+			continue;
+
+		vacuum_one_index(Irel[i], &(stats[i]), lps->lvshared,
+						 get_indstats(lps->lvshared, i),
+						 vacrelstats->dead_tuples);
+	}
+
+	/*
+	 * We have completed the index vacuum so decrement the active worker
+	 * count.
+	 */
+	if (VacuumActiveNWorkers)
+		pg_atomic_sub_fetch_u32(VacuumActiveNWorkers, 1);
+}
+
+/*
+ * Vacuum or cleanup index either by leader process or by one of the worker
+ * process.  After processing the index this function copies the index
+ * statistics returned from ambulkdelete and amvacuumcleanup to the DSM
+ * segment.
+ */
+static void
+vacuum_one_index(Relation indrel, IndexBulkDeleteResult **stats,
+				 LVShared *lvshared, LVSharedIndStats *shared_indstats,
+				 LVDeadTuples *dead_tuples)
+{
+	IndexBulkDeleteResult *bulkdelete_res = NULL;
+
+	if (shared_indstats)
+	{
+		/* Get the space for IndexBulkDeleteResult */
+		bulkdelete_res = &(shared_indstats->stats);
+
+		/*
+		 * Update the pointer to the corresponding bulk-deletion result if
+		 * someone has already updated it.
+		 */
+		if (shared_indstats->updated && *stats == NULL)
+			*stats = bulkdelete_res;
+	}
+
+	/* Do vacuum or cleanup of the index */
+	if (lvshared->for_cleanup)
+		lazy_cleanup_index(indrel, stats, lvshared->reltuples,
+						   lvshared->estimated_count);
+	else
+		lazy_vacuum_index(indrel, stats, dead_tuples,
+						  lvshared->reltuples);
+
+	/*
+	 * Copy the index bulk-deletion result returned from ambulkdelete and
+	 * amvacuumcleanup to the DSM segment if it's the first time to get it
+	 * from them, because they allocate it locally and it's possible that an
+	 * index will be vacuumed by the different vacuum process at the next
+	 * time.  The copying of the result normally happens only after the first
+	 * time of index vacuuming.  From the second time, we pass the result on
+	 * the DSM segment so that they then update it directly.
+	 *
+	 * Since all vacuum workers write the bulk-deletion result at different
+	 * slots we can write them without locking.
+	 */
+	if (shared_indstats && !shared_indstats->updated && *stats != NULL)
+	{
+		memcpy(bulkdelete_res, *stats, sizeof(IndexBulkDeleteResult));
+		shared_indstats->updated = true;
+
+		/*
+		 * Now that the stats[idx] points to the DSM segment, we don't need
+		 * the locally allocated results.
+		 */
+		pfree(*stats);
+		*stats = bulkdelete_res;
+	}
+}
+
+/*
+ * Vacuum indexes.  We process the indexes serially unless we are doing
+ * parallel vacuum.
+ */
+static void
+lazy_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+					LVRelStats *vacrelstats, LVParallelState *lps,
+					int nindexes)
+{
+	int			idx;
+
+	Assert(!IsParallelWorker());
+	Assert(nindexes > 0);
+
+	/* Perform index vacuuming with parallel workers for parallel vacuum. */
+	if (ParallelVacuumIsActive(lps))
+	{
+		/* Tell parallel workers to do index vacuuming */
+		lps->lvshared->for_cleanup = false;
+		lps->lvshared->first_time = false;
+
+		/*
+		 * We can only provide an approximate value of num_heap_tuples in
+		 * vacuum cases.
+		 */
+		lps->lvshared->reltuples = vacrelstats->old_live_tuples;
+		lps->lvshared->estimated_count = true;
+
+		lazy_parallel_vacuum_indexes(Irel, stats, vacrelstats, lps, nindexes);
+	}
+	else
+	{
+		for (idx = 0; idx < nindexes; idx++)
+			lazy_vacuum_index(Irel[idx], &stats[idx], vacrelstats->dead_tuples,
+							  vacrelstats->old_live_tuples);
+	}
+}
+
+/*
+ * Cleanup indexes.  We process the indexes serially unless we are doing
+ * parallel vacuum.
+ */
+static void
+lazy_cleanup_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+					 LVRelStats *vacrelstats, LVParallelState *lps,
+					 int nindexes)
+{
+	int			idx;
+
+	Assert(!IsParallelWorker());
+	Assert(nindexes > 0);
+
+	/*
+	 * If parallel vacuum is active we perform index cleanup with parallel
+	 * workers.
+	 */
+	if (ParallelVacuumIsActive(lps))
+	{
+		/* Tell parallel workers to do index cleanup */
+		lps->lvshared->for_cleanup = true;
+		lps->lvshared->first_time =
+			(vacrelstats->num_index_scans == 0);
+
+		/*
+		 * Now we can provide a better estimate of total number of surviving
+		 * tuples (we assume indexes are more interested in that than in the
+		 * number of nominally live tuples).
+		 */
+		lps->lvshared->reltuples = vacrelstats->new_rel_tuples;
+		lps->lvshared->estimated_count =
+			(vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+
+		lazy_parallel_vacuum_indexes(Irel, stats, vacrelstats, lps, nindexes);
+	}
+	else
+	{
+		for (idx = 0; idx < nindexes; idx++)
+			lazy_cleanup_index(Irel[idx], &stats[idx],
+							   vacrelstats->new_rel_tuples,
+							   vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	}
+}
 
 /*
  *	lazy_vacuum_index() -- vacuum one index relation.
  *
  *		Delete all the index entries pointing to tuples listed in
- *		vacrelstats->dead_tuples, and update running statistics.
+ *		dead_tuples, and update running statistics.
+ *
+ *		reltuples is the number of heap tuples to be passed to the
+ *		bulkdelete callback.
  */
 static void
-lazy_vacuum_index(Relation indrel,
-				  IndexBulkDeleteResult **stats,
-				  LVRelStats *vacrelstats)
+lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult **stats,
+				  LVDeadTuples *dead_tuples, double reltuples)
 {
 	IndexVacuumInfo ivinfo;
+	const char *msg;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
@@ -1751,30 +2332,38 @@ lazy_vacuum_index(Relation indrel,
 	ivinfo.report_progress = false;
 	ivinfo.estimated_count = true;
 	ivinfo.message_level = elevel;
-	/* We can only provide an approximate value of num_heap_tuples here */
-	ivinfo.num_heap_tuples = vacrelstats->old_live_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
 	/* Do bulk deletion */
 	*stats = index_bulk_delete(&ivinfo, *stats,
-							   lazy_tid_reaped, (void *) vacrelstats);
+							   lazy_tid_reaped, (void *) dead_tuples);
+
+	if (IsParallelWorker())
+		msg = gettext_noop("scanned index \"%s\" to remove %d row versions by parallel vacuum worker");
+	else
+		msg = gettext_noop("scanned index \"%s\" to remove %d row versions");
 
 	ereport(elevel,
-			(errmsg("scanned index \"%s\" to remove %d row versions",
+			(errmsg(msg,
 					RelationGetRelationName(indrel),
-					vacrelstats->num_dead_tuples),
+					dead_tuples->num_tuples),
 			 errdetail_internal("%s", pg_rusage_show(&ru0))));
 }
 
 /*
  *	lazy_cleanup_index() -- do post-vacuum cleanup for one index relation.
+ *
+ *		reltuples is the number of heap tuples and estimated_count is true
+ *		if the reltuples is an estimated value.
  */
 static void
 lazy_cleanup_index(Relation indrel,
-				   IndexBulkDeleteResult *stats,
-				   LVRelStats *vacrelstats)
+				   IndexBulkDeleteResult **stats,
+				   double reltuples, bool estimated_count)
 {
 	IndexVacuumInfo ivinfo;
+	const char *msg;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
@@ -1782,49 +2371,33 @@ lazy_cleanup_index(Relation indrel,
 	ivinfo.index = indrel;
 	ivinfo.analyze_only = false;
 	ivinfo.report_progress = false;
-	ivinfo.estimated_count = (vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	ivinfo.estimated_count = estimated_count;
 	ivinfo.message_level = elevel;
 
-	/*
-	 * Now we can provide a better estimate of total number of surviving
-	 * tuples (we assume indexes are more interested in that than in the
-	 * number of nominally live tuples).
-	 */
-	ivinfo.num_heap_tuples = vacrelstats->new_rel_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
-	stats = index_vacuum_cleanup(&ivinfo, stats);
+	*stats = index_vacuum_cleanup(&ivinfo, *stats);
 
-	if (!stats)
+	if (!(*stats))
 		return;
 
-	/*
-	 * Now update statistics in pg_class, but only if the index says the count
-	 * is accurate.
-	 */
-	if (!stats->estimated_count)
-		vac_update_relstats(indrel,
-							stats->num_pages,
-							stats->num_index_tuples,
-							0,
-							false,
-							InvalidTransactionId,
-							InvalidMultiXactId,
-							false);
+	if (IsParallelWorker())
+		msg = gettext_noop("index \"%s\" now contains %.0f row versions in %u pages as reported by parallel vacuum worker");
+	else
+		msg = gettext_noop("index \"%s\" now contains %.0f row versions in %u pages");
 
 	ereport(elevel,
-			(errmsg("index \"%s\" now contains %.0f row versions in %u pages",
+			(errmsg(msg,
 					RelationGetRelationName(indrel),
-					stats->num_index_tuples,
-					stats->num_pages),
+					(*stats)->num_index_tuples,
+					(*stats)->num_pages),
 			 errdetail("%.0f index row versions were removed.\n"
 					   "%u index pages have been deleted, %u are currently reusable.\n"
 					   "%s.",
-					   stats->tuples_removed,
-					   stats->pages_deleted, stats->pages_free,
+					   (*stats)->tuples_removed,
+					   (*stats)->pages_deleted, (*stats)->pages_free,
 					   pg_rusage_show(&ru0))));
-
-	pfree(stats);
 }
 
 /*
@@ -2132,19 +2705,17 @@ count_nondeletable_pages(Relation onerel, LVRelStats *vacrelstats)
 }
 
 /*
- * lazy_space_alloc - space allocation decisions for lazy vacuum
- *
- * See the comments at the head of this file for rationale.
+ * Return the maximum number of dead tuples we can record.
  */
-static void
-lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+static long
+compute_max_dead_tuples(BlockNumber relblocks, bool useindex)
 {
 	long		maxtuples;
 	int			vac_work_mem = IsAutoVacuumWorkerProcess() &&
 	autovacuum_work_mem != -1 ?
 	autovacuum_work_mem : maintenance_work_mem;
 
-	if (vacrelstats->useindex)
+	if (useindex)
 	{
 		maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);
 		maxtuples = Min(maxtuples, INT_MAX);
@@ -2158,34 +2729,49 @@ lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
 		maxtuples = Max(maxtuples, MaxHeapTuplesPerPage);
 	}
 	else
-	{
 		maxtuples = MaxHeapTuplesPerPage;
-	}
 
-	vacrelstats->num_dead_tuples = 0;
-	vacrelstats->max_dead_tuples = (int) maxtuples;
-	vacrelstats->dead_tuples = (ItemPointer)
-		palloc(maxtuples * sizeof(ItemPointerData));
+	return maxtuples;
+}
+
+/*
+ * lazy_space_alloc - space allocation decisions for lazy vacuum
+ *
+ * See the comments at the head of this file for rationale.
+ */
+static void
+lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+{
+	LVDeadTuples *dead_tuples = NULL;
+	long		maxtuples;
+
+	maxtuples = compute_max_dead_tuples(relblocks, vacrelstats->useindex);
+
+	dead_tuples = (LVDeadTuples *)
+		palloc(SizeOfLVDeadTuples + maxtuples * sizeof(ItemPointerData));
+	dead_tuples->num_tuples = 0;
+	dead_tuples->max_tuples = (int) maxtuples;
+
+	vacrelstats->dead_tuples = dead_tuples;
 }
 
 /*
  * lazy_record_dead_tuple - remember one deletable tuple
  */
 static void
-lazy_record_dead_tuple(LVRelStats *vacrelstats,
-					   ItemPointer itemptr)
+lazy_record_dead_tuple(LVDeadTuples *dead_tuples, ItemPointer itemptr)
 {
 	/*
 	 * The array shouldn't overflow under normal behavior, but perhaps it
 	 * could if we are given a really small maintenance_work_mem. In that
 	 * case, just forget the last few tuples (we'll get 'em next time).
 	 */
-	if (vacrelstats->num_dead_tuples < vacrelstats->max_dead_tuples)
+	if (dead_tuples->num_tuples < dead_tuples->max_tuples)
 	{
-		vacrelstats->dead_tuples[vacrelstats->num_dead_tuples] = *itemptr;
-		vacrelstats->num_dead_tuples++;
+		dead_tuples->itemptrs[dead_tuples->num_tuples] = *itemptr;
+		dead_tuples->num_tuples++;
 		pgstat_progress_update_param(PROGRESS_VACUUM_NUM_DEAD_TUPLES,
-									 vacrelstats->num_dead_tuples);
+									 dead_tuples->num_tuples);
 	}
 }
 
@@ -2199,12 +2785,12 @@ lazy_record_dead_tuple(LVRelStats *vacrelstats,
 static bool
 lazy_tid_reaped(ItemPointer itemptr, void *state)
 {
-	LVRelStats *vacrelstats = (LVRelStats *) state;
+	LVDeadTuples *dead_tuples = (LVDeadTuples *) state;
 	ItemPointer res;
 
 	res = (ItemPointer) bsearch((void *) itemptr,
-								(void *) vacrelstats->dead_tuples,
-								vacrelstats->num_dead_tuples,
+								(void *) dead_tuples->itemptrs,
+								dead_tuples->num_tuples,
 								sizeof(ItemPointerData),
 								vac_cmp_itemptr);
 
@@ -2352,3 +2938,451 @@ heap_page_is_all_visible(Relation rel, Buffer buf,
 
 	return all_visible;
 }
+
+/*
+ * Compute the number of parallel worker processes to request.  Both index
+ * vacuuming and index cleanup can be executed with parallel workers.  The
+ * relation sizes of table don't affect to the parallel degree for now.
+ *
+ * nrequested is the number of parallel workers that user requested.  If
+ * nrequested is 0, we compute the parallel degree based on nindexes, that is
+ * the number of indexes that support parallel index vacuuming.  This function
+ * also sets can_parallel_vacuum to remember indexes that participate in
+ * parallel index vacuum.
+ */
+static int
+compute_parallel_vacuum_workers(Relation *Irel, int nindexes, int nrequested,
+								bool *can_parallel_vacuum)
+{
+	int			nindexes_parallel = 0;
+	int			nindexes_parallel_bulkdel = 0;
+	int			nindexes_parallel_cleanup = 0;
+	int			parallel_workers;
+	int			i;
+
+	/*
+	 * We don't allow to perform parallel operation in standalone backend or
+	 * when parallelism is disabled.
+	 */
+	if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+		return 0;
+
+	/*
+	 * Compute the number of indexes that can participate in parallel index
+	 * vacuuming.
+	 */
+	for (i = 0; i < nindexes; i++)
+	{
+		uint8		vacoptions = Irel[i]->rd_indam->amparallelvacuumoptions;
+
+		if (vacoptions == VACUUM_OPTION_NO_PARALLEL ||
+			RelationGetNumberOfBlocks(Irel[i]) < min_parallel_index_scan_size)
+			continue;
+
+		can_parallel_vacuum[i] = true;
+
+		if ((vacoptions & VACUUM_OPTION_PARALLEL_BULKDEL) != 0)
+			nindexes_parallel_bulkdel++;
+		if (((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) != 0) ||
+			((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) != 0))
+			nindexes_parallel_cleanup++;
+	}
+
+	nindexes_parallel = Max(nindexes_parallel_bulkdel,
+							nindexes_parallel_cleanup);
+
+	/* No index doesn't do parallel index vacuuming */
+	if (nindexes_parallel == 0)
+		return 0;
+
+	/* The leader process takes one index */
+	nindexes_parallel--;
+
+	/* Compute the parallel degree */
+	parallel_workers = (nrequested > 0) ?
+		Min(nrequested, nindexes_parallel) : nindexes_parallel;
+
+	/* cap by max_parallel_maintenance_workers */
+	parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+
+	return parallel_workers;
+}
+
+/*
+ * Initialize variables for shared index statistics, set NULL bitmap and the
+ * size of stats for each index.  Since currently we don't support parallel vacuum
+ * for autovacuum we don't need to care about autovacuum_work_mem.
+ */
+static void
+prepare_index_statistics(LVShared *lvshared, bool *can_parallel_vacuum,
+						 int nindexes)
+{
+	int			i;
+
+	Assert(!IsAutoVacuumWorkerProcess());
+
+	/* Set NULL for all indexes */
+	memset(lvshared->bitmap, 0x00, BITMAPLEN(nindexes));
+
+	for (i = 0; i < nindexes; i++)
+	{
+		if (!can_parallel_vacuum[i])
+			continue;
+
+		/* Set NOT NULL as this index do support parallelism */
+		lvshared->bitmap[i >> 3] |= 1 << (i & 0x07);
+	}
+}
+
+/*
+ * Update index statistics in pg_class if the statistics is accurate.
+ */
+static void
+update_index_statistics(Relation *Irel, IndexBulkDeleteResult **stats,
+						int nindexes)
+{
+	int			i;
+
+	Assert(!IsInParallelMode());
+
+	for (i = 0; i < nindexes; i++)
+	{
+		if (stats[i] == NULL || stats[i]->estimated_count)
+			continue;
+
+		/* Update index statistics */
+		vac_update_relstats(Irel[i],
+							stats[i]->num_pages,
+							stats[i]->num_index_tuples,
+							0,
+							false,
+							InvalidTransactionId,
+							InvalidMultiXactId,
+							false);
+		pfree(stats[i]);
+	}
+}
+
+/*
+ * This function prepares and returns parallel vacuum state if we can launch
+ * even one worker.  This function is responsible to create a parallel
+ * context, enter parallel mode, and then initialize the DSM segment.
+ */
+static LVParallelState *
+begin_parallel_vacuum(Oid relid, Relation *Irel, LVRelStats *vacrelstats,
+					  BlockNumber nblocks, int nindexes, int nrequested)
+{
+	LVParallelState *lps = NULL;
+	ParallelContext *pcxt;
+	LVShared   *shared;
+	LVDeadTuples *dead_tuples;
+	bool	   *can_parallel_vacuum;
+	long		maxtuples;
+	char	   *sharedquery;
+	Size		est_shared;
+	Size		est_deadtuples;
+	int			nindexes_mwm = 0;
+	int			parallel_workers = 0;
+	int			querylen;
+	int			i;
+
+	/*
+	 * a parallel vacuum must be requested and there must be indexes on the
+	 * relation
+	 */
+	Assert(nrequested >= 0);
+	Assert(nindexes > 0);
+
+	/*
+	 * Compute the number of parallel vacuum workers to launch
+	 */
+	can_parallel_vacuum = (bool *) palloc0(sizeof(bool) * nindexes);
+	parallel_workers = compute_parallel_vacuum_workers(Irel, nindexes,
+													   nrequested,
+													   can_parallel_vacuum);
+
+	/* Can't perform vacuum in parallel */
+	if (parallel_workers <= 0)
+	{
+		pfree(can_parallel_vacuum);
+		return lps;
+	}
+
+	lps = (LVParallelState *) palloc0(sizeof(LVParallelState));
+
+	EnterParallelMode();
+	pcxt = CreateParallelContext("postgres", "parallel_vacuum_main",
+								 parallel_workers);
+	Assert(pcxt->nworkers > 0);
+	lps->pcxt = pcxt;
+
+	/* Estimate size for shared information -- PARALLEL_VACUUM_KEY_SHARED */
+	est_shared = MAXALIGN(add_size(SizeOfLVShared, BITMAPLEN(nindexes)));
+	for (i = 0; i < nindexes; i++)
+	{
+		uint8		vacoptions = Irel[i]->rd_indam->amparallelvacuumoptions;
+
+		/*
+		 * Cleanup option should be either disabled, always performing in
+		 * parallel or conditionally performing in parallel.
+		 */
+		Assert(((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) == 0) ||
+			   ((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) == 0));
+		Assert(vacoptions <= VACUUM_OPTION_MAX_VALID_VALUE);
+
+		/* Skip indexes that don't participate in parallel index vacuum */
+		if (!can_parallel_vacuum[i])
+			continue;
+
+		if (Irel[i]->rd_indam->amusemaintenanceworkmem)
+			nindexes_mwm++;
+
+		est_shared = add_size(est_shared, sizeof(LVSharedIndStats));
+
+		/*
+		 * Remember the number of indexes that support parallel operation for
+		 * each phase.
+		 */
+		if ((vacoptions & VACUUM_OPTION_PARALLEL_BULKDEL) != 0)
+			lps->nindexes_parallel_bulkdel++;
+		if ((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) != 0)
+			lps->nindexes_parallel_cleanup++;
+		if ((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) != 0)
+			lps->nindexes_parallel_condcleanup++;
+	}
+	shm_toc_estimate_chunk(&pcxt->estimator, est_shared);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	/* Estimate size for dead tuples -- PARALLEL_VACUUM_KEY_DEAD_TUPLES */
+	maxtuples = compute_max_dead_tuples(nblocks, true);
+	est_deadtuples = MAXALIGN(add_size(SizeOfLVDeadTuples,
+									   mul_size(sizeof(ItemPointerData), maxtuples)));
+	shm_toc_estimate_chunk(&pcxt->estimator, est_deadtuples);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	/* Finally, estimate PARALLEL_VACUUM_KEY_QUERY_TEXT space */
+	querylen = strlen(debug_query_string);
+	shm_toc_estimate_chunk(&pcxt->estimator, querylen + 1);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	InitializeParallelDSM(pcxt);
+
+	/* Prepare shared information */
+	shared = (LVShared *) shm_toc_allocate(pcxt->toc, est_shared);
+	MemSet(shared, 0, est_shared);
+	shared->relid = relid;
+	shared->elevel = elevel;
+	shared->maintenance_work_mem_worker =
+		(nindexes_mwm > 0) ?
+		maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
+		maintenance_work_mem;
+
+	/*
+	 * We need to care about alignment because we estimate the shared memory
+	 * in that way.
+	 */
+	shared->offset = MAXALIGN(add_size(SizeOfLVShared, BITMAPLEN(nindexes)));
+	prepare_index_statistics(shared, can_parallel_vacuum, nindexes);
+	pg_atomic_init_u32(&(shared->idx), 0);
+	pg_atomic_init_u32(&(shared->nprocessed), 0);
+	pg_atomic_init_u32(&(shared->cost_balance), 0);
+	pg_atomic_init_u32(&(shared->active_nworkers), 0);
+
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_SHARED, shared);
+	lps->lvshared = shared;
+
+	/* Prepare the dead tuple space */
+	dead_tuples = (LVDeadTuples *) shm_toc_allocate(pcxt->toc, est_deadtuples);
+	dead_tuples->max_tuples = maxtuples;
+	dead_tuples->num_tuples = 0;
+	MemSet(dead_tuples->itemptrs, 0, sizeof(ItemPointerData) * maxtuples);
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_DEAD_TUPLES, dead_tuples);
+	vacrelstats->dead_tuples = dead_tuples;
+
+	/* Store query string for workers */
+	sharedquery = (char *) shm_toc_allocate(pcxt->toc, querylen + 1);
+	memcpy(sharedquery, debug_query_string, querylen + 1);
+	sharedquery[querylen] = '\0';
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, sharedquery);
+
+	pfree(can_parallel_vacuum);
+	return lps;
+}
+
+/*
+ * Destroy the parallel context, and end parallel mode.
+ *
+ * Since writes are not allowed during the parallel mode, so we copy the
+ * updated index statistics from DSM in local memory and then later use that
+ * to update the index statistics.  One might think that we can exit from
+ * parallel mode, update the index statistics and then destroy parallel
+ * context, but that won't be safe (see ExitParallelMode).
+ */
+static void
+end_parallel_vacuum(Relation *Irel, IndexBulkDeleteResult **stats,
+					LVParallelState *lps, int nindexes)
+{
+	int			i;
+
+	Assert(!IsParallelWorker());
+
+	/* copy the updated statistics */
+	for (i = 0; i < nindexes; i++)
+	{
+		LVSharedIndStats *indstats = get_indstats(lps->lvshared, i);
+
+		/*
+		 * Skip unused slot.  The statistics of this index are already stored
+		 * in local memory.
+		 */
+		if (indstats == NULL)
+			continue;
+
+		if (indstats->updated)
+		{
+			stats[i] = (IndexBulkDeleteResult *) palloc0(sizeof(IndexBulkDeleteResult));
+			memcpy(stats[i], &(indstats->stats), sizeof(IndexBulkDeleteResult));
+		}
+		else
+			stats[i] = NULL;
+	}
+
+	DestroyParallelContext(lps->pcxt);
+	ExitParallelMode();
+
+	/* Deactivate parallel vacuum */
+	pfree(lps);
+	lps = NULL;
+}
+
+/* Return the Nth index statistics or NULL */
+static LVSharedIndStats *
+get_indstats(LVShared *lvshared, int n)
+{
+	int			i;
+	char	   *p;
+
+	if (IndStatsIsNull(lvshared, n))
+		return NULL;
+
+	p = (char *) GetSharedIndStats(lvshared);
+	for (i = 0; i < n; i++)
+	{
+		if (IndStatsIsNull(lvshared, i))
+			continue;
+
+		p += sizeof(LVSharedIndStats);
+	}
+
+	return (LVSharedIndStats *) p;
+}
+
+/*
+ * Check if the given index participates in parallel index vacuum or parallel
+ * index cleanup.
+ */
+static bool
+skip_parallel_vacuum_index(Relation indrel, LVShared *lvshared)
+{
+	uint8		vacoptions = indrel->rd_indam->amparallelvacuumoptions;
+
+	/* first_time must be true only if for_cleanup is true */
+	Assert(lvshared->for_cleanup || !lvshared->first_time);
+
+	if (lvshared->for_cleanup)
+	{
+		/* Skip, if the index does not support parallel cleanup */
+		if (((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) == 0) &&
+			((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) == 0))
+			return true;
+
+		/*
+		 * Skip, if the index supports parallel cleanup conditionally, but we
+		 * have already processed the index (for bulkdelete).  See the
+		 * comments for option VACUUM_OPTION_PARALLEL_COND_CLEANUP to know
+		 * when indexes support parallel cleanup conditionally.
+		 */
+		if (!lvshared->first_time &&
+			((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) != 0))
+			return true;
+	}
+	else if ((vacoptions & VACUUM_OPTION_PARALLEL_BULKDEL) == 0)
+	{
+		/* Skip if the index does not support parallel bulk deletion */
+		return true;
+	}
+
+	return false;
+}
+
+/*
+ * Perform work within a launched parallel process.
+ *
+ * Since parallel vacuum workers perform only index vacuum or index cleanup,
+ * we don't need to report the progress information.
+ */
+void
+parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
+{
+	Relation	onerel;
+	Relation   *indrels;
+	LVShared   *lvshared;
+	LVDeadTuples *dead_tuples;
+	int			nindexes;
+	char	   *sharedquery;
+	IndexBulkDeleteResult **stats;
+
+	lvshared = (LVShared *) shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_SHARED,
+										   false);
+	elevel = lvshared->elevel;
+
+	ereport(DEBUG1,
+			(errmsg("starting parallel lazy vacuum worker for %s",
+					lvshared->for_cleanup ? "cleanup" : "vacuuming")));
+
+	/* Set debug_query_string for individual workers */
+	sharedquery = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, false);
+	debug_query_string = sharedquery;
+	pgstat_report_activity(STATE_RUNNING, debug_query_string);
+
+	/*
+	 * Open table.  The lock mode is the same as the leader process.  It's
+	 * okay because the lock mode does not conflict among the parallel
+	 * workers.
+	 */
+	onerel = table_open(lvshared->relid, ShareUpdateExclusiveLock);
+
+	/*
+	 * Open all indexes. indrels are sorted in order by OID, which should be
+	 * matched to the leader's one.
+	 */
+	vac_open_indexes(onerel, RowExclusiveLock, &nindexes, &indrels);
+	Assert(nindexes > 0);
+
+	/* Set dead tuple space */
+	dead_tuples = (LVDeadTuples *) shm_toc_lookup(toc,
+												  PARALLEL_VACUUM_KEY_DEAD_TUPLES,
+												  false);
+
+	/* Set cost-based vacuum delay */
+	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostBalance = 0;
+	VacuumPageHit = 0;
+	VacuumPageMiss = 0;
+	VacuumPageDirty = 0;
+	VacuumSharedCostBalance = &(lvshared->cost_balance);
+	VacuumActiveNWorkers = &(lvshared->active_nworkers);
+
+	stats = (IndexBulkDeleteResult **)
+		palloc0(nindexes * sizeof(IndexBulkDeleteResult *));
+
+	if (lvshared->maintenance_work_mem_worker > 0)
+		maintenance_work_mem = lvshared->maintenance_work_mem_worker;
+
+	/* Process indexes to perform vacuum/cleanup */
+	parallel_vacuum_index(indrels, stats, lvshared, dead_tuples, nindexes);
+
+	vac_close_indexes(nindexes, indrels, RowExclusiveLock);
+	table_close(onerel, ShareUpdateExclusiveLock);
+	pfree(stats);
+}
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index d147236429..6c9ee65ba2 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -14,6 +14,7 @@
 
 #include "postgres.h"
 
+#include "access/heapam.h"
 #include "access/nbtree.h"
 #include "access/parallel.h"
 #include "access/session.h"
@@ -139,6 +140,9 @@ static const struct
 	},
 	{
 		"_bt_parallel_build_main", _bt_parallel_build_main
+	},
+	{
+		"parallel_vacuum_main", parallel_vacuum_main
 	}
 };
 
@@ -174,6 +178,7 @@ CreateParallelContext(const char *library_name, const char *function_name,
 	pcxt = palloc0(sizeof(ParallelContext));
 	pcxt->subid = GetCurrentSubTransactionId();
 	pcxt->nworkers = nworkers;
+	pcxt->nworkers_to_launch = nworkers;
 	pcxt->library_name = pstrdup(library_name);
 	pcxt->function_name = pstrdup(function_name);
 	pcxt->error_context_stack = error_context_stack;
@@ -486,6 +491,23 @@ ReinitializeParallelDSM(ParallelContext *pcxt)
 	}
 }
 
+/*
+ * Reinitialize parallel workers for a parallel context such that we could
+ * launch the different number of workers.  This is required for cases where
+ * we need to reuse the same DSM segment, but the number of workers can
+ * vary from run-to-run.
+ */
+void
+ReinitializeParallelWorkers(ParallelContext *pcxt, int nworkers_to_launch)
+{
+	/*
+	 * The number of workers that need to be launched must be less than the
+	 * number of workers with which the parallel context is initialized.
+	 */
+	Assert(pcxt->nworkers >= nworkers_to_launch);
+	pcxt->nworkers_to_launch = nworkers_to_launch;
+}
+
 /*
  * Launch parallel workers.
  */
@@ -498,7 +520,7 @@ LaunchParallelWorkers(ParallelContext *pcxt)
 	bool		any_registrations_failed = false;
 
 	/* Skip this if we have no workers. */
-	if (pcxt->nworkers == 0)
+	if (pcxt->nworkers == 0 || pcxt->nworkers_to_launch == 0)
 		return;
 
 	/* We need to be a lock group leader. */
@@ -533,7 +555,7 @@ LaunchParallelWorkers(ParallelContext *pcxt)
 	 * fails.  It wouldn't help much anyway, because registering the worker in
 	 * no way guarantees that it will start up and initialize successfully.
 	 */
-	for (i = 0; i < pcxt->nworkers; ++i)
+	for (i = 0; i < pcxt->nworkers_to_launch; ++i)
 	{
 		memcpy(worker.bgw_extra, &i, sizeof(int));
 		if (!any_registrations_failed &&
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 477b271aa3..9431a95e4a 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -42,6 +42,7 @@
 #include "nodes/makefuncs.h"
 #include "pgstat.h"
 #include "postmaster/autovacuum.h"
+#include "postmaster/bgworker_internals.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
 #include "storage/proc.h"
@@ -68,6 +69,14 @@ static MemoryContext vac_context = NULL;
 static BufferAccessStrategy vac_strategy;
 
 
+/*
+ * Variables for cost-based parallel vacuum.  See comments atop
+ * compute_parallel_delay to understand how it works.
+ */
+pg_atomic_uint32 *VacuumSharedCostBalance = NULL;
+pg_atomic_uint32 *VacuumActiveNWorkers = NULL;
+int			VacuumCostBalanceLocal = 0;
+
 /* non-export function prototypes */
 static List *expand_vacuum_rel(VacuumRelation *vrel, int options);
 static List *get_all_vacuum_rels(int options);
@@ -76,6 +85,7 @@ static void vac_truncate_clog(TransactionId frozenXID,
 							  TransactionId lastSaneFrozenXid,
 							  MultiXactId lastSaneMinMulti);
 static bool vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params);
+static double compute_parallel_delay(void);
 static VacOptTernaryValue get_vacopt_ternary_value(DefElem *def);
 
 /*
@@ -99,6 +109,7 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 	/* Set default value */
 	params.index_cleanup = VACOPT_TERNARY_DEFAULT;
 	params.truncate = VACOPT_TERNARY_DEFAULT;
+	params.nworkers = -1;
 
 	/* Parse options list */
 	foreach(lc, vacstmt->options)
@@ -129,6 +140,28 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 			params.index_cleanup = get_vacopt_ternary_value(opt);
 		else if (strcmp(opt->defname, "truncate") == 0)
 			params.truncate = get_vacopt_ternary_value(opt);
+		else if (strcmp(opt->defname, "parallel") == 0)
+		{
+			if (opt->arg == NULL)
+			{
+				/*
+				 * Parallel lazy vacuum is requested but user didn't specify
+				 * the parallel degree. The parallel degree will be determined
+				 * at the start of lazy vacuum.
+				 */
+				params.nworkers = 0;
+			}
+			else
+			{
+				params.nworkers = defGetInt32(opt);
+				if (params.nworkers < 1 || params.nworkers > MAX_PARALLEL_WORKER_LIMIT)
+					ereport(ERROR,
+							(errcode(ERRCODE_SYNTAX_ERROR),
+							 errmsg("parallel vacuum degree must be between 1 and %d",
+									MAX_PARALLEL_WORKER_LIMIT),
+							 parser_errposition(pstate, opt->location)));
+			}
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -170,6 +203,11 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 		}
 	}
 
+	if ((params.options & VACOPT_FULL) && params.nworkers >= 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot specify FULL option with PARALLEL option")));
+
 	/*
 	 * All freeze ages are zero if the FREEZE option is given; otherwise pass
 	 * them as -1 which means to use the default values.
@@ -383,6 +421,7 @@ vacuum(List *relations, VacuumParams *params,
 		VacuumPageHit = 0;
 		VacuumPageMiss = 0;
 		VacuumPageDirty = 0;
+		VacuumSharedCostBalance = NULL;
 
 		/*
 		 * Loop to process each selected relation.
@@ -1738,6 +1777,20 @@ vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params)
 		return false;
 	}
 
+	/*
+	 * Since parallel workers cannot access data in temporary tables, parallel
+	 * vacuum is not allowed for temporary relation. However rather than
+	 * skipping vacuum on the table, just disabling parallel option is better
+	 * option in most cases.
+	 */
+	if (RelationUsesLocalBuffers(onerel) && params->nworkers >= 0)
+	{
+		ereport(WARNING,
+				(errmsg("disabling parallel option of vacuum on \"%s\" --- cannot vacuum temporary tables in parallel",
+						RelationGetRelationName(onerel))));
+		params->nworkers = -1;
+	}
+
 	/*
 	 * Silently ignore partitioned tables as there is no work to be done.  The
 	 * useful work is on their child partitions, which have been queued up for
@@ -1941,16 +1994,26 @@ vac_close_indexes(int nindexes, Relation *Irel, LOCKMODE lockmode)
 void
 vacuum_delay_point(void)
 {
+	double		msec = 0;
+
 	/* Always check for interrupts */
 	CHECK_FOR_INTERRUPTS();
 
-	/* Nap if appropriate */
-	if (VacuumCostActive && !InterruptPending &&
-		VacuumCostBalance >= VacuumCostLimit)
-	{
-		double		msec;
+	if (!VacuumCostActive || InterruptPending)
+		return;
 
+	/*
+	 * For parallel vacuum, the delay is computed based on the shared cost
+	 * balance.  See compute_parallel_delay.
+	 */
+	if (VacuumSharedCostBalance != NULL)
+		msec = compute_parallel_delay();
+	else if (VacuumCostBalance >= VacuumCostLimit)
 		msec = VacuumCostDelay * VacuumCostBalance / VacuumCostLimit;
+
+	/* Nap if appropriate */
+	if (msec > 0)
+	{
 		if (msec > VacuumCostDelay * 4)
 			msec = VacuumCostDelay * 4;
 
@@ -1966,6 +2029,65 @@ vacuum_delay_point(void)
 	}
 }
 
+/*
+ * Computes the vacuum delay for parallel workers.
+ *
+ * The basic idea of a cost-based vacuum delay for parallel index vacuuming
+ * is to allow all parallel vacuum workers including the leader process to
+ * have a shared view of cost related parameters (mainly VacuumCostBalance).
+ * We allow each worker to update it as and when it has incurred any cost and
+ * then based on that decide whether it needs to sleep.  We allow the worker
+ * to sleep proportional to the work done and reduce the
+ * VacuumSharedCostBalance by the amount which is consumed by the current
+ * worker (VacuumCostBalanceLocal).  This can avoid letting the workers sleep
+ * who have done less or no I/O as compared to other workers and therefore can
+ * ensure that workers who are doing more I/O got throttled more.
+ *
+ * We allow any worker to sleep only if it has performed the I/O above a
+ * certain threshold, which is calculated based on the number of active
+ * workers (VacuumActiveNWorkers), and the overall cost balance is more than
+ * VacuumCostLimit set by the system.  The testing reveals that we achieve
+ * the required throttling if we allow a worker that has done more than 50%
+ * of its share of work to sleep.
+ */
+static double
+compute_parallel_delay(void)
+{
+	double		msec = 0;
+	uint32		shared_balance;
+	int			nworkers;
+
+	/* parallel vacuum must be active */
+	Assert(VacuumSharedCostBalance);
+
+	nworkers = pg_atomic_read_u32(VacuumActiveNWorkers);
+
+	/* At least count itself */
+	Assert(nworkers >= 1);
+
+	/* Update the shared cost balance value atomically */
+	shared_balance = pg_atomic_add_fetch_u32(VacuumSharedCostBalance, VacuumCostBalance);
+
+	/* Compute the total local balance for the current worker */
+	VacuumCostBalanceLocal += VacuumCostBalance;
+
+	if ((shared_balance >= VacuumCostLimit) &&
+		(VacuumCostBalanceLocal > 0.5 * (VacuumCostLimit / nworkers)))
+	{
+		/* Compute sleep time based on the local cost balance */
+		msec = VacuumCostDelay * VacuumCostBalanceLocal / VacuumCostLimit;
+		pg_atomic_sub_fetch_u32(VacuumSharedCostBalance, VacuumCostBalanceLocal);
+		VacuumCostBalanceLocal = 0;
+	}
+
+	/*
+	 * Reset the local balance as we accumulated it into the shared value.
+	 */
+	VacuumCostBalance = 0;
+
+	return msec;
+}
+
 /*
  * A wrapper function of defGetBoolean().
  *
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index e919317bab..a97cfe2111 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2886,6 +2886,8 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 			(!wraparound ? VACOPT_SKIP_LOCKED : 0);
 		tab->at_params.index_cleanup = VACOPT_TERNARY_DEFAULT;
 		tab->at_params.truncate = VACOPT_TERNARY_DEFAULT;
+		/* As of now, we don't support parallel vacuum for autovacuum */
+		tab->at_params.nworkers = -1;
 		tab->at_params.freeze_min_age = freeze_min_age;
 		tab->at_params.freeze_table_age = freeze_table_age;
 		tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 5e0db3515d..e2dbd94a3e 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -3591,7 +3591,7 @@ psql_completion(const char *text, int start, int end)
 		if (ends_with(prev_wd, '(') || ends_with(prev_wd, ','))
 			COMPLETE_WITH("FULL", "FREEZE", "ANALYZE", "VERBOSE",
 						  "DISABLE_PAGE_SKIPPING", "SKIP_LOCKED",
-						  "INDEX_CLEANUP", "TRUNCATE");
+						  "INDEX_CLEANUP", "TRUNCATE", "PARALLEL");
 		else if (TailMatches("FULL|FREEZE|ANALYZE|VERBOSE|DISABLE_PAGE_SKIPPING|SKIP_LOCKED|INDEX_CLEANUP|TRUNCATE"))
 			COMPLETE_WITH("ON", "OFF");
 	}
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 858bcb6bc9..e89c1252d3 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -24,6 +24,8 @@
 #include "nodes/primnodes.h"
 #include "storage/bufpage.h"
 #include "storage/lockdefs.h"
+#include "storage/shm_toc.h"
+#include "storage/dsm.h"
 #include "utils/relcache.h"
 #include "utils/snapshot.h"
 
@@ -193,6 +195,7 @@ extern Size SyncScanShmemSize(void);
 struct VacuumParams;
 extern void heap_vacuum_rel(Relation onerel,
 							struct VacuumParams *params, BufferAccessStrategy bstrategy);
+extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
 
 /* in heap/heapam_visibility.c */
 extern bool HeapTupleSatisfiesVisibility(HeapTuple stup, Snapshot snapshot,
diff --git a/src/include/access/parallel.h b/src/include/access/parallel.h
index c00ae6424c..b9ad6cf671 100644
--- a/src/include/access/parallel.h
+++ b/src/include/access/parallel.h
@@ -33,7 +33,8 @@ typedef struct ParallelContext
 {
 	dlist_node	node;
 	SubTransactionId subid;
-	int			nworkers;
+	int			nworkers;		/* Maximum number of workers to launch */
+	int			nworkers_to_launch; /* Actual number of workers to launch */
 	int			nworkers_launched;
 	char	   *library_name;
 	char	   *function_name;
@@ -63,6 +64,7 @@ extern ParallelContext *CreateParallelContext(const char *library_name,
 											  const char *function_name, int nworkers);
 extern void InitializeParallelDSM(ParallelContext *pcxt);
 extern void ReinitializeParallelDSM(ParallelContext *pcxt);
+extern void ReinitializeParallelWorkers(ParallelContext *pcxt, int nworkers_to_launch);
 extern void LaunchParallelWorkers(ParallelContext *pcxt);
 extern void WaitForParallelWorkersToAttach(ParallelContext *pcxt);
 extern void WaitForParallelWorkersToFinish(ParallelContext *pcxt);
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index b9becdbe99..254a6bcda6 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -222,6 +222,12 @@ typedef struct VacuumParams
 										 * default value depends on reloptions */
 	VacOptTernaryValue truncate;	/* Truncate empty pages at the end,
 									 * default value depends on reloptions */
+
+	/*
+	 * The number of parallel vacuum workers. -1 by default for no workers and
+	 * 0 for choosing based on the number of indexes.
+	 */
+	int			nworkers;
 } VacuumParams;
 
 /* GUC parameters */
@@ -231,6 +237,11 @@ extern int	vacuum_freeze_table_age;
 extern int	vacuum_multixact_freeze_min_age;
 extern int	vacuum_multixact_freeze_table_age;
 
+/* Variables for cost-based parallel vacuum  */
+extern pg_atomic_uint32 *VacuumSharedCostBalance;
+extern pg_atomic_uint32 *VacuumActiveNWorkers;
+extern int	VacuumCostBalanceLocal;
+
 
 /* in commands/vacuum.c */
 extern void ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel);
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index 9996d882d1..8571133fe7 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -92,6 +92,34 @@ CONTEXT:  SQL function "do_analyze" statement 1
 SQL function "wrap_do_analyze" statement 1
 VACUUM FULL vactst;
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+-- PARALLEL option
+CREATE TABLE pvactst (i INT, a INT[], p POINT) with (autovacuum_enabled = off);
+INSERT INTO pvactst SELECT i, array[1,2,3], point(i, i+1) FROM generate_series(1,1000) i;
+CREATE INDEX btree_pvactst ON pvactst USING btree (i);
+CREATE INDEX hash_pvactst ON pvactst USING hash (i);
+CREATE INDEX brin_pvactst ON pvactst USING brin (i);
+CREATE INDEX gin_pvactst ON pvactst USING gin (a);
+CREATE INDEX gist_pvactst ON pvactst USING gist (p);
+CREATE INDEX spgist_pvactst ON pvactst USING spgist (p);
+-- VACUUM invokes parallel index cleanup
+SET min_parallel_index_scan_size to 0;
+VACUUM (PARALLEL 2) pvactst;
+-- VACUUM invokes parallel bulk-deletion
+UPDATE pvactst SET i = i WHERE i < 1000;
+VACUUM (PARALLEL 2) pvactst;
+VACUUM (PARALLEL 0) pvactst; -- error
+ERROR:  parallel vacuum degree must be between 1 and 1024
+LINE 1: VACUUM (PARALLEL 0) pvactst;
+                ^
+VACUUM (PARALLEL 2, INDEX_CLEANUP FALSE) pvactst;
+VACUUM (PARALLEL 2, FULL TRUE) pvactst; -- error, cannot use both PARALLEL and FULL
+ERROR:  cannot specify FULL option with PARALLEL option
+CREATE TEMPORARY TABLE tmp (a int PRIMARY KEY);
+CREATE INDEX tmp_idx1 ON tmp (a);
+VACUUM (PARALLEL 1) tmp; -- disables parallel vacuum option
+WARNING:  disabling parallel option of vacuum on "tmp" --- cannot vacuum temporary tables in parallel
+RESET min_parallel_index_scan_size;
+DROP TABLE pvactst;
 -- INDEX_CLEANUP option
 CREATE TABLE no_index_cleanup (i INT PRIMARY KEY, t TEXT);
 -- Use uncompressed data stored in toast.
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 69987f75e9..be4f55616e 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -75,6 +75,33 @@ VACUUM FULL vactst;
 
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
 
+-- PARALLEL option
+CREATE TABLE pvactst (i INT, a INT[], p POINT) with (autovacuum_enabled = off);
+INSERT INTO pvactst SELECT i, array[1,2,3], point(i, i+1) FROM generate_series(1,1000) i;
+CREATE INDEX btree_pvactst ON pvactst USING btree (i);
+CREATE INDEX hash_pvactst ON pvactst USING hash (i);
+CREATE INDEX brin_pvactst ON pvactst USING brin (i);
+CREATE INDEX gin_pvactst ON pvactst USING gin (a);
+CREATE INDEX gist_pvactst ON pvactst USING gist (p);
+CREATE INDEX spgist_pvactst ON pvactst USING spgist (p);
+
+-- VACUUM invokes parallel index cleanup
+SET min_parallel_index_scan_size to 0;
+VACUUM (PARALLEL 2) pvactst;
+
+-- VACUUM invokes parallel bulk-deletion
+UPDATE pvactst SET i = i WHERE i < 1000;
+VACUUM (PARALLEL 2) pvactst;
+
+VACUUM (PARALLEL 0) pvactst; -- error
+VACUUM (PARALLEL 2, INDEX_CLEANUP FALSE) pvactst;
+VACUUM (PARALLEL 2, FULL TRUE) pvactst; -- error, cannot use both PARALLEL and FULL
+CREATE TEMPORARY TABLE tmp (a int PRIMARY KEY);
+CREATE INDEX tmp_idx1 ON tmp (a);
+VACUUM (PARALLEL 1) tmp; -- disables parallel vacuum option
+RESET min_parallel_index_scan_size;
+DROP TABLE pvactst;
+
 -- INDEX_CLEANUP option
 CREATE TABLE no_index_cleanup (i INT PRIMARY KEY, t TEXT);
 -- Use uncompressed data stored in toast.
-- 
2.23.0

#317

Mahendra Singh

mahi6run@gmail.com

about 6 years ago

In reply to: Masahiko Sawada (#316)

On Wed, 25 Dec 2019 at 17:47, Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Tue, 24 Dec 2019 at 15:46, Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Tue, 24 Dec 2019 at 15:44, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Dec 24, 2019 at 12:08 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

The first patches look good to me. I'm reviewing other patches and
will post comments if there is.

Oops I meant first "two" patches look good to me.

Okay, feel free to address few comments raised by Mahendra along with
whatever you find.

Thanks!

I've attached updated patch set as the previous version patch set
conflicts to the current HEAD. This patch set incorporated the review
comments, a few fix and the patch for
PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION. 0001 patch is the same
as previous version.

I verified my all review comments in v40 patch set. All are fixed.

v40-0002-Add-a-parallel-option-to-the-VACUUM-command.patch doesn't
apply on HEAD. Please send re-based patch.

Thanks and Regards
Mahendra Singh Thalor
EnterpriseDB: http://www.enterprisedb.com

#318

Tomas Vondra

tomas.vondra@2ndquadrant.com

about 6 years ago

In reply to: Masahiko Sawada (#316)

Hi,

On Wed, Dec 25, 2019 at 09:17:16PM +0900, Masahiko Sawada wrote:

On Tue, 24 Dec 2019 at 15:46, Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Tue, 24 Dec 2019 at 15:44, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Dec 24, 2019 at 12:08 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

The first patches look good to me. I'm reviewing other patches and
will post comments if there is.

Oops I meant first "two" patches look good to me.

Okay, feel free to address few comments raised by Mahendra along with
whatever you find.

Thanks!

I've attached updated patch set as the previous version patch set
conflicts to the current HEAD. This patch set incorporated the review
comments, a few fix and the patch for
PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION. 0001 patch is the same
as previous version.

I've been reviewing the updated patches over the past couple of days, so
let me share some initial review comments. I initially started to read
the thread, but then I realized it's futile - the thread is massive, and
the patch changed so much re-reading the whole thread is a waste of time.

It might be useful write a summary of the current design, but AFAICS the
original plan to parallelize the heap scan is abandoned and we now do
just the steps that vacuum indexes in parallel. Which is fine, but it
means the subject "block level parallel vacuum" is a bit misleading.

Anyway, most of the logic is implemented in part 0002, which actually
does all the parallel worker stuff. The remaining parts 0001, 0003 and
0004 are either preparing infrastructure or not directlyrelated to the
primary feature.

v40-0001-Introduce-IndexAM-fields-for-parallel-vacuum.patch
-----------------------------------------------------------

I wonder if 'amparallelvacuumoptions' is unnecessarily specific. Maybe
it should be called just 'amvacuumoptions' or something like that? The
'parallel' part is actually encoded in names of the options.

Also, why do we need a separate amusemaintenanceworkmem option? Why
don't we simply track it using a separate flag in 'amvacuumoptions'
(or whatever we end up calling it)?

Would it make sense to track m_w_m usage separately for the two index
cleanup phases? Or is that unnecessary / pointless?

v40-0002-Add-a-parallel-option-to-the-VACUUM-command.patch
----------------------------------------------------------

I haven't found any issues yet, but I've only started with the code
review. I'll continue with the review. It seems in a fairly good shape
though, I think, I only have two minor comments at the moment:

- The SizeOfLVDeadTuples macro seems rather weird. It does include space
for one ItemPointerData, but we really need an array of them. But then
all the places where the macro is used explicitly add space for the
pointers, so the sizeof(ItemPointerData) seems unnecessary. So it
should be either

#define SizeOfLVDeadTuples (offsetof(LVDeadTuples, itemptrs))

#define SizeOfLVDeadTuples(cnt) \
(offsetof(LVDeadTuples, itemptrs) + (cnt) * sizeof(ItemPointerData))

in which case the callers can be simplified.

- It's not quite clear to me why we need the new nworkers_to_launch
field in ParallelContext.

v40-0003-Add-FAST-option-to-vacuum-command.patch
------------------------------------------------

I do have a bit of an issue with this part - I'm not quite convinved we
actually need a FAST option, and I actually suspect we'll come to regret
it sooner than later. AFAIK it pretty much does exactly the same thing
as setting vacuum_cost_delay to 0, and IMO it's confusing to provide
multiple ways to do the same thing - I do expect reports from confused
users on pgsql-bugs etc. Why is setting vacuum_cost_delay directly not a
sufficient solution?

The same thing applies to the PARALLEL flag, added in 0002, BTW. Why do
we need a separate VACUUM option, instead of just using the existing
max_parallel_maintenance_workers GUC? It's good enough for CREATE INDEX
so why not here?

Maybe it's explained somewhere deep in the thread, of course ...

v40-0004-Add-ability-to-disable-leader-participation-in-p.patch
---------------------------------------------------------------

IMHO this should be simply merged into 0002.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#319

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

about 6 years ago

In reply to: Tomas Vondra (#318)

4 attachment(s)

On Fri, 27 Dec 2019 at 11:24, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

Hi,

On Wed, Dec 25, 2019 at 09:17:16PM +0900, Masahiko Sawada wrote:

On Tue, 24 Dec 2019 at 15:46, Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Tue, 24 Dec 2019 at 15:44, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Dec 24, 2019 at 12:08 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

The first patches look good to me. I'm reviewing other patches and
will post comments if there is.

Oops I meant first "two" patches look good to me.

Okay, feel free to address few comments raised by Mahendra along with
whatever you find.

Thanks!

I've attached updated patch set as the previous version patch set
conflicts to the current HEAD. This patch set incorporated the review
comments, a few fix and the patch for
PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION. 0001 patch is the same
as previous version.

I've been reviewing the updated patches over the past couple of days, so
let me share some initial review comments. I initially started to read
the thread, but then I realized it's futile - the thread is massive, and
the patch changed so much re-reading the whole thread is a waste of time.

Thank you for reviewing this patch!

It might be useful write a summary of the current design, but AFAICS the
original plan to parallelize the heap scan is abandoned and we now do
just the steps that vacuum indexes in parallel. Which is fine, but it
means the subject "block level parallel vacuum" is a bit misleading.

Yeah I should have renamed it. I'll summarize the current design.

Anyway, most of the logic is implemented in part 0002, which actually
does all the parallel worker stuff. The remaining parts 0001, 0003 and
0004 are either preparing infrastructure or not directlyrelated to the
primary feature.

v40-0001-Introduce-IndexAM-fields-for-parallel-vacuum.patch
-----------------------------------------------------------

I wonder if 'amparallelvacuumoptions' is unnecessarily specific. Maybe
it should be called just 'amvacuumoptions' or something like that? The
'parallel' part is actually encoded in names of the options.

amvacuumoptions seems good to me.

Also, why do we need a separate amusemaintenanceworkmem option? Why
don't we simply track it using a separate flag in 'amvacuumoptions'
(or whatever we end up calling it)?

It also seems like a good idea.

Would it make sense to track m_w_m usage separately for the two index
cleanup phases? Or is that unnecessary / pointless?

We could do that but currently index AM uses this option is only gin
indexes. And gin indexes could use maintenance_work_mem both during
bulkdelete and cleanup. So it might be unnecessary at least as of now.

v40-0002-Add-a-parallel-option-to-the-VACUUM-command.patch
----------------------------------------------------------

I haven't found any issues yet, but I've only started with the code
review. I'll continue with the review. It seems in a fairly good shape
though, I think, I only have two minor comments at the moment:

- The SizeOfLVDeadTuples macro seems rather weird. It does include space
for one ItemPointerData, but we really need an array of them. But then
all the places where the macro is used explicitly add space for the
pointers, so the sizeof(ItemPointerData) seems unnecessary. So it
should be either

#define SizeOfLVDeadTuples (offsetof(LVDeadTuples, itemptrs))

or

#define SizeOfLVDeadTuples(cnt) \
(offsetof(LVDeadTuples, itemptrs) + (cnt) * sizeof(ItemPointerData))

in which case the callers can be simplified.

Fixed it to the former.

- It's not quite clear to me why we need the new nworkers_to_launch
field in ParallelContext.

The motivation of nworkers_to_launch is to specify the number of
workers to actually launch when we use the same parallel context
several times while changing the number of workers to launch. Since
index AM can choose the participation of bulkdelete and/or cleanup,
the number of workers required for each vacuum phrases can be
different. I originally changed LaunchParallelWorkers to have the
number of workers to launch so that it launches different number of
workers for each vacuum phases but Robert suggested to change the
routine of reinitializing parallel context[1]/messages/by-id/CA+TgmobjtHdLfQhmzqBNt7VEsz+5w3P0yy0-EsoT05yAJViBSQ@mail.gmail.com. It would be less
confusing and would involve modify code in a lot fewer places. So with
this patch we specify the number of workers during initializing the
parallel context as a maximum number of workers. And using
ReinitializeParallelWorkers before doing either bulkdelete or cleanup
we specify the number of workers to launch.

v40-0003-Add-FAST-option-to-vacuum-command.patch
------------------------------------------------

I do have a bit of an issue with this part - I'm not quite convinved we
actually need a FAST option, and I actually suspect we'll come to regret
it sooner than later. AFAIK it pretty much does exactly the same thing
as setting vacuum_cost_delay to 0, and IMO it's confusing to provide
multiple ways to do the same thing - I do expect reports from confused
users on pgsql-bugs etc. Why is setting vacuum_cost_delay directly not a
sufficient solution?

I think the motivation of this option is similar to FREEZE. I think
it's sometimes a good idea to have a shortcut of popular usage and
make it have an name corresponding to its job. From that perspective I
think having FAST option would make sense but maybe we need more
discussion the combination parallel vacuum and vacuum delay.

The same thing applies to the PARALLEL flag, added in 0002, BTW. Why do
we need a separate VACUUM option, instead of just using the existing
max_parallel_maintenance_workers GUC? It's good enough for CREATE INDEX
so why not here?

AFAIR There was no such discussion so far but I think one reason could
be that parallel vacuum should be disabled by default. If the parallel
vacuum uses max_parallel_maintenance_workers (2 by default) rather
than having the option the parallel vacuum would work with default
setting but I think that it would become a big impact for user because
the disk access could become random reads and writes when some indexes
are on the same tablespace.

Maybe it's explained somewhere deep in the thread, of course ...

v40-0004-Add-ability-to-disable-leader-participation-in-p.patch
---------------------------------------------------------------

IMHO this should be simply merged into 0002.

We discussed it's still unclear whether we really want to commit this
code and therefore it's separated from the main part. Please see more
details here[2]/messages/by-id/CAA4eK1+C8OBhm4g3Mnfx+VjGfZ4ckLOLSU9i7Smo1sp4k0V5HA@mail.gmail.com.

I've fixed code based on the review comments and rebased to the
current HEAD. Some comments around vacuum option name and FAST option
are still left as we would need more discussion.

Regards,

[1]: /messages/by-id/CA+TgmobjtHdLfQhmzqBNt7VEsz+5w3P0yy0-EsoT05yAJViBSQ@mail.gmail.com
[2]: /messages/by-id/CAA4eK1+C8OBhm4g3Mnfx+VjGfZ4ckLOLSU9i7Smo1sp4k0V5HA@mail.gmail.com

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

v41-0004-Add-ability-to-disable-leader-participation-in-p.patchapplication/octet-stream; name=v41-0004-Add-ability-to-disable-leader-participation-in-p.patchDownload

From 4f6910f3d9105be32c7bcfe0013bf71059036eab Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Wed, 25 Dec 2019 15:32:23 +0900
Subject: [PATCH v41 4/4] Add ability to disable leader participation in
 parallel vacuum

---
 src/backend/access/heap/vacuumlazy.c | 41 ++++++++++++++++++++++++----
 1 file changed, 35 insertions(+), 6 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 74637c3a0e..a1ddf036b7 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -138,6 +138,13 @@
 #define PARALLEL_VACUUM_KEY_DEAD_TUPLES		2
 #define PARALLEL_VACUUM_KEY_QUERY_TEXT		3
 
+/*
+ * PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION disables the leader's
+ * participation in parallel lazy vacuum.  This may be useful as a debugging
+ * aid.
+#undef PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION
+ */
+
 /*
  * Macro to check if we are in a parallel lazy vacuum.  If true, we are
  * in the parallel mode and the DSM segment is initialized.
@@ -270,6 +277,12 @@ typedef struct LVParallelState
 	int			nindexes_parallel_bulkdel;
 	int			nindexes_parallel_cleanup;
 	int			nindexes_parallel_condcleanup;
+
+	/*
+	 * Always true except for a debugging case where
+	 * PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION are defined.
+	 */
+	bool		leaderparticipates;
 } LVParallelState;
 
 typedef struct LVRelStats
@@ -1985,13 +1998,17 @@ lazy_parallel_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
 	{
 		if (lps->lvshared->first_time)
 			nworkers = lps->nindexes_parallel_cleanup +
-				lps->nindexes_parallel_condcleanup - 1;
+				lps->nindexes_parallel_condcleanup;
 		else
-			nworkers = lps->nindexes_parallel_cleanup - 1;
+			nworkers = lps->nindexes_parallel_cleanup;
 
 	}
 	else
-		nworkers = lps->nindexes_parallel_bulkdel - 1;
+		nworkers = lps->nindexes_parallel_bulkdel;
+
+	/* The leader process takes one index */
+	if (lps->leaderparticipates)
+		nworkers--;
 
 	/*
 	 * It is possible that parallel context is initialized with fewer workers
@@ -2075,8 +2092,9 @@ lazy_parallel_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
 	 * Join as a parallel worker.  The leader process alone processes all the
 	 * indexes in the case where no workers are launched.
 	 */
-	parallel_vacuum_index(Irel, stats, lps->lvshared,
-						  vacrelstats->dead_tuples, nindexes);
+	if (lps->leaderparticipates || lps->pcxt->nworkers_launched == 0)
+		parallel_vacuum_index(Irel, stats, lps->lvshared,
+							  vacrelstats->dead_tuples, nindexes);
 
 	/* Wait for all vacuum workers to finish */
 	WaitForParallelWorkersToFinish(lps->pcxt);
@@ -2946,6 +2964,7 @@ static int
 compute_parallel_vacuum_workers(Relation *Irel, int nindexes, int nrequested,
 								bool *can_parallel_vacuum)
 {
+	bool		leaderparticipates = true;
 	int			nindexes_parallel = 0;
 	int			nindexes_parallel_bulkdel = 0;
 	int			nindexes_parallel_cleanup = 0;
@@ -2987,8 +3006,13 @@ compute_parallel_vacuum_workers(Relation *Irel, int nindexes, int nrequested,
 	if (nindexes_parallel == 0)
 		return 0;
 
+#ifdef PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION
+	leaderparticipates = false;
+#endif
+
 	/* The leader process takes one index */
-	nindexes_parallel--;
+	if (leaderparticipates)
+		nindexes_parallel--;
 
 	/* Compute the parallel degree */
 	parallel_workers = (nrequested > 0) ?
@@ -3107,6 +3131,11 @@ begin_parallel_vacuum(Oid relid, Relation *Irel, LVRelStats *vacrelstats,
 								 parallel_workers);
 	Assert(pcxt->nworkers > 0);
 	lps->pcxt = pcxt;
+	lps->leaderparticipates = true;
+
+#ifdef PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION
+	lps->leaderparticipates = false;
+#endif
 
 	/* Estimate size for shared information -- PARALLEL_VACUUM_KEY_SHARED */
 	est_shared = MAXALIGN(add_size(SizeOfLVShared, BITMAPLEN(nindexes)));
-- 
2.23.0

v41-0001-Introduce-IndexAM-fields-for-parallel-vacuum.patchapplication/octet-stream; name=v41-0001-Introduce-IndexAM-fields-for-parallel-vacuum.patchDownload

From d2e084ec5fb7f583cde325a3bf424088d10aa9ee Mon Sep 17 00:00:00 2001
From: Amit Kapila <akapila@postgresql.org>
Date: Mon, 23 Dec 2019 14:46:37 +0530
Subject: [PATCH v41 1/4] Introduce IndexAM fields for parallel vacuum.

Introduce new fields amusemaintenanceworkmem and amparallelvacuumoptions
in IndexAmRoutine for parallel vacuum.  The amusemaintenanceworkmem tells
whether a particular IndexAM uses maintenance_work_mem or not.  This will
help in controlling the memory used by individual workers as otherwise,
each worker can consume memory equal to maintenance_work_mem.  The
amparallelvacuumoptions tell whether a particular IndexAM participates in
a parallel vacuum and if so in which phase (bulkdelete, vacuumcleanup) of
vacuum.

Author: Masahiko Sawada and Amit Kapila
Reviewed-by: Dilip Kumar, Amit Kapila and Robert Haas
Discussion:
https://postgr.es/m/CAD21AoDTPMgzSkV4E3SFo1CH_x50bf5PqZFQf4jmqjk-C03BWg@mail.gmail.com
https://postgr.es/m/CAA4eK1LmcD5aPogzwim5Nn58Ki+74a6Edghx4Wd8hAskvHaq5A@mail.gmail.com
---
 contrib/bloom/blutils.c                       |  4 ++
 doc/src/sgml/indexam.sgml                     |  4 ++
 src/backend/access/brin/brin.c                |  4 ++
 src/backend/access/gin/ginutil.c              |  4 ++
 src/backend/access/gist/gist.c                |  4 ++
 src/backend/access/hash/hash.c                |  3 ++
 src/backend/access/nbtree/nbtree.c            |  3 ++
 src/backend/access/spgist/spgutils.c          |  4 ++
 src/include/access/amapi.h                    |  4 ++
 src/include/commands/vacuum.h                 | 38 +++++++++++++++++++
 .../modules/dummy_index_am/dummy_index_am.c   |  3 ++
 11 files changed, 75 insertions(+)

diff --git a/contrib/bloom/blutils.c b/contrib/bloom/blutils.c
index e2063bac62..1874aeeb44 100644
--- a/contrib/bloom/blutils.c
+++ b/contrib/bloom/blutils.c
@@ -18,6 +18,7 @@
 #include "access/reloptions.h"
 #include "bloom.h"
 #include "catalog/index.h"
+#include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
@@ -121,6 +122,9 @@ blhandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_CLEANUP;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = blbuild;
diff --git a/doc/src/sgml/indexam.sgml b/doc/src/sgml/indexam.sgml
index dd54c68802..37f8d8760a 100644
--- a/doc/src/sgml/indexam.sgml
+++ b/doc/src/sgml/indexam.sgml
@@ -122,6 +122,10 @@ typedef struct IndexAmRoutine
     bool        amcanparallel;
     /* does AM support columns included with clause INCLUDE? */
     bool        amcaninclude;
+    /* does AM use maintenance_work_mem? */
+    bool        amusemaintenanceworkmem;
+    /* OR of parallel vacuum flags */
+    uint8       amparallelvacuumoptions;
     /* type of data stored in index, or InvalidOid if variable */
     Oid         amkeytype;
 
diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index 294ffa6e20..abd8c40e7e 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -27,6 +27,7 @@
 #include "access/xloginsert.h"
 #include "catalog/index.h"
 #include "catalog/pg_am.h"
+#include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "pgstat.h"
 #include "postmaster/autovacuum.h"
@@ -101,6 +102,9 @@ brinhandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_CLEANUP;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = brinbuild;
diff --git a/src/backend/access/gin/ginutil.c b/src/backend/access/gin/ginutil.c
index 38593554f0..64bd81a003 100644
--- a/src/backend/access/gin/ginutil.c
+++ b/src/backend/access/gin/ginutil.c
@@ -20,6 +20,7 @@
 #include "access/xloginsert.h"
 #include "catalog/pg_collation.h"
 #include "catalog/pg_type.h"
+#include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "storage/indexfsm.h"
 #include "storage/lmgr.h"
@@ -53,6 +54,9 @@ ginhandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amusemaintenanceworkmem = true;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_CLEANUP;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = ginbuild;
diff --git a/src/backend/access/gist/gist.c b/src/backend/access/gist/gist.c
index a259c80616..e29a43bddf 100644
--- a/src/backend/access/gist/gist.c
+++ b/src/backend/access/gist/gist.c
@@ -17,6 +17,7 @@
 #include "access/gist_private.h"
 #include "access/gistscan.h"
 #include "catalog/pg_collation.h"
+#include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "nodes/execnodes.h"
 #include "storage/lmgr.h"
@@ -74,6 +75,9 @@ gisthandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = true;
+	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = gistbuild;
diff --git a/src/backend/access/hash/hash.c b/src/backend/access/hash/hash.c
index a0597a0c6e..8b9272c05f 100644
--- a/src/backend/access/hash/hash.c
+++ b/src/backend/access/hash/hash.c
@@ -72,6 +72,9 @@ hashhandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL;
 	amroutine->amkeytype = INT4OID;
 
 	amroutine->ambuild = hashbuild;
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index 065b5290b0..313e31c71b 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -121,6 +121,9 @@ bthandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = true;
 	amroutine->amcaninclude = true;
+	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = btbuild;
diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c
index e2d391ee75..cbec182347 100644
--- a/src/backend/access/spgist/spgutils.c
+++ b/src/backend/access/spgist/spgutils.c
@@ -22,6 +22,7 @@
 #include "access/transam.h"
 #include "access/xact.h"
 #include "catalog/pg_amop.h"
+#include "commands/vacuum.h"
 #include "storage/bufmgr.h"
 #include "storage/indexfsm.h"
 #include "storage/lmgr.h"
@@ -56,6 +57,9 @@ spghandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = spgbuild;
diff --git a/src/include/access/amapi.h b/src/include/access/amapi.h
index 6e3db06eed..556affb291 100644
--- a/src/include/access/amapi.h
+++ b/src/include/access/amapi.h
@@ -197,6 +197,10 @@ typedef struct IndexAmRoutine
 	bool		amcanparallel;
 	/* does AM support columns included with clause INCLUDE? */
 	bool		amcaninclude;
+	/* does AM use maintenance_work_mem? */
+	bool		amusemaintenanceworkmem;
+	/* OR of parallel vacuum flags.  See vacuum.h for flags. */
+	uint8		amparallelvacuumoptions;
 	/* type of data stored in index, or InvalidOid if variable */
 	Oid			amkeytype;
 
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 128f7ae65d..b9becdbe99 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -23,6 +23,44 @@
 #include "storage/lock.h"
 #include "utils/relcache.h"
 
+/*
+ * Flags for amparallelvacuumoptions to control the participation of bulkdelete
+ * and vacuumcleanup in parallel vacuum.
+ */
+
+/*
+ * Both bulkdelete and vacuumcleanup are disabled by default.  This will be
+ * used by IndexAM's that don't want to or cannot participate in parallel
+ * vacuum.  For example, if an index AM doesn't have a way to communicate the
+ * index statistics allocated by the first ambulkdelete call to the subsequent
+ * ones until amvacuumcleanup, the index AM cannot participate in parallel
+ * vacuum.
+ */
+#define VACUUM_OPTION_NO_PARALLEL			0
+
+/*
+ * bulkdelete can be performed in parallel.  This option can be used by
+ * IndexAm's that need to scan the index to delete the tuples.
+ */
+#define VACUUM_OPTION_PARALLEL_BULKDEL		(1 << 0)
+
+/*
+ * vacuumcleanup can be performed in parallel if bulkdelete is not performed
+ * yet.  This will be used by IndexAM's that can scan the index if the
+ * bulkdelete is not performed.
+ */
+#define VACUUM_OPTION_PARALLEL_COND_CLEANUP	(1 << 1)
+
+/*
+ * vacuumcleanup can be performed in parallel even if bulkdelete has already
+ * processed the index.  This will be used by IndexAM's that scan the index
+ * during the cleanup phase of index irrespective of whether the index is
+ * already scanned or not during bulkdelete phase.
+ */
+#define VACUUM_OPTION_PARALLEL_CLEANUP		(1 << 2)
+
+/* value for checking vacuum flags */
+#define VACUUM_OPTION_MAX_VALID_VALUE		((1 << 3) - 1)
 
 /*----------
  * ANALYZE builds one of these structs for each attribute (column) that is
diff --git a/src/test/modules/dummy_index_am/dummy_index_am.c b/src/test/modules/dummy_index_am/dummy_index_am.c
index 053636e4b4..6bfd883fd3 100644
--- a/src/test/modules/dummy_index_am/dummy_index_am.c
+++ b/src/test/modules/dummy_index_am/dummy_index_am.c
@@ -16,6 +16,7 @@
 #include "access/amapi.h"
 #include "access/reloptions.h"
 #include "catalog/index.h"
+#include "commands/vacuum.h"
 #include "nodes/pathnodes.h"
 #include "utils/guc.h"
 #include "utils/rel.h"
@@ -294,6 +295,8 @@ dihandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amparallelvacuumoptions = VACUUM_OPTION_NO_PARALLEL;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = dibuild;
-- 
2.23.0

v41-0002-Add-a-parallel-option-to-the-VACUUM-command.patchapplication/octet-stream; name=v41-0002-Add-a-parallel-option-to-the-VACUUM-command.patchDownload

From e6e85dfc1a98ada5a1c1251a6e271e75f7476a63 Mon Sep 17 00:00:00 2001
From: Amit Kapila <akapila@postgresql.org>
Date: Mon, 23 Dec 2019 15:03:14 +0530
Subject: [PATCH v41 2/4] Add a parallel option to the VACUUM command.

This change adds a PARALLEL option to VACUUM command that enables us to
perform index vacuuming and index cleanup with background workers.  Each
index is processed by at most one vacuum process.  Therefore parallel
vacuum can be used when the table has at least two indexes.  Users can
specify a parallel degree along with this option which indicates the
number of workers used by this option which is limited by the number of
indexes on a table.  This option can't be used with the FULL option.

The parallel degree is either specified by the user or determined based on
the number of indexes that the table has, and further limited by
max_parallel_maintenance_workers.  The index can participate in parallel
vacuum iff it is greater than min_parallel_index_scan_size.

Author: Masahiko Sawada and Amit Kapila
Reviewed-by: Dilip Kumar, Amit Kapila, Robert Haas, Mahendra Singh and
Sergei Kornilov
Tested-by: Mahendra Singh and Prabhat Sahu
Discussion:
https://postgr.es/m/CAD21AoDTPMgzSkV4E3SFo1CH_x50bf5PqZFQf4jmqjk-C03BWg@mail.gmail.com
https://postgr.es/m/CAA4eK1J-VoR9gzS5E75pcD-OH0mEyCdp8RihcwKrcuw7J-Q0+w@mail.gmail.com
---
 doc/src/sgml/config.sgml              |   14 +-
 doc/src/sgml/ref/vacuum.sgml          |   45 +
 src/backend/access/heap/vacuumlazy.c  | 1252 ++++++++++++++++++++++---
 src/backend/access/transam/parallel.c |   26 +-
 src/backend/commands/vacuum.c         |  132 ++-
 src/backend/postmaster/autovacuum.c   |    2 +
 src/bin/psql/tab-complete.c           |    2 +-
 src/include/access/heapam.h           |    3 +
 src/include/access/parallel.h         |    4 +-
 src/include/commands/vacuum.h         |   11 +
 src/test/regress/expected/vacuum.out  |   28 +
 src/test/regress/sql/vacuum.sql       |   27 +
 12 files changed, 1417 insertions(+), 129 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 5d1c90282f..74756277b7 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2308,13 +2308,13 @@ include_dir 'conf.d'
        <listitem>
         <para>
          Sets the maximum number of parallel workers that can be
-         started by a single utility command.  Currently, the only
-         parallel utility command that supports the use of parallel
-         workers is <command>CREATE INDEX</command>, and only when
-         building a B-tree index.  Parallel workers are taken from the
-         pool of processes established by <xref
-         linkend="guc-max-worker-processes"/>, limited by <xref
-         linkend="guc-max-parallel-workers"/>.  Note that the requested
+         started by a single utility command.  Currently, the parallel
+         utility commands that support the use of parallel workers are
+         <command>CREATE INDEX</command> only when building a B-tree index,
+         and <command>VACUUM</command> without <literal>FULL</literal>
+         option.  Parallel workers are taken from the pool of processes
+         established by <xref linkend="guc-max-worker-processes"/>, limited
+         by <xref linkend="guc-max-parallel-workers"/>.  Note that the requested
          number of workers may not actually be available at run time.
          If this occurs, the utility operation will run with fewer
          workers than expected.  The default value is 2.  Setting this
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index f9b0fb8794..9fee083233 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -34,6 +34,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     SKIP_LOCKED [ <replaceable class="parameter">boolean</replaceable> ]
     INDEX_CLEANUP [ <replaceable class="parameter">boolean</replaceable> ]
     TRUNCATE [ <replaceable class="parameter">boolean</replaceable> ]
+    PARALLEL [ <replaceable class="parameter">integer</replaceable> ]
 
 <phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
 
@@ -223,6 +224,32 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><literal>PARALLEL</literal></term>
+    <listitem>
+     <para>
+      Perform vacuum index and cleanup index phases of <command>VACUUM</command>
+      in parallel using <replaceable class="parameter">integer</replaceable>
+      background workers (for the detail of each vacuum phases, please
+      refer to <xref linkend="vacuum-phases"/>).  If the parallel degree
+      <replaceable class="parameter">integer</replaceable> is omitted,
+      then <command>VACUUM</command> decides the number of workers based
+      on number of indexes that support parallel vacuum operation on the
+      relation which is further limited by
+      <xref linkend="guc-max-parallel-workers-maintenance"/>.  Please note
+      that it is not guaranteed that the number of parallel workers specified
+      in <replaceable class="parameter">integer</replaceable> will be used
+      during execution.  It is possible for a vacuum to run with fewer workers
+      than specified, or even with no workers at all.  Only one worker can
+      be used per index.  So parallel workers are launched only when there
+      are at least <literal>2</literal> indexes in the table.  Workers for
+      vacuum launches before starting each phase and exit at the end of
+      the phase.  These behaviors might change in a future release.  This
+      option can't be used with the <literal>FULL</literal> option.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><replaceable class="parameter">boolean</replaceable></term>
     <listitem>
@@ -237,6 +264,18 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><replaceable class="parameter">integer</replaceable></term>
+    <listitem>
+     <para>
+      Specifies a positive integer value passed to the selected option.
+      The <replaceable class="parameter">integer</replaceable> value can
+      also be omitted, in which case the value is decided by the command
+      based on the option used.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><replaceable class="parameter">table_name</replaceable></term>
     <listitem>
@@ -316,6 +355,12 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     more than a plain <command>VACUUM</command> would.
    </para>
 
+   <para>
+     The <option>PARALLEL</option> option is used only for vacuum purpose.
+     Even if this option is specified with <option>ANALYZE</option> option
+     it does not affect <option>ANALYZE</option>.
+   </para>
+
    <para>
     <command>VACUUM</command> causes a substantial increase in I/O traffic,
     which might cause poor performance for other active sessions.  Therefore,
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 597d8b5f92..7231fa2923 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -22,6 +22,20 @@
  * of index scans performed.  So we don't use maintenance_work_mem memory for
  * the TID array, just enough to hold as many heap tuples as fit on one page.
  *
+ * Lazy vacuum supports parallel execution with parallel worker processes.  In
+ * a parallel lazy vacuum, we perform both index vacuuming and index cleanup
+ * with parallel worker processes.  Individual indexes are processed by one
+ * vacuum process.  At the beginning of a lazy vacuum (at lazy_scan_heap) we
+ * prepare the parallel context and initialize the DSM segment that contains
+ * shared information as well as the memory space for storing dead tuples.
+ * When starting either index vacuuming or index cleanup, we launch parallel
+ * worker processes.  Once all indexes are processed the parallel worker
+ * processes exit.  After that, the leader process re-initializes the parallel
+ * context so that it can use the same DSM for multiple passes of index
+ * vacuum and for performing index cleanup.  For updating the index statistics,
+ * we need to update the system table and since updates are not
+ * allowed during parallel mode we update the index statistics after exiting
+ * from the parallel mode.
  *
  * Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
@@ -36,25 +50,30 @@
 
 #include <math.h>
 
+#include "access/amapi.h"
 #include "access/genam.h"
 #include "access/heapam.h"
 #include "access/heapam_xlog.h"
 #include "access/htup_details.h"
 #include "access/multixact.h"
+#include "access/parallel.h"
 #include "access/transam.h"
 #include "access/visibilitymap.h"
+#include "access/xact.h"
 #include "access/xlog.h"
 #include "catalog/storage.h"
 #include "commands/dbcommands.h"
 #include "commands/progress.h"
 #include "commands/vacuum.h"
 #include "miscadmin.h"
+#include "optimizer/paths.h"
 #include "pgstat.h"
 #include "portability/instr_time.h"
 #include "postmaster/autovacuum.h"
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
 #include "storage/lmgr.h"
+#include "tcop/tcopprot.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/pg_rusage.h"
@@ -110,6 +129,142 @@
  */
 #define PREFETCH_SIZE			((BlockNumber) 32)
 
+/*
+ * DSM keys for parallel lazy vacuum.  Unlike other parallel execution code,
+ * since we don't need to worry about DSM keys conflicting with plan_node_id
+ * we can use small integers.
+ */
+#define PARALLEL_VACUUM_KEY_SHARED			1
+#define PARALLEL_VACUUM_KEY_DEAD_TUPLES		2
+#define PARALLEL_VACUUM_KEY_QUERY_TEXT		3
+
+/*
+ * Macro to check if we are in a parallel lazy vacuum.  If true, we are
+ * in the parallel mode and the DSM segment is initialized.
+ */
+#define ParallelVacuumIsActive(lps) (((LVParallelState *) (lps)) != NULL)
+
+/*
+ * LVDeadTuples stores the dead tuple TIDs collected during the heap scan.
+ * This is allocated in the DSM segment in parallel mode and in local memory
+ * in non-parallel mode.
+ */
+typedef struct LVDeadTuples
+{
+	int			max_tuples;		/* # slots allocated in array */
+	int			num_tuples;		/* current # of entries */
+	/* List of TIDs of tuples we intend to delete */
+	/* NB: this list is ordered by TID address */
+	ItemPointerData itemptrs[FLEXIBLE_ARRAY_MEMBER];	/* array of
+														 * ItemPointerData */
+} LVDeadTuples;
+
+#define SizeOfLVDeadTuples (offsetof(LVDeadTuples, itemptrs))
+
+/*
+ * Shared information among parallel workers.  So this is allocated in the DSM
+ * segment.
+ */
+typedef struct LVShared
+{
+	/*
+	 * Target table relid and log level.  These fields are not modified during
+	 * the lazy vacuum.
+	 */
+	Oid			relid;
+	int			elevel;
+
+	/*
+	 * An indication for vacuum workers to perform either index vacuuming or
+	 * index cleanup.  first_time is true only if for_cleanup is true and
+	 * bulk-deletion is not performed yet.
+	 */
+	bool		for_cleanup;
+	bool		first_time;
+
+	/*
+	 * Fields for both index vacuum and cleanup.
+	 *
+	 * reltuples is the total number of input heap tuples.  We set either old
+	 * live tuples in the index vacuum case or the new live tuples in the
+	 * index cleanup case.
+	 *
+	 * estimated_count is true if the reltuples is an estimated value.
+	 */
+	double		reltuples;
+	bool		estimated_count;
+
+	/*
+	 * In single process lazy vacuum we could consume more memory during index
+	 * vacuuming or cleanup apart from the memory for heap scanning. In
+	 * parallel index vacuuming, since individual vacuum workers can consume
+	 * memory equal to maintenance_work_mem, the new maintenance_work_mem for
+	 * each worker is set such that the parallel operation doesn't consume
+	 * more memory than single process lazy vacuum.
+	 */
+	int			maintenance_work_mem_worker;
+
+	/*
+	 * Shared vacuum cost balance.  During parallel index vacuuming
+	 * VacuumSharedCostBalance points to this value and it accumulates the
+	 * balance of each parallel vacuum workers.
+	 */
+	pg_atomic_uint32 cost_balance;
+
+	/*
+	 * Number of active parallel workers.  This is used for computing the
+	 * minimum threshold of the vacuum cost balance for a worker to go for the
+	 * delay.
+	 */
+	pg_atomic_uint32 active_nworkers;
+
+	/*
+	 * Variables to control parallel index vacuuming.  We have a bitmap to
+	 * indicate which index has stats in shared memory.  The set bit in the
+	 * map indicates that the particular index supports a parallel vacuum.
+	 */
+	pg_atomic_uint32 idx;		/* counter for vacuuming and clean up */
+	pg_atomic_uint32 nprocessed;	/* # of indexes done during parallel
+									 * execution */
+	uint32		offset;			/* sizeof header incl. bitmap */
+	bits8		bitmap[FLEXIBLE_ARRAY_MEMBER];	/* bit map of NULLs */
+
+	/* Shared index statistics data follows at end of struct */
+} LVShared;
+
+#define SizeOfLVShared (offsetof(LVShared, bitmap) + sizeof(bits8))
+#define GetSharedIndStats(s) \
+	((LVSharedIndStats *)((char *)(s) + ((LVShared *)(s))->offset))
+#define IndStatsIsNull(s, i) \
+	(!(((LVShared *)(s))->bitmap[(i) >> 3] & (1 << ((i) & 0x07))))
+
+/*
+ * Struct for an index bulk-deletion statistic used for parallel lazy
+ * vacuum.  This is allocated in the DSM segment.
+ */
+typedef struct LVSharedIndStats
+{
+	bool		updated;		/* are the stats updated? */
+	IndexBulkDeleteResult stats;
+} LVSharedIndStats;
+
+/* Struct for maintaining a parallel vacuum state. */
+typedef struct LVParallelState
+{
+	ParallelContext *pcxt;
+
+	/* Shared information among parallel vacuum workers */
+	LVShared   *lvshared;
+
+	/*
+	 * The number of indexes that support parallel index bulk-deletion and
+	 * parallel index cleanup respectively.
+	 */
+	int			nindexes_parallel_bulkdel;
+	int			nindexes_parallel_cleanup;
+	int			nindexes_parallel_condcleanup;
+} LVParallelState;
+
 typedef struct LVRelStats
 {
 	/* useindex = true means two-pass strategy; false means one-pass */
@@ -128,11 +283,7 @@ typedef struct LVRelStats
 	BlockNumber pages_removed;
 	double		tuples_deleted;
 	BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
-	/* List of TIDs of tuples we intend to delete */
-	/* NB: this list is ordered by TID address */
-	int			num_dead_tuples;	/* current # of entries */
-	int			max_dead_tuples;	/* # slots allocated in array */
-	ItemPointer dead_tuples;	/* array of ItemPointerData */
+	LVDeadTuples *dead_tuples;
 	int			num_index_scans;
 	TransactionId latestRemovedXid;
 	bool		lock_waiter_detected;
@@ -155,15 +306,15 @@ static void lazy_scan_heap(Relation onerel, VacuumParams *params,
 						   bool aggressive);
 static void lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats);
 static bool lazy_check_needs_freeze(Buffer buf, bool *hastup);
-static void lazy_vacuum_index(Relation indrel,
-							  IndexBulkDeleteResult **stats,
-							  LVRelStats *vacrelstats);
-static void lazy_vacuum_all_indexes(Relation onerel, LVRelStats *vacrelstats,
-									Relation *Irel, int nindexes,
-									IndexBulkDeleteResult **indstats);
+static void lazy_vacuum_all_indexes(Relation onerel, Relation *Irel,
+									IndexBulkDeleteResult **stats,
+									LVRelStats *vacrelstats, LVParallelState *lps,
+									int nindexes);
+static void lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult **stats,
+							  LVDeadTuples *dead_tuples, double reltuples);
 static void lazy_cleanup_index(Relation indrel,
-							   IndexBulkDeleteResult *stats,
-							   LVRelStats *vacrelstats);
+							   IndexBulkDeleteResult **stats,
+							   double reltuples, bool estimated_count);
 static int	lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 							 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer);
 static bool should_attempt_truncation(VacuumParams *params,
@@ -172,12 +323,41 @@ static void lazy_truncate_heap(Relation onerel, LVRelStats *vacrelstats);
 static BlockNumber count_nondeletable_pages(Relation onerel,
 											LVRelStats *vacrelstats);
 static void lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks);
-static void lazy_record_dead_tuple(LVRelStats *vacrelstats,
+static void lazy_record_dead_tuple(LVDeadTuples *dead_tuples,
 								   ItemPointer itemptr);
 static bool lazy_tid_reaped(ItemPointer itemptr, void *state);
 static int	vac_cmp_itemptr(const void *left, const void *right);
 static bool heap_page_is_all_visible(Relation rel, Buffer buf,
 									 TransactionId *visibility_cutoff_xid, bool *all_frozen);
+static void lazy_parallel_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+										 LVRelStats *vacrelstats, LVParallelState *lps,
+										 int nindexes);
+static void parallel_vacuum_index(Relation *Irel, IndexBulkDeleteResult **stats,
+								  LVShared *lvshared, LVDeadTuples *dead_tuples,
+								  int nindexes);
+static void vacuum_indexes_leader(Relation *Irel, IndexBulkDeleteResult **stats,
+								  LVRelStats *vacrelstats, LVParallelState *lps,
+								  int nindexes);
+static void vacuum_one_index(Relation indrel, IndexBulkDeleteResult **stats,
+							 LVShared *lvshared, LVSharedIndStats *shared_indstats,
+							 LVDeadTuples *dead_tuples);
+static void lazy_cleanup_all_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+									 LVRelStats *vacrelstats, LVParallelState *lps,
+									 int nindexes);
+static long compute_max_dead_tuples(BlockNumber relblocks, bool hasindex);
+static int	compute_parallel_vacuum_workers(Relation *Irel, int nindexes, int nrequested,
+											bool *can_parallel_vacuum);
+static void prepare_index_statistics(LVShared *lvshared, bool *can_parallel_vacuum,
+									 int nindexes);
+static void update_index_statistics(Relation *Irel, IndexBulkDeleteResult **stats,
+									int nindexes);
+static LVParallelState *begin_parallel_vacuum(Oid relid, Relation *Irel,
+											  LVRelStats *vacrelstats, BlockNumber nblocks,
+											  int nindexes, int nrequested);
+static void end_parallel_vacuum(Relation *Irel, IndexBulkDeleteResult **stats,
+								LVParallelState *lps, int nindexes);
+static LVSharedIndStats *get_indstats(LVShared *lvshared, int n);
+static bool skip_parallel_vacuum_index(Relation indrel, LVShared *lvshared);
 
 
 /*
@@ -491,6 +671,18 @@ vacuum_log_cleanup_info(Relation rel, LVRelStats *vacrelstats)
  *		dead-tuple TIDs, invoke vacuuming of indexes and call lazy_vacuum_heap
  *		to reclaim dead line pointers.
  *
+ *		If the table has at least two indexes and parallel lazy vacuum is
+ *		requested, we execute both index vacuuming and index cleanup with
+ *		parallel workers.  In parallel lazy vacuum, we enter parallel mode and
+ *		then create both the parallel context and the DSM segment before starting
+ *		heap scan so that we can record dead tuples to the DSM segment.  All
+ *		parallel workers are launched at beginning of index vacuuming and index
+ *		cleanup and they exit once done with all indexes.  At the end of this
+ *		function we exit from parallel mode.  Index bulk-deletion results are
+ *		stored in the DSM segment and we update index statistics for all the
+ *		indexes after exiting from parallel mode since writes are not allowed
+ *		during parallel mode.
+ *
  *		If there are no indexes then we can reclaim line pointers on the fly;
  *		dead line pointers need only be retained until all index pointers that
  *		reference them have been killed.
@@ -499,6 +691,8 @@ static void
 lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			   Relation *Irel, int nindexes, bool aggressive)
 {
+	LVParallelState *lps = NULL;
+	LVDeadTuples *dead_tuples;
 	BlockNumber nblocks,
 				blkno;
 	HeapTupleData tuple;
@@ -556,13 +750,28 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	vacrelstats->nonempty_pages = 0;
 	vacrelstats->latestRemovedXid = InvalidTransactionId;
 
-	lazy_space_alloc(vacrelstats, nblocks);
+	/*
+	 * Try to initialize the parallel vacuum if requested
+	 */
+	if (params->nworkers >= 0 && vacrelstats->useindex)
+		lps = begin_parallel_vacuum(RelationGetRelid(onerel), Irel,
+									vacrelstats, nblocks, nindexes,
+									params->nworkers);
+
+	/*
+	 * Allocate the space for dead tuples in case the parallel vacuum is not
+	 * initialized.
+	 */
+	if (!ParallelVacuumIsActive(lps))
+		lazy_space_alloc(vacrelstats, nblocks);
+
+	dead_tuples = vacrelstats->dead_tuples;
 	frozen = palloc(sizeof(xl_heap_freeze_tuple) * MaxHeapTuplesPerPage);
 
 	/* Report that we're scanning the heap, advertising total # of blocks */
 	initprog_val[0] = PROGRESS_VACUUM_PHASE_SCAN_HEAP;
 	initprog_val[1] = nblocks;
-	initprog_val[2] = vacrelstats->max_dead_tuples;
+	initprog_val[2] = dead_tuples->max_tuples;
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
 	/*
@@ -740,8 +949,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * If we are close to overrunning the available space for dead-tuple
 		 * TIDs, pause and do a cycle of vacuuming before we tackle this page.
 		 */
-		if ((vacrelstats->max_dead_tuples - vacrelstats->num_dead_tuples) < MaxHeapTuplesPerPage &&
-			vacrelstats->num_dead_tuples > 0)
+		if ((dead_tuples->max_tuples - dead_tuples->num_tuples) < MaxHeapTuplesPerPage &&
+			dead_tuples->num_tuples > 0)
 		{
 			/*
 			 * Before beginning index vacuuming, we release any pin we may
@@ -756,8 +965,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			}
 
 			/* Work on all the indexes, then the heap */
-			lazy_vacuum_all_indexes(onerel, vacrelstats, Irel,
-									nindexes, indstats);
+			lazy_vacuum_all_indexes(onerel, Irel, indstats,
+									vacrelstats, lps, nindexes);
 
 			/* Remove tuples from heap */
 			lazy_vacuum_heap(onerel, vacrelstats);
@@ -767,7 +976,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 
 			/*
 			 * Vacuum the Free Space Map to make newly-freed space visible on
@@ -962,7 +1171,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		has_dead_tuples = false;
 		nfrozen = 0;
 		hastup = false;
-		prev_dead_count = vacrelstats->num_dead_tuples;
+		prev_dead_count = dead_tuples->num_tuples;
 		maxoff = PageGetMaxOffsetNumber(page);
 
 		/*
@@ -1001,7 +1210,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 */
 			if (ItemIdIsDead(itemid))
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				all_visible = false;
 				continue;
 			}
@@ -1147,7 +1356,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 
 			if (tupgone)
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				HeapTupleHeaderAdvanceLatestRemovedXid(tuple.t_data,
 													   &vacrelstats->latestRemovedXid);
 				tups_vacuumed += 1;
@@ -1217,7 +1426,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * doing a second scan. Also we don't do that but forget dead tuples
 		 * when index cleanup is disabled.
 		 */
-		if (!vacrelstats->useindex && vacrelstats->num_dead_tuples > 0)
+		if (!vacrelstats->useindex && dead_tuples->num_tuples > 0)
 		{
 			if (nindexes == 0)
 			{
@@ -1246,7 +1455,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 
 			/*
 			 * Periodically do incremental FSM vacuuming to make newly-freed
@@ -1361,7 +1570,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * page, so remember its free space as-is.  (This path will always be
 		 * taken if there are no indexes.)
 		 */
-		if (vacrelstats->num_dead_tuples == prev_dead_count)
+		if (dead_tuples->num_tuples == prev_dead_count)
 			RecordPageWithFreeSpace(onerel, blkno, freespace);
 	}
 
@@ -1395,11 +1604,11 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 
 	/* If any tuples need to be deleted, perform final vacuum cycle */
 	/* XXX put a threshold on min number of tuples here? */
-	if (vacrelstats->num_dead_tuples > 0)
+	if (dead_tuples->num_tuples > 0)
 	{
 		/* Work on all the indexes, and then the heap */
-		lazy_vacuum_all_indexes(onerel, vacrelstats, Irel, nindexes,
-								indstats);
+		lazy_vacuum_all_indexes(onerel, Irel, indstats, vacrelstats,
+								lps, nindexes);
 
 		/* Remove tuples from heap */
 		lazy_vacuum_heap(onerel, vacrelstats);
@@ -1412,17 +1621,22 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	if (blkno > next_fsm_block_to_vacuum)
 		FreeSpaceMapVacuumRange(onerel, next_fsm_block_to_vacuum, blkno);
 
-	/* report all blocks vacuumed; and that we're cleaning up */
+	/* report all blocks vacuumed */
 	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_VACUUMED, blkno);
-	pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
-								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
 
-	/* Do post-vacuum cleanup and statistics update for each index */
+	/* Do post-vacuum cleanup */
 	if (vacrelstats->useindex)
-	{
-		for (i = 0; i < nindexes; i++)
-			lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
-	}
+		lazy_cleanup_all_indexes(Irel, indstats, vacrelstats, lps, nindexes);
+
+	/*
+	 * End parallel mode before updating index statistics as we cannot write
+	 * during parallel mode.
+	 */
+	if (ParallelVacuumIsActive(lps))
+		end_parallel_vacuum(Irel, indstats, lps, nindexes);
+
+	/* Update index statistics */
+	update_index_statistics(Irel, indstats, nindexes);
 
 	/* If no indexes, make log report that lazy_vacuum_heap would've made */
 	if (vacuumed_pages)
@@ -1467,15 +1681,16 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 /*
  *	lazy_vacuum_all_indexes() -- vacuum all indexes of relation.
  *
- *		This is a utility wrapper for lazy_vacuum_index(), able to do
- *		progress reporting.
+ * We process the indexes serially unless we are doing parallel vacuum.
  */
 static void
-lazy_vacuum_all_indexes(Relation onerel, LVRelStats *vacrelstats,
-						Relation *Irel, int nindexes,
-						IndexBulkDeleteResult **indstats)
+lazy_vacuum_all_indexes(Relation onerel, Relation *Irel,
+						IndexBulkDeleteResult **stats,
+						LVRelStats *vacrelstats, LVParallelState *lps,
+						int nindexes)
 {
-	int			i;
+	Assert(!IsParallelWorker());
+	Assert(nindexes > 0);
 
 	/* Log cleanup info before we touch indexes */
 	vacuum_log_cleanup_info(onerel, vacrelstats);
@@ -1484,9 +1699,30 @@ lazy_vacuum_all_indexes(Relation onerel, LVRelStats *vacrelstats,
 	pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
 								 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
-	/* Remove index entries */
-	for (i = 0; i < nindexes; i++)
-		lazy_vacuum_index(Irel[i], &indstats[i], vacrelstats);
+	/* Perform index vacuuming with parallel workers for parallel vacuum. */
+	if (ParallelVacuumIsActive(lps))
+	{
+		/* Tell parallel workers to do index vacuuming */
+		lps->lvshared->for_cleanup = false;
+		lps->lvshared->first_time = false;
+
+		/*
+		 * We can only provide an approximate value of num_heap_tuples in
+		 * vacuum cases.
+		 */
+		lps->lvshared->reltuples = vacrelstats->old_live_tuples;
+		lps->lvshared->estimated_count = true;
+
+		lazy_parallel_vacuum_indexes(Irel, stats, vacrelstats, lps, nindexes);
+	}
+	else
+	{
+		int			idx;
+
+		for (idx = 0; idx < nindexes; idx++)
+			lazy_vacuum_index(Irel[idx], &stats[idx], vacrelstats->dead_tuples,
+							  vacrelstats->old_live_tuples);
+	}
 
 	/* Increase and report the number of index scans */
 	vacrelstats->num_index_scans++;
@@ -1522,7 +1758,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 	npages = 0;
 
 	tupindex = 0;
-	while (tupindex < vacrelstats->num_dead_tuples)
+	while (tupindex < vacrelstats->dead_tuples->num_tuples)
 	{
 		BlockNumber tblk;
 		Buffer		buf;
@@ -1531,7 +1767,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 
 		vacuum_delay_point();
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples->itemptrs[tupindex]);
 		buf = ReadBufferExtended(onerel, MAIN_FORKNUM, tblk, RBM_NORMAL,
 								 vac_strategy);
 		if (!ConditionalLockBufferForCleanup(buf))
@@ -1579,6 +1815,7 @@ static int
 lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 				 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer)
 {
+	LVDeadTuples *dead_tuples = vacrelstats->dead_tuples;
 	Page		page = BufferGetPage(buffer);
 	OffsetNumber unused[MaxOffsetNumber];
 	int			uncnt = 0;
@@ -1589,16 +1826,16 @@ lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 
 	START_CRIT_SECTION();
 
-	for (; tupindex < vacrelstats->num_dead_tuples; tupindex++)
+	for (; tupindex < dead_tuples->num_tuples; tupindex++)
 	{
 		BlockNumber tblk;
 		OffsetNumber toff;
 		ItemId		itemid;
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&dead_tuples->itemptrs[tupindex]);
 		if (tblk != blkno)
 			break;				/* past end of tuples for this block */
-		toff = ItemPointerGetOffsetNumber(&vacrelstats->dead_tuples[tupindex]);
+		toff = ItemPointerGetOffsetNumber(&dead_tuples->itemptrs[tupindex]);
 		itemid = PageGetItemId(page, toff);
 		ItemIdSetUnused(itemid);
 		unused[uncnt++] = toff;
@@ -1719,19 +1956,355 @@ lazy_check_needs_freeze(Buffer buf, bool *hastup)
 	return false;
 }
 
+/*
+ * Perform index vacuuming or index cleanup with parallel workers.  This
+ * function must be used by the parallel vacuum leader process. The caller
+ * must set lps->lvshared->for_cleanup to indicate whether to perform vacuum
+ * or cleanup.
+ */
+static void
+lazy_parallel_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+							 LVRelStats *vacrelstats, LVParallelState *lps,
+							 int nindexes)
+{
+	int			nworkers;
+
+	Assert(!IsParallelWorker());
+	Assert(ParallelVacuumIsActive(lps));
+	Assert(nindexes > 0);
+
+	/* Determine the number of parallel workers to launch */
+	if (lps->lvshared->for_cleanup)
+	{
+		if (lps->lvshared->first_time)
+			nworkers = lps->nindexes_parallel_cleanup +
+				lps->nindexes_parallel_condcleanup - 1;
+		else
+			nworkers = lps->nindexes_parallel_cleanup - 1;
+
+	}
+	else
+		nworkers = lps->nindexes_parallel_bulkdel - 1;
+
+	/*
+	 * It is possible that parallel context is initialized with fewer workers
+	 * then the number of indexes that need a separate worker in the current
+	 * phase, so we need to consider it.  See compute_parallel_vacuum_workers.
+	 */
+	nworkers = Min(nworkers, lps->pcxt->nworkers);
+
+	/* Setup the shared cost-based vacuum delay and launch workers */
+	if (nworkers > 0)
+	{
+		if (vacrelstats->num_index_scans > 0)
+		{
+			/* Reset the processing counts */
+			pg_atomic_write_u32(&(lps->lvshared->idx), 0);
+			pg_atomic_write_u32(&(lps->lvshared->nprocessed), 0);
+
+			/* Reinitialize the parallel context to relaunch parallel workers */
+			ReinitializeParallelDSM(lps->pcxt);
+		}
+
+		/* Enable shared cost balance */
+		VacuumSharedCostBalance = &(lps->lvshared->cost_balance);
+		VacuumActiveNWorkers = &(lps->lvshared->active_nworkers);
+
+		/*
+		 * Set up shared cost balance and the number of active workers for
+		 * vacuum delay.
+		 */
+		pg_atomic_write_u32(VacuumSharedCostBalance, VacuumCostBalance);
+		pg_atomic_write_u32(VacuumActiveNWorkers, 0);
+
+		/*
+		 * The number of workers can vary between bulkdelete and cleanup
+		 * phase.
+		 */
+		ReinitializeParallelWorkers(lps->pcxt, nworkers);
+
+		LaunchParallelWorkers(lps->pcxt);
+
+		if (lps->pcxt->nworkers_launched > 0)
+		{
+			/*
+			 * Reset the local cost values for leader backend as we have
+			 * already accumulated the remaining balance of heap.
+			 */
+			VacuumCostBalance = 0;
+			VacuumCostBalanceLocal = 0;
+		}
+		else
+		{
+			/*
+			 * Disable shared cost balance if we are not able to launch
+			 * workers.
+			 */
+			VacuumSharedCostBalance = NULL;
+			VacuumActiveNWorkers = NULL;
+		}
+
+		if (lps->lvshared->for_cleanup)
+			ereport(elevel,
+					(errmsg(ngettext("launched %d parallel vacuum worker for index cleanup (planned: %d)",
+									 "launched %d parallel vacuum workers for index cleanup (planned: %d)",
+									 lps->pcxt->nworkers_launched),
+							lps->pcxt->nworkers_launched, nworkers)));
+		else
+			ereport(elevel,
+					(errmsg(ngettext("launched %d parallel vacuum worker for index vacuuming (planned: %d)",
+									 "launched %d parallel vacuum workers for index vacuuming (planned: %d)",
+									 lps->pcxt->nworkers_launched),
+							lps->pcxt->nworkers_launched, nworkers)));
+	}
+
+	/* Process the indexes that can be processed by only leader process */
+	vacuum_indexes_leader(Irel, stats, vacrelstats, lps, nindexes);
+
+	/*
+	 * Join as a parallel worker.  The leader process alone processes all the
+	 * indexes in the case where no workers are launched.
+	 */
+	parallel_vacuum_index(Irel, stats, lps->lvshared,
+						  vacrelstats->dead_tuples, nindexes);
+
+	/* Wait for all vacuum workers to finish */
+	WaitForParallelWorkersToFinish(lps->pcxt);
+
+	/* Carry the shared balance value to heap scan */
+	if (VacuumSharedCostBalance)
+		VacuumCostBalance = pg_atomic_read_u32(VacuumSharedCostBalance);
+
+	if (nworkers > 0)
+	{
+		/* Disable shared cost balance */
+		VacuumSharedCostBalance = NULL;
+		VacuumActiveNWorkers = NULL;
+	}
+}
+
+/*
+ * Index vacuum/cleanup routine used by the leader process and parallel
+ * vacuum worker processes to process the indexes in parallel.
+ */
+static void
+parallel_vacuum_index(Relation *Irel, IndexBulkDeleteResult **stats,
+					  LVShared *lvshared, LVDeadTuples *dead_tuples,
+					  int nindexes)
+{
+	/*
+	 * Increment the active worker count if we are able to launch any worker.
+	 */
+	if (VacuumActiveNWorkers)
+		pg_atomic_add_fetch_u32(VacuumActiveNWorkers, 1);
+
+	/* Loop until all indexes are vacuumed */
+	for (;;)
+	{
+		int			idx;
+		LVSharedIndStats *shared_indstats;
+
+		/* Get an index number to process */
+		idx = pg_atomic_fetch_add_u32(&(lvshared->idx), 1);
+
+		/* Done for all indexes? */
+		if (idx >= nindexes)
+			break;
+
+		/* Get the index statistics of this index from DSM */
+		shared_indstats = get_indstats(lvshared, idx);
+
+		/*
+		 * Skip processing indexes that doesn't participate in parallel
+		 * operation
+		 */
+		if (shared_indstats == NULL ||
+			skip_parallel_vacuum_index(Irel[idx], lvshared))
+			continue;
+
+		/* Increment the processing count */
+		pg_atomic_add_fetch_u32(&(lvshared->nprocessed), 1);
+
+		/* Do vacuum or cleanup of the index */
+		vacuum_one_index(Irel[idx], &(stats[idx]), lvshared, shared_indstats,
+						 dead_tuples);
+	}
+
+	/*
+	 * We have completed the index vacuum so decrement the active worker
+	 * count.
+	 */
+	if (VacuumActiveNWorkers)
+		pg_atomic_sub_fetch_u32(VacuumActiveNWorkers, 1);
+}
+
+/*
+ * Vacuum or cleanup indexes that can be processed by only the leader process
+ * because these indexes don't support parallel operation at that phase.
+ */
+static void
+vacuum_indexes_leader(Relation *Irel, IndexBulkDeleteResult **stats,
+					  LVRelStats *vacrelstats, LVParallelState *lps,
+					  int nindexes)
+{
+	int			i;
+
+	Assert(!IsParallelWorker());
+
+	/*
+	 * Increment the active worker count if we are able to launch any worker.
+	 */
+	if (VacuumActiveNWorkers)
+		pg_atomic_add_fetch_u32(VacuumActiveNWorkers, 1);
+
+	for (i = 0; i < nindexes; i++)
+	{
+		bool		skip_index = (get_indstats(lps->lvshared, i) == NULL ||
+								  skip_parallel_vacuum_index(Irel[i], lps->lvshared));
+
+		/* Skip the indexes that can be processed by parallel workers */
+		if (!skip_index)
+			continue;
+
+		vacuum_one_index(Irel[i], &(stats[i]), lps->lvshared,
+						 get_indstats(lps->lvshared, i),
+						 vacrelstats->dead_tuples);
+	}
+
+	/*
+	 * We have completed the index vacuum so decrement the active worker
+	 * count.
+	 */
+	if (VacuumActiveNWorkers)
+		pg_atomic_sub_fetch_u32(VacuumActiveNWorkers, 1);
+}
+
+/*
+ * Vacuum or cleanup index either by leader process or by one of the worker
+ * process.  After processing the index this function copies the index
+ * statistics returned from ambulkdelete and amvacuumcleanup to the DSM
+ * segment.
+ */
+static void
+vacuum_one_index(Relation indrel, IndexBulkDeleteResult **stats,
+				 LVShared *lvshared, LVSharedIndStats *shared_indstats,
+				 LVDeadTuples *dead_tuples)
+{
+	IndexBulkDeleteResult *bulkdelete_res = NULL;
+
+	if (shared_indstats)
+	{
+		/* Get the space for IndexBulkDeleteResult */
+		bulkdelete_res = &(shared_indstats->stats);
+
+		/*
+		 * Update the pointer to the corresponding bulk-deletion result if
+		 * someone has already updated it.
+		 */
+		if (shared_indstats->updated && *stats == NULL)
+			*stats = bulkdelete_res;
+	}
+
+	/* Do vacuum or cleanup of the index */
+	if (lvshared->for_cleanup)
+		lazy_cleanup_index(indrel, stats, lvshared->reltuples,
+						   lvshared->estimated_count);
+	else
+		lazy_vacuum_index(indrel, stats, dead_tuples,
+						  lvshared->reltuples);
+
+	/*
+	 * Copy the index bulk-deletion result returned from ambulkdelete and
+	 * amvacuumcleanup to the DSM segment if it's the first time to get it
+	 * from them, because they allocate it locally and it's possible that an
+	 * index will be vacuumed by the different vacuum process at the next
+	 * time.  The copying of the result normally happens only after the first
+	 * time of index vacuuming.  From the second time, we pass the result on
+	 * the DSM segment so that they then update it directly.
+	 *
+	 * Since all vacuum workers write the bulk-deletion result at different
+	 * slots we can write them without locking.
+	 */
+	if (shared_indstats && !shared_indstats->updated && *stats != NULL)
+	{
+		memcpy(bulkdelete_res, *stats, sizeof(IndexBulkDeleteResult));
+		shared_indstats->updated = true;
+
+		/*
+		 * Now that the stats[idx] points to the DSM segment, we don't need
+		 * the locally allocated results.
+		 */
+		pfree(*stats);
+		*stats = bulkdelete_res;
+	}
+}
+
+/*
+ *	lazy_cleanup_all_indexes() -- cleanup all indexes of relation.
+ *
+ * Cleanup indexes.  We process the indexes serially unless we are doing
+ * parallel vacuum.
+ */
+static void
+lazy_cleanup_all_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+						 LVRelStats *vacrelstats, LVParallelState *lps,
+						 int nindexes)
+{
+	int			idx;
+
+	Assert(!IsParallelWorker());
+	Assert(nindexes > 0);
+
+	/* Report that we are now cleaning up indexes */
+	pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
+								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
+
+	/*
+	 * If parallel vacuum is active we perform index cleanup with parallel
+	 * workers.
+	 */
+	if (ParallelVacuumIsActive(lps))
+	{
+		/* Tell parallel workers to do index cleanup */
+		lps->lvshared->for_cleanup = true;
+		lps->lvshared->first_time =
+			(vacrelstats->num_index_scans == 0);
+
+		/*
+		 * Now we can provide a better estimate of total number of surviving
+		 * tuples (we assume indexes are more interested in that than in the
+		 * number of nominally live tuples).
+		 */
+		lps->lvshared->reltuples = vacrelstats->new_rel_tuples;
+		lps->lvshared->estimated_count =
+			(vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+
+		lazy_parallel_vacuum_indexes(Irel, stats, vacrelstats, lps, nindexes);
+	}
+	else
+	{
+		for (idx = 0; idx < nindexes; idx++)
+			lazy_cleanup_index(Irel[idx], &stats[idx],
+							   vacrelstats->new_rel_tuples,
+							   vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	}
+}
 
 /*
  *	lazy_vacuum_index() -- vacuum one index relation.
  *
  *		Delete all the index entries pointing to tuples listed in
- *		vacrelstats->dead_tuples, and update running statistics.
+ *		dead_tuples, and update running statistics.
+ *
+ *		reltuples is the number of heap tuples to be passed to the
+ *		bulkdelete callback.
  */
 static void
-lazy_vacuum_index(Relation indrel,
-				  IndexBulkDeleteResult **stats,
-				  LVRelStats *vacrelstats)
+lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult **stats,
+				  LVDeadTuples *dead_tuples, double reltuples)
 {
 	IndexVacuumInfo ivinfo;
+	const char *msg;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
@@ -1741,30 +2314,38 @@ lazy_vacuum_index(Relation indrel,
 	ivinfo.report_progress = false;
 	ivinfo.estimated_count = true;
 	ivinfo.message_level = elevel;
-	/* We can only provide an approximate value of num_heap_tuples here */
-	ivinfo.num_heap_tuples = vacrelstats->old_live_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
 	/* Do bulk deletion */
 	*stats = index_bulk_delete(&ivinfo, *stats,
-							   lazy_tid_reaped, (void *) vacrelstats);
+							   lazy_tid_reaped, (void *) dead_tuples);
+
+	if (IsParallelWorker())
+		msg = gettext_noop("scanned index \"%s\" to remove %d row versions by parallel vacuum worker");
+	else
+		msg = gettext_noop("scanned index \"%s\" to remove %d row versions");
 
 	ereport(elevel,
-			(errmsg("scanned index \"%s\" to remove %d row versions",
+			(errmsg(msg,
 					RelationGetRelationName(indrel),
-					vacrelstats->num_dead_tuples),
+					dead_tuples->num_tuples),
 			 errdetail_internal("%s", pg_rusage_show(&ru0))));
 }
 
 /*
  *	lazy_cleanup_index() -- do post-vacuum cleanup for one index relation.
+ *
+ *		reltuples is the number of heap tuples and estimated_count is true
+ *		if the reltuples is an estimated value.
  */
 static void
 lazy_cleanup_index(Relation indrel,
-				   IndexBulkDeleteResult *stats,
-				   LVRelStats *vacrelstats)
+				   IndexBulkDeleteResult **stats,
+				   double reltuples, bool estimated_count)
 {
 	IndexVacuumInfo ivinfo;
+	const char *msg;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
@@ -1772,49 +2353,33 @@ lazy_cleanup_index(Relation indrel,
 	ivinfo.index = indrel;
 	ivinfo.analyze_only = false;
 	ivinfo.report_progress = false;
-	ivinfo.estimated_count = (vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	ivinfo.estimated_count = estimated_count;
 	ivinfo.message_level = elevel;
 
-	/*
-	 * Now we can provide a better estimate of total number of surviving
-	 * tuples (we assume indexes are more interested in that than in the
-	 * number of nominally live tuples).
-	 */
-	ivinfo.num_heap_tuples = vacrelstats->new_rel_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
-	stats = index_vacuum_cleanup(&ivinfo, stats);
+	*stats = index_vacuum_cleanup(&ivinfo, *stats);
 
-	if (!stats)
+	if (!(*stats))
 		return;
 
-	/*
-	 * Now update statistics in pg_class, but only if the index says the count
-	 * is accurate.
-	 */
-	if (!stats->estimated_count)
-		vac_update_relstats(indrel,
-							stats->num_pages,
-							stats->num_index_tuples,
-							0,
-							false,
-							InvalidTransactionId,
-							InvalidMultiXactId,
-							false);
+	if (IsParallelWorker())
+		msg = gettext_noop("index \"%s\" now contains %.0f row versions in %u pages as reported by parallel vacuum worker");
+	else
+		msg = gettext_noop("index \"%s\" now contains %.0f row versions in %u pages");
 
 	ereport(elevel,
-			(errmsg("index \"%s\" now contains %.0f row versions in %u pages",
+			(errmsg(msg,
 					RelationGetRelationName(indrel),
-					stats->num_index_tuples,
-					stats->num_pages),
+					(*stats)->num_index_tuples,
+					(*stats)->num_pages),
 			 errdetail("%.0f index row versions were removed.\n"
 					   "%u index pages have been deleted, %u are currently reusable.\n"
 					   "%s.",
-					   stats->tuples_removed,
-					   stats->pages_deleted, stats->pages_free,
+					   (*stats)->tuples_removed,
+					   (*stats)->pages_deleted, (*stats)->pages_free,
 					   pg_rusage_show(&ru0))));
-
-	pfree(stats);
 }
 
 /*
@@ -2122,19 +2687,17 @@ count_nondeletable_pages(Relation onerel, LVRelStats *vacrelstats)
 }
 
 /*
- * lazy_space_alloc - space allocation decisions for lazy vacuum
- *
- * See the comments at the head of this file for rationale.
+ * Return the maximum number of dead tuples we can record.
  */
-static void
-lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+static long
+compute_max_dead_tuples(BlockNumber relblocks, bool useindex)
 {
 	long		maxtuples;
 	int			vac_work_mem = IsAutoVacuumWorkerProcess() &&
 	autovacuum_work_mem != -1 ?
 	autovacuum_work_mem : maintenance_work_mem;
 
-	if (vacrelstats->useindex)
+	if (useindex)
 	{
 		maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);
 		maxtuples = Min(maxtuples, INT_MAX);
@@ -2148,34 +2711,49 @@ lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
 		maxtuples = Max(maxtuples, MaxHeapTuplesPerPage);
 	}
 	else
-	{
 		maxtuples = MaxHeapTuplesPerPage;
-	}
 
-	vacrelstats->num_dead_tuples = 0;
-	vacrelstats->max_dead_tuples = (int) maxtuples;
-	vacrelstats->dead_tuples = (ItemPointer)
-		palloc(maxtuples * sizeof(ItemPointerData));
+	return maxtuples;
+}
+
+/*
+ * lazy_space_alloc - space allocation decisions for lazy vacuum
+ *
+ * See the comments at the head of this file for rationale.
+ */
+static void
+lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+{
+	LVDeadTuples *dead_tuples = NULL;
+	long		maxtuples;
+
+	maxtuples = compute_max_dead_tuples(relblocks, vacrelstats->useindex);
+
+	dead_tuples = (LVDeadTuples *)
+		palloc(SizeOfLVDeadTuples + maxtuples * sizeof(ItemPointerData));
+	dead_tuples->num_tuples = 0;
+	dead_tuples->max_tuples = (int) maxtuples;
+
+	vacrelstats->dead_tuples = dead_tuples;
 }
 
 /*
  * lazy_record_dead_tuple - remember one deletable tuple
  */
 static void
-lazy_record_dead_tuple(LVRelStats *vacrelstats,
-					   ItemPointer itemptr)
+lazy_record_dead_tuple(LVDeadTuples *dead_tuples, ItemPointer itemptr)
 {
 	/*
 	 * The array shouldn't overflow under normal behavior, but perhaps it
 	 * could if we are given a really small maintenance_work_mem. In that
 	 * case, just forget the last few tuples (we'll get 'em next time).
 	 */
-	if (vacrelstats->num_dead_tuples < vacrelstats->max_dead_tuples)
+	if (dead_tuples->num_tuples < dead_tuples->max_tuples)
 	{
-		vacrelstats->dead_tuples[vacrelstats->num_dead_tuples] = *itemptr;
-		vacrelstats->num_dead_tuples++;
+		dead_tuples->itemptrs[dead_tuples->num_tuples] = *itemptr;
+		dead_tuples->num_tuples++;
 		pgstat_progress_update_param(PROGRESS_VACUUM_NUM_DEAD_TUPLES,
-									 vacrelstats->num_dead_tuples);
+									 dead_tuples->num_tuples);
 	}
 }
 
@@ -2189,12 +2767,12 @@ lazy_record_dead_tuple(LVRelStats *vacrelstats,
 static bool
 lazy_tid_reaped(ItemPointer itemptr, void *state)
 {
-	LVRelStats *vacrelstats = (LVRelStats *) state;
+	LVDeadTuples *dead_tuples = (LVDeadTuples *) state;
 	ItemPointer res;
 
 	res = (ItemPointer) bsearch((void *) itemptr,
-								(void *) vacrelstats->dead_tuples,
-								vacrelstats->num_dead_tuples,
+								(void *) dead_tuples->itemptrs,
+								dead_tuples->num_tuples,
 								sizeof(ItemPointerData),
 								vac_cmp_itemptr);
 
@@ -2342,3 +2920,451 @@ heap_page_is_all_visible(Relation rel, Buffer buf,
 
 	return all_visible;
 }
+
+/*
+ * Compute the number of parallel worker processes to request.  Both index
+ * vacuuming and index cleanup can be executed with parallel workers.  The
+ * relation sizes of table don't affect to the parallel degree for now.
+ *
+ * nrequested is the number of parallel workers that user requested.  If
+ * nrequested is 0, we compute the parallel degree based on nindexes, that is
+ * the number of indexes that support parallel index vacuuming.  This function
+ * also sets can_parallel_vacuum to remember indexes that participate in
+ * parallel index vacuum.
+ */
+static int
+compute_parallel_vacuum_workers(Relation *Irel, int nindexes, int nrequested,
+								bool *can_parallel_vacuum)
+{
+	int			nindexes_parallel = 0;
+	int			nindexes_parallel_bulkdel = 0;
+	int			nindexes_parallel_cleanup = 0;
+	int			parallel_workers;
+	int			i;
+
+	/*
+	 * We don't allow to perform parallel operation in standalone backend or
+	 * when parallelism is disabled.
+	 */
+	if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+		return 0;
+
+	/*
+	 * Compute the number of indexes that can participate in parallel index
+	 * vacuuming.
+	 */
+	for (i = 0; i < nindexes; i++)
+	{
+		uint8		vacoptions = Irel[i]->rd_indam->amparallelvacuumoptions;
+
+		if (vacoptions == VACUUM_OPTION_NO_PARALLEL ||
+			RelationGetNumberOfBlocks(Irel[i]) < min_parallel_index_scan_size)
+			continue;
+
+		can_parallel_vacuum[i] = true;
+
+		if ((vacoptions & VACUUM_OPTION_PARALLEL_BULKDEL) != 0)
+			nindexes_parallel_bulkdel++;
+		if (((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) != 0) ||
+			((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) != 0))
+			nindexes_parallel_cleanup++;
+	}
+
+	nindexes_parallel = Max(nindexes_parallel_bulkdel,
+							nindexes_parallel_cleanup);
+
+	/* No index doesn't do parallel index vacuuming */
+	if (nindexes_parallel == 0)
+		return 0;
+
+	/* The leader process takes one index */
+	nindexes_parallel--;
+
+	/* Compute the parallel degree */
+	parallel_workers = (nrequested > 0) ?
+		Min(nrequested, nindexes_parallel) : nindexes_parallel;
+
+	/* cap by max_parallel_maintenance_workers */
+	parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+
+	return parallel_workers;
+}
+
+/*
+ * Initialize variables for shared index statistics, set NULL bitmap and the
+ * size of stats for each index.  Since currently we don't support parallel vacuum
+ * for autovacuum we don't need to care about autovacuum_work_mem.
+ */
+static void
+prepare_index_statistics(LVShared *lvshared, bool *can_parallel_vacuum,
+						 int nindexes)
+{
+	int			i;
+
+	Assert(!IsAutoVacuumWorkerProcess());
+
+	/* Set NULL for all indexes */
+	memset(lvshared->bitmap, 0x00, BITMAPLEN(nindexes));
+
+	for (i = 0; i < nindexes; i++)
+	{
+		if (!can_parallel_vacuum[i])
+			continue;
+
+		/* Set NOT NULL as this index do support parallelism */
+		lvshared->bitmap[i >> 3] |= 1 << (i & 0x07);
+	}
+}
+
+/*
+ * Update index statistics in pg_class if the statistics is accurate.
+ */
+static void
+update_index_statistics(Relation *Irel, IndexBulkDeleteResult **stats,
+						int nindexes)
+{
+	int			i;
+
+	Assert(!IsInParallelMode());
+
+	for (i = 0; i < nindexes; i++)
+	{
+		if (stats[i] == NULL || stats[i]->estimated_count)
+			continue;
+
+		/* Update index statistics */
+		vac_update_relstats(Irel[i],
+							stats[i]->num_pages,
+							stats[i]->num_index_tuples,
+							0,
+							false,
+							InvalidTransactionId,
+							InvalidMultiXactId,
+							false);
+		pfree(stats[i]);
+	}
+}
+
+/*
+ * This function prepares and returns parallel vacuum state if we can launch
+ * even one worker.  This function is responsible to create a parallel
+ * context, enter parallel mode, and then initialize the DSM segment.
+ */
+static LVParallelState *
+begin_parallel_vacuum(Oid relid, Relation *Irel, LVRelStats *vacrelstats,
+					  BlockNumber nblocks, int nindexes, int nrequested)
+{
+	LVParallelState *lps = NULL;
+	ParallelContext *pcxt;
+	LVShared   *shared;
+	LVDeadTuples *dead_tuples;
+	bool	   *can_parallel_vacuum;
+	long		maxtuples;
+	char	   *sharedquery;
+	Size		est_shared;
+	Size		est_deadtuples;
+	int			nindexes_mwm = 0;
+	int			parallel_workers = 0;
+	int			querylen;
+	int			i;
+
+	/*
+	 * a parallel vacuum must be requested and there must be indexes on the
+	 * relation
+	 */
+	Assert(nrequested >= 0);
+	Assert(nindexes > 0);
+
+	/*
+	 * Compute the number of parallel vacuum workers to launch
+	 */
+	can_parallel_vacuum = (bool *) palloc0(sizeof(bool) * nindexes);
+	parallel_workers = compute_parallel_vacuum_workers(Irel, nindexes,
+													   nrequested,
+													   can_parallel_vacuum);
+
+	/* Can't perform vacuum in parallel */
+	if (parallel_workers <= 0)
+	{
+		pfree(can_parallel_vacuum);
+		return lps;
+	}
+
+	lps = (LVParallelState *) palloc0(sizeof(LVParallelState));
+
+	EnterParallelMode();
+	pcxt = CreateParallelContext("postgres", "parallel_vacuum_main",
+								 parallel_workers);
+	Assert(pcxt->nworkers > 0);
+	lps->pcxt = pcxt;
+
+	/* Estimate size for shared information -- PARALLEL_VACUUM_KEY_SHARED */
+	est_shared = MAXALIGN(add_size(SizeOfLVShared, BITMAPLEN(nindexes)));
+	for (i = 0; i < nindexes; i++)
+	{
+		uint8		vacoptions = Irel[i]->rd_indam->amparallelvacuumoptions;
+
+		/*
+		 * Cleanup option should be either disabled, always performing in
+		 * parallel or conditionally performing in parallel.
+		 */
+		Assert(((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) == 0) ||
+			   ((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) == 0));
+		Assert(vacoptions <= VACUUM_OPTION_MAX_VALID_VALUE);
+
+		/* Skip indexes that don't participate in parallel index vacuum */
+		if (!can_parallel_vacuum[i])
+			continue;
+
+		if (Irel[i]->rd_indam->amusemaintenanceworkmem)
+			nindexes_mwm++;
+
+		est_shared = add_size(est_shared, sizeof(LVSharedIndStats));
+
+		/*
+		 * Remember the number of indexes that support parallel operation for
+		 * each phase.
+		 */
+		if ((vacoptions & VACUUM_OPTION_PARALLEL_BULKDEL) != 0)
+			lps->nindexes_parallel_bulkdel++;
+		if ((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) != 0)
+			lps->nindexes_parallel_cleanup++;
+		if ((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) != 0)
+			lps->nindexes_parallel_condcleanup++;
+	}
+	shm_toc_estimate_chunk(&pcxt->estimator, est_shared);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	/* Estimate size for dead tuples -- PARALLEL_VACUUM_KEY_DEAD_TUPLES */
+	maxtuples = compute_max_dead_tuples(nblocks, true);
+	est_deadtuples = MAXALIGN(add_size(SizeOfLVDeadTuples,
+									   mul_size(sizeof(ItemPointerData), maxtuples)));
+	shm_toc_estimate_chunk(&pcxt->estimator, est_deadtuples);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	/* Finally, estimate PARALLEL_VACUUM_KEY_QUERY_TEXT space */
+	querylen = strlen(debug_query_string);
+	shm_toc_estimate_chunk(&pcxt->estimator, querylen + 1);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	InitializeParallelDSM(pcxt);
+
+	/* Prepare shared information */
+	shared = (LVShared *) shm_toc_allocate(pcxt->toc, est_shared);
+	MemSet(shared, 0, est_shared);
+	shared->relid = relid;
+	shared->elevel = elevel;
+	shared->maintenance_work_mem_worker =
+		(nindexes_mwm > 0) ?
+		maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
+		maintenance_work_mem;
+
+	/*
+	 * We need to care about alignment because we estimate the shared memory
+	 * in that way.
+	 */
+	shared->offset = MAXALIGN(add_size(SizeOfLVShared, BITMAPLEN(nindexes)));
+	prepare_index_statistics(shared, can_parallel_vacuum, nindexes);
+	pg_atomic_init_u32(&(shared->idx), 0);
+	pg_atomic_init_u32(&(shared->nprocessed), 0);
+	pg_atomic_init_u32(&(shared->cost_balance), 0);
+	pg_atomic_init_u32(&(shared->active_nworkers), 0);
+
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_SHARED, shared);
+	lps->lvshared = shared;
+
+	/* Prepare the dead tuple space */
+	dead_tuples = (LVDeadTuples *) shm_toc_allocate(pcxt->toc, est_deadtuples);
+	dead_tuples->max_tuples = maxtuples;
+	dead_tuples->num_tuples = 0;
+	MemSet(dead_tuples->itemptrs, 0, sizeof(ItemPointerData) * maxtuples);
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_DEAD_TUPLES, dead_tuples);
+	vacrelstats->dead_tuples = dead_tuples;
+
+	/* Store query string for workers */
+	sharedquery = (char *) shm_toc_allocate(pcxt->toc, querylen + 1);
+	memcpy(sharedquery, debug_query_string, querylen + 1);
+	sharedquery[querylen] = '\0';
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, sharedquery);
+
+	pfree(can_parallel_vacuum);
+	return lps;
+}
+
+/*
+ * Destroy the parallel context, and end parallel mode.
+ *
+ * Since writes are not allowed during the parallel mode, so we copy the
+ * updated index statistics from DSM in local memory and then later use that
+ * to update the index statistics.  One might think that we can exit from
+ * parallel mode, update the index statistics and then destroy parallel
+ * context, but that won't be safe (see ExitParallelMode).
+ */
+static void
+end_parallel_vacuum(Relation *Irel, IndexBulkDeleteResult **stats,
+					LVParallelState *lps, int nindexes)
+{
+	int			i;
+
+	Assert(!IsParallelWorker());
+
+	/* copy the updated statistics */
+	for (i = 0; i < nindexes; i++)
+	{
+		LVSharedIndStats *indstats = get_indstats(lps->lvshared, i);
+
+		/*
+		 * Skip unused slot.  The statistics of this index are already stored
+		 * in local memory.
+		 */
+		if (indstats == NULL)
+			continue;
+
+		if (indstats->updated)
+		{
+			stats[i] = (IndexBulkDeleteResult *) palloc0(sizeof(IndexBulkDeleteResult));
+			memcpy(stats[i], &(indstats->stats), sizeof(IndexBulkDeleteResult));
+		}
+		else
+			stats[i] = NULL;
+	}
+
+	DestroyParallelContext(lps->pcxt);
+	ExitParallelMode();
+
+	/* Deactivate parallel vacuum */
+	pfree(lps);
+	lps = NULL;
+}
+
+/* Return the Nth index statistics or NULL */
+static LVSharedIndStats *
+get_indstats(LVShared *lvshared, int n)
+{
+	int			i;
+	char	   *p;
+
+	if (IndStatsIsNull(lvshared, n))
+		return NULL;
+
+	p = (char *) GetSharedIndStats(lvshared);
+	for (i = 0; i < n; i++)
+	{
+		if (IndStatsIsNull(lvshared, i))
+			continue;
+
+		p += sizeof(LVSharedIndStats);
+	}
+
+	return (LVSharedIndStats *) p;
+}
+
+/*
+ * Check if the given index participates in parallel index vacuum or parallel
+ * index cleanup.
+ */
+static bool
+skip_parallel_vacuum_index(Relation indrel, LVShared *lvshared)
+{
+	uint8		vacoptions = indrel->rd_indam->amparallelvacuumoptions;
+
+	/* first_time must be true only if for_cleanup is true */
+	Assert(lvshared->for_cleanup || !lvshared->first_time);
+
+	if (lvshared->for_cleanup)
+	{
+		/* Skip, if the index does not support parallel cleanup */
+		if (((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) == 0) &&
+			((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) == 0))
+			return true;
+
+		/*
+		 * Skip, if the index supports parallel cleanup conditionally, but we
+		 * have already processed the index (for bulkdelete).  See the
+		 * comments for option VACUUM_OPTION_PARALLEL_COND_CLEANUP to know
+		 * when indexes support parallel cleanup conditionally.
+		 */
+		if (!lvshared->first_time &&
+			((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) != 0))
+			return true;
+	}
+	else if ((vacoptions & VACUUM_OPTION_PARALLEL_BULKDEL) == 0)
+	{
+		/* Skip if the index does not support parallel bulk deletion */
+		return true;
+	}
+
+	return false;
+}
+
+/*
+ * Perform work within a launched parallel process.
+ *
+ * Since parallel vacuum workers perform only index vacuum or index cleanup,
+ * we don't need to report the progress information.
+ */
+void
+parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
+{
+	Relation	onerel;
+	Relation   *indrels;
+	LVShared   *lvshared;
+	LVDeadTuples *dead_tuples;
+	int			nindexes;
+	char	   *sharedquery;
+	IndexBulkDeleteResult **stats;
+
+	lvshared = (LVShared *) shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_SHARED,
+										   false);
+	elevel = lvshared->elevel;
+
+	ereport(DEBUG1,
+			(errmsg("starting parallel lazy vacuum worker for %s",
+					lvshared->for_cleanup ? "cleanup" : "vacuuming")));
+
+	/* Set debug_query_string for individual workers */
+	sharedquery = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, false);
+	debug_query_string = sharedquery;
+	pgstat_report_activity(STATE_RUNNING, debug_query_string);
+
+	/*
+	 * Open table.  The lock mode is the same as the leader process.  It's
+	 * okay because the lock mode does not conflict among the parallel
+	 * workers.
+	 */
+	onerel = table_open(lvshared->relid, ShareUpdateExclusiveLock);
+
+	/*
+	 * Open all indexes. indrels are sorted in order by OID, which should be
+	 * matched to the leader's one.
+	 */
+	vac_open_indexes(onerel, RowExclusiveLock, &nindexes, &indrels);
+	Assert(nindexes > 0);
+
+	/* Set dead tuple space */
+	dead_tuples = (LVDeadTuples *) shm_toc_lookup(toc,
+												  PARALLEL_VACUUM_KEY_DEAD_TUPLES,
+												  false);
+
+	/* Set cost-based vacuum delay */
+	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostBalance = 0;
+	VacuumPageHit = 0;
+	VacuumPageMiss = 0;
+	VacuumPageDirty = 0;
+	VacuumSharedCostBalance = &(lvshared->cost_balance);
+	VacuumActiveNWorkers = &(lvshared->active_nworkers);
+
+	stats = (IndexBulkDeleteResult **)
+		palloc0(nindexes * sizeof(IndexBulkDeleteResult *));
+
+	if (lvshared->maintenance_work_mem_worker > 0)
+		maintenance_work_mem = lvshared->maintenance_work_mem_worker;
+
+	/* Process indexes to perform vacuum/cleanup */
+	parallel_vacuum_index(indrels, stats, lvshared, dead_tuples, nindexes);
+
+	vac_close_indexes(nindexes, indrels, RowExclusiveLock);
+	table_close(onerel, ShareUpdateExclusiveLock);
+	pfree(stats);
+}
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index d147236429..6c9ee65ba2 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -14,6 +14,7 @@
 
 #include "postgres.h"
 
+#include "access/heapam.h"
 #include "access/nbtree.h"
 #include "access/parallel.h"
 #include "access/session.h"
@@ -139,6 +140,9 @@ static const struct
 	},
 	{
 		"_bt_parallel_build_main", _bt_parallel_build_main
+	},
+	{
+		"parallel_vacuum_main", parallel_vacuum_main
 	}
 };
 
@@ -174,6 +178,7 @@ CreateParallelContext(const char *library_name, const char *function_name,
 	pcxt = palloc0(sizeof(ParallelContext));
 	pcxt->subid = GetCurrentSubTransactionId();
 	pcxt->nworkers = nworkers;
+	pcxt->nworkers_to_launch = nworkers;
 	pcxt->library_name = pstrdup(library_name);
 	pcxt->function_name = pstrdup(function_name);
 	pcxt->error_context_stack = error_context_stack;
@@ -486,6 +491,23 @@ ReinitializeParallelDSM(ParallelContext *pcxt)
 	}
 }
 
+/*
+ * Reinitialize parallel workers for a parallel context such that we could
+ * launch the different number of workers.  This is required for cases where
+ * we need to reuse the same DSM segment, but the number of workers can
+ * vary from run-to-run.
+ */
+void
+ReinitializeParallelWorkers(ParallelContext *pcxt, int nworkers_to_launch)
+{
+	/*
+	 * The number of workers that need to be launched must be less than the
+	 * number of workers with which the parallel context is initialized.
+	 */
+	Assert(pcxt->nworkers >= nworkers_to_launch);
+	pcxt->nworkers_to_launch = nworkers_to_launch;
+}
+
 /*
  * Launch parallel workers.
  */
@@ -498,7 +520,7 @@ LaunchParallelWorkers(ParallelContext *pcxt)
 	bool		any_registrations_failed = false;
 
 	/* Skip this if we have no workers. */
-	if (pcxt->nworkers == 0)
+	if (pcxt->nworkers == 0 || pcxt->nworkers_to_launch == 0)
 		return;
 
 	/* We need to be a lock group leader. */
@@ -533,7 +555,7 @@ LaunchParallelWorkers(ParallelContext *pcxt)
 	 * fails.  It wouldn't help much anyway, because registering the worker in
 	 * no way guarantees that it will start up and initialize successfully.
 	 */
-	for (i = 0; i < pcxt->nworkers; ++i)
+	for (i = 0; i < pcxt->nworkers_to_launch; ++i)
 	{
 		memcpy(worker.bgw_extra, &i, sizeof(int));
 		if (!any_registrations_failed &&
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index da1da23400..0672be27f1 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -42,6 +42,7 @@
 #include "nodes/makefuncs.h"
 #include "pgstat.h"
 #include "postmaster/autovacuum.h"
+#include "postmaster/bgworker_internals.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
 #include "storage/proc.h"
@@ -68,6 +69,14 @@ static MemoryContext vac_context = NULL;
 static BufferAccessStrategy vac_strategy;
 
 
+/*
+ * Variables for cost-based parallel vacuum.  See comments atop
+ * compute_parallel_delay to understand how it works.
+ */
+pg_atomic_uint32 *VacuumSharedCostBalance = NULL;
+pg_atomic_uint32 *VacuumActiveNWorkers = NULL;
+int			VacuumCostBalanceLocal = 0;
+
 /* non-export function prototypes */
 static List *expand_vacuum_rel(VacuumRelation *vrel, int options);
 static List *get_all_vacuum_rels(int options);
@@ -76,6 +85,7 @@ static void vac_truncate_clog(TransactionId frozenXID,
 							  TransactionId lastSaneFrozenXid,
 							  MultiXactId lastSaneMinMulti);
 static bool vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params);
+static double compute_parallel_delay(void);
 static VacOptTernaryValue get_vacopt_ternary_value(DefElem *def);
 
 /*
@@ -99,6 +109,7 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 	/* Set default value */
 	params.index_cleanup = VACOPT_TERNARY_DEFAULT;
 	params.truncate = VACOPT_TERNARY_DEFAULT;
+	params.nworkers = -1;
 
 	/* Parse options list */
 	foreach(lc, vacstmt->options)
@@ -129,6 +140,28 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 			params.index_cleanup = get_vacopt_ternary_value(opt);
 		else if (strcmp(opt->defname, "truncate") == 0)
 			params.truncate = get_vacopt_ternary_value(opt);
+		else if (strcmp(opt->defname, "parallel") == 0)
+		{
+			if (opt->arg == NULL)
+			{
+				/*
+				 * Parallel lazy vacuum is requested but user didn't specify
+				 * the parallel degree. The parallel degree will be determined
+				 * at the start of lazy vacuum.
+				 */
+				params.nworkers = 0;
+			}
+			else
+			{
+				params.nworkers = defGetInt32(opt);
+				if (params.nworkers < 1 || params.nworkers > MAX_PARALLEL_WORKER_LIMIT)
+					ereport(ERROR,
+							(errcode(ERRCODE_SYNTAX_ERROR),
+							 errmsg("parallel vacuum degree must be between 1 and %d",
+									MAX_PARALLEL_WORKER_LIMIT),
+							 parser_errposition(pstate, opt->location)));
+			}
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -170,6 +203,11 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 		}
 	}
 
+	if ((params.options & VACOPT_FULL) && params.nworkers >= 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot specify FULL option with PARALLEL option")));
+
 	/*
 	 * All freeze ages are zero if the FREEZE option is given; otherwise pass
 	 * them as -1 which means to use the default values.
@@ -383,6 +421,7 @@ vacuum(List *relations, VacuumParams *params,
 		VacuumPageHit = 0;
 		VacuumPageMiss = 0;
 		VacuumPageDirty = 0;
+		VacuumSharedCostBalance = NULL;
 
 		/*
 		 * Loop to process each selected relation.
@@ -1738,6 +1777,20 @@ vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params)
 		return false;
 	}
 
+	/*
+	 * Since parallel workers cannot access data in temporary tables, parallel
+	 * vacuum is not allowed for temporary relation. However rather than
+	 * skipping vacuum on the table, just disabling parallel option is better
+	 * option in most cases.
+	 */
+	if (RelationUsesLocalBuffers(onerel) && params->nworkers >= 0)
+	{
+		ereport(WARNING,
+				(errmsg("disabling parallel option of vacuum on \"%s\" --- cannot vacuum temporary tables in parallel",
+						RelationGetRelationName(onerel))));
+		params->nworkers = -1;
+	}
+
 	/*
 	 * Silently ignore partitioned tables as there is no work to be done.  The
 	 * useful work is on their child partitions, which have been queued up for
@@ -1941,16 +1994,26 @@ vac_close_indexes(int nindexes, Relation *Irel, LOCKMODE lockmode)
 void
 vacuum_delay_point(void)
 {
+	double		msec = 0;
+
 	/* Always check for interrupts */
 	CHECK_FOR_INTERRUPTS();
 
-	/* Nap if appropriate */
-	if (VacuumCostActive && !InterruptPending &&
-		VacuumCostBalance >= VacuumCostLimit)
-	{
-		double		msec;
+	if (!VacuumCostActive || InterruptPending)
+		return;
 
+	/*
+	 * For parallel vacuum, the delay is computed based on the shared cost
+	 * balance.  See compute_parallel_delay.
+	 */
+	if (VacuumSharedCostBalance != NULL)
+		msec = compute_parallel_delay();
+	else if (VacuumCostBalance >= VacuumCostLimit)
 		msec = VacuumCostDelay * VacuumCostBalance / VacuumCostLimit;
+
+	/* Nap if appropriate */
+	if (msec > 0)
+	{
 		if (msec > VacuumCostDelay * 4)
 			msec = VacuumCostDelay * 4;
 
@@ -1966,6 +2029,65 @@ vacuum_delay_point(void)
 	}
 }
 
+/*
+ * Computes the vacuum delay for parallel workers.
+ *
+ * The basic idea of a cost-based vacuum delay for parallel index vacuuming
+ * is to allow all parallel vacuum workers including the leader process to
+ * have a shared view of cost related parameters (mainly VacuumCostBalance).
+ * We allow each worker to update it as and when it has incurred any cost and
+ * then based on that decide whether it needs to sleep.  We allow the worker
+ * to sleep proportional to the work done and reduce the
+ * VacuumSharedCostBalance by the amount which is consumed by the current
+ * worker (VacuumCostBalanceLocal).  This can avoid letting the workers sleep
+ * who have done less or no I/O as compared to other workers and therefore can
+ * ensure that workers who are doing more I/O got throttled more.
+ *
+ * We allow any worker to sleep only if it has performed the I/O above a
+ * certain threshold, which is calculated based on the number of active
+ * workers (VacuumActiveNWorkers), and the overall cost balance is more than
+ * VacuumCostLimit set by the system.  The testing reveals that we achieve
+ * the required throttling if we allow a worker that has done more than 50%
+ * of its share of work to sleep.
+ */
+static double
+compute_parallel_delay(void)
+{
+	double		msec = 0;
+	uint32		shared_balance;
+	int			nworkers;
+
+	/* parallel vacuum must be active */
+	Assert(VacuumSharedCostBalance);
+
+	nworkers = pg_atomic_read_u32(VacuumActiveNWorkers);
+
+	/* At least count itself */
+	Assert(nworkers >= 1);
+
+	/* Update the shared cost balance value atomically */
+	shared_balance = pg_atomic_add_fetch_u32(VacuumSharedCostBalance, VacuumCostBalance);
+
+	/* Compute the total local balance for the current worker */
+	VacuumCostBalanceLocal += VacuumCostBalance;
+
+	if ((shared_balance >= VacuumCostLimit) &&
+		(VacuumCostBalanceLocal > 0.5 * (VacuumCostLimit / nworkers)))
+	{
+		/* Compute sleep time based on the local cost balance */
+		msec = VacuumCostDelay * VacuumCostBalanceLocal / VacuumCostLimit;
+		pg_atomic_sub_fetch_u32(VacuumSharedCostBalance, VacuumCostBalanceLocal);
+		VacuumCostBalanceLocal = 0;
+	}
+
+	/*
+	 * Reset the local balance as we accumulated it into the shared value.
+	 */
+	VacuumCostBalance = 0;
+
+	return msec;
+}
+
 /*
  * A wrapper function of defGetBoolean().
  *
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index e919317bab..a97cfe2111 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2886,6 +2886,8 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 			(!wraparound ? VACOPT_SKIP_LOCKED : 0);
 		tab->at_params.index_cleanup = VACOPT_TERNARY_DEFAULT;
 		tab->at_params.truncate = VACOPT_TERNARY_DEFAULT;
+		/* As of now, we don't support parallel vacuum for autovacuum */
+		tab->at_params.nworkers = -1;
 		tab->at_params.freeze_min_age = freeze_min_age;
 		tab->at_params.freeze_table_age = freeze_table_age;
 		tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 5e0db3515d..e2dbd94a3e 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -3591,7 +3591,7 @@ psql_completion(const char *text, int start, int end)
 		if (ends_with(prev_wd, '(') || ends_with(prev_wd, ','))
 			COMPLETE_WITH("FULL", "FREEZE", "ANALYZE", "VERBOSE",
 						  "DISABLE_PAGE_SKIPPING", "SKIP_LOCKED",
-						  "INDEX_CLEANUP", "TRUNCATE");
+						  "INDEX_CLEANUP", "TRUNCATE", "PARALLEL");
 		else if (TailMatches("FULL|FREEZE|ANALYZE|VERBOSE|DISABLE_PAGE_SKIPPING|SKIP_LOCKED|INDEX_CLEANUP|TRUNCATE"))
 			COMPLETE_WITH("ON", "OFF");
 	}
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 858bcb6bc9..e89c1252d3 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -24,6 +24,8 @@
 #include "nodes/primnodes.h"
 #include "storage/bufpage.h"
 #include "storage/lockdefs.h"
+#include "storage/shm_toc.h"
+#include "storage/dsm.h"
 #include "utils/relcache.h"
 #include "utils/snapshot.h"
 
@@ -193,6 +195,7 @@ extern Size SyncScanShmemSize(void);
 struct VacuumParams;
 extern void heap_vacuum_rel(Relation onerel,
 							struct VacuumParams *params, BufferAccessStrategy bstrategy);
+extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
 
 /* in heap/heapam_visibility.c */
 extern bool HeapTupleSatisfiesVisibility(HeapTuple stup, Snapshot snapshot,
diff --git a/src/include/access/parallel.h b/src/include/access/parallel.h
index c00ae6424c..b9ad6cf671 100644
--- a/src/include/access/parallel.h
+++ b/src/include/access/parallel.h
@@ -33,7 +33,8 @@ typedef struct ParallelContext
 {
 	dlist_node	node;
 	SubTransactionId subid;
-	int			nworkers;
+	int			nworkers;		/* Maximum number of workers to launch */
+	int			nworkers_to_launch; /* Actual number of workers to launch */
 	int			nworkers_launched;
 	char	   *library_name;
 	char	   *function_name;
@@ -63,6 +64,7 @@ extern ParallelContext *CreateParallelContext(const char *library_name,
 											  const char *function_name, int nworkers);
 extern void InitializeParallelDSM(ParallelContext *pcxt);
 extern void ReinitializeParallelDSM(ParallelContext *pcxt);
+extern void ReinitializeParallelWorkers(ParallelContext *pcxt, int nworkers_to_launch);
 extern void LaunchParallelWorkers(ParallelContext *pcxt);
 extern void WaitForParallelWorkersToAttach(ParallelContext *pcxt);
 extern void WaitForParallelWorkersToFinish(ParallelContext *pcxt);
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index b9becdbe99..254a6bcda6 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -222,6 +222,12 @@ typedef struct VacuumParams
 										 * default value depends on reloptions */
 	VacOptTernaryValue truncate;	/* Truncate empty pages at the end,
 									 * default value depends on reloptions */
+
+	/*
+	 * The number of parallel vacuum workers. -1 by default for no workers and
+	 * 0 for choosing based on the number of indexes.
+	 */
+	int			nworkers;
 } VacuumParams;
 
 /* GUC parameters */
@@ -231,6 +237,11 @@ extern int	vacuum_freeze_table_age;
 extern int	vacuum_multixact_freeze_min_age;
 extern int	vacuum_multixact_freeze_table_age;
 
+/* Variables for cost-based parallel vacuum  */
+extern pg_atomic_uint32 *VacuumSharedCostBalance;
+extern pg_atomic_uint32 *VacuumActiveNWorkers;
+extern int	VacuumCostBalanceLocal;
+
 
 /* in commands/vacuum.c */
 extern void ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel);
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index 9996d882d1..8571133fe7 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -92,6 +92,34 @@ CONTEXT:  SQL function "do_analyze" statement 1
 SQL function "wrap_do_analyze" statement 1
 VACUUM FULL vactst;
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+-- PARALLEL option
+CREATE TABLE pvactst (i INT, a INT[], p POINT) with (autovacuum_enabled = off);
+INSERT INTO pvactst SELECT i, array[1,2,3], point(i, i+1) FROM generate_series(1,1000) i;
+CREATE INDEX btree_pvactst ON pvactst USING btree (i);
+CREATE INDEX hash_pvactst ON pvactst USING hash (i);
+CREATE INDEX brin_pvactst ON pvactst USING brin (i);
+CREATE INDEX gin_pvactst ON pvactst USING gin (a);
+CREATE INDEX gist_pvactst ON pvactst USING gist (p);
+CREATE INDEX spgist_pvactst ON pvactst USING spgist (p);
+-- VACUUM invokes parallel index cleanup
+SET min_parallel_index_scan_size to 0;
+VACUUM (PARALLEL 2) pvactst;
+-- VACUUM invokes parallel bulk-deletion
+UPDATE pvactst SET i = i WHERE i < 1000;
+VACUUM (PARALLEL 2) pvactst;
+VACUUM (PARALLEL 0) pvactst; -- error
+ERROR:  parallel vacuum degree must be between 1 and 1024
+LINE 1: VACUUM (PARALLEL 0) pvactst;
+                ^
+VACUUM (PARALLEL 2, INDEX_CLEANUP FALSE) pvactst;
+VACUUM (PARALLEL 2, FULL TRUE) pvactst; -- error, cannot use both PARALLEL and FULL
+ERROR:  cannot specify FULL option with PARALLEL option
+CREATE TEMPORARY TABLE tmp (a int PRIMARY KEY);
+CREATE INDEX tmp_idx1 ON tmp (a);
+VACUUM (PARALLEL 1) tmp; -- disables parallel vacuum option
+WARNING:  disabling parallel option of vacuum on "tmp" --- cannot vacuum temporary tables in parallel
+RESET min_parallel_index_scan_size;
+DROP TABLE pvactst;
 -- INDEX_CLEANUP option
 CREATE TABLE no_index_cleanup (i INT PRIMARY KEY, t TEXT);
 -- Use uncompressed data stored in toast.
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 69987f75e9..be4f55616e 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -75,6 +75,33 @@ VACUUM FULL vactst;
 
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
 
+-- PARALLEL option
+CREATE TABLE pvactst (i INT, a INT[], p POINT) with (autovacuum_enabled = off);
+INSERT INTO pvactst SELECT i, array[1,2,3], point(i, i+1) FROM generate_series(1,1000) i;
+CREATE INDEX btree_pvactst ON pvactst USING btree (i);
+CREATE INDEX hash_pvactst ON pvactst USING hash (i);
+CREATE INDEX brin_pvactst ON pvactst USING brin (i);
+CREATE INDEX gin_pvactst ON pvactst USING gin (a);
+CREATE INDEX gist_pvactst ON pvactst USING gist (p);
+CREATE INDEX spgist_pvactst ON pvactst USING spgist (p);
+
+-- VACUUM invokes parallel index cleanup
+SET min_parallel_index_scan_size to 0;
+VACUUM (PARALLEL 2) pvactst;
+
+-- VACUUM invokes parallel bulk-deletion
+UPDATE pvactst SET i = i WHERE i < 1000;
+VACUUM (PARALLEL 2) pvactst;
+
+VACUUM (PARALLEL 0) pvactst; -- error
+VACUUM (PARALLEL 2, INDEX_CLEANUP FALSE) pvactst;
+VACUUM (PARALLEL 2, FULL TRUE) pvactst; -- error, cannot use both PARALLEL and FULL
+CREATE TEMPORARY TABLE tmp (a int PRIMARY KEY);
+CREATE INDEX tmp_idx1 ON tmp (a);
+VACUUM (PARALLEL 1) tmp; -- disables parallel vacuum option
+RESET min_parallel_index_scan_size;
+DROP TABLE pvactst;
+
 -- INDEX_CLEANUP option
 CREATE TABLE no_index_cleanup (i INT PRIMARY KEY, t TEXT);
 -- Use uncompressed data stored in toast.
-- 
2.23.0

v41-0003-Add-FAST-option-to-vacuum-command.patchapplication/octet-stream; name=v41-0003-Add-FAST-option-to-vacuum-command.patchDownload

From c38ab16dbaf3bb8bfe56da5f2a79a6b975d40f13 Mon Sep 17 00:00:00 2001
From: Amit Kapila <akapila@postgresql.org>
Date: Mon, 23 Dec 2019 15:37:09 +0530
Subject: [PATCH v41 3/4] Add FAST option to vacuum command.

---
 doc/src/sgml/ref/vacuum.sgml         | 13 +++++++++
 src/backend/access/heap/vacuumlazy.c | 43 +++++++++++++++++-----------
 src/backend/commands/vacuum.c        |  9 ++++--
 src/include/commands/vacuum.h        |  3 +-
 src/test/regress/expected/vacuum.out |  3 ++
 src/test/regress/sql/vacuum.sql      |  4 +++
 6 files changed, 56 insertions(+), 19 deletions(-)

diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index 9fee083233..b190cb0a98 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -35,6 +35,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     INDEX_CLEANUP [ <replaceable class="parameter">boolean</replaceable> ]
     TRUNCATE [ <replaceable class="parameter">boolean</replaceable> ]
     PARALLEL [ <replaceable class="parameter">integer</replaceable> ]
+    FAST [ <replaceable class="parameter">boolean</replaceable> ]
 
 <phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
 
@@ -250,6 +251,18 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><literal>FAST</literal></term>
+    <listitem>
+     <para>
+      Perform vacuum while disabling cost-based vacuum delay feature.
+      Specifying <literal>FAST</literal> is equivalent to performing
+      <command>VACUUM</command> with the
+      <xref linkend="guc-vacuum-cost-delay"/> parameter set to zero.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><replaceable class="parameter">boolean</replaceable></term>
     <listitem>
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 7231fa2923..74637c3a0e 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -218,6 +218,13 @@ typedef struct LVShared
 	 */
 	pg_atomic_uint32 active_nworkers;
 
+	/*
+	 * True if we forcibly disable cost-based vacuum delay during parallel
+	 * index vacuum. This can be true when use specified the FAST vacuum
+	 * option.
+	 */
+	bool		fast;
+
 	/*
 	 * Variables to control parallel index vacuuming.  We have a bitmap to
 	 * indicate which index has stats in shared memory.  The set bit in the
@@ -353,7 +360,7 @@ static void update_index_statistics(Relation *Irel, IndexBulkDeleteResult **stat
 									int nindexes);
 static LVParallelState *begin_parallel_vacuum(Oid relid, Relation *Irel,
 											  LVRelStats *vacrelstats, BlockNumber nblocks,
-											  int nindexes, int nrequested);
+											  int nindexes, VacuumParams *params);
 static void end_parallel_vacuum(Relation *Irel, IndexBulkDeleteResult **stats,
 								LVParallelState *lps, int nindexes);
 static LVSharedIndStats *get_indstats(LVShared *lvshared, int n);
@@ -756,7 +763,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	if (params->nworkers >= 0 && vacrelstats->useindex)
 		lps = begin_parallel_vacuum(RelationGetRelid(onerel), Irel,
 									vacrelstats, nblocks, nindexes,
-									params->nworkers);
+									params);
 
 	/*
 	 * Allocate the space for dead tuples in case the parallel vacuum is not
@@ -2006,16 +2013,19 @@ lazy_parallel_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
 			ReinitializeParallelDSM(lps->pcxt);
 		}
 
-		/* Enable shared cost balance */
-		VacuumSharedCostBalance = &(lps->lvshared->cost_balance);
-		VacuumActiveNWorkers = &(lps->lvshared->active_nworkers);
+		if (!lps->lvshared->fast)
+		{
+			/* Enable shared cost balance */
+			VacuumSharedCostBalance = &(lps->lvshared->cost_balance);
+			VacuumActiveNWorkers = &(lps->lvshared->active_nworkers);
 
-		/*
-		 * Set up shared cost balance and the number of active workers for
-		 * vacuum delay.
-		 */
-		pg_atomic_write_u32(VacuumSharedCostBalance, VacuumCostBalance);
-		pg_atomic_write_u32(VacuumActiveNWorkers, 0);
+			/*
+			 * Set up shared cost balance and the number of active workers for
+			 * vacuum delay.
+			 */
+			pg_atomic_write_u32(VacuumSharedCostBalance, VacuumCostBalance);
+			pg_atomic_write_u32(VacuumActiveNWorkers, 0);
+		}
 
 		/*
 		 * The number of workers can vary between bulkdelete and cleanup
@@ -2034,7 +2044,7 @@ lazy_parallel_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
 			VacuumCostBalance = 0;
 			VacuumCostBalanceLocal = 0;
 		}
-		else
+		else if (!lps->lvshared->fast)
 		{
 			/*
 			 * Disable shared cost balance if we are not able to launch
@@ -3052,7 +3062,7 @@ update_index_statistics(Relation *Irel, IndexBulkDeleteResult **stats,
  */
 static LVParallelState *
 begin_parallel_vacuum(Oid relid, Relation *Irel, LVRelStats *vacrelstats,
-					  BlockNumber nblocks, int nindexes, int nrequested)
+					  BlockNumber nblocks, int nindexes, VacuumParams *params)
 {
 	LVParallelState *lps = NULL;
 	ParallelContext *pcxt;
@@ -3072,7 +3082,7 @@ begin_parallel_vacuum(Oid relid, Relation *Irel, LVRelStats *vacrelstats,
 	 * a parallel vacuum must be requested and there must be indexes on the
 	 * relation
 	 */
-	Assert(nrequested >= 0);
+	Assert(params->nworkers >= 0);
 	Assert(nindexes > 0);
 
 	/*
@@ -3080,7 +3090,7 @@ begin_parallel_vacuum(Oid relid, Relation *Irel, LVRelStats *vacrelstats,
 	 */
 	can_parallel_vacuum = (bool *) palloc0(sizeof(bool) * nindexes);
 	parallel_workers = compute_parallel_vacuum_workers(Irel, nindexes,
-													   nrequested,
+													   params->nworkers,
 													   can_parallel_vacuum);
 
 	/* Can't perform vacuum in parallel */
@@ -3158,6 +3168,7 @@ begin_parallel_vacuum(Oid relid, Relation *Irel, LVRelStats *vacrelstats,
 		(nindexes_mwm > 0) ?
 		maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
 		maintenance_work_mem;
+	shared->fast = (params->options & VACOPT_FAST);
 
 	/*
 	 * We need to care about alignment because we estimate the shared memory
@@ -3347,7 +3358,7 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
 												  false);
 
 	/* Set cost-based vacuum delay */
-	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostActive = ((VacuumCostDelay > 0) && !(lvshared->fast));
 	VacuumCostBalance = 0;
 	VacuumPageHit = 0;
 	VacuumPageMiss = 0;
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 0672be27f1..3019a72325 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -101,6 +101,7 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 	bool		verbose = false;
 	bool		skip_locked = false;
 	bool		analyze = false;
+	bool		fast = false;
 	bool		freeze = false;
 	bool		full = false;
 	bool		disable_page_skipping = false;
@@ -130,6 +131,8 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 		/* Parse options available on VACUUM */
 		else if (strcmp(opt->defname, "analyze") == 0)
 			analyze = defGetBoolean(opt);
+		else if (strcmp(opt->defname, "fast") == 0)
+			fast = defGetBoolean(opt);
 		else if (strcmp(opt->defname, "freeze") == 0)
 			freeze = defGetBoolean(opt);
 		else if (strcmp(opt->defname, "full") == 0)
@@ -177,7 +180,8 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 		(analyze ? VACOPT_ANALYZE : 0) |
 		(freeze ? VACOPT_FREEZE : 0) |
 		(full ? VACOPT_FULL : 0) |
-		(disable_page_skipping ? VACOPT_DISABLE_PAGE_SKIPPING : 0);
+		(disable_page_skipping ? VACOPT_DISABLE_PAGE_SKIPPING : 0) |
+		(fast ? VACOPT_FAST : 0);
 
 	/* sanity checks on options */
 	Assert(params.options & (VACOPT_VACUUM | VACOPT_ANALYZE));
@@ -416,7 +420,8 @@ vacuum(List *relations, VacuumParams *params,
 		ListCell   *cur;
 
 		in_vacuum = true;
-		VacuumCostActive = (VacuumCostDelay > 0);
+		VacuumCostActive = ((VacuumCostDelay > 0) &&
+							!(params->options & VACOPT_FAST));
 		VacuumCostBalance = 0;
 		VacuumPageHit = 0;
 		VacuumPageMiss = 0;
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 254a6bcda6..faed3f9718 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -183,7 +183,8 @@ typedef enum VacuumOption
 	VACOPT_FULL = 1 << 4,		/* FULL (non-concurrent) vacuum */
 	VACOPT_SKIP_LOCKED = 1 << 5,	/* skip if cannot get lock */
 	VACOPT_SKIPTOAST = 1 << 6,	/* don't process the TOAST table, if any */
-	VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7	/* don't skip any pages */
+	VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7,	/* don't skip any pages */
+	VACOPT_FAST = 1 << 8		/* disable vacuum delay */
 } VacuumOption;
 
 /*
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index 8571133fe7..07c7b88a16 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -118,6 +118,9 @@ CREATE TEMPORARY TABLE tmp (a int PRIMARY KEY);
 CREATE INDEX tmp_idx1 ON tmp (a);
 VACUUM (PARALLEL 1) tmp; -- disables parallel vacuum option
 WARNING:  disabling parallel option of vacuum on "tmp" --- cannot vacuum temporary tables in parallel
+--test FAST option
+UPDATE pvactst SET i = i WHERE i < 1000;
+VACUUM (FAST) pvactst;
 RESET min_parallel_index_scan_size;
 DROP TABLE pvactst;
 -- INDEX_CLEANUP option
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index be4f55616e..6227ab9423 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -99,6 +99,10 @@ VACUUM (PARALLEL 2, FULL TRUE) pvactst; -- error, cannot use both PARALLEL and F
 CREATE TEMPORARY TABLE tmp (a int PRIMARY KEY);
 CREATE INDEX tmp_idx1 ON tmp (a);
 VACUUM (PARALLEL 1) tmp; -- disables parallel vacuum option
+
+--test FAST option
+UPDATE pvactst SET i = i WHERE i < 1000;
+VACUUM (FAST) pvactst;
 RESET min_parallel_index_scan_size;
 DROP TABLE pvactst;
 
-- 
2.23.0

#320

Tomas Vondra

tomas.vondra@2ndquadrant.com

about 6 years ago

In reply to: Masahiko Sawada (#319)

On Sun, Dec 29, 2019 at 10:06:23PM +0900, Masahiko Sawada wrote:

On Fri, 27 Dec 2019 at 11:24, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

Hi,

On Wed, Dec 25, 2019 at 09:17:16PM +0900, Masahiko Sawada wrote:

On Tue, 24 Dec 2019 at 15:46, Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Tue, 24 Dec 2019 at 15:44, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Dec 24, 2019 at 12:08 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

The first patches look good to me. I'm reviewing other patches and
will post comments if there is.

Oops I meant first "two" patches look good to me.

Okay, feel free to address few comments raised by Mahendra along with
whatever you find.

Thanks!

I've attached updated patch set as the previous version patch set
conflicts to the current HEAD. This patch set incorporated the review
comments, a few fix and the patch for
PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION. 0001 patch is the same
as previous version.

I've been reviewing the updated patches over the past couple of days, so
let me share some initial review comments. I initially started to read
the thread, but then I realized it's futile - the thread is massive, and
the patch changed so much re-reading the whole thread is a waste of time.

Thank you for reviewing this patch!

It might be useful write a summary of the current design, but AFAICS the
original plan to parallelize the heap scan is abandoned and we now do
just the steps that vacuum indexes in parallel. Which is fine, but it
means the subject "block level parallel vacuum" is a bit misleading.

Yeah I should have renamed it. I'll summarize the current design.

Anyway, most of the logic is implemented in part 0002, which actually
does all the parallel worker stuff. The remaining parts 0001, 0003 and
0004 are either preparing infrastructure or not directlyrelated to the
primary feature.

v40-0001-Introduce-IndexAM-fields-for-parallel-vacuum.patch
-----------------------------------------------------------

I wonder if 'amparallelvacuumoptions' is unnecessarily specific. Maybe
it should be called just 'amvacuumoptions' or something like that? The
'parallel' part is actually encoded in names of the options.

amvacuumoptions seems good to me.

Also, why do we need a separate amusemaintenanceworkmem option? Why
don't we simply track it using a separate flag in 'amvacuumoptions'
(or whatever we end up calling it)?

It also seems like a good idea.

I think there's another question we need to ask - why to we introduce a
bitmask, instead of using regular boolean struct members? Until now, the
IndexAmRoutine struct had simple boolean members describing capabilities
of the AM implementation. Why shouldn't this patch do the same thing,
i.e. add one boolean flag for each AM feature?

Would it make sense to track m_w_m usage separately for the two index
cleanup phases? Or is that unnecessary / pointless?

We could do that but currently index AM uses this option is only gin
indexes. And gin indexes could use maintenance_work_mem both during
bulkdelete and cleanup. So it might be unnecessary at least as of now.

v40-0002-Add-a-parallel-option-to-the-VACUUM-command.patch
----------------------------------------------------------

I haven't found any issues yet, but I've only started with the code
review. I'll continue with the review. It seems in a fairly good shape
though, I think, I only have two minor comments at the moment:

- The SizeOfLVDeadTuples macro seems rather weird. It does include space
for one ItemPointerData, but we really need an array of them. But then
all the places where the macro is used explicitly add space for the
pointers, so the sizeof(ItemPointerData) seems unnecessary. So it
should be either

#define SizeOfLVDeadTuples (offsetof(LVDeadTuples, itemptrs))

or

#define SizeOfLVDeadTuples(cnt) \
(offsetof(LVDeadTuples, itemptrs) + (cnt) * sizeof(ItemPointerData))

in which case the callers can be simplified.

Fixed it to the former.

Hmmm, I'd actually suggest to use the latter variant, because it allows
simplifying the callers. Just translating it to offsetof() is not saving
much code, I think.

- It's not quite clear to me why we need the new nworkers_to_launch
field in ParallelContext.

The motivation of nworkers_to_launch is to specify the number of
workers to actually launch when we use the same parallel context
several times while changing the number of workers to launch. Since
index AM can choose the participation of bulkdelete and/or cleanup,
the number of workers required for each vacuum phrases can be
different. I originally changed LaunchParallelWorkers to have the
number of workers to launch so that it launches different number of
workers for each vacuum phases but Robert suggested to change the
routine of reinitializing parallel context[1]. It would be less
confusing and would involve modify code in a lot fewer places. So with
this patch we specify the number of workers during initializing the
parallel context as a maximum number of workers. And using
ReinitializeParallelWorkers before doing either bulkdelete or cleanup
we specify the number of workers to launch.

Hmmm. I find it a bit confusing, but I don't know a better solution.

v40-0003-Add-FAST-option-to-vacuum-command.patch
------------------------------------------------

I do have a bit of an issue with this part - I'm not quite convinved we
actually need a FAST option, and I actually suspect we'll come to regret
it sooner than later. AFAIK it pretty much does exactly the same thing
as setting vacuum_cost_delay to 0, and IMO it's confusing to provide
multiple ways to do the same thing - I do expect reports from confused
users on pgsql-bugs etc. Why is setting vacuum_cost_delay directly not a
sufficient solution?

I think the motivation of this option is similar to FREEZE. I think
it's sometimes a good idea to have a shortcut of popular usage and
make it have an name corresponding to its job. From that perspective I
think having FAST option would make sense but maybe we need more
discussion the combination parallel vacuum and vacuum delay.

OK. I think it's mostly independent piece, so maybe we should move it to
a separate patch. It's more likely to get attention/feedback when not
buried in this thread.

The same thing applies to the PARALLEL flag, added in 0002, BTW. Why do
we need a separate VACUUM option, instead of just using the existing
max_parallel_maintenance_workers GUC? It's good enough for CREATE INDEX
so why not here?

AFAIR There was no such discussion so far but I think one reason could
be that parallel vacuum should be disabled by default. If the parallel
vacuum uses max_parallel_maintenance_workers (2 by default) rather
than having the option the parallel vacuum would work with default
setting but I think that it would become a big impact for user because
the disk access could become random reads and writes when some indexes
are on the same tablespace.

I'm not quite convinced VACUUM should have parallelism disabled by
default. I know some people argued we should do that because making
vacuum faster may put pressure on other parts of the system. Which is
true, but I don't think the solution is to make vacuum slower by
default. IMHO we should do the opposite - have it parallel by default
(as driven by max_parallel_maintenance_workers), and have an option
to disable parallelism.

It's pretty much the same thing we did with vacuum throttling - it's
disabled for explicit vacuum by default, but you can enable it. If
you're worried about VACUUM causing issues, you should cost delay.

The way it's done now we pretty much don't handle either case without
having to tweak something:

- If you really want to go as fast as possible (e.g. during maintenance
window) you have to say "PARALLEL".

- If you need to restrict VACUUM activity, you have to et cost_delay
because just not using parallelism seems unreliable.

Of course, the question is what to do about autovacuum - I agree it may
make sense to have parallelism disabled in this case (just like we
already have throttling enabled by default for autovacuum).

Maybe it's explained somewhere deep in the thread, of course ...

v40-0004-Add-ability-to-disable-leader-participation-in-p.patch
---------------------------------------------------------------

IMHO this should be simply merged into 0002.

We discussed it's still unclear whether we really want to commit this
code and therefore it's separated from the main part. Please see more
details here[2].

IMO there's not much reason for the leader not to participate. For
regular queries the leader may be doing useful stuff (essentially
running the non-parallel part of the query) but AFAIK for VAUCUM that's
not the case and the worker is not doing anything.

I've fixed code based on the review comments and rebased to the
current HEAD. Some comments around vacuum option name and FAST option
are still left as we would need more discussion.

Thanks, I'll take a look.

regards

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#321

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Tomas Vondra (#320)

On Mon, Dec 30, 2019 at 2:53 AM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:

On Sun, Dec 29, 2019 at 10:06:23PM +0900, Masahiko Sawada wrote:

v40-0001-Introduce-IndexAM-fields-for-parallel-vacuum.patch
-----------------------------------------------------------

I wonder if 'amparallelvacuumoptions' is unnecessarily specific. Maybe
it should be called just 'amvacuumoptions' or something like that? The
'parallel' part is actually encoded in names of the options.

amvacuumoptions seems good to me.

Also, why do we need a separate amusemaintenanceworkmem option? Why
don't we simply track it using a separate flag in 'amvacuumoptions'
(or whatever we end up calling it)?

It also seems like a good idea.

I think there's another question we need to ask - why to we introduce a
bitmask, instead of using regular boolean struct members? Until now, the
IndexAmRoutine struct had simple boolean members describing capabilities
of the AM implementation. Why shouldn't this patch do the same thing,
i.e. add one boolean flag for each AM feature?

This structure member describes mostly one property of index which is
about a parallel vacuum which I am not sure is true for other members.
Now, we can use separate bool variables for it which we were initially
using in the patch but that seems to be taking more space in a
structure without any advantage. Also, using one variable makes a
code bit better because otherwise, in many places we need to check and
set four variables instead of one. This is also the reason we used
parallel in its name (we also use *parallel* for parallel index scan
related things). Having said that, we can remove parallel from its
name if we want to extend/use it for something other than a parallel
vacuum. I think we might need to add a flag or two for parallelizing
heap scan of vacuum when we enhance this feature, so keeping it for
just a parallel vacuum is not completely insane.

I think keeping amusemaintenanceworkmem separate from this variable
seems to me like a better idea as it doesn't describe whether IndexAM
can participate in a parallel vacuum or not. You can see more
discussion about that variable in the thread [1]/messages/by-id/CAA4eK1LmcD5aPogzwim5Nn58Ki+74a6Edghx4Wd8hAskvHaq5A@mail.gmail.com.

v40-0004-Add-ability-to-disable-leader-participation-in-p.patch
---------------------------------------------------------------

IMHO this should be simply merged into 0002.

We discussed it's still unclear whether we really want to commit this
code and therefore it's separated from the main part. Please see more
details here[2].

IMO there's not much reason for the leader not to participate.

The only reason for this is just a debugging/testing aid because
during the development of other parallel features we required such a
knob. The other way could be to have something similar to
force_parallel_mode and there is some discussion about that as well on
this thread but we haven't concluded which is better. So, we decided
to keep it as a separate patch which we can use to test this feature
during development and decide later whether we really need to commit
it. BTW, we have found few bugs by using this knob in the patch.

[1]: /messages/by-id/CAA4eK1LmcD5aPogzwim5Nn58Ki+74a6Edghx4Wd8hAskvHaq5A@mail.gmail.com

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#322

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Tomas Vondra (#320)

On Mon, Dec 30, 2019 at 2:53 AM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:

On Sun, Dec 29, 2019 at 10:06:23PM +0900, Masahiko Sawada wrote:

v40-0003-Add-FAST-option-to-vacuum-command.patch
------------------------------------------------

I do have a bit of an issue with this part - I'm not quite convinved we
actually need a FAST option, and I actually suspect we'll come to regret
it sooner than later. AFAIK it pretty much does exactly the same thing
as setting vacuum_cost_delay to 0, and IMO it's confusing to provide
multiple ways to do the same thing - I do expect reports from confused
users on pgsql-bugs etc. Why is setting vacuum_cost_delay directly not a
sufficient solution?

I think the motivation of this option is similar to FREEZE. I think
it's sometimes a good idea to have a shortcut of popular usage and
make it have an name corresponding to its job. From that perspective I
think having FAST option would make sense but maybe we need more
discussion the combination parallel vacuum and vacuum delay.

OK. I think it's mostly independent piece, so maybe we should move it to
a separate patch. It's more likely to get attention/feedback when not
buried in this thread.

+1. It is already a separate patch and I think we can even discuss
more on it in a new thread once the main patch is committed or do you
think we should have a conclusion about it now itself? To me, this
option appears to be an extension to the main feature which can be
useful for some users and people might like to have a separate option,
so we can discuss it and get broader feedback after the main patch is
committed.

The same thing applies to the PARALLEL flag, added in 0002, BTW. Why do
we need a separate VACUUM option, instead of just using the existing
max_parallel_maintenance_workers GUC?

How will user specify parallel degree? The parallel degree is helpful
because in some cases users can decide how many workers should be
launched based on size and type of indexes.

It's good enough for CREATE INDEX
so why not here?

That is a different feature and I think here users can make a better
judgment based on the size of indexes. Moreover, users have an option
to control a parallel degree for 'Create Index' via Alter Table
<tbl_name> Set (parallel_workers = <n>) which I am not sure is a good
idea for parallel vacuum as the parallelism is more derived from size
and type of indexes. Now, we can think of a similar parameter at the
table/index level for parallel vacuum, but I don't see it equally
useful in this case.

AFAIR There was no such discussion so far but I think one reason could
be that parallel vacuum should be disabled by default. If the parallel
vacuum uses max_parallel_maintenance_workers (2 by default) rather
than having the option the parallel vacuum would work with default
setting but I think that it would become a big impact for user because
the disk access could become random reads and writes when some indexes
are on the same tablespace.

I'm not quite convinced VACUUM should have parallelism disabled by
default. I know some people argued we should do that because making
vacuum faster may put pressure on other parts of the system. Which is
true, but I don't think the solution is to make vacuum slower by
default. IMHO we should do the opposite - have it parallel by default
(as driven by max_parallel_maintenance_workers), and have an option
to disable parallelism.

I think driving parallelism for vacuum by
max_parallel_maintenance_workers might not be sufficient. We need to
give finer control as it depends a lot on the size of indexes. Also,
unlike Create Index, Vacuum can be performed on an entire database and
it is quite possible that some tables/indexes are relatively smaller
and forcing parallelism on them by default might slow down the
operation.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#323

Tomas Vondra

tomas.vondra@2ndquadrant.com

about 6 years ago

In reply to: Amit Kapila (#322)

On Mon, Dec 30, 2019 at 10:40:39AM +0530, Amit Kapila wrote:

On Mon, Dec 30, 2019 at 2:53 AM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:

On Sun, Dec 29, 2019 at 10:06:23PM +0900, Masahiko Sawada wrote:

v40-0003-Add-FAST-option-to-vacuum-command.patch
------------------------------------------------

I do have a bit of an issue with this part - I'm not quite convinved we
actually need a FAST option, and I actually suspect we'll come to regret
it sooner than later. AFAIK it pretty much does exactly the same thing
as setting vacuum_cost_delay to 0, and IMO it's confusing to provide
multiple ways to do the same thing - I do expect reports from confused
users on pgsql-bugs etc. Why is setting vacuum_cost_delay directly not a
sufficient solution?

I think the motivation of this option is similar to FREEZE. I think
it's sometimes a good idea to have a shortcut of popular usage and
make it have an name corresponding to its job. From that perspective I
think having FAST option would make sense but maybe we need more
discussion the combination parallel vacuum and vacuum delay.

OK. I think it's mostly independent piece, so maybe we should move it to
a separate patch. It's more likely to get attention/feedback when not
buried in this thread.

+1. It is already a separate patch and I think we can even discuss
more on it in a new thread once the main patch is committed or do you
think we should have a conclusion about it now itself? To me, this
option appears to be an extension to the main feature which can be
useful for some users and people might like to have a separate option,
so we can discuss it and get broader feedback after the main patch is
committed.

I don't think it's an extension of the main feature - it does not depend
on it, it could be committed before or after the parallel vacuum (with
some conflicts, but the feature itself is not affected).

My point was that by moving it into a separate thread we're more likely
to get feedback on it, e.g. from people who don't feel like reviewing
the parallel vacuum feature and/or feel intimidated by t100+ messages in
this thread.

The same thing applies to the PARALLEL flag, added in 0002, BTW. Why do
we need a separate VACUUM option, instead of just using the existing
max_parallel_maintenance_workers GUC?

How will user specify parallel degree? The parallel degree is helpful
because in some cases users can decide how many workers should be
launched based on size and type of indexes.

By setting max_maintenance_parallel_workers.

It's good enough for CREATE INDEX
so why not here?

That is a different feature and I think here users can make a better
judgment based on the size of indexes. Moreover, users have an option
to control a parallel degree for 'Create Index' via Alter Table
<tbl_name> Set (parallel_workers = <n>) which I am not sure is a good
idea for parallel vacuum as the parallelism is more derived from size
and type of indexes. Now, we can think of a similar parameter at the
table/index level for parallel vacuum, but I don't see it equally
useful in this case.

I'm a bit skeptical about users being able to pick good parallel degree.
If we (i.e. experienced developers/hackers with quite a bit of
knowledge) can't come up with a reasonable heuristics, how likely is it
that a regular user will come up with something better?

Not sure I understand why "parallel_workers" would not be suitable for
parallel vacuum? I mean, even for CREATE INDEX it certainly matters the
size/type of indexes, no?

I may be wrong in both cases, of course.

AFAIR There was no such discussion so far but I think one reason could
be that parallel vacuum should be disabled by default. If the parallel
vacuum uses max_parallel_maintenance_workers (2 by default) rather
than having the option the parallel vacuum would work with default
setting but I think that it would become a big impact for user because
the disk access could become random reads and writes when some indexes
are on the same tablespace.

I'm not quite convinced VACUUM should have parallelism disabled by
default. I know some people argued we should do that because making
vacuum faster may put pressure on other parts of the system. Which is
true, but I don't think the solution is to make vacuum slower by
default. IMHO we should do the opposite - have it parallel by default
(as driven by max_parallel_maintenance_workers), and have an option
to disable parallelism.

I think driving parallelism for vacuum by
max_parallel_maintenance_workers might not be sufficient. We need to
give finer control as it depends a lot on the size of indexes. Also,
unlike Create Index, Vacuum can be performed on an entire database and
it is quite possible that some tables/indexes are relatively smaller
and forcing parallelism on them by default might slow down the
operation.

Why wouldn't it be sufficient? Why couldn't this use similar logic to
what we have in plan_create_index_workers for CREATE INDEX?

Sure, it may be useful to give power users a way to override the default
logic, but I very much doubt users can make reliable judgments about
parallelism.

Also, it's not like the risks are comparable in those two cases. If you
have very large table with a lot of indexes, the gains with parallel
vacuum are pretty much bound to be significant, possibly 10x or more.
OTOH if the table is small, parallelism may not give you much and it may
even be less efficient, but I doubt it's going to be 10x slower. And
considering min_parallel_index_scan_size already protects us against
this, at least partially.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#324

Tomas Vondra

tomas.vondra@2ndquadrant.com

about 6 years ago

In reply to: Amit Kapila (#321)

On Mon, Dec 30, 2019 at 08:25:28AM +0530, Amit Kapila wrote:

On Mon, Dec 30, 2019 at 2:53 AM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:

On Sun, Dec 29, 2019 at 10:06:23PM +0900, Masahiko Sawada wrote:

v40-0001-Introduce-IndexAM-fields-for-parallel-vacuum.patch
-----------------------------------------------------------

I wonder if 'amparallelvacuumoptions' is unnecessarily specific. Maybe
it should be called just 'amvacuumoptions' or something like that? The
'parallel' part is actually encoded in names of the options.

amvacuumoptions seems good to me.

Also, why do we need a separate amusemaintenanceworkmem option? Why
don't we simply track it using a separate flag in 'amvacuumoptions'
(or whatever we end up calling it)?

It also seems like a good idea.

I think there's another question we need to ask - why to we introduce a
bitmask, instead of using regular boolean struct members? Until now, the
IndexAmRoutine struct had simple boolean members describing capabilities
of the AM implementation. Why shouldn't this patch do the same thing,
i.e. add one boolean flag for each AM feature?

This structure member describes mostly one property of index which is
about a parallel vacuum which I am not sure is true for other members.
Now, we can use separate bool variables for it which we were initially
using in the patch but that seems to be taking more space in a
structure without any advantage. Also, using one variable makes a
code bit better because otherwise, in many places we need to check and
set four variables instead of one. This is also the reason we used
parallel in its name (we also use *parallel* for parallel index scan
related things). Having said that, we can remove parallel from its
name if we want to extend/use it for something other than a parallel
vacuum. I think we might need to add a flag or two for parallelizing
heap scan of vacuum when we enhance this feature, so keeping it for
just a parallel vacuum is not completely insane.

I think keeping amusemaintenanceworkmem separate from this variable
seems to me like a better idea as it doesn't describe whether IndexAM
can participate in a parallel vacuum or not. You can see more
discussion about that variable in the thread [1].

I don't know, but IMHO it's somewhat easier to work with separate flags.
Bitmasks make sense when space usage matters a lot, e.g. for on-disk
representation, but that doesn't seem to be the case here I think (if it
was, we'd probably use bitmasks already).

It seems like we're mixing two ways to design the struct unnecessarily,
but I'm not going to nag about this any further.

v40-0004-Add-ability-to-disable-leader-participation-in-p.patch
---------------------------------------------------------------

IMHO this should be simply merged into 0002.

We discussed it's still unclear whether we really want to commit this
code and therefore it's separated from the main part. Please see more
details here[2].

IMO there's not much reason for the leader not to participate.

The only reason for this is just a debugging/testing aid because
during the development of other parallel features we required such a
knob. The other way could be to have something similar to
force_parallel_mode and there is some discussion about that as well on
this thread but we haven't concluded which is better. So, we decided
to keep it as a separate patch which we can use to test this feature
during development and decide later whether we really need to commit
it. BTW, we have found few bugs by using this knob in the patch.

OK, understood. Then why not just use force_parallel_mode?

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#325

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Tomas Vondra (#324)

On Mon, Dec 30, 2019 at 6:46 PM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:

On Mon, Dec 30, 2019 at 08:25:28AM +0530, Amit Kapila wrote:

On Mon, Dec 30, 2019 at 2:53 AM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:

I think there's another question we need to ask - why to we introduce a
bitmask, instead of using regular boolean struct members? Until now, the
IndexAmRoutine struct had simple boolean members describing capabilities
of the AM implementation. Why shouldn't this patch do the same thing,
i.e. add one boolean flag for each AM feature?

This structure member describes mostly one property of index which is
about a parallel vacuum which I am not sure is true for other members.
Now, we can use separate bool variables for it which we were initially
using in the patch but that seems to be taking more space in a
structure without any advantage. Also, using one variable makes a
code bit better because otherwise, in many places we need to check and
set four variables instead of one. This is also the reason we used
parallel in its name (we also use *parallel* for parallel index scan
related things). Having said that, we can remove parallel from its
name if we want to extend/use it for something other than a parallel
vacuum. I think we might need to add a flag or two for parallelizing
heap scan of vacuum when we enhance this feature, so keeping it for
just a parallel vacuum is not completely insane.

I think keeping amusemaintenanceworkmem separate from this variable
seems to me like a better idea as it doesn't describe whether IndexAM
can participate in a parallel vacuum or not. You can see more
discussion about that variable in the thread [1].

I don't know, but IMHO it's somewhat easier to work with separate flags.
Bitmasks make sense when space usage matters a lot, e.g. for on-disk
representation, but that doesn't seem to be the case here I think (if it
was, we'd probably use bitmasks already).

It seems like we're mixing two ways to design the struct unnecessarily,
but I'm not going to nag about this any further.

Fair enough. I see your point and as mentioned earlier that we
started with the approach of separate booleans, but later found that
this is a better way as it was easier to set and check the different
parallel options for a parallel vacuum. I think we can go back to
the individual booleans if we want but I am not sure if that is a
better approach for this usage. Sawada-San, others, do you have any
opinion here?

v40-0004-Add-ability-to-disable-leader-participation-in-p.patch
---------------------------------------------------------------

IMHO this should be simply merged into 0002.

We discussed it's still unclear whether we really want to commit this
code and therefore it's separated from the main part. Please see more
details here[2].

IMO there's not much reason for the leader not to participate.

The only reason for this is just a debugging/testing aid because
during the development of other parallel features we required such a
knob. The other way could be to have something similar to
force_parallel_mode and there is some discussion about that as well on
this thread but we haven't concluded which is better. So, we decided
to keep it as a separate patch which we can use to test this feature
during development and decide later whether we really need to commit
it. BTW, we have found few bugs by using this knob in the patch.

OK, understood. Then why not just use force_parallel_mode?

Because we are not sure what should be its behavior under different
modes especially what should we do when user set force_parallel_mode =
on. We can even consider introducing new guc specific for this, but
as of now, I am not convinced that is required. See some more
discussion around this parameter in emails [1]/messages/by-id/CAFiTN-sUuLASVXm2qOjufVH3tBZHPLdujMJ0RHr47Tnctjk9YA@mail.gmail.com[2]/messages/by-id/CA+fd4k6VgA_DG=8=ui7UvHhqx9VbQ-+72X=_GdTzh=J_xN+VEg@mail.gmail.com. I think we can
decide on this later (probably once the main patch is committed) as we
already have one way to test the patch.

[1]: /messages/by-id/CAFiTN-sUuLASVXm2qOjufVH3tBZHPLdujMJ0RHr47Tnctjk9YA@mail.gmail.com
[2]: /messages/by-id/CA+fd4k6VgA_DG=8=ui7UvHhqx9VbQ-+72X=_GdTzh=J_xN+VEg@mail.gmail.com

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#326

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Tomas Vondra (#323)

On Mon, Dec 30, 2019 at 6:37 PM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:

On Mon, Dec 30, 2019 at 10:40:39AM +0530, Amit Kapila wrote:

On Mon, Dec 30, 2019 at 2:53 AM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:

+1. It is already a separate patch and I think we can even discuss
more on it in a new thread once the main patch is committed or do you
think we should have a conclusion about it now itself? To me, this
option appears to be an extension to the main feature which can be
useful for some users and people might like to have a separate option,
so we can discuss it and get broader feedback after the main patch is
committed.

I don't think it's an extension of the main feature - it does not depend
on it, it could be committed before or after the parallel vacuum (with
some conflicts, but the feature itself is not affected).

My point was that by moving it into a separate thread we're more likely
to get feedback on it, e.g. from people who don't feel like reviewing
the parallel vacuum feature and/or feel intimidated by t100+ messages in
this thread.

I agree with this point.

The same thing applies to the PARALLEL flag, added in 0002, BTW. Why do
we need a separate VACUUM option, instead of just using the existing
max_parallel_maintenance_workers GUC?

How will user specify parallel degree? The parallel degree is helpful
because in some cases users can decide how many workers should be
launched based on size and type of indexes.

By setting max_maintenance_parallel_workers.

It's good enough for CREATE INDEX
so why not here?

That is a different feature and I think here users can make a better
judgment based on the size of indexes. Moreover, users have an option
to control a parallel degree for 'Create Index' via Alter Table
<tbl_name> Set (parallel_workers = <n>) which I am not sure is a good
idea for parallel vacuum as the parallelism is more derived from size
and type of indexes. Now, we can think of a similar parameter at the
table/index level for parallel vacuum, but I don't see it equally
useful in this case.

I'm a bit skeptical about users being able to pick good parallel degree.
If we (i.e. experienced developers/hackers with quite a bit of
knowledge) can't come up with a reasonable heuristics, how likely is it
that a regular user will come up with something better?

In this case, it is highly dependent on the number of indexes (as for
each index, we can spawn one worker). So, it is a bit easier for the
users to specify it. Now, we can internally also identify the same
and we do that in case the user doesn't specify it, however, that can
easily lead to more resource (CPU, I/O) usage than the user would like
to do for a particular vacuum. So, giving an option to the user
sounds quite reasonable to me. Anyway, in case user doesn't specify
the parallel_degree, we are going to select one internally.

Not sure I understand why "parallel_workers" would not be suitable for
parallel vacuum? I mean, even for CREATE INDEX it certainly matters the
size/type of indexes, no?

The difference here is that in parallel vacuum each worker can scan a
separate index whereas parallel_workers is more of an option for
scanning heap in parallel. So, if the size of the heap is bigger,
then increasing that value helps whereas here if there are more number
of indexes on the table, increasing corresponding value for parallel
vacuum can help.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#327

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

about 6 years ago

In reply to: Amit Kapila (#325)

On Tue, 31 Dec 2019 at 12:39, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Dec 30, 2019 at 6:46 PM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:

On Mon, Dec 30, 2019 at 08:25:28AM +0530, Amit Kapila wrote:

On Mon, Dec 30, 2019 at 2:53 AM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:

I think there's another question we need to ask - why to we introduce a
bitmask, instead of using regular boolean struct members? Until now, the
IndexAmRoutine struct had simple boolean members describing capabilities
of the AM implementation. Why shouldn't this patch do the same thing,
i.e. add one boolean flag for each AM feature?

This structure member describes mostly one property of index which is
about a parallel vacuum which I am not sure is true for other members.
Now, we can use separate bool variables for it which we were initially
using in the patch but that seems to be taking more space in a
structure without any advantage. Also, using one variable makes a
code bit better because otherwise, in many places we need to check and
set four variables instead of one. This is also the reason we used
parallel in its name (we also use *parallel* for parallel index scan
related things). Having said that, we can remove parallel from its
name if we want to extend/use it for something other than a parallel
vacuum. I think we might need to add a flag or two for parallelizing
heap scan of vacuum when we enhance this feature, so keeping it for
just a parallel vacuum is not completely insane.

I think keeping amusemaintenanceworkmem separate from this variable
seems to me like a better idea as it doesn't describe whether IndexAM
can participate in a parallel vacuum or not. You can see more
discussion about that variable in the thread [1].

I don't know, but IMHO it's somewhat easier to work with separate flags.
Bitmasks make sense when space usage matters a lot, e.g. for on-disk
representation, but that doesn't seem to be the case here I think (if it
was, we'd probably use bitmasks already).

It seems like we're mixing two ways to design the struct unnecessarily,
but I'm not going to nag about this any further.

Fair enough. I see your point and as mentioned earlier that we
started with the approach of separate booleans, but later found that
this is a better way as it was easier to set and check the different
parallel options for a parallel vacuum. I think we can go back to
the individual booleans if we want but I am not sure if that is a
better approach for this usage. Sawada-San, others, do you have any
opinion here?

If we go back to the individual booleans we would end up with having
three booleans: bulkdelete, cleanup and conditional cleanup. I think
making the bulkdelete option to a boolean makes sense but having two
booleans for cleanup and conditional cleanup might be slightly odd
because these options are exclusive. If we don't stick to have only
booleans the having a ternary value for cleanup might be
understandable but I'm not sure it's better to have it for only vacuum
purpose.

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#328

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Masahiko Sawada (#327)

On Thu, Jan 2, 2020 at 8:29 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Tue, 31 Dec 2019 at 12:39, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Dec 30, 2019 at 6:46 PM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:

On Mon, Dec 30, 2019 at 08:25:28AM +0530, Amit Kapila wrote:

On Mon, Dec 30, 2019 at 2:53 AM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:

I think there's another question we need to ask - why to we introduce a
bitmask, instead of using regular boolean struct members? Until now, the
IndexAmRoutine struct had simple boolean members describing capabilities
of the AM implementation. Why shouldn't this patch do the same thing,
i.e. add one boolean flag for each AM feature?

This structure member describes mostly one property of index which is
about a parallel vacuum which I am not sure is true for other members.
Now, we can use separate bool variables for it which we were initially
using in the patch but that seems to be taking more space in a
structure without any advantage. Also, using one variable makes a
code bit better because otherwise, in many places we need to check and
set four variables instead of one. This is also the reason we used
parallel in its name (we also use *parallel* for parallel index scan
related things). Having said that, we can remove parallel from its
name if we want to extend/use it for something other than a parallel
vacuum. I think we might need to add a flag or two for parallelizing
heap scan of vacuum when we enhance this feature, so keeping it for
just a parallel vacuum is not completely insane.

I think keeping amusemaintenanceworkmem separate from this variable
seems to me like a better idea as it doesn't describe whether IndexAM
can participate in a parallel vacuum or not. You can see more
discussion about that variable in the thread [1].

I don't know, but IMHO it's somewhat easier to work with separate flags.
Bitmasks make sense when space usage matters a lot, e.g. for on-disk
representation, but that doesn't seem to be the case here I think (if it
was, we'd probably use bitmasks already).

It seems like we're mixing two ways to design the struct unnecessarily,
but I'm not going to nag about this any further.

Fair enough. I see your point and as mentioned earlier that we
started with the approach of separate booleans, but later found that
this is a better way as it was easier to set and check the different
parallel options for a parallel vacuum. I think we can go back to
the individual booleans if we want but I am not sure if that is a
better approach for this usage. Sawada-San, others, do you have any
opinion here?

If we go back to the individual booleans we would end up with having
three booleans: bulkdelete, cleanup and conditional cleanup. I think
making the bulkdelete option to a boolean makes sense but having two
booleans for cleanup and conditional cleanup might be slightly odd
because these options are exclusive.

If we have only three booleans, then we need to check for all three to
conclude that a parallel vacuum is not enabled for any index.
Alternatively, we can have a fourth boolean to indicate that a
parallel vacuum is not enabled. And in the future, when we allow
supporting multiple workers for an index, we might need another
variable unless we can allow it for all types of indexes. This was my
point that having multiple variables for the purpose of a parallel
vacuum (for indexes) doesn't sound like a better approach than having
a single uint8 variable.

If we don't stick to have only
booleans the having a ternary value for cleanup might be
understandable but I'm not sure it's better to have it for only vacuum
purpose.

If we want to keep the possibility of extending it for other purposes,
then we can probably rename it to amoptions or something like that.
What do you think?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#329

Dilip Kumar

dilipbalaut@gmail.com

about 6 years ago

In reply to: Amit Kapila (#325)

On Tue, Dec 31, 2019 at 9:09 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Dec 30, 2019 at 6:46 PM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:

On Mon, Dec 30, 2019 at 08:25:28AM +0530, Amit Kapila wrote:

On Mon, Dec 30, 2019 at 2:53 AM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:

I think there's another question we need to ask - why to we introduce a
bitmask, instead of using regular boolean struct members? Until now, the
IndexAmRoutine struct had simple boolean members describing capabilities
of the AM implementation. Why shouldn't this patch do the same thing,
i.e. add one boolean flag for each AM feature?

This structure member describes mostly one property of index which is
about a parallel vacuum which I am not sure is true for other members.
Now, we can use separate bool variables for it which we were initially
using in the patch but that seems to be taking more space in a
structure without any advantage. Also, using one variable makes a
code bit better because otherwise, in many places we need to check and
set four variables instead of one. This is also the reason we used
parallel in its name (we also use *parallel* for parallel index scan
related things). Having said that, we can remove parallel from its
name if we want to extend/use it for something other than a parallel
vacuum. I think we might need to add a flag or two for parallelizing
heap scan of vacuum when we enhance this feature, so keeping it for
just a parallel vacuum is not completely insane.

I think keeping amusemaintenanceworkmem separate from this variable
seems to me like a better idea as it doesn't describe whether IndexAM
can participate in a parallel vacuum or not. You can see more
discussion about that variable in the thread [1].

I don't know, but IMHO it's somewhat easier to work with separate flags.
Bitmasks make sense when space usage matters a lot, e.g. for on-disk
representation, but that doesn't seem to be the case here I think (if it
was, we'd probably use bitmasks already).

It seems like we're mixing two ways to design the struct unnecessarily,
but I'm not going to nag about this any further.

Fair enough. I see your point and as mentioned earlier that we
started with the approach of separate booleans, but later found that
this is a better way as it was easier to set and check the different
parallel options for a parallel vacuum. I think we can go back to
the individual booleans if we want but I am not sure if that is a
better approach for this usage. Sawada-San, others, do you have any
opinion here?

IMHO, having multiple bools will be confusing compared to what we have
now because these are all related to enabling parallelism for
different phases of the vacuum. So it makes more sense to keep it as
a single variable with multiple options.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#330

Dilip Kumar

dilipbalaut@gmail.com

about 6 years ago

In reply to: Amit Kapila (#328)

On Thu, Jan 2, 2020 at 9:03 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Thu, Jan 2, 2020 at 8:29 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Tue, 31 Dec 2019 at 12:39, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Dec 30, 2019 at 6:46 PM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:

On Mon, Dec 30, 2019 at 08:25:28AM +0530, Amit Kapila wrote:

On Mon, Dec 30, 2019 at 2:53 AM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:

I think there's another question we need to ask - why to we introduce a
bitmask, instead of using regular boolean struct members? Until now, the
IndexAmRoutine struct had simple boolean members describing capabilities
of the AM implementation. Why shouldn't this patch do the same thing,
i.e. add one boolean flag for each AM feature?

This structure member describes mostly one property of index which is
about a parallel vacuum which I am not sure is true for other members.
Now, we can use separate bool variables for it which we were initially
using in the patch but that seems to be taking more space in a
structure without any advantage. Also, using one variable makes a
code bit better because otherwise, in many places we need to check and
set four variables instead of one. This is also the reason we used
parallel in its name (we also use *parallel* for parallel index scan
related things). Having said that, we can remove parallel from its
name if we want to extend/use it for something other than a parallel
vacuum. I think we might need to add a flag or two for parallelizing
heap scan of vacuum when we enhance this feature, so keeping it for
just a parallel vacuum is not completely insane.

I think keeping amusemaintenanceworkmem separate from this variable
seems to me like a better idea as it doesn't describe whether IndexAM
can participate in a parallel vacuum or not. You can see more
discussion about that variable in the thread [1].

I don't know, but IMHO it's somewhat easier to work with separate flags.
Bitmasks make sense when space usage matters a lot, e.g. for on-disk
representation, but that doesn't seem to be the case here I think (if it
was, we'd probably use bitmasks already).

It seems like we're mixing two ways to design the struct unnecessarily,
but I'm not going to nag about this any further.

Fair enough. I see your point and as mentioned earlier that we
started with the approach of separate booleans, but later found that
this is a better way as it was easier to set and check the different
parallel options for a parallel vacuum. I think we can go back to
the individual booleans if we want but I am not sure if that is a
better approach for this usage. Sawada-San, others, do you have any
opinion here?

If we go back to the individual booleans we would end up with having
three booleans: bulkdelete, cleanup and conditional cleanup. I think
making the bulkdelete option to a boolean makes sense but having two
booleans for cleanup and conditional cleanup might be slightly odd
because these options are exclusive.

If we have only three booleans, then we need to check for all three to
conclude that a parallel vacuum is not enabled for any index.
Alternatively, we can have a fourth boolean to indicate that a
parallel vacuum is not enabled. And in the future, when we allow
supporting multiple workers for an index, we might need another
variable unless we can allow it for all types of indexes. This was my
point that having multiple variables for the purpose of a parallel
vacuum (for indexes) doesn't sound like a better approach than having
a single uint8 variable.

If we don't stick to have only
booleans the having a ternary value for cleanup might be
understandable but I'm not sure it's better to have it for only vacuum
purpose.

If we want to keep the possibility of extending it for other purposes,
then we can probably rename it to amoptions or something like that.
What do you think?

I think it makes more sense to just keep it for the purpose of
enabling/disabling parallelism in different phases. I am not sure
that adding more options (which are not related to enabling
parallelism in vacuum phases) to the same variable will make sense.
So I think the current name is good for its purpose.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#331

Robert Haas

robertmhaas@gmail.com

about 6 years ago

In reply to: Tomas Vondra (#320)

On Sun, Dec 29, 2019 at 4:23 PM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:

IMO there's not much reason for the leader not to participate. For
regular queries the leader may be doing useful stuff (essentially
running the non-parallel part of the query) but AFAIK for VAUCUM that's
not the case and the worker is not doing anything.

I agree, and said the same thing in
/messages/by-id/CA+Tgmob7JLrngeHz6i60_TqdvE1YBcvGYVoEQ6_xvP=vN7DwGg@mail.gmail.com

I really don't know why we have that code.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#332

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Robert Haas (#331)

On Fri, Jan 3, 2020 at 10:15 PM Robert Haas <robertmhaas@gmail.com> wrote:

On Sun, Dec 29, 2019 at 4:23 PM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:

IMO there's not much reason for the leader not to participate. For
regular queries the leader may be doing useful stuff (essentially
running the non-parallel part of the query) but AFAIK for VAUCUM that's
not the case and the worker is not doing anything.

I agree, and said the same thing in
/messages/by-id/CA+Tgmob7JLrngeHz6i60_TqdvE1YBcvGYVoEQ6_xvP=vN7DwGg@mail.gmail.com

I really don't know why we have that code.

We have removed that code from the main patch. It is in a separate
patch and used mainly for development testing where we want to
debug/test the worker code.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#333

Mahendra Singh Thalor

mahi6run@gmail.com

about 6 years ago

In reply to: Amit Kapila (#332)

6 attachment(s)

On Sat, 4 Jan 2020 at 07:12, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Jan 3, 2020 at 10:15 PM Robert Haas <robertmhaas@gmail.com> wrote:

On Sun, Dec 29, 2019 at 4:23 PM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:

IMO there's not much reason for the leader not to participate. For
regular queries the leader may be doing useful stuff (essentially
running the non-parallel part of the query) but AFAIK for VAUCUM

that's

not the case and the worker is not doing anything.

I agree, and said the same thing in

/messages/by-id/CA+Tgmob7JLrngeHz6i60_TqdvE1YBcvGYVoEQ6_xvP=vN7DwGg@mail.gmail.com

I really don't know why we have that code.

We have removed that code from the main patch. It is in a separate
patch and used mainly for development testing where we want to
debug/test the worker code.

Hi All,

In other thread "parallel vacuum options/syntax" [1]/messages/by-id/CAA4eK1LBUfVQu7jCfL20MAF+RzUssP06mcBEcSZb8XktD7X1BA@mail.gmail.com, Amit Kapila asked
opinion about syntax for making normal vacuum to parallel. From that
thread, I can see that people are in favor of option(b) to implement. So I
tried to implement option(b) on the top of v41 patch set and implemented a
delta patch.

*How vacuum will work?*

If user gave "vacuum" or "vacuum table_name", then based on the number of
parallel supported indexes, we will launch workers.
Ex: vacuum table_name;
or vacuum (parallel) table_name; *//both are same.*

If user has requested parallel degree (1-1024), then we will launch workers
based on requested degree and parallel supported indexes.
Ex: vacuum (parallel 8) table_name;

If user don't want parallel vacuum, then he should set parallel degree as
zero.
Ex: vacuum (parallel 0) table_name;

I did some testing also and didn't find any issue after forcing normal
vacuum to parallel vacuum. All the test cases are passing and make check
world also passing.

Here, I am attaching delta patch that can be applied on the top of v41
patch set. Apart from delta patch, attaching gist index patch (v4) and all
the v41 patch set.

Please let me know your thoughts for this.

[1]: /messages/by-id/CAA4eK1LBUfVQu7jCfL20MAF+RzUssP06mcBEcSZb8XktD7X1BA@mail.gmail.com
/messages/by-id/CAA4eK1LBUfVQu7jCfL20MAF+RzUssP06mcBEcSZb8XktD7X1BA@mail.gmail.com

--
Thanks and Regards
Mahendra Singh Thalor
EnterpriseDB: http://www.enterprisedb.com

Attachments:

v4-0001-Delete-empty-pages-in-each-pass-during-GIST-VACUUM.patchapplication/octet-stream; name=v4-0001-Delete-empty-pages-in-each-pass-during-GIST-VACUUM.patchDownload

From b44bc6deae88c9bec552f6de4e6e73a1b252c191 Mon Sep 17 00:00:00 2001
From: Amit Kapila <akapila@postgresql.org>
Date: Mon, 9 Dec 2019 14:12:59 +0530
Subject: [PATCH] Delete empty pages in each pass during GIST VACUUM.

Earlier, we use to postpone deleting empty pages till the second stage of
vacuum to amortize the cost of scanning internal pages.  However, that can
sometimes (say vacuum is canceled or errored between first and second
stage) delay the pages to be recycled.

Another thing is that to facilitate deleting empty pages in the second
stage, we need to share the information of internal and empty pages
between different stages of vacuum.  It will be quite tricky to share this
information via DSM which is required for the upcoming parallel vacuum
patch.

Also, it will bring the logic to reclaim deleted pages closer to nbtree
where we delete empty pages in each pass.

Overall, the advantages of deleting empty pages in each pass outweigh the
advantages of postponing the same.

Author: Dilip Kumar, with changes by Amit Kapila
Reviewed-by: Sawada Masahiko and Amit Kapila
Discussion: https://postgr.es/m/CAA4eK1LGr+MN0xHZpJ2dfS8QNQ1a_aROKowZB+MPNep8FVtwAA@mail.gmail.com
---
 src/backend/access/gist/README       |  23 +++--
 src/backend/access/gist/gistvacuum.c | 160 +++++++++++++++--------------------
 2 files changed, 78 insertions(+), 105 deletions(-)

diff --git a/src/backend/access/gist/README b/src/backend/access/gist/README
index 8cbca69..fffdfff 100644
--- a/src/backend/access/gist/README
+++ b/src/backend/access/gist/README
@@ -429,18 +429,17 @@ splits during searches, we don't need a "vacuum cycle ID" concept for that
 like B-tree does.
 
 While we scan all the pages, we also make note of any completely empty leaf
-pages. We will try to unlink them from the tree in the second stage. We also
-record the block numbers of all internal pages; they are needed in the second
-stage, to locate parents of the empty pages.
-
-In the second stage, we try to unlink any empty leaf pages from the tree, so
-that their space can be reused. In order to delete an empty page, its
-downlink must be removed from the parent. We scan all the internal pages,
-whose block numbers we memorized in the first stage, and look for downlinks
-to pages that we have memorized as being empty. Whenever we find one, we
-acquire a lock on the parent and child page, re-check that the child page is
-still empty. Then, we remove the downlink and mark the child as deleted, and
-release the locks.
+pages. We will try to unlink them from the tree after the scan. We also record
+the block numbers of all internal pages; they are needed to locate parents of
+the empty pages while unlinking them.
+
+We try to unlink any empty leaf pages from the tree, so that their space can
+be reused. In order to delete an empty page, its downlink must be removed from
+the parent. We scan all the internal pages, whose block numbers we memorized
+in the first stage, and look for downlinks to pages that we have memorized as
+being empty. Whenever we find one, we acquire a lock on the parent and child
+page, re-check that the child page is still empty. Then, we remove the
+downlink and mark the child as deleted, and release the locks.
 
 The insertion algorithm would get confused, if an internal page was completely
 empty. So we never delete the last child of an internal page, even if it's
diff --git a/src/backend/access/gist/gistvacuum.c b/src/backend/access/gist/gistvacuum.c
index 710e401..730f3e8 100644
--- a/src/backend/access/gist/gistvacuum.c
+++ b/src/backend/access/gist/gistvacuum.c
@@ -24,58 +24,34 @@
 #include "storage/lmgr.h"
 #include "utils/memutils.h"
 
-/*
- * State kept across vacuum stages.
- */
+/* Working state needed by gistbulkdelete */
 typedef struct
 {
-	IndexBulkDeleteResult stats;	/* must be first */
+	IndexVacuumInfo *info;
+	IndexBulkDeleteResult *stats;
+	IndexBulkDeleteCallback callback;
+	void	   *callback_state;
+	GistNSN		startNSN;
 
 	/*
-	 * These are used to memorize all internal and empty leaf pages in the 1st
-	 * vacuum stage.  They are used in the 2nd stage, to delete all the empty
-	 * pages.
+	 * These are used to memorize all internal and empty leaf pages. They are
+	 * used for deleting all the empty pages.
 	 */
 	IntegerSet *internal_page_set;
 	IntegerSet *empty_leaf_set;
 	MemoryContext page_set_context;
-} GistBulkDeleteResult;
-
-/* Working state needed by gistbulkdelete */
-typedef struct
-{
-	IndexVacuumInfo *info;
-	GistBulkDeleteResult *stats;
-	IndexBulkDeleteCallback callback;
-	void	   *callback_state;
-	GistNSN		startNSN;
 } GistVacState;
 
-static void gistvacuumscan(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
+static void gistvacuumscan(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
 						   IndexBulkDeleteCallback callback, void *callback_state);
 static void gistvacuumpage(GistVacState *vstate, BlockNumber blkno,
 						   BlockNumber orig_blkno);
 static void gistvacuum_delete_empty_pages(IndexVacuumInfo *info,
-										  GistBulkDeleteResult *stats);
-static bool gistdeletepage(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
+										  GistVacState *vstate);
+static bool gistdeletepage(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
 						   Buffer buffer, OffsetNumber downlink,
 						   Buffer leafBuffer);
 
-/* allocate the 'stats' struct that's kept over vacuum stages */
-static GistBulkDeleteResult *
-create_GistBulkDeleteResult(void)
-{
-	GistBulkDeleteResult *gist_stats;
-
-	gist_stats = (GistBulkDeleteResult *) palloc0(sizeof(GistBulkDeleteResult));
-	gist_stats->page_set_context =
-		GenerationContextCreate(CurrentMemoryContext,
-								"GiST VACUUM page set context",
-								16 * 1024);
-
-	return gist_stats;
-}
-
 /*
  * VACUUM bulkdelete stage: remove index entries.
  */
@@ -83,15 +59,13 @@ IndexBulkDeleteResult *
 gistbulkdelete(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
 			   IndexBulkDeleteCallback callback, void *callback_state)
 {
-	GistBulkDeleteResult *gist_stats = (GistBulkDeleteResult *) stats;
-
 	/* allocate stats if first time through, else re-use existing struct */
-	if (gist_stats == NULL)
-		gist_stats = create_GistBulkDeleteResult();
+	if (stats == NULL)
+		stats = (IndexBulkDeleteResult *) palloc0(sizeof(IndexBulkDeleteResult));
 
-	gistvacuumscan(info, gist_stats, callback, callback_state);
+	gistvacuumscan(info, stats, callback, callback_state);
 
-	return (IndexBulkDeleteResult *) gist_stats;
+	return stats;
 }
 
 /*
@@ -100,8 +74,6 @@ gistbulkdelete(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
 IndexBulkDeleteResult *
 gistvacuumcleanup(IndexVacuumInfo *info, IndexBulkDeleteResult *stats)
 {
-	GistBulkDeleteResult *gist_stats = (GistBulkDeleteResult *) stats;
-
 	/* No-op in ANALYZE ONLY mode */
 	if (info->analyze_only)
 		return stats;
@@ -111,25 +83,13 @@ gistvacuumcleanup(IndexVacuumInfo *info, IndexBulkDeleteResult *stats)
 	 * stats from the latest gistbulkdelete call.  If it wasn't called, we
 	 * still need to do a pass over the index, to obtain index statistics.
 	 */
-	if (gist_stats == NULL)
+	if (stats == NULL)
 	{
-		gist_stats = create_GistBulkDeleteResult();
-		gistvacuumscan(info, gist_stats, NULL, NULL);
+		stats = (IndexBulkDeleteResult *) palloc0(sizeof(IndexBulkDeleteResult));
+		gistvacuumscan(info, stats, NULL, NULL);
 	}
 
 	/*
-	 * If we saw any empty pages, try to unlink them from the tree so that
-	 * they can be reused.
-	 */
-	gistvacuum_delete_empty_pages(info, gist_stats);
-
-	/* we don't need the internal and empty page sets anymore */
-	MemoryContextDelete(gist_stats->page_set_context);
-	gist_stats->page_set_context = NULL;
-	gist_stats->internal_page_set = NULL;
-	gist_stats->empty_leaf_set = NULL;
-
-	/*
 	 * It's quite possible for us to be fooled by concurrent page splits into
 	 * double-counting some index tuples, so disbelieve any total that exceeds
 	 * the underlying heap's count ... if we know that accurately.  Otherwise
@@ -137,11 +97,11 @@ gistvacuumcleanup(IndexVacuumInfo *info, IndexBulkDeleteResult *stats)
 	 */
 	if (!info->estimated_count)
 	{
-		if (gist_stats->stats.num_index_tuples > info->num_heap_tuples)
-			gist_stats->stats.num_index_tuples = info->num_heap_tuples;
+		if (stats->num_index_tuples > info->num_heap_tuples)
+			stats->num_index_tuples = info->num_heap_tuples;
 	}
 
-	return (IndexBulkDeleteResult *) gist_stats;
+	return stats;
 }
 
 /*
@@ -153,15 +113,16 @@ gistvacuumcleanup(IndexVacuumInfo *info, IndexBulkDeleteResult *stats)
  * occurred).
  *
  * This also makes note of any empty leaf pages, as well as all internal
- * pages.  The second stage, gistvacuum_delete_empty_pages(), needs that
- * information.  Any deleted pages are added directly to the free space map.
- * (They should've been added there when they were originally deleted, already,
- * but it's possible that the FSM was lost at a crash, for example.)
+ * pages while looping over all index pages.  After scanning all the pages, we
+ * remove the empty pages so that they can be reused.  Any deleted pages are
+ * added directly to the free space map.  (They should've been added there
+ * when they were originally deleted, already, but it's possible that the FSM
+ * was lost at a crash, for example.)
  *
  * The caller is responsible for initially allocating/zeroing a stats struct.
  */
 static void
-gistvacuumscan(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
+gistvacuumscan(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
 			   IndexBulkDeleteCallback callback, void *callback_state)
 {
 	Relation	rel = info->index;
@@ -175,11 +136,10 @@ gistvacuumscan(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
 	 * Reset counts that will be incremented during the scan; needed in case
 	 * of multiple scans during a single VACUUM command.
 	 */
-	stats->stats.estimated_count = false;
-	stats->stats.num_index_tuples = 0;
-	stats->stats.pages_deleted = 0;
-	stats->stats.pages_free = 0;
-	MemoryContextReset(stats->page_set_context);
+	stats->estimated_count = false;
+	stats->num_index_tuples = 0;
+	stats->pages_deleted = 0;
+	stats->pages_free = 0;
 
 	/*
 	 * Create the integer sets to remember all the internal and the empty leaf
@@ -187,9 +147,12 @@ gistvacuumscan(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
 	 * this context so that the subsequent allocations for these integer sets
 	 * will be done from the same context.
 	 */
-	oldctx = MemoryContextSwitchTo(stats->page_set_context);
-	stats->internal_page_set = intset_create();
-	stats->empty_leaf_set = intset_create();
+	vstate.page_set_context = GenerationContextCreate(CurrentMemoryContext,
+													  "GiST VACUUM page set context",
+													  16 * 1024);
+	oldctx = MemoryContextSwitchTo(vstate.page_set_context);
+	vstate.internal_page_set = intset_create();
+	vstate.empty_leaf_set = intset_create();
 	MemoryContextSwitchTo(oldctx);
 
 	/* Set up info to pass down to gistvacuumpage */
@@ -257,11 +220,23 @@ gistvacuumscan(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
 	 * Note that if no recyclable pages exist, we don't bother vacuuming the
 	 * FSM at all.
 	 */
-	if (stats->stats.pages_free > 0)
+	if (stats->pages_free > 0)
 		IndexFreeSpaceMapVacuum(rel);
 
 	/* update statistics */
-	stats->stats.num_pages = num_pages;
+	stats->num_pages = num_pages;
+
+	/*
+	 * If we saw any empty pages, try to unlink them from the tree so that
+	 * they can be reused.
+	 */
+	gistvacuum_delete_empty_pages(info, &vstate);
+
+	/* we don't need the internal and empty page sets anymore */
+	MemoryContextDelete(vstate.page_set_context);
+	vstate.page_set_context = NULL;
+	vstate.internal_page_set = NULL;
+	vstate.empty_leaf_set = NULL;
 }
 
 /*
@@ -278,7 +253,6 @@ gistvacuumscan(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
 static void
 gistvacuumpage(GistVacState *vstate, BlockNumber blkno, BlockNumber orig_blkno)
 {
-	GistBulkDeleteResult *stats = vstate->stats;
 	IndexVacuumInfo *info = vstate->info;
 	IndexBulkDeleteCallback callback = vstate->callback;
 	void	   *callback_state = vstate->callback_state;
@@ -307,13 +281,13 @@ restart:
 	{
 		/* Okay to recycle this page */
 		RecordFreeIndexPage(rel, blkno);
-		stats->stats.pages_free++;
-		stats->stats.pages_deleted++;
+		vstate->stats->pages_free++;
+		vstate->stats->pages_deleted++;
 	}
 	else if (GistPageIsDeleted(page))
 	{
 		/* Already deleted, but can't recycle yet */
-		stats->stats.pages_deleted++;
+		vstate->stats->pages_deleted++;
 	}
 	else if (GistPageIsLeaf(page))
 	{
@@ -388,7 +362,7 @@ restart:
 
 			END_CRIT_SECTION();
 
-			stats->stats.tuples_removed += ntodelete;
+			vstate->stats->tuples_removed += ntodelete;
 			/* must recompute maxoff */
 			maxoff = PageGetMaxOffsetNumber(page);
 		}
@@ -405,10 +379,10 @@ restart:
 			 * it up.
 			 */
 			if (blkno == orig_blkno)
-				intset_add_member(stats->empty_leaf_set, blkno);
+				intset_add_member(vstate->empty_leaf_set, blkno);
 		}
 		else
-			stats->stats.num_index_tuples += nremain;
+			vstate->stats->num_index_tuples += nremain;
 	}
 	else
 	{
@@ -443,7 +417,7 @@ restart:
 		 * parents of empty leaf pages.
 		 */
 		if (blkno == orig_blkno)
-			intset_add_member(stats->internal_page_set, blkno);
+			intset_add_member(vstate->internal_page_set, blkno);
 	}
 
 	UnlockReleaseBuffer(buffer);
@@ -466,7 +440,7 @@ restart:
  * Scan all internal pages, and try to delete their empty child pages.
  */
 static void
-gistvacuum_delete_empty_pages(IndexVacuumInfo *info, GistBulkDeleteResult *stats)
+gistvacuum_delete_empty_pages(IndexVacuumInfo *info, GistVacState *vstate)
 {
 	Relation	rel = info->index;
 	BlockNumber empty_pages_remaining;
@@ -475,10 +449,10 @@ gistvacuum_delete_empty_pages(IndexVacuumInfo *info, GistBulkDeleteResult *stats
 	/*
 	 * Rescan all inner pages to find those that have empty child pages.
 	 */
-	empty_pages_remaining = intset_num_entries(stats->empty_leaf_set);
-	intset_begin_iterate(stats->internal_page_set);
+	empty_pages_remaining = intset_num_entries(vstate->empty_leaf_set);
+	intset_begin_iterate(vstate->internal_page_set);
 	while (empty_pages_remaining > 0 &&
-		   intset_iterate_next(stats->internal_page_set, &blkno))
+		   intset_iterate_next(vstate->internal_page_set, &blkno))
 	{
 		Buffer		buffer;
 		Page		page;
@@ -521,7 +495,7 @@ gistvacuum_delete_empty_pages(IndexVacuumInfo *info, GistBulkDeleteResult *stats
 			BlockNumber leafblk;
 
 			leafblk = ItemPointerGetBlockNumber(&(idxtuple->t_tid));
-			if (intset_is_member(stats->empty_leaf_set, leafblk))
+			if (intset_is_member(vstate->empty_leaf_set, leafblk))
 			{
 				leafs_to_delete[ntodelete] = leafblk;
 				todelete[ntodelete++] = off;
@@ -561,7 +535,7 @@ gistvacuum_delete_empty_pages(IndexVacuumInfo *info, GistBulkDeleteResult *stats
 			gistcheckpage(rel, leafbuf);
 
 			LockBuffer(buffer, GIST_EXCLUSIVE);
-			if (gistdeletepage(info, stats,
+			if (gistdeletepage(info, vstate->stats,
 							   buffer, todelete[i] - deleted,
 							   leafbuf))
 				deleted++;
@@ -573,7 +547,7 @@ gistvacuum_delete_empty_pages(IndexVacuumInfo *info, GistBulkDeleteResult *stats
 		ReleaseBuffer(buffer);
 
 		/* update stats */
-		stats->stats.pages_removed += deleted;
+		vstate->stats->pages_removed += deleted;
 
 		/*
 		 * We can stop the scan as soon as we have seen the downlinks, even if
@@ -596,7 +570,7 @@ gistvacuum_delete_empty_pages(IndexVacuumInfo *info, GistBulkDeleteResult *stats
  * prevented it.
  */
 static bool
-gistdeletepage(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
+gistdeletepage(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
 			   Buffer parentBuffer, OffsetNumber downlink,
 			   Buffer leafBuffer)
 {
@@ -665,7 +639,7 @@ gistdeletepage(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
 	/* mark the page as deleted */
 	MarkBufferDirty(leafBuffer);
 	GistPageSetDeleted(leafPage, txid);
-	stats->stats.pages_deleted++;
+	stats->pages_deleted++;
 
 	/* remove the downlink from the parent */
 	MarkBufferDirty(parentBuffer);
-- 
1.8.3.1

v41-0001-Introduce-IndexAM-fields-for-parallel-vacuum.patchapplication/octet-stream; name=v41-0001-Introduce-IndexAM-fields-for-parallel-vacuum.patchDownload

From d2e084ec5fb7f583cde325a3bf424088d10aa9ee Mon Sep 17 00:00:00 2001
From: Amit Kapila <akapila@postgresql.org>
Date: Mon, 23 Dec 2019 14:46:37 +0530
Subject: [PATCH v41 1/4] Introduce IndexAM fields for parallel vacuum.

Introduce new fields amusemaintenanceworkmem and amparallelvacuumoptions
in IndexAmRoutine for parallel vacuum.  The amusemaintenanceworkmem tells
whether a particular IndexAM uses maintenance_work_mem or not.  This will
help in controlling the memory used by individual workers as otherwise,
each worker can consume memory equal to maintenance_work_mem.  The
amparallelvacuumoptions tell whether a particular IndexAM participates in
a parallel vacuum and if so in which phase (bulkdelete, vacuumcleanup) of
vacuum.

Author: Masahiko Sawada and Amit Kapila
Reviewed-by: Dilip Kumar, Amit Kapila and Robert Haas
Discussion:
https://postgr.es/m/CAD21AoDTPMgzSkV4E3SFo1CH_x50bf5PqZFQf4jmqjk-C03BWg@mail.gmail.com
https://postgr.es/m/CAA4eK1LmcD5aPogzwim5Nn58Ki+74a6Edghx4Wd8hAskvHaq5A@mail.gmail.com
---
 contrib/bloom/blutils.c                       |  4 ++
 doc/src/sgml/indexam.sgml                     |  4 ++
 src/backend/access/brin/brin.c                |  4 ++
 src/backend/access/gin/ginutil.c              |  4 ++
 src/backend/access/gist/gist.c                |  4 ++
 src/backend/access/hash/hash.c                |  3 ++
 src/backend/access/nbtree/nbtree.c            |  3 ++
 src/backend/access/spgist/spgutils.c          |  4 ++
 src/include/access/amapi.h                    |  4 ++
 src/include/commands/vacuum.h                 | 38 +++++++++++++++++++
 .../modules/dummy_index_am/dummy_index_am.c   |  3 ++
 11 files changed, 75 insertions(+)

diff --git a/contrib/bloom/blutils.c b/contrib/bloom/blutils.c
index e2063bac62..1874aeeb44 100644
--- a/contrib/bloom/blutils.c
+++ b/contrib/bloom/blutils.c
@@ -18,6 +18,7 @@
 #include "access/reloptions.h"
 #include "bloom.h"
 #include "catalog/index.h"
+#include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
@@ -121,6 +122,9 @@ blhandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_CLEANUP;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = blbuild;
diff --git a/doc/src/sgml/indexam.sgml b/doc/src/sgml/indexam.sgml
index dd54c68802..37f8d8760a 100644
--- a/doc/src/sgml/indexam.sgml
+++ b/doc/src/sgml/indexam.sgml
@@ -122,6 +122,10 @@ typedef struct IndexAmRoutine
     bool        amcanparallel;
     /* does AM support columns included with clause INCLUDE? */
     bool        amcaninclude;
+    /* does AM use maintenance_work_mem? */
+    bool        amusemaintenanceworkmem;
+    /* OR of parallel vacuum flags */
+    uint8       amparallelvacuumoptions;
     /* type of data stored in index, or InvalidOid if variable */
     Oid         amkeytype;
 
diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index 294ffa6e20..abd8c40e7e 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -27,6 +27,7 @@
 #include "access/xloginsert.h"
 #include "catalog/index.h"
 #include "catalog/pg_am.h"
+#include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "pgstat.h"
 #include "postmaster/autovacuum.h"
@@ -101,6 +102,9 @@ brinhandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_CLEANUP;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = brinbuild;
diff --git a/src/backend/access/gin/ginutil.c b/src/backend/access/gin/ginutil.c
index 38593554f0..64bd81a003 100644
--- a/src/backend/access/gin/ginutil.c
+++ b/src/backend/access/gin/ginutil.c
@@ -20,6 +20,7 @@
 #include "access/xloginsert.h"
 #include "catalog/pg_collation.h"
 #include "catalog/pg_type.h"
+#include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "storage/indexfsm.h"
 #include "storage/lmgr.h"
@@ -53,6 +54,9 @@ ginhandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amusemaintenanceworkmem = true;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_CLEANUP;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = ginbuild;
diff --git a/src/backend/access/gist/gist.c b/src/backend/access/gist/gist.c
index a259c80616..e29a43bddf 100644
--- a/src/backend/access/gist/gist.c
+++ b/src/backend/access/gist/gist.c
@@ -17,6 +17,7 @@
 #include "access/gist_private.h"
 #include "access/gistscan.h"
 #include "catalog/pg_collation.h"
+#include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "nodes/execnodes.h"
 #include "storage/lmgr.h"
@@ -74,6 +75,9 @@ gisthandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = true;
+	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = gistbuild;
diff --git a/src/backend/access/hash/hash.c b/src/backend/access/hash/hash.c
index a0597a0c6e..8b9272c05f 100644
--- a/src/backend/access/hash/hash.c
+++ b/src/backend/access/hash/hash.c
@@ -72,6 +72,9 @@ hashhandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL;
 	amroutine->amkeytype = INT4OID;
 
 	amroutine->ambuild = hashbuild;
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index 065b5290b0..313e31c71b 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -121,6 +121,9 @@ bthandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = true;
 	amroutine->amcaninclude = true;
+	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = btbuild;
diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c
index e2d391ee75..cbec182347 100644
--- a/src/backend/access/spgist/spgutils.c
+++ b/src/backend/access/spgist/spgutils.c
@@ -22,6 +22,7 @@
 #include "access/transam.h"
 #include "access/xact.h"
 #include "catalog/pg_amop.h"
+#include "commands/vacuum.h"
 #include "storage/bufmgr.h"
 #include "storage/indexfsm.h"
 #include "storage/lmgr.h"
@@ -56,6 +57,9 @@ spghandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = spgbuild;
diff --git a/src/include/access/amapi.h b/src/include/access/amapi.h
index 6e3db06eed..556affb291 100644
--- a/src/include/access/amapi.h
+++ b/src/include/access/amapi.h
@@ -197,6 +197,10 @@ typedef struct IndexAmRoutine
 	bool		amcanparallel;
 	/* does AM support columns included with clause INCLUDE? */
 	bool		amcaninclude;
+	/* does AM use maintenance_work_mem? */
+	bool		amusemaintenanceworkmem;
+	/* OR of parallel vacuum flags.  See vacuum.h for flags. */
+	uint8		amparallelvacuumoptions;
 	/* type of data stored in index, or InvalidOid if variable */
 	Oid			amkeytype;
 
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 128f7ae65d..b9becdbe99 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -23,6 +23,44 @@
 #include "storage/lock.h"
 #include "utils/relcache.h"
 
+/*
+ * Flags for amparallelvacuumoptions to control the participation of bulkdelete
+ * and vacuumcleanup in parallel vacuum.
+ */
+
+/*
+ * Both bulkdelete and vacuumcleanup are disabled by default.  This will be
+ * used by IndexAM's that don't want to or cannot participate in parallel
+ * vacuum.  For example, if an index AM doesn't have a way to communicate the
+ * index statistics allocated by the first ambulkdelete call to the subsequent
+ * ones until amvacuumcleanup, the index AM cannot participate in parallel
+ * vacuum.
+ */
+#define VACUUM_OPTION_NO_PARALLEL			0
+
+/*
+ * bulkdelete can be performed in parallel.  This option can be used by
+ * IndexAm's that need to scan the index to delete the tuples.
+ */
+#define VACUUM_OPTION_PARALLEL_BULKDEL		(1 << 0)
+
+/*
+ * vacuumcleanup can be performed in parallel if bulkdelete is not performed
+ * yet.  This will be used by IndexAM's that can scan the index if the
+ * bulkdelete is not performed.
+ */
+#define VACUUM_OPTION_PARALLEL_COND_CLEANUP	(1 << 1)
+
+/*
+ * vacuumcleanup can be performed in parallel even if bulkdelete has already
+ * processed the index.  This will be used by IndexAM's that scan the index
+ * during the cleanup phase of index irrespective of whether the index is
+ * already scanned or not during bulkdelete phase.
+ */
+#define VACUUM_OPTION_PARALLEL_CLEANUP		(1 << 2)
+
+/* value for checking vacuum flags */
+#define VACUUM_OPTION_MAX_VALID_VALUE		((1 << 3) - 1)
 
 /*----------
  * ANALYZE builds one of these structs for each attribute (column) that is
diff --git a/src/test/modules/dummy_index_am/dummy_index_am.c b/src/test/modules/dummy_index_am/dummy_index_am.c
index 053636e4b4..6bfd883fd3 100644
--- a/src/test/modules/dummy_index_am/dummy_index_am.c
+++ b/src/test/modules/dummy_index_am/dummy_index_am.c
@@ -16,6 +16,7 @@
 #include "access/amapi.h"
 #include "access/reloptions.h"
 #include "catalog/index.h"
+#include "commands/vacuum.h"
 #include "nodes/pathnodes.h"
 #include "utils/guc.h"
 #include "utils/rel.h"
@@ -294,6 +295,8 @@ dihandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amparallelvacuumoptions = VACUUM_OPTION_NO_PARALLEL;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = dibuild;
-- 
2.23.0

v41-0003-Add-FAST-option-to-vacuum-command.patchapplication/octet-stream; name=v41-0003-Add-FAST-option-to-vacuum-command.patchDownload

From c38ab16dbaf3bb8bfe56da5f2a79a6b975d40f13 Mon Sep 17 00:00:00 2001
From: Amit Kapila <akapila@postgresql.org>
Date: Mon, 23 Dec 2019 15:37:09 +0530
Subject: [PATCH v41 3/4] Add FAST option to vacuum command.

---
 doc/src/sgml/ref/vacuum.sgml         | 13 +++++++++
 src/backend/access/heap/vacuumlazy.c | 43 +++++++++++++++++-----------
 src/backend/commands/vacuum.c        |  9 ++++--
 src/include/commands/vacuum.h        |  3 +-
 src/test/regress/expected/vacuum.out |  3 ++
 src/test/regress/sql/vacuum.sql      |  4 +++
 6 files changed, 56 insertions(+), 19 deletions(-)

diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index 9fee083233..b190cb0a98 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -35,6 +35,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     INDEX_CLEANUP [ <replaceable class="parameter">boolean</replaceable> ]
     TRUNCATE [ <replaceable class="parameter">boolean</replaceable> ]
     PARALLEL [ <replaceable class="parameter">integer</replaceable> ]
+    FAST [ <replaceable class="parameter">boolean</replaceable> ]
 
 <phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
 
@@ -250,6 +251,18 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><literal>FAST</literal></term>
+    <listitem>
+     <para>
+      Perform vacuum while disabling cost-based vacuum delay feature.
+      Specifying <literal>FAST</literal> is equivalent to performing
+      <command>VACUUM</command> with the
+      <xref linkend="guc-vacuum-cost-delay"/> parameter set to zero.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><replaceable class="parameter">boolean</replaceable></term>
     <listitem>
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 7231fa2923..74637c3a0e 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -218,6 +218,13 @@ typedef struct LVShared
 	 */
 	pg_atomic_uint32 active_nworkers;
 
+	/*
+	 * True if we forcibly disable cost-based vacuum delay during parallel
+	 * index vacuum. This can be true when use specified the FAST vacuum
+	 * option.
+	 */
+	bool		fast;
+
 	/*
 	 * Variables to control parallel index vacuuming.  We have a bitmap to
 	 * indicate which index has stats in shared memory.  The set bit in the
@@ -353,7 +360,7 @@ static void update_index_statistics(Relation *Irel, IndexBulkDeleteResult **stat
 									int nindexes);
 static LVParallelState *begin_parallel_vacuum(Oid relid, Relation *Irel,
 											  LVRelStats *vacrelstats, BlockNumber nblocks,
-											  int nindexes, int nrequested);
+											  int nindexes, VacuumParams *params);
 static void end_parallel_vacuum(Relation *Irel, IndexBulkDeleteResult **stats,
 								LVParallelState *lps, int nindexes);
 static LVSharedIndStats *get_indstats(LVShared *lvshared, int n);
@@ -756,7 +763,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	if (params->nworkers >= 0 && vacrelstats->useindex)
 		lps = begin_parallel_vacuum(RelationGetRelid(onerel), Irel,
 									vacrelstats, nblocks, nindexes,
-									params->nworkers);
+									params);
 
 	/*
 	 * Allocate the space for dead tuples in case the parallel vacuum is not
@@ -2006,16 +2013,19 @@ lazy_parallel_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
 			ReinitializeParallelDSM(lps->pcxt);
 		}
 
-		/* Enable shared cost balance */
-		VacuumSharedCostBalance = &(lps->lvshared->cost_balance);
-		VacuumActiveNWorkers = &(lps->lvshared->active_nworkers);
+		if (!lps->lvshared->fast)
+		{
+			/* Enable shared cost balance */
+			VacuumSharedCostBalance = &(lps->lvshared->cost_balance);
+			VacuumActiveNWorkers = &(lps->lvshared->active_nworkers);
 
-		/*
-		 * Set up shared cost balance and the number of active workers for
-		 * vacuum delay.
-		 */
-		pg_atomic_write_u32(VacuumSharedCostBalance, VacuumCostBalance);
-		pg_atomic_write_u32(VacuumActiveNWorkers, 0);
+			/*
+			 * Set up shared cost balance and the number of active workers for
+			 * vacuum delay.
+			 */
+			pg_atomic_write_u32(VacuumSharedCostBalance, VacuumCostBalance);
+			pg_atomic_write_u32(VacuumActiveNWorkers, 0);
+		}
 
 		/*
 		 * The number of workers can vary between bulkdelete and cleanup
@@ -2034,7 +2044,7 @@ lazy_parallel_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
 			VacuumCostBalance = 0;
 			VacuumCostBalanceLocal = 0;
 		}
-		else
+		else if (!lps->lvshared->fast)
 		{
 			/*
 			 * Disable shared cost balance if we are not able to launch
@@ -3052,7 +3062,7 @@ update_index_statistics(Relation *Irel, IndexBulkDeleteResult **stats,
  */
 static LVParallelState *
 begin_parallel_vacuum(Oid relid, Relation *Irel, LVRelStats *vacrelstats,
-					  BlockNumber nblocks, int nindexes, int nrequested)
+					  BlockNumber nblocks, int nindexes, VacuumParams *params)
 {
 	LVParallelState *lps = NULL;
 	ParallelContext *pcxt;
@@ -3072,7 +3082,7 @@ begin_parallel_vacuum(Oid relid, Relation *Irel, LVRelStats *vacrelstats,
 	 * a parallel vacuum must be requested and there must be indexes on the
 	 * relation
 	 */
-	Assert(nrequested >= 0);
+	Assert(params->nworkers >= 0);
 	Assert(nindexes > 0);
 
 	/*
@@ -3080,7 +3090,7 @@ begin_parallel_vacuum(Oid relid, Relation *Irel, LVRelStats *vacrelstats,
 	 */
 	can_parallel_vacuum = (bool *) palloc0(sizeof(bool) * nindexes);
 	parallel_workers = compute_parallel_vacuum_workers(Irel, nindexes,
-													   nrequested,
+													   params->nworkers,
 													   can_parallel_vacuum);
 
 	/* Can't perform vacuum in parallel */
@@ -3158,6 +3168,7 @@ begin_parallel_vacuum(Oid relid, Relation *Irel, LVRelStats *vacrelstats,
 		(nindexes_mwm > 0) ?
 		maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
 		maintenance_work_mem;
+	shared->fast = (params->options & VACOPT_FAST);
 
 	/*
 	 * We need to care about alignment because we estimate the shared memory
@@ -3347,7 +3358,7 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
 												  false);
 
 	/* Set cost-based vacuum delay */
-	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostActive = ((VacuumCostDelay > 0) && !(lvshared->fast));
 	VacuumCostBalance = 0;
 	VacuumPageHit = 0;
 	VacuumPageMiss = 0;
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 0672be27f1..3019a72325 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -101,6 +101,7 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 	bool		verbose = false;
 	bool		skip_locked = false;
 	bool		analyze = false;
+	bool		fast = false;
 	bool		freeze = false;
 	bool		full = false;
 	bool		disable_page_skipping = false;
@@ -130,6 +131,8 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 		/* Parse options available on VACUUM */
 		else if (strcmp(opt->defname, "analyze") == 0)
 			analyze = defGetBoolean(opt);
+		else if (strcmp(opt->defname, "fast") == 0)
+			fast = defGetBoolean(opt);
 		else if (strcmp(opt->defname, "freeze") == 0)
 			freeze = defGetBoolean(opt);
 		else if (strcmp(opt->defname, "full") == 0)
@@ -177,7 +180,8 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 		(analyze ? VACOPT_ANALYZE : 0) |
 		(freeze ? VACOPT_FREEZE : 0) |
 		(full ? VACOPT_FULL : 0) |
-		(disable_page_skipping ? VACOPT_DISABLE_PAGE_SKIPPING : 0);
+		(disable_page_skipping ? VACOPT_DISABLE_PAGE_SKIPPING : 0) |
+		(fast ? VACOPT_FAST : 0);
 
 	/* sanity checks on options */
 	Assert(params.options & (VACOPT_VACUUM | VACOPT_ANALYZE));
@@ -416,7 +420,8 @@ vacuum(List *relations, VacuumParams *params,
 		ListCell   *cur;
 
 		in_vacuum = true;
-		VacuumCostActive = (VacuumCostDelay > 0);
+		VacuumCostActive = ((VacuumCostDelay > 0) &&
+							!(params->options & VACOPT_FAST));
 		VacuumCostBalance = 0;
 		VacuumPageHit = 0;
 		VacuumPageMiss = 0;
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 254a6bcda6..faed3f9718 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -183,7 +183,8 @@ typedef enum VacuumOption
 	VACOPT_FULL = 1 << 4,		/* FULL (non-concurrent) vacuum */
 	VACOPT_SKIP_LOCKED = 1 << 5,	/* skip if cannot get lock */
 	VACOPT_SKIPTOAST = 1 << 6,	/* don't process the TOAST table, if any */
-	VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7	/* don't skip any pages */
+	VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7,	/* don't skip any pages */
+	VACOPT_FAST = 1 << 8		/* disable vacuum delay */
 } VacuumOption;
 
 /*
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index 8571133fe7..07c7b88a16 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -118,6 +118,9 @@ CREATE TEMPORARY TABLE tmp (a int PRIMARY KEY);
 CREATE INDEX tmp_idx1 ON tmp (a);
 VACUUM (PARALLEL 1) tmp; -- disables parallel vacuum option
 WARNING:  disabling parallel option of vacuum on "tmp" --- cannot vacuum temporary tables in parallel
+--test FAST option
+UPDATE pvactst SET i = i WHERE i < 1000;
+VACUUM (FAST) pvactst;
 RESET min_parallel_index_scan_size;
 DROP TABLE pvactst;
 -- INDEX_CLEANUP option
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index be4f55616e..6227ab9423 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -99,6 +99,10 @@ VACUUM (PARALLEL 2, FULL TRUE) pvactst; -- error, cannot use both PARALLEL and F
 CREATE TEMPORARY TABLE tmp (a int PRIMARY KEY);
 CREATE INDEX tmp_idx1 ON tmp (a);
 VACUUM (PARALLEL 1) tmp; -- disables parallel vacuum option
+
+--test FAST option
+UPDATE pvactst SET i = i WHERE i < 1000;
+VACUUM (FAST) pvactst;
 RESET min_parallel_index_scan_size;
 DROP TABLE pvactst;
 
-- 
2.23.0

v41-0004-Add-ability-to-disable-leader-participation-in-p.patchapplication/octet-stream; name=v41-0004-Add-ability-to-disable-leader-participation-in-p.patchDownload

From 4f6910f3d9105be32c7bcfe0013bf71059036eab Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Wed, 25 Dec 2019 15:32:23 +0900
Subject: [PATCH v41 4/4] Add ability to disable leader participation in
 parallel vacuum

---
 src/backend/access/heap/vacuumlazy.c | 41 ++++++++++++++++++++++++----
 1 file changed, 35 insertions(+), 6 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 74637c3a0e..a1ddf036b7 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -138,6 +138,13 @@
 #define PARALLEL_VACUUM_KEY_DEAD_TUPLES		2
 #define PARALLEL_VACUUM_KEY_QUERY_TEXT		3
 
+/*
+ * PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION disables the leader's
+ * participation in parallel lazy vacuum.  This may be useful as a debugging
+ * aid.
+#undef PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION
+ */
+
 /*
  * Macro to check if we are in a parallel lazy vacuum.  If true, we are
  * in the parallel mode and the DSM segment is initialized.
@@ -270,6 +277,12 @@ typedef struct LVParallelState
 	int			nindexes_parallel_bulkdel;
 	int			nindexes_parallel_cleanup;
 	int			nindexes_parallel_condcleanup;
+
+	/*
+	 * Always true except for a debugging case where
+	 * PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION are defined.
+	 */
+	bool		leaderparticipates;
 } LVParallelState;
 
 typedef struct LVRelStats
@@ -1985,13 +1998,17 @@ lazy_parallel_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
 	{
 		if (lps->lvshared->first_time)
 			nworkers = lps->nindexes_parallel_cleanup +
-				lps->nindexes_parallel_condcleanup - 1;
+				lps->nindexes_parallel_condcleanup;
 		else
-			nworkers = lps->nindexes_parallel_cleanup - 1;
+			nworkers = lps->nindexes_parallel_cleanup;
 
 	}
 	else
-		nworkers = lps->nindexes_parallel_bulkdel - 1;
+		nworkers = lps->nindexes_parallel_bulkdel;
+
+	/* The leader process takes one index */
+	if (lps->leaderparticipates)
+		nworkers--;
 
 	/*
 	 * It is possible that parallel context is initialized with fewer workers
@@ -2075,8 +2092,9 @@ lazy_parallel_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
 	 * Join as a parallel worker.  The leader process alone processes all the
 	 * indexes in the case where no workers are launched.
 	 */
-	parallel_vacuum_index(Irel, stats, lps->lvshared,
-						  vacrelstats->dead_tuples, nindexes);
+	if (lps->leaderparticipates || lps->pcxt->nworkers_launched == 0)
+		parallel_vacuum_index(Irel, stats, lps->lvshared,
+							  vacrelstats->dead_tuples, nindexes);
 
 	/* Wait for all vacuum workers to finish */
 	WaitForParallelWorkersToFinish(lps->pcxt);
@@ -2946,6 +2964,7 @@ static int
 compute_parallel_vacuum_workers(Relation *Irel, int nindexes, int nrequested,
 								bool *can_parallel_vacuum)
 {
+	bool		leaderparticipates = true;
 	int			nindexes_parallel = 0;
 	int			nindexes_parallel_bulkdel = 0;
 	int			nindexes_parallel_cleanup = 0;
@@ -2987,8 +3006,13 @@ compute_parallel_vacuum_workers(Relation *Irel, int nindexes, int nrequested,
 	if (nindexes_parallel == 0)
 		return 0;
 
+#ifdef PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION
+	leaderparticipates = false;
+#endif
+
 	/* The leader process takes one index */
-	nindexes_parallel--;
+	if (leaderparticipates)
+		nindexes_parallel--;
 
 	/* Compute the parallel degree */
 	parallel_workers = (nrequested > 0) ?
@@ -3107,6 +3131,11 @@ begin_parallel_vacuum(Oid relid, Relation *Irel, LVRelStats *vacrelstats,
 								 parallel_workers);
 	Assert(pcxt->nworkers > 0);
 	lps->pcxt = pcxt;
+	lps->leaderparticipates = true;
+
+#ifdef PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION
+	lps->leaderparticipates = false;
+#endif
 
 	/* Estimate size for shared information -- PARALLEL_VACUUM_KEY_SHARED */
 	est_shared = MAXALIGN(add_size(SizeOfLVShared, BITMAPLEN(nindexes)));
-- 
2.23.0

v41-0002-Add-a-parallel-option-to-the-VACUUM-command.patchapplication/octet-stream; name=v41-0002-Add-a-parallel-option-to-the-VACUUM-command.patchDownload

From e6e85dfc1a98ada5a1c1251a6e271e75f7476a63 Mon Sep 17 00:00:00 2001
From: Amit Kapila <akapila@postgresql.org>
Date: Mon, 23 Dec 2019 15:03:14 +0530
Subject: [PATCH v41 2/4] Add a parallel option to the VACUUM command.

This change adds a PARALLEL option to VACUUM command that enables us to
perform index vacuuming and index cleanup with background workers.  Each
index is processed by at most one vacuum process.  Therefore parallel
vacuum can be used when the table has at least two indexes.  Users can
specify a parallel degree along with this option which indicates the
number of workers used by this option which is limited by the number of
indexes on a table.  This option can't be used with the FULL option.

The parallel degree is either specified by the user or determined based on
the number of indexes that the table has, and further limited by
max_parallel_maintenance_workers.  The index can participate in parallel
vacuum iff it is greater than min_parallel_index_scan_size.

Author: Masahiko Sawada and Amit Kapila
Reviewed-by: Dilip Kumar, Amit Kapila, Robert Haas, Mahendra Singh and
Sergei Kornilov
Tested-by: Mahendra Singh and Prabhat Sahu
Discussion:
https://postgr.es/m/CAD21AoDTPMgzSkV4E3SFo1CH_x50bf5PqZFQf4jmqjk-C03BWg@mail.gmail.com
https://postgr.es/m/CAA4eK1J-VoR9gzS5E75pcD-OH0mEyCdp8RihcwKrcuw7J-Q0+w@mail.gmail.com
---
 doc/src/sgml/config.sgml              |   14 +-
 doc/src/sgml/ref/vacuum.sgml          |   45 +
 src/backend/access/heap/vacuumlazy.c  | 1252 ++++++++++++++++++++++---
 src/backend/access/transam/parallel.c |   26 +-
 src/backend/commands/vacuum.c         |  132 ++-
 src/backend/postmaster/autovacuum.c   |    2 +
 src/bin/psql/tab-complete.c           |    2 +-
 src/include/access/heapam.h           |    3 +
 src/include/access/parallel.h         |    4 +-
 src/include/commands/vacuum.h         |   11 +
 src/test/regress/expected/vacuum.out  |   28 +
 src/test/regress/sql/vacuum.sql       |   27 +
 12 files changed, 1417 insertions(+), 129 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 5d1c90282f..74756277b7 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2308,13 +2308,13 @@ include_dir 'conf.d'
        <listitem>
         <para>
          Sets the maximum number of parallel workers that can be
-         started by a single utility command.  Currently, the only
-         parallel utility command that supports the use of parallel
-         workers is <command>CREATE INDEX</command>, and only when
-         building a B-tree index.  Parallel workers are taken from the
-         pool of processes established by <xref
-         linkend="guc-max-worker-processes"/>, limited by <xref
-         linkend="guc-max-parallel-workers"/>.  Note that the requested
+         started by a single utility command.  Currently, the parallel
+         utility commands that support the use of parallel workers are
+         <command>CREATE INDEX</command> only when building a B-tree index,
+         and <command>VACUUM</command> without <literal>FULL</literal>
+         option.  Parallel workers are taken from the pool of processes
+         established by <xref linkend="guc-max-worker-processes"/>, limited
+         by <xref linkend="guc-max-parallel-workers"/>.  Note that the requested
          number of workers may not actually be available at run time.
          If this occurs, the utility operation will run with fewer
          workers than expected.  The default value is 2.  Setting this
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index f9b0fb8794..9fee083233 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -34,6 +34,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     SKIP_LOCKED [ <replaceable class="parameter">boolean</replaceable> ]
     INDEX_CLEANUP [ <replaceable class="parameter">boolean</replaceable> ]
     TRUNCATE [ <replaceable class="parameter">boolean</replaceable> ]
+    PARALLEL [ <replaceable class="parameter">integer</replaceable> ]
 
 <phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
 
@@ -223,6 +224,32 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><literal>PARALLEL</literal></term>
+    <listitem>
+     <para>
+      Perform vacuum index and cleanup index phases of <command>VACUUM</command>
+      in parallel using <replaceable class="parameter">integer</replaceable>
+      background workers (for the detail of each vacuum phases, please
+      refer to <xref linkend="vacuum-phases"/>).  If the parallel degree
+      <replaceable class="parameter">integer</replaceable> is omitted,
+      then <command>VACUUM</command> decides the number of workers based
+      on number of indexes that support parallel vacuum operation on the
+      relation which is further limited by
+      <xref linkend="guc-max-parallel-workers-maintenance"/>.  Please note
+      that it is not guaranteed that the number of parallel workers specified
+      in <replaceable class="parameter">integer</replaceable> will be used
+      during execution.  It is possible for a vacuum to run with fewer workers
+      than specified, or even with no workers at all.  Only one worker can
+      be used per index.  So parallel workers are launched only when there
+      are at least <literal>2</literal> indexes in the table.  Workers for
+      vacuum launches before starting each phase and exit at the end of
+      the phase.  These behaviors might change in a future release.  This
+      option can't be used with the <literal>FULL</literal> option.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><replaceable class="parameter">boolean</replaceable></term>
     <listitem>
@@ -237,6 +264,18 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><replaceable class="parameter">integer</replaceable></term>
+    <listitem>
+     <para>
+      Specifies a positive integer value passed to the selected option.
+      The <replaceable class="parameter">integer</replaceable> value can
+      also be omitted, in which case the value is decided by the command
+      based on the option used.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><replaceable class="parameter">table_name</replaceable></term>
     <listitem>
@@ -316,6 +355,12 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     more than a plain <command>VACUUM</command> would.
    </para>
 
+   <para>
+     The <option>PARALLEL</option> option is used only for vacuum purpose.
+     Even if this option is specified with <option>ANALYZE</option> option
+     it does not affect <option>ANALYZE</option>.
+   </para>
+
    <para>
     <command>VACUUM</command> causes a substantial increase in I/O traffic,
     which might cause poor performance for other active sessions.  Therefore,
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 597d8b5f92..7231fa2923 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -22,6 +22,20 @@
  * of index scans performed.  So we don't use maintenance_work_mem memory for
  * the TID array, just enough to hold as many heap tuples as fit on one page.
  *
+ * Lazy vacuum supports parallel execution with parallel worker processes.  In
+ * a parallel lazy vacuum, we perform both index vacuuming and index cleanup
+ * with parallel worker processes.  Individual indexes are processed by one
+ * vacuum process.  At the beginning of a lazy vacuum (at lazy_scan_heap) we
+ * prepare the parallel context and initialize the DSM segment that contains
+ * shared information as well as the memory space for storing dead tuples.
+ * When starting either index vacuuming or index cleanup, we launch parallel
+ * worker processes.  Once all indexes are processed the parallel worker
+ * processes exit.  After that, the leader process re-initializes the parallel
+ * context so that it can use the same DSM for multiple passes of index
+ * vacuum and for performing index cleanup.  For updating the index statistics,
+ * we need to update the system table and since updates are not
+ * allowed during parallel mode we update the index statistics after exiting
+ * from the parallel mode.
  *
  * Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
@@ -36,25 +50,30 @@
 
 #include <math.h>
 
+#include "access/amapi.h"
 #include "access/genam.h"
 #include "access/heapam.h"
 #include "access/heapam_xlog.h"
 #include "access/htup_details.h"
 #include "access/multixact.h"
+#include "access/parallel.h"
 #include "access/transam.h"
 #include "access/visibilitymap.h"
+#include "access/xact.h"
 #include "access/xlog.h"
 #include "catalog/storage.h"
 #include "commands/dbcommands.h"
 #include "commands/progress.h"
 #include "commands/vacuum.h"
 #include "miscadmin.h"
+#include "optimizer/paths.h"
 #include "pgstat.h"
 #include "portability/instr_time.h"
 #include "postmaster/autovacuum.h"
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
 #include "storage/lmgr.h"
+#include "tcop/tcopprot.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/pg_rusage.h"
@@ -110,6 +129,142 @@
  */
 #define PREFETCH_SIZE			((BlockNumber) 32)
 
+/*
+ * DSM keys for parallel lazy vacuum.  Unlike other parallel execution code,
+ * since we don't need to worry about DSM keys conflicting with plan_node_id
+ * we can use small integers.
+ */
+#define PARALLEL_VACUUM_KEY_SHARED			1
+#define PARALLEL_VACUUM_KEY_DEAD_TUPLES		2
+#define PARALLEL_VACUUM_KEY_QUERY_TEXT		3
+
+/*
+ * Macro to check if we are in a parallel lazy vacuum.  If true, we are
+ * in the parallel mode and the DSM segment is initialized.
+ */
+#define ParallelVacuumIsActive(lps) (((LVParallelState *) (lps)) != NULL)
+
+/*
+ * LVDeadTuples stores the dead tuple TIDs collected during the heap scan.
+ * This is allocated in the DSM segment in parallel mode and in local memory
+ * in non-parallel mode.
+ */
+typedef struct LVDeadTuples
+{
+	int			max_tuples;		/* # slots allocated in array */
+	int			num_tuples;		/* current # of entries */
+	/* List of TIDs of tuples we intend to delete */
+	/* NB: this list is ordered by TID address */
+	ItemPointerData itemptrs[FLEXIBLE_ARRAY_MEMBER];	/* array of
+														 * ItemPointerData */
+} LVDeadTuples;
+
+#define SizeOfLVDeadTuples (offsetof(LVDeadTuples, itemptrs))
+
+/*
+ * Shared information among parallel workers.  So this is allocated in the DSM
+ * segment.
+ */
+typedef struct LVShared
+{
+	/*
+	 * Target table relid and log level.  These fields are not modified during
+	 * the lazy vacuum.
+	 */
+	Oid			relid;
+	int			elevel;
+
+	/*
+	 * An indication for vacuum workers to perform either index vacuuming or
+	 * index cleanup.  first_time is true only if for_cleanup is true and
+	 * bulk-deletion is not performed yet.
+	 */
+	bool		for_cleanup;
+	bool		first_time;
+
+	/*
+	 * Fields for both index vacuum and cleanup.
+	 *
+	 * reltuples is the total number of input heap tuples.  We set either old
+	 * live tuples in the index vacuum case or the new live tuples in the
+	 * index cleanup case.
+	 *
+	 * estimated_count is true if the reltuples is an estimated value.
+	 */
+	double		reltuples;
+	bool		estimated_count;
+
+	/*
+	 * In single process lazy vacuum we could consume more memory during index
+	 * vacuuming or cleanup apart from the memory for heap scanning. In
+	 * parallel index vacuuming, since individual vacuum workers can consume
+	 * memory equal to maintenance_work_mem, the new maintenance_work_mem for
+	 * each worker is set such that the parallel operation doesn't consume
+	 * more memory than single process lazy vacuum.
+	 */
+	int			maintenance_work_mem_worker;
+
+	/*
+	 * Shared vacuum cost balance.  During parallel index vacuuming
+	 * VacuumSharedCostBalance points to this value and it accumulates the
+	 * balance of each parallel vacuum workers.
+	 */
+	pg_atomic_uint32 cost_balance;
+
+	/*
+	 * Number of active parallel workers.  This is used for computing the
+	 * minimum threshold of the vacuum cost balance for a worker to go for the
+	 * delay.
+	 */
+	pg_atomic_uint32 active_nworkers;
+
+	/*
+	 * Variables to control parallel index vacuuming.  We have a bitmap to
+	 * indicate which index has stats in shared memory.  The set bit in the
+	 * map indicates that the particular index supports a parallel vacuum.
+	 */
+	pg_atomic_uint32 idx;		/* counter for vacuuming and clean up */
+	pg_atomic_uint32 nprocessed;	/* # of indexes done during parallel
+									 * execution */
+	uint32		offset;			/* sizeof header incl. bitmap */
+	bits8		bitmap[FLEXIBLE_ARRAY_MEMBER];	/* bit map of NULLs */
+
+	/* Shared index statistics data follows at end of struct */
+} LVShared;
+
+#define SizeOfLVShared (offsetof(LVShared, bitmap) + sizeof(bits8))
+#define GetSharedIndStats(s) \
+	((LVSharedIndStats *)((char *)(s) + ((LVShared *)(s))->offset))
+#define IndStatsIsNull(s, i) \
+	(!(((LVShared *)(s))->bitmap[(i) >> 3] & (1 << ((i) & 0x07))))
+
+/*
+ * Struct for an index bulk-deletion statistic used for parallel lazy
+ * vacuum.  This is allocated in the DSM segment.
+ */
+typedef struct LVSharedIndStats
+{
+	bool		updated;		/* are the stats updated? */
+	IndexBulkDeleteResult stats;
+} LVSharedIndStats;
+
+/* Struct for maintaining a parallel vacuum state. */
+typedef struct LVParallelState
+{
+	ParallelContext *pcxt;
+
+	/* Shared information among parallel vacuum workers */
+	LVShared   *lvshared;
+
+	/*
+	 * The number of indexes that support parallel index bulk-deletion and
+	 * parallel index cleanup respectively.
+	 */
+	int			nindexes_parallel_bulkdel;
+	int			nindexes_parallel_cleanup;
+	int			nindexes_parallel_condcleanup;
+} LVParallelState;
+
 typedef struct LVRelStats
 {
 	/* useindex = true means two-pass strategy; false means one-pass */
@@ -128,11 +283,7 @@ typedef struct LVRelStats
 	BlockNumber pages_removed;
 	double		tuples_deleted;
 	BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
-	/* List of TIDs of tuples we intend to delete */
-	/* NB: this list is ordered by TID address */
-	int			num_dead_tuples;	/* current # of entries */
-	int			max_dead_tuples;	/* # slots allocated in array */
-	ItemPointer dead_tuples;	/* array of ItemPointerData */
+	LVDeadTuples *dead_tuples;
 	int			num_index_scans;
 	TransactionId latestRemovedXid;
 	bool		lock_waiter_detected;
@@ -155,15 +306,15 @@ static void lazy_scan_heap(Relation onerel, VacuumParams *params,
 						   bool aggressive);
 static void lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats);
 static bool lazy_check_needs_freeze(Buffer buf, bool *hastup);
-static void lazy_vacuum_index(Relation indrel,
-							  IndexBulkDeleteResult **stats,
-							  LVRelStats *vacrelstats);
-static void lazy_vacuum_all_indexes(Relation onerel, LVRelStats *vacrelstats,
-									Relation *Irel, int nindexes,
-									IndexBulkDeleteResult **indstats);
+static void lazy_vacuum_all_indexes(Relation onerel, Relation *Irel,
+									IndexBulkDeleteResult **stats,
+									LVRelStats *vacrelstats, LVParallelState *lps,
+									int nindexes);
+static void lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult **stats,
+							  LVDeadTuples *dead_tuples, double reltuples);
 static void lazy_cleanup_index(Relation indrel,
-							   IndexBulkDeleteResult *stats,
-							   LVRelStats *vacrelstats);
+							   IndexBulkDeleteResult **stats,
+							   double reltuples, bool estimated_count);
 static int	lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 							 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer);
 static bool should_attempt_truncation(VacuumParams *params,
@@ -172,12 +323,41 @@ static void lazy_truncate_heap(Relation onerel, LVRelStats *vacrelstats);
 static BlockNumber count_nondeletable_pages(Relation onerel,
 											LVRelStats *vacrelstats);
 static void lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks);
-static void lazy_record_dead_tuple(LVRelStats *vacrelstats,
+static void lazy_record_dead_tuple(LVDeadTuples *dead_tuples,
 								   ItemPointer itemptr);
 static bool lazy_tid_reaped(ItemPointer itemptr, void *state);
 static int	vac_cmp_itemptr(const void *left, const void *right);
 static bool heap_page_is_all_visible(Relation rel, Buffer buf,
 									 TransactionId *visibility_cutoff_xid, bool *all_frozen);
+static void lazy_parallel_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+										 LVRelStats *vacrelstats, LVParallelState *lps,
+										 int nindexes);
+static void parallel_vacuum_index(Relation *Irel, IndexBulkDeleteResult **stats,
+								  LVShared *lvshared, LVDeadTuples *dead_tuples,
+								  int nindexes);
+static void vacuum_indexes_leader(Relation *Irel, IndexBulkDeleteResult **stats,
+								  LVRelStats *vacrelstats, LVParallelState *lps,
+								  int nindexes);
+static void vacuum_one_index(Relation indrel, IndexBulkDeleteResult **stats,
+							 LVShared *lvshared, LVSharedIndStats *shared_indstats,
+							 LVDeadTuples *dead_tuples);
+static void lazy_cleanup_all_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+									 LVRelStats *vacrelstats, LVParallelState *lps,
+									 int nindexes);
+static long compute_max_dead_tuples(BlockNumber relblocks, bool hasindex);
+static int	compute_parallel_vacuum_workers(Relation *Irel, int nindexes, int nrequested,
+											bool *can_parallel_vacuum);
+static void prepare_index_statistics(LVShared *lvshared, bool *can_parallel_vacuum,
+									 int nindexes);
+static void update_index_statistics(Relation *Irel, IndexBulkDeleteResult **stats,
+									int nindexes);
+static LVParallelState *begin_parallel_vacuum(Oid relid, Relation *Irel,
+											  LVRelStats *vacrelstats, BlockNumber nblocks,
+											  int nindexes, int nrequested);
+static void end_parallel_vacuum(Relation *Irel, IndexBulkDeleteResult **stats,
+								LVParallelState *lps, int nindexes);
+static LVSharedIndStats *get_indstats(LVShared *lvshared, int n);
+static bool skip_parallel_vacuum_index(Relation indrel, LVShared *lvshared);
 
 
 /*
@@ -491,6 +671,18 @@ vacuum_log_cleanup_info(Relation rel, LVRelStats *vacrelstats)
  *		dead-tuple TIDs, invoke vacuuming of indexes and call lazy_vacuum_heap
  *		to reclaim dead line pointers.
  *
+ *		If the table has at least two indexes and parallel lazy vacuum is
+ *		requested, we execute both index vacuuming and index cleanup with
+ *		parallel workers.  In parallel lazy vacuum, we enter parallel mode and
+ *		then create both the parallel context and the DSM segment before starting
+ *		heap scan so that we can record dead tuples to the DSM segment.  All
+ *		parallel workers are launched at beginning of index vacuuming and index
+ *		cleanup and they exit once done with all indexes.  At the end of this
+ *		function we exit from parallel mode.  Index bulk-deletion results are
+ *		stored in the DSM segment and we update index statistics for all the
+ *		indexes after exiting from parallel mode since writes are not allowed
+ *		during parallel mode.
+ *
  *		If there are no indexes then we can reclaim line pointers on the fly;
  *		dead line pointers need only be retained until all index pointers that
  *		reference them have been killed.
@@ -499,6 +691,8 @@ static void
 lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			   Relation *Irel, int nindexes, bool aggressive)
 {
+	LVParallelState *lps = NULL;
+	LVDeadTuples *dead_tuples;
 	BlockNumber nblocks,
 				blkno;
 	HeapTupleData tuple;
@@ -556,13 +750,28 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	vacrelstats->nonempty_pages = 0;
 	vacrelstats->latestRemovedXid = InvalidTransactionId;
 
-	lazy_space_alloc(vacrelstats, nblocks);
+	/*
+	 * Try to initialize the parallel vacuum if requested
+	 */
+	if (params->nworkers >= 0 && vacrelstats->useindex)
+		lps = begin_parallel_vacuum(RelationGetRelid(onerel), Irel,
+									vacrelstats, nblocks, nindexes,
+									params->nworkers);
+
+	/*
+	 * Allocate the space for dead tuples in case the parallel vacuum is not
+	 * initialized.
+	 */
+	if (!ParallelVacuumIsActive(lps))
+		lazy_space_alloc(vacrelstats, nblocks);
+
+	dead_tuples = vacrelstats->dead_tuples;
 	frozen = palloc(sizeof(xl_heap_freeze_tuple) * MaxHeapTuplesPerPage);
 
 	/* Report that we're scanning the heap, advertising total # of blocks */
 	initprog_val[0] = PROGRESS_VACUUM_PHASE_SCAN_HEAP;
 	initprog_val[1] = nblocks;
-	initprog_val[2] = vacrelstats->max_dead_tuples;
+	initprog_val[2] = dead_tuples->max_tuples;
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
 	/*
@@ -740,8 +949,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * If we are close to overrunning the available space for dead-tuple
 		 * TIDs, pause and do a cycle of vacuuming before we tackle this page.
 		 */
-		if ((vacrelstats->max_dead_tuples - vacrelstats->num_dead_tuples) < MaxHeapTuplesPerPage &&
-			vacrelstats->num_dead_tuples > 0)
+		if ((dead_tuples->max_tuples - dead_tuples->num_tuples) < MaxHeapTuplesPerPage &&
+			dead_tuples->num_tuples > 0)
 		{
 			/*
 			 * Before beginning index vacuuming, we release any pin we may
@@ -756,8 +965,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			}
 
 			/* Work on all the indexes, then the heap */
-			lazy_vacuum_all_indexes(onerel, vacrelstats, Irel,
-									nindexes, indstats);
+			lazy_vacuum_all_indexes(onerel, Irel, indstats,
+									vacrelstats, lps, nindexes);
 
 			/* Remove tuples from heap */
 			lazy_vacuum_heap(onerel, vacrelstats);
@@ -767,7 +976,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 
 			/*
 			 * Vacuum the Free Space Map to make newly-freed space visible on
@@ -962,7 +1171,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		has_dead_tuples = false;
 		nfrozen = 0;
 		hastup = false;
-		prev_dead_count = vacrelstats->num_dead_tuples;
+		prev_dead_count = dead_tuples->num_tuples;
 		maxoff = PageGetMaxOffsetNumber(page);
 
 		/*
@@ -1001,7 +1210,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 */
 			if (ItemIdIsDead(itemid))
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				all_visible = false;
 				continue;
 			}
@@ -1147,7 +1356,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 
 			if (tupgone)
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				HeapTupleHeaderAdvanceLatestRemovedXid(tuple.t_data,
 													   &vacrelstats->latestRemovedXid);
 				tups_vacuumed += 1;
@@ -1217,7 +1426,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * doing a second scan. Also we don't do that but forget dead tuples
 		 * when index cleanup is disabled.
 		 */
-		if (!vacrelstats->useindex && vacrelstats->num_dead_tuples > 0)
+		if (!vacrelstats->useindex && dead_tuples->num_tuples > 0)
 		{
 			if (nindexes == 0)
 			{
@@ -1246,7 +1455,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 
 			/*
 			 * Periodically do incremental FSM vacuuming to make newly-freed
@@ -1361,7 +1570,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * page, so remember its free space as-is.  (This path will always be
 		 * taken if there are no indexes.)
 		 */
-		if (vacrelstats->num_dead_tuples == prev_dead_count)
+		if (dead_tuples->num_tuples == prev_dead_count)
 			RecordPageWithFreeSpace(onerel, blkno, freespace);
 	}
 
@@ -1395,11 +1604,11 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 
 	/* If any tuples need to be deleted, perform final vacuum cycle */
 	/* XXX put a threshold on min number of tuples here? */
-	if (vacrelstats->num_dead_tuples > 0)
+	if (dead_tuples->num_tuples > 0)
 	{
 		/* Work on all the indexes, and then the heap */
-		lazy_vacuum_all_indexes(onerel, vacrelstats, Irel, nindexes,
-								indstats);
+		lazy_vacuum_all_indexes(onerel, Irel, indstats, vacrelstats,
+								lps, nindexes);
 
 		/* Remove tuples from heap */
 		lazy_vacuum_heap(onerel, vacrelstats);
@@ -1412,17 +1621,22 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	if (blkno > next_fsm_block_to_vacuum)
 		FreeSpaceMapVacuumRange(onerel, next_fsm_block_to_vacuum, blkno);
 
-	/* report all blocks vacuumed; and that we're cleaning up */
+	/* report all blocks vacuumed */
 	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_VACUUMED, blkno);
-	pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
-								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
 
-	/* Do post-vacuum cleanup and statistics update for each index */
+	/* Do post-vacuum cleanup */
 	if (vacrelstats->useindex)
-	{
-		for (i = 0; i < nindexes; i++)
-			lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
-	}
+		lazy_cleanup_all_indexes(Irel, indstats, vacrelstats, lps, nindexes);
+
+	/*
+	 * End parallel mode before updating index statistics as we cannot write
+	 * during parallel mode.
+	 */
+	if (ParallelVacuumIsActive(lps))
+		end_parallel_vacuum(Irel, indstats, lps, nindexes);
+
+	/* Update index statistics */
+	update_index_statistics(Irel, indstats, nindexes);
 
 	/* If no indexes, make log report that lazy_vacuum_heap would've made */
 	if (vacuumed_pages)
@@ -1467,15 +1681,16 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 /*
  *	lazy_vacuum_all_indexes() -- vacuum all indexes of relation.
  *
- *		This is a utility wrapper for lazy_vacuum_index(), able to do
- *		progress reporting.
+ * We process the indexes serially unless we are doing parallel vacuum.
  */
 static void
-lazy_vacuum_all_indexes(Relation onerel, LVRelStats *vacrelstats,
-						Relation *Irel, int nindexes,
-						IndexBulkDeleteResult **indstats)
+lazy_vacuum_all_indexes(Relation onerel, Relation *Irel,
+						IndexBulkDeleteResult **stats,
+						LVRelStats *vacrelstats, LVParallelState *lps,
+						int nindexes)
 {
-	int			i;
+	Assert(!IsParallelWorker());
+	Assert(nindexes > 0);
 
 	/* Log cleanup info before we touch indexes */
 	vacuum_log_cleanup_info(onerel, vacrelstats);
@@ -1484,9 +1699,30 @@ lazy_vacuum_all_indexes(Relation onerel, LVRelStats *vacrelstats,
 	pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
 								 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
-	/* Remove index entries */
-	for (i = 0; i < nindexes; i++)
-		lazy_vacuum_index(Irel[i], &indstats[i], vacrelstats);
+	/* Perform index vacuuming with parallel workers for parallel vacuum. */
+	if (ParallelVacuumIsActive(lps))
+	{
+		/* Tell parallel workers to do index vacuuming */
+		lps->lvshared->for_cleanup = false;
+		lps->lvshared->first_time = false;
+
+		/*
+		 * We can only provide an approximate value of num_heap_tuples in
+		 * vacuum cases.
+		 */
+		lps->lvshared->reltuples = vacrelstats->old_live_tuples;
+		lps->lvshared->estimated_count = true;
+
+		lazy_parallel_vacuum_indexes(Irel, stats, vacrelstats, lps, nindexes);
+	}
+	else
+	{
+		int			idx;
+
+		for (idx = 0; idx < nindexes; idx++)
+			lazy_vacuum_index(Irel[idx], &stats[idx], vacrelstats->dead_tuples,
+							  vacrelstats->old_live_tuples);
+	}
 
 	/* Increase and report the number of index scans */
 	vacrelstats->num_index_scans++;
@@ -1522,7 +1758,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 	npages = 0;
 
 	tupindex = 0;
-	while (tupindex < vacrelstats->num_dead_tuples)
+	while (tupindex < vacrelstats->dead_tuples->num_tuples)
 	{
 		BlockNumber tblk;
 		Buffer		buf;
@@ -1531,7 +1767,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 
 		vacuum_delay_point();
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples->itemptrs[tupindex]);
 		buf = ReadBufferExtended(onerel, MAIN_FORKNUM, tblk, RBM_NORMAL,
 								 vac_strategy);
 		if (!ConditionalLockBufferForCleanup(buf))
@@ -1579,6 +1815,7 @@ static int
 lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 				 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer)
 {
+	LVDeadTuples *dead_tuples = vacrelstats->dead_tuples;
 	Page		page = BufferGetPage(buffer);
 	OffsetNumber unused[MaxOffsetNumber];
 	int			uncnt = 0;
@@ -1589,16 +1826,16 @@ lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 
 	START_CRIT_SECTION();
 
-	for (; tupindex < vacrelstats->num_dead_tuples; tupindex++)
+	for (; tupindex < dead_tuples->num_tuples; tupindex++)
 	{
 		BlockNumber tblk;
 		OffsetNumber toff;
 		ItemId		itemid;
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&dead_tuples->itemptrs[tupindex]);
 		if (tblk != blkno)
 			break;				/* past end of tuples for this block */
-		toff = ItemPointerGetOffsetNumber(&vacrelstats->dead_tuples[tupindex]);
+		toff = ItemPointerGetOffsetNumber(&dead_tuples->itemptrs[tupindex]);
 		itemid = PageGetItemId(page, toff);
 		ItemIdSetUnused(itemid);
 		unused[uncnt++] = toff;
@@ -1719,19 +1956,355 @@ lazy_check_needs_freeze(Buffer buf, bool *hastup)
 	return false;
 }
 
+/*
+ * Perform index vacuuming or index cleanup with parallel workers.  This
+ * function must be used by the parallel vacuum leader process. The caller
+ * must set lps->lvshared->for_cleanup to indicate whether to perform vacuum
+ * or cleanup.
+ */
+static void
+lazy_parallel_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+							 LVRelStats *vacrelstats, LVParallelState *lps,
+							 int nindexes)
+{
+	int			nworkers;
+
+	Assert(!IsParallelWorker());
+	Assert(ParallelVacuumIsActive(lps));
+	Assert(nindexes > 0);
+
+	/* Determine the number of parallel workers to launch */
+	if (lps->lvshared->for_cleanup)
+	{
+		if (lps->lvshared->first_time)
+			nworkers = lps->nindexes_parallel_cleanup +
+				lps->nindexes_parallel_condcleanup - 1;
+		else
+			nworkers = lps->nindexes_parallel_cleanup - 1;
+
+	}
+	else
+		nworkers = lps->nindexes_parallel_bulkdel - 1;
+
+	/*
+	 * It is possible that parallel context is initialized with fewer workers
+	 * then the number of indexes that need a separate worker in the current
+	 * phase, so we need to consider it.  See compute_parallel_vacuum_workers.
+	 */
+	nworkers = Min(nworkers, lps->pcxt->nworkers);
+
+	/* Setup the shared cost-based vacuum delay and launch workers */
+	if (nworkers > 0)
+	{
+		if (vacrelstats->num_index_scans > 0)
+		{
+			/* Reset the processing counts */
+			pg_atomic_write_u32(&(lps->lvshared->idx), 0);
+			pg_atomic_write_u32(&(lps->lvshared->nprocessed), 0);
+
+			/* Reinitialize the parallel context to relaunch parallel workers */
+			ReinitializeParallelDSM(lps->pcxt);
+		}
+
+		/* Enable shared cost balance */
+		VacuumSharedCostBalance = &(lps->lvshared->cost_balance);
+		VacuumActiveNWorkers = &(lps->lvshared->active_nworkers);
+
+		/*
+		 * Set up shared cost balance and the number of active workers for
+		 * vacuum delay.
+		 */
+		pg_atomic_write_u32(VacuumSharedCostBalance, VacuumCostBalance);
+		pg_atomic_write_u32(VacuumActiveNWorkers, 0);
+
+		/*
+		 * The number of workers can vary between bulkdelete and cleanup
+		 * phase.
+		 */
+		ReinitializeParallelWorkers(lps->pcxt, nworkers);
+
+		LaunchParallelWorkers(lps->pcxt);
+
+		if (lps->pcxt->nworkers_launched > 0)
+		{
+			/*
+			 * Reset the local cost values for leader backend as we have
+			 * already accumulated the remaining balance of heap.
+			 */
+			VacuumCostBalance = 0;
+			VacuumCostBalanceLocal = 0;
+		}
+		else
+		{
+			/*
+			 * Disable shared cost balance if we are not able to launch
+			 * workers.
+			 */
+			VacuumSharedCostBalance = NULL;
+			VacuumActiveNWorkers = NULL;
+		}
+
+		if (lps->lvshared->for_cleanup)
+			ereport(elevel,
+					(errmsg(ngettext("launched %d parallel vacuum worker for index cleanup (planned: %d)",
+									 "launched %d parallel vacuum workers for index cleanup (planned: %d)",
+									 lps->pcxt->nworkers_launched),
+							lps->pcxt->nworkers_launched, nworkers)));
+		else
+			ereport(elevel,
+					(errmsg(ngettext("launched %d parallel vacuum worker for index vacuuming (planned: %d)",
+									 "launched %d parallel vacuum workers for index vacuuming (planned: %d)",
+									 lps->pcxt->nworkers_launched),
+							lps->pcxt->nworkers_launched, nworkers)));
+	}
+
+	/* Process the indexes that can be processed by only leader process */
+	vacuum_indexes_leader(Irel, stats, vacrelstats, lps, nindexes);
+
+	/*
+	 * Join as a parallel worker.  The leader process alone processes all the
+	 * indexes in the case where no workers are launched.
+	 */
+	parallel_vacuum_index(Irel, stats, lps->lvshared,
+						  vacrelstats->dead_tuples, nindexes);
+
+	/* Wait for all vacuum workers to finish */
+	WaitForParallelWorkersToFinish(lps->pcxt);
+
+	/* Carry the shared balance value to heap scan */
+	if (VacuumSharedCostBalance)
+		VacuumCostBalance = pg_atomic_read_u32(VacuumSharedCostBalance);
+
+	if (nworkers > 0)
+	{
+		/* Disable shared cost balance */
+		VacuumSharedCostBalance = NULL;
+		VacuumActiveNWorkers = NULL;
+	}
+}
+
+/*
+ * Index vacuum/cleanup routine used by the leader process and parallel
+ * vacuum worker processes to process the indexes in parallel.
+ */
+static void
+parallel_vacuum_index(Relation *Irel, IndexBulkDeleteResult **stats,
+					  LVShared *lvshared, LVDeadTuples *dead_tuples,
+					  int nindexes)
+{
+	/*
+	 * Increment the active worker count if we are able to launch any worker.
+	 */
+	if (VacuumActiveNWorkers)
+		pg_atomic_add_fetch_u32(VacuumActiveNWorkers, 1);
+
+	/* Loop until all indexes are vacuumed */
+	for (;;)
+	{
+		int			idx;
+		LVSharedIndStats *shared_indstats;
+
+		/* Get an index number to process */
+		idx = pg_atomic_fetch_add_u32(&(lvshared->idx), 1);
+
+		/* Done for all indexes? */
+		if (idx >= nindexes)
+			break;
+
+		/* Get the index statistics of this index from DSM */
+		shared_indstats = get_indstats(lvshared, idx);
+
+		/*
+		 * Skip processing indexes that doesn't participate in parallel
+		 * operation
+		 */
+		if (shared_indstats == NULL ||
+			skip_parallel_vacuum_index(Irel[idx], lvshared))
+			continue;
+
+		/* Increment the processing count */
+		pg_atomic_add_fetch_u32(&(lvshared->nprocessed), 1);
+
+		/* Do vacuum or cleanup of the index */
+		vacuum_one_index(Irel[idx], &(stats[idx]), lvshared, shared_indstats,
+						 dead_tuples);
+	}
+
+	/*
+	 * We have completed the index vacuum so decrement the active worker
+	 * count.
+	 */
+	if (VacuumActiveNWorkers)
+		pg_atomic_sub_fetch_u32(VacuumActiveNWorkers, 1);
+}
+
+/*
+ * Vacuum or cleanup indexes that can be processed by only the leader process
+ * because these indexes don't support parallel operation at that phase.
+ */
+static void
+vacuum_indexes_leader(Relation *Irel, IndexBulkDeleteResult **stats,
+					  LVRelStats *vacrelstats, LVParallelState *lps,
+					  int nindexes)
+{
+	int			i;
+
+	Assert(!IsParallelWorker());
+
+	/*
+	 * Increment the active worker count if we are able to launch any worker.
+	 */
+	if (VacuumActiveNWorkers)
+		pg_atomic_add_fetch_u32(VacuumActiveNWorkers, 1);
+
+	for (i = 0; i < nindexes; i++)
+	{
+		bool		skip_index = (get_indstats(lps->lvshared, i) == NULL ||
+								  skip_parallel_vacuum_index(Irel[i], lps->lvshared));
+
+		/* Skip the indexes that can be processed by parallel workers */
+		if (!skip_index)
+			continue;
+
+		vacuum_one_index(Irel[i], &(stats[i]), lps->lvshared,
+						 get_indstats(lps->lvshared, i),
+						 vacrelstats->dead_tuples);
+	}
+
+	/*
+	 * We have completed the index vacuum so decrement the active worker
+	 * count.
+	 */
+	if (VacuumActiveNWorkers)
+		pg_atomic_sub_fetch_u32(VacuumActiveNWorkers, 1);
+}
+
+/*
+ * Vacuum or cleanup index either by leader process or by one of the worker
+ * process.  After processing the index this function copies the index
+ * statistics returned from ambulkdelete and amvacuumcleanup to the DSM
+ * segment.
+ */
+static void
+vacuum_one_index(Relation indrel, IndexBulkDeleteResult **stats,
+				 LVShared *lvshared, LVSharedIndStats *shared_indstats,
+				 LVDeadTuples *dead_tuples)
+{
+	IndexBulkDeleteResult *bulkdelete_res = NULL;
+
+	if (shared_indstats)
+	{
+		/* Get the space for IndexBulkDeleteResult */
+		bulkdelete_res = &(shared_indstats->stats);
+
+		/*
+		 * Update the pointer to the corresponding bulk-deletion result if
+		 * someone has already updated it.
+		 */
+		if (shared_indstats->updated && *stats == NULL)
+			*stats = bulkdelete_res;
+	}
+
+	/* Do vacuum or cleanup of the index */
+	if (lvshared->for_cleanup)
+		lazy_cleanup_index(indrel, stats, lvshared->reltuples,
+						   lvshared->estimated_count);
+	else
+		lazy_vacuum_index(indrel, stats, dead_tuples,
+						  lvshared->reltuples);
+
+	/*
+	 * Copy the index bulk-deletion result returned from ambulkdelete and
+	 * amvacuumcleanup to the DSM segment if it's the first time to get it
+	 * from them, because they allocate it locally and it's possible that an
+	 * index will be vacuumed by the different vacuum process at the next
+	 * time.  The copying of the result normally happens only after the first
+	 * time of index vacuuming.  From the second time, we pass the result on
+	 * the DSM segment so that they then update it directly.
+	 *
+	 * Since all vacuum workers write the bulk-deletion result at different
+	 * slots we can write them without locking.
+	 */
+	if (shared_indstats && !shared_indstats->updated && *stats != NULL)
+	{
+		memcpy(bulkdelete_res, *stats, sizeof(IndexBulkDeleteResult));
+		shared_indstats->updated = true;
+
+		/*
+		 * Now that the stats[idx] points to the DSM segment, we don't need
+		 * the locally allocated results.
+		 */
+		pfree(*stats);
+		*stats = bulkdelete_res;
+	}
+}
+
+/*
+ *	lazy_cleanup_all_indexes() -- cleanup all indexes of relation.
+ *
+ * Cleanup indexes.  We process the indexes serially unless we are doing
+ * parallel vacuum.
+ */
+static void
+lazy_cleanup_all_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+						 LVRelStats *vacrelstats, LVParallelState *lps,
+						 int nindexes)
+{
+	int			idx;
+
+	Assert(!IsParallelWorker());
+	Assert(nindexes > 0);
+
+	/* Report that we are now cleaning up indexes */
+	pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
+								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
+
+	/*
+	 * If parallel vacuum is active we perform index cleanup with parallel
+	 * workers.
+	 */
+	if (ParallelVacuumIsActive(lps))
+	{
+		/* Tell parallel workers to do index cleanup */
+		lps->lvshared->for_cleanup = true;
+		lps->lvshared->first_time =
+			(vacrelstats->num_index_scans == 0);
+
+		/*
+		 * Now we can provide a better estimate of total number of surviving
+		 * tuples (we assume indexes are more interested in that than in the
+		 * number of nominally live tuples).
+		 */
+		lps->lvshared->reltuples = vacrelstats->new_rel_tuples;
+		lps->lvshared->estimated_count =
+			(vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+
+		lazy_parallel_vacuum_indexes(Irel, stats, vacrelstats, lps, nindexes);
+	}
+	else
+	{
+		for (idx = 0; idx < nindexes; idx++)
+			lazy_cleanup_index(Irel[idx], &stats[idx],
+							   vacrelstats->new_rel_tuples,
+							   vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	}
+}
 
 /*
  *	lazy_vacuum_index() -- vacuum one index relation.
  *
  *		Delete all the index entries pointing to tuples listed in
- *		vacrelstats->dead_tuples, and update running statistics.
+ *		dead_tuples, and update running statistics.
+ *
+ *		reltuples is the number of heap tuples to be passed to the
+ *		bulkdelete callback.
  */
 static void
-lazy_vacuum_index(Relation indrel,
-				  IndexBulkDeleteResult **stats,
-				  LVRelStats *vacrelstats)
+lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult **stats,
+				  LVDeadTuples *dead_tuples, double reltuples)
 {
 	IndexVacuumInfo ivinfo;
+	const char *msg;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
@@ -1741,30 +2314,38 @@ lazy_vacuum_index(Relation indrel,
 	ivinfo.report_progress = false;
 	ivinfo.estimated_count = true;
 	ivinfo.message_level = elevel;
-	/* We can only provide an approximate value of num_heap_tuples here */
-	ivinfo.num_heap_tuples = vacrelstats->old_live_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
 	/* Do bulk deletion */
 	*stats = index_bulk_delete(&ivinfo, *stats,
-							   lazy_tid_reaped, (void *) vacrelstats);
+							   lazy_tid_reaped, (void *) dead_tuples);
+
+	if (IsParallelWorker())
+		msg = gettext_noop("scanned index \"%s\" to remove %d row versions by parallel vacuum worker");
+	else
+		msg = gettext_noop("scanned index \"%s\" to remove %d row versions");
 
 	ereport(elevel,
-			(errmsg("scanned index \"%s\" to remove %d row versions",
+			(errmsg(msg,
 					RelationGetRelationName(indrel),
-					vacrelstats->num_dead_tuples),
+					dead_tuples->num_tuples),
 			 errdetail_internal("%s", pg_rusage_show(&ru0))));
 }
 
 /*
  *	lazy_cleanup_index() -- do post-vacuum cleanup for one index relation.
+ *
+ *		reltuples is the number of heap tuples and estimated_count is true
+ *		if the reltuples is an estimated value.
  */
 static void
 lazy_cleanup_index(Relation indrel,
-				   IndexBulkDeleteResult *stats,
-				   LVRelStats *vacrelstats)
+				   IndexBulkDeleteResult **stats,
+				   double reltuples, bool estimated_count)
 {
 	IndexVacuumInfo ivinfo;
+	const char *msg;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
@@ -1772,49 +2353,33 @@ lazy_cleanup_index(Relation indrel,
 	ivinfo.index = indrel;
 	ivinfo.analyze_only = false;
 	ivinfo.report_progress = false;
-	ivinfo.estimated_count = (vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	ivinfo.estimated_count = estimated_count;
 	ivinfo.message_level = elevel;
 
-	/*
-	 * Now we can provide a better estimate of total number of surviving
-	 * tuples (we assume indexes are more interested in that than in the
-	 * number of nominally live tuples).
-	 */
-	ivinfo.num_heap_tuples = vacrelstats->new_rel_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
-	stats = index_vacuum_cleanup(&ivinfo, stats);
+	*stats = index_vacuum_cleanup(&ivinfo, *stats);
 
-	if (!stats)
+	if (!(*stats))
 		return;
 
-	/*
-	 * Now update statistics in pg_class, but only if the index says the count
-	 * is accurate.
-	 */
-	if (!stats->estimated_count)
-		vac_update_relstats(indrel,
-							stats->num_pages,
-							stats->num_index_tuples,
-							0,
-							false,
-							InvalidTransactionId,
-							InvalidMultiXactId,
-							false);
+	if (IsParallelWorker())
+		msg = gettext_noop("index \"%s\" now contains %.0f row versions in %u pages as reported by parallel vacuum worker");
+	else
+		msg = gettext_noop("index \"%s\" now contains %.0f row versions in %u pages");
 
 	ereport(elevel,
-			(errmsg("index \"%s\" now contains %.0f row versions in %u pages",
+			(errmsg(msg,
 					RelationGetRelationName(indrel),
-					stats->num_index_tuples,
-					stats->num_pages),
+					(*stats)->num_index_tuples,
+					(*stats)->num_pages),
 			 errdetail("%.0f index row versions were removed.\n"
 					   "%u index pages have been deleted, %u are currently reusable.\n"
 					   "%s.",
-					   stats->tuples_removed,
-					   stats->pages_deleted, stats->pages_free,
+					   (*stats)->tuples_removed,
+					   (*stats)->pages_deleted, (*stats)->pages_free,
 					   pg_rusage_show(&ru0))));
-
-	pfree(stats);
 }
 
 /*
@@ -2122,19 +2687,17 @@ count_nondeletable_pages(Relation onerel, LVRelStats *vacrelstats)
 }
 
 /*
- * lazy_space_alloc - space allocation decisions for lazy vacuum
- *
- * See the comments at the head of this file for rationale.
+ * Return the maximum number of dead tuples we can record.
  */
-static void
-lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+static long
+compute_max_dead_tuples(BlockNumber relblocks, bool useindex)
 {
 	long		maxtuples;
 	int			vac_work_mem = IsAutoVacuumWorkerProcess() &&
 	autovacuum_work_mem != -1 ?
 	autovacuum_work_mem : maintenance_work_mem;
 
-	if (vacrelstats->useindex)
+	if (useindex)
 	{
 		maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);
 		maxtuples = Min(maxtuples, INT_MAX);
@@ -2148,34 +2711,49 @@ lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
 		maxtuples = Max(maxtuples, MaxHeapTuplesPerPage);
 	}
 	else
-	{
 		maxtuples = MaxHeapTuplesPerPage;
-	}
 
-	vacrelstats->num_dead_tuples = 0;
-	vacrelstats->max_dead_tuples = (int) maxtuples;
-	vacrelstats->dead_tuples = (ItemPointer)
-		palloc(maxtuples * sizeof(ItemPointerData));
+	return maxtuples;
+}
+
+/*
+ * lazy_space_alloc - space allocation decisions for lazy vacuum
+ *
+ * See the comments at the head of this file for rationale.
+ */
+static void
+lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+{
+	LVDeadTuples *dead_tuples = NULL;
+	long		maxtuples;
+
+	maxtuples = compute_max_dead_tuples(relblocks, vacrelstats->useindex);
+
+	dead_tuples = (LVDeadTuples *)
+		palloc(SizeOfLVDeadTuples + maxtuples * sizeof(ItemPointerData));
+	dead_tuples->num_tuples = 0;
+	dead_tuples->max_tuples = (int) maxtuples;
+
+	vacrelstats->dead_tuples = dead_tuples;
 }
 
 /*
  * lazy_record_dead_tuple - remember one deletable tuple
  */
 static void
-lazy_record_dead_tuple(LVRelStats *vacrelstats,
-					   ItemPointer itemptr)
+lazy_record_dead_tuple(LVDeadTuples *dead_tuples, ItemPointer itemptr)
 {
 	/*
 	 * The array shouldn't overflow under normal behavior, but perhaps it
 	 * could if we are given a really small maintenance_work_mem. In that
 	 * case, just forget the last few tuples (we'll get 'em next time).
 	 */
-	if (vacrelstats->num_dead_tuples < vacrelstats->max_dead_tuples)
+	if (dead_tuples->num_tuples < dead_tuples->max_tuples)
 	{
-		vacrelstats->dead_tuples[vacrelstats->num_dead_tuples] = *itemptr;
-		vacrelstats->num_dead_tuples++;
+		dead_tuples->itemptrs[dead_tuples->num_tuples] = *itemptr;
+		dead_tuples->num_tuples++;
 		pgstat_progress_update_param(PROGRESS_VACUUM_NUM_DEAD_TUPLES,
-									 vacrelstats->num_dead_tuples);
+									 dead_tuples->num_tuples);
 	}
 }
 
@@ -2189,12 +2767,12 @@ lazy_record_dead_tuple(LVRelStats *vacrelstats,
 static bool
 lazy_tid_reaped(ItemPointer itemptr, void *state)
 {
-	LVRelStats *vacrelstats = (LVRelStats *) state;
+	LVDeadTuples *dead_tuples = (LVDeadTuples *) state;
 	ItemPointer res;
 
 	res = (ItemPointer) bsearch((void *) itemptr,
-								(void *) vacrelstats->dead_tuples,
-								vacrelstats->num_dead_tuples,
+								(void *) dead_tuples->itemptrs,
+								dead_tuples->num_tuples,
 								sizeof(ItemPointerData),
 								vac_cmp_itemptr);
 
@@ -2342,3 +2920,451 @@ heap_page_is_all_visible(Relation rel, Buffer buf,
 
 	return all_visible;
 }
+
+/*
+ * Compute the number of parallel worker processes to request.  Both index
+ * vacuuming and index cleanup can be executed with parallel workers.  The
+ * relation sizes of table don't affect to the parallel degree for now.
+ *
+ * nrequested is the number of parallel workers that user requested.  If
+ * nrequested is 0, we compute the parallel degree based on nindexes, that is
+ * the number of indexes that support parallel index vacuuming.  This function
+ * also sets can_parallel_vacuum to remember indexes that participate in
+ * parallel index vacuum.
+ */
+static int
+compute_parallel_vacuum_workers(Relation *Irel, int nindexes, int nrequested,
+								bool *can_parallel_vacuum)
+{
+	int			nindexes_parallel = 0;
+	int			nindexes_parallel_bulkdel = 0;
+	int			nindexes_parallel_cleanup = 0;
+	int			parallel_workers;
+	int			i;
+
+	/*
+	 * We don't allow to perform parallel operation in standalone backend or
+	 * when parallelism is disabled.
+	 */
+	if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+		return 0;
+
+	/*
+	 * Compute the number of indexes that can participate in parallel index
+	 * vacuuming.
+	 */
+	for (i = 0; i < nindexes; i++)
+	{
+		uint8		vacoptions = Irel[i]->rd_indam->amparallelvacuumoptions;
+
+		if (vacoptions == VACUUM_OPTION_NO_PARALLEL ||
+			RelationGetNumberOfBlocks(Irel[i]) < min_parallel_index_scan_size)
+			continue;
+
+		can_parallel_vacuum[i] = true;
+
+		if ((vacoptions & VACUUM_OPTION_PARALLEL_BULKDEL) != 0)
+			nindexes_parallel_bulkdel++;
+		if (((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) != 0) ||
+			((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) != 0))
+			nindexes_parallel_cleanup++;
+	}
+
+	nindexes_parallel = Max(nindexes_parallel_bulkdel,
+							nindexes_parallel_cleanup);
+
+	/* No index doesn't do parallel index vacuuming */
+	if (nindexes_parallel == 0)
+		return 0;
+
+	/* The leader process takes one index */
+	nindexes_parallel--;
+
+	/* Compute the parallel degree */
+	parallel_workers = (nrequested > 0) ?
+		Min(nrequested, nindexes_parallel) : nindexes_parallel;
+
+	/* cap by max_parallel_maintenance_workers */
+	parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+
+	return parallel_workers;
+}
+
+/*
+ * Initialize variables for shared index statistics, set NULL bitmap and the
+ * size of stats for each index.  Since currently we don't support parallel vacuum
+ * for autovacuum we don't need to care about autovacuum_work_mem.
+ */
+static void
+prepare_index_statistics(LVShared *lvshared, bool *can_parallel_vacuum,
+						 int nindexes)
+{
+	int			i;
+
+	Assert(!IsAutoVacuumWorkerProcess());
+
+	/* Set NULL for all indexes */
+	memset(lvshared->bitmap, 0x00, BITMAPLEN(nindexes));
+
+	for (i = 0; i < nindexes; i++)
+	{
+		if (!can_parallel_vacuum[i])
+			continue;
+
+		/* Set NOT NULL as this index do support parallelism */
+		lvshared->bitmap[i >> 3] |= 1 << (i & 0x07);
+	}
+}
+
+/*
+ * Update index statistics in pg_class if the statistics is accurate.
+ */
+static void
+update_index_statistics(Relation *Irel, IndexBulkDeleteResult **stats,
+						int nindexes)
+{
+	int			i;
+
+	Assert(!IsInParallelMode());
+
+	for (i = 0; i < nindexes; i++)
+	{
+		if (stats[i] == NULL || stats[i]->estimated_count)
+			continue;
+
+		/* Update index statistics */
+		vac_update_relstats(Irel[i],
+							stats[i]->num_pages,
+							stats[i]->num_index_tuples,
+							0,
+							false,
+							InvalidTransactionId,
+							InvalidMultiXactId,
+							false);
+		pfree(stats[i]);
+	}
+}
+
+/*
+ * This function prepares and returns parallel vacuum state if we can launch
+ * even one worker.  This function is responsible to create a parallel
+ * context, enter parallel mode, and then initialize the DSM segment.
+ */
+static LVParallelState *
+begin_parallel_vacuum(Oid relid, Relation *Irel, LVRelStats *vacrelstats,
+					  BlockNumber nblocks, int nindexes, int nrequested)
+{
+	LVParallelState *lps = NULL;
+	ParallelContext *pcxt;
+	LVShared   *shared;
+	LVDeadTuples *dead_tuples;
+	bool	   *can_parallel_vacuum;
+	long		maxtuples;
+	char	   *sharedquery;
+	Size		est_shared;
+	Size		est_deadtuples;
+	int			nindexes_mwm = 0;
+	int			parallel_workers = 0;
+	int			querylen;
+	int			i;
+
+	/*
+	 * a parallel vacuum must be requested and there must be indexes on the
+	 * relation
+	 */
+	Assert(nrequested >= 0);
+	Assert(nindexes > 0);
+
+	/*
+	 * Compute the number of parallel vacuum workers to launch
+	 */
+	can_parallel_vacuum = (bool *) palloc0(sizeof(bool) * nindexes);
+	parallel_workers = compute_parallel_vacuum_workers(Irel, nindexes,
+													   nrequested,
+													   can_parallel_vacuum);
+
+	/* Can't perform vacuum in parallel */
+	if (parallel_workers <= 0)
+	{
+		pfree(can_parallel_vacuum);
+		return lps;
+	}
+
+	lps = (LVParallelState *) palloc0(sizeof(LVParallelState));
+
+	EnterParallelMode();
+	pcxt = CreateParallelContext("postgres", "parallel_vacuum_main",
+								 parallel_workers);
+	Assert(pcxt->nworkers > 0);
+	lps->pcxt = pcxt;
+
+	/* Estimate size for shared information -- PARALLEL_VACUUM_KEY_SHARED */
+	est_shared = MAXALIGN(add_size(SizeOfLVShared, BITMAPLEN(nindexes)));
+	for (i = 0; i < nindexes; i++)
+	{
+		uint8		vacoptions = Irel[i]->rd_indam->amparallelvacuumoptions;
+
+		/*
+		 * Cleanup option should be either disabled, always performing in
+		 * parallel or conditionally performing in parallel.
+		 */
+		Assert(((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) == 0) ||
+			   ((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) == 0));
+		Assert(vacoptions <= VACUUM_OPTION_MAX_VALID_VALUE);
+
+		/* Skip indexes that don't participate in parallel index vacuum */
+		if (!can_parallel_vacuum[i])
+			continue;
+
+		if (Irel[i]->rd_indam->amusemaintenanceworkmem)
+			nindexes_mwm++;
+
+		est_shared = add_size(est_shared, sizeof(LVSharedIndStats));
+
+		/*
+		 * Remember the number of indexes that support parallel operation for
+		 * each phase.
+		 */
+		if ((vacoptions & VACUUM_OPTION_PARALLEL_BULKDEL) != 0)
+			lps->nindexes_parallel_bulkdel++;
+		if ((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) != 0)
+			lps->nindexes_parallel_cleanup++;
+		if ((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) != 0)
+			lps->nindexes_parallel_condcleanup++;
+	}
+	shm_toc_estimate_chunk(&pcxt->estimator, est_shared);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	/* Estimate size for dead tuples -- PARALLEL_VACUUM_KEY_DEAD_TUPLES */
+	maxtuples = compute_max_dead_tuples(nblocks, true);
+	est_deadtuples = MAXALIGN(add_size(SizeOfLVDeadTuples,
+									   mul_size(sizeof(ItemPointerData), maxtuples)));
+	shm_toc_estimate_chunk(&pcxt->estimator, est_deadtuples);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	/* Finally, estimate PARALLEL_VACUUM_KEY_QUERY_TEXT space */
+	querylen = strlen(debug_query_string);
+	shm_toc_estimate_chunk(&pcxt->estimator, querylen + 1);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	InitializeParallelDSM(pcxt);
+
+	/* Prepare shared information */
+	shared = (LVShared *) shm_toc_allocate(pcxt->toc, est_shared);
+	MemSet(shared, 0, est_shared);
+	shared->relid = relid;
+	shared->elevel = elevel;
+	shared->maintenance_work_mem_worker =
+		(nindexes_mwm > 0) ?
+		maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
+		maintenance_work_mem;
+
+	/*
+	 * We need to care about alignment because we estimate the shared memory
+	 * in that way.
+	 */
+	shared->offset = MAXALIGN(add_size(SizeOfLVShared, BITMAPLEN(nindexes)));
+	prepare_index_statistics(shared, can_parallel_vacuum, nindexes);
+	pg_atomic_init_u32(&(shared->idx), 0);
+	pg_atomic_init_u32(&(shared->nprocessed), 0);
+	pg_atomic_init_u32(&(shared->cost_balance), 0);
+	pg_atomic_init_u32(&(shared->active_nworkers), 0);
+
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_SHARED, shared);
+	lps->lvshared = shared;
+
+	/* Prepare the dead tuple space */
+	dead_tuples = (LVDeadTuples *) shm_toc_allocate(pcxt->toc, est_deadtuples);
+	dead_tuples->max_tuples = maxtuples;
+	dead_tuples->num_tuples = 0;
+	MemSet(dead_tuples->itemptrs, 0, sizeof(ItemPointerData) * maxtuples);
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_DEAD_TUPLES, dead_tuples);
+	vacrelstats->dead_tuples = dead_tuples;
+
+	/* Store query string for workers */
+	sharedquery = (char *) shm_toc_allocate(pcxt->toc, querylen + 1);
+	memcpy(sharedquery, debug_query_string, querylen + 1);
+	sharedquery[querylen] = '\0';
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, sharedquery);
+
+	pfree(can_parallel_vacuum);
+	return lps;
+}
+
+/*
+ * Destroy the parallel context, and end parallel mode.
+ *
+ * Since writes are not allowed during the parallel mode, so we copy the
+ * updated index statistics from DSM in local memory and then later use that
+ * to update the index statistics.  One might think that we can exit from
+ * parallel mode, update the index statistics and then destroy parallel
+ * context, but that won't be safe (see ExitParallelMode).
+ */
+static void
+end_parallel_vacuum(Relation *Irel, IndexBulkDeleteResult **stats,
+					LVParallelState *lps, int nindexes)
+{
+	int			i;
+
+	Assert(!IsParallelWorker());
+
+	/* copy the updated statistics */
+	for (i = 0; i < nindexes; i++)
+	{
+		LVSharedIndStats *indstats = get_indstats(lps->lvshared, i);
+
+		/*
+		 * Skip unused slot.  The statistics of this index are already stored
+		 * in local memory.
+		 */
+		if (indstats == NULL)
+			continue;
+
+		if (indstats->updated)
+		{
+			stats[i] = (IndexBulkDeleteResult *) palloc0(sizeof(IndexBulkDeleteResult));
+			memcpy(stats[i], &(indstats->stats), sizeof(IndexBulkDeleteResult));
+		}
+		else
+			stats[i] = NULL;
+	}
+
+	DestroyParallelContext(lps->pcxt);
+	ExitParallelMode();
+
+	/* Deactivate parallel vacuum */
+	pfree(lps);
+	lps = NULL;
+}
+
+/* Return the Nth index statistics or NULL */
+static LVSharedIndStats *
+get_indstats(LVShared *lvshared, int n)
+{
+	int			i;
+	char	   *p;
+
+	if (IndStatsIsNull(lvshared, n))
+		return NULL;
+
+	p = (char *) GetSharedIndStats(lvshared);
+	for (i = 0; i < n; i++)
+	{
+		if (IndStatsIsNull(lvshared, i))
+			continue;
+
+		p += sizeof(LVSharedIndStats);
+	}
+
+	return (LVSharedIndStats *) p;
+}
+
+/*
+ * Check if the given index participates in parallel index vacuum or parallel
+ * index cleanup.
+ */
+static bool
+skip_parallel_vacuum_index(Relation indrel, LVShared *lvshared)
+{
+	uint8		vacoptions = indrel->rd_indam->amparallelvacuumoptions;
+
+	/* first_time must be true only if for_cleanup is true */
+	Assert(lvshared->for_cleanup || !lvshared->first_time);
+
+	if (lvshared->for_cleanup)
+	{
+		/* Skip, if the index does not support parallel cleanup */
+		if (((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) == 0) &&
+			((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) == 0))
+			return true;
+
+		/*
+		 * Skip, if the index supports parallel cleanup conditionally, but we
+		 * have already processed the index (for bulkdelete).  See the
+		 * comments for option VACUUM_OPTION_PARALLEL_COND_CLEANUP to know
+		 * when indexes support parallel cleanup conditionally.
+		 */
+		if (!lvshared->first_time &&
+			((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) != 0))
+			return true;
+	}
+	else if ((vacoptions & VACUUM_OPTION_PARALLEL_BULKDEL) == 0)
+	{
+		/* Skip if the index does not support parallel bulk deletion */
+		return true;
+	}
+
+	return false;
+}
+
+/*
+ * Perform work within a launched parallel process.
+ *
+ * Since parallel vacuum workers perform only index vacuum or index cleanup,
+ * we don't need to report the progress information.
+ */
+void
+parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
+{
+	Relation	onerel;
+	Relation   *indrels;
+	LVShared   *lvshared;
+	LVDeadTuples *dead_tuples;
+	int			nindexes;
+	char	   *sharedquery;
+	IndexBulkDeleteResult **stats;
+
+	lvshared = (LVShared *) shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_SHARED,
+										   false);
+	elevel = lvshared->elevel;
+
+	ereport(DEBUG1,
+			(errmsg("starting parallel lazy vacuum worker for %s",
+					lvshared->for_cleanup ? "cleanup" : "vacuuming")));
+
+	/* Set debug_query_string for individual workers */
+	sharedquery = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, false);
+	debug_query_string = sharedquery;
+	pgstat_report_activity(STATE_RUNNING, debug_query_string);
+
+	/*
+	 * Open table.  The lock mode is the same as the leader process.  It's
+	 * okay because the lock mode does not conflict among the parallel
+	 * workers.
+	 */
+	onerel = table_open(lvshared->relid, ShareUpdateExclusiveLock);
+
+	/*
+	 * Open all indexes. indrels are sorted in order by OID, which should be
+	 * matched to the leader's one.
+	 */
+	vac_open_indexes(onerel, RowExclusiveLock, &nindexes, &indrels);
+	Assert(nindexes > 0);
+
+	/* Set dead tuple space */
+	dead_tuples = (LVDeadTuples *) shm_toc_lookup(toc,
+												  PARALLEL_VACUUM_KEY_DEAD_TUPLES,
+												  false);
+
+	/* Set cost-based vacuum delay */
+	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostBalance = 0;
+	VacuumPageHit = 0;
+	VacuumPageMiss = 0;
+	VacuumPageDirty = 0;
+	VacuumSharedCostBalance = &(lvshared->cost_balance);
+	VacuumActiveNWorkers = &(lvshared->active_nworkers);
+
+	stats = (IndexBulkDeleteResult **)
+		palloc0(nindexes * sizeof(IndexBulkDeleteResult *));
+
+	if (lvshared->maintenance_work_mem_worker > 0)
+		maintenance_work_mem = lvshared->maintenance_work_mem_worker;
+
+	/* Process indexes to perform vacuum/cleanup */
+	parallel_vacuum_index(indrels, stats, lvshared, dead_tuples, nindexes);
+
+	vac_close_indexes(nindexes, indrels, RowExclusiveLock);
+	table_close(onerel, ShareUpdateExclusiveLock);
+	pfree(stats);
+}
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index d147236429..6c9ee65ba2 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -14,6 +14,7 @@
 
 #include "postgres.h"
 
+#include "access/heapam.h"
 #include "access/nbtree.h"
 #include "access/parallel.h"
 #include "access/session.h"
@@ -139,6 +140,9 @@ static const struct
 	},
 	{
 		"_bt_parallel_build_main", _bt_parallel_build_main
+	},
+	{
+		"parallel_vacuum_main", parallel_vacuum_main
 	}
 };
 
@@ -174,6 +178,7 @@ CreateParallelContext(const char *library_name, const char *function_name,
 	pcxt = palloc0(sizeof(ParallelContext));
 	pcxt->subid = GetCurrentSubTransactionId();
 	pcxt->nworkers = nworkers;
+	pcxt->nworkers_to_launch = nworkers;
 	pcxt->library_name = pstrdup(library_name);
 	pcxt->function_name = pstrdup(function_name);
 	pcxt->error_context_stack = error_context_stack;
@@ -486,6 +491,23 @@ ReinitializeParallelDSM(ParallelContext *pcxt)
 	}
 }
 
+/*
+ * Reinitialize parallel workers for a parallel context such that we could
+ * launch the different number of workers.  This is required for cases where
+ * we need to reuse the same DSM segment, but the number of workers can
+ * vary from run-to-run.
+ */
+void
+ReinitializeParallelWorkers(ParallelContext *pcxt, int nworkers_to_launch)
+{
+	/*
+	 * The number of workers that need to be launched must be less than the
+	 * number of workers with which the parallel context is initialized.
+	 */
+	Assert(pcxt->nworkers >= nworkers_to_launch);
+	pcxt->nworkers_to_launch = nworkers_to_launch;
+}
+
 /*
  * Launch parallel workers.
  */
@@ -498,7 +520,7 @@ LaunchParallelWorkers(ParallelContext *pcxt)
 	bool		any_registrations_failed = false;
 
 	/* Skip this if we have no workers. */
-	if (pcxt->nworkers == 0)
+	if (pcxt->nworkers == 0 || pcxt->nworkers_to_launch == 0)
 		return;
 
 	/* We need to be a lock group leader. */
@@ -533,7 +555,7 @@ LaunchParallelWorkers(ParallelContext *pcxt)
 	 * fails.  It wouldn't help much anyway, because registering the worker in
 	 * no way guarantees that it will start up and initialize successfully.
 	 */
-	for (i = 0; i < pcxt->nworkers; ++i)
+	for (i = 0; i < pcxt->nworkers_to_launch; ++i)
 	{
 		memcpy(worker.bgw_extra, &i, sizeof(int));
 		if (!any_registrations_failed &&
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index da1da23400..0672be27f1 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -42,6 +42,7 @@
 #include "nodes/makefuncs.h"
 #include "pgstat.h"
 #include "postmaster/autovacuum.h"
+#include "postmaster/bgworker_internals.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
 #include "storage/proc.h"
@@ -68,6 +69,14 @@ static MemoryContext vac_context = NULL;
 static BufferAccessStrategy vac_strategy;
 
 
+/*
+ * Variables for cost-based parallel vacuum.  See comments atop
+ * compute_parallel_delay to understand how it works.
+ */
+pg_atomic_uint32 *VacuumSharedCostBalance = NULL;
+pg_atomic_uint32 *VacuumActiveNWorkers = NULL;
+int			VacuumCostBalanceLocal = 0;
+
 /* non-export function prototypes */
 static List *expand_vacuum_rel(VacuumRelation *vrel, int options);
 static List *get_all_vacuum_rels(int options);
@@ -76,6 +85,7 @@ static void vac_truncate_clog(TransactionId frozenXID,
 							  TransactionId lastSaneFrozenXid,
 							  MultiXactId lastSaneMinMulti);
 static bool vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params);
+static double compute_parallel_delay(void);
 static VacOptTernaryValue get_vacopt_ternary_value(DefElem *def);
 
 /*
@@ -99,6 +109,7 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 	/* Set default value */
 	params.index_cleanup = VACOPT_TERNARY_DEFAULT;
 	params.truncate = VACOPT_TERNARY_DEFAULT;
+	params.nworkers = -1;
 
 	/* Parse options list */
 	foreach(lc, vacstmt->options)
@@ -129,6 +140,28 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 			params.index_cleanup = get_vacopt_ternary_value(opt);
 		else if (strcmp(opt->defname, "truncate") == 0)
 			params.truncate = get_vacopt_ternary_value(opt);
+		else if (strcmp(opt->defname, "parallel") == 0)
+		{
+			if (opt->arg == NULL)
+			{
+				/*
+				 * Parallel lazy vacuum is requested but user didn't specify
+				 * the parallel degree. The parallel degree will be determined
+				 * at the start of lazy vacuum.
+				 */
+				params.nworkers = 0;
+			}
+			else
+			{
+				params.nworkers = defGetInt32(opt);
+				if (params.nworkers < 1 || params.nworkers > MAX_PARALLEL_WORKER_LIMIT)
+					ereport(ERROR,
+							(errcode(ERRCODE_SYNTAX_ERROR),
+							 errmsg("parallel vacuum degree must be between 1 and %d",
+									MAX_PARALLEL_WORKER_LIMIT),
+							 parser_errposition(pstate, opt->location)));
+			}
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -170,6 +203,11 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 		}
 	}
 
+	if ((params.options & VACOPT_FULL) && params.nworkers >= 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot specify FULL option with PARALLEL option")));
+
 	/*
 	 * All freeze ages are zero if the FREEZE option is given; otherwise pass
 	 * them as -1 which means to use the default values.
@@ -383,6 +421,7 @@ vacuum(List *relations, VacuumParams *params,
 		VacuumPageHit = 0;
 		VacuumPageMiss = 0;
 		VacuumPageDirty = 0;
+		VacuumSharedCostBalance = NULL;
 
 		/*
 		 * Loop to process each selected relation.
@@ -1738,6 +1777,20 @@ vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params)
 		return false;
 	}
 
+	/*
+	 * Since parallel workers cannot access data in temporary tables, parallel
+	 * vacuum is not allowed for temporary relation. However rather than
+	 * skipping vacuum on the table, just disabling parallel option is better
+	 * option in most cases.
+	 */
+	if (RelationUsesLocalBuffers(onerel) && params->nworkers >= 0)
+	{
+		ereport(WARNING,
+				(errmsg("disabling parallel option of vacuum on \"%s\" --- cannot vacuum temporary tables in parallel",
+						RelationGetRelationName(onerel))));
+		params->nworkers = -1;
+	}
+
 	/*
 	 * Silently ignore partitioned tables as there is no work to be done.  The
 	 * useful work is on their child partitions, which have been queued up for
@@ -1941,16 +1994,26 @@ vac_close_indexes(int nindexes, Relation *Irel, LOCKMODE lockmode)
 void
 vacuum_delay_point(void)
 {
+	double		msec = 0;
+
 	/* Always check for interrupts */
 	CHECK_FOR_INTERRUPTS();
 
-	/* Nap if appropriate */
-	if (VacuumCostActive && !InterruptPending &&
-		VacuumCostBalance >= VacuumCostLimit)
-	{
-		double		msec;
+	if (!VacuumCostActive || InterruptPending)
+		return;
 
+	/*
+	 * For parallel vacuum, the delay is computed based on the shared cost
+	 * balance.  See compute_parallel_delay.
+	 */
+	if (VacuumSharedCostBalance != NULL)
+		msec = compute_parallel_delay();
+	else if (VacuumCostBalance >= VacuumCostLimit)
 		msec = VacuumCostDelay * VacuumCostBalance / VacuumCostLimit;
+
+	/* Nap if appropriate */
+	if (msec > 0)
+	{
 		if (msec > VacuumCostDelay * 4)
 			msec = VacuumCostDelay * 4;
 
@@ -1966,6 +2029,65 @@ vacuum_delay_point(void)
 	}
 }
 
+/*
+ * Computes the vacuum delay for parallel workers.
+ *
+ * The basic idea of a cost-based vacuum delay for parallel index vacuuming
+ * is to allow all parallel vacuum workers including the leader process to
+ * have a shared view of cost related parameters (mainly VacuumCostBalance).
+ * We allow each worker to update it as and when it has incurred any cost and
+ * then based on that decide whether it needs to sleep.  We allow the worker
+ * to sleep proportional to the work done and reduce the
+ * VacuumSharedCostBalance by the amount which is consumed by the current
+ * worker (VacuumCostBalanceLocal).  This can avoid letting the workers sleep
+ * who have done less or no I/O as compared to other workers and therefore can
+ * ensure that workers who are doing more I/O got throttled more.
+ *
+ * We allow any worker to sleep only if it has performed the I/O above a
+ * certain threshold, which is calculated based on the number of active
+ * workers (VacuumActiveNWorkers), and the overall cost balance is more than
+ * VacuumCostLimit set by the system.  The testing reveals that we achieve
+ * the required throttling if we allow a worker that has done more than 50%
+ * of its share of work to sleep.
+ */
+static double
+compute_parallel_delay(void)
+{
+	double		msec = 0;
+	uint32		shared_balance;
+	int			nworkers;
+
+	/* parallel vacuum must be active */
+	Assert(VacuumSharedCostBalance);
+
+	nworkers = pg_atomic_read_u32(VacuumActiveNWorkers);
+
+	/* At least count itself */
+	Assert(nworkers >= 1);
+
+	/* Update the shared cost balance value atomically */
+	shared_balance = pg_atomic_add_fetch_u32(VacuumSharedCostBalance, VacuumCostBalance);
+
+	/* Compute the total local balance for the current worker */
+	VacuumCostBalanceLocal += VacuumCostBalance;
+
+	if ((shared_balance >= VacuumCostLimit) &&
+		(VacuumCostBalanceLocal > 0.5 * (VacuumCostLimit / nworkers)))
+	{
+		/* Compute sleep time based on the local cost balance */
+		msec = VacuumCostDelay * VacuumCostBalanceLocal / VacuumCostLimit;
+		pg_atomic_sub_fetch_u32(VacuumSharedCostBalance, VacuumCostBalanceLocal);
+		VacuumCostBalanceLocal = 0;
+	}
+
+	/*
+	 * Reset the local balance as we accumulated it into the shared value.
+	 */
+	VacuumCostBalance = 0;
+
+	return msec;
+}
+
 /*
  * A wrapper function of defGetBoolean().
  *
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index e919317bab..a97cfe2111 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2886,6 +2886,8 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 			(!wraparound ? VACOPT_SKIP_LOCKED : 0);
 		tab->at_params.index_cleanup = VACOPT_TERNARY_DEFAULT;
 		tab->at_params.truncate = VACOPT_TERNARY_DEFAULT;
+		/* As of now, we don't support parallel vacuum for autovacuum */
+		tab->at_params.nworkers = -1;
 		tab->at_params.freeze_min_age = freeze_min_age;
 		tab->at_params.freeze_table_age = freeze_table_age;
 		tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 5e0db3515d..e2dbd94a3e 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -3591,7 +3591,7 @@ psql_completion(const char *text, int start, int end)
 		if (ends_with(prev_wd, '(') || ends_with(prev_wd, ','))
 			COMPLETE_WITH("FULL", "FREEZE", "ANALYZE", "VERBOSE",
 						  "DISABLE_PAGE_SKIPPING", "SKIP_LOCKED",
-						  "INDEX_CLEANUP", "TRUNCATE");
+						  "INDEX_CLEANUP", "TRUNCATE", "PARALLEL");
 		else if (TailMatches("FULL|FREEZE|ANALYZE|VERBOSE|DISABLE_PAGE_SKIPPING|SKIP_LOCKED|INDEX_CLEANUP|TRUNCATE"))
 			COMPLETE_WITH("ON", "OFF");
 	}
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 858bcb6bc9..e89c1252d3 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -24,6 +24,8 @@
 #include "nodes/primnodes.h"
 #include "storage/bufpage.h"
 #include "storage/lockdefs.h"
+#include "storage/shm_toc.h"
+#include "storage/dsm.h"
 #include "utils/relcache.h"
 #include "utils/snapshot.h"
 
@@ -193,6 +195,7 @@ extern Size SyncScanShmemSize(void);
 struct VacuumParams;
 extern void heap_vacuum_rel(Relation onerel,
 							struct VacuumParams *params, BufferAccessStrategy bstrategy);
+extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
 
 /* in heap/heapam_visibility.c */
 extern bool HeapTupleSatisfiesVisibility(HeapTuple stup, Snapshot snapshot,
diff --git a/src/include/access/parallel.h b/src/include/access/parallel.h
index c00ae6424c..b9ad6cf671 100644
--- a/src/include/access/parallel.h
+++ b/src/include/access/parallel.h
@@ -33,7 +33,8 @@ typedef struct ParallelContext
 {
 	dlist_node	node;
 	SubTransactionId subid;
-	int			nworkers;
+	int			nworkers;		/* Maximum number of workers to launch */
+	int			nworkers_to_launch; /* Actual number of workers to launch */
 	int			nworkers_launched;
 	char	   *library_name;
 	char	   *function_name;
@@ -63,6 +64,7 @@ extern ParallelContext *CreateParallelContext(const char *library_name,
 											  const char *function_name, int nworkers);
 extern void InitializeParallelDSM(ParallelContext *pcxt);
 extern void ReinitializeParallelDSM(ParallelContext *pcxt);
+extern void ReinitializeParallelWorkers(ParallelContext *pcxt, int nworkers_to_launch);
 extern void LaunchParallelWorkers(ParallelContext *pcxt);
 extern void WaitForParallelWorkersToAttach(ParallelContext *pcxt);
 extern void WaitForParallelWorkersToFinish(ParallelContext *pcxt);
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index b9becdbe99..254a6bcda6 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -222,6 +222,12 @@ typedef struct VacuumParams
 										 * default value depends on reloptions */
 	VacOptTernaryValue truncate;	/* Truncate empty pages at the end,
 									 * default value depends on reloptions */
+
+	/*
+	 * The number of parallel vacuum workers. -1 by default for no workers and
+	 * 0 for choosing based on the number of indexes.
+	 */
+	int			nworkers;
 } VacuumParams;
 
 /* GUC parameters */
@@ -231,6 +237,11 @@ extern int	vacuum_freeze_table_age;
 extern int	vacuum_multixact_freeze_min_age;
 extern int	vacuum_multixact_freeze_table_age;
 
+/* Variables for cost-based parallel vacuum  */
+extern pg_atomic_uint32 *VacuumSharedCostBalance;
+extern pg_atomic_uint32 *VacuumActiveNWorkers;
+extern int	VacuumCostBalanceLocal;
+
 
 /* in commands/vacuum.c */
 extern void ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel);
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index 9996d882d1..8571133fe7 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -92,6 +92,34 @@ CONTEXT:  SQL function "do_analyze" statement 1
 SQL function "wrap_do_analyze" statement 1
 VACUUM FULL vactst;
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+-- PARALLEL option
+CREATE TABLE pvactst (i INT, a INT[], p POINT) with (autovacuum_enabled = off);
+INSERT INTO pvactst SELECT i, array[1,2,3], point(i, i+1) FROM generate_series(1,1000) i;
+CREATE INDEX btree_pvactst ON pvactst USING btree (i);
+CREATE INDEX hash_pvactst ON pvactst USING hash (i);
+CREATE INDEX brin_pvactst ON pvactst USING brin (i);
+CREATE INDEX gin_pvactst ON pvactst USING gin (a);
+CREATE INDEX gist_pvactst ON pvactst USING gist (p);
+CREATE INDEX spgist_pvactst ON pvactst USING spgist (p);
+-- VACUUM invokes parallel index cleanup
+SET min_parallel_index_scan_size to 0;
+VACUUM (PARALLEL 2) pvactst;
+-- VACUUM invokes parallel bulk-deletion
+UPDATE pvactst SET i = i WHERE i < 1000;
+VACUUM (PARALLEL 2) pvactst;
+VACUUM (PARALLEL 0) pvactst; -- error
+ERROR:  parallel vacuum degree must be between 1 and 1024
+LINE 1: VACUUM (PARALLEL 0) pvactst;
+                ^
+VACUUM (PARALLEL 2, INDEX_CLEANUP FALSE) pvactst;
+VACUUM (PARALLEL 2, FULL TRUE) pvactst; -- error, cannot use both PARALLEL and FULL
+ERROR:  cannot specify FULL option with PARALLEL option
+CREATE TEMPORARY TABLE tmp (a int PRIMARY KEY);
+CREATE INDEX tmp_idx1 ON tmp (a);
+VACUUM (PARALLEL 1) tmp; -- disables parallel vacuum option
+WARNING:  disabling parallel option of vacuum on "tmp" --- cannot vacuum temporary tables in parallel
+RESET min_parallel_index_scan_size;
+DROP TABLE pvactst;
 -- INDEX_CLEANUP option
 CREATE TABLE no_index_cleanup (i INT PRIMARY KEY, t TEXT);
 -- Use uncompressed data stored in toast.
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 69987f75e9..be4f55616e 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -75,6 +75,33 @@ VACUUM FULL vactst;
 
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
 
+-- PARALLEL option
+CREATE TABLE pvactst (i INT, a INT[], p POINT) with (autovacuum_enabled = off);
+INSERT INTO pvactst SELECT i, array[1,2,3], point(i, i+1) FROM generate_series(1,1000) i;
+CREATE INDEX btree_pvactst ON pvactst USING btree (i);
+CREATE INDEX hash_pvactst ON pvactst USING hash (i);
+CREATE INDEX brin_pvactst ON pvactst USING brin (i);
+CREATE INDEX gin_pvactst ON pvactst USING gin (a);
+CREATE INDEX gist_pvactst ON pvactst USING gist (p);
+CREATE INDEX spgist_pvactst ON pvactst USING spgist (p);
+
+-- VACUUM invokes parallel index cleanup
+SET min_parallel_index_scan_size to 0;
+VACUUM (PARALLEL 2) pvactst;
+
+-- VACUUM invokes parallel bulk-deletion
+UPDATE pvactst SET i = i WHERE i < 1000;
+VACUUM (PARALLEL 2) pvactst;
+
+VACUUM (PARALLEL 0) pvactst; -- error
+VACUUM (PARALLEL 2, INDEX_CLEANUP FALSE) pvactst;
+VACUUM (PARALLEL 2, FULL TRUE) pvactst; -- error, cannot use both PARALLEL and FULL
+CREATE TEMPORARY TABLE tmp (a int PRIMARY KEY);
+CREATE INDEX tmp_idx1 ON tmp (a);
+VACUUM (PARALLEL 1) tmp; -- disables parallel vacuum option
+RESET min_parallel_index_scan_size;
+DROP TABLE pvactst;
+
 -- INDEX_CLEANUP option
 CREATE TABLE no_index_cleanup (i INT PRIMARY KEY, t TEXT);
 -- Use uncompressed data stored in toast.
-- 
2.23.0

delta_patch_to_make_vacuum_as_parallel.patchapplication/octet-stream; name=delta_patch_to_make_vacuum_as_parallel.patchDownload

commit 8d1e7635e114a450dcc9a3bf2f55b777112dc88c
Author: Mahendra Singh Thalor <mahi6run@gmail.com>
Date:   Fri Jan 3 20:01:19 2020 +0530

    Enable parallel vaccum if parallel option is not given
    
    If user hasn't selected parallel option, then also enable parallel
    vacuum and decide degree of parallel vacuum based on indexes.  If
    user requested zero as degree, then disable parallel vacuum.

diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 8f60fef..f730db1 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -106,6 +106,7 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 	bool		full = false;
 	bool		disable_page_skipping = false;
 	ListCell   *lc;
+	bool		disable_parallel_vacuum = false;
 
 	/* Set default value */
 	params.index_cleanup = VACOPT_TERNARY_DEFAULT;
@@ -157,12 +158,22 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 			else
 			{
 				params.nworkers = defGetInt32(opt);
-				if (params.nworkers < 1 || params.nworkers > MAX_PARALLEL_WORKER_LIMIT)
+				if (params.nworkers < 0 || params.nworkers > MAX_PARALLEL_WORKER_LIMIT)
 					ereport(ERROR,
 							(errcode(ERRCODE_SYNTAX_ERROR),
-							 errmsg("parallel vacuum degree must be between 1 and %d",
+							 errmsg("parallel vacuum degree must be between 0 and %d",
 									MAX_PARALLEL_WORKER_LIMIT),
 							 parser_errposition(pstate, opt->location)));
+
+				/*
+				 * If parallel degree is given as zero, it means user want to
+				 * disable parallel vacuum.
+				 */
+				if (params.nworkers == 0)
+				{
+					disable_parallel_vacuum = true;
+					params.nworkers = -1;
+				}
 			}
 		}
 		else
@@ -213,6 +224,16 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 				 errmsg("cannot specify FULL option with PARALLEL option")));
 
 	/*
+	 * If user has not requested parallel vacuum, and this is not full vacuum,
+	 * so by default, we will enable parallel vacuum and will set degree as
+	 * zero.  Later, based on indexes and size of indexes, we will decide
+	 * degree of parallel vacuum.
+	 */
+	if (!(params.options & VACOPT_FULL) && params.nworkers == -1 &&
+		!disable_parallel_vacuum)
+		params.nworkers = 0;
+
+	/*
 	 * All freeze ages are zero if the FREEZE option is given; otherwise pass
 	 * them as -1 which means to use the default values.
 	 */
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index 07c7b88..7e33955 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -107,9 +107,9 @@ VACUUM (PARALLEL 2) pvactst;
 -- VACUUM invokes parallel bulk-deletion
 UPDATE pvactst SET i = i WHERE i < 1000;
 VACUUM (PARALLEL 2) pvactst;
-VACUUM (PARALLEL 0) pvactst; -- error
-ERROR:  parallel vacuum degree must be between 1 and 1024
-LINE 1: VACUUM (PARALLEL 0) pvactst;
+VACUUM (PARALLEL -1) pvactst; -- error
+ERROR:  parallel vacuum degree must be between 0 and 1024
+LINE 1: VACUUM (PARALLEL -1) pvactst;
                 ^
 VACUUM (PARALLEL 2, INDEX_CLEANUP FALSE) pvactst;
 VACUUM (PARALLEL 2, FULL TRUE) pvactst; -- error, cannot use both PARALLEL and FULL
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 6227ab9..ad7ff4e 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -92,8 +92,7 @@ VACUUM (PARALLEL 2) pvactst;
 -- VACUUM invokes parallel bulk-deletion
 UPDATE pvactst SET i = i WHERE i < 1000;
 VACUUM (PARALLEL 2) pvactst;
-
-VACUUM (PARALLEL 0) pvactst; -- error
+VACUUM (PARALLEL -1) pvactst; -- error
 VACUUM (PARALLEL 2, INDEX_CLEANUP FALSE) pvactst;
 VACUUM (PARALLEL 2, FULL TRUE) pvactst; -- error, cannot use both PARALLEL and FULL
 CREATE TEMPORARY TABLE tmp (a int PRIMARY KEY);

#334

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Mahendra Singh Thalor (#333)

3 attachment(s)

On Sat, Jan 4, 2020 at 6:48 PM Mahendra Singh Thalor <mahi6run@gmail.com> wrote:

Hi All,

In other thread "parallel vacuum options/syntax" [1], Amit Kapila asked opinion about syntax for making normal vacuum to parallel. From that thread, I can see that people are in favor of option(b) to implement. So I tried to implement option(b) on the top of v41 patch set and implemented a delta patch.

I looked at your code and changed it slightly to allow the vacuum to
be performed in parallel by default. Apart from that, I have made a
few other modifications (a) changed the macro SizeOfLVDeadTuples as
preferred by Tomas [1]/messages/by-id/20191229212354.tqivttn23lxjg2jz@development, (b) updated the documentation, (c) changed a
few comments.

The first two patches are the same. I have not posted the patch
related to the FAST option as I am not sure we have a consensus for
that and I have also intentionally left DISABLE_LEADER_PARTICIPATION
related patch to avoid confusion.

What do you think of the attached? Sawada-san, kindly verify the
changes and let me know your opinion.

[1]: /messages/by-id/20191229212354.tqivttn23lxjg2jz@development

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachments:

v4-0001-Delete-empty-pages-in-each-pass-during-GIST-VACUUM.patchapplication/octet-stream; name=v4-0001-Delete-empty-pages-in-each-pass-during-GIST-VACUUM.patchDownload

From 0dff354f7ef6a4d171e4cafe946aa68f9b1d45f0 Mon Sep 17 00:00:00 2001
From: Amit Kapila <akapila@postgresql.org>
Date: Mon, 9 Dec 2019 14:12:59 +0530
Subject: [PATCH 1/3] Delete empty pages in each pass during GIST VACUUM.

Earlier, we use to postpone deleting empty pages till the second stage of
vacuum to amortize the cost of scanning internal pages.  However, that can
sometimes (say vacuum is canceled or errored between first and second
stage) delay the pages to be recycled.

Another thing is that to facilitate deleting empty pages in the second
stage, we need to share the information of internal and empty pages
between different stages of vacuum.  It will be quite tricky to share this
information via DSM which is required for the upcoming parallel vacuum
patch.

Also, it will bring the logic to reclaim deleted pages closer to nbtree
where we delete empty pages in each pass.

Overall, the advantages of deleting empty pages in each pass outweigh the
advantages of postponing the same.

Author: Dilip Kumar, with changes by Amit Kapila
Reviewed-by: Sawada Masahiko and Amit Kapila
Discussion: https://postgr.es/m/CAA4eK1LGr+MN0xHZpJ2dfS8QNQ1a_aROKowZB+MPNep8FVtwAA@mail.gmail.com
---
 src/backend/access/gist/README       |  23 +++--
 src/backend/access/gist/gistvacuum.c | 160 +++++++++++++++--------------------
 2 files changed, 78 insertions(+), 105 deletions(-)

diff --git a/src/backend/access/gist/README b/src/backend/access/gist/README
index 8cbca69..fffdfff 100644
--- a/src/backend/access/gist/README
+++ b/src/backend/access/gist/README
@@ -429,18 +429,17 @@ splits during searches, we don't need a "vacuum cycle ID" concept for that
 like B-tree does.
 
 While we scan all the pages, we also make note of any completely empty leaf
-pages. We will try to unlink them from the tree in the second stage. We also
-record the block numbers of all internal pages; they are needed in the second
-stage, to locate parents of the empty pages.
-
-In the second stage, we try to unlink any empty leaf pages from the tree, so
-that their space can be reused. In order to delete an empty page, its
-downlink must be removed from the parent. We scan all the internal pages,
-whose block numbers we memorized in the first stage, and look for downlinks
-to pages that we have memorized as being empty. Whenever we find one, we
-acquire a lock on the parent and child page, re-check that the child page is
-still empty. Then, we remove the downlink and mark the child as deleted, and
-release the locks.
+pages. We will try to unlink them from the tree after the scan. We also record
+the block numbers of all internal pages; they are needed to locate parents of
+the empty pages while unlinking them.
+
+We try to unlink any empty leaf pages from the tree, so that their space can
+be reused. In order to delete an empty page, its downlink must be removed from
+the parent. We scan all the internal pages, whose block numbers we memorized
+in the first stage, and look for downlinks to pages that we have memorized as
+being empty. Whenever we find one, we acquire a lock on the parent and child
+page, re-check that the child page is still empty. Then, we remove the
+downlink and mark the child as deleted, and release the locks.
 
 The insertion algorithm would get confused, if an internal page was completely
 empty. So we never delete the last child of an internal page, even if it's
diff --git a/src/backend/access/gist/gistvacuum.c b/src/backend/access/gist/gistvacuum.c
index def74fd..ca39404 100644
--- a/src/backend/access/gist/gistvacuum.c
+++ b/src/backend/access/gist/gistvacuum.c
@@ -24,58 +24,34 @@
 #include "storage/lmgr.h"
 #include "utils/memutils.h"
 
-/*
- * State kept across vacuum stages.
- */
+/* Working state needed by gistbulkdelete */
 typedef struct
 {
-	IndexBulkDeleteResult stats;	/* must be first */
+	IndexVacuumInfo *info;
+	IndexBulkDeleteResult *stats;
+	IndexBulkDeleteCallback callback;
+	void	   *callback_state;
+	GistNSN		startNSN;
 
 	/*
-	 * These are used to memorize all internal and empty leaf pages in the 1st
-	 * vacuum stage.  They are used in the 2nd stage, to delete all the empty
-	 * pages.
+	 * These are used to memorize all internal and empty leaf pages. They are
+	 * used for deleting all the empty pages.
 	 */
 	IntegerSet *internal_page_set;
 	IntegerSet *empty_leaf_set;
 	MemoryContext page_set_context;
-} GistBulkDeleteResult;
-
-/* Working state needed by gistbulkdelete */
-typedef struct
-{
-	IndexVacuumInfo *info;
-	GistBulkDeleteResult *stats;
-	IndexBulkDeleteCallback callback;
-	void	   *callback_state;
-	GistNSN		startNSN;
 } GistVacState;
 
-static void gistvacuumscan(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
+static void gistvacuumscan(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
 						   IndexBulkDeleteCallback callback, void *callback_state);
 static void gistvacuumpage(GistVacState *vstate, BlockNumber blkno,
 						   BlockNumber orig_blkno);
 static void gistvacuum_delete_empty_pages(IndexVacuumInfo *info,
-										  GistBulkDeleteResult *stats);
-static bool gistdeletepage(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
+										  GistVacState *vstate);
+static bool gistdeletepage(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
 						   Buffer buffer, OffsetNumber downlink,
 						   Buffer leafBuffer);
 
-/* allocate the 'stats' struct that's kept over vacuum stages */
-static GistBulkDeleteResult *
-create_GistBulkDeleteResult(void)
-{
-	GistBulkDeleteResult *gist_stats;
-
-	gist_stats = (GistBulkDeleteResult *) palloc0(sizeof(GistBulkDeleteResult));
-	gist_stats->page_set_context =
-		GenerationContextCreate(CurrentMemoryContext,
-								"GiST VACUUM page set context",
-								16 * 1024);
-
-	return gist_stats;
-}
-
 /*
  * VACUUM bulkdelete stage: remove index entries.
  */
@@ -83,15 +59,13 @@ IndexBulkDeleteResult *
 gistbulkdelete(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
 			   IndexBulkDeleteCallback callback, void *callback_state)
 {
-	GistBulkDeleteResult *gist_stats = (GistBulkDeleteResult *) stats;
-
 	/* allocate stats if first time through, else re-use existing struct */
-	if (gist_stats == NULL)
-		gist_stats = create_GistBulkDeleteResult();
+	if (stats == NULL)
+		stats = (IndexBulkDeleteResult *) palloc0(sizeof(IndexBulkDeleteResult));
 
-	gistvacuumscan(info, gist_stats, callback, callback_state);
+	gistvacuumscan(info, stats, callback, callback_state);
 
-	return (IndexBulkDeleteResult *) gist_stats;
+	return stats;
 }
 
 /*
@@ -100,8 +74,6 @@ gistbulkdelete(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
 IndexBulkDeleteResult *
 gistvacuumcleanup(IndexVacuumInfo *info, IndexBulkDeleteResult *stats)
 {
-	GistBulkDeleteResult *gist_stats = (GistBulkDeleteResult *) stats;
-
 	/* No-op in ANALYZE ONLY mode */
 	if (info->analyze_only)
 		return stats;
@@ -111,25 +83,13 @@ gistvacuumcleanup(IndexVacuumInfo *info, IndexBulkDeleteResult *stats)
 	 * stats from the latest gistbulkdelete call.  If it wasn't called, we
 	 * still need to do a pass over the index, to obtain index statistics.
 	 */
-	if (gist_stats == NULL)
+	if (stats == NULL)
 	{
-		gist_stats = create_GistBulkDeleteResult();
-		gistvacuumscan(info, gist_stats, NULL, NULL);
+		stats = (IndexBulkDeleteResult *) palloc0(sizeof(IndexBulkDeleteResult));
+		gistvacuumscan(info, stats, NULL, NULL);
 	}
 
 	/*
-	 * If we saw any empty pages, try to unlink them from the tree so that
-	 * they can be reused.
-	 */
-	gistvacuum_delete_empty_pages(info, gist_stats);
-
-	/* we don't need the internal and empty page sets anymore */
-	MemoryContextDelete(gist_stats->page_set_context);
-	gist_stats->page_set_context = NULL;
-	gist_stats->internal_page_set = NULL;
-	gist_stats->empty_leaf_set = NULL;
-
-	/*
 	 * It's quite possible for us to be fooled by concurrent page splits into
 	 * double-counting some index tuples, so disbelieve any total that exceeds
 	 * the underlying heap's count ... if we know that accurately.  Otherwise
@@ -137,11 +97,11 @@ gistvacuumcleanup(IndexVacuumInfo *info, IndexBulkDeleteResult *stats)
 	 */
 	if (!info->estimated_count)
 	{
-		if (gist_stats->stats.num_index_tuples > info->num_heap_tuples)
-			gist_stats->stats.num_index_tuples = info->num_heap_tuples;
+		if (stats->num_index_tuples > info->num_heap_tuples)
+			stats->num_index_tuples = info->num_heap_tuples;
 	}
 
-	return (IndexBulkDeleteResult *) gist_stats;
+	return stats;
 }
 
 /*
@@ -153,15 +113,16 @@ gistvacuumcleanup(IndexVacuumInfo *info, IndexBulkDeleteResult *stats)
  * occurred).
  *
  * This also makes note of any empty leaf pages, as well as all internal
- * pages.  The second stage, gistvacuum_delete_empty_pages(), needs that
- * information.  Any deleted pages are added directly to the free space map.
- * (They should've been added there when they were originally deleted, already,
- * but it's possible that the FSM was lost at a crash, for example.)
+ * pages while looping over all index pages.  After scanning all the pages, we
+ * remove the empty pages so that they can be reused.  Any deleted pages are
+ * added directly to the free space map.  (They should've been added there
+ * when they were originally deleted, already, but it's possible that the FSM
+ * was lost at a crash, for example.)
  *
  * The caller is responsible for initially allocating/zeroing a stats struct.
  */
 static void
-gistvacuumscan(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
+gistvacuumscan(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
 			   IndexBulkDeleteCallback callback, void *callback_state)
 {
 	Relation	rel = info->index;
@@ -175,11 +136,10 @@ gistvacuumscan(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
 	 * Reset counts that will be incremented during the scan; needed in case
 	 * of multiple scans during a single VACUUM command.
 	 */
-	stats->stats.estimated_count = false;
-	stats->stats.num_index_tuples = 0;
-	stats->stats.pages_deleted = 0;
-	stats->stats.pages_free = 0;
-	MemoryContextReset(stats->page_set_context);
+	stats->estimated_count = false;
+	stats->num_index_tuples = 0;
+	stats->pages_deleted = 0;
+	stats->pages_free = 0;
 
 	/*
 	 * Create the integer sets to remember all the internal and the empty leaf
@@ -187,9 +147,12 @@ gistvacuumscan(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
 	 * this context so that the subsequent allocations for these integer sets
 	 * will be done from the same context.
 	 */
-	oldctx = MemoryContextSwitchTo(stats->page_set_context);
-	stats->internal_page_set = intset_create();
-	stats->empty_leaf_set = intset_create();
+	vstate.page_set_context = GenerationContextCreate(CurrentMemoryContext,
+													  "GiST VACUUM page set context",
+													  16 * 1024);
+	oldctx = MemoryContextSwitchTo(vstate.page_set_context);
+	vstate.internal_page_set = intset_create();
+	vstate.empty_leaf_set = intset_create();
 	MemoryContextSwitchTo(oldctx);
 
 	/* Set up info to pass down to gistvacuumpage */
@@ -257,11 +220,23 @@ gistvacuumscan(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
 	 * Note that if no recyclable pages exist, we don't bother vacuuming the
 	 * FSM at all.
 	 */
-	if (stats->stats.pages_free > 0)
+	if (stats->pages_free > 0)
 		IndexFreeSpaceMapVacuum(rel);
 
 	/* update statistics */
-	stats->stats.num_pages = num_pages;
+	stats->num_pages = num_pages;
+
+	/*
+	 * If we saw any empty pages, try to unlink them from the tree so that
+	 * they can be reused.
+	 */
+	gistvacuum_delete_empty_pages(info, &vstate);
+
+	/* we don't need the internal and empty page sets anymore */
+	MemoryContextDelete(vstate.page_set_context);
+	vstate.page_set_context = NULL;
+	vstate.internal_page_set = NULL;
+	vstate.empty_leaf_set = NULL;
 }
 
 /*
@@ -278,7 +253,6 @@ gistvacuumscan(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
 static void
 gistvacuumpage(GistVacState *vstate, BlockNumber blkno, BlockNumber orig_blkno)
 {
-	GistBulkDeleteResult *stats = vstate->stats;
 	IndexVacuumInfo *info = vstate->info;
 	IndexBulkDeleteCallback callback = vstate->callback;
 	void	   *callback_state = vstate->callback_state;
@@ -307,13 +281,13 @@ restart:
 	{
 		/* Okay to recycle this page */
 		RecordFreeIndexPage(rel, blkno);
-		stats->stats.pages_free++;
-		stats->stats.pages_deleted++;
+		vstate->stats->pages_free++;
+		vstate->stats->pages_deleted++;
 	}
 	else if (GistPageIsDeleted(page))
 	{
 		/* Already deleted, but can't recycle yet */
-		stats->stats.pages_deleted++;
+		vstate->stats->pages_deleted++;
 	}
 	else if (GistPageIsLeaf(page))
 	{
@@ -388,7 +362,7 @@ restart:
 
 			END_CRIT_SECTION();
 
-			stats->stats.tuples_removed += ntodelete;
+			vstate->stats->tuples_removed += ntodelete;
 			/* must recompute maxoff */
 			maxoff = PageGetMaxOffsetNumber(page);
 		}
@@ -405,10 +379,10 @@ restart:
 			 * it up.
 			 */
 			if (blkno == orig_blkno)
-				intset_add_member(stats->empty_leaf_set, blkno);
+				intset_add_member(vstate->empty_leaf_set, blkno);
 		}
 		else
-			stats->stats.num_index_tuples += nremain;
+			vstate->stats->num_index_tuples += nremain;
 	}
 	else
 	{
@@ -443,7 +417,7 @@ restart:
 		 * parents of empty leaf pages.
 		 */
 		if (blkno == orig_blkno)
-			intset_add_member(stats->internal_page_set, blkno);
+			intset_add_member(vstate->internal_page_set, blkno);
 	}
 
 	UnlockReleaseBuffer(buffer);
@@ -466,7 +440,7 @@ restart:
  * Scan all internal pages, and try to delete their empty child pages.
  */
 static void
-gistvacuum_delete_empty_pages(IndexVacuumInfo *info, GistBulkDeleteResult *stats)
+gistvacuum_delete_empty_pages(IndexVacuumInfo *info, GistVacState *vstate)
 {
 	Relation	rel = info->index;
 	BlockNumber empty_pages_remaining;
@@ -475,10 +449,10 @@ gistvacuum_delete_empty_pages(IndexVacuumInfo *info, GistBulkDeleteResult *stats
 	/*
 	 * Rescan all inner pages to find those that have empty child pages.
 	 */
-	empty_pages_remaining = intset_num_entries(stats->empty_leaf_set);
-	intset_begin_iterate(stats->internal_page_set);
+	empty_pages_remaining = intset_num_entries(vstate->empty_leaf_set);
+	intset_begin_iterate(vstate->internal_page_set);
 	while (empty_pages_remaining > 0 &&
-		   intset_iterate_next(stats->internal_page_set, &blkno))
+		   intset_iterate_next(vstate->internal_page_set, &blkno))
 	{
 		Buffer		buffer;
 		Page		page;
@@ -521,7 +495,7 @@ gistvacuum_delete_empty_pages(IndexVacuumInfo *info, GistBulkDeleteResult *stats
 			BlockNumber leafblk;
 
 			leafblk = ItemPointerGetBlockNumber(&(idxtuple->t_tid));
-			if (intset_is_member(stats->empty_leaf_set, leafblk))
+			if (intset_is_member(vstate->empty_leaf_set, leafblk))
 			{
 				leafs_to_delete[ntodelete] = leafblk;
 				todelete[ntodelete++] = off;
@@ -561,7 +535,7 @@ gistvacuum_delete_empty_pages(IndexVacuumInfo *info, GistBulkDeleteResult *stats
 			gistcheckpage(rel, leafbuf);
 
 			LockBuffer(buffer, GIST_EXCLUSIVE);
-			if (gistdeletepage(info, stats,
+			if (gistdeletepage(info, vstate->stats,
 							   buffer, todelete[i] - deleted,
 							   leafbuf))
 				deleted++;
@@ -573,7 +547,7 @@ gistvacuum_delete_empty_pages(IndexVacuumInfo *info, GistBulkDeleteResult *stats
 		ReleaseBuffer(buffer);
 
 		/* update stats */
-		stats->stats.pages_removed += deleted;
+		vstate->stats->pages_removed += deleted;
 
 		/*
 		 * We can stop the scan as soon as we have seen the downlinks, even if
@@ -596,7 +570,7 @@ gistvacuum_delete_empty_pages(IndexVacuumInfo *info, GistBulkDeleteResult *stats
  * prevented it.
  */
 static bool
-gistdeletepage(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
+gistdeletepage(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
 			   Buffer parentBuffer, OffsetNumber downlink,
 			   Buffer leafBuffer)
 {
@@ -665,7 +639,7 @@ gistdeletepage(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
 	/* mark the page as deleted */
 	MarkBufferDirty(leafBuffer);
 	GistPageSetDeleted(leafPage, txid);
-	stats->stats.pages_deleted++;
+	stats->pages_deleted++;
 
 	/* remove the downlink from the parent */
 	MarkBufferDirty(parentBuffer);
-- 
1.8.3.1

v42-0001-Introduce-IndexAM-fields-for-parallel-vacuum.patchapplication/octet-stream; name=v42-0001-Introduce-IndexAM-fields-for-parallel-vacuum.patchDownload

From 7ae640acd7c1451f3153ad0ee37b827653a425b9 Mon Sep 17 00:00:00 2001
From: Amit Kapila <akapila@postgresql.org>
Date: Mon, 23 Dec 2019 14:46:37 +0530
Subject: [PATCH 2/3] Introduce IndexAM fields for parallel vacuum.

Introduce new fields amusemaintenanceworkmem and amparallelvacuumoptions
in IndexAmRoutine for parallel vacuum.  The amusemaintenanceworkmem tells
whether a particular IndexAM uses maintenance_work_mem or not.  This will
help in controlling the memory used by individual workers as otherwise,
each worker can consume memory equal to maintenance_work_mem.  The
amparallelvacuumoptions tell whether a particular IndexAM participates in
a parallel vacuum and if so in which phase (bulkdelete, vacuumcleanup) of
vacuum.

Author: Masahiko Sawada and Amit Kapila
Reviewed-by: Dilip Kumar, Amit Kapila and Robert Haas
Discussion:
https://postgr.es/m/CAD21AoDTPMgzSkV4E3SFo1CH_x50bf5PqZFQf4jmqjk-C03BWg@mail.gmail.com
https://postgr.es/m/CAA4eK1LmcD5aPogzwim5Nn58Ki+74a6Edghx4Wd8hAskvHaq5A@mail.gmail.com
---
 contrib/bloom/blutils.c                          |  4 +++
 doc/src/sgml/indexam.sgml                        |  4 +++
 src/backend/access/brin/brin.c                   |  4 +++
 src/backend/access/gin/ginutil.c                 |  4 +++
 src/backend/access/gist/gist.c                   |  4 +++
 src/backend/access/hash/hash.c                   |  3 ++
 src/backend/access/nbtree/nbtree.c               |  3 ++
 src/backend/access/spgist/spgutils.c             |  4 +++
 src/include/access/amapi.h                       |  4 +++
 src/include/commands/vacuum.h                    | 38 ++++++++++++++++++++++++
 src/test/modules/dummy_index_am/dummy_index_am.c |  3 ++
 11 files changed, 75 insertions(+)

diff --git a/contrib/bloom/blutils.c b/contrib/bloom/blutils.c
index 23d959b..0104d02 100644
--- a/contrib/bloom/blutils.c
+++ b/contrib/bloom/blutils.c
@@ -18,6 +18,7 @@
 #include "access/reloptions.h"
 #include "bloom.h"
 #include "catalog/index.h"
+#include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
@@ -121,6 +122,9 @@ blhandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_CLEANUP;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = blbuild;
diff --git a/doc/src/sgml/indexam.sgml b/doc/src/sgml/indexam.sgml
index dd54c68..37f8d87 100644
--- a/doc/src/sgml/indexam.sgml
+++ b/doc/src/sgml/indexam.sgml
@@ -122,6 +122,10 @@ typedef struct IndexAmRoutine
     bool        amcanparallel;
     /* does AM support columns included with clause INCLUDE? */
     bool        amcaninclude;
+    /* does AM use maintenance_work_mem? */
+    bool        amusemaintenanceworkmem;
+    /* OR of parallel vacuum flags */
+    uint8       amparallelvacuumoptions;
     /* type of data stored in index, or InvalidOid if variable */
     Oid         amkeytype;
 
diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index d89af78..2e8f67e 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -27,6 +27,7 @@
 #include "access/xloginsert.h"
 #include "catalog/index.h"
 #include "catalog/pg_am.h"
+#include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "pgstat.h"
 #include "postmaster/autovacuum.h"
@@ -101,6 +102,9 @@ brinhandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_CLEANUP;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = brinbuild;
diff --git a/src/backend/access/gin/ginutil.c b/src/backend/access/gin/ginutil.c
index 910f0bc..a7e55ca 100644
--- a/src/backend/access/gin/ginutil.c
+++ b/src/backend/access/gin/ginutil.c
@@ -20,6 +20,7 @@
 #include "access/xloginsert.h"
 #include "catalog/pg_collation.h"
 #include "catalog/pg_type.h"
+#include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "storage/indexfsm.h"
 #include "storage/lmgr.h"
@@ -53,6 +54,9 @@ ginhandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amusemaintenanceworkmem = true;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_CLEANUP;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = ginbuild;
diff --git a/src/backend/access/gist/gist.c b/src/backend/access/gist/gist.c
index 5c9ad34..aefc302 100644
--- a/src/backend/access/gist/gist.c
+++ b/src/backend/access/gist/gist.c
@@ -17,6 +17,7 @@
 #include "access/gist_private.h"
 #include "access/gistscan.h"
 #include "catalog/pg_collation.h"
+#include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "nodes/execnodes.h"
 #include "storage/lmgr.h"
@@ -74,6 +75,9 @@ gisthandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = true;
+	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = gistbuild;
diff --git a/src/backend/access/hash/hash.c b/src/backend/access/hash/hash.c
index 4bb6efc..4871b7f 100644
--- a/src/backend/access/hash/hash.c
+++ b/src/backend/access/hash/hash.c
@@ -72,6 +72,9 @@ hashhandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL;
 	amroutine->amkeytype = INT4OID;
 
 	amroutine->ambuild = hashbuild;
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index 8376a5e..5254bc7 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -121,6 +121,9 @@ bthandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = true;
 	amroutine->amcaninclude = true;
+	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = btbuild;
diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c
index d715908..4924ae1 100644
--- a/src/backend/access/spgist/spgutils.c
+++ b/src/backend/access/spgist/spgutils.c
@@ -22,6 +22,7 @@
 #include "access/transam.h"
 #include "access/xact.h"
 #include "catalog/pg_amop.h"
+#include "commands/vacuum.h"
 #include "storage/bufmgr.h"
 #include "storage/indexfsm.h"
 #include "storage/lmgr.h"
@@ -56,6 +57,9 @@ spghandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = spgbuild;
diff --git a/src/include/access/amapi.h b/src/include/access/amapi.h
index d2a49e8..3b3e22f 100644
--- a/src/include/access/amapi.h
+++ b/src/include/access/amapi.h
@@ -197,6 +197,10 @@ typedef struct IndexAmRoutine
 	bool		amcanparallel;
 	/* does AM support columns included with clause INCLUDE? */
 	bool		amcaninclude;
+	/* does AM use maintenance_work_mem? */
+	bool		amusemaintenanceworkmem;
+	/* OR of parallel vacuum flags.  See vacuum.h for flags. */
+	uint8		amparallelvacuumoptions;
 	/* type of data stored in index, or InvalidOid if variable */
 	Oid			amkeytype;
 
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 5dc41dd..b3351ad 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -23,6 +23,44 @@
 #include "storage/lock.h"
 #include "utils/relcache.h"
 
+/*
+ * Flags for amparallelvacuumoptions to control the participation of bulkdelete
+ * and vacuumcleanup in parallel vacuum.
+ */
+
+/*
+ * Both bulkdelete and vacuumcleanup are disabled by default.  This will be
+ * used by IndexAM's that don't want to or cannot participate in parallel
+ * vacuum.  For example, if an index AM doesn't have a way to communicate the
+ * index statistics allocated by the first ambulkdelete call to the subsequent
+ * ones until amvacuumcleanup, the index AM cannot participate in parallel
+ * vacuum.
+ */
+#define VACUUM_OPTION_NO_PARALLEL			0
+
+/*
+ * bulkdelete can be performed in parallel.  This option can be used by
+ * IndexAm's that need to scan the index to delete the tuples.
+ */
+#define VACUUM_OPTION_PARALLEL_BULKDEL		(1 << 0)
+
+/*
+ * vacuumcleanup can be performed in parallel if bulkdelete is not performed
+ * yet.  This will be used by IndexAM's that can scan the index if the
+ * bulkdelete is not performed.
+ */
+#define VACUUM_OPTION_PARALLEL_COND_CLEANUP	(1 << 1)
+
+/*
+ * vacuumcleanup can be performed in parallel even if bulkdelete has already
+ * processed the index.  This will be used by IndexAM's that scan the index
+ * during the cleanup phase of index irrespective of whether the index is
+ * already scanned or not during bulkdelete phase.
+ */
+#define VACUUM_OPTION_PARALLEL_CLEANUP		(1 << 2)
+
+/* value for checking vacuum flags */
+#define VACUUM_OPTION_MAX_VALID_VALUE		((1 << 3) - 1)
 
 /*----------
  * ANALYZE builds one of these structs for each attribute (column) that is
diff --git a/src/test/modules/dummy_index_am/dummy_index_am.c b/src/test/modules/dummy_index_am/dummy_index_am.c
index 898ab06..f326320 100644
--- a/src/test/modules/dummy_index_am/dummy_index_am.c
+++ b/src/test/modules/dummy_index_am/dummy_index_am.c
@@ -16,6 +16,7 @@
 #include "access/amapi.h"
 #include "access/reloptions.h"
 #include "catalog/index.h"
+#include "commands/vacuum.h"
 #include "nodes/pathnodes.h"
 #include "utils/guc.h"
 #include "utils/rel.h"
@@ -294,6 +295,8 @@ dihandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amparallelvacuumoptions = VACUUM_OPTION_NO_PARALLEL;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = dibuild;
-- 
1.8.3.1

v42-0002-Allow-vacuum-command-to-process-indexes-in-parallel.patchapplication/octet-stream; name=v42-0002-Allow-vacuum-command-to-process-indexes-in-parallel.patchDownload

From 1e1c66eaa3630e23c72122eea7a3fcdef7561e76 Mon Sep 17 00:00:00 2001
From: Amit Kapila <akapila@postgresql.org>
Date: Wed, 8 Jan 2020 16:19:17 +0530
Subject: [PATCH 3/3] Allow vacuum command to process indexes in parallel.

This feature allows the vacuum to leverage multiple CPUs in order to
process indexes.  This enables us to perform index vacuuming and index
cleanup with background workers.  This adds a PARALLEL option to VACUUM
command where the user can specify the number of workers that can be used
to perform the command which is limited by the number of indexes on a
table.  Specifying zero as a number of workers will disable parallelism.
This option can't be used with the FULL option.

Each index is processed by at most one vacuum process.  Therefore parallel
vacuum can be used when the table has at least two indexes.

The parallel degree is either specified by the user or determined based on
the number of indexes that the table has, and further limited by
max_parallel_maintenance_workers.  The index can participate in parallel
vacuum iff it is greater than min_parallel_index_scan_size.

Author: Masahiko Sawada and Amit Kapila
Reviewed-by: Dilip Kumar, Amit Kapila, Robert Haas, Tomas Vondra,
Mahendra Singh and Sergei Kornilov
Tested-by: Mahendra Singh and Prabhat Sahu
Discussion:
https://postgr.es/m/CAD21AoDTPMgzSkV4E3SFo1CH_x50bf5PqZFQf4jmqjk-C03BWg@mail.gmail.com
https://postgr.es/m/CAA4eK1J-VoR9gzS5E75pcD-OH0mEyCdp8RihcwKrcuw7J-Q0+w@mail.gmail.com
---
 doc/src/sgml/config.sgml              |   14 +-
 doc/src/sgml/ref/vacuum.sgml          |   64 +-
 src/backend/access/heap/vacuumlazy.c  | 1252 ++++++++++++++++++++++++++++++---
 src/backend/access/transam/parallel.c |   26 +-
 src/backend/commands/vacuum.c         |  147 +++-
 src/backend/postmaster/autovacuum.c   |    2 +
 src/bin/psql/tab-complete.c           |    2 +-
 src/include/access/heapam.h           |    3 +
 src/include/access/parallel.h         |    4 +-
 src/include/commands/vacuum.h         |   11 +
 src/test/regress/expected/vacuum.out  |   28 +
 src/test/regress/sql/vacuum.sql       |   27 +
 12 files changed, 1445 insertions(+), 135 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 5d1c902..7475627 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2308,13 +2308,13 @@ include_dir 'conf.d'
        <listitem>
         <para>
          Sets the maximum number of parallel workers that can be
-         started by a single utility command.  Currently, the only
-         parallel utility command that supports the use of parallel
-         workers is <command>CREATE INDEX</command>, and only when
-         building a B-tree index.  Parallel workers are taken from the
-         pool of processes established by <xref
-         linkend="guc-max-worker-processes"/>, limited by <xref
-         linkend="guc-max-parallel-workers"/>.  Note that the requested
+         started by a single utility command.  Currently, the parallel
+         utility commands that support the use of parallel workers are
+         <command>CREATE INDEX</command> only when building a B-tree index,
+         and <command>VACUUM</command> without <literal>FULL</literal>
+         option.  Parallel workers are taken from the pool of processes
+         established by <xref linkend="guc-max-worker-processes"/>, limited
+         by <xref linkend="guc-max-parallel-workers"/>.  Note that the requested
          number of workers may not actually be available at run time.
          If this occurs, the utility operation will run with fewer
          workers than expected.  The default value is 2.  Setting this
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index f9b0fb8..7c199a0 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -34,6 +34,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     SKIP_LOCKED [ <replaceable class="parameter">boolean</replaceable> ]
     INDEX_CLEANUP [ <replaceable class="parameter">boolean</replaceable> ]
     TRUNCATE [ <replaceable class="parameter">boolean</replaceable> ]
+    PARALLEL [ <replaceable class="parameter">integer</replaceable> ]
 
 <phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
 
@@ -75,10 +76,14 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    with normal reading and writing of the table, as an exclusive lock
    is not obtained.  However, extra space is not returned to the operating
    system (in most cases); it's just kept available for re-use within the
-   same table.  <command>VACUUM FULL</command> rewrites the entire contents
-   of the table into a new disk file with no extra space, allowing unused
-   space to be returned to the operating system.  This form is much slower and
-   requires an exclusive lock on each table while it is being processed.
+   same table.  It also allows us to leverage multiple CPUs in order to process
+   indexes.  This feature is known as <firstterm>parallel vacuum</firstterm>.
+   To disable this feature, one can use <literal>PARALLEL</literal> option and
+   specify parallel workers as zero.  <command>VACUUM FULL</command> rewrites
+   the entire contents of the table into a new disk file with no extra space,
+   allowing unused space to be returned to the operating system.  This form is
+   much slower and requires an exclusive lock on each table while it is being
+   processed.
   </para>
 
   <para>
@@ -224,6 +229,33 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    </varlistentry>
 
    <varlistentry>
+    <term><literal>PARALLEL</literal></term>
+    <listitem>
+     <para>
+      Perform vacuum index and cleanup index phases of <command>VACUUM</command>
+      in parallel using <replaceable class="parameter">integer</replaceable>
+      background workers (for the detail of each vacuum phases, please
+      refer to <xref linkend="vacuum-phases"/>).  If the
+      <literal>PARALLEL</literal>option or parallel degree
+      <replaceable class="parameter">integer</replaceable> is omitted,
+      then <command>VACUUM</command> decides the number of workers based
+      on number of indexes that support parallel vacuum operation on the
+      relation which is further limited by
+      <xref linkend="guc-max-parallel-workers-maintenance"/>.  Please note
+      that it is not guaranteed that the number of parallel workers specified
+      in <replaceable class="parameter">integer</replaceable> will be used
+      during execution.  It is possible for a vacuum to run with fewer workers
+      than specified, or even with no workers at all.  Only one worker can
+      be used per index.  So parallel workers are launched only when there
+      are at least <literal>2</literal> indexes in the table.  Workers for
+      vacuum launches before starting each phase and exit at the end of
+      the phase.  These behaviors might change in a future release.  This
+      option can't be used with the <literal>FULL</literal> option.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><replaceable class="parameter">boolean</replaceable></term>
     <listitem>
      <para>
@@ -238,6 +270,18 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    </varlistentry>
 
    <varlistentry>
+    <term><replaceable class="parameter">integer</replaceable></term>
+    <listitem>
+     <para>
+      Specifies a positive integer value passed to the selected option.
+      The <replaceable class="parameter">integer</replaceable> value can
+      also be omitted, in which case the value is decided by the command
+      based on the option used.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><replaceable class="parameter">table_name</replaceable></term>
     <listitem>
      <para>
@@ -317,10 +361,18 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    </para>
 
    <para>
+     The <option>PARALLEL</option> option is used only for vacuum purpose.
+     Even if this option is specified with <option>ANALYZE</option> option
+     it does not affect <option>ANALYZE</option>.
+   </para>
+
+   <para>
     <command>VACUUM</command> causes a substantial increase in I/O traffic,
     which might cause poor performance for other active sessions.  Therefore,
-    it is sometimes advisable to use the cost-based vacuum delay feature.
-    See <xref linkend="runtime-config-resource-vacuum-cost"/> for details.
+    it is sometimes advisable to use the cost-based vacuum delay feature.  For
+    parallel vacuum, each worker sleeps proportional to the work done by that
+    worker.  See <xref linkend="runtime-config-resource-vacuum-cost"/> for
+    details.
    </para>
 
    <para>
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index a5fe904..29ed87f 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -22,6 +22,20 @@
  * of index scans performed.  So we don't use maintenance_work_mem memory for
  * the TID array, just enough to hold as many heap tuples as fit on one page.
  *
+ * Lazy vacuum supports parallel execution with parallel worker processes.  In
+ * a parallel lazy vacuum, we perform both index vacuuming and index cleanup
+ * with parallel worker processes.  Individual indexes are processed by one
+ * vacuum process.  At the beginning of a lazy vacuum (at lazy_scan_heap) we
+ * prepare the parallel context and initialize the DSM segment that contains
+ * shared information as well as the memory space for storing dead tuples.
+ * When starting either index vacuuming or index cleanup, we launch parallel
+ * worker processes.  Once all indexes are processed the parallel worker
+ * processes exit.  After that, the leader process re-initializes the parallel
+ * context so that it can use the same DSM for multiple passes of index
+ * vacuum and for performing index cleanup.  For updating the index statistics,
+ * we need to update the system table and since updates are not
+ * allowed during parallel mode we update the index statistics after exiting
+ * from the parallel mode.
  *
  * Portions Copyright (c) 1996-2020, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
@@ -36,25 +50,30 @@
 
 #include <math.h>
 
+#include "access/amapi.h"
 #include "access/genam.h"
 #include "access/heapam.h"
 #include "access/heapam_xlog.h"
 #include "access/htup_details.h"
 #include "access/multixact.h"
+#include "access/parallel.h"
 #include "access/transam.h"
 #include "access/visibilitymap.h"
+#include "access/xact.h"
 #include "access/xlog.h"
 #include "catalog/storage.h"
 #include "commands/dbcommands.h"
 #include "commands/progress.h"
 #include "commands/vacuum.h"
 #include "miscadmin.h"
+#include "optimizer/paths.h"
 #include "pgstat.h"
 #include "portability/instr_time.h"
 #include "postmaster/autovacuum.h"
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
 #include "storage/lmgr.h"
+#include "tcop/tcopprot.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/pg_rusage.h"
@@ -110,6 +129,144 @@
  */
 #define PREFETCH_SIZE			((BlockNumber) 32)
 
+/*
+ * DSM keys for parallel lazy vacuum.  Unlike other parallel execution code,
+ * since we don't need to worry about DSM keys conflicting with plan_node_id
+ * we can use small integers.
+ */
+#define PARALLEL_VACUUM_KEY_SHARED			1
+#define PARALLEL_VACUUM_KEY_DEAD_TUPLES		2
+#define PARALLEL_VACUUM_KEY_QUERY_TEXT		3
+
+/*
+ * Macro to check if we are in a parallel lazy vacuum.  If true, we are
+ * in the parallel mode and the DSM segment is initialized.
+ */
+#define ParallelVacuumIsActive(lps) (((LVParallelState *) (lps)) != NULL)
+
+/*
+ * LVDeadTuples stores the dead tuple TIDs collected during the heap scan.
+ * This is allocated in the DSM segment in parallel mode and in local memory
+ * in non-parallel mode.
+ */
+typedef struct LVDeadTuples
+{
+	int			max_tuples;		/* # slots allocated in array */
+	int			num_tuples;		/* current # of entries */
+	/* List of TIDs of tuples we intend to delete */
+	/* NB: this list is ordered by TID address */
+	ItemPointerData itemptrs[FLEXIBLE_ARRAY_MEMBER];	/* array of
+														 * ItemPointerData */
+} LVDeadTuples;
+
+#define SizeOfLVDeadTuples(cnt) \
+		add_size((offsetof(LVDeadTuples, itemptrs)), \
+				 mul_size(sizeof(ItemPointerData), cnt))
+
+/*
+ * Shared information among parallel workers.  So this is allocated in the DSM
+ * segment.
+ */
+typedef struct LVShared
+{
+	/*
+	 * Target table relid and log level.  These fields are not modified during
+	 * the lazy vacuum.
+	 */
+	Oid			relid;
+	int			elevel;
+
+	/*
+	 * An indication for vacuum workers to perform either index vacuuming or
+	 * index cleanup.  first_time is true only if for_cleanup is true and
+	 * bulk-deletion is not performed yet.
+	 */
+	bool		for_cleanup;
+	bool		first_time;
+
+	/*
+	 * Fields for both index vacuum and cleanup.
+	 *
+	 * reltuples is the total number of input heap tuples.  We set either old
+	 * live tuples in the index vacuum case or the new live tuples in the
+	 * index cleanup case.
+	 *
+	 * estimated_count is true if the reltuples is an estimated value.
+	 */
+	double		reltuples;
+	bool		estimated_count;
+
+	/*
+	 * In single process lazy vacuum we could consume more memory during index
+	 * vacuuming or cleanup apart from the memory for heap scanning. In
+	 * parallel index vacuuming, since individual vacuum workers can consume
+	 * memory equal to maintenance_work_mem, the new maintenance_work_mem for
+	 * each worker is set such that the parallel operation doesn't consume
+	 * more memory than single process lazy vacuum.
+	 */
+	int			maintenance_work_mem_worker;
+
+	/*
+	 * Shared vacuum cost balance.  During parallel index vacuuming
+	 * VacuumSharedCostBalance points to this value and it accumulates the
+	 * balance of each parallel vacuum workers.
+	 */
+	pg_atomic_uint32 cost_balance;
+
+	/*
+	 * Number of active parallel workers.  This is used for computing the
+	 * minimum threshold of the vacuum cost balance for a worker to go for the
+	 * delay.
+	 */
+	pg_atomic_uint32 active_nworkers;
+
+	/*
+	 * Variables to control parallel index vacuuming.  We have a bitmap to
+	 * indicate which index has stats in shared memory.  The set bit in the
+	 * map indicates that the particular index supports a parallel vacuum.
+	 */
+	pg_atomic_uint32 idx;		/* counter for vacuuming and clean up */
+	pg_atomic_uint32 nprocessed;	/* # of indexes done during parallel
+									 * execution */
+	uint32		offset;			/* sizeof header incl. bitmap */
+	bits8		bitmap[FLEXIBLE_ARRAY_MEMBER];	/* bit map of NULLs */
+
+	/* Shared index statistics data follows at end of struct */
+} LVShared;
+
+#define SizeOfLVShared (offsetof(LVShared, bitmap) + sizeof(bits8))
+#define GetSharedIndStats(s) \
+	((LVSharedIndStats *)((char *)(s) + ((LVShared *)(s))->offset))
+#define IndStatsIsNull(s, i) \
+	(!(((LVShared *)(s))->bitmap[(i) >> 3] & (1 << ((i) & 0x07))))
+
+/*
+ * Struct for an index bulk-deletion statistic used for parallel lazy
+ * vacuum.  This is allocated in the DSM segment.
+ */
+typedef struct LVSharedIndStats
+{
+	bool		updated;		/* are the stats updated? */
+	IndexBulkDeleteResult stats;
+} LVSharedIndStats;
+
+/* Struct for maintaining a parallel vacuum state. */
+typedef struct LVParallelState
+{
+	ParallelContext *pcxt;
+
+	/* Shared information among parallel vacuum workers */
+	LVShared   *lvshared;
+
+	/*
+	 * The number of indexes that support parallel index bulk-deletion and
+	 * parallel index cleanup respectively.
+	 */
+	int			nindexes_parallel_bulkdel;
+	int			nindexes_parallel_cleanup;
+	int			nindexes_parallel_condcleanup;
+} LVParallelState;
+
 typedef struct LVRelStats
 {
 	/* useindex = true means two-pass strategy; false means one-pass */
@@ -128,11 +285,7 @@ typedef struct LVRelStats
 	BlockNumber pages_removed;
 	double		tuples_deleted;
 	BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
-	/* List of TIDs of tuples we intend to delete */
-	/* NB: this list is ordered by TID address */
-	int			num_dead_tuples;	/* current # of entries */
-	int			max_dead_tuples;	/* # slots allocated in array */
-	ItemPointer dead_tuples;	/* array of ItemPointerData */
+	LVDeadTuples *dead_tuples;
 	int			num_index_scans;
 	TransactionId latestRemovedXid;
 	bool		lock_waiter_detected;
@@ -155,15 +308,15 @@ static void lazy_scan_heap(Relation onerel, VacuumParams *params,
 						   bool aggressive);
 static void lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats);
 static bool lazy_check_needs_freeze(Buffer buf, bool *hastup);
-static void lazy_vacuum_index(Relation indrel,
-							  IndexBulkDeleteResult **stats,
-							  LVRelStats *vacrelstats);
-static void lazy_vacuum_all_indexes(Relation onerel, LVRelStats *vacrelstats,
-									Relation *Irel, int nindexes,
-									IndexBulkDeleteResult **indstats);
+static void lazy_vacuum_all_indexes(Relation onerel, Relation *Irel,
+									IndexBulkDeleteResult **stats,
+									LVRelStats *vacrelstats, LVParallelState *lps,
+									int nindexes);
+static void lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult **stats,
+							  LVDeadTuples *dead_tuples, double reltuples);
 static void lazy_cleanup_index(Relation indrel,
-							   IndexBulkDeleteResult *stats,
-							   LVRelStats *vacrelstats);
+							   IndexBulkDeleteResult **stats,
+							   double reltuples, bool estimated_count);
 static int	lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 							 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer);
 static bool should_attempt_truncation(VacuumParams *params,
@@ -172,12 +325,41 @@ static void lazy_truncate_heap(Relation onerel, LVRelStats *vacrelstats);
 static BlockNumber count_nondeletable_pages(Relation onerel,
 											LVRelStats *vacrelstats);
 static void lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks);
-static void lazy_record_dead_tuple(LVRelStats *vacrelstats,
+static void lazy_record_dead_tuple(LVDeadTuples *dead_tuples,
 								   ItemPointer itemptr);
 static bool lazy_tid_reaped(ItemPointer itemptr, void *state);
 static int	vac_cmp_itemptr(const void *left, const void *right);
 static bool heap_page_is_all_visible(Relation rel, Buffer buf,
 									 TransactionId *visibility_cutoff_xid, bool *all_frozen);
+static void lazy_parallel_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+										 LVRelStats *vacrelstats, LVParallelState *lps,
+										 int nindexes);
+static void parallel_vacuum_index(Relation *Irel, IndexBulkDeleteResult **stats,
+								  LVShared *lvshared, LVDeadTuples *dead_tuples,
+								  int nindexes);
+static void vacuum_indexes_leader(Relation *Irel, IndexBulkDeleteResult **stats,
+								  LVRelStats *vacrelstats, LVParallelState *lps,
+								  int nindexes);
+static void vacuum_one_index(Relation indrel, IndexBulkDeleteResult **stats,
+							 LVShared *lvshared, LVSharedIndStats *shared_indstats,
+							 LVDeadTuples *dead_tuples);
+static void lazy_cleanup_all_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+									 LVRelStats *vacrelstats, LVParallelState *lps,
+									 int nindexes);
+static long compute_max_dead_tuples(BlockNumber relblocks, bool hasindex);
+static int	compute_parallel_vacuum_workers(Relation *Irel, int nindexes, int nrequested,
+											bool *can_parallel_vacuum);
+static void prepare_index_statistics(LVShared *lvshared, bool *can_parallel_vacuum,
+									 int nindexes);
+static void update_index_statistics(Relation *Irel, IndexBulkDeleteResult **stats,
+									int nindexes);
+static LVParallelState *begin_parallel_vacuum(Oid relid, Relation *Irel,
+											  LVRelStats *vacrelstats, BlockNumber nblocks,
+											  int nindexes, int nrequested);
+static void end_parallel_vacuum(Relation *Irel, IndexBulkDeleteResult **stats,
+								LVParallelState *lps, int nindexes);
+static LVSharedIndStats *get_indstats(LVShared *lvshared, int n);
+static bool skip_parallel_vacuum_index(Relation indrel, LVShared *lvshared);
 
 
 /*
@@ -491,6 +673,18 @@ vacuum_log_cleanup_info(Relation rel, LVRelStats *vacrelstats)
  *		dead-tuple TIDs, invoke vacuuming of indexes and call lazy_vacuum_heap
  *		to reclaim dead line pointers.
  *
+ *		If the table has at least two indexes and parallel lazy vacuum is
+ *		requested, we execute both index vacuuming and index cleanup with
+ *		parallel workers.  In parallel lazy vacuum, we enter parallel mode and
+ *		then create both the parallel context and the DSM segment before starting
+ *		heap scan so that we can record dead tuples to the DSM segment.  All
+ *		parallel workers are launched at beginning of index vacuuming and index
+ *		cleanup and they exit once done with all indexes.  At the end of this
+ *		function we exit from parallel mode.  Index bulk-deletion results are
+ *		stored in the DSM segment and we update index statistics for all the
+ *		indexes after exiting from parallel mode since writes are not allowed
+ *		during parallel mode.
+ *
  *		If there are no indexes then we can reclaim line pointers on the fly;
  *		dead line pointers need only be retained until all index pointers that
  *		reference them have been killed.
@@ -499,6 +693,8 @@ static void
 lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			   Relation *Irel, int nindexes, bool aggressive)
 {
+	LVParallelState *lps = NULL;
+	LVDeadTuples *dead_tuples;
 	BlockNumber nblocks,
 				blkno;
 	HeapTupleData tuple;
@@ -556,13 +752,28 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	vacrelstats->nonempty_pages = 0;
 	vacrelstats->latestRemovedXid = InvalidTransactionId;
 
-	lazy_space_alloc(vacrelstats, nblocks);
+	/*
+	 * Try to initialize the parallel vacuum if requested
+	 */
+	if (params->nworkers >= 0 && vacrelstats->useindex)
+		lps = begin_parallel_vacuum(RelationGetRelid(onerel), Irel,
+									vacrelstats, nblocks, nindexes,
+									params->nworkers);
+
+	/*
+	 * Allocate the space for dead tuples in case the parallel vacuum is not
+	 * initialized.
+	 */
+	if (!ParallelVacuumIsActive(lps))
+		lazy_space_alloc(vacrelstats, nblocks);
+
+	dead_tuples = vacrelstats->dead_tuples;
 	frozen = palloc(sizeof(xl_heap_freeze_tuple) * MaxHeapTuplesPerPage);
 
 	/* Report that we're scanning the heap, advertising total # of blocks */
 	initprog_val[0] = PROGRESS_VACUUM_PHASE_SCAN_HEAP;
 	initprog_val[1] = nblocks;
-	initprog_val[2] = vacrelstats->max_dead_tuples;
+	initprog_val[2] = dead_tuples->max_tuples;
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
 	/*
@@ -740,8 +951,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * If we are close to overrunning the available space for dead-tuple
 		 * TIDs, pause and do a cycle of vacuuming before we tackle this page.
 		 */
-		if ((vacrelstats->max_dead_tuples - vacrelstats->num_dead_tuples) < MaxHeapTuplesPerPage &&
-			vacrelstats->num_dead_tuples > 0)
+		if ((dead_tuples->max_tuples - dead_tuples->num_tuples) < MaxHeapTuplesPerPage &&
+			dead_tuples->num_tuples > 0)
 		{
 			/*
 			 * Before beginning index vacuuming, we release any pin we may
@@ -756,8 +967,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			}
 
 			/* Work on all the indexes, then the heap */
-			lazy_vacuum_all_indexes(onerel, vacrelstats, Irel,
-									nindexes, indstats);
+			lazy_vacuum_all_indexes(onerel, Irel, indstats,
+									vacrelstats, lps, nindexes);
 
 			/* Remove tuples from heap */
 			lazy_vacuum_heap(onerel, vacrelstats);
@@ -767,7 +978,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 
 			/*
 			 * Vacuum the Free Space Map to make newly-freed space visible on
@@ -962,7 +1173,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		has_dead_tuples = false;
 		nfrozen = 0;
 		hastup = false;
-		prev_dead_count = vacrelstats->num_dead_tuples;
+		prev_dead_count = dead_tuples->num_tuples;
 		maxoff = PageGetMaxOffsetNumber(page);
 
 		/*
@@ -1001,7 +1212,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 */
 			if (ItemIdIsDead(itemid))
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				all_visible = false;
 				continue;
 			}
@@ -1147,7 +1358,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 
 			if (tupgone)
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				HeapTupleHeaderAdvanceLatestRemovedXid(tuple.t_data,
 													   &vacrelstats->latestRemovedXid);
 				tups_vacuumed += 1;
@@ -1217,7 +1428,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * doing a second scan. Also we don't do that but forget dead tuples
 		 * when index cleanup is disabled.
 		 */
-		if (!vacrelstats->useindex && vacrelstats->num_dead_tuples > 0)
+		if (!vacrelstats->useindex && dead_tuples->num_tuples > 0)
 		{
 			if (nindexes == 0)
 			{
@@ -1246,7 +1457,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 
 			/*
 			 * Periodically do incremental FSM vacuuming to make newly-freed
@@ -1361,7 +1572,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * page, so remember its free space as-is.  (This path will always be
 		 * taken if there are no indexes.)
 		 */
-		if (vacrelstats->num_dead_tuples == prev_dead_count)
+		if (dead_tuples->num_tuples == prev_dead_count)
 			RecordPageWithFreeSpace(onerel, blkno, freespace);
 	}
 
@@ -1395,11 +1606,11 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 
 	/* If any tuples need to be deleted, perform final vacuum cycle */
 	/* XXX put a threshold on min number of tuples here? */
-	if (vacrelstats->num_dead_tuples > 0)
+	if (dead_tuples->num_tuples > 0)
 	{
 		/* Work on all the indexes, and then the heap */
-		lazy_vacuum_all_indexes(onerel, vacrelstats, Irel, nindexes,
-								indstats);
+		lazy_vacuum_all_indexes(onerel, Irel, indstats, vacrelstats,
+								lps, nindexes);
 
 		/* Remove tuples from heap */
 		lazy_vacuum_heap(onerel, vacrelstats);
@@ -1412,17 +1623,22 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	if (blkno > next_fsm_block_to_vacuum)
 		FreeSpaceMapVacuumRange(onerel, next_fsm_block_to_vacuum, blkno);
 
-	/* report all blocks vacuumed; and that we're cleaning up */
+	/* report all blocks vacuumed */
 	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_VACUUMED, blkno);
-	pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
-								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
 
-	/* Do post-vacuum cleanup and statistics update for each index */
+	/* Do post-vacuum cleanup */
 	if (vacrelstats->useindex)
-	{
-		for (i = 0; i < nindexes; i++)
-			lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
-	}
+		lazy_cleanup_all_indexes(Irel, indstats, vacrelstats, lps, nindexes);
+
+	/*
+	 * End parallel mode before updating index statistics as we cannot write
+	 * during parallel mode.
+	 */
+	if (ParallelVacuumIsActive(lps))
+		end_parallel_vacuum(Irel, indstats, lps, nindexes);
+
+	/* Update index statistics */
+	update_index_statistics(Irel, indstats, nindexes);
 
 	/* If no indexes, make log report that lazy_vacuum_heap would've made */
 	if (vacuumed_pages)
@@ -1467,15 +1683,16 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 /*
  *	lazy_vacuum_all_indexes() -- vacuum all indexes of relation.
  *
- *		This is a utility wrapper for lazy_vacuum_index(), able to do
- *		progress reporting.
+ * We process the indexes serially unless we are doing parallel vacuum.
  */
 static void
-lazy_vacuum_all_indexes(Relation onerel, LVRelStats *vacrelstats,
-						Relation *Irel, int nindexes,
-						IndexBulkDeleteResult **indstats)
+lazy_vacuum_all_indexes(Relation onerel, Relation *Irel,
+						IndexBulkDeleteResult **stats,
+						LVRelStats *vacrelstats, LVParallelState *lps,
+						int nindexes)
 {
-	int			i;
+	Assert(!IsParallelWorker());
+	Assert(nindexes > 0);
 
 	/* Log cleanup info before we touch indexes */
 	vacuum_log_cleanup_info(onerel, vacrelstats);
@@ -1484,9 +1701,30 @@ lazy_vacuum_all_indexes(Relation onerel, LVRelStats *vacrelstats,
 	pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
 								 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
-	/* Remove index entries */
-	for (i = 0; i < nindexes; i++)
-		lazy_vacuum_index(Irel[i], &indstats[i], vacrelstats);
+	/* Perform index vacuuming with parallel workers for parallel vacuum. */
+	if (ParallelVacuumIsActive(lps))
+	{
+		/* Tell parallel workers to do index vacuuming */
+		lps->lvshared->for_cleanup = false;
+		lps->lvshared->first_time = false;
+
+		/*
+		 * We can only provide an approximate value of num_heap_tuples in
+		 * vacuum cases.
+		 */
+		lps->lvshared->reltuples = vacrelstats->old_live_tuples;
+		lps->lvshared->estimated_count = true;
+
+		lazy_parallel_vacuum_indexes(Irel, stats, vacrelstats, lps, nindexes);
+	}
+	else
+	{
+		int			idx;
+
+		for (idx = 0; idx < nindexes; idx++)
+			lazy_vacuum_index(Irel[idx], &stats[idx], vacrelstats->dead_tuples,
+							  vacrelstats->old_live_tuples);
+	}
 
 	/* Increase and report the number of index scans */
 	vacrelstats->num_index_scans++;
@@ -1522,7 +1760,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 	npages = 0;
 
 	tupindex = 0;
-	while (tupindex < vacrelstats->num_dead_tuples)
+	while (tupindex < vacrelstats->dead_tuples->num_tuples)
 	{
 		BlockNumber tblk;
 		Buffer		buf;
@@ -1531,7 +1769,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 
 		vacuum_delay_point();
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples->itemptrs[tupindex]);
 		buf = ReadBufferExtended(onerel, MAIN_FORKNUM, tblk, RBM_NORMAL,
 								 vac_strategy);
 		if (!ConditionalLockBufferForCleanup(buf))
@@ -1579,6 +1817,7 @@ static int
 lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 				 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer)
 {
+	LVDeadTuples *dead_tuples = vacrelstats->dead_tuples;
 	Page		page = BufferGetPage(buffer);
 	OffsetNumber unused[MaxOffsetNumber];
 	int			uncnt = 0;
@@ -1589,16 +1828,16 @@ lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 
 	START_CRIT_SECTION();
 
-	for (; tupindex < vacrelstats->num_dead_tuples; tupindex++)
+	for (; tupindex < dead_tuples->num_tuples; tupindex++)
 	{
 		BlockNumber tblk;
 		OffsetNumber toff;
 		ItemId		itemid;
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&dead_tuples->itemptrs[tupindex]);
 		if (tblk != blkno)
 			break;				/* past end of tuples for this block */
-		toff = ItemPointerGetOffsetNumber(&vacrelstats->dead_tuples[tupindex]);
+		toff = ItemPointerGetOffsetNumber(&dead_tuples->itemptrs[tupindex]);
 		itemid = PageGetItemId(page, toff);
 		ItemIdSetUnused(itemid);
 		unused[uncnt++] = toff;
@@ -1719,19 +1958,355 @@ lazy_check_needs_freeze(Buffer buf, bool *hastup)
 	return false;
 }
 
+/*
+ * Perform index vacuuming or index cleanup with parallel workers.  This
+ * function must be used by the parallel vacuum leader process. The caller
+ * must set lps->lvshared->for_cleanup to indicate whether to perform vacuum
+ * or cleanup.
+ */
+static void
+lazy_parallel_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+							 LVRelStats *vacrelstats, LVParallelState *lps,
+							 int nindexes)
+{
+	int			nworkers;
+
+	Assert(!IsParallelWorker());
+	Assert(ParallelVacuumIsActive(lps));
+	Assert(nindexes > 0);
+
+	/* Determine the number of parallel workers to launch */
+	if (lps->lvshared->for_cleanup)
+	{
+		if (lps->lvshared->first_time)
+			nworkers = lps->nindexes_parallel_cleanup +
+				lps->nindexes_parallel_condcleanup - 1;
+		else
+			nworkers = lps->nindexes_parallel_cleanup - 1;
+
+	}
+	else
+		nworkers = lps->nindexes_parallel_bulkdel - 1;
+
+	/*
+	 * It is possible that parallel context is initialized with fewer workers
+	 * then the number of indexes that need a separate worker in the current
+	 * phase, so we need to consider it.  See compute_parallel_vacuum_workers.
+	 */
+	nworkers = Min(nworkers, lps->pcxt->nworkers);
+
+	/* Setup the shared cost-based vacuum delay and launch workers */
+	if (nworkers > 0)
+	{
+		if (vacrelstats->num_index_scans > 0)
+		{
+			/* Reset the processing counts */
+			pg_atomic_write_u32(&(lps->lvshared->idx), 0);
+			pg_atomic_write_u32(&(lps->lvshared->nprocessed), 0);
+
+			/* Reinitialize the parallel context to relaunch parallel workers */
+			ReinitializeParallelDSM(lps->pcxt);
+		}
+
+		/* Enable shared cost balance */
+		VacuumSharedCostBalance = &(lps->lvshared->cost_balance);
+		VacuumActiveNWorkers = &(lps->lvshared->active_nworkers);
+
+		/*
+		 * Set up shared cost balance and the number of active workers for
+		 * vacuum delay.
+		 */
+		pg_atomic_write_u32(VacuumSharedCostBalance, VacuumCostBalance);
+		pg_atomic_write_u32(VacuumActiveNWorkers, 0);
+
+		/*
+		 * The number of workers can vary between bulkdelete and cleanup
+		 * phase.
+		 */
+		ReinitializeParallelWorkers(lps->pcxt, nworkers);
+
+		LaunchParallelWorkers(lps->pcxt);
+
+		if (lps->pcxt->nworkers_launched > 0)
+		{
+			/*
+			 * Reset the local cost values for leader backend as we have
+			 * already accumulated the remaining balance of heap.
+			 */
+			VacuumCostBalance = 0;
+			VacuumCostBalanceLocal = 0;
+		}
+		else
+		{
+			/*
+			 * Disable shared cost balance if we are not able to launch
+			 * workers.
+			 */
+			VacuumSharedCostBalance = NULL;
+			VacuumActiveNWorkers = NULL;
+		}
+
+		if (lps->lvshared->for_cleanup)
+			ereport(elevel,
+					(errmsg(ngettext("launched %d parallel vacuum worker for index cleanup (planned: %d)",
+									 "launched %d parallel vacuum workers for index cleanup (planned: %d)",
+									 lps->pcxt->nworkers_launched),
+							lps->pcxt->nworkers_launched, nworkers)));
+		else
+			ereport(elevel,
+					(errmsg(ngettext("launched %d parallel vacuum worker for index vacuuming (planned: %d)",
+									 "launched %d parallel vacuum workers for index vacuuming (planned: %d)",
+									 lps->pcxt->nworkers_launched),
+							lps->pcxt->nworkers_launched, nworkers)));
+	}
+
+	/* Process the indexes that can be processed by only leader process */
+	vacuum_indexes_leader(Irel, stats, vacrelstats, lps, nindexes);
+
+	/*
+	 * Join as a parallel worker.  The leader process alone processes all the
+	 * indexes in the case where no workers are launched.
+	 */
+	parallel_vacuum_index(Irel, stats, lps->lvshared,
+						  vacrelstats->dead_tuples, nindexes);
+
+	/* Wait for all vacuum workers to finish */
+	WaitForParallelWorkersToFinish(lps->pcxt);
+
+	/* Carry the shared balance value to heap scan */
+	if (VacuumSharedCostBalance)
+		VacuumCostBalance = pg_atomic_read_u32(VacuumSharedCostBalance);
+
+	if (nworkers > 0)
+	{
+		/* Disable shared cost balance */
+		VacuumSharedCostBalance = NULL;
+		VacuumActiveNWorkers = NULL;
+	}
+}
+
+/*
+ * Index vacuum/cleanup routine used by the leader process and parallel
+ * vacuum worker processes to process the indexes in parallel.
+ */
+static void
+parallel_vacuum_index(Relation *Irel, IndexBulkDeleteResult **stats,
+					  LVShared *lvshared, LVDeadTuples *dead_tuples,
+					  int nindexes)
+{
+	/*
+	 * Increment the active worker count if we are able to launch any worker.
+	 */
+	if (VacuumActiveNWorkers)
+		pg_atomic_add_fetch_u32(VacuumActiveNWorkers, 1);
+
+	/* Loop until all indexes are vacuumed */
+	for (;;)
+	{
+		int			idx;
+		LVSharedIndStats *shared_indstats;
+
+		/* Get an index number to process */
+		idx = pg_atomic_fetch_add_u32(&(lvshared->idx), 1);
+
+		/* Done for all indexes? */
+		if (idx >= nindexes)
+			break;
+
+		/* Get the index statistics of this index from DSM */
+		shared_indstats = get_indstats(lvshared, idx);
+
+		/*
+		 * Skip processing indexes that doesn't participate in parallel
+		 * operation
+		 */
+		if (shared_indstats == NULL ||
+			skip_parallel_vacuum_index(Irel[idx], lvshared))
+			continue;
+
+		/* Increment the processing count */
+		pg_atomic_add_fetch_u32(&(lvshared->nprocessed), 1);
+
+		/* Do vacuum or cleanup of the index */
+		vacuum_one_index(Irel[idx], &(stats[idx]), lvshared, shared_indstats,
+						 dead_tuples);
+	}
+
+	/*
+	 * We have completed the index vacuum so decrement the active worker
+	 * count.
+	 */
+	if (VacuumActiveNWorkers)
+		pg_atomic_sub_fetch_u32(VacuumActiveNWorkers, 1);
+}
+
+/*
+ * Vacuum or cleanup indexes that can be processed by only the leader process
+ * because these indexes don't support parallel operation at that phase.
+ */
+static void
+vacuum_indexes_leader(Relation *Irel, IndexBulkDeleteResult **stats,
+					  LVRelStats *vacrelstats, LVParallelState *lps,
+					  int nindexes)
+{
+	int			i;
+
+	Assert(!IsParallelWorker());
+
+	/*
+	 * Increment the active worker count if we are able to launch any worker.
+	 */
+	if (VacuumActiveNWorkers)
+		pg_atomic_add_fetch_u32(VacuumActiveNWorkers, 1);
+
+	for (i = 0; i < nindexes; i++)
+	{
+		bool		skip_index = (get_indstats(lps->lvshared, i) == NULL ||
+								  skip_parallel_vacuum_index(Irel[i], lps->lvshared));
+
+		/* Skip the indexes that can be processed by parallel workers */
+		if (!skip_index)
+			continue;
+
+		vacuum_one_index(Irel[i], &(stats[i]), lps->lvshared,
+						 get_indstats(lps->lvshared, i),
+						 vacrelstats->dead_tuples);
+	}
+
+	/*
+	 * We have completed the index vacuum so decrement the active worker
+	 * count.
+	 */
+	if (VacuumActiveNWorkers)
+		pg_atomic_sub_fetch_u32(VacuumActiveNWorkers, 1);
+}
+
+/*
+ * Vacuum or cleanup index either by leader process or by one of the worker
+ * process.  After processing the index this function copies the index
+ * statistics returned from ambulkdelete and amvacuumcleanup to the DSM
+ * segment.
+ */
+static void
+vacuum_one_index(Relation indrel, IndexBulkDeleteResult **stats,
+				 LVShared *lvshared, LVSharedIndStats *shared_indstats,
+				 LVDeadTuples *dead_tuples)
+{
+	IndexBulkDeleteResult *bulkdelete_res = NULL;
+
+	if (shared_indstats)
+	{
+		/* Get the space for IndexBulkDeleteResult */
+		bulkdelete_res = &(shared_indstats->stats);
+
+		/*
+		 * Update the pointer to the corresponding bulk-deletion result if
+		 * someone has already updated it.
+		 */
+		if (shared_indstats->updated && *stats == NULL)
+			*stats = bulkdelete_res;
+	}
+
+	/* Do vacuum or cleanup of the index */
+	if (lvshared->for_cleanup)
+		lazy_cleanup_index(indrel, stats, lvshared->reltuples,
+						   lvshared->estimated_count);
+	else
+		lazy_vacuum_index(indrel, stats, dead_tuples,
+						  lvshared->reltuples);
+
+	/*
+	 * Copy the index bulk-deletion result returned from ambulkdelete and
+	 * amvacuumcleanup to the DSM segment if it's the first time to get it
+	 * from them, because they allocate it locally and it's possible that an
+	 * index will be vacuumed by the different vacuum process at the next
+	 * time.  The copying of the result normally happens only after the first
+	 * time of index vacuuming.  From the second time, we pass the result on
+	 * the DSM segment so that they then update it directly.
+	 *
+	 * Since all vacuum workers write the bulk-deletion result at different
+	 * slots we can write them without locking.
+	 */
+	if (shared_indstats && !shared_indstats->updated && *stats != NULL)
+	{
+		memcpy(bulkdelete_res, *stats, sizeof(IndexBulkDeleteResult));
+		shared_indstats->updated = true;
+
+		/*
+		 * Now that the stats[idx] points to the DSM segment, we don't need
+		 * the locally allocated results.
+		 */
+		pfree(*stats);
+		*stats = bulkdelete_res;
+	}
+}
+
+/*
+ *	lazy_cleanup_all_indexes() -- cleanup all indexes of relation.
+ *
+ * Cleanup indexes.  We process the indexes serially unless we are doing
+ * parallel vacuum.
+ */
+static void
+lazy_cleanup_all_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+						 LVRelStats *vacrelstats, LVParallelState *lps,
+						 int nindexes)
+{
+	int			idx;
+
+	Assert(!IsParallelWorker());
+	Assert(nindexes > 0);
+
+	/* Report that we are now cleaning up indexes */
+	pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
+								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
+
+	/*
+	 * If parallel vacuum is active we perform index cleanup with parallel
+	 * workers.
+	 */
+	if (ParallelVacuumIsActive(lps))
+	{
+		/* Tell parallel workers to do index cleanup */
+		lps->lvshared->for_cleanup = true;
+		lps->lvshared->first_time =
+			(vacrelstats->num_index_scans == 0);
+
+		/*
+		 * Now we can provide a better estimate of total number of surviving
+		 * tuples (we assume indexes are more interested in that than in the
+		 * number of nominally live tuples).
+		 */
+		lps->lvshared->reltuples = vacrelstats->new_rel_tuples;
+		lps->lvshared->estimated_count =
+			(vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+
+		lazy_parallel_vacuum_indexes(Irel, stats, vacrelstats, lps, nindexes);
+	}
+	else
+	{
+		for (idx = 0; idx < nindexes; idx++)
+			lazy_cleanup_index(Irel[idx], &stats[idx],
+							   vacrelstats->new_rel_tuples,
+							   vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	}
+}
 
 /*
  *	lazy_vacuum_index() -- vacuum one index relation.
  *
  *		Delete all the index entries pointing to tuples listed in
- *		vacrelstats->dead_tuples, and update running statistics.
+ *		dead_tuples, and update running statistics.
+ *
+ *		reltuples is the number of heap tuples to be passed to the
+ *		bulkdelete callback.
  */
 static void
-lazy_vacuum_index(Relation indrel,
-				  IndexBulkDeleteResult **stats,
-				  LVRelStats *vacrelstats)
+lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult **stats,
+				  LVDeadTuples *dead_tuples, double reltuples)
 {
 	IndexVacuumInfo ivinfo;
+	const char *msg;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
@@ -1741,30 +2316,38 @@ lazy_vacuum_index(Relation indrel,
 	ivinfo.report_progress = false;
 	ivinfo.estimated_count = true;
 	ivinfo.message_level = elevel;
-	/* We can only provide an approximate value of num_heap_tuples here */
-	ivinfo.num_heap_tuples = vacrelstats->old_live_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
 	/* Do bulk deletion */
 	*stats = index_bulk_delete(&ivinfo, *stats,
-							   lazy_tid_reaped, (void *) vacrelstats);
+							   lazy_tid_reaped, (void *) dead_tuples);
+
+	if (IsParallelWorker())
+		msg = gettext_noop("scanned index \"%s\" to remove %d row versions by parallel vacuum worker");
+	else
+		msg = gettext_noop("scanned index \"%s\" to remove %d row versions");
 
 	ereport(elevel,
-			(errmsg("scanned index \"%s\" to remove %d row versions",
+			(errmsg(msg,
 					RelationGetRelationName(indrel),
-					vacrelstats->num_dead_tuples),
+					dead_tuples->num_tuples),
 			 errdetail_internal("%s", pg_rusage_show(&ru0))));
 }
 
 /*
  *	lazy_cleanup_index() -- do post-vacuum cleanup for one index relation.
+ *
+ *		reltuples is the number of heap tuples and estimated_count is true
+ *		if the reltuples is an estimated value.
  */
 static void
 lazy_cleanup_index(Relation indrel,
-				   IndexBulkDeleteResult *stats,
-				   LVRelStats *vacrelstats)
+				   IndexBulkDeleteResult **stats,
+				   double reltuples, bool estimated_count)
 {
 	IndexVacuumInfo ivinfo;
+	const char *msg;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
@@ -1772,49 +2355,33 @@ lazy_cleanup_index(Relation indrel,
 	ivinfo.index = indrel;
 	ivinfo.analyze_only = false;
 	ivinfo.report_progress = false;
-	ivinfo.estimated_count = (vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	ivinfo.estimated_count = estimated_count;
 	ivinfo.message_level = elevel;
 
-	/*
-	 * Now we can provide a better estimate of total number of surviving
-	 * tuples (we assume indexes are more interested in that than in the
-	 * number of nominally live tuples).
-	 */
-	ivinfo.num_heap_tuples = vacrelstats->new_rel_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
-	stats = index_vacuum_cleanup(&ivinfo, stats);
+	*stats = index_vacuum_cleanup(&ivinfo, *stats);
 
-	if (!stats)
+	if (!(*stats))
 		return;
 
-	/*
-	 * Now update statistics in pg_class, but only if the index says the count
-	 * is accurate.
-	 */
-	if (!stats->estimated_count)
-		vac_update_relstats(indrel,
-							stats->num_pages,
-							stats->num_index_tuples,
-							0,
-							false,
-							InvalidTransactionId,
-							InvalidMultiXactId,
-							false);
+	if (IsParallelWorker())
+		msg = gettext_noop("index \"%s\" now contains %.0f row versions in %u pages as reported by parallel vacuum worker");
+	else
+		msg = gettext_noop("index \"%s\" now contains %.0f row versions in %u pages");
 
 	ereport(elevel,
-			(errmsg("index \"%s\" now contains %.0f row versions in %u pages",
+			(errmsg(msg,
 					RelationGetRelationName(indrel),
-					stats->num_index_tuples,
-					stats->num_pages),
+					(*stats)->num_index_tuples,
+					(*stats)->num_pages),
 			 errdetail("%.0f index row versions were removed.\n"
 					   "%u index pages have been deleted, %u are currently reusable.\n"
 					   "%s.",
-					   stats->tuples_removed,
-					   stats->pages_deleted, stats->pages_free,
+					   (*stats)->tuples_removed,
+					   (*stats)->pages_deleted, (*stats)->pages_free,
 					   pg_rusage_show(&ru0))));
-
-	pfree(stats);
 }
 
 /*
@@ -2122,19 +2689,17 @@ count_nondeletable_pages(Relation onerel, LVRelStats *vacrelstats)
 }
 
 /*
- * lazy_space_alloc - space allocation decisions for lazy vacuum
- *
- * See the comments at the head of this file for rationale.
+ * Return the maximum number of dead tuples we can record.
  */
-static void
-lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+static long
+compute_max_dead_tuples(BlockNumber relblocks, bool useindex)
 {
 	long		maxtuples;
 	int			vac_work_mem = IsAutoVacuumWorkerProcess() &&
 	autovacuum_work_mem != -1 ?
 	autovacuum_work_mem : maintenance_work_mem;
 
-	if (vacrelstats->useindex)
+	if (useindex)
 	{
 		maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);
 		maxtuples = Min(maxtuples, INT_MAX);
@@ -2148,34 +2713,48 @@ lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
 		maxtuples = Max(maxtuples, MaxHeapTuplesPerPage);
 	}
 	else
-	{
 		maxtuples = MaxHeapTuplesPerPage;
-	}
 
-	vacrelstats->num_dead_tuples = 0;
-	vacrelstats->max_dead_tuples = (int) maxtuples;
-	vacrelstats->dead_tuples = (ItemPointer)
-		palloc(maxtuples * sizeof(ItemPointerData));
+	return maxtuples;
+}
+
+/*
+ * lazy_space_alloc - space allocation decisions for lazy vacuum
+ *
+ * See the comments at the head of this file for rationale.
+ */
+static void
+lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+{
+	LVDeadTuples *dead_tuples = NULL;
+	long		maxtuples;
+
+	maxtuples = compute_max_dead_tuples(relblocks, vacrelstats->useindex);
+
+	dead_tuples = (LVDeadTuples *) palloc(SizeOfLVDeadTuples(maxtuples));
+	dead_tuples->num_tuples = 0;
+	dead_tuples->max_tuples = (int) maxtuples;
+
+	vacrelstats->dead_tuples = dead_tuples;
 }
 
 /*
  * lazy_record_dead_tuple - remember one deletable tuple
  */
 static void
-lazy_record_dead_tuple(LVRelStats *vacrelstats,
-					   ItemPointer itemptr)
+lazy_record_dead_tuple(LVDeadTuples *dead_tuples, ItemPointer itemptr)
 {
 	/*
 	 * The array shouldn't overflow under normal behavior, but perhaps it
 	 * could if we are given a really small maintenance_work_mem. In that
 	 * case, just forget the last few tuples (we'll get 'em next time).
 	 */
-	if (vacrelstats->num_dead_tuples < vacrelstats->max_dead_tuples)
+	if (dead_tuples->num_tuples < dead_tuples->max_tuples)
 	{
-		vacrelstats->dead_tuples[vacrelstats->num_dead_tuples] = *itemptr;
-		vacrelstats->num_dead_tuples++;
+		dead_tuples->itemptrs[dead_tuples->num_tuples] = *itemptr;
+		dead_tuples->num_tuples++;
 		pgstat_progress_update_param(PROGRESS_VACUUM_NUM_DEAD_TUPLES,
-									 vacrelstats->num_dead_tuples);
+									 dead_tuples->num_tuples);
 	}
 }
 
@@ -2189,12 +2768,12 @@ lazy_record_dead_tuple(LVRelStats *vacrelstats,
 static bool
 lazy_tid_reaped(ItemPointer itemptr, void *state)
 {
-	LVRelStats *vacrelstats = (LVRelStats *) state;
+	LVDeadTuples *dead_tuples = (LVDeadTuples *) state;
 	ItemPointer res;
 
 	res = (ItemPointer) bsearch((void *) itemptr,
-								(void *) vacrelstats->dead_tuples,
-								vacrelstats->num_dead_tuples,
+								(void *) dead_tuples->itemptrs,
+								dead_tuples->num_tuples,
 								sizeof(ItemPointerData),
 								vac_cmp_itemptr);
 
@@ -2342,3 +2921,450 @@ heap_page_is_all_visible(Relation rel, Buffer buf,
 
 	return all_visible;
 }
+
+/*
+ * Compute the number of parallel worker processes to request.  Both index
+ * vacuuming and index cleanup can be executed with parallel workers.  The
+ * relation sizes of table don't affect to the parallel degree for now.
+ *
+ * nrequested is the number of parallel workers that user requested.  If
+ * nrequested is 0, we compute the parallel degree based on nindexes, that is
+ * the number of indexes that support parallel index vacuuming.  This function
+ * also sets can_parallel_vacuum to remember indexes that participate in
+ * parallel index vacuum.
+ */
+static int
+compute_parallel_vacuum_workers(Relation *Irel, int nindexes, int nrequested,
+								bool *can_parallel_vacuum)
+{
+	int			nindexes_parallel = 0;
+	int			nindexes_parallel_bulkdel = 0;
+	int			nindexes_parallel_cleanup = 0;
+	int			parallel_workers;
+	int			i;
+
+	/*
+	 * We don't allow to perform parallel operation in standalone backend or
+	 * when parallelism is disabled.
+	 */
+	if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+		return 0;
+
+	/*
+	 * Compute the number of indexes that can participate in parallel index
+	 * vacuuming.
+	 */
+	for (i = 0; i < nindexes; i++)
+	{
+		uint8		vacoptions = Irel[i]->rd_indam->amparallelvacuumoptions;
+
+		if (vacoptions == VACUUM_OPTION_NO_PARALLEL ||
+			RelationGetNumberOfBlocks(Irel[i]) < min_parallel_index_scan_size)
+			continue;
+
+		can_parallel_vacuum[i] = true;
+
+		if ((vacoptions & VACUUM_OPTION_PARALLEL_BULKDEL) != 0)
+			nindexes_parallel_bulkdel++;
+		if (((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) != 0) ||
+			((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) != 0))
+			nindexes_parallel_cleanup++;
+	}
+
+	nindexes_parallel = Max(nindexes_parallel_bulkdel,
+							nindexes_parallel_cleanup);
+
+	/* No index supports parallel index vacuuming */
+	if (nindexes_parallel == 0)
+		return 0;
+
+	/* The leader process takes one index */
+	nindexes_parallel--;
+
+	/* Compute the parallel degree */
+	parallel_workers = (nrequested > 0) ?
+		Min(nrequested, nindexes_parallel) : nindexes_parallel;
+
+	/* cap by max_parallel_maintenance_workers */
+	parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+
+	return parallel_workers;
+}
+
+/*
+ * Initialize variables for shared index statistics, set NULL bitmap and the
+ * size of stats for each index.  Since currently we don't support parallel vacuum
+ * for autovacuum we don't need to care about autovacuum_work_mem.
+ */
+static void
+prepare_index_statistics(LVShared *lvshared, bool *can_parallel_vacuum,
+						 int nindexes)
+{
+	int			i;
+
+	Assert(!IsAutoVacuumWorkerProcess());
+
+	/* Set NULL for all indexes */
+	memset(lvshared->bitmap, 0x00, BITMAPLEN(nindexes));
+
+	for (i = 0; i < nindexes; i++)
+	{
+		if (!can_parallel_vacuum[i])
+			continue;
+
+		/* Set NOT NULL as this index do support parallelism */
+		lvshared->bitmap[i >> 3] |= 1 << (i & 0x07);
+	}
+}
+
+/*
+ * Update index statistics in pg_class if the statistics is accurate.
+ */
+static void
+update_index_statistics(Relation *Irel, IndexBulkDeleteResult **stats,
+						int nindexes)
+{
+	int			i;
+
+	Assert(!IsInParallelMode());
+
+	for (i = 0; i < nindexes; i++)
+	{
+		if (stats[i] == NULL || stats[i]->estimated_count)
+			continue;
+
+		/* Update index statistics */
+		vac_update_relstats(Irel[i],
+							stats[i]->num_pages,
+							stats[i]->num_index_tuples,
+							0,
+							false,
+							InvalidTransactionId,
+							InvalidMultiXactId,
+							false);
+		pfree(stats[i]);
+	}
+}
+
+/*
+ * This function prepares and returns parallel vacuum state if we can launch
+ * even one worker.  This function is responsible to create a parallel
+ * context, enter parallel mode, and then initialize the DSM segment.
+ */
+static LVParallelState *
+begin_parallel_vacuum(Oid relid, Relation *Irel, LVRelStats *vacrelstats,
+					  BlockNumber nblocks, int nindexes, int nrequested)
+{
+	LVParallelState *lps = NULL;
+	ParallelContext *pcxt;
+	LVShared   *shared;
+	LVDeadTuples *dead_tuples;
+	bool	   *can_parallel_vacuum;
+	long		maxtuples;
+	char	   *sharedquery;
+	Size		est_shared;
+	Size		est_deadtuples;
+	int			nindexes_mwm = 0;
+	int			parallel_workers = 0;
+	int			querylen;
+	int			i;
+
+	/*
+	 * a parallel vacuum must be requested and there must be indexes on the
+	 * relation
+	 */
+	Assert(nrequested >= 0);
+	Assert(nindexes > 0);
+
+	/*
+	 * Compute the number of parallel vacuum workers to launch
+	 */
+	can_parallel_vacuum = (bool *) palloc0(sizeof(bool) * nindexes);
+	parallel_workers = compute_parallel_vacuum_workers(Irel, nindexes,
+													   nrequested,
+													   can_parallel_vacuum);
+
+	/* Can't perform vacuum in parallel */
+	if (parallel_workers <= 0)
+	{
+		pfree(can_parallel_vacuum);
+		return lps;
+	}
+
+	lps = (LVParallelState *) palloc0(sizeof(LVParallelState));
+
+	EnterParallelMode();
+	pcxt = CreateParallelContext("postgres", "parallel_vacuum_main",
+								 parallel_workers);
+	Assert(pcxt->nworkers > 0);
+	lps->pcxt = pcxt;
+
+	/* Estimate size for shared information -- PARALLEL_VACUUM_KEY_SHARED */
+	est_shared = MAXALIGN(add_size(SizeOfLVShared, BITMAPLEN(nindexes)));
+	for (i = 0; i < nindexes; i++)
+	{
+		uint8		vacoptions = Irel[i]->rd_indam->amparallelvacuumoptions;
+
+		/*
+		 * Cleanup option should be either disabled, always performing in
+		 * parallel or conditionally performing in parallel.
+		 */
+		Assert(((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) == 0) ||
+			   ((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) == 0));
+		Assert(vacoptions <= VACUUM_OPTION_MAX_VALID_VALUE);
+
+		/* Skip indexes that don't participate in parallel index vacuum */
+		if (!can_parallel_vacuum[i])
+			continue;
+
+		if (Irel[i]->rd_indam->amusemaintenanceworkmem)
+			nindexes_mwm++;
+
+		est_shared = add_size(est_shared, sizeof(LVSharedIndStats));
+
+		/*
+		 * Remember the number of indexes that support parallel operation for
+		 * each phase.
+		 */
+		if ((vacoptions & VACUUM_OPTION_PARALLEL_BULKDEL) != 0)
+			lps->nindexes_parallel_bulkdel++;
+		if ((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) != 0)
+			lps->nindexes_parallel_cleanup++;
+		if ((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) != 0)
+			lps->nindexes_parallel_condcleanup++;
+	}
+	shm_toc_estimate_chunk(&pcxt->estimator, est_shared);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	/* Estimate size for dead tuples -- PARALLEL_VACUUM_KEY_DEAD_TUPLES */
+	maxtuples = compute_max_dead_tuples(nblocks, true);
+	est_deadtuples = MAXALIGN(SizeOfLVDeadTuples(maxtuples));
+	shm_toc_estimate_chunk(&pcxt->estimator, est_deadtuples);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	/* Finally, estimate PARALLEL_VACUUM_KEY_QUERY_TEXT space */
+	querylen = strlen(debug_query_string);
+	shm_toc_estimate_chunk(&pcxt->estimator, querylen + 1);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	InitializeParallelDSM(pcxt);
+
+	/* Prepare shared information */
+	shared = (LVShared *) shm_toc_allocate(pcxt->toc, est_shared);
+	MemSet(shared, 0, est_shared);
+	shared->relid = relid;
+	shared->elevel = elevel;
+	shared->maintenance_work_mem_worker =
+		(nindexes_mwm > 0) ?
+		maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
+		maintenance_work_mem;
+
+	/*
+	 * We need to care about alignment because we estimate the shared memory
+	 * in that way.
+	 */
+	shared->offset = MAXALIGN(add_size(SizeOfLVShared, BITMAPLEN(nindexes)));
+	prepare_index_statistics(shared, can_parallel_vacuum, nindexes);
+	pg_atomic_init_u32(&(shared->idx), 0);
+	pg_atomic_init_u32(&(shared->nprocessed), 0);
+	pg_atomic_init_u32(&(shared->cost_balance), 0);
+	pg_atomic_init_u32(&(shared->active_nworkers), 0);
+
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_SHARED, shared);
+	lps->lvshared = shared;
+
+	/* Prepare the dead tuple space */
+	dead_tuples = (LVDeadTuples *) shm_toc_allocate(pcxt->toc, est_deadtuples);
+	dead_tuples->max_tuples = maxtuples;
+	dead_tuples->num_tuples = 0;
+	MemSet(dead_tuples->itemptrs, 0, sizeof(ItemPointerData) * maxtuples);
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_DEAD_TUPLES, dead_tuples);
+	vacrelstats->dead_tuples = dead_tuples;
+
+	/* Store query string for workers */
+	sharedquery = (char *) shm_toc_allocate(pcxt->toc, querylen + 1);
+	memcpy(sharedquery, debug_query_string, querylen + 1);
+	sharedquery[querylen] = '\0';
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, sharedquery);
+
+	pfree(can_parallel_vacuum);
+	return lps;
+}
+
+/*
+ * Destroy the parallel context, and end parallel mode.
+ *
+ * Since writes are not allowed during the parallel mode, so we copy the
+ * updated index statistics from DSM in local memory and then later use that
+ * to update the index statistics.  One might think that we can exit from
+ * parallel mode, update the index statistics and then destroy parallel
+ * context, but that won't be safe (see ExitParallelMode).
+ */
+static void
+end_parallel_vacuum(Relation *Irel, IndexBulkDeleteResult **stats,
+					LVParallelState *lps, int nindexes)
+{
+	int			i;
+
+	Assert(!IsParallelWorker());
+
+	/* copy the updated statistics */
+	for (i = 0; i < nindexes; i++)
+	{
+		LVSharedIndStats *indstats = get_indstats(lps->lvshared, i);
+
+		/*
+		 * Skip unused slot.  The statistics of this index are already stored
+		 * in local memory.
+		 */
+		if (indstats == NULL)
+			continue;
+
+		if (indstats->updated)
+		{
+			stats[i] = (IndexBulkDeleteResult *) palloc0(sizeof(IndexBulkDeleteResult));
+			memcpy(stats[i], &(indstats->stats), sizeof(IndexBulkDeleteResult));
+		}
+		else
+			stats[i] = NULL;
+	}
+
+	DestroyParallelContext(lps->pcxt);
+	ExitParallelMode();
+
+	/* Deactivate parallel vacuum */
+	pfree(lps);
+	lps = NULL;
+}
+
+/* Return the Nth index statistics or NULL */
+static LVSharedIndStats *
+get_indstats(LVShared *lvshared, int n)
+{
+	int			i;
+	char	   *p;
+
+	if (IndStatsIsNull(lvshared, n))
+		return NULL;
+
+	p = (char *) GetSharedIndStats(lvshared);
+	for (i = 0; i < n; i++)
+	{
+		if (IndStatsIsNull(lvshared, i))
+			continue;
+
+		p += sizeof(LVSharedIndStats);
+	}
+
+	return (LVSharedIndStats *) p;
+}
+
+/*
+ * Check if the given index participates in parallel index vacuum or parallel
+ * index cleanup.
+ */
+static bool
+skip_parallel_vacuum_index(Relation indrel, LVShared *lvshared)
+{
+	uint8		vacoptions = indrel->rd_indam->amparallelvacuumoptions;
+
+	/* first_time must be true only if for_cleanup is true */
+	Assert(lvshared->for_cleanup || !lvshared->first_time);
+
+	if (lvshared->for_cleanup)
+	{
+		/* Skip, if the index does not support parallel cleanup */
+		if (((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) == 0) &&
+			((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) == 0))
+			return true;
+
+		/*
+		 * Skip, if the index supports parallel cleanup conditionally, but we
+		 * have already processed the index (for bulkdelete).  See the
+		 * comments for option VACUUM_OPTION_PARALLEL_COND_CLEANUP to know
+		 * when indexes support parallel cleanup conditionally.
+		 */
+		if (!lvshared->first_time &&
+			((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) != 0))
+			return true;
+	}
+	else if ((vacoptions & VACUUM_OPTION_PARALLEL_BULKDEL) == 0)
+	{
+		/* Skip if the index does not support parallel bulk deletion */
+		return true;
+	}
+
+	return false;
+}
+
+/*
+ * Perform work within a launched parallel process.
+ *
+ * Since parallel vacuum workers perform only index vacuum or index cleanup,
+ * we don't need to report the progress information.
+ */
+void
+parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
+{
+	Relation	onerel;
+	Relation   *indrels;
+	LVShared   *lvshared;
+	LVDeadTuples *dead_tuples;
+	int			nindexes;
+	char	   *sharedquery;
+	IndexBulkDeleteResult **stats;
+
+	lvshared = (LVShared *) shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_SHARED,
+										   false);
+	elevel = lvshared->elevel;
+
+	ereport(DEBUG1,
+			(errmsg("starting parallel lazy vacuum worker for %s",
+					lvshared->for_cleanup ? "cleanup" : "vacuuming")));
+
+	/* Set debug_query_string for individual workers */
+	sharedquery = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, false);
+	debug_query_string = sharedquery;
+	pgstat_report_activity(STATE_RUNNING, debug_query_string);
+
+	/*
+	 * Open table.  The lock mode is the same as the leader process.  It's
+	 * okay because the lock mode does not conflict among the parallel
+	 * workers.
+	 */
+	onerel = table_open(lvshared->relid, ShareUpdateExclusiveLock);
+
+	/*
+	 * Open all indexes. indrels are sorted in order by OID, which should be
+	 * matched to the leader's one.
+	 */
+	vac_open_indexes(onerel, RowExclusiveLock, &nindexes, &indrels);
+	Assert(nindexes > 0);
+
+	/* Set dead tuple space */
+	dead_tuples = (LVDeadTuples *) shm_toc_lookup(toc,
+												  PARALLEL_VACUUM_KEY_DEAD_TUPLES,
+												  false);
+
+	/* Set cost-based vacuum delay */
+	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostBalance = 0;
+	VacuumPageHit = 0;
+	VacuumPageMiss = 0;
+	VacuumPageDirty = 0;
+	VacuumSharedCostBalance = &(lvshared->cost_balance);
+	VacuumActiveNWorkers = &(lvshared->active_nworkers);
+
+	stats = (IndexBulkDeleteResult **)
+		palloc0(nindexes * sizeof(IndexBulkDeleteResult *));
+
+	if (lvshared->maintenance_work_mem_worker > 0)
+		maintenance_work_mem = lvshared->maintenance_work_mem_worker;
+
+	/* Process indexes to perform vacuum/cleanup */
+	parallel_vacuum_index(indrels, stats, lvshared, dead_tuples, nindexes);
+
+	vac_close_indexes(nindexes, indrels, RowExclusiveLock);
+	table_close(onerel, ShareUpdateExclusiveLock);
+	pfree(stats);
+}
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index f3e2254..df06e7d 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -14,6 +14,7 @@
 
 #include "postgres.h"
 
+#include "access/heapam.h"
 #include "access/nbtree.h"
 #include "access/parallel.h"
 #include "access/session.h"
@@ -139,6 +140,9 @@ static const struct
 	},
 	{
 		"_bt_parallel_build_main", _bt_parallel_build_main
+	},
+	{
+		"parallel_vacuum_main", parallel_vacuum_main
 	}
 };
 
@@ -174,6 +178,7 @@ CreateParallelContext(const char *library_name, const char *function_name,
 	pcxt = palloc0(sizeof(ParallelContext));
 	pcxt->subid = GetCurrentSubTransactionId();
 	pcxt->nworkers = nworkers;
+	pcxt->nworkers_to_launch = nworkers;
 	pcxt->library_name = pstrdup(library_name);
 	pcxt->function_name = pstrdup(function_name);
 	pcxt->error_context_stack = error_context_stack;
@@ -487,6 +492,23 @@ ReinitializeParallelDSM(ParallelContext *pcxt)
 }
 
 /*
+ * Reinitialize parallel workers for a parallel context such that we could
+ * launch the different number of workers.  This is required for cases where
+ * we need to reuse the same DSM segment, but the number of workers can
+ * vary from run-to-run.
+ */
+void
+ReinitializeParallelWorkers(ParallelContext *pcxt, int nworkers_to_launch)
+{
+	/*
+	 * The number of workers that need to be launched must be less than the
+	 * number of workers with which the parallel context is initialized.
+	 */
+	Assert(pcxt->nworkers >= nworkers_to_launch);
+	pcxt->nworkers_to_launch = nworkers_to_launch;
+}
+
+/*
  * Launch parallel workers.
  */
 void
@@ -498,7 +520,7 @@ LaunchParallelWorkers(ParallelContext *pcxt)
 	bool		any_registrations_failed = false;
 
 	/* Skip this if we have no workers. */
-	if (pcxt->nworkers == 0)
+	if (pcxt->nworkers == 0 || pcxt->nworkers_to_launch == 0)
 		return;
 
 	/* We need to be a lock group leader. */
@@ -533,7 +555,7 @@ LaunchParallelWorkers(ParallelContext *pcxt)
 	 * fails.  It wouldn't help much anyway, because registering the worker in
 	 * no way guarantees that it will start up and initialize successfully.
 	 */
-	for (i = 0; i < pcxt->nworkers; ++i)
+	for (i = 0; i < pcxt->nworkers_to_launch; ++i)
 	{
 		memcpy(worker.bgw_extra, &i, sizeof(int));
 		if (!any_registrations_failed &&
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index bb34e25..945c15e 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -42,6 +42,7 @@
 #include "nodes/makefuncs.h"
 #include "pgstat.h"
 #include "postmaster/autovacuum.h"
+#include "postmaster/bgworker_internals.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
 #include "storage/proc.h"
@@ -68,6 +69,14 @@ static MemoryContext vac_context = NULL;
 static BufferAccessStrategy vac_strategy;
 
 
+/*
+ * Variables for cost-based parallel vacuum.  See comments atop
+ * compute_parallel_delay to understand how it works.
+ */
+pg_atomic_uint32 *VacuumSharedCostBalance = NULL;
+pg_atomic_uint32 *VacuumActiveNWorkers = NULL;
+int			VacuumCostBalanceLocal = 0;
+
 /* non-export function prototypes */
 static List *expand_vacuum_rel(VacuumRelation *vrel, int options);
 static List *get_all_vacuum_rels(int options);
@@ -76,6 +85,7 @@ static void vac_truncate_clog(TransactionId frozenXID,
 							  TransactionId lastSaneFrozenXid,
 							  MultiXactId lastSaneMinMulti);
 static bool vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params);
+static double compute_parallel_delay(void);
 static VacOptTernaryValue get_vacopt_ternary_value(DefElem *def);
 
 /*
@@ -94,12 +104,16 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 	bool		freeze = false;
 	bool		full = false;
 	bool		disable_page_skipping = false;
+	bool		parallel_option = false;
 	ListCell   *lc;
 
 	/* Set default value */
 	params.index_cleanup = VACOPT_TERNARY_DEFAULT;
 	params.truncate = VACOPT_TERNARY_DEFAULT;
 
+	/* By default parallel vacuum is enabled */
+	params.nworkers = 0;
+
 	/* Parse options list */
 	foreach(lc, vacstmt->options)
 	{
@@ -129,6 +143,40 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 			params.index_cleanup = get_vacopt_ternary_value(opt);
 		else if (strcmp(opt->defname, "truncate") == 0)
 			params.truncate = get_vacopt_ternary_value(opt);
+		else if (strcmp(opt->defname, "parallel") == 0)
+		{
+			parallel_option = true;
+			if (opt->arg == NULL)
+			{
+				/*
+				 * Parallel lazy vacuum is requested but user didn't specify
+				 * the parallel degree.  The parallel degree will be determined
+				 * at the start of lazy vacuum based on number of indexes.
+				 */
+				params.nworkers = 0;
+			}
+			else
+			{
+				int nworkers;
+
+				nworkers = defGetInt32(opt);
+				if (nworkers < 0 || nworkers > MAX_PARALLEL_WORKER_LIMIT)
+					ereport(ERROR,
+							(errcode(ERRCODE_SYNTAX_ERROR),
+							 errmsg("parallel vacuum degree must be between 0 and %d",
+									MAX_PARALLEL_WORKER_LIMIT),
+							 parser_errposition(pstate, opt->location)));
+
+				/*
+				 * Disable parallel vacuum, if user has specified parallel
+				 * degree as zero.
+				 */
+				if (nworkers == 0)
+					params.nworkers = -1;
+				else
+					params.nworkers = nworkers;
+			}
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -152,6 +200,11 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 		   !(params.options & (VACOPT_FULL | VACOPT_FREEZE)));
 	Assert(!(params.options & VACOPT_SKIPTOAST));
 
+	if ((params.options & VACOPT_FULL) && parallel_option)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot specify both FULL and PARALLEL options")));
+
 	/*
 	 * Make sure VACOPT_ANALYZE is specified if any column lists are present.
 	 */
@@ -383,6 +436,7 @@ vacuum(List *relations, VacuumParams *params,
 		VacuumPageHit = 0;
 		VacuumPageMiss = 0;
 		VacuumPageDirty = 0;
+		VacuumSharedCostBalance = NULL;
 
 		/*
 		 * Loop to process each selected relation.
@@ -1739,6 +1793,20 @@ vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params)
 	}
 
 	/*
+	 * Since parallel workers cannot access data in temporary tables, parallel
+	 * vacuum is not allowed for temporary relation. However rather than
+	 * skipping vacuum on the table, just disabling parallel option is better
+	 * option in most cases.
+	 */
+	if (RelationUsesLocalBuffers(onerel) && params->nworkers >= 0)
+	{
+		ereport(WARNING,
+				(errmsg("disabling parallel option of vacuum on \"%s\" --- cannot vacuum temporary tables in parallel",
+						RelationGetRelationName(onerel))));
+		params->nworkers = -1;
+	}
+
+	/*
 	 * Silently ignore partitioned tables as there is no work to be done.  The
 	 * useful work is on their child partitions, which have been queued up for
 	 * us separately.
@@ -1941,16 +2009,26 @@ vac_close_indexes(int nindexes, Relation *Irel, LOCKMODE lockmode)
 void
 vacuum_delay_point(void)
 {
+	double		msec = 0;
+
 	/* Always check for interrupts */
 	CHECK_FOR_INTERRUPTS();
 
-	/* Nap if appropriate */
-	if (VacuumCostActive && !InterruptPending &&
-		VacuumCostBalance >= VacuumCostLimit)
-	{
-		double		msec;
+	if (!VacuumCostActive || InterruptPending)
+		return;
 
+	/*
+	 * For parallel vacuum, the delay is computed based on the shared cost
+	 * balance.  See compute_parallel_delay.
+	 */
+	if (VacuumSharedCostBalance != NULL)
+		msec = compute_parallel_delay();
+	else if (VacuumCostBalance >= VacuumCostLimit)
 		msec = VacuumCostDelay * VacuumCostBalance / VacuumCostLimit;
+
+	/* Nap if appropriate */
+	if (msec > 0)
+	{
 		if (msec > VacuumCostDelay * 4)
 			msec = VacuumCostDelay * 4;
 
@@ -1967,6 +2045,65 @@ vacuum_delay_point(void)
 }
 
 /*
+ * Computes the vacuum delay for parallel workers.
+ *
+ * The basic idea of a cost-based vacuum delay for parallel index vacuuming
+ * is to allow all parallel vacuum workers including the leader process to
+ * have a shared view of cost related parameters (mainly VacuumCostBalance).
+ * We allow each worker to update it as and when it has incurred any cost and
+ * then based on that decide whether it needs to sleep.  We allow the worker
+ * to sleep proportional to the work done and reduce the
+ * VacuumSharedCostBalance by the amount which is consumed by the current
+ * worker (VacuumCostBalanceLocal).  This can avoid letting the workers sleep
+ * who have done less or no I/O as compared to other workers and therefore can
+ * ensure that workers who are doing more I/O got throttled more.
+ *
+ * We allow any worker to sleep only if it has performed the I/O above a
+ * certain threshold, which is calculated based on the number of active
+ * workers (VacuumActiveNWorkers), and the overall cost balance is more than
+ * VacuumCostLimit set by the system.  The testing reveals that we achieve
+ * the required throttling if we allow a worker that has done more than 50%
+ * of its share of work to sleep.
+ */
+static double
+compute_parallel_delay(void)
+{
+	double		msec = 0;
+	uint32		shared_balance;
+	int			nworkers;
+
+	/* parallel vacuum must be active */
+	Assert(VacuumSharedCostBalance);
+
+	nworkers = pg_atomic_read_u32(VacuumActiveNWorkers);
+
+	/* At least count itself */
+	Assert(nworkers >= 1);
+
+	/* Update the shared cost balance value atomically */
+	shared_balance = pg_atomic_add_fetch_u32(VacuumSharedCostBalance, VacuumCostBalance);
+
+	/* Compute the total local balance for the current worker */
+	VacuumCostBalanceLocal += VacuumCostBalance;
+
+	if ((shared_balance >= VacuumCostLimit) &&
+		(VacuumCostBalanceLocal > 0.5 * (VacuumCostLimit / nworkers)))
+	{
+		/* Compute sleep time based on the local cost balance */
+		msec = VacuumCostDelay * VacuumCostBalanceLocal / VacuumCostLimit;
+		pg_atomic_sub_fetch_u32(VacuumSharedCostBalance, VacuumCostBalanceLocal);
+		VacuumCostBalanceLocal = 0;
+	}
+
+	/*
+	 * Reset the local balance as we accumulated it into the shared value.
+	 */
+	VacuumCostBalance = 0;
+
+	return msec;
+}
+
+/*
  * A wrapper function of defGetBoolean().
  *
  * This function returns VACOPT_TERNARY_ENABLED and VACOPT_TERNARY_DISABLED
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index f0e40e3..6d1f28c 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2886,6 +2886,8 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 			(!wraparound ? VACOPT_SKIP_LOCKED : 0);
 		tab->at_params.index_cleanup = VACOPT_TERNARY_DEFAULT;
 		tab->at_params.truncate = VACOPT_TERNARY_DEFAULT;
+		/* As of now, we don't support parallel vacuum for autovacuum */
+		tab->at_params.nworkers = -1;
 		tab->at_params.freeze_min_age = freeze_min_age;
 		tab->at_params.freeze_table_age = freeze_table_age;
 		tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 2fd8886..99451fd 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -3597,7 +3597,7 @@ psql_completion(const char *text, int start, int end)
 		if (ends_with(prev_wd, '(') || ends_with(prev_wd, ','))
 			COMPLETE_WITH("FULL", "FREEZE", "ANALYZE", "VERBOSE",
 						  "DISABLE_PAGE_SKIPPING", "SKIP_LOCKED",
-						  "INDEX_CLEANUP", "TRUNCATE");
+						  "INDEX_CLEANUP", "TRUNCATE", "PARALLEL");
 		else if (TailMatches("FULL|FREEZE|ANALYZE|VERBOSE|DISABLE_PAGE_SKIPPING|SKIP_LOCKED|INDEX_CLEANUP|TRUNCATE"))
 			COMPLETE_WITH("ON", "OFF");
 	}
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 580b4ca..479f17c 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -24,6 +24,8 @@
 #include "nodes/primnodes.h"
 #include "storage/bufpage.h"
 #include "storage/lockdefs.h"
+#include "storage/shm_toc.h"
+#include "storage/dsm.h"
 #include "utils/relcache.h"
 #include "utils/snapshot.h"
 
@@ -193,6 +195,7 @@ extern Size SyncScanShmemSize(void);
 struct VacuumParams;
 extern void heap_vacuum_rel(Relation onerel,
 							struct VacuumParams *params, BufferAccessStrategy bstrategy);
+extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
 
 /* in heap/heapam_visibility.c */
 extern bool HeapTupleSatisfiesVisibility(HeapTuple stup, Snapshot snapshot,
diff --git a/src/include/access/parallel.h b/src/include/access/parallel.h
index 646708b..fc6a560 100644
--- a/src/include/access/parallel.h
+++ b/src/include/access/parallel.h
@@ -33,7 +33,8 @@ typedef struct ParallelContext
 {
 	dlist_node	node;
 	SubTransactionId subid;
-	int			nworkers;
+	int			nworkers;		/* Maximum number of workers to launch */
+	int			nworkers_to_launch; /* Actual number of workers to launch */
 	int			nworkers_launched;
 	char	   *library_name;
 	char	   *function_name;
@@ -63,6 +64,7 @@ extern ParallelContext *CreateParallelContext(const char *library_name,
 											  const char *function_name, int nworkers);
 extern void InitializeParallelDSM(ParallelContext *pcxt);
 extern void ReinitializeParallelDSM(ParallelContext *pcxt);
+extern void ReinitializeParallelWorkers(ParallelContext *pcxt, int nworkers_to_launch);
 extern void LaunchParallelWorkers(ParallelContext *pcxt);
 extern void WaitForParallelWorkersToAttach(ParallelContext *pcxt);
 extern void WaitForParallelWorkersToFinish(ParallelContext *pcxt);
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index b3351ad..56417f0 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -222,6 +222,12 @@ typedef struct VacuumParams
 										 * default value depends on reloptions */
 	VacOptTernaryValue truncate;	/* Truncate empty pages at the end,
 									 * default value depends on reloptions */
+
+	/*
+	 * The number of parallel vacuum workers.  0 by default which means choose
+	 * based on number of indexes.  -1 indicates parallel vacuum is disabled.
+	 */
+	int			nworkers;
 } VacuumParams;
 
 /* GUC parameters */
@@ -231,6 +237,11 @@ extern int	vacuum_freeze_table_age;
 extern int	vacuum_multixact_freeze_min_age;
 extern int	vacuum_multixact_freeze_table_age;
 
+/* Variables for cost-based parallel vacuum  */
+extern pg_atomic_uint32 *VacuumSharedCostBalance;
+extern pg_atomic_uint32 *VacuumActiveNWorkers;
+extern int	VacuumCostBalanceLocal;
+
 
 /* in commands/vacuum.c */
 extern void ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel);
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index 9996d88..22cca70 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -92,6 +92,34 @@ CONTEXT:  SQL function "do_analyze" statement 1
 SQL function "wrap_do_analyze" statement 1
 VACUUM FULL vactst;
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+-- PARALLEL option
+CREATE TABLE pvactst (i INT, a INT[], p POINT) with (autovacuum_enabled = off);
+INSERT INTO pvactst SELECT i, array[1,2,3], point(i, i+1) FROM generate_series(1,1000) i;
+CREATE INDEX btree_pvactst ON pvactst USING btree (i);
+CREATE INDEX hash_pvactst ON pvactst USING hash (i);
+CREATE INDEX brin_pvactst ON pvactst USING brin (i);
+CREATE INDEX gin_pvactst ON pvactst USING gin (a);
+CREATE INDEX gist_pvactst ON pvactst USING gist (p);
+CREATE INDEX spgist_pvactst ON pvactst USING spgist (p);
+-- VACUUM invokes parallel index cleanup
+SET min_parallel_index_scan_size to 0;
+VACUUM (PARALLEL 2) pvactst;
+-- VACUUM invokes parallel bulk-deletion
+UPDATE pvactst SET i = i WHERE i < 1000;
+VACUUM (PARALLEL 2) pvactst;
+VACUUM (PARALLEL -1) pvactst; -- error
+ERROR:  parallel vacuum degree must be between 0 and 1024
+LINE 1: VACUUM (PARALLEL -1) pvactst;
+                ^
+VACUUM (PARALLEL 2, INDEX_CLEANUP FALSE) pvactst;
+VACUUM (PARALLEL 2, FULL TRUE) pvactst; -- error, cannot use both PARALLEL and FULL
+ERROR:  cannot specify both FULL and PARALLEL options
+CREATE TEMPORARY TABLE tmp (a int PRIMARY KEY);
+CREATE INDEX tmp_idx1 ON tmp (a);
+VACUUM (PARALLEL 1) tmp; -- disables parallel vacuum option
+WARNING:  disabling parallel option of vacuum on "tmp" --- cannot vacuum temporary tables in parallel
+RESET min_parallel_index_scan_size;
+DROP TABLE pvactst;
 -- INDEX_CLEANUP option
 CREATE TABLE no_index_cleanup (i INT PRIMARY KEY, t TEXT);
 -- Use uncompressed data stored in toast.
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 69987f7..d6859a5 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -75,6 +75,33 @@ VACUUM FULL vactst;
 
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
 
+-- PARALLEL option
+CREATE TABLE pvactst (i INT, a INT[], p POINT) with (autovacuum_enabled = off);
+INSERT INTO pvactst SELECT i, array[1,2,3], point(i, i+1) FROM generate_series(1,1000) i;
+CREATE INDEX btree_pvactst ON pvactst USING btree (i);
+CREATE INDEX hash_pvactst ON pvactst USING hash (i);
+CREATE INDEX brin_pvactst ON pvactst USING brin (i);
+CREATE INDEX gin_pvactst ON pvactst USING gin (a);
+CREATE INDEX gist_pvactst ON pvactst USING gist (p);
+CREATE INDEX spgist_pvactst ON pvactst USING spgist (p);
+
+-- VACUUM invokes parallel index cleanup
+SET min_parallel_index_scan_size to 0;
+VACUUM (PARALLEL 2) pvactst;
+
+-- VACUUM invokes parallel bulk-deletion
+UPDATE pvactst SET i = i WHERE i < 1000;
+VACUUM (PARALLEL 2) pvactst;
+
+VACUUM (PARALLEL -1) pvactst; -- error
+VACUUM (PARALLEL 2, INDEX_CLEANUP FALSE) pvactst;
+VACUUM (PARALLEL 2, FULL TRUE) pvactst; -- error, cannot use both PARALLEL and FULL
+CREATE TEMPORARY TABLE tmp (a int PRIMARY KEY);
+CREATE INDEX tmp_idx1 ON tmp (a);
+VACUUM (PARALLEL 1) tmp; -- disables parallel vacuum option
+RESET min_parallel_index_scan_size;
+DROP TABLE pvactst;
+
 -- INDEX_CLEANUP option
 CREATE TABLE no_index_cleanup (i INT PRIMARY KEY, t TEXT);
 -- Use uncompressed data stored in toast.
-- 
1.8.3.1

#335

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

about 6 years ago

In reply to: Amit Kapila (#334)

On Wed, 8 Jan 2020 at 22:16, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Sat, Jan 4, 2020 at 6:48 PM Mahendra Singh Thalor <mahi6run@gmail.com> wrote:

Hi All,

In other thread "parallel vacuum options/syntax" [1], Amit Kapila asked opinion about syntax for making normal vacuum to parallel. From that thread, I can see that people are in favor of option(b) to implement. So I tried to implement option(b) on the top of v41 patch set and implemented a delta patch.

I looked at your code and changed it slightly to allow the vacuum to
be performed in parallel by default. Apart from that, I have made a
few other modifications (a) changed the macro SizeOfLVDeadTuples as
preferred by Tomas [1], (b) updated the documentation, (c) changed a
few comments.

Thanks.

The first two patches are the same. I have not posted the patch
related to the FAST option as I am not sure we have a consensus for
that and I have also intentionally left DISABLE_LEADER_PARTICIPATION
related patch to avoid confusion.

What do you think of the attached? Sawada-san, kindly verify the
changes and let me know your opinion.

I agreed to not include both the FAST option patch and
DISABLE_LEADER_PARTICIPATION patch at this stage. It's better to focus
on the main part and we can discuss and add them later if want.

I've looked at the latest version patch you shared. Overall it looks
good and works fine. I have a few small comments:

1.
+      refer to <xref linkend="vacuum-phases"/>).  If the
+      <literal>PARALLEL</literal>option or parallel degree

A space is needed between </literal> and 'option'.

2.
+       /*
+        * Variables to control parallel index vacuuming.  We have a bitmap to
+        * indicate which index has stats in shared memory.  The set bit in the
+        * map indicates that the particular index supports a parallel vacuum.
+        */
+       pg_atomic_uint32 idx;           /* counter for vacuuming and clean up */
+       pg_atomic_uint32 nprocessed;    /* # of indexes done during parallel
+
  * execution */
+       uint32          offset;                 /* sizeof header incl. bitmap */
+       bits8           bitmap[FLEXIBLE_ARRAY_MEMBER];  /* bit map of NULLs */
+
+       /* Shared index statistics data follows at end of struct */
+} LVShared;

It seems to me that we no longer use nprocessed at all. So we can remove it.

3.
+ * Compute the number of parallel worker processes to request.  Both index
+ * vacuuming and index cleanup can be executed with parallel workers.  The
+ * relation sizes of table don't affect to the parallel degree for now.

I think the last sentence should be "The relation size of table
doesn't affect to the parallel degree for now".

4.
+       /* cap by max_parallel_maintenance_workers */
+       parallel_workers = Min(parallel_workers,
max_parallel_maintenance_workers);

+       /*
+        * a parallel vacuum must be requested and there must be indexes on the
+        * relation
+        */

+ /* copy the updated statistics */

+       /* parallel vacuum must be active */
+       Assert(VacuumSharedCostBalance);

All comments that the patches newly added except for the above four
places start with a capital letter. Maybe we can change them for
consistency.

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#336

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Masahiko Sawada (#335)

3 attachment(s)

On Thu, Jan 9, 2020 at 10:41 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Wed, 8 Jan 2020 at 22:16, Amit Kapila <amit.kapila16@gmail.com> wrote:

What do you think of the attached? Sawada-san, kindly verify the
changes and let me know your opinion.

I agreed to not include both the FAST option patch and
DISABLE_LEADER_PARTICIPATION patch at this stage. It's better to focus
on the main part and we can discuss and add them later if want.

I've looked at the latest version patch you shared. Overall it looks
good and works fine. I have a few small comments:

I have addressed all your comments and slightly change nearby comments
and ran pgindent. I think we can commit the first two preparatory
patches now unless you or someone else has any more comments on those.
Tomas, most of your comments were in the main patch
(v43-0002-Allow-vacuum-command-to-process-indexes-in-parallel) which
are now addressed and we have provided the reasons for the proposed
API changes in patch
v43-0001-Introduce-IndexAM-fields-for-parallel-vacuum. Do you have
any concerns if we commit the API patch and then in a few days time
(after another pass or two) commit the main patch?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachments:

v4-0001-Delete-empty-pages-in-each-pass-during-GIST-VACUUM.patchapplication/octet-stream; name=v4-0001-Delete-empty-pages-in-each-pass-during-GIST-VACUUM.patchDownload

From 0dff354f7ef6a4d171e4cafe946aa68f9b1d45f0 Mon Sep 17 00:00:00 2001
From: Amit Kapila <akapila@postgresql.org>
Date: Mon, 9 Dec 2019 14:12:59 +0530
Subject: [PATCH 1/3] Delete empty pages in each pass during GIST VACUUM.

Earlier, we use to postpone deleting empty pages till the second stage of
vacuum to amortize the cost of scanning internal pages.  However, that can
sometimes (say vacuum is canceled or errored between first and second
stage) delay the pages to be recycled.

Another thing is that to facilitate deleting empty pages in the second
stage, we need to share the information of internal and empty pages
between different stages of vacuum.  It will be quite tricky to share this
information via DSM which is required for the upcoming parallel vacuum
patch.

Also, it will bring the logic to reclaim deleted pages closer to nbtree
where we delete empty pages in each pass.

Overall, the advantages of deleting empty pages in each pass outweigh the
advantages of postponing the same.

Author: Dilip Kumar, with changes by Amit Kapila
Reviewed-by: Sawada Masahiko and Amit Kapila
Discussion: https://postgr.es/m/CAA4eK1LGr+MN0xHZpJ2dfS8QNQ1a_aROKowZB+MPNep8FVtwAA@mail.gmail.com
---
 src/backend/access/gist/README       |  23 +++--
 src/backend/access/gist/gistvacuum.c | 160 +++++++++++++++--------------------
 2 files changed, 78 insertions(+), 105 deletions(-)

diff --git a/src/backend/access/gist/README b/src/backend/access/gist/README
index 8cbca69..fffdfff 100644
--- a/src/backend/access/gist/README
+++ b/src/backend/access/gist/README
@@ -429,18 +429,17 @@ splits during searches, we don't need a "vacuum cycle ID" concept for that
 like B-tree does.
 
 While we scan all the pages, we also make note of any completely empty leaf
-pages. We will try to unlink them from the tree in the second stage. We also
-record the block numbers of all internal pages; they are needed in the second
-stage, to locate parents of the empty pages.
-
-In the second stage, we try to unlink any empty leaf pages from the tree, so
-that their space can be reused. In order to delete an empty page, its
-downlink must be removed from the parent. We scan all the internal pages,
-whose block numbers we memorized in the first stage, and look for downlinks
-to pages that we have memorized as being empty. Whenever we find one, we
-acquire a lock on the parent and child page, re-check that the child page is
-still empty. Then, we remove the downlink and mark the child as deleted, and
-release the locks.
+pages. We will try to unlink them from the tree after the scan. We also record
+the block numbers of all internal pages; they are needed to locate parents of
+the empty pages while unlinking them.
+
+We try to unlink any empty leaf pages from the tree, so that their space can
+be reused. In order to delete an empty page, its downlink must be removed from
+the parent. We scan all the internal pages, whose block numbers we memorized
+in the first stage, and look for downlinks to pages that we have memorized as
+being empty. Whenever we find one, we acquire a lock on the parent and child
+page, re-check that the child page is still empty. Then, we remove the
+downlink and mark the child as deleted, and release the locks.
 
 The insertion algorithm would get confused, if an internal page was completely
 empty. So we never delete the last child of an internal page, even if it's
diff --git a/src/backend/access/gist/gistvacuum.c b/src/backend/access/gist/gistvacuum.c
index def74fd..ca39404 100644
--- a/src/backend/access/gist/gistvacuum.c
+++ b/src/backend/access/gist/gistvacuum.c
@@ -24,58 +24,34 @@
 #include "storage/lmgr.h"
 #include "utils/memutils.h"
 
-/*
- * State kept across vacuum stages.
- */
+/* Working state needed by gistbulkdelete */
 typedef struct
 {
-	IndexBulkDeleteResult stats;	/* must be first */
+	IndexVacuumInfo *info;
+	IndexBulkDeleteResult *stats;
+	IndexBulkDeleteCallback callback;
+	void	   *callback_state;
+	GistNSN		startNSN;
 
 	/*
-	 * These are used to memorize all internal and empty leaf pages in the 1st
-	 * vacuum stage.  They are used in the 2nd stage, to delete all the empty
-	 * pages.
+	 * These are used to memorize all internal and empty leaf pages. They are
+	 * used for deleting all the empty pages.
 	 */
 	IntegerSet *internal_page_set;
 	IntegerSet *empty_leaf_set;
 	MemoryContext page_set_context;
-} GistBulkDeleteResult;
-
-/* Working state needed by gistbulkdelete */
-typedef struct
-{
-	IndexVacuumInfo *info;
-	GistBulkDeleteResult *stats;
-	IndexBulkDeleteCallback callback;
-	void	   *callback_state;
-	GistNSN		startNSN;
 } GistVacState;
 
-static void gistvacuumscan(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
+static void gistvacuumscan(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
 						   IndexBulkDeleteCallback callback, void *callback_state);
 static void gistvacuumpage(GistVacState *vstate, BlockNumber blkno,
 						   BlockNumber orig_blkno);
 static void gistvacuum_delete_empty_pages(IndexVacuumInfo *info,
-										  GistBulkDeleteResult *stats);
-static bool gistdeletepage(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
+										  GistVacState *vstate);
+static bool gistdeletepage(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
 						   Buffer buffer, OffsetNumber downlink,
 						   Buffer leafBuffer);
 
-/* allocate the 'stats' struct that's kept over vacuum stages */
-static GistBulkDeleteResult *
-create_GistBulkDeleteResult(void)
-{
-	GistBulkDeleteResult *gist_stats;
-
-	gist_stats = (GistBulkDeleteResult *) palloc0(sizeof(GistBulkDeleteResult));
-	gist_stats->page_set_context =
-		GenerationContextCreate(CurrentMemoryContext,
-								"GiST VACUUM page set context",
-								16 * 1024);
-
-	return gist_stats;
-}
-
 /*
  * VACUUM bulkdelete stage: remove index entries.
  */
@@ -83,15 +59,13 @@ IndexBulkDeleteResult *
 gistbulkdelete(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
 			   IndexBulkDeleteCallback callback, void *callback_state)
 {
-	GistBulkDeleteResult *gist_stats = (GistBulkDeleteResult *) stats;
-
 	/* allocate stats if first time through, else re-use existing struct */
-	if (gist_stats == NULL)
-		gist_stats = create_GistBulkDeleteResult();
+	if (stats == NULL)
+		stats = (IndexBulkDeleteResult *) palloc0(sizeof(IndexBulkDeleteResult));
 
-	gistvacuumscan(info, gist_stats, callback, callback_state);
+	gistvacuumscan(info, stats, callback, callback_state);
 
-	return (IndexBulkDeleteResult *) gist_stats;
+	return stats;
 }
 
 /*
@@ -100,8 +74,6 @@ gistbulkdelete(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
 IndexBulkDeleteResult *
 gistvacuumcleanup(IndexVacuumInfo *info, IndexBulkDeleteResult *stats)
 {
-	GistBulkDeleteResult *gist_stats = (GistBulkDeleteResult *) stats;
-
 	/* No-op in ANALYZE ONLY mode */
 	if (info->analyze_only)
 		return stats;
@@ -111,25 +83,13 @@ gistvacuumcleanup(IndexVacuumInfo *info, IndexBulkDeleteResult *stats)
 	 * stats from the latest gistbulkdelete call.  If it wasn't called, we
 	 * still need to do a pass over the index, to obtain index statistics.
 	 */
-	if (gist_stats == NULL)
+	if (stats == NULL)
 	{
-		gist_stats = create_GistBulkDeleteResult();
-		gistvacuumscan(info, gist_stats, NULL, NULL);
+		stats = (IndexBulkDeleteResult *) palloc0(sizeof(IndexBulkDeleteResult));
+		gistvacuumscan(info, stats, NULL, NULL);
 	}
 
 	/*
-	 * If we saw any empty pages, try to unlink them from the tree so that
-	 * they can be reused.
-	 */
-	gistvacuum_delete_empty_pages(info, gist_stats);
-
-	/* we don't need the internal and empty page sets anymore */
-	MemoryContextDelete(gist_stats->page_set_context);
-	gist_stats->page_set_context = NULL;
-	gist_stats->internal_page_set = NULL;
-	gist_stats->empty_leaf_set = NULL;
-
-	/*
 	 * It's quite possible for us to be fooled by concurrent page splits into
 	 * double-counting some index tuples, so disbelieve any total that exceeds
 	 * the underlying heap's count ... if we know that accurately.  Otherwise
@@ -137,11 +97,11 @@ gistvacuumcleanup(IndexVacuumInfo *info, IndexBulkDeleteResult *stats)
 	 */
 	if (!info->estimated_count)
 	{
-		if (gist_stats->stats.num_index_tuples > info->num_heap_tuples)
-			gist_stats->stats.num_index_tuples = info->num_heap_tuples;
+		if (stats->num_index_tuples > info->num_heap_tuples)
+			stats->num_index_tuples = info->num_heap_tuples;
 	}
 
-	return (IndexBulkDeleteResult *) gist_stats;
+	return stats;
 }
 
 /*
@@ -153,15 +113,16 @@ gistvacuumcleanup(IndexVacuumInfo *info, IndexBulkDeleteResult *stats)
  * occurred).
  *
  * This also makes note of any empty leaf pages, as well as all internal
- * pages.  The second stage, gistvacuum_delete_empty_pages(), needs that
- * information.  Any deleted pages are added directly to the free space map.
- * (They should've been added there when they were originally deleted, already,
- * but it's possible that the FSM was lost at a crash, for example.)
+ * pages while looping over all index pages.  After scanning all the pages, we
+ * remove the empty pages so that they can be reused.  Any deleted pages are
+ * added directly to the free space map.  (They should've been added there
+ * when they were originally deleted, already, but it's possible that the FSM
+ * was lost at a crash, for example.)
  *
  * The caller is responsible for initially allocating/zeroing a stats struct.
  */
 static void
-gistvacuumscan(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
+gistvacuumscan(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
 			   IndexBulkDeleteCallback callback, void *callback_state)
 {
 	Relation	rel = info->index;
@@ -175,11 +136,10 @@ gistvacuumscan(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
 	 * Reset counts that will be incremented during the scan; needed in case
 	 * of multiple scans during a single VACUUM command.
 	 */
-	stats->stats.estimated_count = false;
-	stats->stats.num_index_tuples = 0;
-	stats->stats.pages_deleted = 0;
-	stats->stats.pages_free = 0;
-	MemoryContextReset(stats->page_set_context);
+	stats->estimated_count = false;
+	stats->num_index_tuples = 0;
+	stats->pages_deleted = 0;
+	stats->pages_free = 0;
 
 	/*
 	 * Create the integer sets to remember all the internal and the empty leaf
@@ -187,9 +147,12 @@ gistvacuumscan(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
 	 * this context so that the subsequent allocations for these integer sets
 	 * will be done from the same context.
 	 */
-	oldctx = MemoryContextSwitchTo(stats->page_set_context);
-	stats->internal_page_set = intset_create();
-	stats->empty_leaf_set = intset_create();
+	vstate.page_set_context = GenerationContextCreate(CurrentMemoryContext,
+													  "GiST VACUUM page set context",
+													  16 * 1024);
+	oldctx = MemoryContextSwitchTo(vstate.page_set_context);
+	vstate.internal_page_set = intset_create();
+	vstate.empty_leaf_set = intset_create();
 	MemoryContextSwitchTo(oldctx);
 
 	/* Set up info to pass down to gistvacuumpage */
@@ -257,11 +220,23 @@ gistvacuumscan(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
 	 * Note that if no recyclable pages exist, we don't bother vacuuming the
 	 * FSM at all.
 	 */
-	if (stats->stats.pages_free > 0)
+	if (stats->pages_free > 0)
 		IndexFreeSpaceMapVacuum(rel);
 
 	/* update statistics */
-	stats->stats.num_pages = num_pages;
+	stats->num_pages = num_pages;
+
+	/*
+	 * If we saw any empty pages, try to unlink them from the tree so that
+	 * they can be reused.
+	 */
+	gistvacuum_delete_empty_pages(info, &vstate);
+
+	/* we don't need the internal and empty page sets anymore */
+	MemoryContextDelete(vstate.page_set_context);
+	vstate.page_set_context = NULL;
+	vstate.internal_page_set = NULL;
+	vstate.empty_leaf_set = NULL;
 }
 
 /*
@@ -278,7 +253,6 @@ gistvacuumscan(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
 static void
 gistvacuumpage(GistVacState *vstate, BlockNumber blkno, BlockNumber orig_blkno)
 {
-	GistBulkDeleteResult *stats = vstate->stats;
 	IndexVacuumInfo *info = vstate->info;
 	IndexBulkDeleteCallback callback = vstate->callback;
 	void	   *callback_state = vstate->callback_state;
@@ -307,13 +281,13 @@ restart:
 	{
 		/* Okay to recycle this page */
 		RecordFreeIndexPage(rel, blkno);
-		stats->stats.pages_free++;
-		stats->stats.pages_deleted++;
+		vstate->stats->pages_free++;
+		vstate->stats->pages_deleted++;
 	}
 	else if (GistPageIsDeleted(page))
 	{
 		/* Already deleted, but can't recycle yet */
-		stats->stats.pages_deleted++;
+		vstate->stats->pages_deleted++;
 	}
 	else if (GistPageIsLeaf(page))
 	{
@@ -388,7 +362,7 @@ restart:
 
 			END_CRIT_SECTION();
 
-			stats->stats.tuples_removed += ntodelete;
+			vstate->stats->tuples_removed += ntodelete;
 			/* must recompute maxoff */
 			maxoff = PageGetMaxOffsetNumber(page);
 		}
@@ -405,10 +379,10 @@ restart:
 			 * it up.
 			 */
 			if (blkno == orig_blkno)
-				intset_add_member(stats->empty_leaf_set, blkno);
+				intset_add_member(vstate->empty_leaf_set, blkno);
 		}
 		else
-			stats->stats.num_index_tuples += nremain;
+			vstate->stats->num_index_tuples += nremain;
 	}
 	else
 	{
@@ -443,7 +417,7 @@ restart:
 		 * parents of empty leaf pages.
 		 */
 		if (blkno == orig_blkno)
-			intset_add_member(stats->internal_page_set, blkno);
+			intset_add_member(vstate->internal_page_set, blkno);
 	}
 
 	UnlockReleaseBuffer(buffer);
@@ -466,7 +440,7 @@ restart:
  * Scan all internal pages, and try to delete their empty child pages.
  */
 static void
-gistvacuum_delete_empty_pages(IndexVacuumInfo *info, GistBulkDeleteResult *stats)
+gistvacuum_delete_empty_pages(IndexVacuumInfo *info, GistVacState *vstate)
 {
 	Relation	rel = info->index;
 	BlockNumber empty_pages_remaining;
@@ -475,10 +449,10 @@ gistvacuum_delete_empty_pages(IndexVacuumInfo *info, GistBulkDeleteResult *stats
 	/*
 	 * Rescan all inner pages to find those that have empty child pages.
 	 */
-	empty_pages_remaining = intset_num_entries(stats->empty_leaf_set);
-	intset_begin_iterate(stats->internal_page_set);
+	empty_pages_remaining = intset_num_entries(vstate->empty_leaf_set);
+	intset_begin_iterate(vstate->internal_page_set);
 	while (empty_pages_remaining > 0 &&
-		   intset_iterate_next(stats->internal_page_set, &blkno))
+		   intset_iterate_next(vstate->internal_page_set, &blkno))
 	{
 		Buffer		buffer;
 		Page		page;
@@ -521,7 +495,7 @@ gistvacuum_delete_empty_pages(IndexVacuumInfo *info, GistBulkDeleteResult *stats
 			BlockNumber leafblk;
 
 			leafblk = ItemPointerGetBlockNumber(&(idxtuple->t_tid));
-			if (intset_is_member(stats->empty_leaf_set, leafblk))
+			if (intset_is_member(vstate->empty_leaf_set, leafblk))
 			{
 				leafs_to_delete[ntodelete] = leafblk;
 				todelete[ntodelete++] = off;
@@ -561,7 +535,7 @@ gistvacuum_delete_empty_pages(IndexVacuumInfo *info, GistBulkDeleteResult *stats
 			gistcheckpage(rel, leafbuf);
 
 			LockBuffer(buffer, GIST_EXCLUSIVE);
-			if (gistdeletepage(info, stats,
+			if (gistdeletepage(info, vstate->stats,
 							   buffer, todelete[i] - deleted,
 							   leafbuf))
 				deleted++;
@@ -573,7 +547,7 @@ gistvacuum_delete_empty_pages(IndexVacuumInfo *info, GistBulkDeleteResult *stats
 		ReleaseBuffer(buffer);
 
 		/* update stats */
-		stats->stats.pages_removed += deleted;
+		vstate->stats->pages_removed += deleted;
 
 		/*
 		 * We can stop the scan as soon as we have seen the downlinks, even if
@@ -596,7 +570,7 @@ gistvacuum_delete_empty_pages(IndexVacuumInfo *info, GistBulkDeleteResult *stats
  * prevented it.
  */
 static bool
-gistdeletepage(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
+gistdeletepage(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
 			   Buffer parentBuffer, OffsetNumber downlink,
 			   Buffer leafBuffer)
 {
@@ -665,7 +639,7 @@ gistdeletepage(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
 	/* mark the page as deleted */
 	MarkBufferDirty(leafBuffer);
 	GistPageSetDeleted(leafPage, txid);
-	stats->stats.pages_deleted++;
+	stats->pages_deleted++;
 
 	/* remove the downlink from the parent */
 	MarkBufferDirty(parentBuffer);
-- 
1.8.3.1

v43-0001-Introduce-IndexAM-fields-for-parallel-vacuum.patchapplication/octet-stream; name=v43-0001-Introduce-IndexAM-fields-for-parallel-vacuum.patchDownload

From 60a55025659cddc4456063174e22d37ae3ddb43a Mon Sep 17 00:00:00 2001
From: Amit Kapila <akapila@postgresql.org>
Date: Thu, 9 Jan 2020 14:36:35 +0530
Subject: [PATCH 1/2] Introduce IndexAM fields for parallel vacuum.

Introduce new fields amusemaintenanceworkmem and amparallelvacuumoptions
in IndexAmRoutine for parallel vacuum.  The amusemaintenanceworkmem tells
whether a particular IndexAM uses maintenance_work_mem or not.  This will
help in controlling the memory used by individual workers as otherwise,
each worker can consume memory equal to maintenance_work_mem.  The
amparallelvacuumoptions tell whether a particular IndexAM participates in
a parallel vacuum and if so in which phase (bulkdelete, vacuumcleanup) of
vacuum.

Author: Masahiko Sawada and Amit Kapila
Reviewed-by: Dilip Kumar, Amit Kapila, Tomas Vondra and Robert Haas
Discussion:
https://postgr.es/m/CAD21AoDTPMgzSkV4E3SFo1CH_x50bf5PqZFQf4jmqjk-C03BWg@mail.gmail.com
https://postgr.es/m/CAA4eK1LmcD5aPogzwim5Nn58Ki+74a6Edghx4Wd8hAskvHaq5A@mail.gmail.com
---
 contrib/bloom/blutils.c                          |  4 +++
 doc/src/sgml/indexam.sgml                        |  4 +++
 src/backend/access/brin/brin.c                   |  4 +++
 src/backend/access/gin/ginutil.c                 |  4 +++
 src/backend/access/gist/gist.c                   |  4 +++
 src/backend/access/hash/hash.c                   |  3 ++
 src/backend/access/nbtree/nbtree.c               |  3 ++
 src/backend/access/spgist/spgutils.c             |  4 +++
 src/include/access/amapi.h                       |  4 +++
 src/include/commands/vacuum.h                    | 38 ++++++++++++++++++++++++
 src/test/modules/dummy_index_am/dummy_index_am.c |  3 ++
 11 files changed, 75 insertions(+)

diff --git a/contrib/bloom/blutils.c b/contrib/bloom/blutils.c
index 23d959b..0104d02 100644
--- a/contrib/bloom/blutils.c
+++ b/contrib/bloom/blutils.c
@@ -18,6 +18,7 @@
 #include "access/reloptions.h"
 #include "bloom.h"
 #include "catalog/index.h"
+#include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
@@ -121,6 +122,9 @@ blhandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_CLEANUP;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = blbuild;
diff --git a/doc/src/sgml/indexam.sgml b/doc/src/sgml/indexam.sgml
index dd54c68..37f8d87 100644
--- a/doc/src/sgml/indexam.sgml
+++ b/doc/src/sgml/indexam.sgml
@@ -122,6 +122,10 @@ typedef struct IndexAmRoutine
     bool        amcanparallel;
     /* does AM support columns included with clause INCLUDE? */
     bool        amcaninclude;
+    /* does AM use maintenance_work_mem? */
+    bool        amusemaintenanceworkmem;
+    /* OR of parallel vacuum flags */
+    uint8       amparallelvacuumoptions;
     /* type of data stored in index, or InvalidOid if variable */
     Oid         amkeytype;
 
diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index d89af78..2e8f67e 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -27,6 +27,7 @@
 #include "access/xloginsert.h"
 #include "catalog/index.h"
 #include "catalog/pg_am.h"
+#include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "pgstat.h"
 #include "postmaster/autovacuum.h"
@@ -101,6 +102,9 @@ brinhandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_CLEANUP;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = brinbuild;
diff --git a/src/backend/access/gin/ginutil.c b/src/backend/access/gin/ginutil.c
index 910f0bc..a7e55ca 100644
--- a/src/backend/access/gin/ginutil.c
+++ b/src/backend/access/gin/ginutil.c
@@ -20,6 +20,7 @@
 #include "access/xloginsert.h"
 #include "catalog/pg_collation.h"
 #include "catalog/pg_type.h"
+#include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "storage/indexfsm.h"
 #include "storage/lmgr.h"
@@ -53,6 +54,9 @@ ginhandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amusemaintenanceworkmem = true;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_CLEANUP;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = ginbuild;
diff --git a/src/backend/access/gist/gist.c b/src/backend/access/gist/gist.c
index 5c9ad34..aefc302 100644
--- a/src/backend/access/gist/gist.c
+++ b/src/backend/access/gist/gist.c
@@ -17,6 +17,7 @@
 #include "access/gist_private.h"
 #include "access/gistscan.h"
 #include "catalog/pg_collation.h"
+#include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "nodes/execnodes.h"
 #include "storage/lmgr.h"
@@ -74,6 +75,9 @@ gisthandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = true;
+	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = gistbuild;
diff --git a/src/backend/access/hash/hash.c b/src/backend/access/hash/hash.c
index 4bb6efc..4871b7f 100644
--- a/src/backend/access/hash/hash.c
+++ b/src/backend/access/hash/hash.c
@@ -72,6 +72,9 @@ hashhandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL;
 	amroutine->amkeytype = INT4OID;
 
 	amroutine->ambuild = hashbuild;
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index 8376a5e..5254bc7 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -121,6 +121,9 @@ bthandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = true;
 	amroutine->amcaninclude = true;
+	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = btbuild;
diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c
index d715908..4924ae1 100644
--- a/src/backend/access/spgist/spgutils.c
+++ b/src/backend/access/spgist/spgutils.c
@@ -22,6 +22,7 @@
 #include "access/transam.h"
 #include "access/xact.h"
 #include "catalog/pg_amop.h"
+#include "commands/vacuum.h"
 #include "storage/bufmgr.h"
 #include "storage/indexfsm.h"
 #include "storage/lmgr.h"
@@ -56,6 +57,9 @@ spghandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = spgbuild;
diff --git a/src/include/access/amapi.h b/src/include/access/amapi.h
index d2a49e8..3b3e22f 100644
--- a/src/include/access/amapi.h
+++ b/src/include/access/amapi.h
@@ -197,6 +197,10 @@ typedef struct IndexAmRoutine
 	bool		amcanparallel;
 	/* does AM support columns included with clause INCLUDE? */
 	bool		amcaninclude;
+	/* does AM use maintenance_work_mem? */
+	bool		amusemaintenanceworkmem;
+	/* OR of parallel vacuum flags.  See vacuum.h for flags. */
+	uint8		amparallelvacuumoptions;
 	/* type of data stored in index, or InvalidOid if variable */
 	Oid			amkeytype;
 
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 5dc41dd..b3351ad 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -23,6 +23,44 @@
 #include "storage/lock.h"
 #include "utils/relcache.h"
 
+/*
+ * Flags for amparallelvacuumoptions to control the participation of bulkdelete
+ * and vacuumcleanup in parallel vacuum.
+ */
+
+/*
+ * Both bulkdelete and vacuumcleanup are disabled by default.  This will be
+ * used by IndexAM's that don't want to or cannot participate in parallel
+ * vacuum.  For example, if an index AM doesn't have a way to communicate the
+ * index statistics allocated by the first ambulkdelete call to the subsequent
+ * ones until amvacuumcleanup, the index AM cannot participate in parallel
+ * vacuum.
+ */
+#define VACUUM_OPTION_NO_PARALLEL			0
+
+/*
+ * bulkdelete can be performed in parallel.  This option can be used by
+ * IndexAm's that need to scan the index to delete the tuples.
+ */
+#define VACUUM_OPTION_PARALLEL_BULKDEL		(1 << 0)
+
+/*
+ * vacuumcleanup can be performed in parallel if bulkdelete is not performed
+ * yet.  This will be used by IndexAM's that can scan the index if the
+ * bulkdelete is not performed.
+ */
+#define VACUUM_OPTION_PARALLEL_COND_CLEANUP	(1 << 1)
+
+/*
+ * vacuumcleanup can be performed in parallel even if bulkdelete has already
+ * processed the index.  This will be used by IndexAM's that scan the index
+ * during the cleanup phase of index irrespective of whether the index is
+ * already scanned or not during bulkdelete phase.
+ */
+#define VACUUM_OPTION_PARALLEL_CLEANUP		(1 << 2)
+
+/* value for checking vacuum flags */
+#define VACUUM_OPTION_MAX_VALID_VALUE		((1 << 3) - 1)
 
 /*----------
  * ANALYZE builds one of these structs for each attribute (column) that is
diff --git a/src/test/modules/dummy_index_am/dummy_index_am.c b/src/test/modules/dummy_index_am/dummy_index_am.c
index 898ab06..f326320 100644
--- a/src/test/modules/dummy_index_am/dummy_index_am.c
+++ b/src/test/modules/dummy_index_am/dummy_index_am.c
@@ -16,6 +16,7 @@
 #include "access/amapi.h"
 #include "access/reloptions.h"
 #include "catalog/index.h"
+#include "commands/vacuum.h"
 #include "nodes/pathnodes.h"
 #include "utils/guc.h"
 #include "utils/rel.h"
@@ -294,6 +295,8 @@ dihandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amparallelvacuumoptions = VACUUM_OPTION_NO_PARALLEL;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = dibuild;
-- 
1.8.3.1

v43-0002-Allow-vacuum-command-to-process-indexes-in-parallel.patchapplication/octet-stream; name=v43-0002-Allow-vacuum-command-to-process-indexes-in-parallel.patchDownload

From a0da6b9770f6ab6cc8f7b29a4efca078038006d8 Mon Sep 17 00:00:00 2001
From: Amit Kapila <akapila@postgresql.org>
Date: Thu, 9 Jan 2020 15:49:46 +0530
Subject: [PATCH 2/2] Allow vacuum command to process indexes in parallel.

This feature allows the vacuum to leverage multiple CPUs in order to
process indexes.  This enables us to perform index vacuuming and index
cleanup with background workers.  This adds a PARALLEL option to VACUUM
command where the user can specify the number of workers that can be used
to perform the command which is limited by the number of indexes on a
table.  Specifying zero as a number of workers will disable parallelism.
This option can't be used with the FULL option.

Each index is processed by at most one vacuum process.  Therefore parallel
vacuum can be used when the table has at least two indexes.

The parallel degree is either specified by the user or determined based on
the number of indexes that the table has, and further limited by
max_parallel_maintenance_workers.  The index can participate in parallel
vacuum iff it is greater than min_parallel_index_scan_size.

Author: Masahiko Sawada and Amit Kapila
Reviewed-by: Dilip Kumar, Amit Kapila, Robert Haas, Tomas Vondra,
Mahendra Singh and Sergei Kornilov
Tested-by: Mahendra Singh and Prabhat Sahu
Discussion:
https://postgr.es/m/CAD21AoDTPMgzSkV4E3SFo1CH_x50bf5PqZFQf4jmqjk-C03BWg@mail.gmail.com
https://postgr.es/m/CAA4eK1J-VoR9gzS5E75pcD-OH0mEyCdp8RihcwKrcuw7J-Q0+w@mail.gmail.com
---
 doc/src/sgml/config.sgml              |   14 +-
 doc/src/sgml/ref/vacuum.sgml          |   64 +-
 src/backend/access/heap/vacuumlazy.c  | 1245 ++++++++++++++++++++++++++++++---
 src/backend/access/transam/parallel.c |   26 +-
 src/backend/commands/vacuum.c         |  148 +++-
 src/backend/postmaster/autovacuum.c   |    2 +
 src/bin/psql/tab-complete.c           |    2 +-
 src/include/access/heapam.h           |    3 +
 src/include/access/parallel.h         |    4 +-
 src/include/commands/vacuum.h         |   11 +
 src/test/regress/expected/vacuum.out  |   28 +
 src/test/regress/sql/vacuum.sql       |   27 +
 src/tools/pgindent/typedefs.list      |    4 +
 13 files changed, 1443 insertions(+), 135 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 5d1c902..7475627 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2308,13 +2308,13 @@ include_dir 'conf.d'
        <listitem>
         <para>
          Sets the maximum number of parallel workers that can be
-         started by a single utility command.  Currently, the only
-         parallel utility command that supports the use of parallel
-         workers is <command>CREATE INDEX</command>, and only when
-         building a B-tree index.  Parallel workers are taken from the
-         pool of processes established by <xref
-         linkend="guc-max-worker-processes"/>, limited by <xref
-         linkend="guc-max-parallel-workers"/>.  Note that the requested
+         started by a single utility command.  Currently, the parallel
+         utility commands that support the use of parallel workers are
+         <command>CREATE INDEX</command> only when building a B-tree index,
+         and <command>VACUUM</command> without <literal>FULL</literal>
+         option.  Parallel workers are taken from the pool of processes
+         established by <xref linkend="guc-max-worker-processes"/>, limited
+         by <xref linkend="guc-max-parallel-workers"/>.  Note that the requested
          number of workers may not actually be available at run time.
          If this occurs, the utility operation will run with fewer
          workers than expected.  The default value is 2.  Setting this
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index f9b0fb8..f9c7b1a 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -34,6 +34,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     SKIP_LOCKED [ <replaceable class="parameter">boolean</replaceable> ]
     INDEX_CLEANUP [ <replaceable class="parameter">boolean</replaceable> ]
     TRUNCATE [ <replaceable class="parameter">boolean</replaceable> ]
+    PARALLEL [ <replaceable class="parameter">integer</replaceable> ]
 
 <phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
 
@@ -75,10 +76,14 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    with normal reading and writing of the table, as an exclusive lock
    is not obtained.  However, extra space is not returned to the operating
    system (in most cases); it's just kept available for re-use within the
-   same table.  <command>VACUUM FULL</command> rewrites the entire contents
-   of the table into a new disk file with no extra space, allowing unused
-   space to be returned to the operating system.  This form is much slower and
-   requires an exclusive lock on each table while it is being processed.
+   same table.  It also allows us to leverage multiple CPUs in order to process
+   indexes.  This feature is known as <firstterm>parallel vacuum</firstterm>.
+   To disable this feature, one can use <literal>PARALLEL</literal> option and
+   specify parallel workers as zero.  <command>VACUUM FULL</command> rewrites
+   the entire contents of the table into a new disk file with no extra space,
+   allowing unused space to be returned to the operating system.  This form is
+   much slower and requires an exclusive lock on each table while it is being
+   processed.
   </para>
 
   <para>
@@ -224,6 +229,33 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    </varlistentry>
 
    <varlistentry>
+    <term><literal>PARALLEL</literal></term>
+    <listitem>
+     <para>
+      Perform vacuum index and cleanup index phases of <command>VACUUM</command>
+      in parallel using <replaceable class="parameter">integer</replaceable>
+      background workers (for the detail of each vacuum phases, please
+      refer to <xref linkend="vacuum-phases"/>).  If the
+      <literal>PARALLEL</literal> option or parallel degree
+      <replaceable class="parameter">integer</replaceable> is omitted,
+      then <command>VACUUM</command> decides the number of workers based
+      on number of indexes that support parallel vacuum operation on the
+      relation which is further limited by
+      <xref linkend="guc-max-parallel-workers-maintenance"/>.  Please note
+      that it is not guaranteed that the number of parallel workers specified
+      in <replaceable class="parameter">integer</replaceable> will be used
+      during execution.  It is possible for a vacuum to run with fewer workers
+      than specified, or even with no workers at all.  Only one worker can
+      be used per index.  So parallel workers are launched only when there
+      are at least <literal>2</literal> indexes in the table.  Workers for
+      vacuum launches before starting each phase and exit at the end of
+      the phase.  These behaviors might change in a future release.  This
+      option can't be used with the <literal>FULL</literal> option.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><replaceable class="parameter">boolean</replaceable></term>
     <listitem>
      <para>
@@ -238,6 +270,18 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    </varlistentry>
 
    <varlistentry>
+    <term><replaceable class="parameter">integer</replaceable></term>
+    <listitem>
+     <para>
+      Specifies a positive integer value passed to the selected option.
+      The <replaceable class="parameter">integer</replaceable> value can
+      also be omitted, in which case the value is decided by the command
+      based on the option used.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><replaceable class="parameter">table_name</replaceable></term>
     <listitem>
      <para>
@@ -317,10 +361,18 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    </para>
 
    <para>
+     The <option>PARALLEL</option> option is used only for vacuum purpose.
+     Even if this option is specified with <option>ANALYZE</option> option
+     it does not affect <option>ANALYZE</option>.
+   </para>
+
+   <para>
     <command>VACUUM</command> causes a substantial increase in I/O traffic,
     which might cause poor performance for other active sessions.  Therefore,
-    it is sometimes advisable to use the cost-based vacuum delay feature.
-    See <xref linkend="runtime-config-resource-vacuum-cost"/> for details.
+    it is sometimes advisable to use the cost-based vacuum delay feature.  For
+    parallel vacuum, each worker sleeps proportional to the work done by that
+    worker.  See <xref linkend="runtime-config-resource-vacuum-cost"/> for
+    details.
    </para>
 
    <para>
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index a5fe904..0e0b710 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -22,6 +22,20 @@
  * of index scans performed.  So we don't use maintenance_work_mem memory for
  * the TID array, just enough to hold as many heap tuples as fit on one page.
  *
+ * Lazy vacuum supports parallel execution with parallel worker processes.  In
+ * a parallel lazy vacuum, we perform both index vacuuming and index cleanup
+ * with parallel worker processes.  Individual indexes are processed by one
+ * vacuum process.  At the beginning of a lazy vacuum (at lazy_scan_heap) we
+ * prepare the parallel context and initialize the DSM segment that contains
+ * shared information as well as the memory space for storing dead tuples.
+ * When starting either index vacuuming or index cleanup, we launch parallel
+ * worker processes.  Once all indexes are processed the parallel worker
+ * processes exit.  After that, the leader process re-initializes the parallel
+ * context so that it can use the same DSM for multiple passes of index
+ * vacuum and for performing index cleanup.  For updating the index statistics,
+ * we need to update the system table and since updates are not
+ * allowed during parallel mode we update the index statistics after exiting
+ * from the parallel mode.
  *
  * Portions Copyright (c) 1996-2020, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
@@ -36,25 +50,30 @@
 
 #include <math.h>
 
+#include "access/amapi.h"
 #include "access/genam.h"
 #include "access/heapam.h"
 #include "access/heapam_xlog.h"
 #include "access/htup_details.h"
 #include "access/multixact.h"
+#include "access/parallel.h"
 #include "access/transam.h"
 #include "access/visibilitymap.h"
+#include "access/xact.h"
 #include "access/xlog.h"
 #include "catalog/storage.h"
 #include "commands/dbcommands.h"
 #include "commands/progress.h"
 #include "commands/vacuum.h"
 #include "miscadmin.h"
+#include "optimizer/paths.h"
 #include "pgstat.h"
 #include "portability/instr_time.h"
 #include "postmaster/autovacuum.h"
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
 #include "storage/lmgr.h"
+#include "tcop/tcopprot.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/pg_rusage.h"
@@ -110,6 +129,142 @@
  */
 #define PREFETCH_SIZE			((BlockNumber) 32)
 
+/*
+ * DSM keys for parallel lazy vacuum.  Unlike other parallel execution code,
+ * since we don't need to worry about DSM keys conflicting with plan_node_id
+ * we can use small integers.
+ */
+#define PARALLEL_VACUUM_KEY_SHARED			1
+#define PARALLEL_VACUUM_KEY_DEAD_TUPLES		2
+#define PARALLEL_VACUUM_KEY_QUERY_TEXT		3
+
+/*
+ * Macro to check if we are in a parallel lazy vacuum.  If true, we are
+ * in the parallel mode and the DSM segment is initialized.
+ */
+#define ParallelVacuumIsActive(lps) (((LVParallelState *) (lps)) != NULL)
+
+/*
+ * LVDeadTuples stores the dead tuple TIDs collected during the heap scan.
+ * This is allocated in the DSM segment in parallel mode and in local memory
+ * in non-parallel mode.
+ */
+typedef struct LVDeadTuples
+{
+	int			max_tuples;		/* # slots allocated in array */
+	int			num_tuples;		/* current # of entries */
+	/* List of TIDs of tuples we intend to delete */
+	/* NB: this list is ordered by TID address */
+	ItemPointerData itemptrs[FLEXIBLE_ARRAY_MEMBER];	/* array of
+														 * ItemPointerData */
+} LVDeadTuples;
+
+#define SizeOfLVDeadTuples(cnt) \
+		add_size((offsetof(LVDeadTuples, itemptrs)), \
+				 mul_size(sizeof(ItemPointerData), cnt))
+
+/*
+ * Shared information among parallel workers.  So this is allocated in the DSM
+ * segment.
+ */
+typedef struct LVShared
+{
+	/*
+	 * Target table relid and log level.  These fields are not modified during
+	 * the lazy vacuum.
+	 */
+	Oid			relid;
+	int			elevel;
+
+	/*
+	 * An indication for vacuum workers to perform either index vacuuming or
+	 * index cleanup.  first_time is true only if for_cleanup is true and
+	 * bulk-deletion is not performed yet.
+	 */
+	bool		for_cleanup;
+	bool		first_time;
+
+	/*
+	 * Fields for both index vacuum and cleanup.
+	 *
+	 * reltuples is the total number of input heap tuples.  We set either old
+	 * live tuples in the index vacuum case or the new live tuples in the
+	 * index cleanup case.
+	 *
+	 * estimated_count is true if the reltuples is an estimated value.
+	 */
+	double		reltuples;
+	bool		estimated_count;
+
+	/*
+	 * In single process lazy vacuum we could consume more memory during index
+	 * vacuuming or cleanup apart from the memory for heap scanning. In
+	 * parallel index vacuuming, since individual vacuum workers can consume
+	 * memory equal to maintenance_work_mem, the new maintenance_work_mem for
+	 * each worker is set such that the parallel operation doesn't consume
+	 * more memory than single process lazy vacuum.
+	 */
+	int			maintenance_work_mem_worker;
+
+	/*
+	 * Shared vacuum cost balance.  During parallel index vacuuming
+	 * VacuumSharedCostBalance points to this value and it accumulates the
+	 * balance of each parallel vacuum workers.
+	 */
+	pg_atomic_uint32 cost_balance;
+
+	/*
+	 * Number of active parallel workers.  This is used for computing the
+	 * minimum threshold of the vacuum cost balance for a worker to go for the
+	 * delay.
+	 */
+	pg_atomic_uint32 active_nworkers;
+
+	/*
+	 * Variables to control parallel index vacuuming.  We have a bitmap to
+	 * indicate which index has stats in shared memory.  The set bit in the
+	 * map indicates that the particular index supports a parallel vacuum.
+	 */
+	pg_atomic_uint32 idx;		/* counter for vacuuming and clean up */
+	uint32		offset;			/* sizeof header incl. bitmap */
+	bits8		bitmap[FLEXIBLE_ARRAY_MEMBER];	/* bit map of NULLs */
+
+	/* Shared index statistics data follows at end of struct */
+} LVShared;
+
+#define SizeOfLVShared (offsetof(LVShared, bitmap) + sizeof(bits8))
+#define GetSharedIndStats(s) \
+	((LVSharedIndStats *)((char *)(s) + ((LVShared *)(s))->offset))
+#define IndStatsIsNull(s, i) \
+	(!(((LVShared *)(s))->bitmap[(i) >> 3] & (1 << ((i) & 0x07))))
+
+/*
+ * Struct for an index bulk-deletion statistic used for parallel lazy
+ * vacuum.  This is allocated in the DSM segment.
+ */
+typedef struct LVSharedIndStats
+{
+	bool		updated;		/* are the stats updated? */
+	IndexBulkDeleteResult stats;
+} LVSharedIndStats;
+
+/* Struct for maintaining a parallel vacuum state. */
+typedef struct LVParallelState
+{
+	ParallelContext *pcxt;
+
+	/* Shared information among parallel vacuum workers */
+	LVShared   *lvshared;
+
+	/*
+	 * The number of indexes that support parallel index bulk-deletion and
+	 * parallel index cleanup respectively.
+	 */
+	int			nindexes_parallel_bulkdel;
+	int			nindexes_parallel_cleanup;
+	int			nindexes_parallel_condcleanup;
+} LVParallelState;
+
 typedef struct LVRelStats
 {
 	/* useindex = true means two-pass strategy; false means one-pass */
@@ -128,11 +283,7 @@ typedef struct LVRelStats
 	BlockNumber pages_removed;
 	double		tuples_deleted;
 	BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
-	/* List of TIDs of tuples we intend to delete */
-	/* NB: this list is ordered by TID address */
-	int			num_dead_tuples;	/* current # of entries */
-	int			max_dead_tuples;	/* # slots allocated in array */
-	ItemPointer dead_tuples;	/* array of ItemPointerData */
+	LVDeadTuples *dead_tuples;
 	int			num_index_scans;
 	TransactionId latestRemovedXid;
 	bool		lock_waiter_detected;
@@ -155,15 +306,15 @@ static void lazy_scan_heap(Relation onerel, VacuumParams *params,
 						   bool aggressive);
 static void lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats);
 static bool lazy_check_needs_freeze(Buffer buf, bool *hastup);
-static void lazy_vacuum_index(Relation indrel,
-							  IndexBulkDeleteResult **stats,
-							  LVRelStats *vacrelstats);
-static void lazy_vacuum_all_indexes(Relation onerel, LVRelStats *vacrelstats,
-									Relation *Irel, int nindexes,
-									IndexBulkDeleteResult **indstats);
+static void lazy_vacuum_all_indexes(Relation onerel, Relation *Irel,
+									IndexBulkDeleteResult **stats,
+									LVRelStats *vacrelstats, LVParallelState *lps,
+									int nindexes);
+static void lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult **stats,
+							  LVDeadTuples *dead_tuples, double reltuples);
 static void lazy_cleanup_index(Relation indrel,
-							   IndexBulkDeleteResult *stats,
-							   LVRelStats *vacrelstats);
+							   IndexBulkDeleteResult **stats,
+							   double reltuples, bool estimated_count);
 static int	lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 							 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer);
 static bool should_attempt_truncation(VacuumParams *params,
@@ -172,12 +323,41 @@ static void lazy_truncate_heap(Relation onerel, LVRelStats *vacrelstats);
 static BlockNumber count_nondeletable_pages(Relation onerel,
 											LVRelStats *vacrelstats);
 static void lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks);
-static void lazy_record_dead_tuple(LVRelStats *vacrelstats,
+static void lazy_record_dead_tuple(LVDeadTuples *dead_tuples,
 								   ItemPointer itemptr);
 static bool lazy_tid_reaped(ItemPointer itemptr, void *state);
 static int	vac_cmp_itemptr(const void *left, const void *right);
 static bool heap_page_is_all_visible(Relation rel, Buffer buf,
 									 TransactionId *visibility_cutoff_xid, bool *all_frozen);
+static void lazy_parallel_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+										 LVRelStats *vacrelstats, LVParallelState *lps,
+										 int nindexes);
+static void parallel_vacuum_index(Relation *Irel, IndexBulkDeleteResult **stats,
+								  LVShared *lvshared, LVDeadTuples *dead_tuples,
+								  int nindexes);
+static void vacuum_indexes_leader(Relation *Irel, IndexBulkDeleteResult **stats,
+								  LVRelStats *vacrelstats, LVParallelState *lps,
+								  int nindexes);
+static void vacuum_one_index(Relation indrel, IndexBulkDeleteResult **stats,
+							 LVShared *lvshared, LVSharedIndStats *shared_indstats,
+							 LVDeadTuples *dead_tuples);
+static void lazy_cleanup_all_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+									 LVRelStats *vacrelstats, LVParallelState *lps,
+									 int nindexes);
+static long compute_max_dead_tuples(BlockNumber relblocks, bool hasindex);
+static int	compute_parallel_vacuum_workers(Relation *Irel, int nindexes, int nrequested,
+											bool *can_parallel_vacuum);
+static void prepare_index_statistics(LVShared *lvshared, bool *can_parallel_vacuum,
+									 int nindexes);
+static void update_index_statistics(Relation *Irel, IndexBulkDeleteResult **stats,
+									int nindexes);
+static LVParallelState *begin_parallel_vacuum(Oid relid, Relation *Irel,
+											  LVRelStats *vacrelstats, BlockNumber nblocks,
+											  int nindexes, int nrequested);
+static void end_parallel_vacuum(Relation *Irel, IndexBulkDeleteResult **stats,
+								LVParallelState *lps, int nindexes);
+static LVSharedIndStats *get_indstats(LVShared *lvshared, int n);
+static bool skip_parallel_vacuum_index(Relation indrel, LVShared *lvshared);
 
 
 /*
@@ -491,6 +671,18 @@ vacuum_log_cleanup_info(Relation rel, LVRelStats *vacrelstats)
  *		dead-tuple TIDs, invoke vacuuming of indexes and call lazy_vacuum_heap
  *		to reclaim dead line pointers.
  *
+ *		If the table has at least two indexes and parallel lazy vacuum is
+ *		requested, we execute both index vacuuming and index cleanup with
+ *		parallel workers.  In parallel lazy vacuum, we enter parallel mode and
+ *		then create both the parallel context and the DSM segment before starting
+ *		heap scan so that we can record dead tuples to the DSM segment.  All
+ *		parallel workers are launched at beginning of index vacuuming and index
+ *		cleanup and they exit once done with all indexes.  At the end of this
+ *		function we exit from parallel mode.  Index bulk-deletion results are
+ *		stored in the DSM segment and we update index statistics for all the
+ *		indexes after exiting from parallel mode since writes are not allowed
+ *		during parallel mode.
+ *
  *		If there are no indexes then we can reclaim line pointers on the fly;
  *		dead line pointers need only be retained until all index pointers that
  *		reference them have been killed.
@@ -499,6 +691,8 @@ static void
 lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			   Relation *Irel, int nindexes, bool aggressive)
 {
+	LVParallelState *lps = NULL;
+	LVDeadTuples *dead_tuples;
 	BlockNumber nblocks,
 				blkno;
 	HeapTupleData tuple;
@@ -556,13 +750,28 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	vacrelstats->nonempty_pages = 0;
 	vacrelstats->latestRemovedXid = InvalidTransactionId;
 
-	lazy_space_alloc(vacrelstats, nblocks);
+	/*
+	 * Try to initialize the parallel vacuum if requested
+	 */
+	if (params->nworkers >= 0 && vacrelstats->useindex)
+		lps = begin_parallel_vacuum(RelationGetRelid(onerel), Irel,
+									vacrelstats, nblocks, nindexes,
+									params->nworkers);
+
+	/*
+	 * Allocate the space for dead tuples in case the parallel vacuum is not
+	 * initialized.
+	 */
+	if (!ParallelVacuumIsActive(lps))
+		lazy_space_alloc(vacrelstats, nblocks);
+
+	dead_tuples = vacrelstats->dead_tuples;
 	frozen = palloc(sizeof(xl_heap_freeze_tuple) * MaxHeapTuplesPerPage);
 
 	/* Report that we're scanning the heap, advertising total # of blocks */
 	initprog_val[0] = PROGRESS_VACUUM_PHASE_SCAN_HEAP;
 	initprog_val[1] = nblocks;
-	initprog_val[2] = vacrelstats->max_dead_tuples;
+	initprog_val[2] = dead_tuples->max_tuples;
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
 	/*
@@ -740,8 +949,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * If we are close to overrunning the available space for dead-tuple
 		 * TIDs, pause and do a cycle of vacuuming before we tackle this page.
 		 */
-		if ((vacrelstats->max_dead_tuples - vacrelstats->num_dead_tuples) < MaxHeapTuplesPerPage &&
-			vacrelstats->num_dead_tuples > 0)
+		if ((dead_tuples->max_tuples - dead_tuples->num_tuples) < MaxHeapTuplesPerPage &&
+			dead_tuples->num_tuples > 0)
 		{
 			/*
 			 * Before beginning index vacuuming, we release any pin we may
@@ -756,8 +965,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			}
 
 			/* Work on all the indexes, then the heap */
-			lazy_vacuum_all_indexes(onerel, vacrelstats, Irel,
-									nindexes, indstats);
+			lazy_vacuum_all_indexes(onerel, Irel, indstats,
+									vacrelstats, lps, nindexes);
 
 			/* Remove tuples from heap */
 			lazy_vacuum_heap(onerel, vacrelstats);
@@ -767,7 +976,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 
 			/*
 			 * Vacuum the Free Space Map to make newly-freed space visible on
@@ -962,7 +1171,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		has_dead_tuples = false;
 		nfrozen = 0;
 		hastup = false;
-		prev_dead_count = vacrelstats->num_dead_tuples;
+		prev_dead_count = dead_tuples->num_tuples;
 		maxoff = PageGetMaxOffsetNumber(page);
 
 		/*
@@ -1001,7 +1210,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 */
 			if (ItemIdIsDead(itemid))
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				all_visible = false;
 				continue;
 			}
@@ -1147,7 +1356,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 
 			if (tupgone)
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				HeapTupleHeaderAdvanceLatestRemovedXid(tuple.t_data,
 													   &vacrelstats->latestRemovedXid);
 				tups_vacuumed += 1;
@@ -1217,7 +1426,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * doing a second scan. Also we don't do that but forget dead tuples
 		 * when index cleanup is disabled.
 		 */
-		if (!vacrelstats->useindex && vacrelstats->num_dead_tuples > 0)
+		if (!vacrelstats->useindex && dead_tuples->num_tuples > 0)
 		{
 			if (nindexes == 0)
 			{
@@ -1246,7 +1455,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 
 			/*
 			 * Periodically do incremental FSM vacuuming to make newly-freed
@@ -1361,7 +1570,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * page, so remember its free space as-is.  (This path will always be
 		 * taken if there are no indexes.)
 		 */
-		if (vacrelstats->num_dead_tuples == prev_dead_count)
+		if (dead_tuples->num_tuples == prev_dead_count)
 			RecordPageWithFreeSpace(onerel, blkno, freespace);
 	}
 
@@ -1395,11 +1604,11 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 
 	/* If any tuples need to be deleted, perform final vacuum cycle */
 	/* XXX put a threshold on min number of tuples here? */
-	if (vacrelstats->num_dead_tuples > 0)
+	if (dead_tuples->num_tuples > 0)
 	{
 		/* Work on all the indexes, and then the heap */
-		lazy_vacuum_all_indexes(onerel, vacrelstats, Irel, nindexes,
-								indstats);
+		lazy_vacuum_all_indexes(onerel, Irel, indstats, vacrelstats,
+								lps, nindexes);
 
 		/* Remove tuples from heap */
 		lazy_vacuum_heap(onerel, vacrelstats);
@@ -1412,17 +1621,22 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	if (blkno > next_fsm_block_to_vacuum)
 		FreeSpaceMapVacuumRange(onerel, next_fsm_block_to_vacuum, blkno);
 
-	/* report all blocks vacuumed; and that we're cleaning up */
+	/* report all blocks vacuumed */
 	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_VACUUMED, blkno);
-	pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
-								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
 
-	/* Do post-vacuum cleanup and statistics update for each index */
+	/* Do post-vacuum cleanup */
 	if (vacrelstats->useindex)
-	{
-		for (i = 0; i < nindexes; i++)
-			lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
-	}
+		lazy_cleanup_all_indexes(Irel, indstats, vacrelstats, lps, nindexes);
+
+	/*
+	 * End parallel mode before updating index statistics as we cannot write
+	 * during parallel mode.
+	 */
+	if (ParallelVacuumIsActive(lps))
+		end_parallel_vacuum(Irel, indstats, lps, nindexes);
+
+	/* Update index statistics */
+	update_index_statistics(Irel, indstats, nindexes);
 
 	/* If no indexes, make log report that lazy_vacuum_heap would've made */
 	if (vacuumed_pages)
@@ -1467,15 +1681,16 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 /*
  *	lazy_vacuum_all_indexes() -- vacuum all indexes of relation.
  *
- *		This is a utility wrapper for lazy_vacuum_index(), able to do
- *		progress reporting.
+ * We process the indexes serially unless we are doing parallel vacuum.
  */
 static void
-lazy_vacuum_all_indexes(Relation onerel, LVRelStats *vacrelstats,
-						Relation *Irel, int nindexes,
-						IndexBulkDeleteResult **indstats)
+lazy_vacuum_all_indexes(Relation onerel, Relation *Irel,
+						IndexBulkDeleteResult **stats,
+						LVRelStats *vacrelstats, LVParallelState *lps,
+						int nindexes)
 {
-	int			i;
+	Assert(!IsParallelWorker());
+	Assert(nindexes > 0);
 
 	/* Log cleanup info before we touch indexes */
 	vacuum_log_cleanup_info(onerel, vacrelstats);
@@ -1484,9 +1699,30 @@ lazy_vacuum_all_indexes(Relation onerel, LVRelStats *vacrelstats,
 	pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
 								 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
-	/* Remove index entries */
-	for (i = 0; i < nindexes; i++)
-		lazy_vacuum_index(Irel[i], &indstats[i], vacrelstats);
+	/* Perform index vacuuming with parallel workers for parallel vacuum. */
+	if (ParallelVacuumIsActive(lps))
+	{
+		/* Tell parallel workers to do index vacuuming */
+		lps->lvshared->for_cleanup = false;
+		lps->lvshared->first_time = false;
+
+		/*
+		 * We can only provide an approximate value of num_heap_tuples in
+		 * vacuum cases.
+		 */
+		lps->lvshared->reltuples = vacrelstats->old_live_tuples;
+		lps->lvshared->estimated_count = true;
+
+		lazy_parallel_vacuum_indexes(Irel, stats, vacrelstats, lps, nindexes);
+	}
+	else
+	{
+		int			idx;
+
+		for (idx = 0; idx < nindexes; idx++)
+			lazy_vacuum_index(Irel[idx], &stats[idx], vacrelstats->dead_tuples,
+							  vacrelstats->old_live_tuples);
+	}
 
 	/* Increase and report the number of index scans */
 	vacrelstats->num_index_scans++;
@@ -1522,7 +1758,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 	npages = 0;
 
 	tupindex = 0;
-	while (tupindex < vacrelstats->num_dead_tuples)
+	while (tupindex < vacrelstats->dead_tuples->num_tuples)
 	{
 		BlockNumber tblk;
 		Buffer		buf;
@@ -1531,7 +1767,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 
 		vacuum_delay_point();
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples->itemptrs[tupindex]);
 		buf = ReadBufferExtended(onerel, MAIN_FORKNUM, tblk, RBM_NORMAL,
 								 vac_strategy);
 		if (!ConditionalLockBufferForCleanup(buf))
@@ -1579,6 +1815,7 @@ static int
 lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 				 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer)
 {
+	LVDeadTuples *dead_tuples = vacrelstats->dead_tuples;
 	Page		page = BufferGetPage(buffer);
 	OffsetNumber unused[MaxOffsetNumber];
 	int			uncnt = 0;
@@ -1589,16 +1826,16 @@ lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 
 	START_CRIT_SECTION();
 
-	for (; tupindex < vacrelstats->num_dead_tuples; tupindex++)
+	for (; tupindex < dead_tuples->num_tuples; tupindex++)
 	{
 		BlockNumber tblk;
 		OffsetNumber toff;
 		ItemId		itemid;
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&dead_tuples->itemptrs[tupindex]);
 		if (tblk != blkno)
 			break;				/* past end of tuples for this block */
-		toff = ItemPointerGetOffsetNumber(&vacrelstats->dead_tuples[tupindex]);
+		toff = ItemPointerGetOffsetNumber(&dead_tuples->itemptrs[tupindex]);
 		itemid = PageGetItemId(page, toff);
 		ItemIdSetUnused(itemid);
 		unused[uncnt++] = toff;
@@ -1719,19 +1956,351 @@ lazy_check_needs_freeze(Buffer buf, bool *hastup)
 	return false;
 }
 
+/*
+ * Perform index vacuuming or index cleanup with parallel workers.  This
+ * function must be used by the parallel vacuum leader process. The caller
+ * must set lps->lvshared->for_cleanup to indicate whether to perform vacuum
+ * or cleanup.
+ */
+static void
+lazy_parallel_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+							 LVRelStats *vacrelstats, LVParallelState *lps,
+							 int nindexes)
+{
+	int			nworkers;
+
+	Assert(!IsParallelWorker());
+	Assert(ParallelVacuumIsActive(lps));
+	Assert(nindexes > 0);
+
+	/* Determine the number of parallel workers to launch */
+	if (lps->lvshared->for_cleanup)
+	{
+		if (lps->lvshared->first_time)
+			nworkers = lps->nindexes_parallel_cleanup +
+				lps->nindexes_parallel_condcleanup - 1;
+		else
+			nworkers = lps->nindexes_parallel_cleanup - 1;
+
+	}
+	else
+		nworkers = lps->nindexes_parallel_bulkdel - 1;
+
+	/*
+	 * It is possible that parallel context is initialized with fewer workers
+	 * then the number of indexes that need a separate worker in the current
+	 * phase, so we need to consider it.  See compute_parallel_vacuum_workers.
+	 */
+	nworkers = Min(nworkers, lps->pcxt->nworkers);
+
+	/* Setup the shared cost-based vacuum delay and launch workers */
+	if (nworkers > 0)
+	{
+		if (vacrelstats->num_index_scans > 0)
+		{
+			/* Reset the parallel index processing counter */
+			pg_atomic_write_u32(&(lps->lvshared->idx), 0);
+
+			/* Reinitialize the parallel context to relaunch parallel workers */
+			ReinitializeParallelDSM(lps->pcxt);
+		}
+
+		/* Enable shared cost balance */
+		VacuumSharedCostBalance = &(lps->lvshared->cost_balance);
+		VacuumActiveNWorkers = &(lps->lvshared->active_nworkers);
+
+		/*
+		 * Set up shared cost balance and the number of active workers for
+		 * vacuum delay.
+		 */
+		pg_atomic_write_u32(VacuumSharedCostBalance, VacuumCostBalance);
+		pg_atomic_write_u32(VacuumActiveNWorkers, 0);
+
+		/*
+		 * The number of workers can vary between bulkdelete and cleanup
+		 * phase.
+		 */
+		ReinitializeParallelWorkers(lps->pcxt, nworkers);
+
+		LaunchParallelWorkers(lps->pcxt);
+
+		if (lps->pcxt->nworkers_launched > 0)
+		{
+			/*
+			 * Reset the local cost values for leader backend as we have
+			 * already accumulated the remaining balance of heap.
+			 */
+			VacuumCostBalance = 0;
+			VacuumCostBalanceLocal = 0;
+		}
+		else
+		{
+			/*
+			 * Disable shared cost balance if we are not able to launch
+			 * workers.
+			 */
+			VacuumSharedCostBalance = NULL;
+			VacuumActiveNWorkers = NULL;
+		}
+
+		if (lps->lvshared->for_cleanup)
+			ereport(elevel,
+					(errmsg(ngettext("launched %d parallel vacuum worker for index cleanup (planned: %d)",
+									 "launched %d parallel vacuum workers for index cleanup (planned: %d)",
+									 lps->pcxt->nworkers_launched),
+							lps->pcxt->nworkers_launched, nworkers)));
+		else
+			ereport(elevel,
+					(errmsg(ngettext("launched %d parallel vacuum worker for index vacuuming (planned: %d)",
+									 "launched %d parallel vacuum workers for index vacuuming (planned: %d)",
+									 lps->pcxt->nworkers_launched),
+							lps->pcxt->nworkers_launched, nworkers)));
+	}
+
+	/* Process the indexes that can be processed by only leader process */
+	vacuum_indexes_leader(Irel, stats, vacrelstats, lps, nindexes);
+
+	/*
+	 * Join as a parallel worker.  The leader process alone processes all the
+	 * indexes in the case where no workers are launched.
+	 */
+	parallel_vacuum_index(Irel, stats, lps->lvshared,
+						  vacrelstats->dead_tuples, nindexes);
+
+	/* Wait for all vacuum workers to finish */
+	WaitForParallelWorkersToFinish(lps->pcxt);
+
+	/* Carry the shared balance value to heap scan */
+	if (VacuumSharedCostBalance)
+		VacuumCostBalance = pg_atomic_read_u32(VacuumSharedCostBalance);
+
+	if (nworkers > 0)
+	{
+		/* Disable shared cost balance */
+		VacuumSharedCostBalance = NULL;
+		VacuumActiveNWorkers = NULL;
+	}
+}
+
+/*
+ * Index vacuum/cleanup routine used by the leader process and parallel
+ * vacuum worker processes to process the indexes in parallel.
+ */
+static void
+parallel_vacuum_index(Relation *Irel, IndexBulkDeleteResult **stats,
+					  LVShared *lvshared, LVDeadTuples *dead_tuples,
+					  int nindexes)
+{
+	/*
+	 * Increment the active worker count if we are able to launch any worker.
+	 */
+	if (VacuumActiveNWorkers)
+		pg_atomic_add_fetch_u32(VacuumActiveNWorkers, 1);
+
+	/* Loop until all indexes are vacuumed */
+	for (;;)
+	{
+		int			idx;
+		LVSharedIndStats *shared_indstats;
+
+		/* Get an index number to process */
+		idx = pg_atomic_fetch_add_u32(&(lvshared->idx), 1);
+
+		/* Done for all indexes? */
+		if (idx >= nindexes)
+			break;
+
+		/* Get the index statistics of this index from DSM */
+		shared_indstats = get_indstats(lvshared, idx);
+
+		/*
+		 * Skip processing indexes that doesn't participate in parallel
+		 * operation
+		 */
+		if (shared_indstats == NULL ||
+			skip_parallel_vacuum_index(Irel[idx], lvshared))
+			continue;
+
+		/* Do vacuum or cleanup of the index */
+		vacuum_one_index(Irel[idx], &(stats[idx]), lvshared, shared_indstats,
+						 dead_tuples);
+	}
+
+	/*
+	 * We have completed the index vacuum so decrement the active worker
+	 * count.
+	 */
+	if (VacuumActiveNWorkers)
+		pg_atomic_sub_fetch_u32(VacuumActiveNWorkers, 1);
+}
+
+/*
+ * Vacuum or cleanup indexes that can be processed by only the leader process
+ * because these indexes don't support parallel operation at that phase.
+ */
+static void
+vacuum_indexes_leader(Relation *Irel, IndexBulkDeleteResult **stats,
+					  LVRelStats *vacrelstats, LVParallelState *lps,
+					  int nindexes)
+{
+	int			i;
+
+	Assert(!IsParallelWorker());
+
+	/*
+	 * Increment the active worker count if we are able to launch any worker.
+	 */
+	if (VacuumActiveNWorkers)
+		pg_atomic_add_fetch_u32(VacuumActiveNWorkers, 1);
+
+	for (i = 0; i < nindexes; i++)
+	{
+		bool		skip_index = (get_indstats(lps->lvshared, i) == NULL ||
+								  skip_parallel_vacuum_index(Irel[i], lps->lvshared));
+
+		/* Skip the indexes that can be processed by parallel workers */
+		if (!skip_index)
+			continue;
+
+		vacuum_one_index(Irel[i], &(stats[i]), lps->lvshared,
+						 get_indstats(lps->lvshared, i),
+						 vacrelstats->dead_tuples);
+	}
+
+	/*
+	 * We have completed the index vacuum so decrement the active worker
+	 * count.
+	 */
+	if (VacuumActiveNWorkers)
+		pg_atomic_sub_fetch_u32(VacuumActiveNWorkers, 1);
+}
+
+/*
+ * Vacuum or cleanup index either by leader process or by one of the worker
+ * process.  After processing the index this function copies the index
+ * statistics returned from ambulkdelete and amvacuumcleanup to the DSM
+ * segment.
+ */
+static void
+vacuum_one_index(Relation indrel, IndexBulkDeleteResult **stats,
+				 LVShared *lvshared, LVSharedIndStats *shared_indstats,
+				 LVDeadTuples *dead_tuples)
+{
+	IndexBulkDeleteResult *bulkdelete_res = NULL;
+
+	if (shared_indstats)
+	{
+		/* Get the space for IndexBulkDeleteResult */
+		bulkdelete_res = &(shared_indstats->stats);
+
+		/*
+		 * Update the pointer to the corresponding bulk-deletion result if
+		 * someone has already updated it.
+		 */
+		if (shared_indstats->updated && *stats == NULL)
+			*stats = bulkdelete_res;
+	}
+
+	/* Do vacuum or cleanup of the index */
+	if (lvshared->for_cleanup)
+		lazy_cleanup_index(indrel, stats, lvshared->reltuples,
+						   lvshared->estimated_count);
+	else
+		lazy_vacuum_index(indrel, stats, dead_tuples,
+						  lvshared->reltuples);
+
+	/*
+	 * Copy the index bulk-deletion result returned from ambulkdelete and
+	 * amvacuumcleanup to the DSM segment if it's the first time to get it
+	 * from them, because they allocate it locally and it's possible that an
+	 * index will be vacuumed by the different vacuum process at the next
+	 * time.  The copying of the result normally happens only after the first
+	 * time of index vacuuming.  From the second time, we pass the result on
+	 * the DSM segment so that they then update it directly.
+	 *
+	 * Since all vacuum workers write the bulk-deletion result at different
+	 * slots we can write them without locking.
+	 */
+	if (shared_indstats && !shared_indstats->updated && *stats != NULL)
+	{
+		memcpy(bulkdelete_res, *stats, sizeof(IndexBulkDeleteResult));
+		shared_indstats->updated = true;
+
+		/*
+		 * Now that the stats[idx] points to the DSM segment, we don't need
+		 * the locally allocated results.
+		 */
+		pfree(*stats);
+		*stats = bulkdelete_res;
+	}
+}
+
+/*
+ *	lazy_cleanup_all_indexes() -- cleanup all indexes of relation.
+ *
+ * Cleanup indexes.  We process the indexes serially unless we are doing
+ * parallel vacuum.
+ */
+static void
+lazy_cleanup_all_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+						 LVRelStats *vacrelstats, LVParallelState *lps,
+						 int nindexes)
+{
+	int			idx;
+
+	Assert(!IsParallelWorker());
+	Assert(nindexes > 0);
+
+	/* Report that we are now cleaning up indexes */
+	pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
+								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
+
+	/*
+	 * If parallel vacuum is active we perform index cleanup with parallel
+	 * workers.
+	 */
+	if (ParallelVacuumIsActive(lps))
+	{
+		/* Tell parallel workers to do index cleanup */
+		lps->lvshared->for_cleanup = true;
+		lps->lvshared->first_time =
+			(vacrelstats->num_index_scans == 0);
+
+		/*
+		 * Now we can provide a better estimate of total number of surviving
+		 * tuples (we assume indexes are more interested in that than in the
+		 * number of nominally live tuples).
+		 */
+		lps->lvshared->reltuples = vacrelstats->new_rel_tuples;
+		lps->lvshared->estimated_count =
+			(vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+
+		lazy_parallel_vacuum_indexes(Irel, stats, vacrelstats, lps, nindexes);
+	}
+	else
+	{
+		for (idx = 0; idx < nindexes; idx++)
+			lazy_cleanup_index(Irel[idx], &stats[idx],
+							   vacrelstats->new_rel_tuples,
+							   vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	}
+}
 
 /*
  *	lazy_vacuum_index() -- vacuum one index relation.
  *
  *		Delete all the index entries pointing to tuples listed in
- *		vacrelstats->dead_tuples, and update running statistics.
+ *		dead_tuples, and update running statistics.
+ *
+ *		reltuples is the number of heap tuples to be passed to the
+ *		bulkdelete callback.
  */
 static void
-lazy_vacuum_index(Relation indrel,
-				  IndexBulkDeleteResult **stats,
-				  LVRelStats *vacrelstats)
+lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult **stats,
+				  LVDeadTuples *dead_tuples, double reltuples)
 {
 	IndexVacuumInfo ivinfo;
+	const char *msg;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
@@ -1741,30 +2310,38 @@ lazy_vacuum_index(Relation indrel,
 	ivinfo.report_progress = false;
 	ivinfo.estimated_count = true;
 	ivinfo.message_level = elevel;
-	/* We can only provide an approximate value of num_heap_tuples here */
-	ivinfo.num_heap_tuples = vacrelstats->old_live_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
 	/* Do bulk deletion */
 	*stats = index_bulk_delete(&ivinfo, *stats,
-							   lazy_tid_reaped, (void *) vacrelstats);
+							   lazy_tid_reaped, (void *) dead_tuples);
+
+	if (IsParallelWorker())
+		msg = gettext_noop("scanned index \"%s\" to remove %d row versions by parallel vacuum worker");
+	else
+		msg = gettext_noop("scanned index \"%s\" to remove %d row versions");
 
 	ereport(elevel,
-			(errmsg("scanned index \"%s\" to remove %d row versions",
+			(errmsg(msg,
 					RelationGetRelationName(indrel),
-					vacrelstats->num_dead_tuples),
+					dead_tuples->num_tuples),
 			 errdetail_internal("%s", pg_rusage_show(&ru0))));
 }
 
 /*
  *	lazy_cleanup_index() -- do post-vacuum cleanup for one index relation.
+ *
+ *		reltuples is the number of heap tuples and estimated_count is true
+ *		if the reltuples is an estimated value.
  */
 static void
 lazy_cleanup_index(Relation indrel,
-				   IndexBulkDeleteResult *stats,
-				   LVRelStats *vacrelstats)
+				   IndexBulkDeleteResult **stats,
+				   double reltuples, bool estimated_count)
 {
 	IndexVacuumInfo ivinfo;
+	const char *msg;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
@@ -1772,49 +2349,33 @@ lazy_cleanup_index(Relation indrel,
 	ivinfo.index = indrel;
 	ivinfo.analyze_only = false;
 	ivinfo.report_progress = false;
-	ivinfo.estimated_count = (vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	ivinfo.estimated_count = estimated_count;
 	ivinfo.message_level = elevel;
 
-	/*
-	 * Now we can provide a better estimate of total number of surviving
-	 * tuples (we assume indexes are more interested in that than in the
-	 * number of nominally live tuples).
-	 */
-	ivinfo.num_heap_tuples = vacrelstats->new_rel_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
-	stats = index_vacuum_cleanup(&ivinfo, stats);
+	*stats = index_vacuum_cleanup(&ivinfo, *stats);
 
-	if (!stats)
+	if (!(*stats))
 		return;
 
-	/*
-	 * Now update statistics in pg_class, but only if the index says the count
-	 * is accurate.
-	 */
-	if (!stats->estimated_count)
-		vac_update_relstats(indrel,
-							stats->num_pages,
-							stats->num_index_tuples,
-							0,
-							false,
-							InvalidTransactionId,
-							InvalidMultiXactId,
-							false);
+	if (IsParallelWorker())
+		msg = gettext_noop("index \"%s\" now contains %.0f row versions in %u pages as reported by parallel vacuum worker");
+	else
+		msg = gettext_noop("index \"%s\" now contains %.0f row versions in %u pages");
 
 	ereport(elevel,
-			(errmsg("index \"%s\" now contains %.0f row versions in %u pages",
+			(errmsg(msg,
 					RelationGetRelationName(indrel),
-					stats->num_index_tuples,
-					stats->num_pages),
+					(*stats)->num_index_tuples,
+					(*stats)->num_pages),
 			 errdetail("%.0f index row versions were removed.\n"
 					   "%u index pages have been deleted, %u are currently reusable.\n"
 					   "%s.",
-					   stats->tuples_removed,
-					   stats->pages_deleted, stats->pages_free,
+					   (*stats)->tuples_removed,
+					   (*stats)->pages_deleted, (*stats)->pages_free,
 					   pg_rusage_show(&ru0))));
-
-	pfree(stats);
 }
 
 /*
@@ -2122,19 +2683,17 @@ count_nondeletable_pages(Relation onerel, LVRelStats *vacrelstats)
 }
 
 /*
- * lazy_space_alloc - space allocation decisions for lazy vacuum
- *
- * See the comments at the head of this file for rationale.
+ * Return the maximum number of dead tuples we can record.
  */
-static void
-lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+static long
+compute_max_dead_tuples(BlockNumber relblocks, bool useindex)
 {
 	long		maxtuples;
 	int			vac_work_mem = IsAutoVacuumWorkerProcess() &&
 	autovacuum_work_mem != -1 ?
 	autovacuum_work_mem : maintenance_work_mem;
 
-	if (vacrelstats->useindex)
+	if (useindex)
 	{
 		maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);
 		maxtuples = Min(maxtuples, INT_MAX);
@@ -2148,34 +2707,48 @@ lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
 		maxtuples = Max(maxtuples, MaxHeapTuplesPerPage);
 	}
 	else
-	{
 		maxtuples = MaxHeapTuplesPerPage;
-	}
 
-	vacrelstats->num_dead_tuples = 0;
-	vacrelstats->max_dead_tuples = (int) maxtuples;
-	vacrelstats->dead_tuples = (ItemPointer)
-		palloc(maxtuples * sizeof(ItemPointerData));
+	return maxtuples;
+}
+
+/*
+ * lazy_space_alloc - space allocation decisions for lazy vacuum
+ *
+ * See the comments at the head of this file for rationale.
+ */
+static void
+lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+{
+	LVDeadTuples *dead_tuples = NULL;
+	long		maxtuples;
+
+	maxtuples = compute_max_dead_tuples(relblocks, vacrelstats->useindex);
+
+	dead_tuples = (LVDeadTuples *) palloc(SizeOfLVDeadTuples(maxtuples));
+	dead_tuples->num_tuples = 0;
+	dead_tuples->max_tuples = (int) maxtuples;
+
+	vacrelstats->dead_tuples = dead_tuples;
 }
 
 /*
  * lazy_record_dead_tuple - remember one deletable tuple
  */
 static void
-lazy_record_dead_tuple(LVRelStats *vacrelstats,
-					   ItemPointer itemptr)
+lazy_record_dead_tuple(LVDeadTuples *dead_tuples, ItemPointer itemptr)
 {
 	/*
 	 * The array shouldn't overflow under normal behavior, but perhaps it
 	 * could if we are given a really small maintenance_work_mem. In that
 	 * case, just forget the last few tuples (we'll get 'em next time).
 	 */
-	if (vacrelstats->num_dead_tuples < vacrelstats->max_dead_tuples)
+	if (dead_tuples->num_tuples < dead_tuples->max_tuples)
 	{
-		vacrelstats->dead_tuples[vacrelstats->num_dead_tuples] = *itemptr;
-		vacrelstats->num_dead_tuples++;
+		dead_tuples->itemptrs[dead_tuples->num_tuples] = *itemptr;
+		dead_tuples->num_tuples++;
 		pgstat_progress_update_param(PROGRESS_VACUUM_NUM_DEAD_TUPLES,
-									 vacrelstats->num_dead_tuples);
+									 dead_tuples->num_tuples);
 	}
 }
 
@@ -2189,12 +2762,12 @@ lazy_record_dead_tuple(LVRelStats *vacrelstats,
 static bool
 lazy_tid_reaped(ItemPointer itemptr, void *state)
 {
-	LVRelStats *vacrelstats = (LVRelStats *) state;
+	LVDeadTuples *dead_tuples = (LVDeadTuples *) state;
 	ItemPointer res;
 
 	res = (ItemPointer) bsearch((void *) itemptr,
-								(void *) vacrelstats->dead_tuples,
-								vacrelstats->num_dead_tuples,
+								(void *) dead_tuples->itemptrs,
+								dead_tuples->num_tuples,
 								sizeof(ItemPointerData),
 								vac_cmp_itemptr);
 
@@ -2342,3 +2915,449 @@ heap_page_is_all_visible(Relation rel, Buffer buf,
 
 	return all_visible;
 }
+
+/*
+ * Compute the number of parallel worker processes to request.  Both index
+ * vacuuming and index cleanup can be executed with parallel workers.  The
+ * relation size of the table don't affect the parallel degree for now.
+ *
+ * nrequested is the number of parallel workers that user requested.  If
+ * nrequested is 0, we compute the parallel degree based on nindexes, that is
+ * the number of indexes that support parallel index vacuuming.  This function
+ * also sets can_parallel_vacuum to remember indexes that participate in
+ * parallel index vacuum.
+ */
+static int
+compute_parallel_vacuum_workers(Relation *Irel, int nindexes, int nrequested,
+								bool *can_parallel_vacuum)
+{
+	int			nindexes_parallel = 0;
+	int			nindexes_parallel_bulkdel = 0;
+	int			nindexes_parallel_cleanup = 0;
+	int			parallel_workers;
+	int			i;
+
+	/*
+	 * We don't allow to perform parallel operation in standalone backend or
+	 * when parallelism is disabled.
+	 */
+	if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+		return 0;
+
+	/*
+	 * Compute the number of indexes that can participate in parallel index
+	 * vacuuming.
+	 */
+	for (i = 0; i < nindexes; i++)
+	{
+		uint8		vacoptions = Irel[i]->rd_indam->amparallelvacuumoptions;
+
+		if (vacoptions == VACUUM_OPTION_NO_PARALLEL ||
+			RelationGetNumberOfBlocks(Irel[i]) < min_parallel_index_scan_size)
+			continue;
+
+		can_parallel_vacuum[i] = true;
+
+		if ((vacoptions & VACUUM_OPTION_PARALLEL_BULKDEL) != 0)
+			nindexes_parallel_bulkdel++;
+		if (((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) != 0) ||
+			((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) != 0))
+			nindexes_parallel_cleanup++;
+	}
+
+	nindexes_parallel = Max(nindexes_parallel_bulkdel,
+							nindexes_parallel_cleanup);
+
+	/* No index supports parallel index vacuuming */
+	if (nindexes_parallel == 0)
+		return 0;
+
+	/* The leader process takes one index */
+	nindexes_parallel--;
+
+	/* Compute the parallel degree */
+	parallel_workers = (nrequested > 0) ?
+		Min(nrequested, nindexes_parallel) : nindexes_parallel;
+
+	/* Cap by max_parallel_maintenance_workers */
+	parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+
+	return parallel_workers;
+}
+
+/*
+ * Initialize variables for shared index statistics, set NULL bitmap and the
+ * size of stats for each index.  Since currently we don't support parallel vacuum
+ * for autovacuum we don't need to care about autovacuum_work_mem.
+ */
+static void
+prepare_index_statistics(LVShared *lvshared, bool *can_parallel_vacuum,
+						 int nindexes)
+{
+	int			i;
+
+	Assert(!IsAutoVacuumWorkerProcess());
+
+	/* Set NULL for all indexes */
+	memset(lvshared->bitmap, 0x00, BITMAPLEN(nindexes));
+
+	for (i = 0; i < nindexes; i++)
+	{
+		if (!can_parallel_vacuum[i])
+			continue;
+
+		/* Set NOT NULL as this index do support parallelism */
+		lvshared->bitmap[i >> 3] |= 1 << (i & 0x07);
+	}
+}
+
+/*
+ * Update index statistics in pg_class if the statistics is accurate.
+ */
+static void
+update_index_statistics(Relation *Irel, IndexBulkDeleteResult **stats,
+						int nindexes)
+{
+	int			i;
+
+	Assert(!IsInParallelMode());
+
+	for (i = 0; i < nindexes; i++)
+	{
+		if (stats[i] == NULL || stats[i]->estimated_count)
+			continue;
+
+		/* Update index statistics */
+		vac_update_relstats(Irel[i],
+							stats[i]->num_pages,
+							stats[i]->num_index_tuples,
+							0,
+							false,
+							InvalidTransactionId,
+							InvalidMultiXactId,
+							false);
+		pfree(stats[i]);
+	}
+}
+
+/*
+ * This function prepares and returns parallel vacuum state if we can launch
+ * even one worker.  This function is responsible to create a parallel
+ * context, enter parallel mode, and then initialize the DSM segment.
+ */
+static LVParallelState *
+begin_parallel_vacuum(Oid relid, Relation *Irel, LVRelStats *vacrelstats,
+					  BlockNumber nblocks, int nindexes, int nrequested)
+{
+	LVParallelState *lps = NULL;
+	ParallelContext *pcxt;
+	LVShared   *shared;
+	LVDeadTuples *dead_tuples;
+	bool	   *can_parallel_vacuum;
+	long		maxtuples;
+	char	   *sharedquery;
+	Size		est_shared;
+	Size		est_deadtuples;
+	int			nindexes_mwm = 0;
+	int			parallel_workers = 0;
+	int			querylen;
+	int			i;
+
+	/*
+	 * A parallel vacuum must be requested and there must be indexes on the
+	 * relation
+	 */
+	Assert(nrequested >= 0);
+	Assert(nindexes > 0);
+
+	/*
+	 * Compute the number of parallel vacuum workers to launch
+	 */
+	can_parallel_vacuum = (bool *) palloc0(sizeof(bool) * nindexes);
+	parallel_workers = compute_parallel_vacuum_workers(Irel, nindexes,
+													   nrequested,
+													   can_parallel_vacuum);
+
+	/* Can't perform vacuum in parallel */
+	if (parallel_workers <= 0)
+	{
+		pfree(can_parallel_vacuum);
+		return lps;
+	}
+
+	lps = (LVParallelState *) palloc0(sizeof(LVParallelState));
+
+	EnterParallelMode();
+	pcxt = CreateParallelContext("postgres", "parallel_vacuum_main",
+								 parallel_workers);
+	Assert(pcxt->nworkers > 0);
+	lps->pcxt = pcxt;
+
+	/* Estimate size for shared information -- PARALLEL_VACUUM_KEY_SHARED */
+	est_shared = MAXALIGN(add_size(SizeOfLVShared, BITMAPLEN(nindexes)));
+	for (i = 0; i < nindexes; i++)
+	{
+		uint8		vacoptions = Irel[i]->rd_indam->amparallelvacuumoptions;
+
+		/*
+		 * Cleanup option should be either disabled, always performing in
+		 * parallel or conditionally performing in parallel.
+		 */
+		Assert(((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) == 0) ||
+			   ((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) == 0));
+		Assert(vacoptions <= VACUUM_OPTION_MAX_VALID_VALUE);
+
+		/* Skip indexes that don't participate in parallel index vacuum */
+		if (!can_parallel_vacuum[i])
+			continue;
+
+		if (Irel[i]->rd_indam->amusemaintenanceworkmem)
+			nindexes_mwm++;
+
+		est_shared = add_size(est_shared, sizeof(LVSharedIndStats));
+
+		/*
+		 * Remember the number of indexes that support parallel operation for
+		 * each phase.
+		 */
+		if ((vacoptions & VACUUM_OPTION_PARALLEL_BULKDEL) != 0)
+			lps->nindexes_parallel_bulkdel++;
+		if ((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) != 0)
+			lps->nindexes_parallel_cleanup++;
+		if ((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) != 0)
+			lps->nindexes_parallel_condcleanup++;
+	}
+	shm_toc_estimate_chunk(&pcxt->estimator, est_shared);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	/* Estimate size for dead tuples -- PARALLEL_VACUUM_KEY_DEAD_TUPLES */
+	maxtuples = compute_max_dead_tuples(nblocks, true);
+	est_deadtuples = MAXALIGN(SizeOfLVDeadTuples(maxtuples));
+	shm_toc_estimate_chunk(&pcxt->estimator, est_deadtuples);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	/* Finally, estimate PARALLEL_VACUUM_KEY_QUERY_TEXT space */
+	querylen = strlen(debug_query_string);
+	shm_toc_estimate_chunk(&pcxt->estimator, querylen + 1);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	InitializeParallelDSM(pcxt);
+
+	/* Prepare shared information */
+	shared = (LVShared *) shm_toc_allocate(pcxt->toc, est_shared);
+	MemSet(shared, 0, est_shared);
+	shared->relid = relid;
+	shared->elevel = elevel;
+	shared->maintenance_work_mem_worker =
+		(nindexes_mwm > 0) ?
+		maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
+		maintenance_work_mem;
+
+	/*
+	 * We need to care about alignment because we estimate the shared memory
+	 * in that way.
+	 */
+	shared->offset = MAXALIGN(add_size(SizeOfLVShared, BITMAPLEN(nindexes)));
+	prepare_index_statistics(shared, can_parallel_vacuum, nindexes);
+	pg_atomic_init_u32(&(shared->idx), 0);
+	pg_atomic_init_u32(&(shared->cost_balance), 0);
+	pg_atomic_init_u32(&(shared->active_nworkers), 0);
+
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_SHARED, shared);
+	lps->lvshared = shared;
+
+	/* Prepare the dead tuple space */
+	dead_tuples = (LVDeadTuples *) shm_toc_allocate(pcxt->toc, est_deadtuples);
+	dead_tuples->max_tuples = maxtuples;
+	dead_tuples->num_tuples = 0;
+	MemSet(dead_tuples->itemptrs, 0, sizeof(ItemPointerData) * maxtuples);
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_DEAD_TUPLES, dead_tuples);
+	vacrelstats->dead_tuples = dead_tuples;
+
+	/* Store query string for workers */
+	sharedquery = (char *) shm_toc_allocate(pcxt->toc, querylen + 1);
+	memcpy(sharedquery, debug_query_string, querylen + 1);
+	sharedquery[querylen] = '\0';
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, sharedquery);
+
+	pfree(can_parallel_vacuum);
+	return lps;
+}
+
+/*
+ * Destroy the parallel context, and end parallel mode.
+ *
+ * Since writes are not allowed during the parallel mode, so we copy the
+ * updated index statistics from DSM in local memory and then later use that
+ * to update the index statistics.  One might think that we can exit from
+ * parallel mode, update the index statistics and then destroy parallel
+ * context, but that won't be safe (see ExitParallelMode).
+ */
+static void
+end_parallel_vacuum(Relation *Irel, IndexBulkDeleteResult **stats,
+					LVParallelState *lps, int nindexes)
+{
+	int			i;
+
+	Assert(!IsParallelWorker());
+
+	/* Copy the updated statistics */
+	for (i = 0; i < nindexes; i++)
+	{
+		LVSharedIndStats *indstats = get_indstats(lps->lvshared, i);
+
+		/*
+		 * Skip unused slot.  The statistics of this index are already stored
+		 * in local memory.
+		 */
+		if (indstats == NULL)
+			continue;
+
+		if (indstats->updated)
+		{
+			stats[i] = (IndexBulkDeleteResult *) palloc0(sizeof(IndexBulkDeleteResult));
+			memcpy(stats[i], &(indstats->stats), sizeof(IndexBulkDeleteResult));
+		}
+		else
+			stats[i] = NULL;
+	}
+
+	DestroyParallelContext(lps->pcxt);
+	ExitParallelMode();
+
+	/* Deactivate parallel vacuum */
+	pfree(lps);
+	lps = NULL;
+}
+
+/* Return the Nth index statistics or NULL */
+static LVSharedIndStats *
+get_indstats(LVShared *lvshared, int n)
+{
+	int			i;
+	char	   *p;
+
+	if (IndStatsIsNull(lvshared, n))
+		return NULL;
+
+	p = (char *) GetSharedIndStats(lvshared);
+	for (i = 0; i < n; i++)
+	{
+		if (IndStatsIsNull(lvshared, i))
+			continue;
+
+		p += sizeof(LVSharedIndStats);
+	}
+
+	return (LVSharedIndStats *) p;
+}
+
+/*
+ * Check if the given index participates in parallel index vacuum or parallel
+ * index cleanup.
+ */
+static bool
+skip_parallel_vacuum_index(Relation indrel, LVShared *lvshared)
+{
+	uint8		vacoptions = indrel->rd_indam->amparallelvacuumoptions;
+
+	/* first_time must be true only if for_cleanup is true */
+	Assert(lvshared->for_cleanup || !lvshared->first_time);
+
+	if (lvshared->for_cleanup)
+	{
+		/* Skip, if the index does not support parallel cleanup */
+		if (((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) == 0) &&
+			((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) == 0))
+			return true;
+
+		/*
+		 * Skip, if the index supports parallel cleanup conditionally, but we
+		 * have already processed the index (for bulkdelete).  See the
+		 * comments for option VACUUM_OPTION_PARALLEL_COND_CLEANUP to know
+		 * when indexes support parallel cleanup conditionally.
+		 */
+		if (!lvshared->first_time &&
+			((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) != 0))
+			return true;
+	}
+	else if ((vacoptions & VACUUM_OPTION_PARALLEL_BULKDEL) == 0)
+	{
+		/* Skip if the index does not support parallel bulk deletion */
+		return true;
+	}
+
+	return false;
+}
+
+/*
+ * Perform work within a launched parallel process.
+ *
+ * Since parallel vacuum workers perform only index vacuum or index cleanup,
+ * we don't need to report the progress information.
+ */
+void
+parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
+{
+	Relation	onerel;
+	Relation   *indrels;
+	LVShared   *lvshared;
+	LVDeadTuples *dead_tuples;
+	int			nindexes;
+	char	   *sharedquery;
+	IndexBulkDeleteResult **stats;
+
+	lvshared = (LVShared *) shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_SHARED,
+										   false);
+	elevel = lvshared->elevel;
+
+	ereport(DEBUG1,
+			(errmsg("starting parallel lazy vacuum worker for %s",
+					lvshared->for_cleanup ? "cleanup" : "vacuuming")));
+
+	/* Set debug_query_string for individual workers */
+	sharedquery = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, false);
+	debug_query_string = sharedquery;
+	pgstat_report_activity(STATE_RUNNING, debug_query_string);
+
+	/*
+	 * Open table.  The lock mode is the same as the leader process.  It's
+	 * okay because the lock mode does not conflict among the parallel
+	 * workers.
+	 */
+	onerel = table_open(lvshared->relid, ShareUpdateExclusiveLock);
+
+	/*
+	 * Open all indexes. indrels are sorted in order by OID, which should be
+	 * matched to the leader's one.
+	 */
+	vac_open_indexes(onerel, RowExclusiveLock, &nindexes, &indrels);
+	Assert(nindexes > 0);
+
+	/* Set dead tuple space */
+	dead_tuples = (LVDeadTuples *) shm_toc_lookup(toc,
+												  PARALLEL_VACUUM_KEY_DEAD_TUPLES,
+												  false);
+
+	/* Set cost-based vacuum delay */
+	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostBalance = 0;
+	VacuumPageHit = 0;
+	VacuumPageMiss = 0;
+	VacuumPageDirty = 0;
+	VacuumSharedCostBalance = &(lvshared->cost_balance);
+	VacuumActiveNWorkers = &(lvshared->active_nworkers);
+
+	stats = (IndexBulkDeleteResult **)
+		palloc0(nindexes * sizeof(IndexBulkDeleteResult *));
+
+	if (lvshared->maintenance_work_mem_worker > 0)
+		maintenance_work_mem = lvshared->maintenance_work_mem_worker;
+
+	/* Process indexes to perform vacuum/cleanup */
+	parallel_vacuum_index(indrels, stats, lvshared, dead_tuples, nindexes);
+
+	vac_close_indexes(nindexes, indrels, RowExclusiveLock);
+	table_close(onerel, ShareUpdateExclusiveLock);
+	pfree(stats);
+}
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index f3e2254..df06e7d 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -14,6 +14,7 @@
 
 #include "postgres.h"
 
+#include "access/heapam.h"
 #include "access/nbtree.h"
 #include "access/parallel.h"
 #include "access/session.h"
@@ -139,6 +140,9 @@ static const struct
 	},
 	{
 		"_bt_parallel_build_main", _bt_parallel_build_main
+	},
+	{
+		"parallel_vacuum_main", parallel_vacuum_main
 	}
 };
 
@@ -174,6 +178,7 @@ CreateParallelContext(const char *library_name, const char *function_name,
 	pcxt = palloc0(sizeof(ParallelContext));
 	pcxt->subid = GetCurrentSubTransactionId();
 	pcxt->nworkers = nworkers;
+	pcxt->nworkers_to_launch = nworkers;
 	pcxt->library_name = pstrdup(library_name);
 	pcxt->function_name = pstrdup(function_name);
 	pcxt->error_context_stack = error_context_stack;
@@ -487,6 +492,23 @@ ReinitializeParallelDSM(ParallelContext *pcxt)
 }
 
 /*
+ * Reinitialize parallel workers for a parallel context such that we could
+ * launch the different number of workers.  This is required for cases where
+ * we need to reuse the same DSM segment, but the number of workers can
+ * vary from run-to-run.
+ */
+void
+ReinitializeParallelWorkers(ParallelContext *pcxt, int nworkers_to_launch)
+{
+	/*
+	 * The number of workers that need to be launched must be less than the
+	 * number of workers with which the parallel context is initialized.
+	 */
+	Assert(pcxt->nworkers >= nworkers_to_launch);
+	pcxt->nworkers_to_launch = nworkers_to_launch;
+}
+
+/*
  * Launch parallel workers.
  */
 void
@@ -498,7 +520,7 @@ LaunchParallelWorkers(ParallelContext *pcxt)
 	bool		any_registrations_failed = false;
 
 	/* Skip this if we have no workers. */
-	if (pcxt->nworkers == 0)
+	if (pcxt->nworkers == 0 || pcxt->nworkers_to_launch == 0)
 		return;
 
 	/* We need to be a lock group leader. */
@@ -533,7 +555,7 @@ LaunchParallelWorkers(ParallelContext *pcxt)
 	 * fails.  It wouldn't help much anyway, because registering the worker in
 	 * no way guarantees that it will start up and initialize successfully.
 	 */
-	for (i = 0; i < pcxt->nworkers; ++i)
+	for (i = 0; i < pcxt->nworkers_to_launch; ++i)
 	{
 		memcpy(worker.bgw_extra, &i, sizeof(int));
 		if (!any_registrations_failed &&
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index bb34e25..6526cc1 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -42,6 +42,7 @@
 #include "nodes/makefuncs.h"
 #include "pgstat.h"
 #include "postmaster/autovacuum.h"
+#include "postmaster/bgworker_internals.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
 #include "storage/proc.h"
@@ -68,6 +69,14 @@ static MemoryContext vac_context = NULL;
 static BufferAccessStrategy vac_strategy;
 
 
+/*
+ * Variables for cost-based parallel vacuum.  See comments atop
+ * compute_parallel_delay to understand how it works.
+ */
+pg_atomic_uint32 *VacuumSharedCostBalance = NULL;
+pg_atomic_uint32 *VacuumActiveNWorkers = NULL;
+int			VacuumCostBalanceLocal = 0;
+
 /* non-export function prototypes */
 static List *expand_vacuum_rel(VacuumRelation *vrel, int options);
 static List *get_all_vacuum_rels(int options);
@@ -76,6 +85,7 @@ static void vac_truncate_clog(TransactionId frozenXID,
 							  TransactionId lastSaneFrozenXid,
 							  MultiXactId lastSaneMinMulti);
 static bool vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params);
+static double compute_parallel_delay(void);
 static VacOptTernaryValue get_vacopt_ternary_value(DefElem *def);
 
 /*
@@ -94,12 +104,16 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 	bool		freeze = false;
 	bool		full = false;
 	bool		disable_page_skipping = false;
+	bool		parallel_option = false;
 	ListCell   *lc;
 
 	/* Set default value */
 	params.index_cleanup = VACOPT_TERNARY_DEFAULT;
 	params.truncate = VACOPT_TERNARY_DEFAULT;
 
+	/* By default parallel vacuum is enabled */
+	params.nworkers = 0;
+
 	/* Parse options list */
 	foreach(lc, vacstmt->options)
 	{
@@ -129,6 +143,41 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 			params.index_cleanup = get_vacopt_ternary_value(opt);
 		else if (strcmp(opt->defname, "truncate") == 0)
 			params.truncate = get_vacopt_ternary_value(opt);
+		else if (strcmp(opt->defname, "parallel") == 0)
+		{
+			parallel_option = true;
+			if (opt->arg == NULL)
+			{
+				/*
+				 * Parallel lazy vacuum is requested but user didn't specify
+				 * the parallel degree.  The parallel degree will be
+				 * determined at the start of lazy vacuum based on number of
+				 * indexes.
+				 */
+				params.nworkers = 0;
+			}
+			else
+			{
+				int			nworkers;
+
+				nworkers = defGetInt32(opt);
+				if (nworkers < 0 || nworkers > MAX_PARALLEL_WORKER_LIMIT)
+					ereport(ERROR,
+							(errcode(ERRCODE_SYNTAX_ERROR),
+							 errmsg("parallel vacuum degree must be between 0 and %d",
+									MAX_PARALLEL_WORKER_LIMIT),
+							 parser_errposition(pstate, opt->location)));
+
+				/*
+				 * Disable parallel vacuum, if user has specified parallel
+				 * degree as zero.
+				 */
+				if (nworkers == 0)
+					params.nworkers = -1;
+				else
+					params.nworkers = nworkers;
+			}
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -152,6 +201,11 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 		   !(params.options & (VACOPT_FULL | VACOPT_FREEZE)));
 	Assert(!(params.options & VACOPT_SKIPTOAST));
 
+	if ((params.options & VACOPT_FULL) && parallel_option)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot specify both FULL and PARALLEL options")));
+
 	/*
 	 * Make sure VACOPT_ANALYZE is specified if any column lists are present.
 	 */
@@ -383,6 +437,7 @@ vacuum(List *relations, VacuumParams *params,
 		VacuumPageHit = 0;
 		VacuumPageMiss = 0;
 		VacuumPageDirty = 0;
+		VacuumSharedCostBalance = NULL;
 
 		/*
 		 * Loop to process each selected relation.
@@ -1739,6 +1794,20 @@ vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params)
 	}
 
 	/*
+	 * Since parallel workers cannot access data in temporary tables, parallel
+	 * vacuum is not allowed for temporary relation. However rather than
+	 * skipping vacuum on the table, just disabling parallel option is better
+	 * option in most cases.
+	 */
+	if (RelationUsesLocalBuffers(onerel) && params->nworkers >= 0)
+	{
+		ereport(WARNING,
+				(errmsg("disabling parallel option of vacuum on \"%s\" --- cannot vacuum temporary tables in parallel",
+						RelationGetRelationName(onerel))));
+		params->nworkers = -1;
+	}
+
+	/*
 	 * Silently ignore partitioned tables as there is no work to be done.  The
 	 * useful work is on their child partitions, which have been queued up for
 	 * us separately.
@@ -1941,16 +2010,26 @@ vac_close_indexes(int nindexes, Relation *Irel, LOCKMODE lockmode)
 void
 vacuum_delay_point(void)
 {
+	double		msec = 0;
+
 	/* Always check for interrupts */
 	CHECK_FOR_INTERRUPTS();
 
-	/* Nap if appropriate */
-	if (VacuumCostActive && !InterruptPending &&
-		VacuumCostBalance >= VacuumCostLimit)
-	{
-		double		msec;
+	if (!VacuumCostActive || InterruptPending)
+		return;
 
+	/*
+	 * For parallel vacuum, the delay is computed based on the shared cost
+	 * balance.  See compute_parallel_delay.
+	 */
+	if (VacuumSharedCostBalance != NULL)
+		msec = compute_parallel_delay();
+	else if (VacuumCostBalance >= VacuumCostLimit)
 		msec = VacuumCostDelay * VacuumCostBalance / VacuumCostLimit;
+
+	/* Nap if appropriate */
+	if (msec > 0)
+	{
 		if (msec > VacuumCostDelay * 4)
 			msec = VacuumCostDelay * 4;
 
@@ -1967,6 +2046,65 @@ vacuum_delay_point(void)
 }
 
 /*
+ * Computes the vacuum delay for parallel workers.
+ *
+ * The basic idea of a cost-based vacuum delay for parallel index vacuuming
+ * is to allow all parallel vacuum workers including the leader process to
+ * have a shared view of cost related parameters (mainly VacuumCostBalance).
+ * We allow each worker to update it as and when it has incurred any cost and
+ * then based on that decide whether it needs to sleep.  We allow the worker
+ * to sleep proportional to the work done and reduce the
+ * VacuumSharedCostBalance by the amount which is consumed by the current
+ * worker (VacuumCostBalanceLocal).  This can avoid letting the workers sleep
+ * who have done less or no I/O as compared to other workers and therefore can
+ * ensure that workers who are doing more I/O got throttled more.
+ *
+ * We allow any worker to sleep only if it has performed the I/O above a
+ * certain threshold, which is calculated based on the number of active
+ * workers (VacuumActiveNWorkers), and the overall cost balance is more than
+ * VacuumCostLimit set by the system.  The testing reveals that we achieve
+ * the required throttling if we allow a worker that has done more than 50%
+ * of its share of work to sleep.
+ */
+static double
+compute_parallel_delay(void)
+{
+	double		msec = 0;
+	uint32		shared_balance;
+	int			nworkers;
+
+	/* Parallel vacuum must be active */
+	Assert(VacuumSharedCostBalance);
+
+	nworkers = pg_atomic_read_u32(VacuumActiveNWorkers);
+
+	/* At least count itself */
+	Assert(nworkers >= 1);
+
+	/* Update the shared cost balance value atomically */
+	shared_balance = pg_atomic_add_fetch_u32(VacuumSharedCostBalance, VacuumCostBalance);
+
+	/* Compute the total local balance for the current worker */
+	VacuumCostBalanceLocal += VacuumCostBalance;
+
+	if ((shared_balance >= VacuumCostLimit) &&
+		(VacuumCostBalanceLocal > 0.5 * (VacuumCostLimit / nworkers)))
+	{
+		/* Compute sleep time based on the local cost balance */
+		msec = VacuumCostDelay * VacuumCostBalanceLocal / VacuumCostLimit;
+		pg_atomic_sub_fetch_u32(VacuumSharedCostBalance, VacuumCostBalanceLocal);
+		VacuumCostBalanceLocal = 0;
+	}
+
+	/*
+	 * Reset the local balance as we accumulated it into the shared value.
+	 */
+	VacuumCostBalance = 0;
+
+	return msec;
+}
+
+/*
  * A wrapper function of defGetBoolean().
  *
  * This function returns VACOPT_TERNARY_ENABLED and VACOPT_TERNARY_DISABLED
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index f0e40e3..6d1f28c 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2886,6 +2886,8 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 			(!wraparound ? VACOPT_SKIP_LOCKED : 0);
 		tab->at_params.index_cleanup = VACOPT_TERNARY_DEFAULT;
 		tab->at_params.truncate = VACOPT_TERNARY_DEFAULT;
+		/* As of now, we don't support parallel vacuum for autovacuum */
+		tab->at_params.nworkers = -1;
 		tab->at_params.freeze_min_age = freeze_min_age;
 		tab->at_params.freeze_table_age = freeze_table_age;
 		tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 2fd8886..99451fd 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -3597,7 +3597,7 @@ psql_completion(const char *text, int start, int end)
 		if (ends_with(prev_wd, '(') || ends_with(prev_wd, ','))
 			COMPLETE_WITH("FULL", "FREEZE", "ANALYZE", "VERBOSE",
 						  "DISABLE_PAGE_SKIPPING", "SKIP_LOCKED",
-						  "INDEX_CLEANUP", "TRUNCATE");
+						  "INDEX_CLEANUP", "TRUNCATE", "PARALLEL");
 		else if (TailMatches("FULL|FREEZE|ANALYZE|VERBOSE|DISABLE_PAGE_SKIPPING|SKIP_LOCKED|INDEX_CLEANUP|TRUNCATE"))
 			COMPLETE_WITH("ON", "OFF");
 	}
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 580b4ca..479f17c 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -24,6 +24,8 @@
 #include "nodes/primnodes.h"
 #include "storage/bufpage.h"
 #include "storage/lockdefs.h"
+#include "storage/shm_toc.h"
+#include "storage/dsm.h"
 #include "utils/relcache.h"
 #include "utils/snapshot.h"
 
@@ -193,6 +195,7 @@ extern Size SyncScanShmemSize(void);
 struct VacuumParams;
 extern void heap_vacuum_rel(Relation onerel,
 							struct VacuumParams *params, BufferAccessStrategy bstrategy);
+extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
 
 /* in heap/heapam_visibility.c */
 extern bool HeapTupleSatisfiesVisibility(HeapTuple stup, Snapshot snapshot,
diff --git a/src/include/access/parallel.h b/src/include/access/parallel.h
index 646708b..fc6a560 100644
--- a/src/include/access/parallel.h
+++ b/src/include/access/parallel.h
@@ -33,7 +33,8 @@ typedef struct ParallelContext
 {
 	dlist_node	node;
 	SubTransactionId subid;
-	int			nworkers;
+	int			nworkers;		/* Maximum number of workers to launch */
+	int			nworkers_to_launch; /* Actual number of workers to launch */
 	int			nworkers_launched;
 	char	   *library_name;
 	char	   *function_name;
@@ -63,6 +64,7 @@ extern ParallelContext *CreateParallelContext(const char *library_name,
 											  const char *function_name, int nworkers);
 extern void InitializeParallelDSM(ParallelContext *pcxt);
 extern void ReinitializeParallelDSM(ParallelContext *pcxt);
+extern void ReinitializeParallelWorkers(ParallelContext *pcxt, int nworkers_to_launch);
 extern void LaunchParallelWorkers(ParallelContext *pcxt);
 extern void WaitForParallelWorkersToAttach(ParallelContext *pcxt);
 extern void WaitForParallelWorkersToFinish(ParallelContext *pcxt);
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index b3351ad..56417f0 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -222,6 +222,12 @@ typedef struct VacuumParams
 										 * default value depends on reloptions */
 	VacOptTernaryValue truncate;	/* Truncate empty pages at the end,
 									 * default value depends on reloptions */
+
+	/*
+	 * The number of parallel vacuum workers.  0 by default which means choose
+	 * based on number of indexes.  -1 indicates parallel vacuum is disabled.
+	 */
+	int			nworkers;
 } VacuumParams;
 
 /* GUC parameters */
@@ -231,6 +237,11 @@ extern int	vacuum_freeze_table_age;
 extern int	vacuum_multixact_freeze_min_age;
 extern int	vacuum_multixact_freeze_table_age;
 
+/* Variables for cost-based parallel vacuum  */
+extern pg_atomic_uint32 *VacuumSharedCostBalance;
+extern pg_atomic_uint32 *VacuumActiveNWorkers;
+extern int	VacuumCostBalanceLocal;
+
 
 /* in commands/vacuum.c */
 extern void ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel);
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index 9996d88..22cca70 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -92,6 +92,34 @@ CONTEXT:  SQL function "do_analyze" statement 1
 SQL function "wrap_do_analyze" statement 1
 VACUUM FULL vactst;
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+-- PARALLEL option
+CREATE TABLE pvactst (i INT, a INT[], p POINT) with (autovacuum_enabled = off);
+INSERT INTO pvactst SELECT i, array[1,2,3], point(i, i+1) FROM generate_series(1,1000) i;
+CREATE INDEX btree_pvactst ON pvactst USING btree (i);
+CREATE INDEX hash_pvactst ON pvactst USING hash (i);
+CREATE INDEX brin_pvactst ON pvactst USING brin (i);
+CREATE INDEX gin_pvactst ON pvactst USING gin (a);
+CREATE INDEX gist_pvactst ON pvactst USING gist (p);
+CREATE INDEX spgist_pvactst ON pvactst USING spgist (p);
+-- VACUUM invokes parallel index cleanup
+SET min_parallel_index_scan_size to 0;
+VACUUM (PARALLEL 2) pvactst;
+-- VACUUM invokes parallel bulk-deletion
+UPDATE pvactst SET i = i WHERE i < 1000;
+VACUUM (PARALLEL 2) pvactst;
+VACUUM (PARALLEL -1) pvactst; -- error
+ERROR:  parallel vacuum degree must be between 0 and 1024
+LINE 1: VACUUM (PARALLEL -1) pvactst;
+                ^
+VACUUM (PARALLEL 2, INDEX_CLEANUP FALSE) pvactst;
+VACUUM (PARALLEL 2, FULL TRUE) pvactst; -- error, cannot use both PARALLEL and FULL
+ERROR:  cannot specify both FULL and PARALLEL options
+CREATE TEMPORARY TABLE tmp (a int PRIMARY KEY);
+CREATE INDEX tmp_idx1 ON tmp (a);
+VACUUM (PARALLEL 1) tmp; -- disables parallel vacuum option
+WARNING:  disabling parallel option of vacuum on "tmp" --- cannot vacuum temporary tables in parallel
+RESET min_parallel_index_scan_size;
+DROP TABLE pvactst;
 -- INDEX_CLEANUP option
 CREATE TABLE no_index_cleanup (i INT PRIMARY KEY, t TEXT);
 -- Use uncompressed data stored in toast.
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 69987f7..d6859a5 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -75,6 +75,33 @@ VACUUM FULL vactst;
 
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
 
+-- PARALLEL option
+CREATE TABLE pvactst (i INT, a INT[], p POINT) with (autovacuum_enabled = off);
+INSERT INTO pvactst SELECT i, array[1,2,3], point(i, i+1) FROM generate_series(1,1000) i;
+CREATE INDEX btree_pvactst ON pvactst USING btree (i);
+CREATE INDEX hash_pvactst ON pvactst USING hash (i);
+CREATE INDEX brin_pvactst ON pvactst USING brin (i);
+CREATE INDEX gin_pvactst ON pvactst USING gin (a);
+CREATE INDEX gist_pvactst ON pvactst USING gist (p);
+CREATE INDEX spgist_pvactst ON pvactst USING spgist (p);
+
+-- VACUUM invokes parallel index cleanup
+SET min_parallel_index_scan_size to 0;
+VACUUM (PARALLEL 2) pvactst;
+
+-- VACUUM invokes parallel bulk-deletion
+UPDATE pvactst SET i = i WHERE i < 1000;
+VACUUM (PARALLEL 2) pvactst;
+
+VACUUM (PARALLEL -1) pvactst; -- error
+VACUUM (PARALLEL 2, INDEX_CLEANUP FALSE) pvactst;
+VACUUM (PARALLEL 2, FULL TRUE) pvactst; -- error, cannot use both PARALLEL and FULL
+CREATE TEMPORARY TABLE tmp (a int PRIMARY KEY);
+CREATE INDEX tmp_idx1 ON tmp (a);
+VACUUM (PARALLEL 1) tmp; -- disables parallel vacuum option
+RESET min_parallel_index_scan_size;
+DROP TABLE pvactst;
+
 -- INDEX_CLEANUP option
 CREATE TABLE no_index_cleanup (i INT PRIMARY KEY, t TEXT);
 -- Use uncompressed data stored in toast.
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index caf6b86..0242e66 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -1216,7 +1216,11 @@ LPVOID
 LPWSTR
 LSEG
 LUID
+LVDeadTuples
 LVRelStats
+LVShared
+LVSharedIndStats
+LVParallelState
 LWLock
 LWLockHandle
 LWLockMinimallyPadded
-- 
1.8.3.1

#337

Sergei Kornilov

sk@zsrv.org

about 6 years ago

In reply to: Amit Kapila (#336)

Hello

I noticed that parallel vacuum uses min_parallel_index_scan_size GUC to skip small indexes but this is not mentioned in documentation for both vacuum command and GUC itself.

+	/* Determine the number of parallel workers to launch */
+	if (lps->lvshared->for_cleanup)
+	{
+		if (lps->lvshared->first_time)
+			nworkers = lps->nindexes_parallel_cleanup +
+				lps->nindexes_parallel_condcleanup - 1;
+		else
+			nworkers = lps->nindexes_parallel_cleanup - 1;
+
+	}
+	else
+		nworkers = lps->nindexes_parallel_bulkdel - 1;

(lazy_parallel_vacuum_indexes)
Perhaps we need to add a comment for future readers, why we reduce the number of workers by 1. Maybe this would be cleaner?

+	/* Determine the number of parallel workers to launch */
+	if (lps->lvshared->for_cleanup)
+	{
+		if (lps->lvshared->first_time)
+			nworkers = lps->nindexes_parallel_cleanup +
+				lps->nindexes_parallel_condcleanup;
+		else
+			nworkers = lps->nindexes_parallel_cleanup;
+
+	}
+	else
+		nworkers = lps->nindexes_parallel_bulkdel;
+
+   /* The leader process will participate */
+   nworkers--;

I have no more comments after reading the patches.

regards, Sergei

#338

Mahendra Singh Thalor

mahi6run@gmail.com

about 6 years ago

In reply to: Sergei Kornilov (#337)

On Thu, 9 Jan 2020 at 17:31, Sergei Kornilov <sk@zsrv.org> wrote:

Hello

I noticed that parallel vacuum uses min_parallel_index_scan_size GUC to skip small indexes but this is not mentioned in documentation for both vacuum command and GUC itself.

+       /* Determine the number of parallel workers to launch */
+       if (lps->lvshared->for_cleanup)
+       {
+               if (lps->lvshared->first_time)
+                       nworkers = lps->nindexes_parallel_cleanup +
+                               lps->nindexes_parallel_condcleanup - 1;
+               else
+                       nworkers = lps->nindexes_parallel_cleanup - 1;
+
+       }
+       else
+               nworkers = lps->nindexes_parallel_bulkdel - 1;

v43-0001-Introduce-IndexAM-fields-for-parallel-vacuum and
v43-0001-Introduce-IndexAM-fields-for-parallel-vacuum patches look
fine to me.

Below are some review comments for v43-0002 patch.

1.
+    <term><replaceable class="parameter">integer</replaceable></term>
+    <listitem>
+     <para>
+      Specifies a positive integer value passed to the selected option.
+      The <replaceable class="parameter">integer</replaceable> value can
+      also be omitted, in which case the value is decided by the command
+      based on the option used.
+     </para>
+    </listitem

I think, now we are supporting zero also as a degree, so it should be
changed from "positive integer" to "positive integer(including zero)"

2.
+ * with parallel worker processes.  Individual indexes are processed by one
+ * vacuum process.  At the beginning of a lazy vacuum (at lazy_scan_heap) we

I think, above sentence should be like "Each individual index is
processed by one vacuum process." or one worker

3.
+ * Lazy vacuum supports parallel execution with parallel worker processes.  In
+ * a parallel lazy vacuum, we perform both index vacuuming and index cleanup

Here, "index vacuuming" should be changed to "index vacuum" or "index
cleanup" to "index cleaning"

Thanks and Regards
Mahendra Singh Thalor
EnterpriseDB: http://www.enterprisedb.com

#339

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

about 6 years ago

In reply to: Amit Kapila (#336)

On Thu, 9 Jan 2020 at 19:33, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Thu, Jan 9, 2020 at 10:41 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Wed, 8 Jan 2020 at 22:16, Amit Kapila <amit.kapila16@gmail.com> wrote:

What do you think of the attached? Sawada-san, kindly verify the
changes and let me know your opinion.

I agreed to not include both the FAST option patch and
DISABLE_LEADER_PARTICIPATION patch at this stage. It's better to focus
on the main part and we can discuss and add them later if want.

I've looked at the latest version patch you shared. Overall it looks
good and works fine. I have a few small comments:

I have addressed all your comments and slightly change nearby comments
and ran pgindent. I think we can commit the first two preparatory
patches now unless you or someone else has any more comments on those.

Yes.

I'd like to briefly summarize the
v43-0002-Allow-vacuum-command-to-process-indexes-in-parallel for other
reviewers who wants to newly starts to review this patch:

Introduce PARALLEL option to VACUUM command. Parallel vacuum is
enabled by default. The number of parallel workers is determined based
on the number of indexes that support parallel index when user didn't
specify the parallel degree or PARALLEL option is omitted. Specifying
PARALLEL 0 disables parallel vacuum.

In parallel vacuum of this patch, only the leader process does heap
scan and collect dead tuple TIDs on the DSM segment. Before starting
index vacuum or index cleanup the leader launches the parallel workers
and perform it together with parallel workers. Individual index are
processed by one vacuum worker process. Therefore parallel vacuum can
be used when the table has at least 2 indexes (the leader always takes
one index). After launched parallel workers, the leader process
vacuums indexes first that don't support parallel index after launched
parallel workers. The parallel workers process indexes that support
parallel index vacuum and the leader process join as a worker after
completing such indexes. Once all indexes are processed the parallel
worker processes exit. After that, the leader process re-initializes
the parallel context so that it can use the same DSM for multiple
passes of index vacuum and for performing index cleanup. For updating
the index statistics, we need to update the system table and since
updates are not allowed during parallel mode we update the index
statistics after exiting from the parallel mode.

When the vacuum cost-based delay is enabled, even parallel vacuum is
throttled. The basic idea of a cost-based vacuum delay for parallel
index vacuuming is to allow all parallel vacuum workers including the
leader process to have a shared view of cost related parameters
(mainly VacuumCostBalance). We allow each worker to update it as and
when it has incurred any cost and then based on that decide whether it
needs to sleep. We allow the worker to sleep proportional to the work
done and reduce the VacuumSharedCostBalance by the amount which is
consumed by the current worker (VacuumCostBalanceLocal). This can
avoid letting the workers sleep who have done less or no I/O as
compared to other workers and therefore can ensure that workers who
are doing more I/O got throttled more.

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#340

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Sergei Kornilov (#337)

3 attachment(s)

On Thu, Jan 9, 2020 at 5:31 PM Sergei Kornilov <sk@zsrv.org> wrote:

Hello

I noticed that parallel vacuum uses min_parallel_index_scan_size GUC to skip small indexes but this is not mentioned in documentation for both vacuum command and GUC itself.

Changed documentation at both places.

+       /* Determine the number of parallel workers to launch */
+       if (lps->lvshared->for_cleanup)
+       {
+               if (lps->lvshared->first_time)
+                       nworkers = lps->nindexes_parallel_cleanup +
+                               lps->nindexes_parallel_condcleanup - 1;
+               else
+                       nworkers = lps->nindexes_parallel_cleanup - 1;
+
+       }
+       else
+               nworkers = lps->nindexes_parallel_bulkdel - 1;

(lazy_parallel_vacuum_indexes)
Perhaps we need to add a comment for future readers, why we reduce the number of workers by 1. Maybe this would be cleaner?

Adapted your suggestion.

I have no more comments after reading the patches.

Thank you for reviewing the patch.

1.
+    <term><replaceable class="parameter">integer</replaceable></term>
+    <listitem>
+     <para>
+      Specifies a positive integer value passed to the selected option.
+      The <replaceable class="parameter">integer</replaceable> value can
+      also be omitted, in which case the value is decided by the command
+      based on the option used.
+     </para>
+    </listitem

I think, now we are supporting zero also as a degree, so it should be
changed from "positive integer" to "positive integer(including zero)"

I have replaced it with "non-negative integer .."

2.
+ * with parallel worker processes.  Individual indexes are processed by one
+ * vacuum process.  At the beginning of a lazy vacuum (at lazy_scan_heap) we
I think, above sentence should be like "Each individual index is
processed by one vacuum process." or one worker

Hmm, in the above sentence vacuum process refers to either a leader or
worker process, so not sure if what you are suggesting is an
improvement over current.

3.
+ * Lazy vacuum supports parallel execution with parallel worker processes.  In
+ * a parallel lazy vacuum, we perform both index vacuuming and index cleanup
Here, "index vacuuming" should be changed to "index vacuum" or "index
cleanup" to "index cleaning"

Okay, changed at the place you mentioned and other places where
similar change is required.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachments:

v4-0001-Delete-empty-pages-in-each-pass-during-GIST-VACUUM.patchapplication/octet-stream; name=v4-0001-Delete-empty-pages-in-each-pass-during-GIST-VACUUM.patchDownload

From d38c5ed75cb84f1961ab19d63a93014b771121a0 Mon Sep 17 00:00:00 2001
From: Amit Kapila <akapila@postgresql.org>
Date: Mon, 9 Dec 2019 14:12:59 +0530
Subject: [PATCH 1/3] Delete empty pages in each pass during GIST VACUUM.

Earlier, we use to postpone deleting empty pages till the second stage of
vacuum to amortize the cost of scanning internal pages.  However, that can
sometimes (say vacuum is canceled or errored between first and second
stage) delay the pages to be recycled.

Another thing is that to facilitate deleting empty pages in the second
stage, we need to share the information of internal and empty pages
between different stages of vacuum.  It will be quite tricky to share this
information via DSM which is required for the upcoming parallel vacuum
patch.

Also, it will bring the logic to reclaim deleted pages closer to nbtree
where we delete empty pages in each pass.

Overall, the advantages of deleting empty pages in each pass outweigh the
advantages of postponing the same.

Author: Dilip Kumar, with changes by Amit Kapila
Reviewed-by: Sawada Masahiko and Amit Kapila
Discussion: https://postgr.es/m/CAA4eK1LGr+MN0xHZpJ2dfS8QNQ1a_aROKowZB+MPNep8FVtwAA@mail.gmail.com
---
 src/backend/access/gist/README       |  23 +++--
 src/backend/access/gist/gistvacuum.c | 160 +++++++++++++++--------------------
 2 files changed, 78 insertions(+), 105 deletions(-)

diff --git a/src/backend/access/gist/README b/src/backend/access/gist/README
index 8cbca69296..fffdfff6e1 100644
--- a/src/backend/access/gist/README
+++ b/src/backend/access/gist/README
@@ -429,18 +429,17 @@ splits during searches, we don't need a "vacuum cycle ID" concept for that
 like B-tree does.
 
 While we scan all the pages, we also make note of any completely empty leaf
-pages. We will try to unlink them from the tree in the second stage. We also
-record the block numbers of all internal pages; they are needed in the second
-stage, to locate parents of the empty pages.
-
-In the second stage, we try to unlink any empty leaf pages from the tree, so
-that their space can be reused. In order to delete an empty page, its
-downlink must be removed from the parent. We scan all the internal pages,
-whose block numbers we memorized in the first stage, and look for downlinks
-to pages that we have memorized as being empty. Whenever we find one, we
-acquire a lock on the parent and child page, re-check that the child page is
-still empty. Then, we remove the downlink and mark the child as deleted, and
-release the locks.
+pages. We will try to unlink them from the tree after the scan. We also record
+the block numbers of all internal pages; they are needed to locate parents of
+the empty pages while unlinking them.
+
+We try to unlink any empty leaf pages from the tree, so that their space can
+be reused. In order to delete an empty page, its downlink must be removed from
+the parent. We scan all the internal pages, whose block numbers we memorized
+in the first stage, and look for downlinks to pages that we have memorized as
+being empty. Whenever we find one, we acquire a lock on the parent and child
+page, re-check that the child page is still empty. Then, we remove the
+downlink and mark the child as deleted, and release the locks.
 
 The insertion algorithm would get confused, if an internal page was completely
 empty. So we never delete the last child of an internal page, even if it's
diff --git a/src/backend/access/gist/gistvacuum.c b/src/backend/access/gist/gistvacuum.c
index def74fdaa3..a9c616c772 100644
--- a/src/backend/access/gist/gistvacuum.c
+++ b/src/backend/access/gist/gistvacuum.c
@@ -24,58 +24,34 @@
 #include "storage/lmgr.h"
 #include "utils/memutils.h"
 
-/*
- * State kept across vacuum stages.
- */
+/* Working state needed by gistbulkdelete */
 typedef struct
 {
-	IndexBulkDeleteResult stats;	/* must be first */
+	IndexVacuumInfo *info;
+	IndexBulkDeleteResult *stats;
+	IndexBulkDeleteCallback callback;
+	void	   *callback_state;
+	GistNSN		startNSN;
 
 	/*
-	 * These are used to memorize all internal and empty leaf pages in the 1st
-	 * vacuum stage.  They are used in the 2nd stage, to delete all the empty
-	 * pages.
+	 * These are used to memorize all internal and empty leaf pages.  They are
+	 * used for deleting all the empty pages.
 	 */
 	IntegerSet *internal_page_set;
 	IntegerSet *empty_leaf_set;
 	MemoryContext page_set_context;
-} GistBulkDeleteResult;
-
-/* Working state needed by gistbulkdelete */
-typedef struct
-{
-	IndexVacuumInfo *info;
-	GistBulkDeleteResult *stats;
-	IndexBulkDeleteCallback callback;
-	void	   *callback_state;
-	GistNSN		startNSN;
 } GistVacState;
 
-static void gistvacuumscan(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
+static void gistvacuumscan(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
 						   IndexBulkDeleteCallback callback, void *callback_state);
 static void gistvacuumpage(GistVacState *vstate, BlockNumber blkno,
 						   BlockNumber orig_blkno);
 static void gistvacuum_delete_empty_pages(IndexVacuumInfo *info,
-										  GistBulkDeleteResult *stats);
-static bool gistdeletepage(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
+										  GistVacState *vstate);
+static bool gistdeletepage(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
 						   Buffer buffer, OffsetNumber downlink,
 						   Buffer leafBuffer);
 
-/* allocate the 'stats' struct that's kept over vacuum stages */
-static GistBulkDeleteResult *
-create_GistBulkDeleteResult(void)
-{
-	GistBulkDeleteResult *gist_stats;
-
-	gist_stats = (GistBulkDeleteResult *) palloc0(sizeof(GistBulkDeleteResult));
-	gist_stats->page_set_context =
-		GenerationContextCreate(CurrentMemoryContext,
-								"GiST VACUUM page set context",
-								16 * 1024);
-
-	return gist_stats;
-}
-
 /*
  * VACUUM bulkdelete stage: remove index entries.
  */
@@ -83,15 +59,13 @@ IndexBulkDeleteResult *
 gistbulkdelete(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
 			   IndexBulkDeleteCallback callback, void *callback_state)
 {
-	GistBulkDeleteResult *gist_stats = (GistBulkDeleteResult *) stats;
-
 	/* allocate stats if first time through, else re-use existing struct */
-	if (gist_stats == NULL)
-		gist_stats = create_GistBulkDeleteResult();
+	if (stats == NULL)
+		stats = (IndexBulkDeleteResult *) palloc0(sizeof(IndexBulkDeleteResult));
 
-	gistvacuumscan(info, gist_stats, callback, callback_state);
+	gistvacuumscan(info, stats, callback, callback_state);
 
-	return (IndexBulkDeleteResult *) gist_stats;
+	return stats;
 }
 
 /*
@@ -100,8 +74,6 @@ gistbulkdelete(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
 IndexBulkDeleteResult *
 gistvacuumcleanup(IndexVacuumInfo *info, IndexBulkDeleteResult *stats)
 {
-	GistBulkDeleteResult *gist_stats = (GistBulkDeleteResult *) stats;
-
 	/* No-op in ANALYZE ONLY mode */
 	if (info->analyze_only)
 		return stats;
@@ -111,24 +83,12 @@ gistvacuumcleanup(IndexVacuumInfo *info, IndexBulkDeleteResult *stats)
 	 * stats from the latest gistbulkdelete call.  If it wasn't called, we
 	 * still need to do a pass over the index, to obtain index statistics.
 	 */
-	if (gist_stats == NULL)
+	if (stats == NULL)
 	{
-		gist_stats = create_GistBulkDeleteResult();
-		gistvacuumscan(info, gist_stats, NULL, NULL);
+		stats = (IndexBulkDeleteResult *) palloc0(sizeof(IndexBulkDeleteResult));
+		gistvacuumscan(info, stats, NULL, NULL);
 	}
 
-	/*
-	 * If we saw any empty pages, try to unlink them from the tree so that
-	 * they can be reused.
-	 */
-	gistvacuum_delete_empty_pages(info, gist_stats);
-
-	/* we don't need the internal and empty page sets anymore */
-	MemoryContextDelete(gist_stats->page_set_context);
-	gist_stats->page_set_context = NULL;
-	gist_stats->internal_page_set = NULL;
-	gist_stats->empty_leaf_set = NULL;
-
 	/*
 	 * It's quite possible for us to be fooled by concurrent page splits into
 	 * double-counting some index tuples, so disbelieve any total that exceeds
@@ -137,11 +97,11 @@ gistvacuumcleanup(IndexVacuumInfo *info, IndexBulkDeleteResult *stats)
 	 */
 	if (!info->estimated_count)
 	{
-		if (gist_stats->stats.num_index_tuples > info->num_heap_tuples)
-			gist_stats->stats.num_index_tuples = info->num_heap_tuples;
+		if (stats->num_index_tuples > info->num_heap_tuples)
+			stats->num_index_tuples = info->num_heap_tuples;
 	}
 
-	return (IndexBulkDeleteResult *) gist_stats;
+	return stats;
 }
 
 /*
@@ -153,15 +113,16 @@ gistvacuumcleanup(IndexVacuumInfo *info, IndexBulkDeleteResult *stats)
  * occurred).
  *
  * This also makes note of any empty leaf pages, as well as all internal
- * pages.  The second stage, gistvacuum_delete_empty_pages(), needs that
- * information.  Any deleted pages are added directly to the free space map.
- * (They should've been added there when they were originally deleted, already,
- * but it's possible that the FSM was lost at a crash, for example.)
+ * pages while looping over all index pages.  After scanning all the pages, we
+ * remove the empty pages so that they can be reused.  Any deleted pages are
+ * added directly to the free space map.  (They should've been added there
+ * when they were originally deleted, already, but it's possible that the FSM
+ * was lost at a crash, for example.)
  *
  * The caller is responsible for initially allocating/zeroing a stats struct.
  */
 static void
-gistvacuumscan(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
+gistvacuumscan(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
 			   IndexBulkDeleteCallback callback, void *callback_state)
 {
 	Relation	rel = info->index;
@@ -175,11 +136,10 @@ gistvacuumscan(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
 	 * Reset counts that will be incremented during the scan; needed in case
 	 * of multiple scans during a single VACUUM command.
 	 */
-	stats->stats.estimated_count = false;
-	stats->stats.num_index_tuples = 0;
-	stats->stats.pages_deleted = 0;
-	stats->stats.pages_free = 0;
-	MemoryContextReset(stats->page_set_context);
+	stats->estimated_count = false;
+	stats->num_index_tuples = 0;
+	stats->pages_deleted = 0;
+	stats->pages_free = 0;
 
 	/*
 	 * Create the integer sets to remember all the internal and the empty leaf
@@ -187,9 +147,12 @@ gistvacuumscan(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
 	 * this context so that the subsequent allocations for these integer sets
 	 * will be done from the same context.
 	 */
-	oldctx = MemoryContextSwitchTo(stats->page_set_context);
-	stats->internal_page_set = intset_create();
-	stats->empty_leaf_set = intset_create();
+	vstate.page_set_context = GenerationContextCreate(CurrentMemoryContext,
+													  "GiST VACUUM page set context",
+													  16 * 1024);
+	oldctx = MemoryContextSwitchTo(vstate.page_set_context);
+	vstate.internal_page_set = intset_create();
+	vstate.empty_leaf_set = intset_create();
 	MemoryContextSwitchTo(oldctx);
 
 	/* Set up info to pass down to gistvacuumpage */
@@ -257,11 +220,23 @@ gistvacuumscan(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
 	 * Note that if no recyclable pages exist, we don't bother vacuuming the
 	 * FSM at all.
 	 */
-	if (stats->stats.pages_free > 0)
+	if (stats->pages_free > 0)
 		IndexFreeSpaceMapVacuum(rel);
 
 	/* update statistics */
-	stats->stats.num_pages = num_pages;
+	stats->num_pages = num_pages;
+
+	/*
+	 * If we saw any empty pages, try to unlink them from the tree so that
+	 * they can be reused.
+	 */
+	gistvacuum_delete_empty_pages(info, &vstate);
+
+	/* we don't need the internal and empty page sets anymore */
+	MemoryContextDelete(vstate.page_set_context);
+	vstate.page_set_context = NULL;
+	vstate.internal_page_set = NULL;
+	vstate.empty_leaf_set = NULL;
 }
 
 /*
@@ -278,7 +253,6 @@ gistvacuumscan(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
 static void
 gistvacuumpage(GistVacState *vstate, BlockNumber blkno, BlockNumber orig_blkno)
 {
-	GistBulkDeleteResult *stats = vstate->stats;
 	IndexVacuumInfo *info = vstate->info;
 	IndexBulkDeleteCallback callback = vstate->callback;
 	void	   *callback_state = vstate->callback_state;
@@ -307,13 +281,13 @@ restart:
 	{
 		/* Okay to recycle this page */
 		RecordFreeIndexPage(rel, blkno);
-		stats->stats.pages_free++;
-		stats->stats.pages_deleted++;
+		vstate->stats->pages_free++;
+		vstate->stats->pages_deleted++;
 	}
 	else if (GistPageIsDeleted(page))
 	{
 		/* Already deleted, but can't recycle yet */
-		stats->stats.pages_deleted++;
+		vstate->stats->pages_deleted++;
 	}
 	else if (GistPageIsLeaf(page))
 	{
@@ -388,7 +362,7 @@ restart:
 
 			END_CRIT_SECTION();
 
-			stats->stats.tuples_removed += ntodelete;
+			vstate->stats->tuples_removed += ntodelete;
 			/* must recompute maxoff */
 			maxoff = PageGetMaxOffsetNumber(page);
 		}
@@ -405,10 +379,10 @@ restart:
 			 * it up.
 			 */
 			if (blkno == orig_blkno)
-				intset_add_member(stats->empty_leaf_set, blkno);
+				intset_add_member(vstate->empty_leaf_set, blkno);
 		}
 		else
-			stats->stats.num_index_tuples += nremain;
+			vstate->stats->num_index_tuples += nremain;
 	}
 	else
 	{
@@ -443,7 +417,7 @@ restart:
 		 * parents of empty leaf pages.
 		 */
 		if (blkno == orig_blkno)
-			intset_add_member(stats->internal_page_set, blkno);
+			intset_add_member(vstate->internal_page_set, blkno);
 	}
 
 	UnlockReleaseBuffer(buffer);
@@ -466,7 +440,7 @@ restart:
  * Scan all internal pages, and try to delete their empty child pages.
  */
 static void
-gistvacuum_delete_empty_pages(IndexVacuumInfo *info, GistBulkDeleteResult *stats)
+gistvacuum_delete_empty_pages(IndexVacuumInfo *info, GistVacState *vstate)
 {
 	Relation	rel = info->index;
 	BlockNumber empty_pages_remaining;
@@ -475,10 +449,10 @@ gistvacuum_delete_empty_pages(IndexVacuumInfo *info, GistBulkDeleteResult *stats
 	/*
 	 * Rescan all inner pages to find those that have empty child pages.
 	 */
-	empty_pages_remaining = intset_num_entries(stats->empty_leaf_set);
-	intset_begin_iterate(stats->internal_page_set);
+	empty_pages_remaining = intset_num_entries(vstate->empty_leaf_set);
+	intset_begin_iterate(vstate->internal_page_set);
 	while (empty_pages_remaining > 0 &&
-		   intset_iterate_next(stats->internal_page_set, &blkno))
+		   intset_iterate_next(vstate->internal_page_set, &blkno))
 	{
 		Buffer		buffer;
 		Page		page;
@@ -521,7 +495,7 @@ gistvacuum_delete_empty_pages(IndexVacuumInfo *info, GistBulkDeleteResult *stats
 			BlockNumber leafblk;
 
 			leafblk = ItemPointerGetBlockNumber(&(idxtuple->t_tid));
-			if (intset_is_member(stats->empty_leaf_set, leafblk))
+			if (intset_is_member(vstate->empty_leaf_set, leafblk))
 			{
 				leafs_to_delete[ntodelete] = leafblk;
 				todelete[ntodelete++] = off;
@@ -561,7 +535,7 @@ gistvacuum_delete_empty_pages(IndexVacuumInfo *info, GistBulkDeleteResult *stats
 			gistcheckpage(rel, leafbuf);
 
 			LockBuffer(buffer, GIST_EXCLUSIVE);
-			if (gistdeletepage(info, stats,
+			if (gistdeletepage(info, vstate->stats,
 							   buffer, todelete[i] - deleted,
 							   leafbuf))
 				deleted++;
@@ -573,7 +547,7 @@ gistvacuum_delete_empty_pages(IndexVacuumInfo *info, GistBulkDeleteResult *stats
 		ReleaseBuffer(buffer);
 
 		/* update stats */
-		stats->stats.pages_removed += deleted;
+		vstate->stats->pages_removed += deleted;
 
 		/*
 		 * We can stop the scan as soon as we have seen the downlinks, even if
@@ -596,7 +570,7 @@ gistvacuum_delete_empty_pages(IndexVacuumInfo *info, GistBulkDeleteResult *stats
  * prevented it.
  */
 static bool
-gistdeletepage(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
+gistdeletepage(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
 			   Buffer parentBuffer, OffsetNumber downlink,
 			   Buffer leafBuffer)
 {
@@ -665,7 +639,7 @@ gistdeletepage(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
 	/* mark the page as deleted */
 	MarkBufferDirty(leafBuffer);
 	GistPageSetDeleted(leafPage, txid);
-	stats->stats.pages_deleted++;
+	stats->pages_deleted++;
 
 	/* remove the downlink from the parent */
 	MarkBufferDirty(parentBuffer);
-- 
2.16.2.windows.1

v44-0001-Introduce-IndexAM-fields-for-parallel-vacuum.patchapplication/octet-stream; name=v44-0001-Introduce-IndexAM-fields-for-parallel-vacuum.patchDownload

From a7dd7c7384afed0505d7d459739ee040f570f69f Mon Sep 17 00:00:00 2001
From: Amit Kapila <akapila@postgresql.org>
Date: Thu, 9 Jan 2020 14:36:35 +0530
Subject: [PATCH 2/3] Introduce IndexAM fields for parallel vacuum.

Introduce new fields amusemaintenanceworkmem and amparallelvacuumoptions
in IndexAmRoutine for parallel vacuum.  The amusemaintenanceworkmem tells
whether a particular IndexAM uses maintenance_work_mem or not.  This will
help in controlling the memory used by individual workers as otherwise,
each worker can consume memory equal to maintenance_work_mem.  The
amparallelvacuumoptions tell whether a particular IndexAM participates in
a parallel vacuum and if so in which phase (bulkdelete, vacuumcleanup) of
vacuum.

Author: Masahiko Sawada and Amit Kapila
Reviewed-by: Dilip Kumar, Amit Kapila, Tomas Vondra and Robert Haas
Discussion:
https://postgr.es/m/CAD21AoDTPMgzSkV4E3SFo1CH_x50bf5PqZFQf4jmqjk-C03BWg@mail.gmail.com
https://postgr.es/m/CAA4eK1LmcD5aPogzwim5Nn58Ki+74a6Edghx4Wd8hAskvHaq5A@mail.gmail.com
---
 contrib/bloom/blutils.c                          |  4 +++
 doc/src/sgml/indexam.sgml                        |  4 +++
 src/backend/access/brin/brin.c                   |  4 +++
 src/backend/access/gin/ginutil.c                 |  4 +++
 src/backend/access/gist/gist.c                   |  4 +++
 src/backend/access/hash/hash.c                   |  3 ++
 src/backend/access/nbtree/nbtree.c               |  3 ++
 src/backend/access/spgist/spgutils.c             |  4 +++
 src/include/access/amapi.h                       |  4 +++
 src/include/commands/vacuum.h                    | 38 ++++++++++++++++++++++++
 src/test/modules/dummy_index_am/dummy_index_am.c |  3 ++
 11 files changed, 75 insertions(+)

diff --git a/contrib/bloom/blutils.c b/contrib/bloom/blutils.c
index 23d959b9f0..0104d02f67 100644
--- a/contrib/bloom/blutils.c
+++ b/contrib/bloom/blutils.c
@@ -18,6 +18,7 @@
 #include "access/reloptions.h"
 #include "bloom.h"
 #include "catalog/index.h"
+#include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
@@ -121,6 +122,9 @@ blhandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_CLEANUP;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = blbuild;
diff --git a/doc/src/sgml/indexam.sgml b/doc/src/sgml/indexam.sgml
index dd54c68802..37f8d8760a 100644
--- a/doc/src/sgml/indexam.sgml
+++ b/doc/src/sgml/indexam.sgml
@@ -122,6 +122,10 @@ typedef struct IndexAmRoutine
     bool        amcanparallel;
     /* does AM support columns included with clause INCLUDE? */
     bool        amcaninclude;
+    /* does AM use maintenance_work_mem? */
+    bool        amusemaintenanceworkmem;
+    /* OR of parallel vacuum flags */
+    uint8       amparallelvacuumoptions;
     /* type of data stored in index, or InvalidOid if variable */
     Oid         amkeytype;
 
diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index d89af7844d..2e8f67ef10 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -27,6 +27,7 @@
 #include "access/xloginsert.h"
 #include "catalog/index.h"
 #include "catalog/pg_am.h"
+#include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "pgstat.h"
 #include "postmaster/autovacuum.h"
@@ -101,6 +102,9 @@ brinhandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_CLEANUP;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = brinbuild;
diff --git a/src/backend/access/gin/ginutil.c b/src/backend/access/gin/ginutil.c
index 910f0bcb91..a7e55caf28 100644
--- a/src/backend/access/gin/ginutil.c
+++ b/src/backend/access/gin/ginutil.c
@@ -20,6 +20,7 @@
 #include "access/xloginsert.h"
 #include "catalog/pg_collation.h"
 #include "catalog/pg_type.h"
+#include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "storage/indexfsm.h"
 #include "storage/lmgr.h"
@@ -53,6 +54,9 @@ ginhandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amusemaintenanceworkmem = true;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_CLEANUP;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = ginbuild;
diff --git a/src/backend/access/gist/gist.c b/src/backend/access/gist/gist.c
index 5c9ad341b3..aefc302ed2 100644
--- a/src/backend/access/gist/gist.c
+++ b/src/backend/access/gist/gist.c
@@ -17,6 +17,7 @@
 #include "access/gist_private.h"
 #include "access/gistscan.h"
 #include "catalog/pg_collation.h"
+#include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "nodes/execnodes.h"
 #include "storage/lmgr.h"
@@ -74,6 +75,9 @@ gisthandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = true;
+	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = gistbuild;
diff --git a/src/backend/access/hash/hash.c b/src/backend/access/hash/hash.c
index 4bb6efc98f..4871b7ff4d 100644
--- a/src/backend/access/hash/hash.c
+++ b/src/backend/access/hash/hash.c
@@ -72,6 +72,9 @@ hashhandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL;
 	amroutine->amkeytype = INT4OID;
 
 	amroutine->ambuild = hashbuild;
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index 8376a5e6b7..5254bc7ef5 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -121,6 +121,9 @@ bthandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = true;
 	amroutine->amcaninclude = true;
+	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = btbuild;
diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c
index d715908764..4924ae1c59 100644
--- a/src/backend/access/spgist/spgutils.c
+++ b/src/backend/access/spgist/spgutils.c
@@ -22,6 +22,7 @@
 #include "access/transam.h"
 #include "access/xact.h"
 #include "catalog/pg_amop.h"
+#include "commands/vacuum.h"
 #include "storage/bufmgr.h"
 #include "storage/indexfsm.h"
 #include "storage/lmgr.h"
@@ -56,6 +57,9 @@ spghandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = spgbuild;
diff --git a/src/include/access/amapi.h b/src/include/access/amapi.h
index d2a49e8d3e..3b3e22f73d 100644
--- a/src/include/access/amapi.h
+++ b/src/include/access/amapi.h
@@ -197,6 +197,10 @@ typedef struct IndexAmRoutine
 	bool		amcanparallel;
 	/* does AM support columns included with clause INCLUDE? */
 	bool		amcaninclude;
+	/* does AM use maintenance_work_mem? */
+	bool		amusemaintenanceworkmem;
+	/* OR of parallel vacuum flags.  See vacuum.h for flags. */
+	uint8		amparallelvacuumoptions;
 	/* type of data stored in index, or InvalidOid if variable */
 	Oid			amkeytype;
 
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 5dc41dd0c1..b3351ad406 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -23,6 +23,44 @@
 #include "storage/lock.h"
 #include "utils/relcache.h"
 
+/*
+ * Flags for amparallelvacuumoptions to control the participation of bulkdelete
+ * and vacuumcleanup in parallel vacuum.
+ */
+
+/*
+ * Both bulkdelete and vacuumcleanup are disabled by default.  This will be
+ * used by IndexAM's that don't want to or cannot participate in parallel
+ * vacuum.  For example, if an index AM doesn't have a way to communicate the
+ * index statistics allocated by the first ambulkdelete call to the subsequent
+ * ones until amvacuumcleanup, the index AM cannot participate in parallel
+ * vacuum.
+ */
+#define VACUUM_OPTION_NO_PARALLEL			0
+
+/*
+ * bulkdelete can be performed in parallel.  This option can be used by
+ * IndexAm's that need to scan the index to delete the tuples.
+ */
+#define VACUUM_OPTION_PARALLEL_BULKDEL		(1 << 0)
+
+/*
+ * vacuumcleanup can be performed in parallel if bulkdelete is not performed
+ * yet.  This will be used by IndexAM's that can scan the index if the
+ * bulkdelete is not performed.
+ */
+#define VACUUM_OPTION_PARALLEL_COND_CLEANUP	(1 << 1)
+
+/*
+ * vacuumcleanup can be performed in parallel even if bulkdelete has already
+ * processed the index.  This will be used by IndexAM's that scan the index
+ * during the cleanup phase of index irrespective of whether the index is
+ * already scanned or not during bulkdelete phase.
+ */
+#define VACUUM_OPTION_PARALLEL_CLEANUP		(1 << 2)
+
+/* value for checking vacuum flags */
+#define VACUUM_OPTION_MAX_VALID_VALUE		((1 << 3) - 1)
 
 /*----------
  * ANALYZE builds one of these structs for each attribute (column) that is
diff --git a/src/test/modules/dummy_index_am/dummy_index_am.c b/src/test/modules/dummy_index_am/dummy_index_am.c
index 898ab06639..f32632089b 100644
--- a/src/test/modules/dummy_index_am/dummy_index_am.c
+++ b/src/test/modules/dummy_index_am/dummy_index_am.c
@@ -16,6 +16,7 @@
 #include "access/amapi.h"
 #include "access/reloptions.h"
 #include "catalog/index.h"
+#include "commands/vacuum.h"
 #include "nodes/pathnodes.h"
 #include "utils/guc.h"
 #include "utils/rel.h"
@@ -294,6 +295,8 @@ dihandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amparallelvacuumoptions = VACUUM_OPTION_NO_PARALLEL;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = dibuild;
-- 
2.16.2.windows.1

v44-0002-Allow-vacuum-command-to-process-indexes-in-parallel.patchapplication/octet-stream; name=v44-0002-Allow-vacuum-command-to-process-indexes-in-parallel.patchDownload

From deb8370d15be9fae899eaf26f9a6b9d91684ed65 Mon Sep 17 00:00:00 2001
From: Amit Kapila <akapila@postgresql.org>
Date: Thu, 9 Jan 2020 15:49:46 +0530
Subject: [PATCH 3/3] Allow vacuum command to process indexes in parallel.

This feature allows the vacuum to leverage multiple CPUs in order to
process indexes.  This enables us to perform index vacuuming and index
cleanup with background workers.  This adds a PARALLEL option to VACUUM
command where the user can specify the number of workers that can be used
to perform the command which is limited by the number of indexes on a
table.  Specifying zero as a number of workers will disable parallelism.
This option can't be used with the FULL option.

Each index is processed by at most one vacuum process.  Therefore parallel
vacuum can be used when the table has at least two indexes.

The parallel degree is either specified by the user or determined based on
the number of indexes that the table has, and further limited by
max_parallel_maintenance_workers.  The index can participate in parallel
vacuum iff it is greater than min_parallel_index_scan_size.

Author: Masahiko Sawada and Amit Kapila
Reviewed-by: Dilip Kumar, Amit Kapila, Robert Haas, Tomas Vondra,
Mahendra Singh and Sergei Kornilov
Tested-by: Mahendra Singh and Prabhat Sahu
Discussion:
https://postgr.es/m/CAD21AoDTPMgzSkV4E3SFo1CH_x50bf5PqZFQf4jmqjk-C03BWg@mail.gmail.com
https://postgr.es/m/CAA4eK1J-VoR9gzS5E75pcD-OH0mEyCdp8RihcwKrcuw7J-Q0+w@mail.gmail.com
---
 doc/src/sgml/config.sgml              |   18 +-
 doc/src/sgml/ref/vacuum.sgml          |   66 +-
 src/backend/access/heap/vacuumlazy.c  | 1248 ++++++++++++++++++++++++++++++---
 src/backend/access/transam/parallel.c |   26 +-
 src/backend/commands/vacuum.c         |  148 +++-
 src/backend/postmaster/autovacuum.c   |    2 +
 src/bin/psql/tab-complete.c           |    2 +-
 src/include/access/heapam.h           |    3 +
 src/include/access/parallel.h         |    4 +-
 src/include/commands/vacuum.h         |   11 +
 src/test/regress/expected/vacuum.out  |   28 +
 src/test/regress/sql/vacuum.sql       |   27 +
 src/tools/pgindent/typedefs.list      |    4 +
 13 files changed, 1451 insertions(+), 136 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 5d1c90282f..beb3d599c9 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2308,13 +2308,13 @@ include_dir 'conf.d'
        <listitem>
         <para>
          Sets the maximum number of parallel workers that can be
-         started by a single utility command.  Currently, the only
-         parallel utility command that supports the use of parallel
-         workers is <command>CREATE INDEX</command>, and only when
-         building a B-tree index.  Parallel workers are taken from the
-         pool of processes established by <xref
-         linkend="guc-max-worker-processes"/>, limited by <xref
-         linkend="guc-max-parallel-workers"/>.  Note that the requested
+         started by a single utility command.  Currently, the parallel
+         utility commands that support the use of parallel workers are
+         <command>CREATE INDEX</command> only when building a B-tree index,
+         and <command>VACUUM</command> without <literal>FULL</literal>
+         option.  Parallel workers are taken from the pool of processes
+         established by <xref linkend="guc-max-worker-processes"/>, limited
+         by <xref linkend="guc-max-parallel-workers"/>.  Note that the requested
          number of workers may not actually be available at run time.
          If this occurs, the utility operation will run with fewer
          workers than expected.  The default value is 2.  Setting this
@@ -4895,7 +4895,9 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
         for a parallel scan to be considered.  Note that a parallel index scan
         typically won't touch the entire index; it is the number of pages
         which the planner believes will actually be touched by the scan which
-        is relevant.
+        is relevant.  This parameter is also used to decide whether a
+        particular index can participate in a parallel vacuum.  See
+        <xref linkend="sql-vacuum"/>.
         If this value is specified without units, it is taken as blocks,
         that is <symbol>BLCKSZ</symbol> bytes, typically 8kB.
         The default is 512 kilobytes (<literal>512kB</literal>).
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index f9b0fb8794..b8435da7fa 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -34,6 +34,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     SKIP_LOCKED [ <replaceable class="parameter">boolean</replaceable> ]
     INDEX_CLEANUP [ <replaceable class="parameter">boolean</replaceable> ]
     TRUNCATE [ <replaceable class="parameter">boolean</replaceable> ]
+    PARALLEL [ <replaceable class="parameter">integer</replaceable> ]
 
 <phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
 
@@ -75,10 +76,14 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    with normal reading and writing of the table, as an exclusive lock
    is not obtained.  However, extra space is not returned to the operating
    system (in most cases); it's just kept available for re-use within the
-   same table.  <command>VACUUM FULL</command> rewrites the entire contents
-   of the table into a new disk file with no extra space, allowing unused
-   space to be returned to the operating system.  This form is much slower and
-   requires an exclusive lock on each table while it is being processed.
+   same table.  It also allows us to leverage multiple CPUs in order to process
+   indexes.  This feature is known as <firstterm>parallel vacuum</firstterm>.
+   To disable this feature, one can use <literal>PARALLEL</literal> option and
+   specify parallel workers as zero.  <command>VACUUM FULL</command> rewrites
+   the entire contents of the table into a new disk file with no extra space,
+   allowing unused space to be returned to the operating system.  This form is
+   much slower and requires an exclusive lock on each table while it is being
+   processed.
   </para>
 
   <para>
@@ -223,6 +228,35 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><literal>PARALLEL</literal></term>
+    <listitem>
+     <para>
+      Perform vacuum index and cleanup index phases of <command>VACUUM</command>
+      in parallel using <replaceable class="parameter">integer</replaceable>
+      background workers (for the detail of each vacuum phases, please
+      refer to <xref linkend="vacuum-phases"/>).  If the
+      <literal>PARALLEL</literal> option or parallel degree
+      <replaceable class="parameter">integer</replaceable> is omitted,
+      then <command>VACUUM</command> decides the number of workers based
+      on number of indexes that support parallel vacuum operation on the
+      relation which is further limited by
+      <xref linkend="guc-max-parallel-workers-maintenance"/>.  The index can
+      participate in a parallel vacuum if and only if the size of the index is
+      more than <xref linkend="guc-min-parallel-index-scan-size"/>.  Please
+      note that it is not guaranteed that the number of parallel workers
+      specified in <replaceable class="parameter">integer</replaceable> will
+      be used during execution.  It is possible for a vacuum to run with fewer
+      workers than specified, or even with no workers at all.  Only one worker
+      can be used per index.  So parallel workers are launched only when there
+      are at least <literal>2</literal> indexes in the table.  Workers for
+      vacuum launches before starting each phase and exit at the end of
+      the phase.  These behaviors might change in a future release.  This
+      option can't be used with the <literal>FULL</literal> option.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><replaceable class="parameter">boolean</replaceable></term>
     <listitem>
@@ -237,6 +271,18 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><replaceable class="parameter">integer</replaceable></term>
+    <listitem>
+     <para>
+      Specifies a non-negative integer value passed to the selected option.
+      The <replaceable class="parameter">integer</replaceable> value can
+      also be omitted, in which case the value is decided by the command
+      based on the option used.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><replaceable class="parameter">table_name</replaceable></term>
     <listitem>
@@ -316,11 +362,19 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     more than a plain <command>VACUUM</command> would.
    </para>
 
+   <para>
+     The <option>PARALLEL</option> option is used only for vacuum purpose.
+     Even if this option is specified with <option>ANALYZE</option> option
+     it does not affect <option>ANALYZE</option>.
+   </para>
+
    <para>
     <command>VACUUM</command> causes a substantial increase in I/O traffic,
     which might cause poor performance for other active sessions.  Therefore,
-    it is sometimes advisable to use the cost-based vacuum delay feature.
-    See <xref linkend="runtime-config-resource-vacuum-cost"/> for details.
+    it is sometimes advisable to use the cost-based vacuum delay feature.  For
+    parallel vacuum, each worker sleeps proportional to the work done by that
+    worker.  See <xref linkend="runtime-config-resource-vacuum-cost"/> for
+    details.
    </para>
 
    <para>
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index a5fe90485f..f12c41f26b 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -22,6 +22,20 @@
  * of index scans performed.  So we don't use maintenance_work_mem memory for
  * the TID array, just enough to hold as many heap tuples as fit on one page.
  *
+ * Lazy vacuum supports parallel execution with parallel worker processes.  In
+ * a parallel lazy vacuum, we perform both index vacuum and index cleanup with
+ * parallel worker processes.  Individual indexes are processed by one vacuum
+ * process.  At the beginning of a lazy vacuum (at lazy_scan_heap) we prepare
+ * the parallel context and initialize the DSM segment that contains shared
+ * information as well as the memory space for storing dead tuples.  When
+ * starting either index vacuum or index cleanup, we launch parallel worker
+ * processes.  Once all indexes are processed the parallel worker processes
+ * exit.  After that, the leader process re-initializes the parallel context
+ * so that it can use the same DSM for multiple passes of index vacuum and
+ * for performing index cleanup.  For updating the index statistics, we need
+ * to update the system table and since updates are not allowed during
+ * parallel mode we update the index statistics after exiting from the
+ * parallel mode.
  *
  * Portions Copyright (c) 1996-2020, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
@@ -36,25 +50,30 @@
 
 #include <math.h>
 
+#include "access/amapi.h"
 #include "access/genam.h"
 #include "access/heapam.h"
 #include "access/heapam_xlog.h"
 #include "access/htup_details.h"
 #include "access/multixact.h"
+#include "access/parallel.h"
 #include "access/transam.h"
 #include "access/visibilitymap.h"
+#include "access/xact.h"
 #include "access/xlog.h"
 #include "catalog/storage.h"
 #include "commands/dbcommands.h"
 #include "commands/progress.h"
 #include "commands/vacuum.h"
 #include "miscadmin.h"
+#include "optimizer/paths.h"
 #include "pgstat.h"
 #include "portability/instr_time.h"
 #include "postmaster/autovacuum.h"
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
 #include "storage/lmgr.h"
+#include "tcop/tcopprot.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/pg_rusage.h"
@@ -110,6 +129,142 @@
  */
 #define PREFETCH_SIZE			((BlockNumber) 32)
 
+/*
+ * DSM keys for parallel lazy vacuum.  Unlike other parallel execution code,
+ * since we don't need to worry about DSM keys conflicting with plan_node_id
+ * we can use small integers.
+ */
+#define PARALLEL_VACUUM_KEY_SHARED			1
+#define PARALLEL_VACUUM_KEY_DEAD_TUPLES		2
+#define PARALLEL_VACUUM_KEY_QUERY_TEXT		3
+
+/*
+ * Macro to check if we are in a parallel lazy vacuum.  If true, we are
+ * in the parallel mode and the DSM segment is initialized.
+ */
+#define ParallelVacuumIsActive(lps) (((LVParallelState *) (lps)) != NULL)
+
+/*
+ * LVDeadTuples stores the dead tuple TIDs collected during the heap scan.
+ * This is allocated in the DSM segment in parallel mode and in local memory
+ * in non-parallel mode.
+ */
+typedef struct LVDeadTuples
+{
+	int			max_tuples;		/* # slots allocated in array */
+	int			num_tuples;		/* current # of entries */
+	/* List of TIDs of tuples we intend to delete */
+	/* NB: this list is ordered by TID address */
+	ItemPointerData itemptrs[FLEXIBLE_ARRAY_MEMBER];	/* array of
+														 * ItemPointerData */
+} LVDeadTuples;
+
+#define SizeOfLVDeadTuples(cnt) \
+		add_size((offsetof(LVDeadTuples, itemptrs)), \
+				 mul_size(sizeof(ItemPointerData), cnt))
+
+/*
+ * Shared information among parallel workers.  So this is allocated in the DSM
+ * segment.
+ */
+typedef struct LVShared
+{
+	/*
+	 * Target table relid and log level.  These fields are not modified during
+	 * the lazy vacuum.
+	 */
+	Oid			relid;
+	int			elevel;
+
+	/*
+	 * An indication for vacuum workers to perform either index vacuum or
+	 * index cleanup.  first_time is true only if for_cleanup is true and
+	 * bulk-deletion is not performed yet.
+	 */
+	bool		for_cleanup;
+	bool		first_time;
+
+	/*
+	 * Fields for both index vacuum and cleanup.
+	 *
+	 * reltuples is the total number of input heap tuples.  We set either old
+	 * live tuples in the index vacuum case or the new live tuples in the
+	 * index cleanup case.
+	 *
+	 * estimated_count is true if the reltuples is an estimated value.
+	 */
+	double		reltuples;
+	bool		estimated_count;
+
+	/*
+	 * In single process lazy vacuum we could consume more memory during index
+	 * vacuuming or cleanup apart from the memory for heap scanning. In
+	 * parallel index vacuum, since individual vacuum workers can consume
+	 * memory equal to maintenance_work_mem, the new maintenance_work_mem for
+	 * each worker is set such that the parallel operation doesn't consume
+	 * more memory than single process lazy vacuum.
+	 */
+	int			maintenance_work_mem_worker;
+
+	/*
+	 * Shared vacuum cost balance.  During parallel index vacuuming
+	 * VacuumSharedCostBalance points to this value and it accumulates the
+	 * balance of each parallel vacuum workers.
+	 */
+	pg_atomic_uint32 cost_balance;
+
+	/*
+	 * Number of active parallel workers.  This is used for computing the
+	 * minimum threshold of the vacuum cost balance for a worker to go for the
+	 * delay.
+	 */
+	pg_atomic_uint32 active_nworkers;
+
+	/*
+	 * Variables to control parallel index vacuuming.  We have a bitmap to
+	 * indicate which index has stats in shared memory.  The set bit in the
+	 * map indicates that the particular index supports a parallel vacuum.
+	 */
+	pg_atomic_uint32 idx;		/* counter for vacuuming and clean up */
+	uint32		offset;			/* sizeof header incl. bitmap */
+	bits8		bitmap[FLEXIBLE_ARRAY_MEMBER];	/* bit map of NULLs */
+
+	/* Shared index statistics data follows at end of struct */
+} LVShared;
+
+#define SizeOfLVShared (offsetof(LVShared, bitmap) + sizeof(bits8))
+#define GetSharedIndStats(s) \
+	((LVSharedIndStats *)((char *)(s) + ((LVShared *)(s))->offset))
+#define IndStatsIsNull(s, i) \
+	(!(((LVShared *)(s))->bitmap[(i) >> 3] & (1 << ((i) & 0x07))))
+
+/*
+ * Struct for an index bulk-deletion statistic used for parallel lazy
+ * vacuum.  This is allocated in the DSM segment.
+ */
+typedef struct LVSharedIndStats
+{
+	bool		updated;		/* are the stats updated? */
+	IndexBulkDeleteResult stats;
+} LVSharedIndStats;
+
+/* Struct for maintaining a parallel vacuum state. */
+typedef struct LVParallelState
+{
+	ParallelContext *pcxt;
+
+	/* Shared information among parallel vacuum workers */
+	LVShared   *lvshared;
+
+	/*
+	 * The number of indexes that support parallel index bulk-deletion and
+	 * parallel index cleanup respectively.
+	 */
+	int			nindexes_parallel_bulkdel;
+	int			nindexes_parallel_cleanup;
+	int			nindexes_parallel_condcleanup;
+} LVParallelState;
+
 typedef struct LVRelStats
 {
 	/* useindex = true means two-pass strategy; false means one-pass */
@@ -128,11 +283,7 @@ typedef struct LVRelStats
 	BlockNumber pages_removed;
 	double		tuples_deleted;
 	BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
-	/* List of TIDs of tuples we intend to delete */
-	/* NB: this list is ordered by TID address */
-	int			num_dead_tuples;	/* current # of entries */
-	int			max_dead_tuples;	/* # slots allocated in array */
-	ItemPointer dead_tuples;	/* array of ItemPointerData */
+	LVDeadTuples *dead_tuples;
 	int			num_index_scans;
 	TransactionId latestRemovedXid;
 	bool		lock_waiter_detected;
@@ -155,15 +306,15 @@ static void lazy_scan_heap(Relation onerel, VacuumParams *params,
 						   bool aggressive);
 static void lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats);
 static bool lazy_check_needs_freeze(Buffer buf, bool *hastup);
-static void lazy_vacuum_index(Relation indrel,
-							  IndexBulkDeleteResult **stats,
-							  LVRelStats *vacrelstats);
-static void lazy_vacuum_all_indexes(Relation onerel, LVRelStats *vacrelstats,
-									Relation *Irel, int nindexes,
-									IndexBulkDeleteResult **indstats);
+static void lazy_vacuum_all_indexes(Relation onerel, Relation *Irel,
+									IndexBulkDeleteResult **stats,
+									LVRelStats *vacrelstats, LVParallelState *lps,
+									int nindexes);
+static void lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult **stats,
+							  LVDeadTuples *dead_tuples, double reltuples);
 static void lazy_cleanup_index(Relation indrel,
-							   IndexBulkDeleteResult *stats,
-							   LVRelStats *vacrelstats);
+							   IndexBulkDeleteResult **stats,
+							   double reltuples, bool estimated_count);
 static int	lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 							 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer);
 static bool should_attempt_truncation(VacuumParams *params,
@@ -172,12 +323,41 @@ static void lazy_truncate_heap(Relation onerel, LVRelStats *vacrelstats);
 static BlockNumber count_nondeletable_pages(Relation onerel,
 											LVRelStats *vacrelstats);
 static void lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks);
-static void lazy_record_dead_tuple(LVRelStats *vacrelstats,
+static void lazy_record_dead_tuple(LVDeadTuples *dead_tuples,
 								   ItemPointer itemptr);
 static bool lazy_tid_reaped(ItemPointer itemptr, void *state);
 static int	vac_cmp_itemptr(const void *left, const void *right);
 static bool heap_page_is_all_visible(Relation rel, Buffer buf,
 									 TransactionId *visibility_cutoff_xid, bool *all_frozen);
+static void lazy_parallel_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+										 LVRelStats *vacrelstats, LVParallelState *lps,
+										 int nindexes);
+static void parallel_vacuum_index(Relation *Irel, IndexBulkDeleteResult **stats,
+								  LVShared *lvshared, LVDeadTuples *dead_tuples,
+								  int nindexes);
+static void vacuum_indexes_leader(Relation *Irel, IndexBulkDeleteResult **stats,
+								  LVRelStats *vacrelstats, LVParallelState *lps,
+								  int nindexes);
+static void vacuum_one_index(Relation indrel, IndexBulkDeleteResult **stats,
+							 LVShared *lvshared, LVSharedIndStats *shared_indstats,
+							 LVDeadTuples *dead_tuples);
+static void lazy_cleanup_all_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+									 LVRelStats *vacrelstats, LVParallelState *lps,
+									 int nindexes);
+static long compute_max_dead_tuples(BlockNumber relblocks, bool hasindex);
+static int	compute_parallel_vacuum_workers(Relation *Irel, int nindexes, int nrequested,
+											bool *can_parallel_vacuum);
+static void prepare_index_statistics(LVShared *lvshared, bool *can_parallel_vacuum,
+									 int nindexes);
+static void update_index_statistics(Relation *Irel, IndexBulkDeleteResult **stats,
+									int nindexes);
+static LVParallelState *begin_parallel_vacuum(Oid relid, Relation *Irel,
+											  LVRelStats *vacrelstats, BlockNumber nblocks,
+											  int nindexes, int nrequested);
+static void end_parallel_vacuum(Relation *Irel, IndexBulkDeleteResult **stats,
+								LVParallelState *lps, int nindexes);
+static LVSharedIndStats *get_indstats(LVShared *lvshared, int n);
+static bool skip_parallel_vacuum_index(Relation indrel, LVShared *lvshared);
 
 
 /*
@@ -491,6 +671,18 @@ vacuum_log_cleanup_info(Relation rel, LVRelStats *vacrelstats)
  *		dead-tuple TIDs, invoke vacuuming of indexes and call lazy_vacuum_heap
  *		to reclaim dead line pointers.
  *
+ *		If the table has at least two indexes and parallel lazy vacuum is
+ *		requested, we execute both index vacuum and index cleanup with
+ *		parallel workers.  In parallel lazy vacuum, we enter parallel mode and
+ *		then create both the parallel context and the DSM segment before starting
+ *		heap scan so that we can record dead tuples to the DSM segment.  All
+ *		parallel workers are launched at beginning of index vacuuming and index
+ *		cleanup and they exit once done with all indexes.  At the end of this
+ *		function we exit from parallel mode.  Index bulk-deletion results are
+ *		stored in the DSM segment and we update index statistics for all the
+ *		indexes after exiting from parallel mode since writes are not allowed
+ *		during parallel mode.
+ *
  *		If there are no indexes then we can reclaim line pointers on the fly;
  *		dead line pointers need only be retained until all index pointers that
  *		reference them have been killed.
@@ -499,6 +691,8 @@ static void
 lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			   Relation *Irel, int nindexes, bool aggressive)
 {
+	LVParallelState *lps = NULL;
+	LVDeadTuples *dead_tuples;
 	BlockNumber nblocks,
 				blkno;
 	HeapTupleData tuple;
@@ -556,13 +750,28 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	vacrelstats->nonempty_pages = 0;
 	vacrelstats->latestRemovedXid = InvalidTransactionId;
 
-	lazy_space_alloc(vacrelstats, nblocks);
+	/*
+	 * Try to initialize the parallel vacuum if requested
+	 */
+	if (params->nworkers >= 0 && vacrelstats->useindex)
+		lps = begin_parallel_vacuum(RelationGetRelid(onerel), Irel,
+									vacrelstats, nblocks, nindexes,
+									params->nworkers);
+
+	/*
+	 * Allocate the space for dead tuples in case the parallel vacuum is not
+	 * initialized.
+	 */
+	if (!ParallelVacuumIsActive(lps))
+		lazy_space_alloc(vacrelstats, nblocks);
+
+	dead_tuples = vacrelstats->dead_tuples;
 	frozen = palloc(sizeof(xl_heap_freeze_tuple) * MaxHeapTuplesPerPage);
 
 	/* Report that we're scanning the heap, advertising total # of blocks */
 	initprog_val[0] = PROGRESS_VACUUM_PHASE_SCAN_HEAP;
 	initprog_val[1] = nblocks;
-	initprog_val[2] = vacrelstats->max_dead_tuples;
+	initprog_val[2] = dead_tuples->max_tuples;
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
 	/*
@@ -740,8 +949,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * If we are close to overrunning the available space for dead-tuple
 		 * TIDs, pause and do a cycle of vacuuming before we tackle this page.
 		 */
-		if ((vacrelstats->max_dead_tuples - vacrelstats->num_dead_tuples) < MaxHeapTuplesPerPage &&
-			vacrelstats->num_dead_tuples > 0)
+		if ((dead_tuples->max_tuples - dead_tuples->num_tuples) < MaxHeapTuplesPerPage &&
+			dead_tuples->num_tuples > 0)
 		{
 			/*
 			 * Before beginning index vacuuming, we release any pin we may
@@ -756,8 +965,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			}
 
 			/* Work on all the indexes, then the heap */
-			lazy_vacuum_all_indexes(onerel, vacrelstats, Irel,
-									nindexes, indstats);
+			lazy_vacuum_all_indexes(onerel, Irel, indstats,
+									vacrelstats, lps, nindexes);
 
 			/* Remove tuples from heap */
 			lazy_vacuum_heap(onerel, vacrelstats);
@@ -767,7 +976,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 
 			/*
 			 * Vacuum the Free Space Map to make newly-freed space visible on
@@ -962,7 +1171,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		has_dead_tuples = false;
 		nfrozen = 0;
 		hastup = false;
-		prev_dead_count = vacrelstats->num_dead_tuples;
+		prev_dead_count = dead_tuples->num_tuples;
 		maxoff = PageGetMaxOffsetNumber(page);
 
 		/*
@@ -1001,7 +1210,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 */
 			if (ItemIdIsDead(itemid))
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				all_visible = false;
 				continue;
 			}
@@ -1147,7 +1356,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 
 			if (tupgone)
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				HeapTupleHeaderAdvanceLatestRemovedXid(tuple.t_data,
 													   &vacrelstats->latestRemovedXid);
 				tups_vacuumed += 1;
@@ -1217,7 +1426,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * doing a second scan. Also we don't do that but forget dead tuples
 		 * when index cleanup is disabled.
 		 */
-		if (!vacrelstats->useindex && vacrelstats->num_dead_tuples > 0)
+		if (!vacrelstats->useindex && dead_tuples->num_tuples > 0)
 		{
 			if (nindexes == 0)
 			{
@@ -1246,7 +1455,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 
 			/*
 			 * Periodically do incremental FSM vacuuming to make newly-freed
@@ -1361,7 +1570,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * page, so remember its free space as-is.  (This path will always be
 		 * taken if there are no indexes.)
 		 */
-		if (vacrelstats->num_dead_tuples == prev_dead_count)
+		if (dead_tuples->num_tuples == prev_dead_count)
 			RecordPageWithFreeSpace(onerel, blkno, freespace);
 	}
 
@@ -1395,11 +1604,11 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 
 	/* If any tuples need to be deleted, perform final vacuum cycle */
 	/* XXX put a threshold on min number of tuples here? */
-	if (vacrelstats->num_dead_tuples > 0)
+	if (dead_tuples->num_tuples > 0)
 	{
 		/* Work on all the indexes, and then the heap */
-		lazy_vacuum_all_indexes(onerel, vacrelstats, Irel, nindexes,
-								indstats);
+		lazy_vacuum_all_indexes(onerel, Irel, indstats, vacrelstats,
+								lps, nindexes);
 
 		/* Remove tuples from heap */
 		lazy_vacuum_heap(onerel, vacrelstats);
@@ -1412,17 +1621,22 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	if (blkno > next_fsm_block_to_vacuum)
 		FreeSpaceMapVacuumRange(onerel, next_fsm_block_to_vacuum, blkno);
 
-	/* report all blocks vacuumed; and that we're cleaning up */
+	/* report all blocks vacuumed */
 	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_VACUUMED, blkno);
-	pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
-								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
 
-	/* Do post-vacuum cleanup and statistics update for each index */
+	/* Do post-vacuum cleanup */
 	if (vacrelstats->useindex)
-	{
-		for (i = 0; i < nindexes; i++)
-			lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
-	}
+		lazy_cleanup_all_indexes(Irel, indstats, vacrelstats, lps, nindexes);
+
+	/*
+	 * End parallel mode before updating index statistics as we cannot write
+	 * during parallel mode.
+	 */
+	if (ParallelVacuumIsActive(lps))
+		end_parallel_vacuum(Irel, indstats, lps, nindexes);
+
+	/* Update index statistics */
+	update_index_statistics(Irel, indstats, nindexes);
 
 	/* If no indexes, make log report that lazy_vacuum_heap would've made */
 	if (vacuumed_pages)
@@ -1467,15 +1681,16 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 /*
  *	lazy_vacuum_all_indexes() -- vacuum all indexes of relation.
  *
- *		This is a utility wrapper for lazy_vacuum_index(), able to do
- *		progress reporting.
+ * We process the indexes serially unless we are doing parallel vacuum.
  */
 static void
-lazy_vacuum_all_indexes(Relation onerel, LVRelStats *vacrelstats,
-						Relation *Irel, int nindexes,
-						IndexBulkDeleteResult **indstats)
+lazy_vacuum_all_indexes(Relation onerel, Relation *Irel,
+						IndexBulkDeleteResult **stats,
+						LVRelStats *vacrelstats, LVParallelState *lps,
+						int nindexes)
 {
-	int			i;
+	Assert(!IsParallelWorker());
+	Assert(nindexes > 0);
 
 	/* Log cleanup info before we touch indexes */
 	vacuum_log_cleanup_info(onerel, vacrelstats);
@@ -1484,9 +1699,30 @@ lazy_vacuum_all_indexes(Relation onerel, LVRelStats *vacrelstats,
 	pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
 								 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
-	/* Remove index entries */
-	for (i = 0; i < nindexes; i++)
-		lazy_vacuum_index(Irel[i], &indstats[i], vacrelstats);
+	/* Perform index vacuuming with parallel workers for parallel vacuum. */
+	if (ParallelVacuumIsActive(lps))
+	{
+		/* Tell parallel workers to do index vacuuming */
+		lps->lvshared->for_cleanup = false;
+		lps->lvshared->first_time = false;
+
+		/*
+		 * We can only provide an approximate value of num_heap_tuples in
+		 * vacuum cases.
+		 */
+		lps->lvshared->reltuples = vacrelstats->old_live_tuples;
+		lps->lvshared->estimated_count = true;
+
+		lazy_parallel_vacuum_indexes(Irel, stats, vacrelstats, lps, nindexes);
+	}
+	else
+	{
+		int			idx;
+
+		for (idx = 0; idx < nindexes; idx++)
+			lazy_vacuum_index(Irel[idx], &stats[idx], vacrelstats->dead_tuples,
+							  vacrelstats->old_live_tuples);
+	}
 
 	/* Increase and report the number of index scans */
 	vacrelstats->num_index_scans++;
@@ -1522,7 +1758,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 	npages = 0;
 
 	tupindex = 0;
-	while (tupindex < vacrelstats->num_dead_tuples)
+	while (tupindex < vacrelstats->dead_tuples->num_tuples)
 	{
 		BlockNumber tblk;
 		Buffer		buf;
@@ -1531,7 +1767,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 
 		vacuum_delay_point();
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples->itemptrs[tupindex]);
 		buf = ReadBufferExtended(onerel, MAIN_FORKNUM, tblk, RBM_NORMAL,
 								 vac_strategy);
 		if (!ConditionalLockBufferForCleanup(buf))
@@ -1579,6 +1815,7 @@ static int
 lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 				 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer)
 {
+	LVDeadTuples *dead_tuples = vacrelstats->dead_tuples;
 	Page		page = BufferGetPage(buffer);
 	OffsetNumber unused[MaxOffsetNumber];
 	int			uncnt = 0;
@@ -1589,16 +1826,16 @@ lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 
 	START_CRIT_SECTION();
 
-	for (; tupindex < vacrelstats->num_dead_tuples; tupindex++)
+	for (; tupindex < dead_tuples->num_tuples; tupindex++)
 	{
 		BlockNumber tblk;
 		OffsetNumber toff;
 		ItemId		itemid;
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&dead_tuples->itemptrs[tupindex]);
 		if (tblk != blkno)
 			break;				/* past end of tuples for this block */
-		toff = ItemPointerGetOffsetNumber(&vacrelstats->dead_tuples[tupindex]);
+		toff = ItemPointerGetOffsetNumber(&dead_tuples->itemptrs[tupindex]);
 		itemid = PageGetItemId(page, toff);
 		ItemIdSetUnused(itemid);
 		unused[uncnt++] = toff;
@@ -1719,19 +1956,354 @@ lazy_check_needs_freeze(Buffer buf, bool *hastup)
 	return false;
 }
 
+/*
+ * Perform index vacuum or index cleanup with parallel workers.  This function
+ * must be used by the parallel vacuum leader process.  The caller must set
+ * lps->lvshared->for_cleanup to indicate whether to perform vacuum or
+ * cleanup.
+ */
+static void
+lazy_parallel_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+							 LVRelStats *vacrelstats, LVParallelState *lps,
+							 int nindexes)
+{
+	int			nworkers;
+
+	Assert(!IsParallelWorker());
+	Assert(ParallelVacuumIsActive(lps));
+	Assert(nindexes > 0);
+
+	/* Determine the number of parallel workers to launch */
+	if (lps->lvshared->for_cleanup)
+	{
+		if (lps->lvshared->first_time)
+			nworkers = lps->nindexes_parallel_cleanup +
+				lps->nindexes_parallel_condcleanup;
+		else
+			nworkers = lps->nindexes_parallel_cleanup;
+
+	}
+	else
+		nworkers = lps->nindexes_parallel_bulkdel;
+
+	/* The leader process will participate */
+	nworkers--;
+
+	/*
+	 * It is possible that parallel context is initialized with fewer workers
+	 * then the number of indexes that need a separate worker in the current
+	 * phase, so we need to consider it.  See compute_parallel_vacuum_workers.
+	 */
+	nworkers = Min(nworkers, lps->pcxt->nworkers);
+
+	/* Setup the shared cost-based vacuum delay and launch workers */
+	if (nworkers > 0)
+	{
+		if (vacrelstats->num_index_scans > 0)
+		{
+			/* Reset the parallel index processing counter */
+			pg_atomic_write_u32(&(lps->lvshared->idx), 0);
+
+			/* Reinitialize the parallel context to relaunch parallel workers */
+			ReinitializeParallelDSM(lps->pcxt);
+		}
+
+		/* Enable shared cost balance */
+		VacuumSharedCostBalance = &(lps->lvshared->cost_balance);
+		VacuumActiveNWorkers = &(lps->lvshared->active_nworkers);
+
+		/*
+		 * Set up shared cost balance and the number of active workers for
+		 * vacuum delay.
+		 */
+		pg_atomic_write_u32(VacuumSharedCostBalance, VacuumCostBalance);
+		pg_atomic_write_u32(VacuumActiveNWorkers, 0);
+
+		/*
+		 * The number of workers can vary between bulkdelete and cleanup
+		 * phase.
+		 */
+		ReinitializeParallelWorkers(lps->pcxt, nworkers);
+
+		LaunchParallelWorkers(lps->pcxt);
+
+		if (lps->pcxt->nworkers_launched > 0)
+		{
+			/*
+			 * Reset the local cost values for leader backend as we have
+			 * already accumulated the remaining balance of heap.
+			 */
+			VacuumCostBalance = 0;
+			VacuumCostBalanceLocal = 0;
+		}
+		else
+		{
+			/*
+			 * Disable shared cost balance if we are not able to launch
+			 * workers.
+			 */
+			VacuumSharedCostBalance = NULL;
+			VacuumActiveNWorkers = NULL;
+		}
+
+		if (lps->lvshared->for_cleanup)
+			ereport(elevel,
+					(errmsg(ngettext("launched %d parallel vacuum worker for index cleanup (planned: %d)",
+									 "launched %d parallel vacuum workers for index cleanup (planned: %d)",
+									 lps->pcxt->nworkers_launched),
+							lps->pcxt->nworkers_launched, nworkers)));
+		else
+			ereport(elevel,
+					(errmsg(ngettext("launched %d parallel vacuum worker for index vacuuming (planned: %d)",
+									 "launched %d parallel vacuum workers for index vacuuming (planned: %d)",
+									 lps->pcxt->nworkers_launched),
+							lps->pcxt->nworkers_launched, nworkers)));
+	}
+
+	/* Process the indexes that can be processed by only leader process */
+	vacuum_indexes_leader(Irel, stats, vacrelstats, lps, nindexes);
+
+	/*
+	 * Join as a parallel worker.  The leader process alone processes all the
+	 * indexes in the case where no workers are launched.
+	 */
+	parallel_vacuum_index(Irel, stats, lps->lvshared,
+						  vacrelstats->dead_tuples, nindexes);
+
+	/* Wait for all vacuum workers to finish */
+	WaitForParallelWorkersToFinish(lps->pcxt);
+
+	/* Carry the shared balance value to heap scan */
+	if (VacuumSharedCostBalance)
+		VacuumCostBalance = pg_atomic_read_u32(VacuumSharedCostBalance);
+
+	if (nworkers > 0)
+	{
+		/* Disable shared cost balance */
+		VacuumSharedCostBalance = NULL;
+		VacuumActiveNWorkers = NULL;
+	}
+}
+
+/*
+ * Index vacuum/cleanup routine used by the leader process and parallel
+ * vacuum worker processes to process the indexes in parallel.
+ */
+static void
+parallel_vacuum_index(Relation *Irel, IndexBulkDeleteResult **stats,
+					  LVShared *lvshared, LVDeadTuples *dead_tuples,
+					  int nindexes)
+{
+	/*
+	 * Increment the active worker count if we are able to launch any worker.
+	 */
+	if (VacuumActiveNWorkers)
+		pg_atomic_add_fetch_u32(VacuumActiveNWorkers, 1);
+
+	/* Loop until all indexes are vacuumed */
+	for (;;)
+	{
+		int			idx;
+		LVSharedIndStats *shared_indstats;
+
+		/* Get an index number to process */
+		idx = pg_atomic_fetch_add_u32(&(lvshared->idx), 1);
+
+		/* Done for all indexes? */
+		if (idx >= nindexes)
+			break;
+
+		/* Get the index statistics of this index from DSM */
+		shared_indstats = get_indstats(lvshared, idx);
+
+		/*
+		 * Skip processing indexes that doesn't participate in parallel
+		 * operation
+		 */
+		if (shared_indstats == NULL ||
+			skip_parallel_vacuum_index(Irel[idx], lvshared))
+			continue;
+
+		/* Do vacuum or cleanup of the index */
+		vacuum_one_index(Irel[idx], &(stats[idx]), lvshared, shared_indstats,
+						 dead_tuples);
+	}
+
+	/*
+	 * We have completed the index vacuum so decrement the active worker
+	 * count.
+	 */
+	if (VacuumActiveNWorkers)
+		pg_atomic_sub_fetch_u32(VacuumActiveNWorkers, 1);
+}
+
+/*
+ * Vacuum or cleanup indexes that can be processed by only the leader process
+ * because these indexes don't support parallel operation at that phase.
+ */
+static void
+vacuum_indexes_leader(Relation *Irel, IndexBulkDeleteResult **stats,
+					  LVRelStats *vacrelstats, LVParallelState *lps,
+					  int nindexes)
+{
+	int			i;
+
+	Assert(!IsParallelWorker());
+
+	/*
+	 * Increment the active worker count if we are able to launch any worker.
+	 */
+	if (VacuumActiveNWorkers)
+		pg_atomic_add_fetch_u32(VacuumActiveNWorkers, 1);
+
+	for (i = 0; i < nindexes; i++)
+	{
+		bool		skip_index = (get_indstats(lps->lvshared, i) == NULL ||
+								  skip_parallel_vacuum_index(Irel[i], lps->lvshared));
+
+		/* Skip the indexes that can be processed by parallel workers */
+		if (!skip_index)
+			continue;
+
+		vacuum_one_index(Irel[i], &(stats[i]), lps->lvshared,
+						 get_indstats(lps->lvshared, i),
+						 vacrelstats->dead_tuples);
+	}
+
+	/*
+	 * We have completed the index vacuum so decrement the active worker
+	 * count.
+	 */
+	if (VacuumActiveNWorkers)
+		pg_atomic_sub_fetch_u32(VacuumActiveNWorkers, 1);
+}
+
+/*
+ * Vacuum or cleanup index either by leader process or by one of the worker
+ * process.  After processing the index this function copies the index
+ * statistics returned from ambulkdelete and amvacuumcleanup to the DSM
+ * segment.
+ */
+static void
+vacuum_one_index(Relation indrel, IndexBulkDeleteResult **stats,
+				 LVShared *lvshared, LVSharedIndStats *shared_indstats,
+				 LVDeadTuples *dead_tuples)
+{
+	IndexBulkDeleteResult *bulkdelete_res = NULL;
+
+	if (shared_indstats)
+	{
+		/* Get the space for IndexBulkDeleteResult */
+		bulkdelete_res = &(shared_indstats->stats);
+
+		/*
+		 * Update the pointer to the corresponding bulk-deletion result if
+		 * someone has already updated it.
+		 */
+		if (shared_indstats->updated && *stats == NULL)
+			*stats = bulkdelete_res;
+	}
+
+	/* Do vacuum or cleanup of the index */
+	if (lvshared->for_cleanup)
+		lazy_cleanup_index(indrel, stats, lvshared->reltuples,
+						   lvshared->estimated_count);
+	else
+		lazy_vacuum_index(indrel, stats, dead_tuples,
+						  lvshared->reltuples);
+
+	/*
+	 * Copy the index bulk-deletion result returned from ambulkdelete and
+	 * amvacuumcleanup to the DSM segment if it's the first time to get it
+	 * from them, because they allocate it locally and it's possible that an
+	 * index will be vacuumed by the different vacuum process at the next
+	 * time.  The copying of the result normally happens only after the first
+	 * time of index vacuuming.  From the second time, we pass the result on
+	 * the DSM segment so that they then update it directly.
+	 *
+	 * Since all vacuum workers write the bulk-deletion result at different
+	 * slots we can write them without locking.
+	 */
+	if (shared_indstats && !shared_indstats->updated && *stats != NULL)
+	{
+		memcpy(bulkdelete_res, *stats, sizeof(IndexBulkDeleteResult));
+		shared_indstats->updated = true;
+
+		/*
+		 * Now that the stats[idx] points to the DSM segment, we don't need
+		 * the locally allocated results.
+		 */
+		pfree(*stats);
+		*stats = bulkdelete_res;
+	}
+}
+
+/*
+ *	lazy_cleanup_all_indexes() -- cleanup all indexes of relation.
+ *
+ * Cleanup indexes.  We process the indexes serially unless we are doing
+ * parallel vacuum.
+ */
+static void
+lazy_cleanup_all_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+						 LVRelStats *vacrelstats, LVParallelState *lps,
+						 int nindexes)
+{
+	int			idx;
+
+	Assert(!IsParallelWorker());
+	Assert(nindexes > 0);
+
+	/* Report that we are now cleaning up indexes */
+	pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
+								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
+
+	/*
+	 * If parallel vacuum is active we perform index cleanup with parallel
+	 * workers.
+	 */
+	if (ParallelVacuumIsActive(lps))
+	{
+		/* Tell parallel workers to do index cleanup */
+		lps->lvshared->for_cleanup = true;
+		lps->lvshared->first_time =
+			(vacrelstats->num_index_scans == 0);
+
+		/*
+		 * Now we can provide a better estimate of total number of surviving
+		 * tuples (we assume indexes are more interested in that than in the
+		 * number of nominally live tuples).
+		 */
+		lps->lvshared->reltuples = vacrelstats->new_rel_tuples;
+		lps->lvshared->estimated_count =
+			(vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+
+		lazy_parallel_vacuum_indexes(Irel, stats, vacrelstats, lps, nindexes);
+	}
+	else
+	{
+		for (idx = 0; idx < nindexes; idx++)
+			lazy_cleanup_index(Irel[idx], &stats[idx],
+							   vacrelstats->new_rel_tuples,
+							   vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	}
+}
 
 /*
  *	lazy_vacuum_index() -- vacuum one index relation.
  *
  *		Delete all the index entries pointing to tuples listed in
- *		vacrelstats->dead_tuples, and update running statistics.
+ *		dead_tuples, and update running statistics.
+ *
+ *		reltuples is the number of heap tuples to be passed to the
+ *		bulkdelete callback.
  */
 static void
-lazy_vacuum_index(Relation indrel,
-				  IndexBulkDeleteResult **stats,
-				  LVRelStats *vacrelstats)
+lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult **stats,
+				  LVDeadTuples *dead_tuples, double reltuples)
 {
 	IndexVacuumInfo ivinfo;
+	const char *msg;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
@@ -1741,30 +2313,38 @@ lazy_vacuum_index(Relation indrel,
 	ivinfo.report_progress = false;
 	ivinfo.estimated_count = true;
 	ivinfo.message_level = elevel;
-	/* We can only provide an approximate value of num_heap_tuples here */
-	ivinfo.num_heap_tuples = vacrelstats->old_live_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
 	/* Do bulk deletion */
 	*stats = index_bulk_delete(&ivinfo, *stats,
-							   lazy_tid_reaped, (void *) vacrelstats);
+							   lazy_tid_reaped, (void *) dead_tuples);
+
+	if (IsParallelWorker())
+		msg = gettext_noop("scanned index \"%s\" to remove %d row versions by parallel vacuum worker");
+	else
+		msg = gettext_noop("scanned index \"%s\" to remove %d row versions");
 
 	ereport(elevel,
-			(errmsg("scanned index \"%s\" to remove %d row versions",
+			(errmsg(msg,
 					RelationGetRelationName(indrel),
-					vacrelstats->num_dead_tuples),
+					dead_tuples->num_tuples),
 			 errdetail_internal("%s", pg_rusage_show(&ru0))));
 }
 
 /*
  *	lazy_cleanup_index() -- do post-vacuum cleanup for one index relation.
+ *
+ *		reltuples is the number of heap tuples and estimated_count is true
+ *		if the reltuples is an estimated value.
  */
 static void
 lazy_cleanup_index(Relation indrel,
-				   IndexBulkDeleteResult *stats,
-				   LVRelStats *vacrelstats)
+				   IndexBulkDeleteResult **stats,
+				   double reltuples, bool estimated_count)
 {
 	IndexVacuumInfo ivinfo;
+	const char *msg;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
@@ -1772,49 +2352,33 @@ lazy_cleanup_index(Relation indrel,
 	ivinfo.index = indrel;
 	ivinfo.analyze_only = false;
 	ivinfo.report_progress = false;
-	ivinfo.estimated_count = (vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	ivinfo.estimated_count = estimated_count;
 	ivinfo.message_level = elevel;
 
-	/*
-	 * Now we can provide a better estimate of total number of surviving
-	 * tuples (we assume indexes are more interested in that than in the
-	 * number of nominally live tuples).
-	 */
-	ivinfo.num_heap_tuples = vacrelstats->new_rel_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
-	stats = index_vacuum_cleanup(&ivinfo, stats);
+	*stats = index_vacuum_cleanup(&ivinfo, *stats);
 
-	if (!stats)
+	if (!(*stats))
 		return;
 
-	/*
-	 * Now update statistics in pg_class, but only if the index says the count
-	 * is accurate.
-	 */
-	if (!stats->estimated_count)
-		vac_update_relstats(indrel,
-							stats->num_pages,
-							stats->num_index_tuples,
-							0,
-							false,
-							InvalidTransactionId,
-							InvalidMultiXactId,
-							false);
+	if (IsParallelWorker())
+		msg = gettext_noop("index \"%s\" now contains %.0f row versions in %u pages as reported by parallel vacuum worker");
+	else
+		msg = gettext_noop("index \"%s\" now contains %.0f row versions in %u pages");
 
 	ereport(elevel,
-			(errmsg("index \"%s\" now contains %.0f row versions in %u pages",
+			(errmsg(msg,
 					RelationGetRelationName(indrel),
-					stats->num_index_tuples,
-					stats->num_pages),
+					(*stats)->num_index_tuples,
+					(*stats)->num_pages),
 			 errdetail("%.0f index row versions were removed.\n"
 					   "%u index pages have been deleted, %u are currently reusable.\n"
 					   "%s.",
-					   stats->tuples_removed,
-					   stats->pages_deleted, stats->pages_free,
+					   (*stats)->tuples_removed,
+					   (*stats)->pages_deleted, (*stats)->pages_free,
 					   pg_rusage_show(&ru0))));
-
-	pfree(stats);
 }
 
 /*
@@ -2122,19 +2686,17 @@ count_nondeletable_pages(Relation onerel, LVRelStats *vacrelstats)
 }
 
 /*
- * lazy_space_alloc - space allocation decisions for lazy vacuum
- *
- * See the comments at the head of this file for rationale.
+ * Return the maximum number of dead tuples we can record.
  */
-static void
-lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+static long
+compute_max_dead_tuples(BlockNumber relblocks, bool useindex)
 {
 	long		maxtuples;
 	int			vac_work_mem = IsAutoVacuumWorkerProcess() &&
 	autovacuum_work_mem != -1 ?
 	autovacuum_work_mem : maintenance_work_mem;
 
-	if (vacrelstats->useindex)
+	if (useindex)
 	{
 		maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);
 		maxtuples = Min(maxtuples, INT_MAX);
@@ -2148,34 +2710,48 @@ lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
 		maxtuples = Max(maxtuples, MaxHeapTuplesPerPage);
 	}
 	else
-	{
 		maxtuples = MaxHeapTuplesPerPage;
-	}
 
-	vacrelstats->num_dead_tuples = 0;
-	vacrelstats->max_dead_tuples = (int) maxtuples;
-	vacrelstats->dead_tuples = (ItemPointer)
-		palloc(maxtuples * sizeof(ItemPointerData));
+	return maxtuples;
+}
+
+/*
+ * lazy_space_alloc - space allocation decisions for lazy vacuum
+ *
+ * See the comments at the head of this file for rationale.
+ */
+static void
+lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+{
+	LVDeadTuples *dead_tuples = NULL;
+	long		maxtuples;
+
+	maxtuples = compute_max_dead_tuples(relblocks, vacrelstats->useindex);
+
+	dead_tuples = (LVDeadTuples *) palloc(SizeOfLVDeadTuples(maxtuples));
+	dead_tuples->num_tuples = 0;
+	dead_tuples->max_tuples = (int) maxtuples;
+
+	vacrelstats->dead_tuples = dead_tuples;
 }
 
 /*
  * lazy_record_dead_tuple - remember one deletable tuple
  */
 static void
-lazy_record_dead_tuple(LVRelStats *vacrelstats,
-					   ItemPointer itemptr)
+lazy_record_dead_tuple(LVDeadTuples *dead_tuples, ItemPointer itemptr)
 {
 	/*
 	 * The array shouldn't overflow under normal behavior, but perhaps it
 	 * could if we are given a really small maintenance_work_mem. In that
 	 * case, just forget the last few tuples (we'll get 'em next time).
 	 */
-	if (vacrelstats->num_dead_tuples < vacrelstats->max_dead_tuples)
+	if (dead_tuples->num_tuples < dead_tuples->max_tuples)
 	{
-		vacrelstats->dead_tuples[vacrelstats->num_dead_tuples] = *itemptr;
-		vacrelstats->num_dead_tuples++;
+		dead_tuples->itemptrs[dead_tuples->num_tuples] = *itemptr;
+		dead_tuples->num_tuples++;
 		pgstat_progress_update_param(PROGRESS_VACUUM_NUM_DEAD_TUPLES,
-									 vacrelstats->num_dead_tuples);
+									 dead_tuples->num_tuples);
 	}
 }
 
@@ -2189,12 +2765,12 @@ lazy_record_dead_tuple(LVRelStats *vacrelstats,
 static bool
 lazy_tid_reaped(ItemPointer itemptr, void *state)
 {
-	LVRelStats *vacrelstats = (LVRelStats *) state;
+	LVDeadTuples *dead_tuples = (LVDeadTuples *) state;
 	ItemPointer res;
 
 	res = (ItemPointer) bsearch((void *) itemptr,
-								(void *) vacrelstats->dead_tuples,
-								vacrelstats->num_dead_tuples,
+								(void *) dead_tuples->itemptrs,
+								dead_tuples->num_tuples,
 								sizeof(ItemPointerData),
 								vac_cmp_itemptr);
 
@@ -2342,3 +2918,449 @@ heap_page_is_all_visible(Relation rel, Buffer buf,
 
 	return all_visible;
 }
+
+/*
+ * Compute the number of parallel worker processes to request.  Both index
+ * vacuum and index cleanup can be executed with parallel workers.  The
+ * relation size of the table don't affect the parallel degree for now.
+ *
+ * nrequested is the number of parallel workers that user requested.  If
+ * nrequested is 0, we compute the parallel degree based on nindexes, that is
+ * the number of indexes that support parallel index vacuuming.  This function
+ * also sets can_parallel_vacuum to remember indexes that participate in
+ * parallel index vacuum.
+ */
+static int
+compute_parallel_vacuum_workers(Relation *Irel, int nindexes, int nrequested,
+								bool *can_parallel_vacuum)
+{
+	int			nindexes_parallel = 0;
+	int			nindexes_parallel_bulkdel = 0;
+	int			nindexes_parallel_cleanup = 0;
+	int			parallel_workers;
+	int			i;
+
+	/*
+	 * We don't allow to perform parallel operation in standalone backend or
+	 * when parallelism is disabled.
+	 */
+	if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+		return 0;
+
+	/*
+	 * Compute the number of indexes that can participate in parallel index
+	 * vacuuming.
+	 */
+	for (i = 0; i < nindexes; i++)
+	{
+		uint8		vacoptions = Irel[i]->rd_indam->amparallelvacuumoptions;
+
+		if (vacoptions == VACUUM_OPTION_NO_PARALLEL ||
+			RelationGetNumberOfBlocks(Irel[i]) < min_parallel_index_scan_size)
+			continue;
+
+		can_parallel_vacuum[i] = true;
+
+		if ((vacoptions & VACUUM_OPTION_PARALLEL_BULKDEL) != 0)
+			nindexes_parallel_bulkdel++;
+		if (((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) != 0) ||
+			((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) != 0))
+			nindexes_parallel_cleanup++;
+	}
+
+	nindexes_parallel = Max(nindexes_parallel_bulkdel,
+							nindexes_parallel_cleanup);
+
+	/* No index supports parallel index vacuuming */
+	if (nindexes_parallel == 0)
+		return 0;
+
+	/* The leader process takes one index */
+	nindexes_parallel--;
+
+	/* Compute the parallel degree */
+	parallel_workers = (nrequested > 0) ?
+		Min(nrequested, nindexes_parallel) : nindexes_parallel;
+
+	/* Cap by max_parallel_maintenance_workers */
+	parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+
+	return parallel_workers;
+}
+
+/*
+ * Initialize variables for shared index statistics, set NULL bitmap and the
+ * size of stats for each index.  Since currently we don't support parallel vacuum
+ * for autovacuum we don't need to care about autovacuum_work_mem.
+ */
+static void
+prepare_index_statistics(LVShared *lvshared, bool *can_parallel_vacuum,
+						 int nindexes)
+{
+	int			i;
+
+	Assert(!IsAutoVacuumWorkerProcess());
+
+	/* Set NULL for all indexes */
+	memset(lvshared->bitmap, 0x00, BITMAPLEN(nindexes));
+
+	for (i = 0; i < nindexes; i++)
+	{
+		if (!can_parallel_vacuum[i])
+			continue;
+
+		/* Set NOT NULL as this index do support parallelism */
+		lvshared->bitmap[i >> 3] |= 1 << (i & 0x07);
+	}
+}
+
+/*
+ * Update index statistics in pg_class if the statistics is accurate.
+ */
+static void
+update_index_statistics(Relation *Irel, IndexBulkDeleteResult **stats,
+						int nindexes)
+{
+	int			i;
+
+	Assert(!IsInParallelMode());
+
+	for (i = 0; i < nindexes; i++)
+	{
+		if (stats[i] == NULL || stats[i]->estimated_count)
+			continue;
+
+		/* Update index statistics */
+		vac_update_relstats(Irel[i],
+							stats[i]->num_pages,
+							stats[i]->num_index_tuples,
+							0,
+							false,
+							InvalidTransactionId,
+							InvalidMultiXactId,
+							false);
+		pfree(stats[i]);
+	}
+}
+
+/*
+ * This function prepares and returns parallel vacuum state if we can launch
+ * even one worker.  This function is responsible to create a parallel
+ * context, enter parallel mode, and then initialize the DSM segment.
+ */
+static LVParallelState *
+begin_parallel_vacuum(Oid relid, Relation *Irel, LVRelStats *vacrelstats,
+					  BlockNumber nblocks, int nindexes, int nrequested)
+{
+	LVParallelState *lps = NULL;
+	ParallelContext *pcxt;
+	LVShared   *shared;
+	LVDeadTuples *dead_tuples;
+	bool	   *can_parallel_vacuum;
+	long		maxtuples;
+	char	   *sharedquery;
+	Size		est_shared;
+	Size		est_deadtuples;
+	int			nindexes_mwm = 0;
+	int			parallel_workers = 0;
+	int			querylen;
+	int			i;
+
+	/*
+	 * A parallel vacuum must be requested and there must be indexes on the
+	 * relation
+	 */
+	Assert(nrequested >= 0);
+	Assert(nindexes > 0);
+
+	/*
+	 * Compute the number of parallel vacuum workers to launch
+	 */
+	can_parallel_vacuum = (bool *) palloc0(sizeof(bool) * nindexes);
+	parallel_workers = compute_parallel_vacuum_workers(Irel, nindexes,
+													   nrequested,
+													   can_parallel_vacuum);
+
+	/* Can't perform vacuum in parallel */
+	if (parallel_workers <= 0)
+	{
+		pfree(can_parallel_vacuum);
+		return lps;
+	}
+
+	lps = (LVParallelState *) palloc0(sizeof(LVParallelState));
+
+	EnterParallelMode();
+	pcxt = CreateParallelContext("postgres", "parallel_vacuum_main",
+								 parallel_workers);
+	Assert(pcxt->nworkers > 0);
+	lps->pcxt = pcxt;
+
+	/* Estimate size for shared information -- PARALLEL_VACUUM_KEY_SHARED */
+	est_shared = MAXALIGN(add_size(SizeOfLVShared, BITMAPLEN(nindexes)));
+	for (i = 0; i < nindexes; i++)
+	{
+		uint8		vacoptions = Irel[i]->rd_indam->amparallelvacuumoptions;
+
+		/*
+		 * Cleanup option should be either disabled, always performing in
+		 * parallel or conditionally performing in parallel.
+		 */
+		Assert(((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) == 0) ||
+			   ((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) == 0));
+		Assert(vacoptions <= VACUUM_OPTION_MAX_VALID_VALUE);
+
+		/* Skip indexes that don't participate in parallel index vacuum */
+		if (!can_parallel_vacuum[i])
+			continue;
+
+		if (Irel[i]->rd_indam->amusemaintenanceworkmem)
+			nindexes_mwm++;
+
+		est_shared = add_size(est_shared, sizeof(LVSharedIndStats));
+
+		/*
+		 * Remember the number of indexes that support parallel operation for
+		 * each phase.
+		 */
+		if ((vacoptions & VACUUM_OPTION_PARALLEL_BULKDEL) != 0)
+			lps->nindexes_parallel_bulkdel++;
+		if ((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) != 0)
+			lps->nindexes_parallel_cleanup++;
+		if ((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) != 0)
+			lps->nindexes_parallel_condcleanup++;
+	}
+	shm_toc_estimate_chunk(&pcxt->estimator, est_shared);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	/* Estimate size for dead tuples -- PARALLEL_VACUUM_KEY_DEAD_TUPLES */
+	maxtuples = compute_max_dead_tuples(nblocks, true);
+	est_deadtuples = MAXALIGN(SizeOfLVDeadTuples(maxtuples));
+	shm_toc_estimate_chunk(&pcxt->estimator, est_deadtuples);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	/* Finally, estimate PARALLEL_VACUUM_KEY_QUERY_TEXT space */
+	querylen = strlen(debug_query_string);
+	shm_toc_estimate_chunk(&pcxt->estimator, querylen + 1);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	InitializeParallelDSM(pcxt);
+
+	/* Prepare shared information */
+	shared = (LVShared *) shm_toc_allocate(pcxt->toc, est_shared);
+	MemSet(shared, 0, est_shared);
+	shared->relid = relid;
+	shared->elevel = elevel;
+	shared->maintenance_work_mem_worker =
+		(nindexes_mwm > 0) ?
+		maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
+		maintenance_work_mem;
+
+	/*
+	 * We need to care about alignment because we estimate the shared memory
+	 * in that way.
+	 */
+	shared->offset = MAXALIGN(add_size(SizeOfLVShared, BITMAPLEN(nindexes)));
+	prepare_index_statistics(shared, can_parallel_vacuum, nindexes);
+	pg_atomic_init_u32(&(shared->idx), 0);
+	pg_atomic_init_u32(&(shared->cost_balance), 0);
+	pg_atomic_init_u32(&(shared->active_nworkers), 0);
+
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_SHARED, shared);
+	lps->lvshared = shared;
+
+	/* Prepare the dead tuple space */
+	dead_tuples = (LVDeadTuples *) shm_toc_allocate(pcxt->toc, est_deadtuples);
+	dead_tuples->max_tuples = maxtuples;
+	dead_tuples->num_tuples = 0;
+	MemSet(dead_tuples->itemptrs, 0, sizeof(ItemPointerData) * maxtuples);
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_DEAD_TUPLES, dead_tuples);
+	vacrelstats->dead_tuples = dead_tuples;
+
+	/* Store query string for workers */
+	sharedquery = (char *) shm_toc_allocate(pcxt->toc, querylen + 1);
+	memcpy(sharedquery, debug_query_string, querylen + 1);
+	sharedquery[querylen] = '\0';
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, sharedquery);
+
+	pfree(can_parallel_vacuum);
+	return lps;
+}
+
+/*
+ * Destroy the parallel context, and end parallel mode.
+ *
+ * Since writes are not allowed during the parallel mode, so we copy the
+ * updated index statistics from DSM in local memory and then later use that
+ * to update the index statistics.  One might think that we can exit from
+ * parallel mode, update the index statistics and then destroy parallel
+ * context, but that won't be safe (see ExitParallelMode).
+ */
+static void
+end_parallel_vacuum(Relation *Irel, IndexBulkDeleteResult **stats,
+					LVParallelState *lps, int nindexes)
+{
+	int			i;
+
+	Assert(!IsParallelWorker());
+
+	/* Copy the updated statistics */
+	for (i = 0; i < nindexes; i++)
+	{
+		LVSharedIndStats *indstats = get_indstats(lps->lvshared, i);
+
+		/*
+		 * Skip unused slot.  The statistics of this index are already stored
+		 * in local memory.
+		 */
+		if (indstats == NULL)
+			continue;
+
+		if (indstats->updated)
+		{
+			stats[i] = (IndexBulkDeleteResult *) palloc0(sizeof(IndexBulkDeleteResult));
+			memcpy(stats[i], &(indstats->stats), sizeof(IndexBulkDeleteResult));
+		}
+		else
+			stats[i] = NULL;
+	}
+
+	DestroyParallelContext(lps->pcxt);
+	ExitParallelMode();
+
+	/* Deactivate parallel vacuum */
+	pfree(lps);
+	lps = NULL;
+}
+
+/* Return the Nth index statistics or NULL */
+static LVSharedIndStats *
+get_indstats(LVShared *lvshared, int n)
+{
+	int			i;
+	char	   *p;
+
+	if (IndStatsIsNull(lvshared, n))
+		return NULL;
+
+	p = (char *) GetSharedIndStats(lvshared);
+	for (i = 0; i < n; i++)
+	{
+		if (IndStatsIsNull(lvshared, i))
+			continue;
+
+		p += sizeof(LVSharedIndStats);
+	}
+
+	return (LVSharedIndStats *) p;
+}
+
+/*
+ * Check if the given index participates in parallel index vacuum or parallel
+ * index cleanup.
+ */
+static bool
+skip_parallel_vacuum_index(Relation indrel, LVShared *lvshared)
+{
+	uint8		vacoptions = indrel->rd_indam->amparallelvacuumoptions;
+
+	/* first_time must be true only if for_cleanup is true */
+	Assert(lvshared->for_cleanup || !lvshared->first_time);
+
+	if (lvshared->for_cleanup)
+	{
+		/* Skip, if the index does not support parallel cleanup */
+		if (((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) == 0) &&
+			((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) == 0))
+			return true;
+
+		/*
+		 * Skip, if the index supports parallel cleanup conditionally, but we
+		 * have already processed the index (for bulkdelete).  See the
+		 * comments for option VACUUM_OPTION_PARALLEL_COND_CLEANUP to know
+		 * when indexes support parallel cleanup conditionally.
+		 */
+		if (!lvshared->first_time &&
+			((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) != 0))
+			return true;
+	}
+	else if ((vacoptions & VACUUM_OPTION_PARALLEL_BULKDEL) == 0)
+	{
+		/* Skip if the index does not support parallel bulk deletion */
+		return true;
+	}
+
+	return false;
+}
+
+/*
+ * Perform work within a launched parallel process.
+ *
+ * Since parallel vacuum workers perform only index vacuum or index cleanup,
+ * we don't need to report the progress information.
+ */
+void
+parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
+{
+	Relation	onerel;
+	Relation   *indrels;
+	LVShared   *lvshared;
+	LVDeadTuples *dead_tuples;
+	int			nindexes;
+	char	   *sharedquery;
+	IndexBulkDeleteResult **stats;
+
+	lvshared = (LVShared *) shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_SHARED,
+										   false);
+	elevel = lvshared->elevel;
+
+	ereport(DEBUG1,
+			(errmsg("starting parallel lazy vacuum worker for %s",
+					lvshared->for_cleanup ? "cleanup" : "vacuuming")));
+
+	/* Set debug_query_string for individual workers */
+	sharedquery = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, false);
+	debug_query_string = sharedquery;
+	pgstat_report_activity(STATE_RUNNING, debug_query_string);
+
+	/*
+	 * Open table.  The lock mode is the same as the leader process.  It's
+	 * okay because the lock mode does not conflict among the parallel
+	 * workers.
+	 */
+	onerel = table_open(lvshared->relid, ShareUpdateExclusiveLock);
+
+	/*
+	 * Open all indexes. indrels are sorted in order by OID, which should be
+	 * matched to the leader's one.
+	 */
+	vac_open_indexes(onerel, RowExclusiveLock, &nindexes, &indrels);
+	Assert(nindexes > 0);
+
+	/* Set dead tuple space */
+	dead_tuples = (LVDeadTuples *) shm_toc_lookup(toc,
+												  PARALLEL_VACUUM_KEY_DEAD_TUPLES,
+												  false);
+
+	/* Set cost-based vacuum delay */
+	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostBalance = 0;
+	VacuumPageHit = 0;
+	VacuumPageMiss = 0;
+	VacuumPageDirty = 0;
+	VacuumSharedCostBalance = &(lvshared->cost_balance);
+	VacuumActiveNWorkers = &(lvshared->active_nworkers);
+
+	stats = (IndexBulkDeleteResult **)
+		palloc0(nindexes * sizeof(IndexBulkDeleteResult *));
+
+	if (lvshared->maintenance_work_mem_worker > 0)
+		maintenance_work_mem = lvshared->maintenance_work_mem_worker;
+
+	/* Process indexes to perform vacuum/cleanup */
+	parallel_vacuum_index(indrels, stats, lvshared, dead_tuples, nindexes);
+
+	vac_close_indexes(nindexes, indrels, RowExclusiveLock);
+	table_close(onerel, ShareUpdateExclusiveLock);
+	pfree(stats);
+}
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index f3e2254954..df06e7d174 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -14,6 +14,7 @@
 
 #include "postgres.h"
 
+#include "access/heapam.h"
 #include "access/nbtree.h"
 #include "access/parallel.h"
 #include "access/session.h"
@@ -139,6 +140,9 @@ static const struct
 	},
 	{
 		"_bt_parallel_build_main", _bt_parallel_build_main
+	},
+	{
+		"parallel_vacuum_main", parallel_vacuum_main
 	}
 };
 
@@ -174,6 +178,7 @@ CreateParallelContext(const char *library_name, const char *function_name,
 	pcxt = palloc0(sizeof(ParallelContext));
 	pcxt->subid = GetCurrentSubTransactionId();
 	pcxt->nworkers = nworkers;
+	pcxt->nworkers_to_launch = nworkers;
 	pcxt->library_name = pstrdup(library_name);
 	pcxt->function_name = pstrdup(function_name);
 	pcxt->error_context_stack = error_context_stack;
@@ -486,6 +491,23 @@ ReinitializeParallelDSM(ParallelContext *pcxt)
 	}
 }
 
+/*
+ * Reinitialize parallel workers for a parallel context such that we could
+ * launch the different number of workers.  This is required for cases where
+ * we need to reuse the same DSM segment, but the number of workers can
+ * vary from run-to-run.
+ */
+void
+ReinitializeParallelWorkers(ParallelContext *pcxt, int nworkers_to_launch)
+{
+	/*
+	 * The number of workers that need to be launched must be less than the
+	 * number of workers with which the parallel context is initialized.
+	 */
+	Assert(pcxt->nworkers >= nworkers_to_launch);
+	pcxt->nworkers_to_launch = nworkers_to_launch;
+}
+
 /*
  * Launch parallel workers.
  */
@@ -498,7 +520,7 @@ LaunchParallelWorkers(ParallelContext *pcxt)
 	bool		any_registrations_failed = false;
 
 	/* Skip this if we have no workers. */
-	if (pcxt->nworkers == 0)
+	if (pcxt->nworkers == 0 || pcxt->nworkers_to_launch == 0)
 		return;
 
 	/* We need to be a lock group leader. */
@@ -533,7 +555,7 @@ LaunchParallelWorkers(ParallelContext *pcxt)
 	 * fails.  It wouldn't help much anyway, because registering the worker in
 	 * no way guarantees that it will start up and initialize successfully.
 	 */
-	for (i = 0; i < pcxt->nworkers; ++i)
+	for (i = 0; i < pcxt->nworkers_to_launch; ++i)
 	{
 		memcpy(worker.bgw_extra, &i, sizeof(int));
 		if (!any_registrations_failed &&
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index bb34e252e4..6526cc1301 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -42,6 +42,7 @@
 #include "nodes/makefuncs.h"
 #include "pgstat.h"
 #include "postmaster/autovacuum.h"
+#include "postmaster/bgworker_internals.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
 #include "storage/proc.h"
@@ -68,6 +69,14 @@ static MemoryContext vac_context = NULL;
 static BufferAccessStrategy vac_strategy;
 
 
+/*
+ * Variables for cost-based parallel vacuum.  See comments atop
+ * compute_parallel_delay to understand how it works.
+ */
+pg_atomic_uint32 *VacuumSharedCostBalance = NULL;
+pg_atomic_uint32 *VacuumActiveNWorkers = NULL;
+int			VacuumCostBalanceLocal = 0;
+
 /* non-export function prototypes */
 static List *expand_vacuum_rel(VacuumRelation *vrel, int options);
 static List *get_all_vacuum_rels(int options);
@@ -76,6 +85,7 @@ static void vac_truncate_clog(TransactionId frozenXID,
 							  TransactionId lastSaneFrozenXid,
 							  MultiXactId lastSaneMinMulti);
 static bool vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params);
+static double compute_parallel_delay(void);
 static VacOptTernaryValue get_vacopt_ternary_value(DefElem *def);
 
 /*
@@ -94,12 +104,16 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 	bool		freeze = false;
 	bool		full = false;
 	bool		disable_page_skipping = false;
+	bool		parallel_option = false;
 	ListCell   *lc;
 
 	/* Set default value */
 	params.index_cleanup = VACOPT_TERNARY_DEFAULT;
 	params.truncate = VACOPT_TERNARY_DEFAULT;
 
+	/* By default parallel vacuum is enabled */
+	params.nworkers = 0;
+
 	/* Parse options list */
 	foreach(lc, vacstmt->options)
 	{
@@ -129,6 +143,41 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 			params.index_cleanup = get_vacopt_ternary_value(opt);
 		else if (strcmp(opt->defname, "truncate") == 0)
 			params.truncate = get_vacopt_ternary_value(opt);
+		else if (strcmp(opt->defname, "parallel") == 0)
+		{
+			parallel_option = true;
+			if (opt->arg == NULL)
+			{
+				/*
+				 * Parallel lazy vacuum is requested but user didn't specify
+				 * the parallel degree.  The parallel degree will be
+				 * determined at the start of lazy vacuum based on number of
+				 * indexes.
+				 */
+				params.nworkers = 0;
+			}
+			else
+			{
+				int			nworkers;
+
+				nworkers = defGetInt32(opt);
+				if (nworkers < 0 || nworkers > MAX_PARALLEL_WORKER_LIMIT)
+					ereport(ERROR,
+							(errcode(ERRCODE_SYNTAX_ERROR),
+							 errmsg("parallel vacuum degree must be between 0 and %d",
+									MAX_PARALLEL_WORKER_LIMIT),
+							 parser_errposition(pstate, opt->location)));
+
+				/*
+				 * Disable parallel vacuum, if user has specified parallel
+				 * degree as zero.
+				 */
+				if (nworkers == 0)
+					params.nworkers = -1;
+				else
+					params.nworkers = nworkers;
+			}
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -152,6 +201,11 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 		   !(params.options & (VACOPT_FULL | VACOPT_FREEZE)));
 	Assert(!(params.options & VACOPT_SKIPTOAST));
 
+	if ((params.options & VACOPT_FULL) && parallel_option)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot specify both FULL and PARALLEL options")));
+
 	/*
 	 * Make sure VACOPT_ANALYZE is specified if any column lists are present.
 	 */
@@ -383,6 +437,7 @@ vacuum(List *relations, VacuumParams *params,
 		VacuumPageHit = 0;
 		VacuumPageMiss = 0;
 		VacuumPageDirty = 0;
+		VacuumSharedCostBalance = NULL;
 
 		/*
 		 * Loop to process each selected relation.
@@ -1738,6 +1793,20 @@ vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params)
 		return false;
 	}
 
+	/*
+	 * Since parallel workers cannot access data in temporary tables, parallel
+	 * vacuum is not allowed for temporary relation. However rather than
+	 * skipping vacuum on the table, just disabling parallel option is better
+	 * option in most cases.
+	 */
+	if (RelationUsesLocalBuffers(onerel) && params->nworkers >= 0)
+	{
+		ereport(WARNING,
+				(errmsg("disabling parallel option of vacuum on \"%s\" --- cannot vacuum temporary tables in parallel",
+						RelationGetRelationName(onerel))));
+		params->nworkers = -1;
+	}
+
 	/*
 	 * Silently ignore partitioned tables as there is no work to be done.  The
 	 * useful work is on their child partitions, which have been queued up for
@@ -1941,16 +2010,26 @@ vac_close_indexes(int nindexes, Relation *Irel, LOCKMODE lockmode)
 void
 vacuum_delay_point(void)
 {
+	double		msec = 0;
+
 	/* Always check for interrupts */
 	CHECK_FOR_INTERRUPTS();
 
-	/* Nap if appropriate */
-	if (VacuumCostActive && !InterruptPending &&
-		VacuumCostBalance >= VacuumCostLimit)
-	{
-		double		msec;
+	if (!VacuumCostActive || InterruptPending)
+		return;
 
+	/*
+	 * For parallel vacuum, the delay is computed based on the shared cost
+	 * balance.  See compute_parallel_delay.
+	 */
+	if (VacuumSharedCostBalance != NULL)
+		msec = compute_parallel_delay();
+	else if (VacuumCostBalance >= VacuumCostLimit)
 		msec = VacuumCostDelay * VacuumCostBalance / VacuumCostLimit;
+
+	/* Nap if appropriate */
+	if (msec > 0)
+	{
 		if (msec > VacuumCostDelay * 4)
 			msec = VacuumCostDelay * 4;
 
@@ -1966,6 +2045,65 @@ vacuum_delay_point(void)
 	}
 }
 
+/*
+ * Computes the vacuum delay for parallel workers.
+ *
+ * The basic idea of a cost-based vacuum delay for parallel index vacuuming
+ * is to allow all parallel vacuum workers including the leader process to
+ * have a shared view of cost related parameters (mainly VacuumCostBalance).
+ * We allow each worker to update it as and when it has incurred any cost and
+ * then based on that decide whether it needs to sleep.  We allow the worker
+ * to sleep proportional to the work done and reduce the
+ * VacuumSharedCostBalance by the amount which is consumed by the current
+ * worker (VacuumCostBalanceLocal).  This can avoid letting the workers sleep
+ * who have done less or no I/O as compared to other workers and therefore can
+ * ensure that workers who are doing more I/O got throttled more.
+ *
+ * We allow any worker to sleep only if it has performed the I/O above a
+ * certain threshold, which is calculated based on the number of active
+ * workers (VacuumActiveNWorkers), and the overall cost balance is more than
+ * VacuumCostLimit set by the system.  The testing reveals that we achieve
+ * the required throttling if we allow a worker that has done more than 50%
+ * of its share of work to sleep.
+ */
+static double
+compute_parallel_delay(void)
+{
+	double		msec = 0;
+	uint32		shared_balance;
+	int			nworkers;
+
+	/* Parallel vacuum must be active */
+	Assert(VacuumSharedCostBalance);
+
+	nworkers = pg_atomic_read_u32(VacuumActiveNWorkers);
+
+	/* At least count itself */
+	Assert(nworkers >= 1);
+
+	/* Update the shared cost balance value atomically */
+	shared_balance = pg_atomic_add_fetch_u32(VacuumSharedCostBalance, VacuumCostBalance);
+
+	/* Compute the total local balance for the current worker */
+	VacuumCostBalanceLocal += VacuumCostBalance;
+
+	if ((shared_balance >= VacuumCostLimit) &&
+		(VacuumCostBalanceLocal > 0.5 * (VacuumCostLimit / nworkers)))
+	{
+		/* Compute sleep time based on the local cost balance */
+		msec = VacuumCostDelay * VacuumCostBalanceLocal / VacuumCostLimit;
+		pg_atomic_sub_fetch_u32(VacuumSharedCostBalance, VacuumCostBalanceLocal);
+		VacuumCostBalanceLocal = 0;
+	}
+
+	/*
+	 * Reset the local balance as we accumulated it into the shared value.
+	 */
+	VacuumCostBalance = 0;
+
+	return msec;
+}
+
 /*
  * A wrapper function of defGetBoolean().
  *
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index f0e40e36af..6d1f28c327 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2886,6 +2886,8 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 			(!wraparound ? VACOPT_SKIP_LOCKED : 0);
 		tab->at_params.index_cleanup = VACOPT_TERNARY_DEFAULT;
 		tab->at_params.truncate = VACOPT_TERNARY_DEFAULT;
+		/* As of now, we don't support parallel vacuum for autovacuum */
+		tab->at_params.nworkers = -1;
 		tab->at_params.freeze_min_age = freeze_min_age;
 		tab->at_params.freeze_table_age = freeze_table_age;
 		tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 2fd88866c9..99451fd942 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -3597,7 +3597,7 @@ psql_completion(const char *text, int start, int end)
 		if (ends_with(prev_wd, '(') || ends_with(prev_wd, ','))
 			COMPLETE_WITH("FULL", "FREEZE", "ANALYZE", "VERBOSE",
 						  "DISABLE_PAGE_SKIPPING", "SKIP_LOCKED",
-						  "INDEX_CLEANUP", "TRUNCATE");
+						  "INDEX_CLEANUP", "TRUNCATE", "PARALLEL");
 		else if (TailMatches("FULL|FREEZE|ANALYZE|VERBOSE|DISABLE_PAGE_SKIPPING|SKIP_LOCKED|INDEX_CLEANUP|TRUNCATE"))
 			COMPLETE_WITH("ON", "OFF");
 	}
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 580b4caef7..479f17c55f 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -24,6 +24,8 @@
 #include "nodes/primnodes.h"
 #include "storage/bufpage.h"
 #include "storage/lockdefs.h"
+#include "storage/shm_toc.h"
+#include "storage/dsm.h"
 #include "utils/relcache.h"
 #include "utils/snapshot.h"
 
@@ -193,6 +195,7 @@ extern Size SyncScanShmemSize(void);
 struct VacuumParams;
 extern void heap_vacuum_rel(Relation onerel,
 							struct VacuumParams *params, BufferAccessStrategy bstrategy);
+extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
 
 /* in heap/heapam_visibility.c */
 extern bool HeapTupleSatisfiesVisibility(HeapTuple stup, Snapshot snapshot,
diff --git a/src/include/access/parallel.h b/src/include/access/parallel.h
index 646708ba5f..fc6a5603bb 100644
--- a/src/include/access/parallel.h
+++ b/src/include/access/parallel.h
@@ -33,7 +33,8 @@ typedef struct ParallelContext
 {
 	dlist_node	node;
 	SubTransactionId subid;
-	int			nworkers;
+	int			nworkers;		/* Maximum number of workers to launch */
+	int			nworkers_to_launch; /* Actual number of workers to launch */
 	int			nworkers_launched;
 	char	   *library_name;
 	char	   *function_name;
@@ -63,6 +64,7 @@ extern ParallelContext *CreateParallelContext(const char *library_name,
 											  const char *function_name, int nworkers);
 extern void InitializeParallelDSM(ParallelContext *pcxt);
 extern void ReinitializeParallelDSM(ParallelContext *pcxt);
+extern void ReinitializeParallelWorkers(ParallelContext *pcxt, int nworkers_to_launch);
 extern void LaunchParallelWorkers(ParallelContext *pcxt);
 extern void WaitForParallelWorkersToAttach(ParallelContext *pcxt);
 extern void WaitForParallelWorkersToFinish(ParallelContext *pcxt);
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index b3351ad406..56417f0a8b 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -222,6 +222,12 @@ typedef struct VacuumParams
 										 * default value depends on reloptions */
 	VacOptTernaryValue truncate;	/* Truncate empty pages at the end,
 									 * default value depends on reloptions */
+
+	/*
+	 * The number of parallel vacuum workers.  0 by default which means choose
+	 * based on number of indexes.  -1 indicates parallel vacuum is disabled.
+	 */
+	int			nworkers;
 } VacuumParams;
 
 /* GUC parameters */
@@ -231,6 +237,11 @@ extern int	vacuum_freeze_table_age;
 extern int	vacuum_multixact_freeze_min_age;
 extern int	vacuum_multixact_freeze_table_age;
 
+/* Variables for cost-based parallel vacuum  */
+extern pg_atomic_uint32 *VacuumSharedCostBalance;
+extern pg_atomic_uint32 *VacuumActiveNWorkers;
+extern int	VacuumCostBalanceLocal;
+
 
 /* in commands/vacuum.c */
 extern void ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel);
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index 9996d882d1..22cca70687 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -92,6 +92,34 @@ CONTEXT:  SQL function "do_analyze" statement 1
 SQL function "wrap_do_analyze" statement 1
 VACUUM FULL vactst;
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+-- PARALLEL option
+CREATE TABLE pvactst (i INT, a INT[], p POINT) with (autovacuum_enabled = off);
+INSERT INTO pvactst SELECT i, array[1,2,3], point(i, i+1) FROM generate_series(1,1000) i;
+CREATE INDEX btree_pvactst ON pvactst USING btree (i);
+CREATE INDEX hash_pvactst ON pvactst USING hash (i);
+CREATE INDEX brin_pvactst ON pvactst USING brin (i);
+CREATE INDEX gin_pvactst ON pvactst USING gin (a);
+CREATE INDEX gist_pvactst ON pvactst USING gist (p);
+CREATE INDEX spgist_pvactst ON pvactst USING spgist (p);
+-- VACUUM invokes parallel index cleanup
+SET min_parallel_index_scan_size to 0;
+VACUUM (PARALLEL 2) pvactst;
+-- VACUUM invokes parallel bulk-deletion
+UPDATE pvactst SET i = i WHERE i < 1000;
+VACUUM (PARALLEL 2) pvactst;
+VACUUM (PARALLEL -1) pvactst; -- error
+ERROR:  parallel vacuum degree must be between 0 and 1024
+LINE 1: VACUUM (PARALLEL -1) pvactst;
+                ^
+VACUUM (PARALLEL 2, INDEX_CLEANUP FALSE) pvactst;
+VACUUM (PARALLEL 2, FULL TRUE) pvactst; -- error, cannot use both PARALLEL and FULL
+ERROR:  cannot specify both FULL and PARALLEL options
+CREATE TEMPORARY TABLE tmp (a int PRIMARY KEY);
+CREATE INDEX tmp_idx1 ON tmp (a);
+VACUUM (PARALLEL 1) tmp; -- disables parallel vacuum option
+WARNING:  disabling parallel option of vacuum on "tmp" --- cannot vacuum temporary tables in parallel
+RESET min_parallel_index_scan_size;
+DROP TABLE pvactst;
 -- INDEX_CLEANUP option
 CREATE TABLE no_index_cleanup (i INT PRIMARY KEY, t TEXT);
 -- Use uncompressed data stored in toast.
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 69987f75e9..d6859a5bc9 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -75,6 +75,33 @@ VACUUM FULL vactst;
 
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
 
+-- PARALLEL option
+CREATE TABLE pvactst (i INT, a INT[], p POINT) with (autovacuum_enabled = off);
+INSERT INTO pvactst SELECT i, array[1,2,3], point(i, i+1) FROM generate_series(1,1000) i;
+CREATE INDEX btree_pvactst ON pvactst USING btree (i);
+CREATE INDEX hash_pvactst ON pvactst USING hash (i);
+CREATE INDEX brin_pvactst ON pvactst USING brin (i);
+CREATE INDEX gin_pvactst ON pvactst USING gin (a);
+CREATE INDEX gist_pvactst ON pvactst USING gist (p);
+CREATE INDEX spgist_pvactst ON pvactst USING spgist (p);
+
+-- VACUUM invokes parallel index cleanup
+SET min_parallel_index_scan_size to 0;
+VACUUM (PARALLEL 2) pvactst;
+
+-- VACUUM invokes parallel bulk-deletion
+UPDATE pvactst SET i = i WHERE i < 1000;
+VACUUM (PARALLEL 2) pvactst;
+
+VACUUM (PARALLEL -1) pvactst; -- error
+VACUUM (PARALLEL 2, INDEX_CLEANUP FALSE) pvactst;
+VACUUM (PARALLEL 2, FULL TRUE) pvactst; -- error, cannot use both PARALLEL and FULL
+CREATE TEMPORARY TABLE tmp (a int PRIMARY KEY);
+CREATE INDEX tmp_idx1 ON tmp (a);
+VACUUM (PARALLEL 1) tmp; -- disables parallel vacuum option
+RESET min_parallel_index_scan_size;
+DROP TABLE pvactst;
+
 -- INDEX_CLEANUP option
 CREATE TABLE no_index_cleanup (i INT PRIMARY KEY, t TEXT);
 -- Use uncompressed data stored in toast.
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index caf6b86f92..0242e6627d 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -1216,7 +1216,11 @@ LPVOID
 LPWSTR
 LSEG
 LUID
+LVDeadTuples
 LVRelStats
+LVShared
+LVSharedIndStats
+LVParallelState
 LWLock
 LWLockHandle
 LWLockMinimallyPadded
-- 
2.16.2.windows.1

#341

Sergei Kornilov

sk@zsrv.org

about 6 years ago

In reply to: Amit Kapila (#340)

Hi
Thank you for update! I looked again

(vacuum_indexes_leader)
+		/* Skip the indexes that can be processed by parallel workers */
+		if (!skip_index)
+			continue;

Does the variable name skip_index not confuse here? Maybe rename to something like can_parallel?

Another question about behavior on temporary tables. Use case: the user commands just "vacuum;" to vacuum entire database (and has enough maintenance workers). Vacuum starts fine in parallel, but on first temporary table we hit:

+	if (RelationUsesLocalBuffers(onerel) && params->nworkers >= 0)
+	{
+		ereport(WARNING,
+				(errmsg("disabling parallel option of vacuum on \"%s\" --- cannot vacuum temporary tables in parallel",
+						RelationGetRelationName(onerel))));
+		params->nworkers = -1;
+	}

And therefore we turn off the parallel vacuum for the remaining tables... Can we improve this case?

regards, Sergei

#342

Mahendra Singh Thalor

mahi6run@gmail.com

about 6 years ago

In reply to: Sergei Kornilov (#341)

1 attachment(s)

On Fri, 10 Jan 2020 at 15:51, Sergei Kornilov <sk@zsrv.org> wrote:

Hi
Thank you for update! I looked again
(vacuum_indexes_leader)
+               /* Skip the indexes that can be processed by parallel workers */
+               if (!skip_index)
+                       continue;
Does the variable name skip_index not confuse here? Maybe rename to something like can_parallel?

I also agree with your point.

Another question about behavior on temporary tables. Use case: the user commands just "vacuum;" to vacuum entire database (and has enough maintenance workers). Vacuum starts fine in parallel, but on first temporary table we hit:
+       if (RelationUsesLocalBuffers(onerel) && params->nworkers >= 0)
+       {
+               ereport(WARNING,
+                               (errmsg("disabling parallel option of vacuum on \"%s\" --- cannot vacuum temporary tables in parallel",
+                                               RelationGetRelationName(onerel))));
+               params->nworkers = -1;
+       }
And therefore we turn off the parallel vacuum for the remaining tables... Can we improve this case?

Good point.
Yes, we should improve this. I tried to fix this. Attaching a delta
patch that is fixing both the comments.

--
Thanks and Regards
Mahendra Singh Thalor
EnterpriseDB: http://www.enterprisedb.com

Attachments:

v44-0002-delta_Allow-vacuum-command-to-process-indexes-in-parallel.patchapplication/octet-stream; name=v44-0002-delta_Allow-vacuum-command-to-process-indexes-in-parallel.patchDownload

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index f12c41f..d4ffd1b 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -2158,11 +2158,12 @@ vacuum_indexes_leader(Relation *Irel, IndexBulkDeleteResult **stats,
 
 	for (i = 0; i < nindexes; i++)
 	{
-		bool		skip_index = (get_indstats(lps->lvshared, i) == NULL ||
-								  skip_parallel_vacuum_index(Irel[i], lps->lvshared));
+		bool		can_parallel = (get_indstats(lps->lvshared, i) == NULL ||
+									skip_parallel_vacuum_index(Irel[i],
+															   lps->lvshared));
 
 		/* Skip the indexes that can be processed by parallel workers */
-		if (!skip_index)
+		if (!can_parallel)
 			continue;
 
 		vacuum_one_index(Irel[i], &(stats[i]), lps->lvshared,
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 6526cc1..a32fe28 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -430,6 +430,7 @@ vacuum(List *relations, VacuumParams *params,
 	PG_TRY();
 	{
 		ListCell   *cur;
+		int			nworkers = params->nworkers;
 
 		in_vacuum = true;
 		VacuumCostActive = (VacuumCostDelay > 0);
@@ -446,6 +447,14 @@ vacuum(List *relations, VacuumParams *params,
 		{
 			VacuumRelation *vrel = lfirst_node(VacuumRelation, cur);
 
+			/*
+			 * Copy the number of workers.  It is possible that we might
+			 * reseted nworkers to -1 to disable parallel vacuum for temp
+			 * tables.
+			 */
+			if (nworkers != params->nworkers)
+				params->nworkers = nworkers;
+
 			if (params->options & VACOPT_VACUUM)
 			{
 				if (!vacuum_rel(vrel->oid, vrel->relation, params))

#343

Sergei Kornilov

sk@zsrv.org

about 6 years ago

In reply to: Mahendra Singh Thalor (#342)

Hello

Yes, we should improve this. I tried to fix this. Attaching a delta
patch that is fixing both the comments.

Thank you, I have no objections.

I think that status of CF entry is outdated and the most appropriate status for this patch is "Ready to Commiter". Changed. I also added an annotation with a link to recently summarized results.

regards, Sergei

#344

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

about 6 years ago

In reply to: Mahendra Singh Thalor (#342)

On Fri, 10 Jan 2020 at 20:54, Mahendra Singh Thalor <mahi6run@gmail.com>
wrote:

On Fri, 10 Jan 2020 at 15:51, Sergei Kornilov <sk@zsrv.org> wrote:

Hi
Thank you for update! I looked again

(vacuum_indexes_leader)
+ /* Skip the indexes that can be processed by parallel

workers */

+ if (!skip_index)
+ continue;

Does the variable name skip_index not confuse here? Maybe rename to

something like can_parallel?

I also agree with your point.

I don't think the change is a good idea.

-               bool            skip_index = (get_indstats(lps->lvshared,
i) == NULL ||
-
 skip_parallel_vacuum_index(Irel[i], lps->lvshared));
+               bool            can_parallel = (get_indstats(lps->lvshared,
i) == NULL ||
+
 skip_parallel_vacuum_index(Irel[i],
+
                                                lps->lvshared));

The above condition is true when the index can *not* do parallel index
vacuum. How about changing it to skipped_index and change the comment to
something like “We are interested in only index skipped parallel vacuum”?

Another question about behavior on temporary tables. Use case: the user

commands just "vacuum;" to vacuum entire database (and has enough
maintenance workers). Vacuum starts fine in parallel, but on first
temporary table we hit:

+       if (RelationUsesLocalBuffers(onerel) && params->nworkers >= 0)
+       {
+               ereport(WARNING,
+                               (errmsg("disabling parallel option of

vacuum on \"%s\" --- cannot vacuum temporary tables in parallel",

+

RelationGetRelationName(onerel))));

+ params->nworkers = -1;
+ }

And therefore we turn off the parallel vacuum for the remaining

tables... Can we improve this case?

Good point.
Yes, we should improve this. I tried to fix this.

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#345

Amit Kapila

amit.kapila16@gmail.com

about 6 years ago

In reply to: Masahiko Sawada (#344)

3 attachment(s)

On Sat, Jan 11, 2020 at 9:23 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Fri, 10 Jan 2020 at 20:54, Mahendra Singh Thalor <mahi6run@gmail.com> wrote:
On Fri, 10 Jan 2020 at 15:51, Sergei Kornilov <sk@zsrv.org> wrote:
Hi
Thank you for update! I looked again
(vacuum_indexes_leader)
+               /* Skip the indexes that can be processed by parallel workers */
+               if (!skip_index)
+                       continue;
Does the variable name skip_index not confuse here? Maybe rename to something like can_parallel?
I also agree with your point.
I don't think the change is a good idea.
-               bool            skip_index = (get_indstats(lps->lvshared, i) == NULL ||
-                                                                 skip_parallel_vacuum_index(Irel[i], lps->lvshared));
+               bool            can_parallel = (get_indstats(lps->lvshared, i) == NULL ||
+                                                                       skip_parallel_vacuum_index(Irel[i],
+                                                                                                                          lps->lvshared));
The above condition is true when the index can *not* do parallel index vacuum. How about changing it to skipped_index and change the comment to something like “We are interested in only index skipped parallel vacuum”?

Hmm, I find the current code and comment better than what you or
Sergei are proposing. I am not sure what is the point of confusion in
the current code?

Another question about behavior on temporary tables. Use case: the user commands just "vacuum;" to vacuum entire database (and has enough maintenance workers). Vacuum starts fine in parallel, but on first temporary table we hit:
+       if (RelationUsesLocalBuffers(onerel) && params->nworkers >= 0)
+       {
+               ereport(WARNING,
+                               (errmsg("disabling parallel option of vacuum on \"%s\" --- cannot vacuum temporary tables in parallel",
+                                               RelationGetRelationName(onerel))));
+               params->nworkers = -1;
+       }
And therefore we turn off the parallel vacuum for the remaining tables... Can we improve this case?
Good point.
Yes, we should improve this. I tried to fix this.
+1

Yeah, we can improve the situation here. I think we don't need to
change the value of params->nworkers at first place if allow
lazy_scan_heap to take care of this. Also, I think we shouldn't
display warning unless the user has explicitly asked for parallel
option. See the fix in the attached patch.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachments:

v4-0001-Delete-empty-pages-in-each-pass-during-GIST-VACUUM.patchapplication/octet-stream; name=v4-0001-Delete-empty-pages-in-each-pass-during-GIST-VACUUM.patchDownload

From 74832b063b3c6b69e71d4e5372d2e8f50be8453c Mon Sep 17 00:00:00 2001
From: Amit Kapila <akapila@postgresql.org>
Date: Mon, 9 Dec 2019 14:12:59 +0530
Subject: [PATCH 1/3] Delete empty pages in each pass during GIST VACUUM.

Earlier, we use to postpone deleting empty pages till the second stage of
vacuum to amortize the cost of scanning internal pages.  However, that can
sometimes (say vacuum is canceled or errored between first and second
stage) delay the pages to be recycled.

Another thing is that to facilitate deleting empty pages in the second
stage, we need to share the information of internal and empty pages
between different stages of vacuum.  It will be quite tricky to share this
information via DSM which is required for the upcoming parallel vacuum
patch.

Also, it will bring the logic to reclaim deleted pages closer to nbtree
where we delete empty pages in each pass.

Overall, the advantages of deleting empty pages in each pass outweigh the
advantages of postponing the same.

Author: Dilip Kumar, with changes by Amit Kapila
Reviewed-by: Sawada Masahiko and Amit Kapila
Discussion: https://postgr.es/m/CAA4eK1LGr+MN0xHZpJ2dfS8QNQ1a_aROKowZB+MPNep8FVtwAA@mail.gmail.com
---
 src/backend/access/gist/README       |  23 +++--
 src/backend/access/gist/gistvacuum.c | 160 +++++++++++++++--------------------
 2 files changed, 78 insertions(+), 105 deletions(-)

diff --git a/src/backend/access/gist/README b/src/backend/access/gist/README
index 8cbca69296..fffdfff6e1 100644
--- a/src/backend/access/gist/README
+++ b/src/backend/access/gist/README
@@ -429,18 +429,17 @@ splits during searches, we don't need a "vacuum cycle ID" concept for that
 like B-tree does.
 
 While we scan all the pages, we also make note of any completely empty leaf
-pages. We will try to unlink them from the tree in the second stage. We also
-record the block numbers of all internal pages; they are needed in the second
-stage, to locate parents of the empty pages.
-
-In the second stage, we try to unlink any empty leaf pages from the tree, so
-that their space can be reused. In order to delete an empty page, its
-downlink must be removed from the parent. We scan all the internal pages,
-whose block numbers we memorized in the first stage, and look for downlinks
-to pages that we have memorized as being empty. Whenever we find one, we
-acquire a lock on the parent and child page, re-check that the child page is
-still empty. Then, we remove the downlink and mark the child as deleted, and
-release the locks.
+pages. We will try to unlink them from the tree after the scan. We also record
+the block numbers of all internal pages; they are needed to locate parents of
+the empty pages while unlinking them.
+
+We try to unlink any empty leaf pages from the tree, so that their space can
+be reused. In order to delete an empty page, its downlink must be removed from
+the parent. We scan all the internal pages, whose block numbers we memorized
+in the first stage, and look for downlinks to pages that we have memorized as
+being empty. Whenever we find one, we acquire a lock on the parent and child
+page, re-check that the child page is still empty. Then, we remove the
+downlink and mark the child as deleted, and release the locks.
 
 The insertion algorithm would get confused, if an internal page was completely
 empty. So we never delete the last child of an internal page, even if it's
diff --git a/src/backend/access/gist/gistvacuum.c b/src/backend/access/gist/gistvacuum.c
index def74fdaa3..a9c616c772 100644
--- a/src/backend/access/gist/gistvacuum.c
+++ b/src/backend/access/gist/gistvacuum.c
@@ -24,58 +24,34 @@
 #include "storage/lmgr.h"
 #include "utils/memutils.h"
 
-/*
- * State kept across vacuum stages.
- */
+/* Working state needed by gistbulkdelete */
 typedef struct
 {
-	IndexBulkDeleteResult stats;	/* must be first */
+	IndexVacuumInfo *info;
+	IndexBulkDeleteResult *stats;
+	IndexBulkDeleteCallback callback;
+	void	   *callback_state;
+	GistNSN		startNSN;
 
 	/*
-	 * These are used to memorize all internal and empty leaf pages in the 1st
-	 * vacuum stage.  They are used in the 2nd stage, to delete all the empty
-	 * pages.
+	 * These are used to memorize all internal and empty leaf pages.  They are
+	 * used for deleting all the empty pages.
 	 */
 	IntegerSet *internal_page_set;
 	IntegerSet *empty_leaf_set;
 	MemoryContext page_set_context;
-} GistBulkDeleteResult;
-
-/* Working state needed by gistbulkdelete */
-typedef struct
-{
-	IndexVacuumInfo *info;
-	GistBulkDeleteResult *stats;
-	IndexBulkDeleteCallback callback;
-	void	   *callback_state;
-	GistNSN		startNSN;
 } GistVacState;
 
-static void gistvacuumscan(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
+static void gistvacuumscan(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
 						   IndexBulkDeleteCallback callback, void *callback_state);
 static void gistvacuumpage(GistVacState *vstate, BlockNumber blkno,
 						   BlockNumber orig_blkno);
 static void gistvacuum_delete_empty_pages(IndexVacuumInfo *info,
-										  GistBulkDeleteResult *stats);
-static bool gistdeletepage(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
+										  GistVacState *vstate);
+static bool gistdeletepage(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
 						   Buffer buffer, OffsetNumber downlink,
 						   Buffer leafBuffer);
 
-/* allocate the 'stats' struct that's kept over vacuum stages */
-static GistBulkDeleteResult *
-create_GistBulkDeleteResult(void)
-{
-	GistBulkDeleteResult *gist_stats;
-
-	gist_stats = (GistBulkDeleteResult *) palloc0(sizeof(GistBulkDeleteResult));
-	gist_stats->page_set_context =
-		GenerationContextCreate(CurrentMemoryContext,
-								"GiST VACUUM page set context",
-								16 * 1024);
-
-	return gist_stats;
-}
-
 /*
  * VACUUM bulkdelete stage: remove index entries.
  */
@@ -83,15 +59,13 @@ IndexBulkDeleteResult *
 gistbulkdelete(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
 			   IndexBulkDeleteCallback callback, void *callback_state)
 {
-	GistBulkDeleteResult *gist_stats = (GistBulkDeleteResult *) stats;
-
 	/* allocate stats if first time through, else re-use existing struct */
-	if (gist_stats == NULL)
-		gist_stats = create_GistBulkDeleteResult();
+	if (stats == NULL)
+		stats = (IndexBulkDeleteResult *) palloc0(sizeof(IndexBulkDeleteResult));
 
-	gistvacuumscan(info, gist_stats, callback, callback_state);
+	gistvacuumscan(info, stats, callback, callback_state);
 
-	return (IndexBulkDeleteResult *) gist_stats;
+	return stats;
 }
 
 /*
@@ -100,8 +74,6 @@ gistbulkdelete(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
 IndexBulkDeleteResult *
 gistvacuumcleanup(IndexVacuumInfo *info, IndexBulkDeleteResult *stats)
 {
-	GistBulkDeleteResult *gist_stats = (GistBulkDeleteResult *) stats;
-
 	/* No-op in ANALYZE ONLY mode */
 	if (info->analyze_only)
 		return stats;
@@ -111,24 +83,12 @@ gistvacuumcleanup(IndexVacuumInfo *info, IndexBulkDeleteResult *stats)
 	 * stats from the latest gistbulkdelete call.  If it wasn't called, we
 	 * still need to do a pass over the index, to obtain index statistics.
 	 */
-	if (gist_stats == NULL)
+	if (stats == NULL)
 	{
-		gist_stats = create_GistBulkDeleteResult();
-		gistvacuumscan(info, gist_stats, NULL, NULL);
+		stats = (IndexBulkDeleteResult *) palloc0(sizeof(IndexBulkDeleteResult));
+		gistvacuumscan(info, stats, NULL, NULL);
 	}
 
-	/*
-	 * If we saw any empty pages, try to unlink them from the tree so that
-	 * they can be reused.
-	 */
-	gistvacuum_delete_empty_pages(info, gist_stats);
-
-	/* we don't need the internal and empty page sets anymore */
-	MemoryContextDelete(gist_stats->page_set_context);
-	gist_stats->page_set_context = NULL;
-	gist_stats->internal_page_set = NULL;
-	gist_stats->empty_leaf_set = NULL;
-
 	/*
 	 * It's quite possible for us to be fooled by concurrent page splits into
 	 * double-counting some index tuples, so disbelieve any total that exceeds
@@ -137,11 +97,11 @@ gistvacuumcleanup(IndexVacuumInfo *info, IndexBulkDeleteResult *stats)
 	 */
 	if (!info->estimated_count)
 	{
-		if (gist_stats->stats.num_index_tuples > info->num_heap_tuples)
-			gist_stats->stats.num_index_tuples = info->num_heap_tuples;
+		if (stats->num_index_tuples > info->num_heap_tuples)
+			stats->num_index_tuples = info->num_heap_tuples;
 	}
 
-	return (IndexBulkDeleteResult *) gist_stats;
+	return stats;
 }
 
 /*
@@ -153,15 +113,16 @@ gistvacuumcleanup(IndexVacuumInfo *info, IndexBulkDeleteResult *stats)
  * occurred).
  *
  * This also makes note of any empty leaf pages, as well as all internal
- * pages.  The second stage, gistvacuum_delete_empty_pages(), needs that
- * information.  Any deleted pages are added directly to the free space map.
- * (They should've been added there when they were originally deleted, already,
- * but it's possible that the FSM was lost at a crash, for example.)
+ * pages while looping over all index pages.  After scanning all the pages, we
+ * remove the empty pages so that they can be reused.  Any deleted pages are
+ * added directly to the free space map.  (They should've been added there
+ * when they were originally deleted, already, but it's possible that the FSM
+ * was lost at a crash, for example.)
  *
  * The caller is responsible for initially allocating/zeroing a stats struct.
  */
 static void
-gistvacuumscan(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
+gistvacuumscan(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
 			   IndexBulkDeleteCallback callback, void *callback_state)
 {
 	Relation	rel = info->index;
@@ -175,11 +136,10 @@ gistvacuumscan(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
 	 * Reset counts that will be incremented during the scan; needed in case
 	 * of multiple scans during a single VACUUM command.
 	 */
-	stats->stats.estimated_count = false;
-	stats->stats.num_index_tuples = 0;
-	stats->stats.pages_deleted = 0;
-	stats->stats.pages_free = 0;
-	MemoryContextReset(stats->page_set_context);
+	stats->estimated_count = false;
+	stats->num_index_tuples = 0;
+	stats->pages_deleted = 0;
+	stats->pages_free = 0;
 
 	/*
 	 * Create the integer sets to remember all the internal and the empty leaf
@@ -187,9 +147,12 @@ gistvacuumscan(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
 	 * this context so that the subsequent allocations for these integer sets
 	 * will be done from the same context.
 	 */
-	oldctx = MemoryContextSwitchTo(stats->page_set_context);
-	stats->internal_page_set = intset_create();
-	stats->empty_leaf_set = intset_create();
+	vstate.page_set_context = GenerationContextCreate(CurrentMemoryContext,
+													  "GiST VACUUM page set context",
+													  16 * 1024);
+	oldctx = MemoryContextSwitchTo(vstate.page_set_context);
+	vstate.internal_page_set = intset_create();
+	vstate.empty_leaf_set = intset_create();
 	MemoryContextSwitchTo(oldctx);
 
 	/* Set up info to pass down to gistvacuumpage */
@@ -257,11 +220,23 @@ gistvacuumscan(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
 	 * Note that if no recyclable pages exist, we don't bother vacuuming the
 	 * FSM at all.
 	 */
-	if (stats->stats.pages_free > 0)
+	if (stats->pages_free > 0)
 		IndexFreeSpaceMapVacuum(rel);
 
 	/* update statistics */
-	stats->stats.num_pages = num_pages;
+	stats->num_pages = num_pages;
+
+	/*
+	 * If we saw any empty pages, try to unlink them from the tree so that
+	 * they can be reused.
+	 */
+	gistvacuum_delete_empty_pages(info, &vstate);
+
+	/* we don't need the internal and empty page sets anymore */
+	MemoryContextDelete(vstate.page_set_context);
+	vstate.page_set_context = NULL;
+	vstate.internal_page_set = NULL;
+	vstate.empty_leaf_set = NULL;
 }
 
 /*
@@ -278,7 +253,6 @@ gistvacuumscan(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
 static void
 gistvacuumpage(GistVacState *vstate, BlockNumber blkno, BlockNumber orig_blkno)
 {
-	GistBulkDeleteResult *stats = vstate->stats;
 	IndexVacuumInfo *info = vstate->info;
 	IndexBulkDeleteCallback callback = vstate->callback;
 	void	   *callback_state = vstate->callback_state;
@@ -307,13 +281,13 @@ restart:
 	{
 		/* Okay to recycle this page */
 		RecordFreeIndexPage(rel, blkno);
-		stats->stats.pages_free++;
-		stats->stats.pages_deleted++;
+		vstate->stats->pages_free++;
+		vstate->stats->pages_deleted++;
 	}
 	else if (GistPageIsDeleted(page))
 	{
 		/* Already deleted, but can't recycle yet */
-		stats->stats.pages_deleted++;
+		vstate->stats->pages_deleted++;
 	}
 	else if (GistPageIsLeaf(page))
 	{
@@ -388,7 +362,7 @@ restart:
 
 			END_CRIT_SECTION();
 
-			stats->stats.tuples_removed += ntodelete;
+			vstate->stats->tuples_removed += ntodelete;
 			/* must recompute maxoff */
 			maxoff = PageGetMaxOffsetNumber(page);
 		}
@@ -405,10 +379,10 @@ restart:
 			 * it up.
 			 */
 			if (blkno == orig_blkno)
-				intset_add_member(stats->empty_leaf_set, blkno);
+				intset_add_member(vstate->empty_leaf_set, blkno);
 		}
 		else
-			stats->stats.num_index_tuples += nremain;
+			vstate->stats->num_index_tuples += nremain;
 	}
 	else
 	{
@@ -443,7 +417,7 @@ restart:
 		 * parents of empty leaf pages.
 		 */
 		if (blkno == orig_blkno)
-			intset_add_member(stats->internal_page_set, blkno);
+			intset_add_member(vstate->internal_page_set, blkno);
 	}
 
 	UnlockReleaseBuffer(buffer);
@@ -466,7 +440,7 @@ restart:
  * Scan all internal pages, and try to delete their empty child pages.
  */
 static void
-gistvacuum_delete_empty_pages(IndexVacuumInfo *info, GistBulkDeleteResult *stats)
+gistvacuum_delete_empty_pages(IndexVacuumInfo *info, GistVacState *vstate)
 {
 	Relation	rel = info->index;
 	BlockNumber empty_pages_remaining;
@@ -475,10 +449,10 @@ gistvacuum_delete_empty_pages(IndexVacuumInfo *info, GistBulkDeleteResult *stats
 	/*
 	 * Rescan all inner pages to find those that have empty child pages.
 	 */
-	empty_pages_remaining = intset_num_entries(stats->empty_leaf_set);
-	intset_begin_iterate(stats->internal_page_set);
+	empty_pages_remaining = intset_num_entries(vstate->empty_leaf_set);
+	intset_begin_iterate(vstate->internal_page_set);
 	while (empty_pages_remaining > 0 &&
-		   intset_iterate_next(stats->internal_page_set, &blkno))
+		   intset_iterate_next(vstate->internal_page_set, &blkno))
 	{
 		Buffer		buffer;
 		Page		page;
@@ -521,7 +495,7 @@ gistvacuum_delete_empty_pages(IndexVacuumInfo *info, GistBulkDeleteResult *stats
 			BlockNumber leafblk;
 
 			leafblk = ItemPointerGetBlockNumber(&(idxtuple->t_tid));
-			if (intset_is_member(stats->empty_leaf_set, leafblk))
+			if (intset_is_member(vstate->empty_leaf_set, leafblk))
 			{
 				leafs_to_delete[ntodelete] = leafblk;
 				todelete[ntodelete++] = off;
@@ -561,7 +535,7 @@ gistvacuum_delete_empty_pages(IndexVacuumInfo *info, GistBulkDeleteResult *stats
 			gistcheckpage(rel, leafbuf);
 
 			LockBuffer(buffer, GIST_EXCLUSIVE);
-			if (gistdeletepage(info, stats,
+			if (gistdeletepage(info, vstate->stats,
 							   buffer, todelete[i] - deleted,
 							   leafbuf))
 				deleted++;
@@ -573,7 +547,7 @@ gistvacuum_delete_empty_pages(IndexVacuumInfo *info, GistBulkDeleteResult *stats
 		ReleaseBuffer(buffer);
 
 		/* update stats */
-		stats->stats.pages_removed += deleted;
+		vstate->stats->pages_removed += deleted;
 
 		/*
 		 * We can stop the scan as soon as we have seen the downlinks, even if
@@ -596,7 +570,7 @@ gistvacuum_delete_empty_pages(IndexVacuumInfo *info, GistBulkDeleteResult *stats
  * prevented it.
  */
 static bool
-gistdeletepage(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
+gistdeletepage(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
 			   Buffer parentBuffer, OffsetNumber downlink,
 			   Buffer leafBuffer)
 {
@@ -665,7 +639,7 @@ gistdeletepage(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
 	/* mark the page as deleted */
 	MarkBufferDirty(leafBuffer);
 	GistPageSetDeleted(leafPage, txid);
-	stats->stats.pages_deleted++;
+	stats->pages_deleted++;
 
 	/* remove the downlink from the parent */
 	MarkBufferDirty(parentBuffer);
-- 
2.16.2.windows.1

v45-0001-Introduce-IndexAM-fields-for-parallel-vacuum.patchapplication/octet-stream; name=v45-0001-Introduce-IndexAM-fields-for-parallel-vacuum.patchDownload

From 7effa7dd7488fbed7447f807987996f14321433b Mon Sep 17 00:00:00 2001
From: Amit Kapila <akapila@postgresql.org>
Date: Thu, 9 Jan 2020 14:36:35 +0530
Subject: [PATCH 2/3] Introduce IndexAM fields for parallel vacuum.

Introduce new fields amusemaintenanceworkmem and amparallelvacuumoptions
in IndexAmRoutine for parallel vacuum.  The amusemaintenanceworkmem tells
whether a particular IndexAM uses maintenance_work_mem or not.  This will
help in controlling the memory used by individual workers as otherwise,
each worker can consume memory equal to maintenance_work_mem.  The
amparallelvacuumoptions tell whether a particular IndexAM participates in
a parallel vacuum and if so in which phase (bulkdelete, vacuumcleanup) of
vacuum.

Author: Masahiko Sawada and Amit Kapila
Reviewed-by: Dilip Kumar, Amit Kapila, Tomas Vondra and Robert Haas
Discussion:
https://postgr.es/m/CAD21AoDTPMgzSkV4E3SFo1CH_x50bf5PqZFQf4jmqjk-C03BWg@mail.gmail.com
https://postgr.es/m/CAA4eK1LmcD5aPogzwim5Nn58Ki+74a6Edghx4Wd8hAskvHaq5A@mail.gmail.com
---
 contrib/bloom/blutils.c                          |  4 +++
 doc/src/sgml/indexam.sgml                        |  4 +++
 src/backend/access/brin/brin.c                   |  4 +++
 src/backend/access/gin/ginutil.c                 |  4 +++
 src/backend/access/gist/gist.c                   |  4 +++
 src/backend/access/hash/hash.c                   |  3 ++
 src/backend/access/nbtree/nbtree.c               |  3 ++
 src/backend/access/spgist/spgutils.c             |  4 +++
 src/include/access/amapi.h                       |  4 +++
 src/include/commands/vacuum.h                    | 38 ++++++++++++++++++++++++
 src/test/modules/dummy_index_am/dummy_index_am.c |  3 ++
 11 files changed, 75 insertions(+)

diff --git a/contrib/bloom/blutils.c b/contrib/bloom/blutils.c
index 23d959b9f0..0104d02f67 100644
--- a/contrib/bloom/blutils.c
+++ b/contrib/bloom/blutils.c
@@ -18,6 +18,7 @@
 #include "access/reloptions.h"
 #include "bloom.h"
 #include "catalog/index.h"
+#include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
@@ -121,6 +122,9 @@ blhandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_CLEANUP;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = blbuild;
diff --git a/doc/src/sgml/indexam.sgml b/doc/src/sgml/indexam.sgml
index dd54c68802..37f8d8760a 100644
--- a/doc/src/sgml/indexam.sgml
+++ b/doc/src/sgml/indexam.sgml
@@ -122,6 +122,10 @@ typedef struct IndexAmRoutine
     bool        amcanparallel;
     /* does AM support columns included with clause INCLUDE? */
     bool        amcaninclude;
+    /* does AM use maintenance_work_mem? */
+    bool        amusemaintenanceworkmem;
+    /* OR of parallel vacuum flags */
+    uint8       amparallelvacuumoptions;
     /* type of data stored in index, or InvalidOid if variable */
     Oid         amkeytype;
 
diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index d89af7844d..2e8f67ef10 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -27,6 +27,7 @@
 #include "access/xloginsert.h"
 #include "catalog/index.h"
 #include "catalog/pg_am.h"
+#include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "pgstat.h"
 #include "postmaster/autovacuum.h"
@@ -101,6 +102,9 @@ brinhandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_CLEANUP;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = brinbuild;
diff --git a/src/backend/access/gin/ginutil.c b/src/backend/access/gin/ginutil.c
index 910f0bcb91..a7e55caf28 100644
--- a/src/backend/access/gin/ginutil.c
+++ b/src/backend/access/gin/ginutil.c
@@ -20,6 +20,7 @@
 #include "access/xloginsert.h"
 #include "catalog/pg_collation.h"
 #include "catalog/pg_type.h"
+#include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "storage/indexfsm.h"
 #include "storage/lmgr.h"
@@ -53,6 +54,9 @@ ginhandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amusemaintenanceworkmem = true;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_CLEANUP;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = ginbuild;
diff --git a/src/backend/access/gist/gist.c b/src/backend/access/gist/gist.c
index 5c9ad341b3..aefc302ed2 100644
--- a/src/backend/access/gist/gist.c
+++ b/src/backend/access/gist/gist.c
@@ -17,6 +17,7 @@
 #include "access/gist_private.h"
 #include "access/gistscan.h"
 #include "catalog/pg_collation.h"
+#include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "nodes/execnodes.h"
 #include "storage/lmgr.h"
@@ -74,6 +75,9 @@ gisthandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = true;
+	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = gistbuild;
diff --git a/src/backend/access/hash/hash.c b/src/backend/access/hash/hash.c
index 4bb6efc98f..4871b7ff4d 100644
--- a/src/backend/access/hash/hash.c
+++ b/src/backend/access/hash/hash.c
@@ -72,6 +72,9 @@ hashhandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL;
 	amroutine->amkeytype = INT4OID;
 
 	amroutine->ambuild = hashbuild;
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index 8376a5e6b7..5254bc7ef5 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -121,6 +121,9 @@ bthandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = true;
 	amroutine->amcaninclude = true;
+	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = btbuild;
diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c
index d715908764..4924ae1c59 100644
--- a/src/backend/access/spgist/spgutils.c
+++ b/src/backend/access/spgist/spgutils.c
@@ -22,6 +22,7 @@
 #include "access/transam.h"
 #include "access/xact.h"
 #include "catalog/pg_amop.h"
+#include "commands/vacuum.h"
 #include "storage/bufmgr.h"
 #include "storage/indexfsm.h"
 #include "storage/lmgr.h"
@@ -56,6 +57,9 @@ spghandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = spgbuild;
diff --git a/src/include/access/amapi.h b/src/include/access/amapi.h
index d2a49e8d3e..3b3e22f73d 100644
--- a/src/include/access/amapi.h
+++ b/src/include/access/amapi.h
@@ -197,6 +197,10 @@ typedef struct IndexAmRoutine
 	bool		amcanparallel;
 	/* does AM support columns included with clause INCLUDE? */
 	bool		amcaninclude;
+	/* does AM use maintenance_work_mem? */
+	bool		amusemaintenanceworkmem;
+	/* OR of parallel vacuum flags.  See vacuum.h for flags. */
+	uint8		amparallelvacuumoptions;
 	/* type of data stored in index, or InvalidOid if variable */
 	Oid			amkeytype;
 
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 5dc41dd0c1..b3351ad406 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -23,6 +23,44 @@
 #include "storage/lock.h"
 #include "utils/relcache.h"
 
+/*
+ * Flags for amparallelvacuumoptions to control the participation of bulkdelete
+ * and vacuumcleanup in parallel vacuum.
+ */
+
+/*
+ * Both bulkdelete and vacuumcleanup are disabled by default.  This will be
+ * used by IndexAM's that don't want to or cannot participate in parallel
+ * vacuum.  For example, if an index AM doesn't have a way to communicate the
+ * index statistics allocated by the first ambulkdelete call to the subsequent
+ * ones until amvacuumcleanup, the index AM cannot participate in parallel
+ * vacuum.
+ */
+#define VACUUM_OPTION_NO_PARALLEL			0
+
+/*
+ * bulkdelete can be performed in parallel.  This option can be used by
+ * IndexAm's that need to scan the index to delete the tuples.
+ */
+#define VACUUM_OPTION_PARALLEL_BULKDEL		(1 << 0)
+
+/*
+ * vacuumcleanup can be performed in parallel if bulkdelete is not performed
+ * yet.  This will be used by IndexAM's that can scan the index if the
+ * bulkdelete is not performed.
+ */
+#define VACUUM_OPTION_PARALLEL_COND_CLEANUP	(1 << 1)
+
+/*
+ * vacuumcleanup can be performed in parallel even if bulkdelete has already
+ * processed the index.  This will be used by IndexAM's that scan the index
+ * during the cleanup phase of index irrespective of whether the index is
+ * already scanned or not during bulkdelete phase.
+ */
+#define VACUUM_OPTION_PARALLEL_CLEANUP		(1 << 2)
+
+/* value for checking vacuum flags */
+#define VACUUM_OPTION_MAX_VALID_VALUE		((1 << 3) - 1)
 
 /*----------
  * ANALYZE builds one of these structs for each attribute (column) that is
diff --git a/src/test/modules/dummy_index_am/dummy_index_am.c b/src/test/modules/dummy_index_am/dummy_index_am.c
index 898ab06639..f32632089b 100644
--- a/src/test/modules/dummy_index_am/dummy_index_am.c
+++ b/src/test/modules/dummy_index_am/dummy_index_am.c
@@ -16,6 +16,7 @@
 #include "access/amapi.h"
 #include "access/reloptions.h"
 #include "catalog/index.h"
+#include "commands/vacuum.h"
 #include "nodes/pathnodes.h"
 #include "utils/guc.h"
 #include "utils/rel.h"
@@ -294,6 +295,8 @@ dihandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amparallelvacuumoptions = VACUUM_OPTION_NO_PARALLEL;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = dibuild;
-- 
2.16.2.windows.1

v45-0002-Allow-vacuum-command-to-process-indexes-in-parallel.patchapplication/octet-stream; name=v45-0002-Allow-vacuum-command-to-process-indexes-in-parallel.patchDownload

From 007631f154eb86033ba7e9b04cedb6294452012b Mon Sep 17 00:00:00 2001
From: Amit Kapila <akapila@postgresql.org>
Date: Thu, 9 Jan 2020 15:49:46 +0530
Subject: [PATCH 3/3] Allow vacuum command to process indexes in parallel.

This feature allows the vacuum to leverage multiple CPUs in order to
process indexes.  This enables us to perform index vacuuming and index
cleanup with background workers.  This adds a PARALLEL option to VACUUM
command where the user can specify the number of workers that can be used
to perform the command which is limited by the number of indexes on a
table.  Specifying zero as a number of workers will disable parallelism.
This option can't be used with the FULL option.

Each index is processed by at most one vacuum process.  Therefore parallel
vacuum can be used when the table has at least two indexes.

The parallel degree is either specified by the user or determined based on
the number of indexes that the table has, and further limited by
max_parallel_maintenance_workers.  The index can participate in parallel
vacuum iff it is greater than min_parallel_index_scan_size.

Author: Masahiko Sawada and Amit Kapila
Reviewed-by: Dilip Kumar, Amit Kapila, Robert Haas, Tomas Vondra,
Mahendra Singh and Sergei Kornilov
Tested-by: Mahendra Singh and Prabhat Sahu
Discussion:
https://postgr.es/m/CAD21AoDTPMgzSkV4E3SFo1CH_x50bf5PqZFQf4jmqjk-C03BWg@mail.gmail.com
https://postgr.es/m/CAA4eK1J-VoR9gzS5E75pcD-OH0mEyCdp8RihcwKrcuw7J-Q0+w@mail.gmail.com
---
 doc/src/sgml/config.sgml              |   18 +-
 doc/src/sgml/ref/vacuum.sgml          |   66 +-
 src/backend/access/heap/vacuumlazy.c  | 1266 ++++++++++++++++++++++++++++++---
 src/backend/access/transam/parallel.c |   26 +-
 src/backend/commands/vacuum.c         |  134 +++-
 src/backend/postmaster/autovacuum.c   |    2 +
 src/bin/psql/tab-complete.c           |    2 +-
 src/include/access/heapam.h           |    3 +
 src/include/access/parallel.h         |    4 +-
 src/include/commands/vacuum.h         |   11 +
 src/test/regress/expected/vacuum.out  |   28 +
 src/test/regress/sql/vacuum.sql       |   27 +
 src/tools/pgindent/typedefs.list      |    4 +
 13 files changed, 1455 insertions(+), 136 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 5d1c90282f..beb3d599c9 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2308,13 +2308,13 @@ include_dir 'conf.d'
        <listitem>
         <para>
          Sets the maximum number of parallel workers that can be
-         started by a single utility command.  Currently, the only
-         parallel utility command that supports the use of parallel
-         workers is <command>CREATE INDEX</command>, and only when
-         building a B-tree index.  Parallel workers are taken from the
-         pool of processes established by <xref
-         linkend="guc-max-worker-processes"/>, limited by <xref
-         linkend="guc-max-parallel-workers"/>.  Note that the requested
+         started by a single utility command.  Currently, the parallel
+         utility commands that support the use of parallel workers are
+         <command>CREATE INDEX</command> only when building a B-tree index,
+         and <command>VACUUM</command> without <literal>FULL</literal>
+         option.  Parallel workers are taken from the pool of processes
+         established by <xref linkend="guc-max-worker-processes"/>, limited
+         by <xref linkend="guc-max-parallel-workers"/>.  Note that the requested
          number of workers may not actually be available at run time.
          If this occurs, the utility operation will run with fewer
          workers than expected.  The default value is 2.  Setting this
@@ -4895,7 +4895,9 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
         for a parallel scan to be considered.  Note that a parallel index scan
         typically won't touch the entire index; it is the number of pages
         which the planner believes will actually be touched by the scan which
-        is relevant.
+        is relevant.  This parameter is also used to decide whether a
+        particular index can participate in a parallel vacuum.  See
+        <xref linkend="sql-vacuum"/>.
         If this value is specified without units, it is taken as blocks,
         that is <symbol>BLCKSZ</symbol> bytes, typically 8kB.
         The default is 512 kilobytes (<literal>512kB</literal>).
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index f9b0fb8794..b8435da7fa 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -34,6 +34,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     SKIP_LOCKED [ <replaceable class="parameter">boolean</replaceable> ]
     INDEX_CLEANUP [ <replaceable class="parameter">boolean</replaceable> ]
     TRUNCATE [ <replaceable class="parameter">boolean</replaceable> ]
+    PARALLEL [ <replaceable class="parameter">integer</replaceable> ]
 
 <phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
 
@@ -75,10 +76,14 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    with normal reading and writing of the table, as an exclusive lock
    is not obtained.  However, extra space is not returned to the operating
    system (in most cases); it's just kept available for re-use within the
-   same table.  <command>VACUUM FULL</command> rewrites the entire contents
-   of the table into a new disk file with no extra space, allowing unused
-   space to be returned to the operating system.  This form is much slower and
-   requires an exclusive lock on each table while it is being processed.
+   same table.  It also allows us to leverage multiple CPUs in order to process
+   indexes.  This feature is known as <firstterm>parallel vacuum</firstterm>.
+   To disable this feature, one can use <literal>PARALLEL</literal> option and
+   specify parallel workers as zero.  <command>VACUUM FULL</command> rewrites
+   the entire contents of the table into a new disk file with no extra space,
+   allowing unused space to be returned to the operating system.  This form is
+   much slower and requires an exclusive lock on each table while it is being
+   processed.
   </para>
 
   <para>
@@ -223,6 +228,35 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><literal>PARALLEL</literal></term>
+    <listitem>
+     <para>
+      Perform vacuum index and cleanup index phases of <command>VACUUM</command>
+      in parallel using <replaceable class="parameter">integer</replaceable>
+      background workers (for the detail of each vacuum phases, please
+      refer to <xref linkend="vacuum-phases"/>).  If the
+      <literal>PARALLEL</literal> option or parallel degree
+      <replaceable class="parameter">integer</replaceable> is omitted,
+      then <command>VACUUM</command> decides the number of workers based
+      on number of indexes that support parallel vacuum operation on the
+      relation which is further limited by
+      <xref linkend="guc-max-parallel-workers-maintenance"/>.  The index can
+      participate in a parallel vacuum if and only if the size of the index is
+      more than <xref linkend="guc-min-parallel-index-scan-size"/>.  Please
+      note that it is not guaranteed that the number of parallel workers
+      specified in <replaceable class="parameter">integer</replaceable> will
+      be used during execution.  It is possible for a vacuum to run with fewer
+      workers than specified, or even with no workers at all.  Only one worker
+      can be used per index.  So parallel workers are launched only when there
+      are at least <literal>2</literal> indexes in the table.  Workers for
+      vacuum launches before starting each phase and exit at the end of
+      the phase.  These behaviors might change in a future release.  This
+      option can't be used with the <literal>FULL</literal> option.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><replaceable class="parameter">boolean</replaceable></term>
     <listitem>
@@ -237,6 +271,18 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><replaceable class="parameter">integer</replaceable></term>
+    <listitem>
+     <para>
+      Specifies a non-negative integer value passed to the selected option.
+      The <replaceable class="parameter">integer</replaceable> value can
+      also be omitted, in which case the value is decided by the command
+      based on the option used.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><replaceable class="parameter">table_name</replaceable></term>
     <listitem>
@@ -316,11 +362,19 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     more than a plain <command>VACUUM</command> would.
    </para>
 
+   <para>
+     The <option>PARALLEL</option> option is used only for vacuum purpose.
+     Even if this option is specified with <option>ANALYZE</option> option
+     it does not affect <option>ANALYZE</option>.
+   </para>
+
    <para>
     <command>VACUUM</command> causes a substantial increase in I/O traffic,
     which might cause poor performance for other active sessions.  Therefore,
-    it is sometimes advisable to use the cost-based vacuum delay feature.
-    See <xref linkend="runtime-config-resource-vacuum-cost"/> for details.
+    it is sometimes advisable to use the cost-based vacuum delay feature.  For
+    parallel vacuum, each worker sleeps proportional to the work done by that
+    worker.  See <xref linkend="runtime-config-resource-vacuum-cost"/> for
+    details.
    </para>
 
    <para>
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index a5fe90485f..b1887e7d22 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -22,6 +22,20 @@
  * of index scans performed.  So we don't use maintenance_work_mem memory for
  * the TID array, just enough to hold as many heap tuples as fit on one page.
  *
+ * Lazy vacuum supports parallel execution with parallel worker processes.  In
+ * a parallel lazy vacuum, we perform both index vacuum and index cleanup with
+ * parallel worker processes.  Individual indexes are processed by one vacuum
+ * process.  At the beginning of a lazy vacuum (at lazy_scan_heap) we prepare
+ * the parallel context and initialize the DSM segment that contains shared
+ * information as well as the memory space for storing dead tuples.  When
+ * starting either index vacuum or index cleanup, we launch parallel worker
+ * processes.  Once all indexes are processed the parallel worker processes
+ * exit.  After that, the leader process re-initializes the parallel context
+ * so that it can use the same DSM for multiple passes of index vacuum and
+ * for performing index cleanup.  For updating the index statistics, we need
+ * to update the system table and since updates are not allowed during
+ * parallel mode we update the index statistics after exiting from the
+ * parallel mode.
  *
  * Portions Copyright (c) 1996-2020, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
@@ -36,25 +50,30 @@
 
 #include <math.h>
 
+#include "access/amapi.h"
 #include "access/genam.h"
 #include "access/heapam.h"
 #include "access/heapam_xlog.h"
 #include "access/htup_details.h"
 #include "access/multixact.h"
+#include "access/parallel.h"
 #include "access/transam.h"
 #include "access/visibilitymap.h"
+#include "access/xact.h"
 #include "access/xlog.h"
 #include "catalog/storage.h"
 #include "commands/dbcommands.h"
 #include "commands/progress.h"
 #include "commands/vacuum.h"
 #include "miscadmin.h"
+#include "optimizer/paths.h"
 #include "pgstat.h"
 #include "portability/instr_time.h"
 #include "postmaster/autovacuum.h"
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
 #include "storage/lmgr.h"
+#include "tcop/tcopprot.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/pg_rusage.h"
@@ -110,6 +129,142 @@
  */
 #define PREFETCH_SIZE			((BlockNumber) 32)
 
+/*
+ * DSM keys for parallel lazy vacuum.  Unlike other parallel execution code,
+ * since we don't need to worry about DSM keys conflicting with plan_node_id
+ * we can use small integers.
+ */
+#define PARALLEL_VACUUM_KEY_SHARED			1
+#define PARALLEL_VACUUM_KEY_DEAD_TUPLES		2
+#define PARALLEL_VACUUM_KEY_QUERY_TEXT		3
+
+/*
+ * Macro to check if we are in a parallel lazy vacuum.  If true, we are
+ * in the parallel mode and the DSM segment is initialized.
+ */
+#define ParallelVacuumIsActive(lps) (((LVParallelState *) (lps)) != NULL)
+
+/*
+ * LVDeadTuples stores the dead tuple TIDs collected during the heap scan.
+ * This is allocated in the DSM segment in parallel mode and in local memory
+ * in non-parallel mode.
+ */
+typedef struct LVDeadTuples
+{
+	int			max_tuples;		/* # slots allocated in array */
+	int			num_tuples;		/* current # of entries */
+	/* List of TIDs of tuples we intend to delete */
+	/* NB: this list is ordered by TID address */
+	ItemPointerData itemptrs[FLEXIBLE_ARRAY_MEMBER];	/* array of
+														 * ItemPointerData */
+} LVDeadTuples;
+
+#define SizeOfLVDeadTuples(cnt) \
+		add_size((offsetof(LVDeadTuples, itemptrs)), \
+				 mul_size(sizeof(ItemPointerData), cnt))
+
+/*
+ * Shared information among parallel workers.  So this is allocated in the DSM
+ * segment.
+ */
+typedef struct LVShared
+{
+	/*
+	 * Target table relid and log level.  These fields are not modified during
+	 * the lazy vacuum.
+	 */
+	Oid			relid;
+	int			elevel;
+
+	/*
+	 * An indication for vacuum workers to perform either index vacuum or
+	 * index cleanup.  first_time is true only if for_cleanup is true and
+	 * bulk-deletion is not performed yet.
+	 */
+	bool		for_cleanup;
+	bool		first_time;
+
+	/*
+	 * Fields for both index vacuum and cleanup.
+	 *
+	 * reltuples is the total number of input heap tuples.  We set either old
+	 * live tuples in the index vacuum case or the new live tuples in the
+	 * index cleanup case.
+	 *
+	 * estimated_count is true if the reltuples is an estimated value.
+	 */
+	double		reltuples;
+	bool		estimated_count;
+
+	/*
+	 * In single process lazy vacuum we could consume more memory during index
+	 * vacuuming or cleanup apart from the memory for heap scanning. In
+	 * parallel index vacuum, since individual vacuum workers can consume
+	 * memory equal to maintenance_work_mem, the new maintenance_work_mem for
+	 * each worker is set such that the parallel operation doesn't consume
+	 * more memory than single process lazy vacuum.
+	 */
+	int			maintenance_work_mem_worker;
+
+	/*
+	 * Shared vacuum cost balance.  During parallel index vacuuming
+	 * VacuumSharedCostBalance points to this value and it accumulates the
+	 * balance of each parallel vacuum workers.
+	 */
+	pg_atomic_uint32 cost_balance;
+
+	/*
+	 * Number of active parallel workers.  This is used for computing the
+	 * minimum threshold of the vacuum cost balance for a worker to go for the
+	 * delay.
+	 */
+	pg_atomic_uint32 active_nworkers;
+
+	/*
+	 * Variables to control parallel index vacuuming.  We have a bitmap to
+	 * indicate which index has stats in shared memory.  The set bit in the
+	 * map indicates that the particular index supports a parallel vacuum.
+	 */
+	pg_atomic_uint32 idx;		/* counter for vacuuming and clean up */
+	uint32		offset;			/* sizeof header incl. bitmap */
+	bits8		bitmap[FLEXIBLE_ARRAY_MEMBER];	/* bit map of NULLs */
+
+	/* Shared index statistics data follows at end of struct */
+} LVShared;
+
+#define SizeOfLVShared (offsetof(LVShared, bitmap) + sizeof(bits8))
+#define GetSharedIndStats(s) \
+	((LVSharedIndStats *)((char *)(s) + ((LVShared *)(s))->offset))
+#define IndStatsIsNull(s, i) \
+	(!(((LVShared *)(s))->bitmap[(i) >> 3] & (1 << ((i) & 0x07))))
+
+/*
+ * Struct for an index bulk-deletion statistic used for parallel lazy
+ * vacuum.  This is allocated in the DSM segment.
+ */
+typedef struct LVSharedIndStats
+{
+	bool		updated;		/* are the stats updated? */
+	IndexBulkDeleteResult stats;
+} LVSharedIndStats;
+
+/* Struct for maintaining a parallel vacuum state. */
+typedef struct LVParallelState
+{
+	ParallelContext *pcxt;
+
+	/* Shared information among parallel vacuum workers */
+	LVShared   *lvshared;
+
+	/*
+	 * The number of indexes that support parallel index bulk-deletion and
+	 * parallel index cleanup respectively.
+	 */
+	int			nindexes_parallel_bulkdel;
+	int			nindexes_parallel_cleanup;
+	int			nindexes_parallel_condcleanup;
+} LVParallelState;
+
 typedef struct LVRelStats
 {
 	/* useindex = true means two-pass strategy; false means one-pass */
@@ -128,11 +283,7 @@ typedef struct LVRelStats
 	BlockNumber pages_removed;
 	double		tuples_deleted;
 	BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
-	/* List of TIDs of tuples we intend to delete */
-	/* NB: this list is ordered by TID address */
-	int			num_dead_tuples;	/* current # of entries */
-	int			max_dead_tuples;	/* # slots allocated in array */
-	ItemPointer dead_tuples;	/* array of ItemPointerData */
+	LVDeadTuples *dead_tuples;
 	int			num_index_scans;
 	TransactionId latestRemovedXid;
 	bool		lock_waiter_detected;
@@ -155,15 +306,15 @@ static void lazy_scan_heap(Relation onerel, VacuumParams *params,
 						   bool aggressive);
 static void lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats);
 static bool lazy_check_needs_freeze(Buffer buf, bool *hastup);
-static void lazy_vacuum_index(Relation indrel,
-							  IndexBulkDeleteResult **stats,
-							  LVRelStats *vacrelstats);
-static void lazy_vacuum_all_indexes(Relation onerel, LVRelStats *vacrelstats,
-									Relation *Irel, int nindexes,
-									IndexBulkDeleteResult **indstats);
+static void lazy_vacuum_all_indexes(Relation onerel, Relation *Irel,
+									IndexBulkDeleteResult **stats,
+									LVRelStats *vacrelstats, LVParallelState *lps,
+									int nindexes);
+static void lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult **stats,
+							  LVDeadTuples *dead_tuples, double reltuples);
 static void lazy_cleanup_index(Relation indrel,
-							   IndexBulkDeleteResult *stats,
-							   LVRelStats *vacrelstats);
+							   IndexBulkDeleteResult **stats,
+							   double reltuples, bool estimated_count);
 static int	lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 							 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer);
 static bool should_attempt_truncation(VacuumParams *params,
@@ -172,12 +323,41 @@ static void lazy_truncate_heap(Relation onerel, LVRelStats *vacrelstats);
 static BlockNumber count_nondeletable_pages(Relation onerel,
 											LVRelStats *vacrelstats);
 static void lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks);
-static void lazy_record_dead_tuple(LVRelStats *vacrelstats,
+static void lazy_record_dead_tuple(LVDeadTuples *dead_tuples,
 								   ItemPointer itemptr);
 static bool lazy_tid_reaped(ItemPointer itemptr, void *state);
 static int	vac_cmp_itemptr(const void *left, const void *right);
 static bool heap_page_is_all_visible(Relation rel, Buffer buf,
 									 TransactionId *visibility_cutoff_xid, bool *all_frozen);
+static void lazy_parallel_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+										 LVRelStats *vacrelstats, LVParallelState *lps,
+										 int nindexes);
+static void parallel_vacuum_index(Relation *Irel, IndexBulkDeleteResult **stats,
+								  LVShared *lvshared, LVDeadTuples *dead_tuples,
+								  int nindexes);
+static void vacuum_indexes_leader(Relation *Irel, IndexBulkDeleteResult **stats,
+								  LVRelStats *vacrelstats, LVParallelState *lps,
+								  int nindexes);
+static void vacuum_one_index(Relation indrel, IndexBulkDeleteResult **stats,
+							 LVShared *lvshared, LVSharedIndStats *shared_indstats,
+							 LVDeadTuples *dead_tuples);
+static void lazy_cleanup_all_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+									 LVRelStats *vacrelstats, LVParallelState *lps,
+									 int nindexes);
+static long compute_max_dead_tuples(BlockNumber relblocks, bool hasindex);
+static int	compute_parallel_vacuum_workers(Relation *Irel, int nindexes, int nrequested,
+											bool *can_parallel_vacuum);
+static void prepare_index_statistics(LVShared *lvshared, bool *can_parallel_vacuum,
+									 int nindexes);
+static void update_index_statistics(Relation *Irel, IndexBulkDeleteResult **stats,
+									int nindexes);
+static LVParallelState *begin_parallel_vacuum(Oid relid, Relation *Irel,
+											  LVRelStats *vacrelstats, BlockNumber nblocks,
+											  int nindexes, int nrequested);
+static void end_parallel_vacuum(Relation *Irel, IndexBulkDeleteResult **stats,
+								LVParallelState *lps, int nindexes);
+static LVSharedIndStats *get_indstats(LVShared *lvshared, int n);
+static bool skip_parallel_vacuum_index(Relation indrel, LVShared *lvshared);
 
 
 /*
@@ -491,6 +671,18 @@ vacuum_log_cleanup_info(Relation rel, LVRelStats *vacrelstats)
  *		dead-tuple TIDs, invoke vacuuming of indexes and call lazy_vacuum_heap
  *		to reclaim dead line pointers.
  *
+ *		If the table has at least two indexes and parallel lazy vacuum is
+ *		requested, we execute both index vacuum and index cleanup with
+ *		parallel workers.  In parallel lazy vacuum, we enter parallel mode and
+ *		then create both the parallel context and the DSM segment before starting
+ *		heap scan so that we can record dead tuples to the DSM segment.  All
+ *		parallel workers are launched at beginning of index vacuuming and index
+ *		cleanup and they exit once done with all indexes.  At the end of this
+ *		function we exit from parallel mode.  Index bulk-deletion results are
+ *		stored in the DSM segment and we update index statistics for all the
+ *		indexes after exiting from parallel mode since writes are not allowed
+ *		during parallel mode.
+ *
  *		If there are no indexes then we can reclaim line pointers on the fly;
  *		dead line pointers need only be retained until all index pointers that
  *		reference them have been killed.
@@ -499,6 +691,8 @@ static void
 lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			   Relation *Irel, int nindexes, bool aggressive)
 {
+	LVParallelState *lps = NULL;
+	LVDeadTuples *dead_tuples;
 	BlockNumber nblocks,
 				blkno;
 	HeapTupleData tuple;
@@ -556,13 +750,46 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	vacrelstats->nonempty_pages = 0;
 	vacrelstats->latestRemovedXid = InvalidTransactionId;
 
-	lazy_space_alloc(vacrelstats, nblocks);
+	/*
+	 * Try to initialize the parallel vacuum if requested
+	 */
+	if (params->nworkers >= 0 && vacrelstats->useindex)
+	{
+		/*
+		 * Since parallel workers cannot access data in temporary tables, we
+		 * can't perform parallel vacuum on them.
+		 */
+		if (RelationUsesLocalBuffers(onerel))
+		{
+			/*
+			 * Give warning only if the user explicitly tries to perform a
+			 * parallel vacuum on the temporary table.
+			 */
+			if (params->nworkers > 0)
+				ereport(WARNING,
+						(errmsg("disabling parallel option of vacuum on \"%s\" --- cannot vacuum temporary tables in parallel",
+								RelationGetRelationName(onerel))));
+		}
+		else
+			lps = begin_parallel_vacuum(RelationGetRelid(onerel), Irel,
+										vacrelstats, nblocks, nindexes,
+										params->nworkers);
+	}
+
+	/*
+	 * Allocate the space for dead tuples in case the parallel vacuum is not
+	 * initialized.
+	 */
+	if (!ParallelVacuumIsActive(lps))
+		lazy_space_alloc(vacrelstats, nblocks);
+
+	dead_tuples = vacrelstats->dead_tuples;
 	frozen = palloc(sizeof(xl_heap_freeze_tuple) * MaxHeapTuplesPerPage);
 
 	/* Report that we're scanning the heap, advertising total # of blocks */
 	initprog_val[0] = PROGRESS_VACUUM_PHASE_SCAN_HEAP;
 	initprog_val[1] = nblocks;
-	initprog_val[2] = vacrelstats->max_dead_tuples;
+	initprog_val[2] = dead_tuples->max_tuples;
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
 	/*
@@ -740,8 +967,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * If we are close to overrunning the available space for dead-tuple
 		 * TIDs, pause and do a cycle of vacuuming before we tackle this page.
 		 */
-		if ((vacrelstats->max_dead_tuples - vacrelstats->num_dead_tuples) < MaxHeapTuplesPerPage &&
-			vacrelstats->num_dead_tuples > 0)
+		if ((dead_tuples->max_tuples - dead_tuples->num_tuples) < MaxHeapTuplesPerPage &&
+			dead_tuples->num_tuples > 0)
 		{
 			/*
 			 * Before beginning index vacuuming, we release any pin we may
@@ -756,8 +983,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			}
 
 			/* Work on all the indexes, then the heap */
-			lazy_vacuum_all_indexes(onerel, vacrelstats, Irel,
-									nindexes, indstats);
+			lazy_vacuum_all_indexes(onerel, Irel, indstats,
+									vacrelstats, lps, nindexes);
 
 			/* Remove tuples from heap */
 			lazy_vacuum_heap(onerel, vacrelstats);
@@ -767,7 +994,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 
 			/*
 			 * Vacuum the Free Space Map to make newly-freed space visible on
@@ -962,7 +1189,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		has_dead_tuples = false;
 		nfrozen = 0;
 		hastup = false;
-		prev_dead_count = vacrelstats->num_dead_tuples;
+		prev_dead_count = dead_tuples->num_tuples;
 		maxoff = PageGetMaxOffsetNumber(page);
 
 		/*
@@ -1001,7 +1228,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 */
 			if (ItemIdIsDead(itemid))
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				all_visible = false;
 				continue;
 			}
@@ -1147,7 +1374,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 
 			if (tupgone)
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				HeapTupleHeaderAdvanceLatestRemovedXid(tuple.t_data,
 													   &vacrelstats->latestRemovedXid);
 				tups_vacuumed += 1;
@@ -1217,7 +1444,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * doing a second scan. Also we don't do that but forget dead tuples
 		 * when index cleanup is disabled.
 		 */
-		if (!vacrelstats->useindex && vacrelstats->num_dead_tuples > 0)
+		if (!vacrelstats->useindex && dead_tuples->num_tuples > 0)
 		{
 			if (nindexes == 0)
 			{
@@ -1246,7 +1473,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 
 			/*
 			 * Periodically do incremental FSM vacuuming to make newly-freed
@@ -1361,7 +1588,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * page, so remember its free space as-is.  (This path will always be
 		 * taken if there are no indexes.)
 		 */
-		if (vacrelstats->num_dead_tuples == prev_dead_count)
+		if (dead_tuples->num_tuples == prev_dead_count)
 			RecordPageWithFreeSpace(onerel, blkno, freespace);
 	}
 
@@ -1395,11 +1622,11 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 
 	/* If any tuples need to be deleted, perform final vacuum cycle */
 	/* XXX put a threshold on min number of tuples here? */
-	if (vacrelstats->num_dead_tuples > 0)
+	if (dead_tuples->num_tuples > 0)
 	{
 		/* Work on all the indexes, and then the heap */
-		lazy_vacuum_all_indexes(onerel, vacrelstats, Irel, nindexes,
-								indstats);
+		lazy_vacuum_all_indexes(onerel, Irel, indstats, vacrelstats,
+								lps, nindexes);
 
 		/* Remove tuples from heap */
 		lazy_vacuum_heap(onerel, vacrelstats);
@@ -1412,17 +1639,22 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	if (blkno > next_fsm_block_to_vacuum)
 		FreeSpaceMapVacuumRange(onerel, next_fsm_block_to_vacuum, blkno);
 
-	/* report all blocks vacuumed; and that we're cleaning up */
+	/* report all blocks vacuumed */
 	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_VACUUMED, blkno);
-	pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
-								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
 
-	/* Do post-vacuum cleanup and statistics update for each index */
+	/* Do post-vacuum cleanup */
 	if (vacrelstats->useindex)
-	{
-		for (i = 0; i < nindexes; i++)
-			lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
-	}
+		lazy_cleanup_all_indexes(Irel, indstats, vacrelstats, lps, nindexes);
+
+	/*
+	 * End parallel mode before updating index statistics as we cannot write
+	 * during parallel mode.
+	 */
+	if (ParallelVacuumIsActive(lps))
+		end_parallel_vacuum(Irel, indstats, lps, nindexes);
+
+	/* Update index statistics */
+	update_index_statistics(Irel, indstats, nindexes);
 
 	/* If no indexes, make log report that lazy_vacuum_heap would've made */
 	if (vacuumed_pages)
@@ -1467,15 +1699,16 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 /*
  *	lazy_vacuum_all_indexes() -- vacuum all indexes of relation.
  *
- *		This is a utility wrapper for lazy_vacuum_index(), able to do
- *		progress reporting.
+ * We process the indexes serially unless we are doing parallel vacuum.
  */
 static void
-lazy_vacuum_all_indexes(Relation onerel, LVRelStats *vacrelstats,
-						Relation *Irel, int nindexes,
-						IndexBulkDeleteResult **indstats)
+lazy_vacuum_all_indexes(Relation onerel, Relation *Irel,
+						IndexBulkDeleteResult **stats,
+						LVRelStats *vacrelstats, LVParallelState *lps,
+						int nindexes)
 {
-	int			i;
+	Assert(!IsParallelWorker());
+	Assert(nindexes > 0);
 
 	/* Log cleanup info before we touch indexes */
 	vacuum_log_cleanup_info(onerel, vacrelstats);
@@ -1484,9 +1717,30 @@ lazy_vacuum_all_indexes(Relation onerel, LVRelStats *vacrelstats,
 	pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
 								 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
-	/* Remove index entries */
-	for (i = 0; i < nindexes; i++)
-		lazy_vacuum_index(Irel[i], &indstats[i], vacrelstats);
+	/* Perform index vacuuming with parallel workers for parallel vacuum. */
+	if (ParallelVacuumIsActive(lps))
+	{
+		/* Tell parallel workers to do index vacuuming */
+		lps->lvshared->for_cleanup = false;
+		lps->lvshared->first_time = false;
+
+		/*
+		 * We can only provide an approximate value of num_heap_tuples in
+		 * vacuum cases.
+		 */
+		lps->lvshared->reltuples = vacrelstats->old_live_tuples;
+		lps->lvshared->estimated_count = true;
+
+		lazy_parallel_vacuum_indexes(Irel, stats, vacrelstats, lps, nindexes);
+	}
+	else
+	{
+		int			idx;
+
+		for (idx = 0; idx < nindexes; idx++)
+			lazy_vacuum_index(Irel[idx], &stats[idx], vacrelstats->dead_tuples,
+							  vacrelstats->old_live_tuples);
+	}
 
 	/* Increase and report the number of index scans */
 	vacrelstats->num_index_scans++;
@@ -1522,7 +1776,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 	npages = 0;
 
 	tupindex = 0;
-	while (tupindex < vacrelstats->num_dead_tuples)
+	while (tupindex < vacrelstats->dead_tuples->num_tuples)
 	{
 		BlockNumber tblk;
 		Buffer		buf;
@@ -1531,7 +1785,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 
 		vacuum_delay_point();
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples->itemptrs[tupindex]);
 		buf = ReadBufferExtended(onerel, MAIN_FORKNUM, tblk, RBM_NORMAL,
 								 vac_strategy);
 		if (!ConditionalLockBufferForCleanup(buf))
@@ -1579,6 +1833,7 @@ static int
 lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 				 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer)
 {
+	LVDeadTuples *dead_tuples = vacrelstats->dead_tuples;
 	Page		page = BufferGetPage(buffer);
 	OffsetNumber unused[MaxOffsetNumber];
 	int			uncnt = 0;
@@ -1589,16 +1844,16 @@ lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 
 	START_CRIT_SECTION();
 
-	for (; tupindex < vacrelstats->num_dead_tuples; tupindex++)
+	for (; tupindex < dead_tuples->num_tuples; tupindex++)
 	{
 		BlockNumber tblk;
 		OffsetNumber toff;
 		ItemId		itemid;
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&dead_tuples->itemptrs[tupindex]);
 		if (tblk != blkno)
 			break;				/* past end of tuples for this block */
-		toff = ItemPointerGetOffsetNumber(&vacrelstats->dead_tuples[tupindex]);
+		toff = ItemPointerGetOffsetNumber(&dead_tuples->itemptrs[tupindex]);
 		itemid = PageGetItemId(page, toff);
 		ItemIdSetUnused(itemid);
 		unused[uncnt++] = toff;
@@ -1719,19 +1974,354 @@ lazy_check_needs_freeze(Buffer buf, bool *hastup)
 	return false;
 }
 
+/*
+ * Perform index vacuum or index cleanup with parallel workers.  This function
+ * must be used by the parallel vacuum leader process.  The caller must set
+ * lps->lvshared->for_cleanup to indicate whether to perform vacuum or
+ * cleanup.
+ */
+static void
+lazy_parallel_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+							 LVRelStats *vacrelstats, LVParallelState *lps,
+							 int nindexes)
+{
+	int			nworkers;
+
+	Assert(!IsParallelWorker());
+	Assert(ParallelVacuumIsActive(lps));
+	Assert(nindexes > 0);
+
+	/* Determine the number of parallel workers to launch */
+	if (lps->lvshared->for_cleanup)
+	{
+		if (lps->lvshared->first_time)
+			nworkers = lps->nindexes_parallel_cleanup +
+				lps->nindexes_parallel_condcleanup;
+		else
+			nworkers = lps->nindexes_parallel_cleanup;
+
+	}
+	else
+		nworkers = lps->nindexes_parallel_bulkdel;
+
+	/* The leader process will participate */
+	nworkers--;
+
+	/*
+	 * It is possible that parallel context is initialized with fewer workers
+	 * then the number of indexes that need a separate worker in the current
+	 * phase, so we need to consider it.  See compute_parallel_vacuum_workers.
+	 */
+	nworkers = Min(nworkers, lps->pcxt->nworkers);
+
+	/* Setup the shared cost-based vacuum delay and launch workers */
+	if (nworkers > 0)
+	{
+		if (vacrelstats->num_index_scans > 0)
+		{
+			/* Reset the parallel index processing counter */
+			pg_atomic_write_u32(&(lps->lvshared->idx), 0);
+
+			/* Reinitialize the parallel context to relaunch parallel workers */
+			ReinitializeParallelDSM(lps->pcxt);
+		}
+
+		/* Enable shared cost balance */
+		VacuumSharedCostBalance = &(lps->lvshared->cost_balance);
+		VacuumActiveNWorkers = &(lps->lvshared->active_nworkers);
+
+		/*
+		 * Set up shared cost balance and the number of active workers for
+		 * vacuum delay.
+		 */
+		pg_atomic_write_u32(VacuumSharedCostBalance, VacuumCostBalance);
+		pg_atomic_write_u32(VacuumActiveNWorkers, 0);
+
+		/*
+		 * The number of workers can vary between bulkdelete and cleanup
+		 * phase.
+		 */
+		ReinitializeParallelWorkers(lps->pcxt, nworkers);
+
+		LaunchParallelWorkers(lps->pcxt);
+
+		if (lps->pcxt->nworkers_launched > 0)
+		{
+			/*
+			 * Reset the local cost values for leader backend as we have
+			 * already accumulated the remaining balance of heap.
+			 */
+			VacuumCostBalance = 0;
+			VacuumCostBalanceLocal = 0;
+		}
+		else
+		{
+			/*
+			 * Disable shared cost balance if we are not able to launch
+			 * workers.
+			 */
+			VacuumSharedCostBalance = NULL;
+			VacuumActiveNWorkers = NULL;
+		}
+
+		if (lps->lvshared->for_cleanup)
+			ereport(elevel,
+					(errmsg(ngettext("launched %d parallel vacuum worker for index cleanup (planned: %d)",
+									 "launched %d parallel vacuum workers for index cleanup (planned: %d)",
+									 lps->pcxt->nworkers_launched),
+							lps->pcxt->nworkers_launched, nworkers)));
+		else
+			ereport(elevel,
+					(errmsg(ngettext("launched %d parallel vacuum worker for index vacuuming (planned: %d)",
+									 "launched %d parallel vacuum workers for index vacuuming (planned: %d)",
+									 lps->pcxt->nworkers_launched),
+							lps->pcxt->nworkers_launched, nworkers)));
+	}
+
+	/* Process the indexes that can be processed by only leader process */
+	vacuum_indexes_leader(Irel, stats, vacrelstats, lps, nindexes);
+
+	/*
+	 * Join as a parallel worker.  The leader process alone processes all the
+	 * indexes in the case where no workers are launched.
+	 */
+	parallel_vacuum_index(Irel, stats, lps->lvshared,
+						  vacrelstats->dead_tuples, nindexes);
+
+	/* Wait for all vacuum workers to finish */
+	WaitForParallelWorkersToFinish(lps->pcxt);
+
+	/* Carry the shared balance value to heap scan */
+	if (VacuumSharedCostBalance)
+		VacuumCostBalance = pg_atomic_read_u32(VacuumSharedCostBalance);
+
+	if (nworkers > 0)
+	{
+		/* Disable shared cost balance */
+		VacuumSharedCostBalance = NULL;
+		VacuumActiveNWorkers = NULL;
+	}
+}
+
+/*
+ * Index vacuum/cleanup routine used by the leader process and parallel
+ * vacuum worker processes to process the indexes in parallel.
+ */
+static void
+parallel_vacuum_index(Relation *Irel, IndexBulkDeleteResult **stats,
+					  LVShared *lvshared, LVDeadTuples *dead_tuples,
+					  int nindexes)
+{
+	/*
+	 * Increment the active worker count if we are able to launch any worker.
+	 */
+	if (VacuumActiveNWorkers)
+		pg_atomic_add_fetch_u32(VacuumActiveNWorkers, 1);
+
+	/* Loop until all indexes are vacuumed */
+	for (;;)
+	{
+		int			idx;
+		LVSharedIndStats *shared_indstats;
+
+		/* Get an index number to process */
+		idx = pg_atomic_fetch_add_u32(&(lvshared->idx), 1);
+
+		/* Done for all indexes? */
+		if (idx >= nindexes)
+			break;
+
+		/* Get the index statistics of this index from DSM */
+		shared_indstats = get_indstats(lvshared, idx);
+
+		/*
+		 * Skip processing indexes that doesn't participate in parallel
+		 * operation
+		 */
+		if (shared_indstats == NULL ||
+			skip_parallel_vacuum_index(Irel[idx], lvshared))
+			continue;
+
+		/* Do vacuum or cleanup of the index */
+		vacuum_one_index(Irel[idx], &(stats[idx]), lvshared, shared_indstats,
+						 dead_tuples);
+	}
+
+	/*
+	 * We have completed the index vacuum so decrement the active worker
+	 * count.
+	 */
+	if (VacuumActiveNWorkers)
+		pg_atomic_sub_fetch_u32(VacuumActiveNWorkers, 1);
+}
+
+/*
+ * Vacuum or cleanup indexes that can be processed by only the leader process
+ * because these indexes don't support parallel operation at that phase.
+ */
+static void
+vacuum_indexes_leader(Relation *Irel, IndexBulkDeleteResult **stats,
+					  LVRelStats *vacrelstats, LVParallelState *lps,
+					  int nindexes)
+{
+	int			i;
+
+	Assert(!IsParallelWorker());
+
+	/*
+	 * Increment the active worker count if we are able to launch any worker.
+	 */
+	if (VacuumActiveNWorkers)
+		pg_atomic_add_fetch_u32(VacuumActiveNWorkers, 1);
+
+	for (i = 0; i < nindexes; i++)
+	{
+		bool		skip_index = (get_indstats(lps->lvshared, i) == NULL ||
+								  skip_parallel_vacuum_index(Irel[i], lps->lvshared));
+
+		/* Skip the indexes that can be processed by parallel workers */
+		if (!skip_index)
+			continue;
+
+		vacuum_one_index(Irel[i], &(stats[i]), lps->lvshared,
+						 get_indstats(lps->lvshared, i),
+						 vacrelstats->dead_tuples);
+	}
+
+	/*
+	 * We have completed the index vacuum so decrement the active worker
+	 * count.
+	 */
+	if (VacuumActiveNWorkers)
+		pg_atomic_sub_fetch_u32(VacuumActiveNWorkers, 1);
+}
+
+/*
+ * Vacuum or cleanup index either by leader process or by one of the worker
+ * process.  After processing the index this function copies the index
+ * statistics returned from ambulkdelete and amvacuumcleanup to the DSM
+ * segment.
+ */
+static void
+vacuum_one_index(Relation indrel, IndexBulkDeleteResult **stats,
+				 LVShared *lvshared, LVSharedIndStats *shared_indstats,
+				 LVDeadTuples *dead_tuples)
+{
+	IndexBulkDeleteResult *bulkdelete_res = NULL;
+
+	if (shared_indstats)
+	{
+		/* Get the space for IndexBulkDeleteResult */
+		bulkdelete_res = &(shared_indstats->stats);
+
+		/*
+		 * Update the pointer to the corresponding bulk-deletion result if
+		 * someone has already updated it.
+		 */
+		if (shared_indstats->updated && *stats == NULL)
+			*stats = bulkdelete_res;
+	}
+
+	/* Do vacuum or cleanup of the index */
+	if (lvshared->for_cleanup)
+		lazy_cleanup_index(indrel, stats, lvshared->reltuples,
+						   lvshared->estimated_count);
+	else
+		lazy_vacuum_index(indrel, stats, dead_tuples,
+						  lvshared->reltuples);
+
+	/*
+	 * Copy the index bulk-deletion result returned from ambulkdelete and
+	 * amvacuumcleanup to the DSM segment if it's the first time to get it
+	 * from them, because they allocate it locally and it's possible that an
+	 * index will be vacuumed by the different vacuum process at the next
+	 * time.  The copying of the result normally happens only after the first
+	 * time of index vacuuming.  From the second time, we pass the result on
+	 * the DSM segment so that they then update it directly.
+	 *
+	 * Since all vacuum workers write the bulk-deletion result at different
+	 * slots we can write them without locking.
+	 */
+	if (shared_indstats && !shared_indstats->updated && *stats != NULL)
+	{
+		memcpy(bulkdelete_res, *stats, sizeof(IndexBulkDeleteResult));
+		shared_indstats->updated = true;
+
+		/*
+		 * Now that the stats[idx] points to the DSM segment, we don't need
+		 * the locally allocated results.
+		 */
+		pfree(*stats);
+		*stats = bulkdelete_res;
+	}
+}
+
+/*
+ *	lazy_cleanup_all_indexes() -- cleanup all indexes of relation.
+ *
+ * Cleanup indexes.  We process the indexes serially unless we are doing
+ * parallel vacuum.
+ */
+static void
+lazy_cleanup_all_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+						 LVRelStats *vacrelstats, LVParallelState *lps,
+						 int nindexes)
+{
+	int			idx;
+
+	Assert(!IsParallelWorker());
+	Assert(nindexes > 0);
+
+	/* Report that we are now cleaning up indexes */
+	pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
+								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
+
+	/*
+	 * If parallel vacuum is active we perform index cleanup with parallel
+	 * workers.
+	 */
+	if (ParallelVacuumIsActive(lps))
+	{
+		/* Tell parallel workers to do index cleanup */
+		lps->lvshared->for_cleanup = true;
+		lps->lvshared->first_time =
+			(vacrelstats->num_index_scans == 0);
+
+		/*
+		 * Now we can provide a better estimate of total number of surviving
+		 * tuples (we assume indexes are more interested in that than in the
+		 * number of nominally live tuples).
+		 */
+		lps->lvshared->reltuples = vacrelstats->new_rel_tuples;
+		lps->lvshared->estimated_count =
+			(vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+
+		lazy_parallel_vacuum_indexes(Irel, stats, vacrelstats, lps, nindexes);
+	}
+	else
+	{
+		for (idx = 0; idx < nindexes; idx++)
+			lazy_cleanup_index(Irel[idx], &stats[idx],
+							   vacrelstats->new_rel_tuples,
+							   vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	}
+}
 
 /*
  *	lazy_vacuum_index() -- vacuum one index relation.
  *
  *		Delete all the index entries pointing to tuples listed in
- *		vacrelstats->dead_tuples, and update running statistics.
+ *		dead_tuples, and update running statistics.
+ *
+ *		reltuples is the number of heap tuples to be passed to the
+ *		bulkdelete callback.
  */
 static void
-lazy_vacuum_index(Relation indrel,
-				  IndexBulkDeleteResult **stats,
-				  LVRelStats *vacrelstats)
+lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult **stats,
+				  LVDeadTuples *dead_tuples, double reltuples)
 {
 	IndexVacuumInfo ivinfo;
+	const char *msg;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
@@ -1741,30 +2331,38 @@ lazy_vacuum_index(Relation indrel,
 	ivinfo.report_progress = false;
 	ivinfo.estimated_count = true;
 	ivinfo.message_level = elevel;
-	/* We can only provide an approximate value of num_heap_tuples here */
-	ivinfo.num_heap_tuples = vacrelstats->old_live_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
 	/* Do bulk deletion */
 	*stats = index_bulk_delete(&ivinfo, *stats,
-							   lazy_tid_reaped, (void *) vacrelstats);
+							   lazy_tid_reaped, (void *) dead_tuples);
+
+	if (IsParallelWorker())
+		msg = gettext_noop("scanned index \"%s\" to remove %d row versions by parallel vacuum worker");
+	else
+		msg = gettext_noop("scanned index \"%s\" to remove %d row versions");
 
 	ereport(elevel,
-			(errmsg("scanned index \"%s\" to remove %d row versions",
+			(errmsg(msg,
 					RelationGetRelationName(indrel),
-					vacrelstats->num_dead_tuples),
+					dead_tuples->num_tuples),
 			 errdetail_internal("%s", pg_rusage_show(&ru0))));
 }
 
 /*
  *	lazy_cleanup_index() -- do post-vacuum cleanup for one index relation.
+ *
+ *		reltuples is the number of heap tuples and estimated_count is true
+ *		if the reltuples is an estimated value.
  */
 static void
 lazy_cleanup_index(Relation indrel,
-				   IndexBulkDeleteResult *stats,
-				   LVRelStats *vacrelstats)
+				   IndexBulkDeleteResult **stats,
+				   double reltuples, bool estimated_count)
 {
 	IndexVacuumInfo ivinfo;
+	const char *msg;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
@@ -1772,49 +2370,33 @@ lazy_cleanup_index(Relation indrel,
 	ivinfo.index = indrel;
 	ivinfo.analyze_only = false;
 	ivinfo.report_progress = false;
-	ivinfo.estimated_count = (vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	ivinfo.estimated_count = estimated_count;
 	ivinfo.message_level = elevel;
 
-	/*
-	 * Now we can provide a better estimate of total number of surviving
-	 * tuples (we assume indexes are more interested in that than in the
-	 * number of nominally live tuples).
-	 */
-	ivinfo.num_heap_tuples = vacrelstats->new_rel_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
-	stats = index_vacuum_cleanup(&ivinfo, stats);
+	*stats = index_vacuum_cleanup(&ivinfo, *stats);
 
-	if (!stats)
+	if (!(*stats))
 		return;
 
-	/*
-	 * Now update statistics in pg_class, but only if the index says the count
-	 * is accurate.
-	 */
-	if (!stats->estimated_count)
-		vac_update_relstats(indrel,
-							stats->num_pages,
-							stats->num_index_tuples,
-							0,
-							false,
-							InvalidTransactionId,
-							InvalidMultiXactId,
-							false);
+	if (IsParallelWorker())
+		msg = gettext_noop("index \"%s\" now contains %.0f row versions in %u pages as reported by parallel vacuum worker");
+	else
+		msg = gettext_noop("index \"%s\" now contains %.0f row versions in %u pages");
 
 	ereport(elevel,
-			(errmsg("index \"%s\" now contains %.0f row versions in %u pages",
+			(errmsg(msg,
 					RelationGetRelationName(indrel),
-					stats->num_index_tuples,
-					stats->num_pages),
+					(*stats)->num_index_tuples,
+					(*stats)->num_pages),
 			 errdetail("%.0f index row versions were removed.\n"
 					   "%u index pages have been deleted, %u are currently reusable.\n"
 					   "%s.",
-					   stats->tuples_removed,
-					   stats->pages_deleted, stats->pages_free,
+					   (*stats)->tuples_removed,
+					   (*stats)->pages_deleted, (*stats)->pages_free,
 					   pg_rusage_show(&ru0))));
-
-	pfree(stats);
 }
 
 /*
@@ -2122,19 +2704,17 @@ count_nondeletable_pages(Relation onerel, LVRelStats *vacrelstats)
 }
 
 /*
- * lazy_space_alloc - space allocation decisions for lazy vacuum
- *
- * See the comments at the head of this file for rationale.
+ * Return the maximum number of dead tuples we can record.
  */
-static void
-lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+static long
+compute_max_dead_tuples(BlockNumber relblocks, bool useindex)
 {
 	long		maxtuples;
 	int			vac_work_mem = IsAutoVacuumWorkerProcess() &&
 	autovacuum_work_mem != -1 ?
 	autovacuum_work_mem : maintenance_work_mem;
 
-	if (vacrelstats->useindex)
+	if (useindex)
 	{
 		maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);
 		maxtuples = Min(maxtuples, INT_MAX);
@@ -2148,34 +2728,48 @@ lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
 		maxtuples = Max(maxtuples, MaxHeapTuplesPerPage);
 	}
 	else
-	{
 		maxtuples = MaxHeapTuplesPerPage;
-	}
 
-	vacrelstats->num_dead_tuples = 0;
-	vacrelstats->max_dead_tuples = (int) maxtuples;
-	vacrelstats->dead_tuples = (ItemPointer)
-		palloc(maxtuples * sizeof(ItemPointerData));
+	return maxtuples;
+}
+
+/*
+ * lazy_space_alloc - space allocation decisions for lazy vacuum
+ *
+ * See the comments at the head of this file for rationale.
+ */
+static void
+lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+{
+	LVDeadTuples *dead_tuples = NULL;
+	long		maxtuples;
+
+	maxtuples = compute_max_dead_tuples(relblocks, vacrelstats->useindex);
+
+	dead_tuples = (LVDeadTuples *) palloc(SizeOfLVDeadTuples(maxtuples));
+	dead_tuples->num_tuples = 0;
+	dead_tuples->max_tuples = (int) maxtuples;
+
+	vacrelstats->dead_tuples = dead_tuples;
 }
 
 /*
  * lazy_record_dead_tuple - remember one deletable tuple
  */
 static void
-lazy_record_dead_tuple(LVRelStats *vacrelstats,
-					   ItemPointer itemptr)
+lazy_record_dead_tuple(LVDeadTuples *dead_tuples, ItemPointer itemptr)
 {
 	/*
 	 * The array shouldn't overflow under normal behavior, but perhaps it
 	 * could if we are given a really small maintenance_work_mem. In that
 	 * case, just forget the last few tuples (we'll get 'em next time).
 	 */
-	if (vacrelstats->num_dead_tuples < vacrelstats->max_dead_tuples)
+	if (dead_tuples->num_tuples < dead_tuples->max_tuples)
 	{
-		vacrelstats->dead_tuples[vacrelstats->num_dead_tuples] = *itemptr;
-		vacrelstats->num_dead_tuples++;
+		dead_tuples->itemptrs[dead_tuples->num_tuples] = *itemptr;
+		dead_tuples->num_tuples++;
 		pgstat_progress_update_param(PROGRESS_VACUUM_NUM_DEAD_TUPLES,
-									 vacrelstats->num_dead_tuples);
+									 dead_tuples->num_tuples);
 	}
 }
 
@@ -2189,12 +2783,12 @@ lazy_record_dead_tuple(LVRelStats *vacrelstats,
 static bool
 lazy_tid_reaped(ItemPointer itemptr, void *state)
 {
-	LVRelStats *vacrelstats = (LVRelStats *) state;
+	LVDeadTuples *dead_tuples = (LVDeadTuples *) state;
 	ItemPointer res;
 
 	res = (ItemPointer) bsearch((void *) itemptr,
-								(void *) vacrelstats->dead_tuples,
-								vacrelstats->num_dead_tuples,
+								(void *) dead_tuples->itemptrs,
+								dead_tuples->num_tuples,
 								sizeof(ItemPointerData),
 								vac_cmp_itemptr);
 
@@ -2342,3 +2936,449 @@ heap_page_is_all_visible(Relation rel, Buffer buf,
 
 	return all_visible;
 }
+
+/*
+ * Compute the number of parallel worker processes to request.  Both index
+ * vacuum and index cleanup can be executed with parallel workers.  The
+ * relation size of the table don't affect the parallel degree for now.
+ *
+ * nrequested is the number of parallel workers that user requested.  If
+ * nrequested is 0, we compute the parallel degree based on nindexes, that is
+ * the number of indexes that support parallel index vacuuming.  This function
+ * also sets can_parallel_vacuum to remember indexes that participate in
+ * parallel index vacuum.
+ */
+static int
+compute_parallel_vacuum_workers(Relation *Irel, int nindexes, int nrequested,
+								bool *can_parallel_vacuum)
+{
+	int			nindexes_parallel = 0;
+	int			nindexes_parallel_bulkdel = 0;
+	int			nindexes_parallel_cleanup = 0;
+	int			parallel_workers;
+	int			i;
+
+	/*
+	 * We don't allow to perform parallel operation in standalone backend or
+	 * when parallelism is disabled.
+	 */
+	if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+		return 0;
+
+	/*
+	 * Compute the number of indexes that can participate in parallel index
+	 * vacuuming.
+	 */
+	for (i = 0; i < nindexes; i++)
+	{
+		uint8		vacoptions = Irel[i]->rd_indam->amparallelvacuumoptions;
+
+		if (vacoptions == VACUUM_OPTION_NO_PARALLEL ||
+			RelationGetNumberOfBlocks(Irel[i]) < min_parallel_index_scan_size)
+			continue;
+
+		can_parallel_vacuum[i] = true;
+
+		if ((vacoptions & VACUUM_OPTION_PARALLEL_BULKDEL) != 0)
+			nindexes_parallel_bulkdel++;
+		if (((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) != 0) ||
+			((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) != 0))
+			nindexes_parallel_cleanup++;
+	}
+
+	nindexes_parallel = Max(nindexes_parallel_bulkdel,
+							nindexes_parallel_cleanup);
+
+	/* No index supports parallel index vacuuming */
+	if (nindexes_parallel == 0)
+		return 0;
+
+	/* The leader process takes one index */
+	nindexes_parallel--;
+
+	/* Compute the parallel degree */
+	parallel_workers = (nrequested > 0) ?
+		Min(nrequested, nindexes_parallel) : nindexes_parallel;
+
+	/* Cap by max_parallel_maintenance_workers */
+	parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+
+	return parallel_workers;
+}
+
+/*
+ * Initialize variables for shared index statistics, set NULL bitmap and the
+ * size of stats for each index.  Since currently we don't support parallel vacuum
+ * for autovacuum we don't need to care about autovacuum_work_mem.
+ */
+static void
+prepare_index_statistics(LVShared *lvshared, bool *can_parallel_vacuum,
+						 int nindexes)
+{
+	int			i;
+
+	Assert(!IsAutoVacuumWorkerProcess());
+
+	/* Set NULL for all indexes */
+	memset(lvshared->bitmap, 0x00, BITMAPLEN(nindexes));
+
+	for (i = 0; i < nindexes; i++)
+	{
+		if (!can_parallel_vacuum[i])
+			continue;
+
+		/* Set NOT NULL as this index do support parallelism */
+		lvshared->bitmap[i >> 3] |= 1 << (i & 0x07);
+	}
+}
+
+/*
+ * Update index statistics in pg_class if the statistics is accurate.
+ */
+static void
+update_index_statistics(Relation *Irel, IndexBulkDeleteResult **stats,
+						int nindexes)
+{
+	int			i;
+
+	Assert(!IsInParallelMode());
+
+	for (i = 0; i < nindexes; i++)
+	{
+		if (stats[i] == NULL || stats[i]->estimated_count)
+			continue;
+
+		/* Update index statistics */
+		vac_update_relstats(Irel[i],
+							stats[i]->num_pages,
+							stats[i]->num_index_tuples,
+							0,
+							false,
+							InvalidTransactionId,
+							InvalidMultiXactId,
+							false);
+		pfree(stats[i]);
+	}
+}
+
+/*
+ * This function prepares and returns parallel vacuum state if we can launch
+ * even one worker.  This function is responsible to create a parallel
+ * context, enter parallel mode, and then initialize the DSM segment.
+ */
+static LVParallelState *
+begin_parallel_vacuum(Oid relid, Relation *Irel, LVRelStats *vacrelstats,
+					  BlockNumber nblocks, int nindexes, int nrequested)
+{
+	LVParallelState *lps = NULL;
+	ParallelContext *pcxt;
+	LVShared   *shared;
+	LVDeadTuples *dead_tuples;
+	bool	   *can_parallel_vacuum;
+	long		maxtuples;
+	char	   *sharedquery;
+	Size		est_shared;
+	Size		est_deadtuples;
+	int			nindexes_mwm = 0;
+	int			parallel_workers = 0;
+	int			querylen;
+	int			i;
+
+	/*
+	 * A parallel vacuum must be requested and there must be indexes on the
+	 * relation
+	 */
+	Assert(nrequested >= 0);
+	Assert(nindexes > 0);
+
+	/*
+	 * Compute the number of parallel vacuum workers to launch
+	 */
+	can_parallel_vacuum = (bool *) palloc0(sizeof(bool) * nindexes);
+	parallel_workers = compute_parallel_vacuum_workers(Irel, nindexes,
+													   nrequested,
+													   can_parallel_vacuum);
+
+	/* Can't perform vacuum in parallel */
+	if (parallel_workers <= 0)
+	{
+		pfree(can_parallel_vacuum);
+		return lps;
+	}
+
+	lps = (LVParallelState *) palloc0(sizeof(LVParallelState));
+
+	EnterParallelMode();
+	pcxt = CreateParallelContext("postgres", "parallel_vacuum_main",
+								 parallel_workers);
+	Assert(pcxt->nworkers > 0);
+	lps->pcxt = pcxt;
+
+	/* Estimate size for shared information -- PARALLEL_VACUUM_KEY_SHARED */
+	est_shared = MAXALIGN(add_size(SizeOfLVShared, BITMAPLEN(nindexes)));
+	for (i = 0; i < nindexes; i++)
+	{
+		uint8		vacoptions = Irel[i]->rd_indam->amparallelvacuumoptions;
+
+		/*
+		 * Cleanup option should be either disabled, always performing in
+		 * parallel or conditionally performing in parallel.
+		 */
+		Assert(((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) == 0) ||
+			   ((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) == 0));
+		Assert(vacoptions <= VACUUM_OPTION_MAX_VALID_VALUE);
+
+		/* Skip indexes that don't participate in parallel index vacuum */
+		if (!can_parallel_vacuum[i])
+			continue;
+
+		if (Irel[i]->rd_indam->amusemaintenanceworkmem)
+			nindexes_mwm++;
+
+		est_shared = add_size(est_shared, sizeof(LVSharedIndStats));
+
+		/*
+		 * Remember the number of indexes that support parallel operation for
+		 * each phase.
+		 */
+		if ((vacoptions & VACUUM_OPTION_PARALLEL_BULKDEL) != 0)
+			lps->nindexes_parallel_bulkdel++;
+		if ((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) != 0)
+			lps->nindexes_parallel_cleanup++;
+		if ((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) != 0)
+			lps->nindexes_parallel_condcleanup++;
+	}
+	shm_toc_estimate_chunk(&pcxt->estimator, est_shared);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	/* Estimate size for dead tuples -- PARALLEL_VACUUM_KEY_DEAD_TUPLES */
+	maxtuples = compute_max_dead_tuples(nblocks, true);
+	est_deadtuples = MAXALIGN(SizeOfLVDeadTuples(maxtuples));
+	shm_toc_estimate_chunk(&pcxt->estimator, est_deadtuples);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	/* Finally, estimate PARALLEL_VACUUM_KEY_QUERY_TEXT space */
+	querylen = strlen(debug_query_string);
+	shm_toc_estimate_chunk(&pcxt->estimator, querylen + 1);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	InitializeParallelDSM(pcxt);
+
+	/* Prepare shared information */
+	shared = (LVShared *) shm_toc_allocate(pcxt->toc, est_shared);
+	MemSet(shared, 0, est_shared);
+	shared->relid = relid;
+	shared->elevel = elevel;
+	shared->maintenance_work_mem_worker =
+		(nindexes_mwm > 0) ?
+		maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
+		maintenance_work_mem;
+
+	/*
+	 * We need to care about alignment because we estimate the shared memory
+	 * in that way.
+	 */
+	shared->offset = MAXALIGN(add_size(SizeOfLVShared, BITMAPLEN(nindexes)));
+	prepare_index_statistics(shared, can_parallel_vacuum, nindexes);
+	pg_atomic_init_u32(&(shared->idx), 0);
+	pg_atomic_init_u32(&(shared->cost_balance), 0);
+	pg_atomic_init_u32(&(shared->active_nworkers), 0);
+
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_SHARED, shared);
+	lps->lvshared = shared;
+
+	/* Prepare the dead tuple space */
+	dead_tuples = (LVDeadTuples *) shm_toc_allocate(pcxt->toc, est_deadtuples);
+	dead_tuples->max_tuples = maxtuples;
+	dead_tuples->num_tuples = 0;
+	MemSet(dead_tuples->itemptrs, 0, sizeof(ItemPointerData) * maxtuples);
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_DEAD_TUPLES, dead_tuples);
+	vacrelstats->dead_tuples = dead_tuples;
+
+	/* Store query string for workers */
+	sharedquery = (char *) shm_toc_allocate(pcxt->toc, querylen + 1);
+	memcpy(sharedquery, debug_query_string, querylen + 1);
+	sharedquery[querylen] = '\0';
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, sharedquery);
+
+	pfree(can_parallel_vacuum);
+	return lps;
+}
+
+/*
+ * Destroy the parallel context, and end parallel mode.
+ *
+ * Since writes are not allowed during the parallel mode, so we copy the
+ * updated index statistics from DSM in local memory and then later use that
+ * to update the index statistics.  One might think that we can exit from
+ * parallel mode, update the index statistics and then destroy parallel
+ * context, but that won't be safe (see ExitParallelMode).
+ */
+static void
+end_parallel_vacuum(Relation *Irel, IndexBulkDeleteResult **stats,
+					LVParallelState *lps, int nindexes)
+{
+	int			i;
+
+	Assert(!IsParallelWorker());
+
+	/* Copy the updated statistics */
+	for (i = 0; i < nindexes; i++)
+	{
+		LVSharedIndStats *indstats = get_indstats(lps->lvshared, i);
+
+		/*
+		 * Skip unused slot.  The statistics of this index are already stored
+		 * in local memory.
+		 */
+		if (indstats == NULL)
+			continue;
+
+		if (indstats->updated)
+		{
+			stats[i] = (IndexBulkDeleteResult *) palloc0(sizeof(IndexBulkDeleteResult));
+			memcpy(stats[i], &(indstats->stats), sizeof(IndexBulkDeleteResult));
+		}
+		else
+			stats[i] = NULL;
+	}
+
+	DestroyParallelContext(lps->pcxt);
+	ExitParallelMode();
+
+	/* Deactivate parallel vacuum */
+	pfree(lps);
+	lps = NULL;
+}
+
+/* Return the Nth index statistics or NULL */
+static LVSharedIndStats *
+get_indstats(LVShared *lvshared, int n)
+{
+	int			i;
+	char	   *p;
+
+	if (IndStatsIsNull(lvshared, n))
+		return NULL;
+
+	p = (char *) GetSharedIndStats(lvshared);
+	for (i = 0; i < n; i++)
+	{
+		if (IndStatsIsNull(lvshared, i))
+			continue;
+
+		p += sizeof(LVSharedIndStats);
+	}
+
+	return (LVSharedIndStats *) p;
+}
+
+/*
+ * Check if the given index participates in parallel index vacuum or parallel
+ * index cleanup.
+ */
+static bool
+skip_parallel_vacuum_index(Relation indrel, LVShared *lvshared)
+{
+	uint8		vacoptions = indrel->rd_indam->amparallelvacuumoptions;
+
+	/* first_time must be true only if for_cleanup is true */
+	Assert(lvshared->for_cleanup || !lvshared->first_time);
+
+	if (lvshared->for_cleanup)
+	{
+		/* Skip, if the index does not support parallel cleanup */
+		if (((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) == 0) &&
+			((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) == 0))
+			return true;
+
+		/*
+		 * Skip, if the index supports parallel cleanup conditionally, but we
+		 * have already processed the index (for bulkdelete).  See the
+		 * comments for option VACUUM_OPTION_PARALLEL_COND_CLEANUP to know
+		 * when indexes support parallel cleanup conditionally.
+		 */
+		if (!lvshared->first_time &&
+			((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) != 0))
+			return true;
+	}
+	else if ((vacoptions & VACUUM_OPTION_PARALLEL_BULKDEL) == 0)
+	{
+		/* Skip if the index does not support parallel bulk deletion */
+		return true;
+	}
+
+	return false;
+}
+
+/*
+ * Perform work within a launched parallel process.
+ *
+ * Since parallel vacuum workers perform only index vacuum or index cleanup,
+ * we don't need to report the progress information.
+ */
+void
+parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
+{
+	Relation	onerel;
+	Relation   *indrels;
+	LVShared   *lvshared;
+	LVDeadTuples *dead_tuples;
+	int			nindexes;
+	char	   *sharedquery;
+	IndexBulkDeleteResult **stats;
+
+	lvshared = (LVShared *) shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_SHARED,
+										   false);
+	elevel = lvshared->elevel;
+
+	ereport(DEBUG1,
+			(errmsg("starting parallel lazy vacuum worker for %s",
+					lvshared->for_cleanup ? "cleanup" : "vacuuming")));
+
+	/* Set debug_query_string for individual workers */
+	sharedquery = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, false);
+	debug_query_string = sharedquery;
+	pgstat_report_activity(STATE_RUNNING, debug_query_string);
+
+	/*
+	 * Open table.  The lock mode is the same as the leader process.  It's
+	 * okay because the lock mode does not conflict among the parallel
+	 * workers.
+	 */
+	onerel = table_open(lvshared->relid, ShareUpdateExclusiveLock);
+
+	/*
+	 * Open all indexes. indrels are sorted in order by OID, which should be
+	 * matched to the leader's one.
+	 */
+	vac_open_indexes(onerel, RowExclusiveLock, &nindexes, &indrels);
+	Assert(nindexes > 0);
+
+	/* Set dead tuple space */
+	dead_tuples = (LVDeadTuples *) shm_toc_lookup(toc,
+												  PARALLEL_VACUUM_KEY_DEAD_TUPLES,
+												  false);
+
+	/* Set cost-based vacuum delay */
+	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostBalance = 0;
+	VacuumPageHit = 0;
+	VacuumPageMiss = 0;
+	VacuumPageDirty = 0;
+	VacuumSharedCostBalance = &(lvshared->cost_balance);
+	VacuumActiveNWorkers = &(lvshared->active_nworkers);
+
+	stats = (IndexBulkDeleteResult **)
+		palloc0(nindexes * sizeof(IndexBulkDeleteResult *));
+
+	if (lvshared->maintenance_work_mem_worker > 0)
+		maintenance_work_mem = lvshared->maintenance_work_mem_worker;
+
+	/* Process indexes to perform vacuum/cleanup */
+	parallel_vacuum_index(indrels, stats, lvshared, dead_tuples, nindexes);
+
+	vac_close_indexes(nindexes, indrels, RowExclusiveLock);
+	table_close(onerel, ShareUpdateExclusiveLock);
+	pfree(stats);
+}
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index f3e2254954..df06e7d174 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -14,6 +14,7 @@
 
 #include "postgres.h"
 
+#include "access/heapam.h"
 #include "access/nbtree.h"
 #include "access/parallel.h"
 #include "access/session.h"
@@ -139,6 +140,9 @@ static const struct
 	},
 	{
 		"_bt_parallel_build_main", _bt_parallel_build_main
+	},
+	{
+		"parallel_vacuum_main", parallel_vacuum_main
 	}
 };
 
@@ -174,6 +178,7 @@ CreateParallelContext(const char *library_name, const char *function_name,
 	pcxt = palloc0(sizeof(ParallelContext));
 	pcxt->subid = GetCurrentSubTransactionId();
 	pcxt->nworkers = nworkers;
+	pcxt->nworkers_to_launch = nworkers;
 	pcxt->library_name = pstrdup(library_name);
 	pcxt->function_name = pstrdup(function_name);
 	pcxt->error_context_stack = error_context_stack;
@@ -486,6 +491,23 @@ ReinitializeParallelDSM(ParallelContext *pcxt)
 	}
 }
 
+/*
+ * Reinitialize parallel workers for a parallel context such that we could
+ * launch the different number of workers.  This is required for cases where
+ * we need to reuse the same DSM segment, but the number of workers can
+ * vary from run-to-run.
+ */
+void
+ReinitializeParallelWorkers(ParallelContext *pcxt, int nworkers_to_launch)
+{
+	/*
+	 * The number of workers that need to be launched must be less than the
+	 * number of workers with which the parallel context is initialized.
+	 */
+	Assert(pcxt->nworkers >= nworkers_to_launch);
+	pcxt->nworkers_to_launch = nworkers_to_launch;
+}
+
 /*
  * Launch parallel workers.
  */
@@ -498,7 +520,7 @@ LaunchParallelWorkers(ParallelContext *pcxt)
 	bool		any_registrations_failed = false;
 
 	/* Skip this if we have no workers. */
-	if (pcxt->nworkers == 0)
+	if (pcxt->nworkers == 0 || pcxt->nworkers_to_launch == 0)
 		return;
 
 	/* We need to be a lock group leader. */
@@ -533,7 +555,7 @@ LaunchParallelWorkers(ParallelContext *pcxt)
 	 * fails.  It wouldn't help much anyway, because registering the worker in
 	 * no way guarantees that it will start up and initialize successfully.
 	 */
-	for (i = 0; i < pcxt->nworkers; ++i)
+	for (i = 0; i < pcxt->nworkers_to_launch; ++i)
 	{
 		memcpy(worker.bgw_extra, &i, sizeof(int));
 		if (!any_registrations_failed &&
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index bb34e252e4..d207bba44a 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -42,6 +42,7 @@
 #include "nodes/makefuncs.h"
 #include "pgstat.h"
 #include "postmaster/autovacuum.h"
+#include "postmaster/bgworker_internals.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
 #include "storage/proc.h"
@@ -68,6 +69,14 @@ static MemoryContext vac_context = NULL;
 static BufferAccessStrategy vac_strategy;
 
 
+/*
+ * Variables for cost-based parallel vacuum.  See comments atop
+ * compute_parallel_delay to understand how it works.
+ */
+pg_atomic_uint32 *VacuumSharedCostBalance = NULL;
+pg_atomic_uint32 *VacuumActiveNWorkers = NULL;
+int			VacuumCostBalanceLocal = 0;
+
 /* non-export function prototypes */
 static List *expand_vacuum_rel(VacuumRelation *vrel, int options);
 static List *get_all_vacuum_rels(int options);
@@ -76,6 +85,7 @@ static void vac_truncate_clog(TransactionId frozenXID,
 							  TransactionId lastSaneFrozenXid,
 							  MultiXactId lastSaneMinMulti);
 static bool vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params);
+static double compute_parallel_delay(void);
 static VacOptTernaryValue get_vacopt_ternary_value(DefElem *def);
 
 /*
@@ -94,12 +104,16 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 	bool		freeze = false;
 	bool		full = false;
 	bool		disable_page_skipping = false;
+	bool		parallel_option = false;
 	ListCell   *lc;
 
 	/* Set default value */
 	params.index_cleanup = VACOPT_TERNARY_DEFAULT;
 	params.truncate = VACOPT_TERNARY_DEFAULT;
 
+	/* By default parallel vacuum is enabled */
+	params.nworkers = 0;
+
 	/* Parse options list */
 	foreach(lc, vacstmt->options)
 	{
@@ -129,6 +143,41 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 			params.index_cleanup = get_vacopt_ternary_value(opt);
 		else if (strcmp(opt->defname, "truncate") == 0)
 			params.truncate = get_vacopt_ternary_value(opt);
+		else if (strcmp(opt->defname, "parallel") == 0)
+		{
+			parallel_option = true;
+			if (opt->arg == NULL)
+			{
+				/*
+				 * Parallel lazy vacuum is requested but user didn't specify
+				 * the parallel degree.  The parallel degree will be
+				 * determined at the start of lazy vacuum based on number of
+				 * indexes.
+				 */
+				params.nworkers = 0;
+			}
+			else
+			{
+				int			nworkers;
+
+				nworkers = defGetInt32(opt);
+				if (nworkers < 0 || nworkers > MAX_PARALLEL_WORKER_LIMIT)
+					ereport(ERROR,
+							(errcode(ERRCODE_SYNTAX_ERROR),
+							 errmsg("parallel vacuum degree must be between 0 and %d",
+									MAX_PARALLEL_WORKER_LIMIT),
+							 parser_errposition(pstate, opt->location)));
+
+				/*
+				 * Disable parallel vacuum, if user has specified parallel
+				 * degree as zero.
+				 */
+				if (nworkers == 0)
+					params.nworkers = -1;
+				else
+					params.nworkers = nworkers;
+			}
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -152,6 +201,11 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 		   !(params.options & (VACOPT_FULL | VACOPT_FREEZE)));
 	Assert(!(params.options & VACOPT_SKIPTOAST));
 
+	if ((params.options & VACOPT_FULL) && parallel_option)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot specify both FULL and PARALLEL options")));
+
 	/*
 	 * Make sure VACOPT_ANALYZE is specified if any column lists are present.
 	 */
@@ -383,6 +437,7 @@ vacuum(List *relations, VacuumParams *params,
 		VacuumPageHit = 0;
 		VacuumPageMiss = 0;
 		VacuumPageDirty = 0;
+		VacuumSharedCostBalance = NULL;
 
 		/*
 		 * Loop to process each selected relation.
@@ -1941,16 +1996,26 @@ vac_close_indexes(int nindexes, Relation *Irel, LOCKMODE lockmode)
 void
 vacuum_delay_point(void)
 {
+	double		msec = 0;
+
 	/* Always check for interrupts */
 	CHECK_FOR_INTERRUPTS();
 
-	/* Nap if appropriate */
-	if (VacuumCostActive && !InterruptPending &&
-		VacuumCostBalance >= VacuumCostLimit)
-	{
-		double		msec;
+	if (!VacuumCostActive || InterruptPending)
+		return;
 
+	/*
+	 * For parallel vacuum, the delay is computed based on the shared cost
+	 * balance.  See compute_parallel_delay.
+	 */
+	if (VacuumSharedCostBalance != NULL)
+		msec = compute_parallel_delay();
+	else if (VacuumCostBalance >= VacuumCostLimit)
 		msec = VacuumCostDelay * VacuumCostBalance / VacuumCostLimit;
+
+	/* Nap if appropriate */
+	if (msec > 0)
+	{
 		if (msec > VacuumCostDelay * 4)
 			msec = VacuumCostDelay * 4;
 
@@ -1966,6 +2031,65 @@ vacuum_delay_point(void)
 	}
 }
 
+/*
+ * Computes the vacuum delay for parallel workers.
+ *
+ * The basic idea of a cost-based vacuum delay for parallel index vacuuming
+ * is to allow all parallel vacuum workers including the leader process to
+ * have a shared view of cost related parameters (mainly VacuumCostBalance).
+ * We allow each worker to update it as and when it has incurred any cost and
+ * then based on that decide whether it needs to sleep.  We allow the worker
+ * to sleep proportional to the work done and reduce the
+ * VacuumSharedCostBalance by the amount which is consumed by the current
+ * worker (VacuumCostBalanceLocal).  This can avoid letting the workers sleep
+ * who have done less or no I/O as compared to other workers and therefore can
+ * ensure that workers who are doing more I/O got throttled more.
+ *
+ * We allow any worker to sleep only if it has performed the I/O above a
+ * certain threshold, which is calculated based on the number of active
+ * workers (VacuumActiveNWorkers), and the overall cost balance is more than
+ * VacuumCostLimit set by the system.  The testing reveals that we achieve
+ * the required throttling if we allow a worker that has done more than 50%
+ * of its share of work to sleep.
+ */
+static double
+compute_parallel_delay(void)
+{
+	double		msec = 0;
+	uint32		shared_balance;
+	int			nworkers;
+
+	/* Parallel vacuum must be active */
+	Assert(VacuumSharedCostBalance);
+
+	nworkers = pg_atomic_read_u32(VacuumActiveNWorkers);
+
+	/* At least count itself */
+	Assert(nworkers >= 1);
+
+	/* Update the shared cost balance value atomically */
+	shared_balance = pg_atomic_add_fetch_u32(VacuumSharedCostBalance, VacuumCostBalance);
+
+	/* Compute the total local balance for the current worker */
+	VacuumCostBalanceLocal += VacuumCostBalance;
+
+	if ((shared_balance >= VacuumCostLimit) &&
+		(VacuumCostBalanceLocal > 0.5 * (VacuumCostLimit / nworkers)))
+	{
+		/* Compute sleep time based on the local cost balance */
+		msec = VacuumCostDelay * VacuumCostBalanceLocal / VacuumCostLimit;
+		pg_atomic_sub_fetch_u32(VacuumSharedCostBalance, VacuumCostBalanceLocal);
+		VacuumCostBalanceLocal = 0;
+	}
+
+	/*
+	 * Reset the local balance as we accumulated it into the shared value.
+	 */
+	VacuumCostBalance = 0;
+
+	return msec;
+}
+
 /*
  * A wrapper function of defGetBoolean().
  *
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index f0e40e36af..6d1f28c327 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2886,6 +2886,8 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 			(!wraparound ? VACOPT_SKIP_LOCKED : 0);
 		tab->at_params.index_cleanup = VACOPT_TERNARY_DEFAULT;
 		tab->at_params.truncate = VACOPT_TERNARY_DEFAULT;
+		/* As of now, we don't support parallel vacuum for autovacuum */
+		tab->at_params.nworkers = -1;
 		tab->at_params.freeze_min_age = freeze_min_age;
 		tab->at_params.freeze_table_age = freeze_table_age;
 		tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 2fd88866c9..99451fd942 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -3597,7 +3597,7 @@ psql_completion(const char *text, int start, int end)
 		if (ends_with(prev_wd, '(') || ends_with(prev_wd, ','))
 			COMPLETE_WITH("FULL", "FREEZE", "ANALYZE", "VERBOSE",
 						  "DISABLE_PAGE_SKIPPING", "SKIP_LOCKED",
-						  "INDEX_CLEANUP", "TRUNCATE");
+						  "INDEX_CLEANUP", "TRUNCATE", "PARALLEL");
 		else if (TailMatches("FULL|FREEZE|ANALYZE|VERBOSE|DISABLE_PAGE_SKIPPING|SKIP_LOCKED|INDEX_CLEANUP|TRUNCATE"))
 			COMPLETE_WITH("ON", "OFF");
 	}
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 580b4caef7..479f17c55f 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -24,6 +24,8 @@
 #include "nodes/primnodes.h"
 #include "storage/bufpage.h"
 #include "storage/lockdefs.h"
+#include "storage/shm_toc.h"
+#include "storage/dsm.h"
 #include "utils/relcache.h"
 #include "utils/snapshot.h"
 
@@ -193,6 +195,7 @@ extern Size SyncScanShmemSize(void);
 struct VacuumParams;
 extern void heap_vacuum_rel(Relation onerel,
 							struct VacuumParams *params, BufferAccessStrategy bstrategy);
+extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
 
 /* in heap/heapam_visibility.c */
 extern bool HeapTupleSatisfiesVisibility(HeapTuple stup, Snapshot snapshot,
diff --git a/src/include/access/parallel.h b/src/include/access/parallel.h
index 646708ba5f..fc6a5603bb 100644
--- a/src/include/access/parallel.h
+++ b/src/include/access/parallel.h
@@ -33,7 +33,8 @@ typedef struct ParallelContext
 {
 	dlist_node	node;
 	SubTransactionId subid;
-	int			nworkers;
+	int			nworkers;		/* Maximum number of workers to launch */
+	int			nworkers_to_launch; /* Actual number of workers to launch */
 	int			nworkers_launched;
 	char	   *library_name;
 	char	   *function_name;
@@ -63,6 +64,7 @@ extern ParallelContext *CreateParallelContext(const char *library_name,
 											  const char *function_name, int nworkers);
 extern void InitializeParallelDSM(ParallelContext *pcxt);
 extern void ReinitializeParallelDSM(ParallelContext *pcxt);
+extern void ReinitializeParallelWorkers(ParallelContext *pcxt, int nworkers_to_launch);
 extern void LaunchParallelWorkers(ParallelContext *pcxt);
 extern void WaitForParallelWorkersToAttach(ParallelContext *pcxt);
 extern void WaitForParallelWorkersToFinish(ParallelContext *pcxt);
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index b3351ad406..56417f0a8b 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -222,6 +222,12 @@ typedef struct VacuumParams
 										 * default value depends on reloptions */
 	VacOptTernaryValue truncate;	/* Truncate empty pages at the end,
 									 * default value depends on reloptions */
+
+	/*
+	 * The number of parallel vacuum workers.  0 by default which means choose
+	 * based on number of indexes.  -1 indicates parallel vacuum is disabled.
+	 */
+	int			nworkers;
 } VacuumParams;
 
 /* GUC parameters */
@@ -231,6 +237,11 @@ extern int	vacuum_freeze_table_age;
 extern int	vacuum_multixact_freeze_min_age;
 extern int	vacuum_multixact_freeze_table_age;
 
+/* Variables for cost-based parallel vacuum  */
+extern pg_atomic_uint32 *VacuumSharedCostBalance;
+extern pg_atomic_uint32 *VacuumActiveNWorkers;
+extern int	VacuumCostBalanceLocal;
+
 
 /* in commands/vacuum.c */
 extern void ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel);
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index 9996d882d1..22cca70687 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -92,6 +92,34 @@ CONTEXT:  SQL function "do_analyze" statement 1
 SQL function "wrap_do_analyze" statement 1
 VACUUM FULL vactst;
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+-- PARALLEL option
+CREATE TABLE pvactst (i INT, a INT[], p POINT) with (autovacuum_enabled = off);
+INSERT INTO pvactst SELECT i, array[1,2,3], point(i, i+1) FROM generate_series(1,1000) i;
+CREATE INDEX btree_pvactst ON pvactst USING btree (i);
+CREATE INDEX hash_pvactst ON pvactst USING hash (i);
+CREATE INDEX brin_pvactst ON pvactst USING brin (i);
+CREATE INDEX gin_pvactst ON pvactst USING gin (a);
+CREATE INDEX gist_pvactst ON pvactst USING gist (p);
+CREATE INDEX spgist_pvactst ON pvactst USING spgist (p);
+-- VACUUM invokes parallel index cleanup
+SET min_parallel_index_scan_size to 0;
+VACUUM (PARALLEL 2) pvactst;
+-- VACUUM invokes parallel bulk-deletion
+UPDATE pvactst SET i = i WHERE i < 1000;
+VACUUM (PARALLEL 2) pvactst;
+VACUUM (PARALLEL -1) pvactst; -- error
+ERROR:  parallel vacuum degree must be between 0 and 1024
+LINE 1: VACUUM (PARALLEL -1) pvactst;
+                ^
+VACUUM (PARALLEL 2, INDEX_CLEANUP FALSE) pvactst;
+VACUUM (PARALLEL 2, FULL TRUE) pvactst; -- error, cannot use both PARALLEL and FULL
+ERROR:  cannot specify both FULL and PARALLEL options
+CREATE TEMPORARY TABLE tmp (a int PRIMARY KEY);
+CREATE INDEX tmp_idx1 ON tmp (a);
+VACUUM (PARALLEL 1) tmp; -- disables parallel vacuum option
+WARNING:  disabling parallel option of vacuum on "tmp" --- cannot vacuum temporary tables in parallel
+RESET min_parallel_index_scan_size;
+DROP TABLE pvactst;
 -- INDEX_CLEANUP option
 CREATE TABLE no_index_cleanup (i INT PRIMARY KEY, t TEXT);
 -- Use uncompressed data stored in toast.
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 69987f75e9..d6859a5bc9 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -75,6 +75,33 @@ VACUUM FULL vactst;
 
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
 
+-- PARALLEL option
+CREATE TABLE pvactst (i INT, a INT[], p POINT) with (autovacuum_enabled = off);
+INSERT INTO pvactst SELECT i, array[1,2,3], point(i, i+1) FROM generate_series(1,1000) i;
+CREATE INDEX btree_pvactst ON pvactst USING btree (i);
+CREATE INDEX hash_pvactst ON pvactst USING hash (i);
+CREATE INDEX brin_pvactst ON pvactst USING brin (i);
+CREATE INDEX gin_pvactst ON pvactst USING gin (a);
+CREATE INDEX gist_pvactst ON pvactst USING gist (p);
+CREATE INDEX spgist_pvactst ON pvactst USING spgist (p);
+
+-- VACUUM invokes parallel index cleanup
+SET min_parallel_index_scan_size to 0;
+VACUUM (PARALLEL 2) pvactst;
+
+-- VACUUM invokes parallel bulk-deletion
+UPDATE pvactst SET i = i WHERE i < 1000;
+VACUUM (PARALLEL 2) pvactst;
+
+VACUUM (PARALLEL -1) pvactst; -- error
+VACUUM (PARALLEL 2, INDEX_CLEANUP FALSE) pvactst;
+VACUUM (PARALLEL 2, FULL TRUE) pvactst; -- error, cannot use both PARALLEL and FULL
+CREATE TEMPORARY TABLE tmp (a int PRIMARY KEY);
+CREATE INDEX tmp_idx1 ON tmp (a);
+VACUUM (PARALLEL 1) tmp; -- disables parallel vacuum option
+RESET min_parallel_index_scan_size;
+DROP TABLE pvactst;
+
 -- INDEX_CLEANUP option
 CREATE TABLE no_index_cleanup (i INT PRIMARY KEY, t TEXT);
 -- Use uncompressed data stored in toast.
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index caf6b86f92..0242e6627d 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -1216,7 +1216,11 @@ LPVOID
 LPWSTR
 LSEG
 LUID
+LVDeadTuples
 LVRelStats
+LVShared
+LVSharedIndStats
+LVParallelState
 LWLock
 LWLockHandle
 LWLockMinimallyPadded
-- 
2.16.2.windows.1

#346

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

about 6 years ago

In reply to: Amit Kapila (#345)

On Sat, 11 Jan 2020 at 13:18, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Sat, Jan 11, 2020 at 9:23 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
On Fri, 10 Jan 2020 at 20:54, Mahendra Singh Thalor <mahi6run@gmail.com> wrote:
On Fri, 10 Jan 2020 at 15:51, Sergei Kornilov <sk@zsrv.org> wrote:
Hi
Thank you for update! I looked again
(vacuum_indexes_leader)
+               /* Skip the indexes that can be processed by parallel workers */
+               if (!skip_index)
+                       continue;
Does the variable name skip_index not confuse here? Maybe rename to something like can_parallel?
I also agree with your point.
I don't think the change is a good idea.
-               bool            skip_index = (get_indstats(lps->lvshared, i) == NULL ||
-                                                                 skip_parallel_vacuum_index(Irel[i], lps->lvshared));
+               bool            can_parallel = (get_indstats(lps->lvshared, i) == NULL ||
+                                                                       skip_parallel_vacuum_index(Irel[i],
+                                                                                                                          lps->lvshared));
The above condition is true when the index can *not* do parallel index vacuum. How about changing it to skipped_index and change the comment to something like “We are interested in only index skipped parallel vacuum”?
Hmm, I find the current code and comment better than what you or
Sergei are proposing. I am not sure what is the point of confusion in
the current code?

Yeah the current code is also good. I just thought they were concerned
that the variable name skip_index might be confusing because we skip
if skip_index is NOT true.

Another question about behavior on temporary tables. Use case: the user commands just "vacuum;" to vacuum entire database (and has enough maintenance workers). Vacuum starts fine in parallel, but on first temporary table we hit:
+       if (RelationUsesLocalBuffers(onerel) && params->nworkers >= 0)
+       {
+               ereport(WARNING,
+                               (errmsg("disabling parallel option of vacuum on \"%s\" --- cannot vacuum temporary tables in parallel",
+                                               RelationGetRelationName(onerel))));
+               params->nworkers = -1;
+       }
And therefore we turn off the parallel vacuum for the remaining tables... Can we improve this case?
Good point.
Yes, we should improve this. I tried to fix this.
+1
Yeah, we can improve the situation here. I think we don't need to
change the value of params->nworkers at first place if allow
lazy_scan_heap to take care of this. Also, I think we shouldn't
display warning unless the user has explicitly asked for parallel
option. See the fix in the attached patch.

Agreed. But with the updated patch the PARALLEL option without the
parallel degree doesn't display warning because params->nworkers = 0
in that case. So how about restoring params->nworkers at the end of
vacuum_rel()?

+                       /*
+                        * Give warning only if the user explicitly
tries to perform a
+                        * parallel vacuum on the temporary table.
+                        */
+                       if (params->nworkers > 0)
+                               ereport(WARNING,
+                                               (errmsg("disabling
parallel option of vacuum on \"%s\" --- cannot vacuum temporary tables
in parallel",
+
RelationGetRelationName(onerel))));

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#347

Mahendra Singh Thalor

mahi6run@gmail.com

about 6 years ago

In reply to: Masahiko Sawada (#346)

On Sat, 11 Jan 2020 at 19:48, Masahiko Sawada <
masahiko.sawada@2ndquadrant.com> wrote:

On Sat, 11 Jan 2020 at 13:18, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Sat, Jan 11, 2020 at 9:23 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Fri, 10 Jan 2020 at 20:54, Mahendra Singh Thalor <

mahi6run@gmail.com> wrote:

On Fri, 10 Jan 2020 at 15:51, Sergei Kornilov <sk@zsrv.org> wrote:

Hi
Thank you for update! I looked again

(vacuum_indexes_leader)
+ /* Skip the indexes that can be processed by

parallel workers */

+ if (!skip_index)
+ continue;

Does the variable name skip_index not confuse here? Maybe rename

to something like can_parallel?

I also agree with your point.

I don't think the change is a good idea.

- bool skip_index =

(get_indstats(lps->lvshared, i) == NULL ||

-

skip_parallel_vacuum_index(Irel[i], lps->lvshared));

+ bool can_parallel =

(get_indstats(lps->lvshared, i) == NULL ||

+

skip_parallel_vacuum_index(Irel[i],

+

lps->lvshared));

The above condition is true when the index can *not* do parallel

index vacuum. How about changing it to skipped_index and change the comment
to something like “We are interested in only index skipped parallel vacuum”?

Hmm, I find the current code and comment better than what you or
Sergei are proposing. I am not sure what is the point of confusion in
the current code?

Yeah the current code is also good. I just thought they were concerned
that the variable name skip_index might be confusing because we skip
if skip_index is NOT true.

Another question about behavior on temporary tables. Use case:

the user commands just "vacuum;" to vacuum entire database (and has enough
maintenance workers). Vacuum starts fine in parallel, but on first
temporary table we hit:

+ if (RelationUsesLocalBuffers(onerel) && params->nworkers

= 0)
+       {
+               ereport(WARNING,
+                               (errmsg("disabling parallel

option of vacuum on \"%s\" --- cannot vacuum temporary tables in parallel",

+

RelationGetRelationName(onerel))));

+ params->nworkers = -1;
+ }

And therefore we turn off the parallel vacuum for the remaining

tables... Can we improve this case?

Good point.
Yes, we should improve this. I tried to fix this.

+1

Yeah, we can improve the situation here. I think we don't need to
change the value of params->nworkers at first place if allow
lazy_scan_heap to take care of this. Also, I think we shouldn't
display warning unless the user has explicitly asked for parallel
option. See the fix in the attached patch.

Agreed. But with the updated patch the PARALLEL option without the
parallel degree doesn't display warning because params->nworkers = 0
in that case. So how about restoring params->nworkers at the end of
vacuum_rel()?
+                       /*
+                        * Give warning only if the user explicitly
tries to perform a
+                        * parallel vacuum on the temporary table.
+                        */
+                       if (params->nworkers > 0)
+                               ereport(WARNING,
+                                               (errmsg("disabling
parallel option of vacuum on \"%s\" --- cannot vacuum temporary tables
in parallel",
+
RelationGetRelationName(onerel))));

Hi,
I have some doubts for warning of temporary tables . Below are the some
examples.

Let we have 1 temporary table with name "temp_table".
*Case 1:*
vacuum;
I think, in this case, we should not give any warning for temp table. We
should do parallel vacuum(considering zero as parallel degree) for all the
tables except temporary tables.

*Case 2:*
vacuum (parallel);

*Case 3:*
vacuum(parallel 5);

*Case 4*:
vacuum(parallel) temp_table;

*Case 5:*
vacuum(parallel 4) temp_table;

I think, for case 2 and 4, as per new design, we should give error (ERROR:
Parallel degree should be specified between 0 to 1024) because by default,
parallel vacuum is ON, so if user give parallel option without degree, then
we can give error.
If we can give error for case 2 and 4, then we can give warning for case 3,
5.

Thoughts?

--
Thanks and Regards
Mahendra Singh Thalor
EnterpriseDB: http://www.enterprisedb.com

#348

Amit Kapila

amit.kapila16@gmail.com

almost 6 years ago

In reply to: Masahiko Sawada (#346)

On Sat, Jan 11, 2020 at 7:48 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Sat, 11 Jan 2020 at 13:18, Amit Kapila <amit.kapila16@gmail.com> wrote:
On Sat, Jan 11, 2020 at 9:23 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
On Fri, 10 Jan 2020 at 20:54, Mahendra Singh Thalor <mahi6run@gmail.com> wrote:
On Fri, 10 Jan 2020 at 15:51, Sergei Kornilov <sk@zsrv.org> wrote:
Hi
Thank you for update! I looked again
(vacuum_indexes_leader)
+               /* Skip the indexes that can be processed by parallel workers */
+               if (!skip_index)
+                       continue;
Does the variable name skip_index not confuse here? Maybe rename to something like can_parallel?
I also agree with your point.
I don't think the change is a good idea.
-               bool            skip_index = (get_indstats(lps->lvshared, i) == NULL ||
-                                                                 skip_parallel_vacuum_index(Irel[i], lps->lvshared));
+               bool            can_parallel = (get_indstats(lps->lvshared, i) == NULL ||
+                                                                       skip_parallel_vacuum_index(Irel[i],
+                                                                                                                          lps->lvshared));
The above condition is true when the index can *not* do parallel index vacuum. How about changing it to skipped_index and change the comment to something like “We are interested in only index skipped parallel vacuum”?
Hmm, I find the current code and comment better than what you or
Sergei are proposing. I am not sure what is the point of confusion in
the current code?
Yeah the current code is also good. I just thought they were concerned
that the variable name skip_index might be confusing because we skip
if skip_index is NOT true.

Okay, would it better if we get rid of this variable and have code like below?

/* Skip the indexes that can be processed by parallel workers */
if ( !(get_indstats(lps->lvshared, i) == NULL ||
skip_parallel_vacuum_index(Irel[i], lps->lvshared)))
continue;
...

Another question about behavior on temporary tables. Use case: the user commands just "vacuum;" to vacuum entire database (and has enough maintenance workers). Vacuum starts fine in parallel, but on first temporary table we hit:
+       if (RelationUsesLocalBuffers(onerel) && params->nworkers >= 0)
+       {
+               ereport(WARNING,
+                               (errmsg("disabling parallel option of vacuum on \"%s\" --- cannot vacuum temporary tables in parallel",
+                                               RelationGetRelationName(onerel))));
+               params->nworkers = -1;
+       }
And therefore we turn off the parallel vacuum for the remaining tables... Can we improve this case?
Good point.
Yes, we should improve this. I tried to fix this.
+1
Yeah, we can improve the situation here. I think we don't need to
change the value of params->nworkers at first place if allow
lazy_scan_heap to take care of this. Also, I think we shouldn't
display warning unless the user has explicitly asked for parallel
option. See the fix in the attached patch.
Agreed. But with the updated patch the PARALLEL option without the
parallel degree doesn't display warning because params->nworkers = 0
in that case. So how about restoring params->nworkers at the end of
vacuum_rel()?

I had also thought on those lines, but I was not entirely sure about
this resetting of workers. Today, again thinking about it, it seems
the idea Mahendra is suggesting that is giving an error if the
parallel degree is not specified seems reasonable to me. This means
Vacuum (parallel), Vacuum (parallel) <tbl_name>, etc. will give an
error "parallel degree must be specified". This idea has merit as now
we are supporting a parallel vacuum by default, so a 'parallel' option
without a parallel degree doesn't have any meaning. If we do that,
then we don't need to do anything additional about the handling of
temp tables (other than what patch is already doing) as well. What do
you think?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#349

Dilip Kumar

dilipbalaut@gmail.com

almost 6 years ago

In reply to: Amit Kapila (#348)

On Mon, Jan 13, 2020 at 9:20 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Sat, Jan 11, 2020 at 7:48 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
On Sat, 11 Jan 2020 at 13:18, Amit Kapila <amit.kapila16@gmail.com> wrote:
On Sat, Jan 11, 2020 at 9:23 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
On Fri, 10 Jan 2020 at 20:54, Mahendra Singh Thalor <mahi6run@gmail.com> wrote:
On Fri, 10 Jan 2020 at 15:51, Sergei Kornilov <sk@zsrv.org> wrote:
Hi
Thank you for update! I looked again
(vacuum_indexes_leader)
+               /* Skip the indexes that can be processed by parallel workers */
+               if (!skip_index)
+                       continue;
Does the variable name skip_index not confuse here? Maybe rename to something like can_parallel?
I also agree with your point.
I don't think the change is a good idea.
-               bool            skip_index = (get_indstats(lps->lvshared, i) == NULL ||
-                                                                 skip_parallel_vacuum_index(Irel[i], lps->lvshared));
+               bool            can_parallel = (get_indstats(lps->lvshared, i) == NULL ||
+                                                                       skip_parallel_vacuum_index(Irel[i],
+                                                                                                                          lps->lvshared));
The above condition is true when the index can *not* do parallel index vacuum. How about changing it to skipped_index and change the comment to something like “We are interested in only index skipped parallel vacuum”?
Hmm, I find the current code and comment better than what you or
Sergei are proposing. I am not sure what is the point of confusion in
the current code?
Yeah the current code is also good. I just thought they were concerned
that the variable name skip_index might be confusing because we skip
if skip_index is NOT true.
Okay, would it better if we get rid of this variable and have code like below?

/* Skip the indexes that can be processed by parallel workers */
if ( !(get_indstats(lps->lvshared, i) == NULL ||
skip_parallel_vacuum_index(Irel[i], lps->lvshared)))
continue;
...
Another question about behavior on temporary tables. Use case: the user commands just "vacuum;" to vacuum entire database (and has enough maintenance workers). Vacuum starts fine in parallel, but on first temporary table we hit:
+       if (RelationUsesLocalBuffers(onerel) && params->nworkers >= 0)
+       {
+               ereport(WARNING,
+                               (errmsg("disabling parallel option of vacuum on \"%s\" --- cannot vacuum temporary tables in parallel",
+                                               RelationGetRelationName(onerel))));
+               params->nworkers = -1;
+       }
And therefore we turn off the parallel vacuum for the remaining tables... Can we improve this case?
Good point.
Yes, we should improve this. I tried to fix this.
+1
Yeah, we can improve the situation here. I think we don't need to
change the value of params->nworkers at first place if allow
lazy_scan_heap to take care of this. Also, I think we shouldn't
display warning unless the user has explicitly asked for parallel
option. See the fix in the attached patch.
Agreed. But with the updated patch the PARALLEL option without the
parallel degree doesn't display warning because params->nworkers = 0
in that case. So how about restoring params->nworkers at the end of
vacuum_rel()?
I had also thought on those lines, but I was not entirely sure about
this resetting of workers. Today, again thinking about it, it seems
the idea Mahendra is suggesting that is giving an error if the
parallel degree is not specified seems reasonable to me. This means
Vacuum (parallel), Vacuum (parallel) <tbl_name>, etc. will give an
error "parallel degree must be specified". This idea has merit as now
we are supporting a parallel vacuum by default, so a 'parallel' option
without a parallel degree doesn't have any meaning. If we do that,
then we don't need to do anything additional about the handling of
temp tables (other than what patch is already doing) as well. What do
you think?

This idea make sense to me.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#350

Sergei Kornilov

sk@zsrv.org

almost 6 years ago

In reply to: Masahiko Sawada (#346)

Hello

I just thought they were concerned
that the variable name skip_index might be confusing because we skip
if skip_index is NOT true.

Right.

- bool skip_index = (get_indstats(lps->lvshared, i) == NULL ||
- skip_parallel_vacuum_index(Irel[i], lps->lvshared));
+ bool can_parallel = (get_indstats(lps->lvshared, i) == NULL ||
+ skip_parallel_vacuum_index(Irel[i],
+ lps->lvshared));

The above condition is true when the index can *not* do parallel index vacuum.

Ouch, right. I was wrong. (or the variable name and the comment really confused me)

Okay, would it better if we get rid of this variable and have code like below?

/* Skip the indexes that can be processed by parallel workers */
if ( !(get_indstats(lps->lvshared, i) == NULL ||
skip_parallel_vacuum_index(Irel[i], lps->lvshared)))
continue;

Complex condition... Not sure.

How about changing it to skipped_index and change the comment to something like “We are interested in only index skipped parallel vacuum”?

I prefer this idea.

Today, again thinking about it, it seems
the idea Mahendra is suggesting that is giving an error if the
parallel degree is not specified seems reasonable to me.

regards, Sergei

#351

Amit Kapila

amit.kapila16@gmail.com

almost 6 years ago

In reply to: Amit Kapila (#336)

2 attachment(s)

On Thu, Jan 9, 2020 at 4:03 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Thu, Jan 9, 2020 at 10:41 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Wed, 8 Jan 2020 at 22:16, Amit Kapila <amit.kapila16@gmail.com> wrote:

What do you think of the attached? Sawada-san, kindly verify the
changes and let me know your opinion.

I agreed to not include both the FAST option patch and
DISABLE_LEADER_PARTICIPATION patch at this stage. It's better to focus
on the main part and we can discuss and add them later if want.

I've looked at the latest version patch you shared. Overall it looks
good and works fine. I have a few small comments:

I have addressed all your comments and slightly change nearby comments
and ran pgindent. I think we can commit the first two preparatory
patches now unless you or someone else has any more comments on those.

I have pushed the first one (4e514c6) and I am planning to commit the
next one (API: v46-0001-Introduce-IndexAM-fields-for-parallel-vacuum)
patch on Wednesday. We are still discussing a few things for the main
parallel vacuum patch
(v46-0002-Allow-vacuum-command-to-process-indexes-in-parallel) which
we should reach conclusion soon. In the attached, I have made a few
changes in the comments of patch
v46-0002-Allow-vacuum-command-to-process-indexes-in-parallel.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachments:

v46-0001-Introduce-IndexAM-fields-for-parallel-vacuum.patchapplication/octet-stream; name=v46-0001-Introduce-IndexAM-fields-for-parallel-vacuum.patchDownload

From d9ddbc1ee9c132c094b3f057dd0aa7af2e955e44 Mon Sep 17 00:00:00 2001
From: Amit Kapila <akapila@postgresql.org>
Date: Thu, 9 Jan 2020 14:36:35 +0530
Subject: [PATCH 1/2] Introduce IndexAM fields for parallel vacuum.

Introduce new fields amusemaintenanceworkmem and amparallelvacuumoptions
in IndexAmRoutine for parallel vacuum.  The amusemaintenanceworkmem tells
whether a particular IndexAM uses maintenance_work_mem or not.  This will
help in controlling the memory used by individual workers as otherwise,
each worker can consume memory equal to maintenance_work_mem.  The
amparallelvacuumoptions tell whether a particular IndexAM participates in
a parallel vacuum and if so in which phase (bulkdelete, vacuumcleanup) of
vacuum.

Author: Masahiko Sawada and Amit Kapila
Reviewed-by: Dilip Kumar, Amit Kapila, Tomas Vondra and Robert Haas
Discussion:
https://postgr.es/m/CAD21AoDTPMgzSkV4E3SFo1CH_x50bf5PqZFQf4jmqjk-C03BWg@mail.gmail.com
https://postgr.es/m/CAA4eK1LmcD5aPogzwim5Nn58Ki+74a6Edghx4Wd8hAskvHaq5A@mail.gmail.com
---
 contrib/bloom/blutils.c                          |  4 +++
 doc/src/sgml/indexam.sgml                        |  4 +++
 src/backend/access/brin/brin.c                   |  4 +++
 src/backend/access/gin/ginutil.c                 |  4 +++
 src/backend/access/gist/gist.c                   |  4 +++
 src/backend/access/hash/hash.c                   |  3 ++
 src/backend/access/nbtree/nbtree.c               |  3 ++
 src/backend/access/spgist/spgutils.c             |  4 +++
 src/include/access/amapi.h                       |  4 +++
 src/include/commands/vacuum.h                    | 38 ++++++++++++++++++++++++
 src/test/modules/dummy_index_am/dummy_index_am.c |  3 ++
 11 files changed, 75 insertions(+)

diff --git a/contrib/bloom/blutils.c b/contrib/bloom/blutils.c
index 23d959b..0104d02 100644
--- a/contrib/bloom/blutils.c
+++ b/contrib/bloom/blutils.c
@@ -18,6 +18,7 @@
 #include "access/reloptions.h"
 #include "bloom.h"
 #include "catalog/index.h"
+#include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
@@ -121,6 +122,9 @@ blhandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_CLEANUP;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = blbuild;
diff --git a/doc/src/sgml/indexam.sgml b/doc/src/sgml/indexam.sgml
index dd54c68..37f8d87 100644
--- a/doc/src/sgml/indexam.sgml
+++ b/doc/src/sgml/indexam.sgml
@@ -122,6 +122,10 @@ typedef struct IndexAmRoutine
     bool        amcanparallel;
     /* does AM support columns included with clause INCLUDE? */
     bool        amcaninclude;
+    /* does AM use maintenance_work_mem? */
+    bool        amusemaintenanceworkmem;
+    /* OR of parallel vacuum flags */
+    uint8       amparallelvacuumoptions;
     /* type of data stored in index, or InvalidOid if variable */
     Oid         amkeytype;
 
diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index d89af78..2e8f67e 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -27,6 +27,7 @@
 #include "access/xloginsert.h"
 #include "catalog/index.h"
 #include "catalog/pg_am.h"
+#include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "pgstat.h"
 #include "postmaster/autovacuum.h"
@@ -101,6 +102,9 @@ brinhandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_CLEANUP;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = brinbuild;
diff --git a/src/backend/access/gin/ginutil.c b/src/backend/access/gin/ginutil.c
index 910f0bc..a7e55ca 100644
--- a/src/backend/access/gin/ginutil.c
+++ b/src/backend/access/gin/ginutil.c
@@ -20,6 +20,7 @@
 #include "access/xloginsert.h"
 #include "catalog/pg_collation.h"
 #include "catalog/pg_type.h"
+#include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "storage/indexfsm.h"
 #include "storage/lmgr.h"
@@ -53,6 +54,9 @@ ginhandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amusemaintenanceworkmem = true;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_CLEANUP;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = ginbuild;
diff --git a/src/backend/access/gist/gist.c b/src/backend/access/gist/gist.c
index 5c9ad34..aefc302 100644
--- a/src/backend/access/gist/gist.c
+++ b/src/backend/access/gist/gist.c
@@ -17,6 +17,7 @@
 #include "access/gist_private.h"
 #include "access/gistscan.h"
 #include "catalog/pg_collation.h"
+#include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "nodes/execnodes.h"
 #include "storage/lmgr.h"
@@ -74,6 +75,9 @@ gisthandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = true;
+	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = gistbuild;
diff --git a/src/backend/access/hash/hash.c b/src/backend/access/hash/hash.c
index 4bb6efc..4871b7f 100644
--- a/src/backend/access/hash/hash.c
+++ b/src/backend/access/hash/hash.c
@@ -72,6 +72,9 @@ hashhandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL;
 	amroutine->amkeytype = INT4OID;
 
 	amroutine->ambuild = hashbuild;
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index 8376a5e..5254bc7 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -121,6 +121,9 @@ bthandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = true;
 	amroutine->amcaninclude = true;
+	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = btbuild;
diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c
index d715908..4924ae1 100644
--- a/src/backend/access/spgist/spgutils.c
+++ b/src/backend/access/spgist/spgutils.c
@@ -22,6 +22,7 @@
 #include "access/transam.h"
 #include "access/xact.h"
 #include "catalog/pg_amop.h"
+#include "commands/vacuum.h"
 #include "storage/bufmgr.h"
 #include "storage/indexfsm.h"
 #include "storage/lmgr.h"
@@ -56,6 +57,9 @@ spghandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = spgbuild;
diff --git a/src/include/access/amapi.h b/src/include/access/amapi.h
index d2a49e8..3b3e22f 100644
--- a/src/include/access/amapi.h
+++ b/src/include/access/amapi.h
@@ -197,6 +197,10 @@ typedef struct IndexAmRoutine
 	bool		amcanparallel;
 	/* does AM support columns included with clause INCLUDE? */
 	bool		amcaninclude;
+	/* does AM use maintenance_work_mem? */
+	bool		amusemaintenanceworkmem;
+	/* OR of parallel vacuum flags.  See vacuum.h for flags. */
+	uint8		amparallelvacuumoptions;
 	/* type of data stored in index, or InvalidOid if variable */
 	Oid			amkeytype;
 
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 5dc41dd..b3351ad 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -23,6 +23,44 @@
 #include "storage/lock.h"
 #include "utils/relcache.h"
 
+/*
+ * Flags for amparallelvacuumoptions to control the participation of bulkdelete
+ * and vacuumcleanup in parallel vacuum.
+ */
+
+/*
+ * Both bulkdelete and vacuumcleanup are disabled by default.  This will be
+ * used by IndexAM's that don't want to or cannot participate in parallel
+ * vacuum.  For example, if an index AM doesn't have a way to communicate the
+ * index statistics allocated by the first ambulkdelete call to the subsequent
+ * ones until amvacuumcleanup, the index AM cannot participate in parallel
+ * vacuum.
+ */
+#define VACUUM_OPTION_NO_PARALLEL			0
+
+/*
+ * bulkdelete can be performed in parallel.  This option can be used by
+ * IndexAm's that need to scan the index to delete the tuples.
+ */
+#define VACUUM_OPTION_PARALLEL_BULKDEL		(1 << 0)
+
+/*
+ * vacuumcleanup can be performed in parallel if bulkdelete is not performed
+ * yet.  This will be used by IndexAM's that can scan the index if the
+ * bulkdelete is not performed.
+ */
+#define VACUUM_OPTION_PARALLEL_COND_CLEANUP	(1 << 1)
+
+/*
+ * vacuumcleanup can be performed in parallel even if bulkdelete has already
+ * processed the index.  This will be used by IndexAM's that scan the index
+ * during the cleanup phase of index irrespective of whether the index is
+ * already scanned or not during bulkdelete phase.
+ */
+#define VACUUM_OPTION_PARALLEL_CLEANUP		(1 << 2)
+
+/* value for checking vacuum flags */
+#define VACUUM_OPTION_MAX_VALID_VALUE		((1 << 3) - 1)
 
 /*----------
  * ANALYZE builds one of these structs for each attribute (column) that is
diff --git a/src/test/modules/dummy_index_am/dummy_index_am.c b/src/test/modules/dummy_index_am/dummy_index_am.c
index 898ab06..f326320 100644
--- a/src/test/modules/dummy_index_am/dummy_index_am.c
+++ b/src/test/modules/dummy_index_am/dummy_index_am.c
@@ -16,6 +16,7 @@
 #include "access/amapi.h"
 #include "access/reloptions.h"
 #include "catalog/index.h"
+#include "commands/vacuum.h"
 #include "nodes/pathnodes.h"
 #include "utils/guc.h"
 #include "utils/rel.h"
@@ -294,6 +295,8 @@ dihandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amparallelvacuumoptions = VACUUM_OPTION_NO_PARALLEL;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = dibuild;
-- 
1.8.3.1

v46-0002-Allow-vacuum-command-to-process-indexes-in-parallel.patchapplication/octet-stream; name=v46-0002-Allow-vacuum-command-to-process-indexes-in-parallel.patchDownload

From ac4953873ab88d35842d95bd2c9582b555eb0dc9 Mon Sep 17 00:00:00 2001
From: Amit Kapila <akapila@postgresql.org>
Date: Thu, 9 Jan 2020 15:49:46 +0530
Subject: [PATCH 2/2] Allow vacuum command to process indexes in parallel.

This feature allows the vacuum to leverage multiple CPUs in order to
process indexes.  This enables us to perform index vacuuming and index
cleanup with background workers.  This adds a PARALLEL option to VACUUM
command where the user can specify the number of workers that can be used
to perform the command which is limited by the number of indexes on a
table.  Specifying zero as a number of workers will disable parallelism.
This option can't be used with the FULL option.

Each index is processed by at most one vacuum process.  Therefore parallel
vacuum can be used when the table has at least two indexes.

The parallel degree is either specified by the user or determined based on
the number of indexes that the table has, and further limited by
max_parallel_maintenance_workers.  The index can participate in parallel
vacuum iff it's size is greater than min_parallel_index_scan_size.

Author: Masahiko Sawada and Amit Kapila
Reviewed-by: Dilip Kumar, Amit Kapila, Robert Haas, Tomas Vondra,
Mahendra Singh and Sergei Kornilov
Tested-by: Mahendra Singh and Prabhat Sahu
Discussion:
https://postgr.es/m/CAD21AoDTPMgzSkV4E3SFo1CH_x50bf5PqZFQf4jmqjk-C03BWg@mail.gmail.com
https://postgr.es/m/CAA4eK1J-VoR9gzS5E75pcD-OH0mEyCdp8RihcwKrcuw7J-Q0+w@mail.gmail.com
---
 doc/src/sgml/config.sgml              |   18 +-
 doc/src/sgml/ref/vacuum.sgml          |   66 +-
 src/backend/access/heap/vacuumlazy.c  | 1266 ++++++++++++++++++++++++++++++---
 src/backend/access/transam/parallel.c |   26 +-
 src/backend/commands/vacuum.c         |  135 +++-
 src/backend/postmaster/autovacuum.c   |    2 +
 src/bin/psql/tab-complete.c           |    2 +-
 src/include/access/heapam.h           |    3 +
 src/include/access/parallel.h         |    4 +-
 src/include/commands/vacuum.h         |   12 +
 src/test/regress/expected/vacuum.out  |   28 +
 src/test/regress/sql/vacuum.sql       |   27 +
 src/tools/pgindent/typedefs.list      |    4 +
 13 files changed, 1457 insertions(+), 136 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 5d1c902..beb3d59 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2308,13 +2308,13 @@ include_dir 'conf.d'
        <listitem>
         <para>
          Sets the maximum number of parallel workers that can be
-         started by a single utility command.  Currently, the only
-         parallel utility command that supports the use of parallel
-         workers is <command>CREATE INDEX</command>, and only when
-         building a B-tree index.  Parallel workers are taken from the
-         pool of processes established by <xref
-         linkend="guc-max-worker-processes"/>, limited by <xref
-         linkend="guc-max-parallel-workers"/>.  Note that the requested
+         started by a single utility command.  Currently, the parallel
+         utility commands that support the use of parallel workers are
+         <command>CREATE INDEX</command> only when building a B-tree index,
+         and <command>VACUUM</command> without <literal>FULL</literal>
+         option.  Parallel workers are taken from the pool of processes
+         established by <xref linkend="guc-max-worker-processes"/>, limited
+         by <xref linkend="guc-max-parallel-workers"/>.  Note that the requested
          number of workers may not actually be available at run time.
          If this occurs, the utility operation will run with fewer
          workers than expected.  The default value is 2.  Setting this
@@ -4895,7 +4895,9 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
         for a parallel scan to be considered.  Note that a parallel index scan
         typically won't touch the entire index; it is the number of pages
         which the planner believes will actually be touched by the scan which
-        is relevant.
+        is relevant.  This parameter is also used to decide whether a
+        particular index can participate in a parallel vacuum.  See
+        <xref linkend="sql-vacuum"/>.
         If this value is specified without units, it is taken as blocks,
         that is <symbol>BLCKSZ</symbol> bytes, typically 8kB.
         The default is 512 kilobytes (<literal>512kB</literal>).
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index f9b0fb8..b8435da 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -34,6 +34,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     SKIP_LOCKED [ <replaceable class="parameter">boolean</replaceable> ]
     INDEX_CLEANUP [ <replaceable class="parameter">boolean</replaceable> ]
     TRUNCATE [ <replaceable class="parameter">boolean</replaceable> ]
+    PARALLEL [ <replaceable class="parameter">integer</replaceable> ]
 
 <phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
 
@@ -75,10 +76,14 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    with normal reading and writing of the table, as an exclusive lock
    is not obtained.  However, extra space is not returned to the operating
    system (in most cases); it's just kept available for re-use within the
-   same table.  <command>VACUUM FULL</command> rewrites the entire contents
-   of the table into a new disk file with no extra space, allowing unused
-   space to be returned to the operating system.  This form is much slower and
-   requires an exclusive lock on each table while it is being processed.
+   same table.  It also allows us to leverage multiple CPUs in order to process
+   indexes.  This feature is known as <firstterm>parallel vacuum</firstterm>.
+   To disable this feature, one can use <literal>PARALLEL</literal> option and
+   specify parallel workers as zero.  <command>VACUUM FULL</command> rewrites
+   the entire contents of the table into a new disk file with no extra space,
+   allowing unused space to be returned to the operating system.  This form is
+   much slower and requires an exclusive lock on each table while it is being
+   processed.
   </para>
 
   <para>
@@ -224,6 +229,35 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    </varlistentry>
 
    <varlistentry>
+    <term><literal>PARALLEL</literal></term>
+    <listitem>
+     <para>
+      Perform vacuum index and cleanup index phases of <command>VACUUM</command>
+      in parallel using <replaceable class="parameter">integer</replaceable>
+      background workers (for the detail of each vacuum phases, please
+      refer to <xref linkend="vacuum-phases"/>).  If the
+      <literal>PARALLEL</literal> option or parallel degree
+      <replaceable class="parameter">integer</replaceable> is omitted,
+      then <command>VACUUM</command> decides the number of workers based
+      on number of indexes that support parallel vacuum operation on the
+      relation which is further limited by
+      <xref linkend="guc-max-parallel-workers-maintenance"/>.  The index can
+      participate in a parallel vacuum if and only if the size of the index is
+      more than <xref linkend="guc-min-parallel-index-scan-size"/>.  Please
+      note that it is not guaranteed that the number of parallel workers
+      specified in <replaceable class="parameter">integer</replaceable> will
+      be used during execution.  It is possible for a vacuum to run with fewer
+      workers than specified, or even with no workers at all.  Only one worker
+      can be used per index.  So parallel workers are launched only when there
+      are at least <literal>2</literal> indexes in the table.  Workers for
+      vacuum launches before starting each phase and exit at the end of
+      the phase.  These behaviors might change in a future release.  This
+      option can't be used with the <literal>FULL</literal> option.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><replaceable class="parameter">boolean</replaceable></term>
     <listitem>
      <para>
@@ -238,6 +272,18 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    </varlistentry>
 
    <varlistentry>
+    <term><replaceable class="parameter">integer</replaceable></term>
+    <listitem>
+     <para>
+      Specifies a non-negative integer value passed to the selected option.
+      The <replaceable class="parameter">integer</replaceable> value can
+      also be omitted, in which case the value is decided by the command
+      based on the option used.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><replaceable class="parameter">table_name</replaceable></term>
     <listitem>
      <para>
@@ -317,10 +363,18 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    </para>
 
    <para>
+     The <option>PARALLEL</option> option is used only for vacuum purpose.
+     Even if this option is specified with <option>ANALYZE</option> option
+     it does not affect <option>ANALYZE</option>.
+   </para>
+
+   <para>
     <command>VACUUM</command> causes a substantial increase in I/O traffic,
     which might cause poor performance for other active sessions.  Therefore,
-    it is sometimes advisable to use the cost-based vacuum delay feature.
-    See <xref linkend="runtime-config-resource-vacuum-cost"/> for details.
+    it is sometimes advisable to use the cost-based vacuum delay feature.  For
+    parallel vacuum, each worker sleeps proportional to the work done by that
+    worker.  See <xref linkend="runtime-config-resource-vacuum-cost"/> for
+    details.
    </para>
 
    <para>
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index a5fe904..b95caf4 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -22,6 +22,20 @@
  * of index scans performed.  So we don't use maintenance_work_mem memory for
  * the TID array, just enough to hold as many heap tuples as fit on one page.
  *
+ * Lazy vacuum supports parallel execution with parallel worker processes.  In
+ * a parallel lazy vacuum, we perform both index vacuum and index cleanup with
+ * parallel worker processes.  Individual indexes are processed by one vacuum
+ * process.  At the beginning of a lazy vacuum (at lazy_scan_heap) we prepare
+ * the parallel context and initialize the DSM segment that contains shared
+ * information as well as the memory space for storing dead tuples.  When
+ * starting either index vacuum or index cleanup, we launch parallel worker
+ * processes.  Once all indexes are processed the parallel worker processes
+ * exit.  After that, the leader process re-initializes the parallel context
+ * so that it can use the same DSM for multiple passes of index vacuum and
+ * for performing index cleanup.  For updating the index statistics, we need
+ * to update the system table and since updates are not allowed during
+ * parallel mode we update the index statistics after exiting from the
+ * parallel mode.
  *
  * Portions Copyright (c) 1996-2020, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
@@ -36,25 +50,30 @@
 
 #include <math.h>
 
+#include "access/amapi.h"
 #include "access/genam.h"
 #include "access/heapam.h"
 #include "access/heapam_xlog.h"
 #include "access/htup_details.h"
 #include "access/multixact.h"
+#include "access/parallel.h"
 #include "access/transam.h"
 #include "access/visibilitymap.h"
+#include "access/xact.h"
 #include "access/xlog.h"
 #include "catalog/storage.h"
 #include "commands/dbcommands.h"
 #include "commands/progress.h"
 #include "commands/vacuum.h"
 #include "miscadmin.h"
+#include "optimizer/paths.h"
 #include "pgstat.h"
 #include "portability/instr_time.h"
 #include "postmaster/autovacuum.h"
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
 #include "storage/lmgr.h"
+#include "tcop/tcopprot.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/pg_rusage.h"
@@ -110,6 +129,142 @@
  */
 #define PREFETCH_SIZE			((BlockNumber) 32)
 
+/*
+ * DSM keys for parallel lazy vacuum.  Unlike other parallel execution code,
+ * since we don't need to worry about DSM keys conflicting with plan_node_id
+ * we can use small integers.
+ */
+#define PARALLEL_VACUUM_KEY_SHARED			1
+#define PARALLEL_VACUUM_KEY_DEAD_TUPLES		2
+#define PARALLEL_VACUUM_KEY_QUERY_TEXT		3
+
+/*
+ * Macro to check if we are in a parallel lazy vacuum.  If true, we are
+ * in the parallel mode and the DSM segment is initialized.
+ */
+#define ParallelVacuumIsActive(lps) (((LVParallelState *) (lps)) != NULL)
+
+/*
+ * LVDeadTuples stores the dead tuple TIDs collected during the heap scan.
+ * This is allocated in the DSM segment in parallel mode and in local memory
+ * in non-parallel mode.
+ */
+typedef struct LVDeadTuples
+{
+	int			max_tuples;		/* # slots allocated in array */
+	int			num_tuples;		/* current # of entries */
+	/* List of TIDs of tuples we intend to delete */
+	/* NB: this list is ordered by TID address */
+	ItemPointerData itemptrs[FLEXIBLE_ARRAY_MEMBER];	/* array of
+														 * ItemPointerData */
+} LVDeadTuples;
+
+#define SizeOfLVDeadTuples(cnt) \
+		add_size((offsetof(LVDeadTuples, itemptrs)), \
+				 mul_size(sizeof(ItemPointerData), cnt))
+
+/*
+ * Shared information among parallel workers.  So this is allocated in the DSM
+ * segment.
+ */
+typedef struct LVShared
+{
+	/*
+	 * Target table relid and log level.  These fields are not modified during
+	 * the lazy vacuum.
+	 */
+	Oid			relid;
+	int			elevel;
+
+	/*
+	 * An indication for vacuum workers to perform either index vacuum or
+	 * index cleanup.  first_time is true only if for_cleanup is true and
+	 * bulk-deletion is not performed yet.
+	 */
+	bool		for_cleanup;
+	bool		first_time;
+
+	/*
+	 * Fields for both index vacuum and cleanup.
+	 *
+	 * reltuples is the total number of input heap tuples.  We set either old
+	 * live tuples in the index vacuum case or the new live tuples in the
+	 * index cleanup case.
+	 *
+	 * estimated_count is true if the reltuples is an estimated value.
+	 */
+	double		reltuples;
+	bool		estimated_count;
+
+	/*
+	 * In single process lazy vacuum we could consume more memory during index
+	 * vacuuming or cleanup apart from the memory for heap scanning. In
+	 * parallel index vacuum, since individual vacuum workers can consume
+	 * memory equal to maintenance_work_mem, the new maintenance_work_mem for
+	 * each worker is set such that the parallel operation doesn't consume
+	 * more memory than single process lazy vacuum.
+	 */
+	int			maintenance_work_mem_worker;
+
+	/*
+	 * Shared vacuum cost balance.  During parallel index vacuuming
+	 * VacuumSharedCostBalance points to this value and it accumulates the
+	 * balance of each parallel vacuum worker.
+	 */
+	pg_atomic_uint32 cost_balance;
+
+	/*
+	 * Number of active parallel workers.  This is used for computing the
+	 * minimum threshold of the vacuum cost balance for a worker to go for the
+	 * delay.
+	 */
+	pg_atomic_uint32 active_nworkers;
+
+	/*
+	 * Variables to control parallel index vacuuming.  We have a bitmap to
+	 * indicate which index has stats in shared memory.  The set bit in the
+	 * map indicates that the particular index supports a parallel vacuum.
+	 */
+	pg_atomic_uint32 idx;		/* counter for vacuuming and clean up */
+	uint32		offset;			/* sizeof header incl. bitmap */
+	bits8		bitmap[FLEXIBLE_ARRAY_MEMBER];	/* bit map of NULLs */
+
+	/* Shared index statistics data follows at end of struct */
+} LVShared;
+
+#define SizeOfLVShared (offsetof(LVShared, bitmap) + sizeof(bits8))
+#define GetSharedIndStats(s) \
+	((LVSharedIndStats *)((char *)(s) + ((LVShared *)(s))->offset))
+#define IndStatsIsNull(s, i) \
+	(!(((LVShared *)(s))->bitmap[(i) >> 3] & (1 << ((i) & 0x07))))
+
+/*
+ * Struct for an index bulk-deletion statistic used for parallel lazy
+ * vacuum.  This is allocated in the DSM segment.
+ */
+typedef struct LVSharedIndStats
+{
+	bool		updated;		/* are the stats updated? */
+	IndexBulkDeleteResult stats;
+} LVSharedIndStats;
+
+/* Struct for maintaining a parallel vacuum state. */
+typedef struct LVParallelState
+{
+	ParallelContext *pcxt;
+
+	/* Shared information among parallel vacuum workers */
+	LVShared   *lvshared;
+
+	/*
+	 * The number of indexes that support parallel index bulk-deletion and
+	 * parallel index cleanup respectively.
+	 */
+	int			nindexes_parallel_bulkdel;
+	int			nindexes_parallel_cleanup;
+	int			nindexes_parallel_condcleanup;
+} LVParallelState;
+
 typedef struct LVRelStats
 {
 	/* useindex = true means two-pass strategy; false means one-pass */
@@ -128,11 +283,7 @@ typedef struct LVRelStats
 	BlockNumber pages_removed;
 	double		tuples_deleted;
 	BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
-	/* List of TIDs of tuples we intend to delete */
-	/* NB: this list is ordered by TID address */
-	int			num_dead_tuples;	/* current # of entries */
-	int			max_dead_tuples;	/* # slots allocated in array */
-	ItemPointer dead_tuples;	/* array of ItemPointerData */
+	LVDeadTuples *dead_tuples;
 	int			num_index_scans;
 	TransactionId latestRemovedXid;
 	bool		lock_waiter_detected;
@@ -155,15 +306,15 @@ static void lazy_scan_heap(Relation onerel, VacuumParams *params,
 						   bool aggressive);
 static void lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats);
 static bool lazy_check_needs_freeze(Buffer buf, bool *hastup);
-static void lazy_vacuum_index(Relation indrel,
-							  IndexBulkDeleteResult **stats,
-							  LVRelStats *vacrelstats);
-static void lazy_vacuum_all_indexes(Relation onerel, LVRelStats *vacrelstats,
-									Relation *Irel, int nindexes,
-									IndexBulkDeleteResult **indstats);
+static void lazy_vacuum_all_indexes(Relation onerel, Relation *Irel,
+									IndexBulkDeleteResult **stats,
+									LVRelStats *vacrelstats, LVParallelState *lps,
+									int nindexes);
+static void lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult **stats,
+							  LVDeadTuples *dead_tuples, double reltuples);
 static void lazy_cleanup_index(Relation indrel,
-							   IndexBulkDeleteResult *stats,
-							   LVRelStats *vacrelstats);
+							   IndexBulkDeleteResult **stats,
+							   double reltuples, bool estimated_count);
 static int	lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 							 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer);
 static bool should_attempt_truncation(VacuumParams *params,
@@ -172,12 +323,41 @@ static void lazy_truncate_heap(Relation onerel, LVRelStats *vacrelstats);
 static BlockNumber count_nondeletable_pages(Relation onerel,
 											LVRelStats *vacrelstats);
 static void lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks);
-static void lazy_record_dead_tuple(LVRelStats *vacrelstats,
+static void lazy_record_dead_tuple(LVDeadTuples *dead_tuples,
 								   ItemPointer itemptr);
 static bool lazy_tid_reaped(ItemPointer itemptr, void *state);
 static int	vac_cmp_itemptr(const void *left, const void *right);
 static bool heap_page_is_all_visible(Relation rel, Buffer buf,
 									 TransactionId *visibility_cutoff_xid, bool *all_frozen);
+static void lazy_parallel_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+										 LVRelStats *vacrelstats, LVParallelState *lps,
+										 int nindexes);
+static void parallel_vacuum_index(Relation *Irel, IndexBulkDeleteResult **stats,
+								  LVShared *lvshared, LVDeadTuples *dead_tuples,
+								  int nindexes);
+static void vacuum_indexes_leader(Relation *Irel, IndexBulkDeleteResult **stats,
+								  LVRelStats *vacrelstats, LVParallelState *lps,
+								  int nindexes);
+static void vacuum_one_index(Relation indrel, IndexBulkDeleteResult **stats,
+							 LVShared *lvshared, LVSharedIndStats *shared_indstats,
+							 LVDeadTuples *dead_tuples);
+static void lazy_cleanup_all_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+									 LVRelStats *vacrelstats, LVParallelState *lps,
+									 int nindexes);
+static long compute_max_dead_tuples(BlockNumber relblocks, bool hasindex);
+static int	compute_parallel_vacuum_workers(Relation *Irel, int nindexes, int nrequested,
+											bool *can_parallel_vacuum);
+static void prepare_index_statistics(LVShared *lvshared, bool *can_parallel_vacuum,
+									 int nindexes);
+static void update_index_statistics(Relation *Irel, IndexBulkDeleteResult **stats,
+									int nindexes);
+static LVParallelState *begin_parallel_vacuum(Oid relid, Relation *Irel,
+											  LVRelStats *vacrelstats, BlockNumber nblocks,
+											  int nindexes, int nrequested);
+static void end_parallel_vacuum(Relation *Irel, IndexBulkDeleteResult **stats,
+								LVParallelState *lps, int nindexes);
+static LVSharedIndStats *get_indstats(LVShared *lvshared, int n);
+static bool skip_parallel_vacuum_index(Relation indrel, LVShared *lvshared);
 
 
 /*
@@ -491,6 +671,18 @@ vacuum_log_cleanup_info(Relation rel, LVRelStats *vacrelstats)
  *		dead-tuple TIDs, invoke vacuuming of indexes and call lazy_vacuum_heap
  *		to reclaim dead line pointers.
  *
+ *		If the table has at least two indexes and parallel lazy vacuum is
+ *		requested, we execute both index vacuum and index cleanup with
+ *		parallel workers.  In parallel lazy vacuum, we enter parallel mode and
+ *		then create both the parallel context and the DSM segment before starting
+ *		heap scan so that we can record dead tuples to the DSM segment.  All
+ *		parallel workers are launched at beginning of index vacuuming and index
+ *		cleanup and they exit once done with all indexes.  At the end of this
+ *		function we exit from parallel mode.  Index bulk-deletion results are
+ *		stored in the DSM segment and we update index statistics for all the
+ *		indexes after exiting from parallel mode since writes are not allowed
+ *		during parallel mode.
+ *
  *		If there are no indexes then we can reclaim line pointers on the fly;
  *		dead line pointers need only be retained until all index pointers that
  *		reference them have been killed.
@@ -499,6 +691,8 @@ static void
 lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			   Relation *Irel, int nindexes, bool aggressive)
 {
+	LVParallelState *lps = NULL;
+	LVDeadTuples *dead_tuples;
 	BlockNumber nblocks,
 				blkno;
 	HeapTupleData tuple;
@@ -556,13 +750,46 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	vacrelstats->nonempty_pages = 0;
 	vacrelstats->latestRemovedXid = InvalidTransactionId;
 
-	lazy_space_alloc(vacrelstats, nblocks);
+	/*
+	 * Initialize the state for parallel vacuum
+	 */
+	if (params->nworkers >= 0 && vacrelstats->useindex)
+	{
+		/*
+		 * Since parallel workers cannot access data in temporary tables, we
+		 * can't perform parallel vacuum on them.
+		 */
+		if (RelationUsesLocalBuffers(onerel))
+		{
+			/*
+			 * Give warning only if the user explicitly tries to perform a
+			 * parallel vacuum on the temporary table.
+			 */
+			if (params->nworkers > 0)
+				ereport(WARNING,
+						(errmsg("disabling parallel option of vacuum on \"%s\" --- cannot vacuum temporary tables in parallel",
+								RelationGetRelationName(onerel))));
+		}
+		else
+			lps = begin_parallel_vacuum(RelationGetRelid(onerel), Irel,
+										vacrelstats, nblocks, nindexes,
+										params->nworkers);
+	}
+
+	/*
+	 * Allocate the space for dead tuples in case the parallel vacuum is not
+	 * initialized.
+	 */
+	if (!ParallelVacuumIsActive(lps))
+		lazy_space_alloc(vacrelstats, nblocks);
+
+	dead_tuples = vacrelstats->dead_tuples;
 	frozen = palloc(sizeof(xl_heap_freeze_tuple) * MaxHeapTuplesPerPage);
 
 	/* Report that we're scanning the heap, advertising total # of blocks */
 	initprog_val[0] = PROGRESS_VACUUM_PHASE_SCAN_HEAP;
 	initprog_val[1] = nblocks;
-	initprog_val[2] = vacrelstats->max_dead_tuples;
+	initprog_val[2] = dead_tuples->max_tuples;
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
 	/*
@@ -740,8 +967,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * If we are close to overrunning the available space for dead-tuple
 		 * TIDs, pause and do a cycle of vacuuming before we tackle this page.
 		 */
-		if ((vacrelstats->max_dead_tuples - vacrelstats->num_dead_tuples) < MaxHeapTuplesPerPage &&
-			vacrelstats->num_dead_tuples > 0)
+		if ((dead_tuples->max_tuples - dead_tuples->num_tuples) < MaxHeapTuplesPerPage &&
+			dead_tuples->num_tuples > 0)
 		{
 			/*
 			 * Before beginning index vacuuming, we release any pin we may
@@ -756,8 +983,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			}
 
 			/* Work on all the indexes, then the heap */
-			lazy_vacuum_all_indexes(onerel, vacrelstats, Irel,
-									nindexes, indstats);
+			lazy_vacuum_all_indexes(onerel, Irel, indstats,
+									vacrelstats, lps, nindexes);
 
 			/* Remove tuples from heap */
 			lazy_vacuum_heap(onerel, vacrelstats);
@@ -767,7 +994,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 
 			/*
 			 * Vacuum the Free Space Map to make newly-freed space visible on
@@ -962,7 +1189,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		has_dead_tuples = false;
 		nfrozen = 0;
 		hastup = false;
-		prev_dead_count = vacrelstats->num_dead_tuples;
+		prev_dead_count = dead_tuples->num_tuples;
 		maxoff = PageGetMaxOffsetNumber(page);
 
 		/*
@@ -1001,7 +1228,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 */
 			if (ItemIdIsDead(itemid))
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				all_visible = false;
 				continue;
 			}
@@ -1147,7 +1374,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 
 			if (tupgone)
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				HeapTupleHeaderAdvanceLatestRemovedXid(tuple.t_data,
 													   &vacrelstats->latestRemovedXid);
 				tups_vacuumed += 1;
@@ -1217,7 +1444,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * doing a second scan. Also we don't do that but forget dead tuples
 		 * when index cleanup is disabled.
 		 */
-		if (!vacrelstats->useindex && vacrelstats->num_dead_tuples > 0)
+		if (!vacrelstats->useindex && dead_tuples->num_tuples > 0)
 		{
 			if (nindexes == 0)
 			{
@@ -1246,7 +1473,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 
 			/*
 			 * Periodically do incremental FSM vacuuming to make newly-freed
@@ -1361,7 +1588,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * page, so remember its free space as-is.  (This path will always be
 		 * taken if there are no indexes.)
 		 */
-		if (vacrelstats->num_dead_tuples == prev_dead_count)
+		if (dead_tuples->num_tuples == prev_dead_count)
 			RecordPageWithFreeSpace(onerel, blkno, freespace);
 	}
 
@@ -1395,11 +1622,11 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 
 	/* If any tuples need to be deleted, perform final vacuum cycle */
 	/* XXX put a threshold on min number of tuples here? */
-	if (vacrelstats->num_dead_tuples > 0)
+	if (dead_tuples->num_tuples > 0)
 	{
 		/* Work on all the indexes, and then the heap */
-		lazy_vacuum_all_indexes(onerel, vacrelstats, Irel, nindexes,
-								indstats);
+		lazy_vacuum_all_indexes(onerel, Irel, indstats, vacrelstats,
+								lps, nindexes);
 
 		/* Remove tuples from heap */
 		lazy_vacuum_heap(onerel, vacrelstats);
@@ -1412,17 +1639,22 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	if (blkno > next_fsm_block_to_vacuum)
 		FreeSpaceMapVacuumRange(onerel, next_fsm_block_to_vacuum, blkno);
 
-	/* report all blocks vacuumed; and that we're cleaning up */
+	/* report all blocks vacuumed */
 	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_VACUUMED, blkno);
-	pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
-								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
 
-	/* Do post-vacuum cleanup and statistics update for each index */
+	/* Do post-vacuum cleanup */
 	if (vacrelstats->useindex)
-	{
-		for (i = 0; i < nindexes; i++)
-			lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
-	}
+		lazy_cleanup_all_indexes(Irel, indstats, vacrelstats, lps, nindexes);
+
+	/*
+	 * End parallel mode before updating index statistics as we cannot write
+	 * during parallel mode.
+	 */
+	if (ParallelVacuumIsActive(lps))
+		end_parallel_vacuum(Irel, indstats, lps, nindexes);
+
+	/* Update index statistics */
+	update_index_statistics(Irel, indstats, nindexes);
 
 	/* If no indexes, make log report that lazy_vacuum_heap would've made */
 	if (vacuumed_pages)
@@ -1467,15 +1699,16 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 /*
  *	lazy_vacuum_all_indexes() -- vacuum all indexes of relation.
  *
- *		This is a utility wrapper for lazy_vacuum_index(), able to do
- *		progress reporting.
+ * We process the indexes serially unless we are doing parallel vacuum.
  */
 static void
-lazy_vacuum_all_indexes(Relation onerel, LVRelStats *vacrelstats,
-						Relation *Irel, int nindexes,
-						IndexBulkDeleteResult **indstats)
+lazy_vacuum_all_indexes(Relation onerel, Relation *Irel,
+						IndexBulkDeleteResult **stats,
+						LVRelStats *vacrelstats, LVParallelState *lps,
+						int nindexes)
 {
-	int			i;
+	Assert(!IsParallelWorker());
+	Assert(nindexes > 0);
 
 	/* Log cleanup info before we touch indexes */
 	vacuum_log_cleanup_info(onerel, vacrelstats);
@@ -1484,9 +1717,30 @@ lazy_vacuum_all_indexes(Relation onerel, LVRelStats *vacrelstats,
 	pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
 								 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
-	/* Remove index entries */
-	for (i = 0; i < nindexes; i++)
-		lazy_vacuum_index(Irel[i], &indstats[i], vacrelstats);
+	/* Perform index vacuuming with parallel workers for parallel vacuum. */
+	if (ParallelVacuumIsActive(lps))
+	{
+		/* Tell parallel workers to do index vacuuming */
+		lps->lvshared->for_cleanup = false;
+		lps->lvshared->first_time = false;
+
+		/*
+		 * We can only provide an approximate value of num_heap_tuples in
+		 * vacuum cases.
+		 */
+		lps->lvshared->reltuples = vacrelstats->old_live_tuples;
+		lps->lvshared->estimated_count = true;
+
+		lazy_parallel_vacuum_indexes(Irel, stats, vacrelstats, lps, nindexes);
+	}
+	else
+	{
+		int			idx;
+
+		for (idx = 0; idx < nindexes; idx++)
+			lazy_vacuum_index(Irel[idx], &stats[idx], vacrelstats->dead_tuples,
+							  vacrelstats->old_live_tuples);
+	}
 
 	/* Increase and report the number of index scans */
 	vacrelstats->num_index_scans++;
@@ -1522,7 +1776,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 	npages = 0;
 
 	tupindex = 0;
-	while (tupindex < vacrelstats->num_dead_tuples)
+	while (tupindex < vacrelstats->dead_tuples->num_tuples)
 	{
 		BlockNumber tblk;
 		Buffer		buf;
@@ -1531,7 +1785,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 
 		vacuum_delay_point();
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples->itemptrs[tupindex]);
 		buf = ReadBufferExtended(onerel, MAIN_FORKNUM, tblk, RBM_NORMAL,
 								 vac_strategy);
 		if (!ConditionalLockBufferForCleanup(buf))
@@ -1579,6 +1833,7 @@ static int
 lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 				 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer)
 {
+	LVDeadTuples *dead_tuples = vacrelstats->dead_tuples;
 	Page		page = BufferGetPage(buffer);
 	OffsetNumber unused[MaxOffsetNumber];
 	int			uncnt = 0;
@@ -1589,16 +1844,16 @@ lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 
 	START_CRIT_SECTION();
 
-	for (; tupindex < vacrelstats->num_dead_tuples; tupindex++)
+	for (; tupindex < dead_tuples->num_tuples; tupindex++)
 	{
 		BlockNumber tblk;
 		OffsetNumber toff;
 		ItemId		itemid;
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&dead_tuples->itemptrs[tupindex]);
 		if (tblk != blkno)
 			break;				/* past end of tuples for this block */
-		toff = ItemPointerGetOffsetNumber(&vacrelstats->dead_tuples[tupindex]);
+		toff = ItemPointerGetOffsetNumber(&dead_tuples->itemptrs[tupindex]);
 		itemid = PageGetItemId(page, toff);
 		ItemIdSetUnused(itemid);
 		unused[uncnt++] = toff;
@@ -1719,19 +1974,354 @@ lazy_check_needs_freeze(Buffer buf, bool *hastup)
 	return false;
 }
 
+/*
+ * Perform index vacuum or index cleanup with parallel workers.  This function
+ * must be used by the parallel vacuum leader process.  The caller must set
+ * lps->lvshared->for_cleanup to indicate whether to perform vacuum or
+ * cleanup.
+ */
+static void
+lazy_parallel_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+							 LVRelStats *vacrelstats, LVParallelState *lps,
+							 int nindexes)
+{
+	int			nworkers;
+
+	Assert(!IsParallelWorker());
+	Assert(ParallelVacuumIsActive(lps));
+	Assert(nindexes > 0);
+
+	/* Determine the number of parallel workers to launch */
+	if (lps->lvshared->for_cleanup)
+	{
+		if (lps->lvshared->first_time)
+			nworkers = lps->nindexes_parallel_cleanup +
+				lps->nindexes_parallel_condcleanup;
+		else
+			nworkers = lps->nindexes_parallel_cleanup;
+
+	}
+	else
+		nworkers = lps->nindexes_parallel_bulkdel;
+
+	/* The leader process will participate */
+	nworkers--;
+
+	/*
+	 * It is possible that parallel context is initialized with fewer workers
+	 * then the number of indexes that need a separate worker in the current
+	 * phase, so we need to consider it.  See compute_parallel_vacuum_workers.
+	 */
+	nworkers = Min(nworkers, lps->pcxt->nworkers);
+
+	/* Setup the shared cost-based vacuum delay and launch workers */
+	if (nworkers > 0)
+	{
+		if (vacrelstats->num_index_scans > 0)
+		{
+			/* Reset the parallel index processing counter */
+			pg_atomic_write_u32(&(lps->lvshared->idx), 0);
+
+			/* Reinitialize the parallel context to relaunch parallel workers */
+			ReinitializeParallelDSM(lps->pcxt);
+		}
+
+		/* Enable shared cost balance */
+		VacuumSharedCostBalance = &(lps->lvshared->cost_balance);
+		VacuumActiveNWorkers = &(lps->lvshared->active_nworkers);
+
+		/*
+		 * Set up shared cost balance and the number of active workers for
+		 * vacuum delay.
+		 */
+		pg_atomic_write_u32(VacuumSharedCostBalance, VacuumCostBalance);
+		pg_atomic_write_u32(VacuumActiveNWorkers, 0);
+
+		/*
+		 * The number of workers can vary between bulkdelete and cleanup
+		 * phase.
+		 */
+		ReinitializeParallelWorkers(lps->pcxt, nworkers);
+
+		LaunchParallelWorkers(lps->pcxt);
+
+		if (lps->pcxt->nworkers_launched > 0)
+		{
+			/*
+			 * Reset the local cost values for leader backend as we have
+			 * already accumulated the remaining balance of heap.
+			 */
+			VacuumCostBalance = 0;
+			VacuumCostBalanceLocal = 0;
+		}
+		else
+		{
+			/*
+			 * Disable shared cost balance if we are not able to launch
+			 * workers.
+			 */
+			VacuumSharedCostBalance = NULL;
+			VacuumActiveNWorkers = NULL;
+		}
+
+		if (lps->lvshared->for_cleanup)
+			ereport(elevel,
+					(errmsg(ngettext("launched %d parallel vacuum worker for index cleanup (planned: %d)",
+									 "launched %d parallel vacuum workers for index cleanup (planned: %d)",
+									 lps->pcxt->nworkers_launched),
+							lps->pcxt->nworkers_launched, nworkers)));
+		else
+			ereport(elevel,
+					(errmsg(ngettext("launched %d parallel vacuum worker for index vacuuming (planned: %d)",
+									 "launched %d parallel vacuum workers for index vacuuming (planned: %d)",
+									 lps->pcxt->nworkers_launched),
+							lps->pcxt->nworkers_launched, nworkers)));
+	}
+
+	/* Process the indexes that can be processed by only leader process */
+	vacuum_indexes_leader(Irel, stats, vacrelstats, lps, nindexes);
+
+	/*
+	 * Join as a parallel worker.  The leader process alone processes all the
+	 * indexes in the case where no workers are launched.
+	 */
+	parallel_vacuum_index(Irel, stats, lps->lvshared,
+						  vacrelstats->dead_tuples, nindexes);
+
+	/* Wait for all vacuum workers to finish */
+	WaitForParallelWorkersToFinish(lps->pcxt);
+
+	/* Carry the shared balance value to heap scan */
+	if (VacuumSharedCostBalance)
+		VacuumCostBalance = pg_atomic_read_u32(VacuumSharedCostBalance);
+
+	if (nworkers > 0)
+	{
+		/* Disable shared cost balance */
+		VacuumSharedCostBalance = NULL;
+		VacuumActiveNWorkers = NULL;
+	}
+}
+
+/*
+ * Index vacuum/cleanup routine used by the leader process and parallel
+ * vacuum worker processes to process the indexes in parallel.
+ */
+static void
+parallel_vacuum_index(Relation *Irel, IndexBulkDeleteResult **stats,
+					  LVShared *lvshared, LVDeadTuples *dead_tuples,
+					  int nindexes)
+{
+	/*
+	 * Increment the active worker count if we are able to launch any worker.
+	 */
+	if (VacuumActiveNWorkers)
+		pg_atomic_add_fetch_u32(VacuumActiveNWorkers, 1);
+
+	/* Loop until all indexes are vacuumed */
+	for (;;)
+	{
+		int			idx;
+		LVSharedIndStats *shared_indstats;
+
+		/* Get an index number to process */
+		idx = pg_atomic_fetch_add_u32(&(lvshared->idx), 1);
+
+		/* Done for all indexes? */
+		if (idx >= nindexes)
+			break;
+
+		/* Get the index statistics of this index from DSM */
+		shared_indstats = get_indstats(lvshared, idx);
+
+		/*
+		 * Skip processing indexes that doesn't participate in parallel
+		 * operation
+		 */
+		if (shared_indstats == NULL ||
+			skip_parallel_vacuum_index(Irel[idx], lvshared))
+			continue;
+
+		/* Do vacuum or cleanup of the index */
+		vacuum_one_index(Irel[idx], &(stats[idx]), lvshared, shared_indstats,
+						 dead_tuples);
+	}
+
+	/*
+	 * We have completed the index vacuum so decrement the active worker
+	 * count.
+	 */
+	if (VacuumActiveNWorkers)
+		pg_atomic_sub_fetch_u32(VacuumActiveNWorkers, 1);
+}
+
+/*
+ * Vacuum or cleanup indexes that can be processed by only the leader process
+ * because these indexes don't support parallel operation at that phase.
+ */
+static void
+vacuum_indexes_leader(Relation *Irel, IndexBulkDeleteResult **stats,
+					  LVRelStats *vacrelstats, LVParallelState *lps,
+					  int nindexes)
+{
+	int			i;
+
+	Assert(!IsParallelWorker());
+
+	/*
+	 * Increment the active worker count if we are able to launch any worker.
+	 */
+	if (VacuumActiveNWorkers)
+		pg_atomic_add_fetch_u32(VacuumActiveNWorkers, 1);
+
+	for (i = 0; i < nindexes; i++)
+	{
+		bool		skip_index = (get_indstats(lps->lvshared, i) == NULL ||
+								  skip_parallel_vacuum_index(Irel[i], lps->lvshared));
+
+		/* Skip the indexes that can be processed by parallel workers */
+		if (!skip_index)
+			continue;
+
+		vacuum_one_index(Irel[i], &(stats[i]), lps->lvshared,
+						 get_indstats(lps->lvshared, i),
+						 vacrelstats->dead_tuples);
+	}
+
+	/*
+	 * We have completed the index vacuum so decrement the active worker
+	 * count.
+	 */
+	if (VacuumActiveNWorkers)
+		pg_atomic_sub_fetch_u32(VacuumActiveNWorkers, 1);
+}
+
+/*
+ * Vacuum or cleanup index either by leader process or by one of the worker
+ * process.  After processing the index this function copies the index
+ * statistics returned from ambulkdelete and amvacuumcleanup to the DSM
+ * segment.
+ */
+static void
+vacuum_one_index(Relation indrel, IndexBulkDeleteResult **stats,
+				 LVShared *lvshared, LVSharedIndStats *shared_indstats,
+				 LVDeadTuples *dead_tuples)
+{
+	IndexBulkDeleteResult *bulkdelete_res = NULL;
+
+	if (shared_indstats)
+	{
+		/* Get the space for IndexBulkDeleteResult */
+		bulkdelete_res = &(shared_indstats->stats);
+
+		/*
+		 * Update the pointer to the corresponding bulk-deletion result if
+		 * someone has already updated it.
+		 */
+		if (shared_indstats->updated && *stats == NULL)
+			*stats = bulkdelete_res;
+	}
+
+	/* Do vacuum or cleanup of the index */
+	if (lvshared->for_cleanup)
+		lazy_cleanup_index(indrel, stats, lvshared->reltuples,
+						   lvshared->estimated_count);
+	else
+		lazy_vacuum_index(indrel, stats, dead_tuples,
+						  lvshared->reltuples);
+
+	/*
+	 * Copy the index bulk-deletion result returned from ambulkdelete and
+	 * amvacuumcleanup to the DSM segment if it's the first time to get it
+	 * from them, because they allocate it locally and it's possible that an
+	 * index will be vacuumed by the different vacuum process at the next
+	 * time.  The copying of the result normally happens only after the first
+	 * time of index vacuuming.  From the second time, we pass the result on
+	 * the DSM segment so that they then update it directly.
+	 *
+	 * Since all vacuum workers write the bulk-deletion result at different
+	 * slots we can write them without locking.
+	 */
+	if (shared_indstats && !shared_indstats->updated && *stats != NULL)
+	{
+		memcpy(bulkdelete_res, *stats, sizeof(IndexBulkDeleteResult));
+		shared_indstats->updated = true;
+
+		/*
+		 * Now that the stats[idx] points to the DSM segment, we don't need
+		 * the locally allocated results.
+		 */
+		pfree(*stats);
+		*stats = bulkdelete_res;
+	}
+}
+
+/*
+ *	lazy_cleanup_all_indexes() -- cleanup all indexes of relation.
+ *
+ * Cleanup indexes.  We process the indexes serially unless we are doing
+ * parallel vacuum.
+ */
+static void
+lazy_cleanup_all_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+						 LVRelStats *vacrelstats, LVParallelState *lps,
+						 int nindexes)
+{
+	int			idx;
+
+	Assert(!IsParallelWorker());
+	Assert(nindexes > 0);
+
+	/* Report that we are now cleaning up indexes */
+	pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
+								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
+
+	/*
+	 * If parallel vacuum is active we perform index cleanup with parallel
+	 * workers.
+	 */
+	if (ParallelVacuumIsActive(lps))
+	{
+		/* Tell parallel workers to do index cleanup */
+		lps->lvshared->for_cleanup = true;
+		lps->lvshared->first_time =
+			(vacrelstats->num_index_scans == 0);
+
+		/*
+		 * Now we can provide a better estimate of total number of surviving
+		 * tuples (we assume indexes are more interested in that than in the
+		 * number of nominally live tuples).
+		 */
+		lps->lvshared->reltuples = vacrelstats->new_rel_tuples;
+		lps->lvshared->estimated_count =
+			(vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+
+		lazy_parallel_vacuum_indexes(Irel, stats, vacrelstats, lps, nindexes);
+	}
+	else
+	{
+		for (idx = 0; idx < nindexes; idx++)
+			lazy_cleanup_index(Irel[idx], &stats[idx],
+							   vacrelstats->new_rel_tuples,
+							   vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	}
+}
 
 /*
  *	lazy_vacuum_index() -- vacuum one index relation.
  *
  *		Delete all the index entries pointing to tuples listed in
- *		vacrelstats->dead_tuples, and update running statistics.
+ *		dead_tuples, and update running statistics.
+ *
+ *		reltuples is the number of heap tuples to be passed to the
+ *		bulkdelete callback.
  */
 static void
-lazy_vacuum_index(Relation indrel,
-				  IndexBulkDeleteResult **stats,
-				  LVRelStats *vacrelstats)
+lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult **stats,
+				  LVDeadTuples *dead_tuples, double reltuples)
 {
 	IndexVacuumInfo ivinfo;
+	const char *msg;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
@@ -1741,30 +2331,38 @@ lazy_vacuum_index(Relation indrel,
 	ivinfo.report_progress = false;
 	ivinfo.estimated_count = true;
 	ivinfo.message_level = elevel;
-	/* We can only provide an approximate value of num_heap_tuples here */
-	ivinfo.num_heap_tuples = vacrelstats->old_live_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
 	/* Do bulk deletion */
 	*stats = index_bulk_delete(&ivinfo, *stats,
-							   lazy_tid_reaped, (void *) vacrelstats);
+							   lazy_tid_reaped, (void *) dead_tuples);
+
+	if (IsParallelWorker())
+		msg = gettext_noop("scanned index \"%s\" to remove %d row versions by parallel vacuum worker");
+	else
+		msg = gettext_noop("scanned index \"%s\" to remove %d row versions");
 
 	ereport(elevel,
-			(errmsg("scanned index \"%s\" to remove %d row versions",
+			(errmsg(msg,
 					RelationGetRelationName(indrel),
-					vacrelstats->num_dead_tuples),
+					dead_tuples->num_tuples),
 			 errdetail_internal("%s", pg_rusage_show(&ru0))));
 }
 
 /*
  *	lazy_cleanup_index() -- do post-vacuum cleanup for one index relation.
+ *
+ *		reltuples is the number of heap tuples and estimated_count is true
+ *		if the reltuples is an estimated value.
  */
 static void
 lazy_cleanup_index(Relation indrel,
-				   IndexBulkDeleteResult *stats,
-				   LVRelStats *vacrelstats)
+				   IndexBulkDeleteResult **stats,
+				   double reltuples, bool estimated_count)
 {
 	IndexVacuumInfo ivinfo;
+	const char *msg;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
@@ -1772,49 +2370,33 @@ lazy_cleanup_index(Relation indrel,
 	ivinfo.index = indrel;
 	ivinfo.analyze_only = false;
 	ivinfo.report_progress = false;
-	ivinfo.estimated_count = (vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	ivinfo.estimated_count = estimated_count;
 	ivinfo.message_level = elevel;
 
-	/*
-	 * Now we can provide a better estimate of total number of surviving
-	 * tuples (we assume indexes are more interested in that than in the
-	 * number of nominally live tuples).
-	 */
-	ivinfo.num_heap_tuples = vacrelstats->new_rel_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
-	stats = index_vacuum_cleanup(&ivinfo, stats);
+	*stats = index_vacuum_cleanup(&ivinfo, *stats);
 
-	if (!stats)
+	if (!(*stats))
 		return;
 
-	/*
-	 * Now update statistics in pg_class, but only if the index says the count
-	 * is accurate.
-	 */
-	if (!stats->estimated_count)
-		vac_update_relstats(indrel,
-							stats->num_pages,
-							stats->num_index_tuples,
-							0,
-							false,
-							InvalidTransactionId,
-							InvalidMultiXactId,
-							false);
+	if (IsParallelWorker())
+		msg = gettext_noop("index \"%s\" now contains %.0f row versions in %u pages as reported by parallel vacuum worker");
+	else
+		msg = gettext_noop("index \"%s\" now contains %.0f row versions in %u pages");
 
 	ereport(elevel,
-			(errmsg("index \"%s\" now contains %.0f row versions in %u pages",
+			(errmsg(msg,
 					RelationGetRelationName(indrel),
-					stats->num_index_tuples,
-					stats->num_pages),
+					(*stats)->num_index_tuples,
+					(*stats)->num_pages),
 			 errdetail("%.0f index row versions were removed.\n"
 					   "%u index pages have been deleted, %u are currently reusable.\n"
 					   "%s.",
-					   stats->tuples_removed,
-					   stats->pages_deleted, stats->pages_free,
+					   (*stats)->tuples_removed,
+					   (*stats)->pages_deleted, (*stats)->pages_free,
 					   pg_rusage_show(&ru0))));
-
-	pfree(stats);
 }
 
 /*
@@ -2122,19 +2704,17 @@ count_nondeletable_pages(Relation onerel, LVRelStats *vacrelstats)
 }
 
 /*
- * lazy_space_alloc - space allocation decisions for lazy vacuum
- *
- * See the comments at the head of this file for rationale.
+ * Return the maximum number of dead tuples we can record.
  */
-static void
-lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+static long
+compute_max_dead_tuples(BlockNumber relblocks, bool useindex)
 {
 	long		maxtuples;
 	int			vac_work_mem = IsAutoVacuumWorkerProcess() &&
 	autovacuum_work_mem != -1 ?
 	autovacuum_work_mem : maintenance_work_mem;
 
-	if (vacrelstats->useindex)
+	if (useindex)
 	{
 		maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);
 		maxtuples = Min(maxtuples, INT_MAX);
@@ -2148,34 +2728,48 @@ lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
 		maxtuples = Max(maxtuples, MaxHeapTuplesPerPage);
 	}
 	else
-	{
 		maxtuples = MaxHeapTuplesPerPage;
-	}
 
-	vacrelstats->num_dead_tuples = 0;
-	vacrelstats->max_dead_tuples = (int) maxtuples;
-	vacrelstats->dead_tuples = (ItemPointer)
-		palloc(maxtuples * sizeof(ItemPointerData));
+	return maxtuples;
+}
+
+/*
+ * lazy_space_alloc - space allocation decisions for lazy vacuum
+ *
+ * See the comments at the head of this file for rationale.
+ */
+static void
+lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+{
+	LVDeadTuples *dead_tuples = NULL;
+	long		maxtuples;
+
+	maxtuples = compute_max_dead_tuples(relblocks, vacrelstats->useindex);
+
+	dead_tuples = (LVDeadTuples *) palloc(SizeOfLVDeadTuples(maxtuples));
+	dead_tuples->num_tuples = 0;
+	dead_tuples->max_tuples = (int) maxtuples;
+
+	vacrelstats->dead_tuples = dead_tuples;
 }
 
 /*
  * lazy_record_dead_tuple - remember one deletable tuple
  */
 static void
-lazy_record_dead_tuple(LVRelStats *vacrelstats,
-					   ItemPointer itemptr)
+lazy_record_dead_tuple(LVDeadTuples *dead_tuples, ItemPointer itemptr)
 {
 	/*
 	 * The array shouldn't overflow under normal behavior, but perhaps it
 	 * could if we are given a really small maintenance_work_mem. In that
 	 * case, just forget the last few tuples (we'll get 'em next time).
 	 */
-	if (vacrelstats->num_dead_tuples < vacrelstats->max_dead_tuples)
+	if (dead_tuples->num_tuples < dead_tuples->max_tuples)
 	{
-		vacrelstats->dead_tuples[vacrelstats->num_dead_tuples] = *itemptr;
-		vacrelstats->num_dead_tuples++;
+		dead_tuples->itemptrs[dead_tuples->num_tuples] = *itemptr;
+		dead_tuples->num_tuples++;
 		pgstat_progress_update_param(PROGRESS_VACUUM_NUM_DEAD_TUPLES,
-									 vacrelstats->num_dead_tuples);
+									 dead_tuples->num_tuples);
 	}
 }
 
@@ -2189,12 +2783,12 @@ lazy_record_dead_tuple(LVRelStats *vacrelstats,
 static bool
 lazy_tid_reaped(ItemPointer itemptr, void *state)
 {
-	LVRelStats *vacrelstats = (LVRelStats *) state;
+	LVDeadTuples *dead_tuples = (LVDeadTuples *) state;
 	ItemPointer res;
 
 	res = (ItemPointer) bsearch((void *) itemptr,
-								(void *) vacrelstats->dead_tuples,
-								vacrelstats->num_dead_tuples,
+								(void *) dead_tuples->itemptrs,
+								dead_tuples->num_tuples,
 								sizeof(ItemPointerData),
 								vac_cmp_itemptr);
 
@@ -2342,3 +2936,449 @@ heap_page_is_all_visible(Relation rel, Buffer buf,
 
 	return all_visible;
 }
+
+/*
+ * Compute the number of parallel worker processes to request.  Both index
+ * vacuum and index cleanup can be executed with parallel workers.  The
+ * relation size of the table don't affect the parallel degree for now.
+ *
+ * nrequested is the number of parallel workers that user requested.  If
+ * nrequested is 0, we compute the parallel degree based on nindexes, that is
+ * the number of indexes that support parallel index vacuuming.  This function
+ * also sets can_parallel_vacuum to remember indexes that participate in
+ * parallel index vacuum.
+ */
+static int
+compute_parallel_vacuum_workers(Relation *Irel, int nindexes, int nrequested,
+								bool *can_parallel_vacuum)
+{
+	int			nindexes_parallel = 0;
+	int			nindexes_parallel_bulkdel = 0;
+	int			nindexes_parallel_cleanup = 0;
+	int			parallel_workers;
+	int			i;
+
+	/*
+	 * We don't allow to perform parallel operation in standalone backend or
+	 * when parallelism is disabled.
+	 */
+	if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+		return 0;
+
+	/*
+	 * Compute the number of indexes that can participate in parallel index
+	 * vacuuming.
+	 */
+	for (i = 0; i < nindexes; i++)
+	{
+		uint8		vacoptions = Irel[i]->rd_indam->amparallelvacuumoptions;
+
+		if (vacoptions == VACUUM_OPTION_NO_PARALLEL ||
+			RelationGetNumberOfBlocks(Irel[i]) < min_parallel_index_scan_size)
+			continue;
+
+		can_parallel_vacuum[i] = true;
+
+		if ((vacoptions & VACUUM_OPTION_PARALLEL_BULKDEL) != 0)
+			nindexes_parallel_bulkdel++;
+		if (((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) != 0) ||
+			((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) != 0))
+			nindexes_parallel_cleanup++;
+	}
+
+	nindexes_parallel = Max(nindexes_parallel_bulkdel,
+							nindexes_parallel_cleanup);
+
+	/* No index supports parallel index vacuuming */
+	if (nindexes_parallel == 0)
+		return 0;
+
+	/* The leader process takes one index */
+	nindexes_parallel--;
+
+	/* Compute the parallel degree */
+	parallel_workers = (nrequested > 0) ?
+		Min(nrequested, nindexes_parallel) : nindexes_parallel;
+
+	/* Cap by max_parallel_maintenance_workers */
+	parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+
+	return parallel_workers;
+}
+
+/*
+ * Initialize variables for shared index statistics, set NULL bitmap and the
+ * size of stats for each index.
+ */
+static void
+prepare_index_statistics(LVShared *lvshared, bool *can_parallel_vacuum,
+						 int nindexes)
+{
+	int			i;
+
+	/* Currently, we don't support parallel vacuum for autovacuum */
+	Assert(!IsAutoVacuumWorkerProcess());
+
+	/* Set NULL for all indexes */
+	memset(lvshared->bitmap, 0x00, BITMAPLEN(nindexes));
+
+	for (i = 0; i < nindexes; i++)
+	{
+		if (!can_parallel_vacuum[i])
+			continue;
+
+		/* Set NOT NULL as this index do support parallelism */
+		lvshared->bitmap[i >> 3] |= 1 << (i & 0x07);
+	}
+}
+
+/*
+ * Update index statistics in pg_class if the statistics is accurate.
+ */
+static void
+update_index_statistics(Relation *Irel, IndexBulkDeleteResult **stats,
+						int nindexes)
+{
+	int			i;
+
+	Assert(!IsInParallelMode());
+
+	for (i = 0; i < nindexes; i++)
+	{
+		if (stats[i] == NULL || stats[i]->estimated_count)
+			continue;
+
+		/* Update index statistics */
+		vac_update_relstats(Irel[i],
+							stats[i]->num_pages,
+							stats[i]->num_index_tuples,
+							0,
+							false,
+							InvalidTransactionId,
+							InvalidMultiXactId,
+							false);
+		pfree(stats[i]);
+	}
+}
+
+/*
+ * This function prepares and returns parallel vacuum state if we can launch
+ * even one worker.  This function is responsible to create a parallel
+ * context, enter parallel mode, and then initialize the DSM segment.
+ */
+static LVParallelState *
+begin_parallel_vacuum(Oid relid, Relation *Irel, LVRelStats *vacrelstats,
+					  BlockNumber nblocks, int nindexes, int nrequested)
+{
+	LVParallelState *lps = NULL;
+	ParallelContext *pcxt;
+	LVShared   *shared;
+	LVDeadTuples *dead_tuples;
+	bool	   *can_parallel_vacuum;
+	long		maxtuples;
+	char	   *sharedquery;
+	Size		est_shared;
+	Size		est_deadtuples;
+	int			nindexes_mwm = 0;
+	int			parallel_workers = 0;
+	int			querylen;
+	int			i;
+
+	/*
+	 * A parallel vacuum must be requested and there must be indexes on the
+	 * relation
+	 */
+	Assert(nrequested >= 0);
+	Assert(nindexes > 0);
+
+	/*
+	 * Compute the number of parallel vacuum workers to launch
+	 */
+	can_parallel_vacuum = (bool *) palloc0(sizeof(bool) * nindexes);
+	parallel_workers = compute_parallel_vacuum_workers(Irel, nindexes,
+													   nrequested,
+													   can_parallel_vacuum);
+
+	/* Can't perform vacuum in parallel */
+	if (parallel_workers <= 0)
+	{
+		pfree(can_parallel_vacuum);
+		return lps;
+	}
+
+	lps = (LVParallelState *) palloc0(sizeof(LVParallelState));
+
+	EnterParallelMode();
+	pcxt = CreateParallelContext("postgres", "parallel_vacuum_main",
+								 parallel_workers);
+	Assert(pcxt->nworkers > 0);
+	lps->pcxt = pcxt;
+
+	/* Estimate size for shared information -- PARALLEL_VACUUM_KEY_SHARED */
+	est_shared = MAXALIGN(add_size(SizeOfLVShared, BITMAPLEN(nindexes)));
+	for (i = 0; i < nindexes; i++)
+	{
+		uint8		vacoptions = Irel[i]->rd_indam->amparallelvacuumoptions;
+
+		/*
+		 * Cleanup option should be either disabled, always performing in
+		 * parallel or conditionally performing in parallel.
+		 */
+		Assert(((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) == 0) ||
+			   ((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) == 0));
+		Assert(vacoptions <= VACUUM_OPTION_MAX_VALID_VALUE);
+
+		/* Skip indexes that don't participate in parallel index vacuum */
+		if (!can_parallel_vacuum[i])
+			continue;
+
+		if (Irel[i]->rd_indam->amusemaintenanceworkmem)
+			nindexes_mwm++;
+
+		est_shared = add_size(est_shared, sizeof(LVSharedIndStats));
+
+		/*
+		 * Remember the number of indexes that support parallel operation for
+		 * each phase.
+		 */
+		if ((vacoptions & VACUUM_OPTION_PARALLEL_BULKDEL) != 0)
+			lps->nindexes_parallel_bulkdel++;
+		if ((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) != 0)
+			lps->nindexes_parallel_cleanup++;
+		if ((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) != 0)
+			lps->nindexes_parallel_condcleanup++;
+	}
+	shm_toc_estimate_chunk(&pcxt->estimator, est_shared);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	/* Estimate size for dead tuples -- PARALLEL_VACUUM_KEY_DEAD_TUPLES */
+	maxtuples = compute_max_dead_tuples(nblocks, true);
+	est_deadtuples = MAXALIGN(SizeOfLVDeadTuples(maxtuples));
+	shm_toc_estimate_chunk(&pcxt->estimator, est_deadtuples);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	/* Finally, estimate PARALLEL_VACUUM_KEY_QUERY_TEXT space */
+	querylen = strlen(debug_query_string);
+	shm_toc_estimate_chunk(&pcxt->estimator, querylen + 1);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	InitializeParallelDSM(pcxt);
+
+	/* Prepare shared information */
+	shared = (LVShared *) shm_toc_allocate(pcxt->toc, est_shared);
+	MemSet(shared, 0, est_shared);
+	shared->relid = relid;
+	shared->elevel = elevel;
+	shared->maintenance_work_mem_worker =
+		(nindexes_mwm > 0) ?
+		maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
+		maintenance_work_mem;
+
+	/*
+	 * We need to care about alignment because we estimate the shared memory
+	 * in that way.
+	 */
+	shared->offset = MAXALIGN(add_size(SizeOfLVShared, BITMAPLEN(nindexes)));
+	prepare_index_statistics(shared, can_parallel_vacuum, nindexes);
+	pg_atomic_init_u32(&(shared->idx), 0);
+	pg_atomic_init_u32(&(shared->cost_balance), 0);
+	pg_atomic_init_u32(&(shared->active_nworkers), 0);
+
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_SHARED, shared);
+	lps->lvshared = shared;
+
+	/* Prepare the dead tuple space */
+	dead_tuples = (LVDeadTuples *) shm_toc_allocate(pcxt->toc, est_deadtuples);
+	dead_tuples->max_tuples = maxtuples;
+	dead_tuples->num_tuples = 0;
+	MemSet(dead_tuples->itemptrs, 0, sizeof(ItemPointerData) * maxtuples);
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_DEAD_TUPLES, dead_tuples);
+	vacrelstats->dead_tuples = dead_tuples;
+
+	/* Store query string for workers */
+	sharedquery = (char *) shm_toc_allocate(pcxt->toc, querylen + 1);
+	memcpy(sharedquery, debug_query_string, querylen + 1);
+	sharedquery[querylen] = '\0';
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, sharedquery);
+
+	pfree(can_parallel_vacuum);
+	return lps;
+}
+
+/*
+ * Destroy the parallel context, and end parallel mode.
+ *
+ * Since writes are not allowed during the parallel mode, so we copy the
+ * updated index statistics from DSM in local memory and then later use that
+ * to update the index statistics.  One might think that we can exit from
+ * parallel mode, update the index statistics and then destroy parallel
+ * context, but that won't be safe (see ExitParallelMode).
+ */
+static void
+end_parallel_vacuum(Relation *Irel, IndexBulkDeleteResult **stats,
+					LVParallelState *lps, int nindexes)
+{
+	int			i;
+
+	Assert(!IsParallelWorker());
+
+	/* Copy the updated statistics */
+	for (i = 0; i < nindexes; i++)
+	{
+		LVSharedIndStats *indstats = get_indstats(lps->lvshared, i);
+
+		/*
+		 * Skip unused slot.  The statistics of this index are already stored
+		 * in local memory.
+		 */
+		if (indstats == NULL)
+			continue;
+
+		if (indstats->updated)
+		{
+			stats[i] = (IndexBulkDeleteResult *) palloc0(sizeof(IndexBulkDeleteResult));
+			memcpy(stats[i], &(indstats->stats), sizeof(IndexBulkDeleteResult));
+		}
+		else
+			stats[i] = NULL;
+	}
+
+	DestroyParallelContext(lps->pcxt);
+	ExitParallelMode();
+
+	/* Deactivate parallel vacuum */
+	pfree(lps);
+	lps = NULL;
+}
+
+/* Return the Nth index statistics or NULL */
+static LVSharedIndStats *
+get_indstats(LVShared *lvshared, int n)
+{
+	int			i;
+	char	   *p;
+
+	if (IndStatsIsNull(lvshared, n))
+		return NULL;
+
+	p = (char *) GetSharedIndStats(lvshared);
+	for (i = 0; i < n; i++)
+	{
+		if (IndStatsIsNull(lvshared, i))
+			continue;
+
+		p += sizeof(LVSharedIndStats);
+	}
+
+	return (LVSharedIndStats *) p;
+}
+
+/*
+ * Returns true, if the given index can't participate in parallel index vacuum
+ * or parallel index cleanup, false, otherwise.
+ */
+static bool
+skip_parallel_vacuum_index(Relation indrel, LVShared *lvshared)
+{
+	uint8		vacoptions = indrel->rd_indam->amparallelvacuumoptions;
+
+	/* first_time must be true only if for_cleanup is true */
+	Assert(lvshared->for_cleanup || !lvshared->first_time);
+
+	if (lvshared->for_cleanup)
+	{
+		/* Skip, if the index does not support parallel cleanup */
+		if (((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) == 0) &&
+			((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) == 0))
+			return true;
+
+		/*
+		 * Skip, if the index supports parallel cleanup conditionally, but we
+		 * have already processed the index (for bulkdelete).  See the
+		 * comments for option VACUUM_OPTION_PARALLEL_COND_CLEANUP to know
+		 * when indexes support parallel cleanup conditionally.
+		 */
+		if (!lvshared->first_time &&
+			((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) != 0))
+			return true;
+	}
+	else if ((vacoptions & VACUUM_OPTION_PARALLEL_BULKDEL) == 0)
+	{
+		/* Skip if the index does not support parallel bulk deletion */
+		return true;
+	}
+
+	return false;
+}
+
+/*
+ * Perform work within a launched parallel process.
+ *
+ * Since parallel vacuum workers perform only index vacuum or index cleanup,
+ * we don't need to report the progress information.
+ */
+void
+parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
+{
+	Relation	onerel;
+	Relation   *indrels;
+	LVShared   *lvshared;
+	LVDeadTuples *dead_tuples;
+	int			nindexes;
+	char	   *sharedquery;
+	IndexBulkDeleteResult **stats;
+
+	lvshared = (LVShared *) shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_SHARED,
+										   false);
+	elevel = lvshared->elevel;
+
+	ereport(DEBUG1,
+			(errmsg("starting parallel lazy vacuum worker for %s",
+					lvshared->for_cleanup ? "cleanup" : "vacuuming")));
+
+	/* Set debug_query_string for individual workers */
+	sharedquery = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, false);
+	debug_query_string = sharedquery;
+	pgstat_report_activity(STATE_RUNNING, debug_query_string);
+
+	/*
+	 * Open table.  The lock mode is the same as the leader process.  It's
+	 * okay because the lock mode does not conflict among the parallel
+	 * workers.
+	 */
+	onerel = table_open(lvshared->relid, ShareUpdateExclusiveLock);
+
+	/*
+	 * Open all indexes. indrels are sorted in order by OID, which should be
+	 * matched to the leader's one.
+	 */
+	vac_open_indexes(onerel, RowExclusiveLock, &nindexes, &indrels);
+	Assert(nindexes > 0);
+
+	/* Set dead tuple space */
+	dead_tuples = (LVDeadTuples *) shm_toc_lookup(toc,
+												  PARALLEL_VACUUM_KEY_DEAD_TUPLES,
+												  false);
+
+	/* Set cost-based vacuum delay */
+	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostBalance = 0;
+	VacuumPageHit = 0;
+	VacuumPageMiss = 0;
+	VacuumPageDirty = 0;
+	VacuumSharedCostBalance = &(lvshared->cost_balance);
+	VacuumActiveNWorkers = &(lvshared->active_nworkers);
+
+	stats = (IndexBulkDeleteResult **)
+		palloc0(nindexes * sizeof(IndexBulkDeleteResult *));
+
+	if (lvshared->maintenance_work_mem_worker > 0)
+		maintenance_work_mem = lvshared->maintenance_work_mem_worker;
+
+	/* Process indexes to perform vacuum/cleanup */
+	parallel_vacuum_index(indrels, stats, lvshared, dead_tuples, nindexes);
+
+	vac_close_indexes(nindexes, indrels, RowExclusiveLock);
+	table_close(onerel, ShareUpdateExclusiveLock);
+	pfree(stats);
+}
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index f3e2254..df06e7d 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -14,6 +14,7 @@
 
 #include "postgres.h"
 
+#include "access/heapam.h"
 #include "access/nbtree.h"
 #include "access/parallel.h"
 #include "access/session.h"
@@ -139,6 +140,9 @@ static const struct
 	},
 	{
 		"_bt_parallel_build_main", _bt_parallel_build_main
+	},
+	{
+		"parallel_vacuum_main", parallel_vacuum_main
 	}
 };
 
@@ -174,6 +178,7 @@ CreateParallelContext(const char *library_name, const char *function_name,
 	pcxt = palloc0(sizeof(ParallelContext));
 	pcxt->subid = GetCurrentSubTransactionId();
 	pcxt->nworkers = nworkers;
+	pcxt->nworkers_to_launch = nworkers;
 	pcxt->library_name = pstrdup(library_name);
 	pcxt->function_name = pstrdup(function_name);
 	pcxt->error_context_stack = error_context_stack;
@@ -487,6 +492,23 @@ ReinitializeParallelDSM(ParallelContext *pcxt)
 }
 
 /*
+ * Reinitialize parallel workers for a parallel context such that we could
+ * launch the different number of workers.  This is required for cases where
+ * we need to reuse the same DSM segment, but the number of workers can
+ * vary from run-to-run.
+ */
+void
+ReinitializeParallelWorkers(ParallelContext *pcxt, int nworkers_to_launch)
+{
+	/*
+	 * The number of workers that need to be launched must be less than the
+	 * number of workers with which the parallel context is initialized.
+	 */
+	Assert(pcxt->nworkers >= nworkers_to_launch);
+	pcxt->nworkers_to_launch = nworkers_to_launch;
+}
+
+/*
  * Launch parallel workers.
  */
 void
@@ -498,7 +520,7 @@ LaunchParallelWorkers(ParallelContext *pcxt)
 	bool		any_registrations_failed = false;
 
 	/* Skip this if we have no workers. */
-	if (pcxt->nworkers == 0)
+	if (pcxt->nworkers == 0 || pcxt->nworkers_to_launch == 0)
 		return;
 
 	/* We need to be a lock group leader. */
@@ -533,7 +555,7 @@ LaunchParallelWorkers(ParallelContext *pcxt)
 	 * fails.  It wouldn't help much anyway, because registering the worker in
 	 * no way guarantees that it will start up and initialize successfully.
 	 */
-	for (i = 0; i < pcxt->nworkers; ++i)
+	for (i = 0; i < pcxt->nworkers_to_launch; ++i)
 	{
 		memcpy(worker.bgw_extra, &i, sizeof(int));
 		if (!any_registrations_failed &&
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index bb34e25..74784e8 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -42,6 +42,7 @@
 #include "nodes/makefuncs.h"
 #include "pgstat.h"
 #include "postmaster/autovacuum.h"
+#include "postmaster/bgworker_internals.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
 #include "storage/proc.h"
@@ -68,6 +69,14 @@ static MemoryContext vac_context = NULL;
 static BufferAccessStrategy vac_strategy;
 
 
+/*
+ * Variables for cost-based parallel vacuum.  See comments atop
+ * compute_parallel_delay to understand how it works.
+ */
+pg_atomic_uint32 *VacuumSharedCostBalance = NULL;
+pg_atomic_uint32 *VacuumActiveNWorkers = NULL;
+int			VacuumCostBalanceLocal = 0;
+
 /* non-export function prototypes */
 static List *expand_vacuum_rel(VacuumRelation *vrel, int options);
 static List *get_all_vacuum_rels(int options);
@@ -76,6 +85,7 @@ static void vac_truncate_clog(TransactionId frozenXID,
 							  TransactionId lastSaneFrozenXid,
 							  MultiXactId lastSaneMinMulti);
 static bool vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params);
+static double compute_parallel_delay(void);
 static VacOptTernaryValue get_vacopt_ternary_value(DefElem *def);
 
 /*
@@ -94,12 +104,16 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 	bool		freeze = false;
 	bool		full = false;
 	bool		disable_page_skipping = false;
+	bool		parallel_option = false;
 	ListCell   *lc;
 
 	/* Set default value */
 	params.index_cleanup = VACOPT_TERNARY_DEFAULT;
 	params.truncate = VACOPT_TERNARY_DEFAULT;
 
+	/* By default parallel vacuum is enabled */
+	params.nworkers = 0;
+
 	/* Parse options list */
 	foreach(lc, vacstmt->options)
 	{
@@ -129,6 +143,41 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 			params.index_cleanup = get_vacopt_ternary_value(opt);
 		else if (strcmp(opt->defname, "truncate") == 0)
 			params.truncate = get_vacopt_ternary_value(opt);
+		else if (strcmp(opt->defname, "parallel") == 0)
+		{
+			parallel_option = true;
+			if (opt->arg == NULL)
+			{
+				/*
+				 * Parallel lazy vacuum is requested but user didn't specify
+				 * the parallel degree.  The parallel degree will be
+				 * determined at the start of lazy vacuum based on number of
+				 * indexes.
+				 */
+				params.nworkers = 0;
+			}
+			else
+			{
+				int			nworkers;
+
+				nworkers = defGetInt32(opt);
+				if (nworkers < 0 || nworkers > MAX_PARALLEL_WORKER_LIMIT)
+					ereport(ERROR,
+							(errcode(ERRCODE_SYNTAX_ERROR),
+							 errmsg("parallel vacuum degree must be between 0 and %d",
+									MAX_PARALLEL_WORKER_LIMIT),
+							 parser_errposition(pstate, opt->location)));
+
+				/*
+				 * Disable parallel vacuum, if user has specified parallel
+				 * degree as zero.
+				 */
+				if (nworkers == 0)
+					params.nworkers = -1;
+				else
+					params.nworkers = nworkers;
+			}
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -152,6 +201,11 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 		   !(params.options & (VACOPT_FULL | VACOPT_FREEZE)));
 	Assert(!(params.options & VACOPT_SKIPTOAST));
 
+	if ((params.options & VACOPT_FULL) && parallel_option)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot specify both FULL and PARALLEL options")));
+
 	/*
 	 * Make sure VACOPT_ANALYZE is specified if any column lists are present.
 	 */
@@ -383,6 +437,7 @@ vacuum(List *relations, VacuumParams *params,
 		VacuumPageHit = 0;
 		VacuumPageMiss = 0;
 		VacuumPageDirty = 0;
+		VacuumSharedCostBalance = NULL;
 
 		/*
 		 * Loop to process each selected relation.
@@ -1941,16 +1996,26 @@ vac_close_indexes(int nindexes, Relation *Irel, LOCKMODE lockmode)
 void
 vacuum_delay_point(void)
 {
+	double		msec = 0;
+
 	/* Always check for interrupts */
 	CHECK_FOR_INTERRUPTS();
 
-	/* Nap if appropriate */
-	if (VacuumCostActive && !InterruptPending &&
-		VacuumCostBalance >= VacuumCostLimit)
-	{
-		double		msec;
+	if (!VacuumCostActive || InterruptPending)
+		return;
 
+	/*
+	 * For parallel vacuum, the delay is computed based on the shared cost
+	 * balance.  See compute_parallel_delay.
+	 */
+	if (VacuumSharedCostBalance != NULL)
+		msec = compute_parallel_delay();
+	else if (VacuumCostBalance >= VacuumCostLimit)
 		msec = VacuumCostDelay * VacuumCostBalance / VacuumCostLimit;
+
+	/* Nap if appropriate */
+	if (msec > 0)
+	{
 		if (msec > VacuumCostDelay * 4)
 			msec = VacuumCostDelay * 4;
 
@@ -1967,6 +2032,66 @@ vacuum_delay_point(void)
 }
 
 /*
+ * Computes the vacuum delay for parallel workers.
+ *
+ * The basic idea of a cost-based vacuum delay for parallel index vacuuming
+ * is to allow each worker to sleep proportional to the work done by it.  We
+ * achieve this by allowing all parallel vacuum workers including the leader
+ * process to have a shared view of cost related parameters (mainly
+ * VacuumCostBalance).  We allow each worker to update it as and when it has
+ * incurred any cost and then based on that decide whether it needs to sleep.
+ * We compute the time to sleep for a worker based on the cost it has incurred
+ * (VacuumCostBalanceLocal) and then reduce the VacuumSharedCostBalance by
+ * that amount.  This avoids letting the workers sleep who have done less or
+ * no I/O as compared to other workers and therefore can ensure that workers
+ * who are doing more I/O got throttled more.
+ *
+ * We allow any worker to sleep only if it has performed the I/O above a
+ * certain threshold, which is calculated based on the number of active
+ * workers (VacuumActiveNWorkers), and the overall cost balance is more than
+ * VacuumCostLimit set by the system.  The testing reveals that we achieve
+ * the required throttling if we allow a worker that has done more than 50%
+ * of its share of work to sleep.
+ */
+static double
+compute_parallel_delay(void)
+{
+	double		msec = 0;
+	uint32		shared_balance;
+	int			nworkers;
+
+	/* Parallel vacuum must be active */
+	Assert(VacuumSharedCostBalance);
+
+	nworkers = pg_atomic_read_u32(VacuumActiveNWorkers);
+
+	/* At least count itself */
+	Assert(nworkers >= 1);
+
+	/* Update the shared cost balance value atomically */
+	shared_balance = pg_atomic_add_fetch_u32(VacuumSharedCostBalance, VacuumCostBalance);
+
+	/* Compute the total local balance for the current worker */
+	VacuumCostBalanceLocal += VacuumCostBalance;
+
+	if ((shared_balance >= VacuumCostLimit) &&
+		(VacuumCostBalanceLocal > 0.5 * (VacuumCostLimit / nworkers)))
+	{
+		/* Compute sleep time based on the local cost balance */
+		msec = VacuumCostDelay * VacuumCostBalanceLocal / VacuumCostLimit;
+		pg_atomic_sub_fetch_u32(VacuumSharedCostBalance, VacuumCostBalanceLocal);
+		VacuumCostBalanceLocal = 0;
+	}
+
+	/*
+	 * Reset the local balance as we accumulated it into the shared value.
+	 */
+	VacuumCostBalance = 0;
+
+	return msec;
+}
+
+/*
  * A wrapper function of defGetBoolean().
  *
  * This function returns VACOPT_TERNARY_ENABLED and VACOPT_TERNARY_DISABLED
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index f0e40e3..6d1f28c 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2886,6 +2886,8 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 			(!wraparound ? VACOPT_SKIP_LOCKED : 0);
 		tab->at_params.index_cleanup = VACOPT_TERNARY_DEFAULT;
 		tab->at_params.truncate = VACOPT_TERNARY_DEFAULT;
+		/* As of now, we don't support parallel vacuum for autovacuum */
+		tab->at_params.nworkers = -1;
 		tab->at_params.freeze_min_age = freeze_min_age;
 		tab->at_params.freeze_table_age = freeze_table_age;
 		tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 2fd8886..99451fd 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -3597,7 +3597,7 @@ psql_completion(const char *text, int start, int end)
 		if (ends_with(prev_wd, '(') || ends_with(prev_wd, ','))
 			COMPLETE_WITH("FULL", "FREEZE", "ANALYZE", "VERBOSE",
 						  "DISABLE_PAGE_SKIPPING", "SKIP_LOCKED",
-						  "INDEX_CLEANUP", "TRUNCATE");
+						  "INDEX_CLEANUP", "TRUNCATE", "PARALLEL");
 		else if (TailMatches("FULL|FREEZE|ANALYZE|VERBOSE|DISABLE_PAGE_SKIPPING|SKIP_LOCKED|INDEX_CLEANUP|TRUNCATE"))
 			COMPLETE_WITH("ON", "OFF");
 	}
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 580b4ca..479f17c 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -24,6 +24,8 @@
 #include "nodes/primnodes.h"
 #include "storage/bufpage.h"
 #include "storage/lockdefs.h"
+#include "storage/shm_toc.h"
+#include "storage/dsm.h"
 #include "utils/relcache.h"
 #include "utils/snapshot.h"
 
@@ -193,6 +195,7 @@ extern Size SyncScanShmemSize(void);
 struct VacuumParams;
 extern void heap_vacuum_rel(Relation onerel,
 							struct VacuumParams *params, BufferAccessStrategy bstrategy);
+extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
 
 /* in heap/heapam_visibility.c */
 extern bool HeapTupleSatisfiesVisibility(HeapTuple stup, Snapshot snapshot,
diff --git a/src/include/access/parallel.h b/src/include/access/parallel.h
index 646708b..fc6a560 100644
--- a/src/include/access/parallel.h
+++ b/src/include/access/parallel.h
@@ -33,7 +33,8 @@ typedef struct ParallelContext
 {
 	dlist_node	node;
 	SubTransactionId subid;
-	int			nworkers;
+	int			nworkers;		/* Maximum number of workers to launch */
+	int			nworkers_to_launch; /* Actual number of workers to launch */
 	int			nworkers_launched;
 	char	   *library_name;
 	char	   *function_name;
@@ -63,6 +64,7 @@ extern ParallelContext *CreateParallelContext(const char *library_name,
 											  const char *function_name, int nworkers);
 extern void InitializeParallelDSM(ParallelContext *pcxt);
 extern void ReinitializeParallelDSM(ParallelContext *pcxt);
+extern void ReinitializeParallelWorkers(ParallelContext *pcxt, int nworkers_to_launch);
 extern void LaunchParallelWorkers(ParallelContext *pcxt);
 extern void WaitForParallelWorkersToAttach(ParallelContext *pcxt);
 extern void WaitForParallelWorkersToFinish(ParallelContext *pcxt);
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index b3351ad..62dd01f 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -222,6 +222,13 @@ typedef struct VacuumParams
 										 * default value depends on reloptions */
 	VacOptTernaryValue truncate;	/* Truncate empty pages at the end,
 									 * default value depends on reloptions */
+
+	/*
+	 * The number of parallel vacuum workers.  0 by default which means choose
+	 * based on the number of indexes.  -1 indicates a parallel vacuum is
+	 * disabled.
+	 */
+	int			nworkers;
 } VacuumParams;
 
 /* GUC parameters */
@@ -231,6 +238,11 @@ extern int	vacuum_freeze_table_age;
 extern int	vacuum_multixact_freeze_min_age;
 extern int	vacuum_multixact_freeze_table_age;
 
+/* Variables for cost-based parallel vacuum  */
+extern pg_atomic_uint32 *VacuumSharedCostBalance;
+extern pg_atomic_uint32 *VacuumActiveNWorkers;
+extern int	VacuumCostBalanceLocal;
+
 
 /* in commands/vacuum.c */
 extern void ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel);
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index 9996d88..22cca70 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -92,6 +92,34 @@ CONTEXT:  SQL function "do_analyze" statement 1
 SQL function "wrap_do_analyze" statement 1
 VACUUM FULL vactst;
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+-- PARALLEL option
+CREATE TABLE pvactst (i INT, a INT[], p POINT) with (autovacuum_enabled = off);
+INSERT INTO pvactst SELECT i, array[1,2,3], point(i, i+1) FROM generate_series(1,1000) i;
+CREATE INDEX btree_pvactst ON pvactst USING btree (i);
+CREATE INDEX hash_pvactst ON pvactst USING hash (i);
+CREATE INDEX brin_pvactst ON pvactst USING brin (i);
+CREATE INDEX gin_pvactst ON pvactst USING gin (a);
+CREATE INDEX gist_pvactst ON pvactst USING gist (p);
+CREATE INDEX spgist_pvactst ON pvactst USING spgist (p);
+-- VACUUM invokes parallel index cleanup
+SET min_parallel_index_scan_size to 0;
+VACUUM (PARALLEL 2) pvactst;
+-- VACUUM invokes parallel bulk-deletion
+UPDATE pvactst SET i = i WHERE i < 1000;
+VACUUM (PARALLEL 2) pvactst;
+VACUUM (PARALLEL -1) pvactst; -- error
+ERROR:  parallel vacuum degree must be between 0 and 1024
+LINE 1: VACUUM (PARALLEL -1) pvactst;
+                ^
+VACUUM (PARALLEL 2, INDEX_CLEANUP FALSE) pvactst;
+VACUUM (PARALLEL 2, FULL TRUE) pvactst; -- error, cannot use both PARALLEL and FULL
+ERROR:  cannot specify both FULL and PARALLEL options
+CREATE TEMPORARY TABLE tmp (a int PRIMARY KEY);
+CREATE INDEX tmp_idx1 ON tmp (a);
+VACUUM (PARALLEL 1) tmp; -- disables parallel vacuum option
+WARNING:  disabling parallel option of vacuum on "tmp" --- cannot vacuum temporary tables in parallel
+RESET min_parallel_index_scan_size;
+DROP TABLE pvactst;
 -- INDEX_CLEANUP option
 CREATE TABLE no_index_cleanup (i INT PRIMARY KEY, t TEXT);
 -- Use uncompressed data stored in toast.
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 69987f7..d6859a5 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -75,6 +75,33 @@ VACUUM FULL vactst;
 
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
 
+-- PARALLEL option
+CREATE TABLE pvactst (i INT, a INT[], p POINT) with (autovacuum_enabled = off);
+INSERT INTO pvactst SELECT i, array[1,2,3], point(i, i+1) FROM generate_series(1,1000) i;
+CREATE INDEX btree_pvactst ON pvactst USING btree (i);
+CREATE INDEX hash_pvactst ON pvactst USING hash (i);
+CREATE INDEX brin_pvactst ON pvactst USING brin (i);
+CREATE INDEX gin_pvactst ON pvactst USING gin (a);
+CREATE INDEX gist_pvactst ON pvactst USING gist (p);
+CREATE INDEX spgist_pvactst ON pvactst USING spgist (p);
+
+-- VACUUM invokes parallel index cleanup
+SET min_parallel_index_scan_size to 0;
+VACUUM (PARALLEL 2) pvactst;
+
+-- VACUUM invokes parallel bulk-deletion
+UPDATE pvactst SET i = i WHERE i < 1000;
+VACUUM (PARALLEL 2) pvactst;
+
+VACUUM (PARALLEL -1) pvactst; -- error
+VACUUM (PARALLEL 2, INDEX_CLEANUP FALSE) pvactst;
+VACUUM (PARALLEL 2, FULL TRUE) pvactst; -- error, cannot use both PARALLEL and FULL
+CREATE TEMPORARY TABLE tmp (a int PRIMARY KEY);
+CREATE INDEX tmp_idx1 ON tmp (a);
+VACUUM (PARALLEL 1) tmp; -- disables parallel vacuum option
+RESET min_parallel_index_scan_size;
+DROP TABLE pvactst;
+
 -- INDEX_CLEANUP option
 CREATE TABLE no_index_cleanup (i INT PRIMARY KEY, t TEXT);
 -- Use uncompressed data stored in toast.
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index caf6b86..0242e66 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -1216,7 +1216,11 @@ LPVOID
 LPWSTR
 LSEG
 LUID
+LVDeadTuples
 LVRelStats
+LVShared
+LVSharedIndStats
+LVParallelState
 LWLock
 LWLockHandle
 LWLockMinimallyPadded
-- 
1.8.3.1

#352

Mahendra Singh Thalor

mahi6run@gmail.com

almost 6 years ago

In reply to: Sergei Kornilov (#341)

On Fri, 10 Jan 2020 at 15:51, Sergei Kornilov <sk@zsrv.org> wrote:

Hi
Thank you for update! I looked again
(vacuum_indexes_leader)
+               /* Skip the indexes that can be processed by parallel workers */
+               if (!skip_index)
+                       continue;
Does the variable name skip_index not confuse here? Maybe rename to something like can_parallel?

Again I looked into code and thought that somehow if we can add a
boolean flag(can_parallel) in IndexBulkDeleteResult structure to
identify that this index is supporting parallel vacuum or not, then it
will be easy to skip those indexes and multiple time we will not call
skip_parallel_vacuum_index (from vacuum_indexes_leader and
parallel_vacuum_index)
We can have a linked list of non-parallel supported indexes, then
directly we can pass to vacuum_indexes_leader.

Ex: let suppose we have 5 indexes into a table. If before launching
parallel workers, if we can add boolean flag(can_parallel)
IndexBulkDeleteResult structure to identify that this index is
supporting parallel vacuum or not.
Let index 1, 4 are not supporting parallel vacuum so we already have
info in a linked list that 1->4 are not supporting parallel vacuum, so
parallel_vacuum_index will process these indexes and rest will be
processed by parallel workers. If parallel worker found that
can_parallel is false, then it will skip that index.

As per my understanding, if we implement this, then we can avoid
multiple function calling of skip_parallel_vacuum_index and if there
is no index which can't performe parallel vacuum, then we will not
call vacuum_indexes_leader as head of list pointing to null. (we can
save unnecessary calling of vacuum_indexes_leader)

Thoughts?

--
Thanks and Regards
Mahendra Singh Thalor
EnterpriseDB: http://www.enterprisedb.com

#353

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

almost 6 years ago

In reply to: Amit Kapila (#348)

On Mon, 13 Jan 2020 at 12:50, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Sat, Jan 11, 2020 at 7:48 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
On Sat, 11 Jan 2020 at 13:18, Amit Kapila <amit.kapila16@gmail.com> wrote:
On Sat, Jan 11, 2020 at 9:23 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
On Fri, 10 Jan 2020 at 20:54, Mahendra Singh Thalor <mahi6run@gmail.com> wrote:
On Fri, 10 Jan 2020 at 15:51, Sergei Kornilov <sk@zsrv.org> wrote:
Hi
Thank you for update! I looked again
(vacuum_indexes_leader)
+               /* Skip the indexes that can be processed by parallel workers */
+               if (!skip_index)
+                       continue;
Does the variable name skip_index not confuse here? Maybe rename to something like can_parallel?
I also agree with your point.
I don't think the change is a good idea.
-               bool            skip_index = (get_indstats(lps->lvshared, i) == NULL ||
-                                                                 skip_parallel_vacuum_index(Irel[i], lps->lvshared));
+               bool            can_parallel = (get_indstats(lps->lvshared, i) == NULL ||
+                                                                       skip_parallel_vacuum_index(Irel[i],
+                                                                                                                          lps->lvshared));
The above condition is true when the index can *not* do parallel index vacuum. How about changing it to skipped_index and change the comment to something like “We are interested in only index skipped parallel vacuum”?
Hmm, I find the current code and comment better than what you or
Sergei are proposing. I am not sure what is the point of confusion in
the current code?
Yeah the current code is also good. I just thought they were concerned
that the variable name skip_index might be confusing because we skip
if skip_index is NOT true.
Okay, would it better if we get rid of this variable and have code like below?

/* Skip the indexes that can be processed by parallel workers */
if ( !(get_indstats(lps->lvshared, i) == NULL ||
skip_parallel_vacuum_index(Irel[i], lps->lvshared)))
continue;

Make sense to me.

...
Another question about behavior on temporary tables. Use case: the user commands just "vacuum;" to vacuum entire database (and has enough maintenance workers). Vacuum starts fine in parallel, but on first temporary table we hit:
+       if (RelationUsesLocalBuffers(onerel) && params->nworkers >= 0)
+       {
+               ereport(WARNING,
+                               (errmsg("disabling parallel option of vacuum on \"%s\" --- cannot vacuum temporary tables in parallel",
+                                               RelationGetRelationName(onerel))));
+               params->nworkers = -1;
+       }
And therefore we turn off the parallel vacuum for the remaining tables... Can we improve this case?
Good point.
Yes, we should improve this. I tried to fix this.
+1
Yeah, we can improve the situation here. I think we don't need to
change the value of params->nworkers at first place if allow
lazy_scan_heap to take care of this. Also, I think we shouldn't
display warning unless the user has explicitly asked for parallel
option. See the fix in the attached patch.
Agreed. But with the updated patch the PARALLEL option without the
parallel degree doesn't display warning because params->nworkers = 0
in that case. So how about restoring params->nworkers at the end of
vacuum_rel()?
I had also thought on those lines, but I was not entirely sure about
this resetting of workers. Today, again thinking about it, it seems
the idea Mahendra is suggesting that is giving an error if the
parallel degree is not specified seems reasonable to me. This means
Vacuum (parallel), Vacuum (parallel) <tbl_name>, etc. will give an
error "parallel degree must be specified". This idea has merit as now
we are supporting a parallel vacuum by default, so a 'parallel' option
without a parallel degree doesn't have any meaning. If we do that,
then we don't need to do anything additional about the handling of
temp tables (other than what patch is already doing) as well. What do
you think?

Good point! Agreed.

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#354

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

almost 6 years ago

In reply to: Mahendra Singh Thalor (#352)

On Tue, 14 Jan 2020 at 03:20, Mahendra Singh Thalor <mahi6run@gmail.com> wrote:

On Fri, 10 Jan 2020 at 15:51, Sergei Kornilov <sk@zsrv.org> wrote:
Hi
Thank you for update! I looked again
(vacuum_indexes_leader)
+               /* Skip the indexes that can be processed by parallel workers */
+               if (!skip_index)
+                       continue;
Does the variable name skip_index not confuse here? Maybe rename to something like can_parallel?
Again I looked into code and thought that somehow if we can add a
boolean flag(can_parallel) in IndexBulkDeleteResult structure to
identify that this index is supporting parallel vacuum or not, then it
will be easy to skip those indexes and multiple time we will not call
skip_parallel_vacuum_index (from vacuum_indexes_leader and
parallel_vacuum_index)
We can have a linked list of non-parallel supported indexes, then
directly we can pass to vacuum_indexes_leader.

Ex: let suppose we have 5 indexes into a table. If before launching
parallel workers, if we can add boolean flag(can_parallel)
IndexBulkDeleteResult structure to identify that this index is
supporting parallel vacuum or not.
Let index 1, 4 are not supporting parallel vacuum so we already have
info in a linked list that 1->4 are not supporting parallel vacuum, so
parallel_vacuum_index will process these indexes and rest will be
processed by parallel workers. If parallel worker found that
can_parallel is false, then it will skip that index.

As per my understanding, if we implement this, then we can avoid
multiple function calling of skip_parallel_vacuum_index and if there
is no index which can't performe parallel vacuum, then we will not
call vacuum_indexes_leader as head of list pointing to null. (we can
save unnecessary calling of vacuum_indexes_leader)

Thoughts?

We skip not only indexes that don't support parallel index vacuum but
also indexes supporting it depending on vacuum phase. That is, we
could skip different indexes at different vacuum phase. Therefore with
your idea, we would need to have at least three linked lists for each
possible vacuum phase(bulkdelete, conditional cleanup and cleanup), is
that right?

I think we can check if there are indexes that should be processed by
the leader process before entering the loop in vacuum_indexes_leader
by comparing nindexes_parallel_XXX of LVParallelState to the number of
indexes but I'm not sure it's effective since the number of indexes on
a table should be small.

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#355

Mahendra Singh Thalor

mahi6run@gmail.com

almost 6 years ago

In reply to: Masahiko Sawada (#354)

On Tue, 14 Jan 2020 at 10:06, Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Tue, 14 Jan 2020 at 03:20, Mahendra Singh Thalor <mahi6run@gmail.com> wrote:
On Fri, 10 Jan 2020 at 15:51, Sergei Kornilov <sk@zsrv.org> wrote:
Hi
Thank you for update! I looked again
(vacuum_indexes_leader)
+               /* Skip the indexes that can be processed by parallel workers */
+               if (!skip_index)
+                       continue;
Does the variable name skip_index not confuse here? Maybe rename to something like can_parallel?
Again I looked into code and thought that somehow if we can add a
boolean flag(can_parallel) in IndexBulkDeleteResult structure to
identify that this index is supporting parallel vacuum or not, then it
will be easy to skip those indexes and multiple time we will not call
skip_parallel_vacuum_index (from vacuum_indexes_leader and
parallel_vacuum_index)
We can have a linked list of non-parallel supported indexes, then
directly we can pass to vacuum_indexes_leader.

Ex: let suppose we have 5 indexes into a table. If before launching
parallel workers, if we can add boolean flag(can_parallel)
IndexBulkDeleteResult structure to identify that this index is
supporting parallel vacuum or not.
Let index 1, 4 are not supporting parallel vacuum so we already have
info in a linked list that 1->4 are not supporting parallel vacuum, so
parallel_vacuum_index will process these indexes and rest will be
processed by parallel workers. If parallel worker found that
can_parallel is false, then it will skip that index.

As per my understanding, if we implement this, then we can avoid
multiple function calling of skip_parallel_vacuum_index and if there
is no index which can't performe parallel vacuum, then we will not
call vacuum_indexes_leader as head of list pointing to null. (we can
save unnecessary calling of vacuum_indexes_leader)

Thoughts?
We skip not only indexes that don't support parallel index vacuum but
also indexes supporting it depending on vacuum phase. That is, we
could skip different indexes at different vacuum phase. Therefore with
your idea, we would need to have at least three linked lists for each
possible vacuum phase(bulkdelete, conditional cleanup and cleanup), is
that right?

I think we can check if there are indexes that should be processed by
the leader process before entering the loop in vacuum_indexes_leader
by comparing nindexes_parallel_XXX of LVParallelState to the number of
indexes but I'm not sure it's effective since the number of indexes on
a table should be small.

Hi,

+    /*
+     * Try to initialize the parallel vacuum if requested
+     */
+    if (params->nworkers >= 0 && vacrelstats->useindex)
+    {
+        /*
+         * Since parallel workers cannot access data in temporary tables, we
+         * can't perform parallel vacuum on them.
+         */
+        if (RelationUsesLocalBuffers(onerel))
+        {
+            /*
+             * Give warning only if the user explicitly tries to perform a
+             * parallel vacuum on the temporary table.
+             */
+            if (params->nworkers > 0)
+                ereport(WARNING,
+                        (errmsg("disabling parallel option of vacuum
on \"%s\" --- cannot vacuum temporary tables in parallel",

From v45 patch, we moved warning of temporary table into
"params->nworkers >= 0 && vacrelstats->useindex)" check so if table
don't have any index, then we are not giving any warning. I think, we
should give warning for all the temporary tables if parallel degree is
given. (Till v44 patch, we were giving warning for all the temporary
tables(having index and without index))

Thoughts?

--
Thanks and Regards
Mahendra Singh Thalor
EnterpriseDB: http://www.enterprisedb.com

#356

Mahendra Singh Thalor

mahi6run@gmail.com

almost 6 years ago

In reply to: Mahendra Singh Thalor (#355)

On Tue, 14 Jan 2020 at 16:17, Mahendra Singh Thalor <mahi6run@gmail.com>
wrote:

On Tue, 14 Jan 2020 at 10:06, Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Tue, 14 Jan 2020 at 03:20, Mahendra Singh Thalor <mahi6run@gmail.com>

wrote:

On Fri, 10 Jan 2020 at 15:51, Sergei Kornilov <sk@zsrv.org> wrote:

Hi
Thank you for update! I looked again

(vacuum_indexes_leader)
+ /* Skip the indexes that can be processed by

parallel workers */

+ if (!skip_index)
+ continue;

Does the variable name skip_index not confuse here? Maybe rename to

something like can_parallel?

Again I looked into code and thought that somehow if we can add a
boolean flag(can_parallel) in IndexBulkDeleteResult structure to
identify that this index is supporting parallel vacuum or not, then it
will be easy to skip those indexes and multiple time we will not call
skip_parallel_vacuum_index (from vacuum_indexes_leader and
parallel_vacuum_index)
We can have a linked list of non-parallel supported indexes, then
directly we can pass to vacuum_indexes_leader.

Ex: let suppose we have 5 indexes into a table. If before launching
parallel workers, if we can add boolean flag(can_parallel)
IndexBulkDeleteResult structure to identify that this index is
supporting parallel vacuum or not.
Let index 1, 4 are not supporting parallel vacuum so we already have
info in a linked list that 1->4 are not supporting parallel vacuum, so
parallel_vacuum_index will process these indexes and rest will be
processed by parallel workers. If parallel worker found that
can_parallel is false, then it will skip that index.

As per my understanding, if we implement this, then we can avoid
multiple function calling of skip_parallel_vacuum_index and if there
is no index which can't performe parallel vacuum, then we will not
call vacuum_indexes_leader as head of list pointing to null. (we can
save unnecessary calling of vacuum_indexes_leader)

Thoughts?

We skip not only indexes that don't support parallel index vacuum but
also indexes supporting it depending on vacuum phase. That is, we
could skip different indexes at different vacuum phase. Therefore with
your idea, we would need to have at least three linked lists for each
possible vacuum phase(bulkdelete, conditional cleanup and cleanup), is
that right?

I think we can check if there are indexes that should be processed by
the leader process before entering the loop in vacuum_indexes_leader
by comparing nindexes_parallel_XXX of LVParallelState to the number of
indexes but I'm not sure it's effective since the number of indexes on
a table should be small.

Hi,
+    /*
+     * Try to initialize the parallel vacuum if requested
+     */
+    if (params->nworkers >= 0 && vacrelstats->useindex)
+    {
+        /*
+         * Since parallel workers cannot access data in temporary

tables, we

+         * can't perform parallel vacuum on them.
+         */
+        if (RelationUsesLocalBuffers(onerel))
+        {
+            /*
+             * Give warning only if the user explicitly tries to perform

+             * parallel vacuum on the temporary table.
+             */
+            if (params->nworkers > 0)
+                ereport(WARNING,
+                        (errmsg("disabling parallel option of vacuum
on \"%s\" --- cannot vacuum temporary tables in parallel",
From v45 patch, we moved warning of temporary table into
"params->nworkers >= 0 && vacrelstats->useindex)" check so if table
don't have any index, then we are not giving any warning. I think, we
should give warning for all the temporary tables if parallel degree is
given. (Till v44 patch, we were giving warning for all the temporary
tables(having index and without index))

Thoughts?

Hi,
I did some more review. Below is the 1 review comment for v46-0002.

+    /*
+     * Initialize the state for parallel vacuum
+     */
+    if (params->nworkers >= 0 && vacrelstats->useindex)
+    {
+        /*
+         * Since parallel workers cannot access data in temporary tables,
we
+         * can't perform parallel vacuum on them.
+         */
+        if (RelationUsesLocalBuffers(onerel)

In above check, we should add "nindexes > 1" check so that if there is only
1 index, then we will not call begin_parallel_vacuum.

"Initialize the state for parallel vacuum",we can improve this comment by
mentioning that what are doing here. (If table has more than index and
parallel vacuum is requested, then try to start parallel vacuum).

--
Thanks and Regards
Mahendra Singh Thalor
EnterpriseDB: http://www.enterprisedb.com

#357

Amit Kapila

amit.kapila16@gmail.com

almost 6 years ago

In reply to: Masahiko Sawada (#353)

2 attachment(s)

On Tue, Jan 14, 2020 at 10:04 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Mon, 13 Jan 2020 at 12:50, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Sat, Jan 11, 2020 at 7:48 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

Okay, would it better if we get rid of this variable and have code like below?

/* Skip the indexes that can be processed by parallel workers */
if ( !(get_indstats(lps->lvshared, i) == NULL ||
skip_parallel_vacuum_index(Irel[i], lps->lvshared)))
continue;

Make sense to me.

I have changed the comment and condition to make it a positive test so
that it is more clear.

...

Agreed. But with the updated patch the PARALLEL option without the
parallel degree doesn't display warning because params->nworkers = 0
in that case. So how about restoring params->nworkers at the end of
vacuum_rel()?

I had also thought on those lines, but I was not entirely sure about
this resetting of workers. Today, again thinking about it, it seems
the idea Mahendra is suggesting that is giving an error if the
parallel degree is not specified seems reasonable to me. This means
Vacuum (parallel), Vacuum (parallel) <tbl_name>, etc. will give an
error "parallel degree must be specified". This idea has merit as now
we are supporting a parallel vacuum by default, so a 'parallel' option
without a parallel degree doesn't have any meaning. If we do that,
then we don't need to do anything additional about the handling of
temp tables (other than what patch is already doing) as well. What do
you think?

Good point! Agreed.

Thanks, changed accordingly.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachments:

v47-0001-Introduce-IndexAM-fields-for-parallel-vacuum.patchapplication/octet-stream; name=v47-0001-Introduce-IndexAM-fields-for-parallel-vacuum.patchDownload

From 3d573db875714025939802b4e8deb3e95a621821 Mon Sep 17 00:00:00 2001
From: Amit Kapila <akapila@postgresql.org>
Date: Thu, 9 Jan 2020 14:36:35 +0530
Subject: [PATCH 1/2] Introduce IndexAM fields for parallel vacuum.

Introduce new fields amusemaintenanceworkmem and amparallelvacuumoptions
in IndexAmRoutine for parallel vacuum.  The amusemaintenanceworkmem tells
whether a particular IndexAM uses maintenance_work_mem or not.  This will
help in controlling the memory used by individual workers as otherwise,
each worker can consume memory equal to maintenance_work_mem.  The
amparallelvacuumoptions tell whether a particular IndexAM participates in
a parallel vacuum and if so in which phase (bulkdelete, vacuumcleanup) of
vacuum.

Author: Masahiko Sawada and Amit Kapila
Reviewed-by: Dilip Kumar, Amit Kapila, Tomas Vondra and Robert Haas
Discussion:
https://postgr.es/m/CAD21AoDTPMgzSkV4E3SFo1CH_x50bf5PqZFQf4jmqjk-C03BWg@mail.gmail.com
https://postgr.es/m/CAA4eK1LmcD5aPogzwim5Nn58Ki+74a6Edghx4Wd8hAskvHaq5A@mail.gmail.com
---
 contrib/bloom/blutils.c                          |  4 +++
 doc/src/sgml/indexam.sgml                        |  4 +++
 src/backend/access/brin/brin.c                   |  4 +++
 src/backend/access/gin/ginutil.c                 |  4 +++
 src/backend/access/gist/gist.c                   |  4 +++
 src/backend/access/hash/hash.c                   |  3 ++
 src/backend/access/nbtree/nbtree.c               |  3 ++
 src/backend/access/spgist/spgutils.c             |  4 +++
 src/include/access/amapi.h                       |  4 +++
 src/include/commands/vacuum.h                    | 38 ++++++++++++++++++++++++
 src/test/modules/dummy_index_am/dummy_index_am.c |  3 ++
 11 files changed, 75 insertions(+)

diff --git a/contrib/bloom/blutils.c b/contrib/bloom/blutils.c
index 23d959b9f0..0104d02f67 100644
--- a/contrib/bloom/blutils.c
+++ b/contrib/bloom/blutils.c
@@ -18,6 +18,7 @@
 #include "access/reloptions.h"
 #include "bloom.h"
 #include "catalog/index.h"
+#include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
@@ -121,6 +122,9 @@ blhandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_CLEANUP;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = blbuild;
diff --git a/doc/src/sgml/indexam.sgml b/doc/src/sgml/indexam.sgml
index dd54c68802..37f8d8760a 100644
--- a/doc/src/sgml/indexam.sgml
+++ b/doc/src/sgml/indexam.sgml
@@ -122,6 +122,10 @@ typedef struct IndexAmRoutine
     bool        amcanparallel;
     /* does AM support columns included with clause INCLUDE? */
     bool        amcaninclude;
+    /* does AM use maintenance_work_mem? */
+    bool        amusemaintenanceworkmem;
+    /* OR of parallel vacuum flags */
+    uint8       amparallelvacuumoptions;
     /* type of data stored in index, or InvalidOid if variable */
     Oid         amkeytype;
 
diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index d89af7844d..2e8f67ef10 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -27,6 +27,7 @@
 #include "access/xloginsert.h"
 #include "catalog/index.h"
 #include "catalog/pg_am.h"
+#include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "pgstat.h"
 #include "postmaster/autovacuum.h"
@@ -101,6 +102,9 @@ brinhandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_CLEANUP;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = brinbuild;
diff --git a/src/backend/access/gin/ginutil.c b/src/backend/access/gin/ginutil.c
index 910f0bcb91..a7e55caf28 100644
--- a/src/backend/access/gin/ginutil.c
+++ b/src/backend/access/gin/ginutil.c
@@ -20,6 +20,7 @@
 #include "access/xloginsert.h"
 #include "catalog/pg_collation.h"
 #include "catalog/pg_type.h"
+#include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "storage/indexfsm.h"
 #include "storage/lmgr.h"
@@ -53,6 +54,9 @@ ginhandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amusemaintenanceworkmem = true;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_CLEANUP;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = ginbuild;
diff --git a/src/backend/access/gist/gist.c b/src/backend/access/gist/gist.c
index 5c9ad341b3..aefc302ed2 100644
--- a/src/backend/access/gist/gist.c
+++ b/src/backend/access/gist/gist.c
@@ -17,6 +17,7 @@
 #include "access/gist_private.h"
 #include "access/gistscan.h"
 #include "catalog/pg_collation.h"
+#include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "nodes/execnodes.h"
 #include "storage/lmgr.h"
@@ -74,6 +75,9 @@ gisthandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = true;
+	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = gistbuild;
diff --git a/src/backend/access/hash/hash.c b/src/backend/access/hash/hash.c
index 4bb6efc98f..4871b7ff4d 100644
--- a/src/backend/access/hash/hash.c
+++ b/src/backend/access/hash/hash.c
@@ -72,6 +72,9 @@ hashhandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL;
 	amroutine->amkeytype = INT4OID;
 
 	amroutine->ambuild = hashbuild;
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index 8376a5e6b7..5254bc7ef5 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -121,6 +121,9 @@ bthandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = true;
 	amroutine->amcaninclude = true;
+	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = btbuild;
diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c
index d715908764..4924ae1c59 100644
--- a/src/backend/access/spgist/spgutils.c
+++ b/src/backend/access/spgist/spgutils.c
@@ -22,6 +22,7 @@
 #include "access/transam.h"
 #include "access/xact.h"
 #include "catalog/pg_amop.h"
+#include "commands/vacuum.h"
 #include "storage/bufmgr.h"
 #include "storage/indexfsm.h"
 #include "storage/lmgr.h"
@@ -56,6 +57,9 @@ spghandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amparallelvacuumoptions =
+		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = spgbuild;
diff --git a/src/include/access/amapi.h b/src/include/access/amapi.h
index d2a49e8d3e..3b3e22f73d 100644
--- a/src/include/access/amapi.h
+++ b/src/include/access/amapi.h
@@ -197,6 +197,10 @@ typedef struct IndexAmRoutine
 	bool		amcanparallel;
 	/* does AM support columns included with clause INCLUDE? */
 	bool		amcaninclude;
+	/* does AM use maintenance_work_mem? */
+	bool		amusemaintenanceworkmem;
+	/* OR of parallel vacuum flags.  See vacuum.h for flags. */
+	uint8		amparallelvacuumoptions;
 	/* type of data stored in index, or InvalidOid if variable */
 	Oid			amkeytype;
 
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 5dc41dd0c1..b3351ad406 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -23,6 +23,44 @@
 #include "storage/lock.h"
 #include "utils/relcache.h"
 
+/*
+ * Flags for amparallelvacuumoptions to control the participation of bulkdelete
+ * and vacuumcleanup in parallel vacuum.
+ */
+
+/*
+ * Both bulkdelete and vacuumcleanup are disabled by default.  This will be
+ * used by IndexAM's that don't want to or cannot participate in parallel
+ * vacuum.  For example, if an index AM doesn't have a way to communicate the
+ * index statistics allocated by the first ambulkdelete call to the subsequent
+ * ones until amvacuumcleanup, the index AM cannot participate in parallel
+ * vacuum.
+ */
+#define VACUUM_OPTION_NO_PARALLEL			0
+
+/*
+ * bulkdelete can be performed in parallel.  This option can be used by
+ * IndexAm's that need to scan the index to delete the tuples.
+ */
+#define VACUUM_OPTION_PARALLEL_BULKDEL		(1 << 0)
+
+/*
+ * vacuumcleanup can be performed in parallel if bulkdelete is not performed
+ * yet.  This will be used by IndexAM's that can scan the index if the
+ * bulkdelete is not performed.
+ */
+#define VACUUM_OPTION_PARALLEL_COND_CLEANUP	(1 << 1)
+
+/*
+ * vacuumcleanup can be performed in parallel even if bulkdelete has already
+ * processed the index.  This will be used by IndexAM's that scan the index
+ * during the cleanup phase of index irrespective of whether the index is
+ * already scanned or not during bulkdelete phase.
+ */
+#define VACUUM_OPTION_PARALLEL_CLEANUP		(1 << 2)
+
+/* value for checking vacuum flags */
+#define VACUUM_OPTION_MAX_VALID_VALUE		((1 << 3) - 1)
 
 /*----------
  * ANALYZE builds one of these structs for each attribute (column) that is
diff --git a/src/test/modules/dummy_index_am/dummy_index_am.c b/src/test/modules/dummy_index_am/dummy_index_am.c
index 898ab06639..f32632089b 100644
--- a/src/test/modules/dummy_index_am/dummy_index_am.c
+++ b/src/test/modules/dummy_index_am/dummy_index_am.c
@@ -16,6 +16,7 @@
 #include "access/amapi.h"
 #include "access/reloptions.h"
 #include "catalog/index.h"
+#include "commands/vacuum.h"
 #include "nodes/pathnodes.h"
 #include "utils/guc.h"
 #include "utils/rel.h"
@@ -294,6 +295,8 @@ dihandler(PG_FUNCTION_ARGS)
 	amroutine->ampredlocks = false;
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
+	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amparallelvacuumoptions = VACUUM_OPTION_NO_PARALLEL;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = dibuild;
-- 
2.16.2.windows.1

v47-0002-Allow-vacuum-command-to-process-indexes-in-parallel.patchapplication/octet-stream; name=v47-0002-Allow-vacuum-command-to-process-indexes-in-parallel.patchDownload

From c0ea47c8f623766dfcbd7917f0d54fbb84f796cc Mon Sep 17 00:00:00 2001
From: Amit Kapila <akapila@postgresql.org>
Date: Thu, 9 Jan 2020 15:49:46 +0530
Subject: [PATCH 2/2] Allow vacuum command to process indexes in parallel.

This feature allows the vacuum to leverage multiple CPUs in order to
process indexes.  This enables us to perform index vacuuming and index
cleanup with background workers.  This adds a PARALLEL option to VACUUM
command where the user can specify the number of workers that can be used
to perform the command which is limited by the number of indexes on a
table.  Specifying zero as a number of workers will disable parallelism.
This option can't be used with the FULL option.

Each index is processed by at most one vacuum process.  Therefore parallel
vacuum can be used when the table has at least two indexes.

The parallel degree is either specified by the user or determined based on
the number of indexes that the table has, and further limited by
max_parallel_maintenance_workers.  The index can participate in parallel
vacuum iff it's size is greater than min_parallel_index_scan_size.

Author: Masahiko Sawada and Amit Kapila
Reviewed-by: Dilip Kumar, Amit Kapila, Robert Haas, Tomas Vondra,
Mahendra Singh and Sergei Kornilov
Tested-by: Mahendra Singh and Prabhat Sahu
Discussion:
https://postgr.es/m/CAD21AoDTPMgzSkV4E3SFo1CH_x50bf5PqZFQf4jmqjk-C03BWg@mail.gmail.com
https://postgr.es/m/CAA4eK1J-VoR9gzS5E75pcD-OH0mEyCdp8RihcwKrcuw7J-Q0+w@mail.gmail.com
---
 doc/src/sgml/config.sgml              |   18 +-
 doc/src/sgml/ref/vacuum.sgml          |   61 +-
 src/backend/access/heap/vacuumlazy.c  | 1265 ++++++++++++++++++++++++++++++---
 src/backend/access/transam/parallel.c |   26 +-
 src/backend/commands/vacuum.c         |  133 +++-
 src/backend/postmaster/autovacuum.c   |    2 +
 src/bin/psql/tab-complete.c           |    2 +-
 src/include/access/heapam.h           |    3 +
 src/include/access/parallel.h         |    4 +-
 src/include/commands/vacuum.h         |   12 +
 src/test/regress/expected/vacuum.out  |   28 +
 src/test/regress/sql/vacuum.sql       |   27 +
 src/tools/pgindent/typedefs.list      |    4 +
 13 files changed, 1449 insertions(+), 136 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 5d1c90282f..beb3d599c9 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2308,13 +2308,13 @@ include_dir 'conf.d'
        <listitem>
         <para>
          Sets the maximum number of parallel workers that can be
-         started by a single utility command.  Currently, the only
-         parallel utility command that supports the use of parallel
-         workers is <command>CREATE INDEX</command>, and only when
-         building a B-tree index.  Parallel workers are taken from the
-         pool of processes established by <xref
-         linkend="guc-max-worker-processes"/>, limited by <xref
-         linkend="guc-max-parallel-workers"/>.  Note that the requested
+         started by a single utility command.  Currently, the parallel
+         utility commands that support the use of parallel workers are
+         <command>CREATE INDEX</command> only when building a B-tree index,
+         and <command>VACUUM</command> without <literal>FULL</literal>
+         option.  Parallel workers are taken from the pool of processes
+         established by <xref linkend="guc-max-worker-processes"/>, limited
+         by <xref linkend="guc-max-parallel-workers"/>.  Note that the requested
          number of workers may not actually be available at run time.
          If this occurs, the utility operation will run with fewer
          workers than expected.  The default value is 2.  Setting this
@@ -4895,7 +4895,9 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
         for a parallel scan to be considered.  Note that a parallel index scan
         typically won't touch the entire index; it is the number of pages
         which the planner believes will actually be touched by the scan which
-        is relevant.
+        is relevant.  This parameter is also used to decide whether a
+        particular index can participate in a parallel vacuum.  See
+        <xref linkend="sql-vacuum"/>.
         If this value is specified without units, it is taken as blocks,
         that is <symbol>BLCKSZ</symbol> bytes, typically 8kB.
         The default is 512 kilobytes (<literal>512kB</literal>).
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index f9b0fb8794..846056a353 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -34,6 +34,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     SKIP_LOCKED [ <replaceable class="parameter">boolean</replaceable> ]
     INDEX_CLEANUP [ <replaceable class="parameter">boolean</replaceable> ]
     TRUNCATE [ <replaceable class="parameter">boolean</replaceable> ]
+    PARALLEL <replaceable class="parameter">integer</replaceable>
 
 <phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
 
@@ -75,10 +76,14 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    with normal reading and writing of the table, as an exclusive lock
    is not obtained.  However, extra space is not returned to the operating
    system (in most cases); it's just kept available for re-use within the
-   same table.  <command>VACUUM FULL</command> rewrites the entire contents
-   of the table into a new disk file with no extra space, allowing unused
-   space to be returned to the operating system.  This form is much slower and
-   requires an exclusive lock on each table while it is being processed.
+   same table.  It also allows us to leverage multiple CPUs in order to process
+   indexes.  This feature is known as <firstterm>parallel vacuum</firstterm>.
+   To disable this feature, one can use <literal>PARALLEL</literal> option and
+   specify parallel workers as zero.  <command>VACUUM FULL</command> rewrites
+   the entire contents of the table into a new disk file with no extra space,
+   allowing unused space to be returned to the operating system.  This form is
+   much slower and requires an exclusive lock on each table while it is being
+   processed.
   </para>
 
   <para>
@@ -223,6 +228,33 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><literal>PARALLEL</literal></term>
+    <listitem>
+     <para>
+      Perform vacuum index and cleanup index phases of <command>VACUUM</command>
+      in parallel using <replaceable class="parameter">integer</replaceable>
+      background workers (for the detail of each vacuum phases, please
+      refer to <xref linkend="vacuum-phases"/>).  If the
+      <literal>PARALLEL</literal> option is omitted, then
+      <command>VACUUM</command> decides the number of workers based on number
+      of indexes that support parallel vacuum operation on the relation which
+      is further limited by <xref linkend="guc-max-parallel-workers-maintenance"/>.
+      The index can participate in a parallel vacuum if and only if the size
+      of the index is more than <xref linkend="guc-min-parallel-index-scan-size"/>.
+      Please note that it is not guaranteed that the number of parallel workers
+      specified in <replaceable class="parameter">integer</replaceable> will
+      be used during execution.  It is possible for a vacuum to run with fewer
+      workers than specified, or even with no workers at all.  Only one worker
+      can be used per index.  So parallel workers are launched only when there
+      are at least <literal>2</literal> indexes in the table.  Workers for
+      vacuum launches before starting each phase and exit at the end of
+      the phase.  These behaviors might change in a future release.  This
+      option can't be used with the <literal>FULL</literal> option.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><replaceable class="parameter">boolean</replaceable></term>
     <listitem>
@@ -237,6 +269,15 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><replaceable class="parameter">integer</replaceable></term>
+    <listitem>
+     <para>
+      Specifies a non-negative integer value passed to the selected option.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><replaceable class="parameter">table_name</replaceable></term>
     <listitem>
@@ -316,11 +357,19 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     more than a plain <command>VACUUM</command> would.
    </para>
 
+   <para>
+     The <option>PARALLEL</option> option is used only for vacuum purpose.
+     Even if this option is specified with <option>ANALYZE</option> option
+     it does not affect <option>ANALYZE</option>.
+   </para>
+
    <para>
     <command>VACUUM</command> causes a substantial increase in I/O traffic,
     which might cause poor performance for other active sessions.  Therefore,
-    it is sometimes advisable to use the cost-based vacuum delay feature.
-    See <xref linkend="runtime-config-resource-vacuum-cost"/> for details.
+    it is sometimes advisable to use the cost-based vacuum delay feature.  For
+    parallel vacuum, each worker sleeps proportional to the work done by that
+    worker.  See <xref linkend="runtime-config-resource-vacuum-cost"/> for
+    details.
    </para>
 
    <para>
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index a5fe90485f..d2c895a6fb 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -22,6 +22,20 @@
  * of index scans performed.  So we don't use maintenance_work_mem memory for
  * the TID array, just enough to hold as many heap tuples as fit on one page.
  *
+ * Lazy vacuum supports parallel execution with parallel worker processes.  In
+ * a parallel lazy vacuum, we perform both index vacuum and index cleanup with
+ * parallel worker processes.  Individual indexes are processed by one vacuum
+ * process.  At the beginning of a lazy vacuum (at lazy_scan_heap) we prepare
+ * the parallel context and initialize the DSM segment that contains shared
+ * information as well as the memory space for storing dead tuples.  When
+ * starting either index vacuum or index cleanup, we launch parallel worker
+ * processes.  Once all indexes are processed the parallel worker processes
+ * exit.  After that, the leader process re-initializes the parallel context
+ * so that it can use the same DSM for multiple passes of index vacuum and
+ * for performing index cleanup.  For updating the index statistics, we need
+ * to update the system table and since updates are not allowed during
+ * parallel mode we update the index statistics after exiting from the
+ * parallel mode.
  *
  * Portions Copyright (c) 1996-2020, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
@@ -36,25 +50,30 @@
 
 #include <math.h>
 
+#include "access/amapi.h"
 #include "access/genam.h"
 #include "access/heapam.h"
 #include "access/heapam_xlog.h"
 #include "access/htup_details.h"
 #include "access/multixact.h"
+#include "access/parallel.h"
 #include "access/transam.h"
 #include "access/visibilitymap.h"
+#include "access/xact.h"
 #include "access/xlog.h"
 #include "catalog/storage.h"
 #include "commands/dbcommands.h"
 #include "commands/progress.h"
 #include "commands/vacuum.h"
 #include "miscadmin.h"
+#include "optimizer/paths.h"
 #include "pgstat.h"
 #include "portability/instr_time.h"
 #include "postmaster/autovacuum.h"
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
 #include "storage/lmgr.h"
+#include "tcop/tcopprot.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/pg_rusage.h"
@@ -110,6 +129,142 @@
  */
 #define PREFETCH_SIZE			((BlockNumber) 32)
 
+/*
+ * DSM keys for parallel lazy vacuum.  Unlike other parallel execution code,
+ * since we don't need to worry about DSM keys conflicting with plan_node_id
+ * we can use small integers.
+ */
+#define PARALLEL_VACUUM_KEY_SHARED			1
+#define PARALLEL_VACUUM_KEY_DEAD_TUPLES		2
+#define PARALLEL_VACUUM_KEY_QUERY_TEXT		3
+
+/*
+ * Macro to check if we are in a parallel lazy vacuum.  If true, we are
+ * in the parallel mode and the DSM segment is initialized.
+ */
+#define ParallelVacuumIsActive(lps) (((LVParallelState *) (lps)) != NULL)
+
+/*
+ * LVDeadTuples stores the dead tuple TIDs collected during the heap scan.
+ * This is allocated in the DSM segment in parallel mode and in local memory
+ * in non-parallel mode.
+ */
+typedef struct LVDeadTuples
+{
+	int			max_tuples;		/* # slots allocated in array */
+	int			num_tuples;		/* current # of entries */
+	/* List of TIDs of tuples we intend to delete */
+	/* NB: this list is ordered by TID address */
+	ItemPointerData itemptrs[FLEXIBLE_ARRAY_MEMBER];	/* array of
+														 * ItemPointerData */
+} LVDeadTuples;
+
+#define SizeOfLVDeadTuples(cnt) \
+		add_size((offsetof(LVDeadTuples, itemptrs)), \
+				 mul_size(sizeof(ItemPointerData), cnt))
+
+/*
+ * Shared information among parallel workers.  So this is allocated in the DSM
+ * segment.
+ */
+typedef struct LVShared
+{
+	/*
+	 * Target table relid and log level.  These fields are not modified during
+	 * the lazy vacuum.
+	 */
+	Oid			relid;
+	int			elevel;
+
+	/*
+	 * An indication for vacuum workers to perform either index vacuum or
+	 * index cleanup.  first_time is true only if for_cleanup is true and
+	 * bulk-deletion is not performed yet.
+	 */
+	bool		for_cleanup;
+	bool		first_time;
+
+	/*
+	 * Fields for both index vacuum and cleanup.
+	 *
+	 * reltuples is the total number of input heap tuples.  We set either old
+	 * live tuples in the index vacuum case or the new live tuples in the
+	 * index cleanup case.
+	 *
+	 * estimated_count is true if the reltuples is an estimated value.
+	 */
+	double		reltuples;
+	bool		estimated_count;
+
+	/*
+	 * In single process lazy vacuum we could consume more memory during index
+	 * vacuuming or cleanup apart from the memory for heap scanning. In
+	 * parallel index vacuum, since individual vacuum workers can consume
+	 * memory equal to maintenance_work_mem, the new maintenance_work_mem for
+	 * each worker is set such that the parallel operation doesn't consume
+	 * more memory than single process lazy vacuum.
+	 */
+	int			maintenance_work_mem_worker;
+
+	/*
+	 * Shared vacuum cost balance.  During parallel index vacuuming
+	 * VacuumSharedCostBalance points to this value and it accumulates the
+	 * balance of each parallel vacuum worker.
+	 */
+	pg_atomic_uint32 cost_balance;
+
+	/*
+	 * Number of active parallel workers.  This is used for computing the
+	 * minimum threshold of the vacuum cost balance for a worker to go for the
+	 * delay.
+	 */
+	pg_atomic_uint32 active_nworkers;
+
+	/*
+	 * Variables to control parallel index vacuuming.  We have a bitmap to
+	 * indicate which index has stats in shared memory.  The set bit in the
+	 * map indicates that the particular index supports a parallel vacuum.
+	 */
+	pg_atomic_uint32 idx;		/* counter for vacuuming and clean up */
+	uint32		offset;			/* sizeof header incl. bitmap */
+	bits8		bitmap[FLEXIBLE_ARRAY_MEMBER];	/* bit map of NULLs */
+
+	/* Shared index statistics data follows at end of struct */
+} LVShared;
+
+#define SizeOfLVShared (offsetof(LVShared, bitmap) + sizeof(bits8))
+#define GetSharedIndStats(s) \
+	((LVSharedIndStats *)((char *)(s) + ((LVShared *)(s))->offset))
+#define IndStatsIsNull(s, i) \
+	(!(((LVShared *)(s))->bitmap[(i) >> 3] & (1 << ((i) & 0x07))))
+
+/*
+ * Struct for an index bulk-deletion statistic used for parallel lazy
+ * vacuum.  This is allocated in the DSM segment.
+ */
+typedef struct LVSharedIndStats
+{
+	bool		updated;		/* are the stats updated? */
+	IndexBulkDeleteResult stats;
+} LVSharedIndStats;
+
+/* Struct for maintaining a parallel vacuum state. */
+typedef struct LVParallelState
+{
+	ParallelContext *pcxt;
+
+	/* Shared information among parallel vacuum workers */
+	LVShared   *lvshared;
+
+	/*
+	 * The number of indexes that support parallel index bulk-deletion and
+	 * parallel index cleanup respectively.
+	 */
+	int			nindexes_parallel_bulkdel;
+	int			nindexes_parallel_cleanup;
+	int			nindexes_parallel_condcleanup;
+} LVParallelState;
+
 typedef struct LVRelStats
 {
 	/* useindex = true means two-pass strategy; false means one-pass */
@@ -128,11 +283,7 @@ typedef struct LVRelStats
 	BlockNumber pages_removed;
 	double		tuples_deleted;
 	BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
-	/* List of TIDs of tuples we intend to delete */
-	/* NB: this list is ordered by TID address */
-	int			num_dead_tuples;	/* current # of entries */
-	int			max_dead_tuples;	/* # slots allocated in array */
-	ItemPointer dead_tuples;	/* array of ItemPointerData */
+	LVDeadTuples *dead_tuples;
 	int			num_index_scans;
 	TransactionId latestRemovedXid;
 	bool		lock_waiter_detected;
@@ -155,15 +306,15 @@ static void lazy_scan_heap(Relation onerel, VacuumParams *params,
 						   bool aggressive);
 static void lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats);
 static bool lazy_check_needs_freeze(Buffer buf, bool *hastup);
-static void lazy_vacuum_index(Relation indrel,
-							  IndexBulkDeleteResult **stats,
-							  LVRelStats *vacrelstats);
-static void lazy_vacuum_all_indexes(Relation onerel, LVRelStats *vacrelstats,
-									Relation *Irel, int nindexes,
-									IndexBulkDeleteResult **indstats);
+static void lazy_vacuum_all_indexes(Relation onerel, Relation *Irel,
+									IndexBulkDeleteResult **stats,
+									LVRelStats *vacrelstats, LVParallelState *lps,
+									int nindexes);
+static void lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult **stats,
+							  LVDeadTuples *dead_tuples, double reltuples);
 static void lazy_cleanup_index(Relation indrel,
-							   IndexBulkDeleteResult *stats,
-							   LVRelStats *vacrelstats);
+							   IndexBulkDeleteResult **stats,
+							   double reltuples, bool estimated_count);
 static int	lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 							 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer);
 static bool should_attempt_truncation(VacuumParams *params,
@@ -172,12 +323,41 @@ static void lazy_truncate_heap(Relation onerel, LVRelStats *vacrelstats);
 static BlockNumber count_nondeletable_pages(Relation onerel,
 											LVRelStats *vacrelstats);
 static void lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks);
-static void lazy_record_dead_tuple(LVRelStats *vacrelstats,
+static void lazy_record_dead_tuple(LVDeadTuples *dead_tuples,
 								   ItemPointer itemptr);
 static bool lazy_tid_reaped(ItemPointer itemptr, void *state);
 static int	vac_cmp_itemptr(const void *left, const void *right);
 static bool heap_page_is_all_visible(Relation rel, Buffer buf,
 									 TransactionId *visibility_cutoff_xid, bool *all_frozen);
+static void lazy_parallel_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+										 LVRelStats *vacrelstats, LVParallelState *lps,
+										 int nindexes);
+static void parallel_vacuum_index(Relation *Irel, IndexBulkDeleteResult **stats,
+								  LVShared *lvshared, LVDeadTuples *dead_tuples,
+								  int nindexes);
+static void vacuum_indexes_leader(Relation *Irel, IndexBulkDeleteResult **stats,
+								  LVRelStats *vacrelstats, LVParallelState *lps,
+								  int nindexes);
+static void vacuum_one_index(Relation indrel, IndexBulkDeleteResult **stats,
+							 LVShared *lvshared, LVSharedIndStats *shared_indstats,
+							 LVDeadTuples *dead_tuples);
+static void lazy_cleanup_all_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+									 LVRelStats *vacrelstats, LVParallelState *lps,
+									 int nindexes);
+static long compute_max_dead_tuples(BlockNumber relblocks, bool hasindex);
+static int	compute_parallel_vacuum_workers(Relation *Irel, int nindexes, int nrequested,
+											bool *can_parallel_vacuum);
+static void prepare_index_statistics(LVShared *lvshared, bool *can_parallel_vacuum,
+									 int nindexes);
+static void update_index_statistics(Relation *Irel, IndexBulkDeleteResult **stats,
+									int nindexes);
+static LVParallelState *begin_parallel_vacuum(Oid relid, Relation *Irel,
+											  LVRelStats *vacrelstats, BlockNumber nblocks,
+											  int nindexes, int nrequested);
+static void end_parallel_vacuum(Relation *Irel, IndexBulkDeleteResult **stats,
+								LVParallelState *lps, int nindexes);
+static LVSharedIndStats *get_indstats(LVShared *lvshared, int n);
+static bool skip_parallel_vacuum_index(Relation indrel, LVShared *lvshared);
 
 
 /*
@@ -491,6 +671,18 @@ vacuum_log_cleanup_info(Relation rel, LVRelStats *vacrelstats)
  *		dead-tuple TIDs, invoke vacuuming of indexes and call lazy_vacuum_heap
  *		to reclaim dead line pointers.
  *
+ *		If the table has at least two indexes and parallel lazy vacuum is
+ *		requested, we execute both index vacuum and index cleanup with
+ *		parallel workers.  In parallel lazy vacuum, we enter parallel mode and
+ *		then create both the parallel context and the DSM segment before starting
+ *		heap scan so that we can record dead tuples to the DSM segment.  All
+ *		parallel workers are launched at beginning of index vacuuming and index
+ *		cleanup and they exit once done with all indexes.  At the end of this
+ *		function we exit from parallel mode.  Index bulk-deletion results are
+ *		stored in the DSM segment and we update index statistics for all the
+ *		indexes after exiting from parallel mode since writes are not allowed
+ *		during parallel mode.
+ *
  *		If there are no indexes then we can reclaim line pointers on the fly;
  *		dead line pointers need only be retained until all index pointers that
  *		reference them have been killed.
@@ -499,6 +691,8 @@ static void
 lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			   Relation *Irel, int nindexes, bool aggressive)
 {
+	LVParallelState *lps = NULL;
+	LVDeadTuples *dead_tuples;
 	BlockNumber nblocks,
 				blkno;
 	HeapTupleData tuple;
@@ -556,13 +750,46 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	vacrelstats->nonempty_pages = 0;
 	vacrelstats->latestRemovedXid = InvalidTransactionId;
 
-	lazy_space_alloc(vacrelstats, nblocks);
+	/*
+	 * Initialize the state for parallel vacuum
+	 */
+	if (params->nworkers >= 0 && vacrelstats->useindex)
+	{
+		/*
+		 * Since parallel workers cannot access data in temporary tables, we
+		 * can't perform parallel vacuum on them.
+		 */
+		if (RelationUsesLocalBuffers(onerel))
+		{
+			/*
+			 * Give warning only if the user explicitly tries to perform a
+			 * parallel vacuum on the temporary table.
+			 */
+			if (params->nworkers > 0)
+				ereport(WARNING,
+						(errmsg("disabling parallel option of vacuum on \"%s\" --- cannot vacuum temporary tables in parallel",
+								RelationGetRelationName(onerel))));
+		}
+		else
+			lps = begin_parallel_vacuum(RelationGetRelid(onerel), Irel,
+										vacrelstats, nblocks, nindexes,
+										params->nworkers);
+	}
+
+	/*
+	 * Allocate the space for dead tuples in case the parallel vacuum is not
+	 * initialized.
+	 */
+	if (!ParallelVacuumIsActive(lps))
+		lazy_space_alloc(vacrelstats, nblocks);
+
+	dead_tuples = vacrelstats->dead_tuples;
 	frozen = palloc(sizeof(xl_heap_freeze_tuple) * MaxHeapTuplesPerPage);
 
 	/* Report that we're scanning the heap, advertising total # of blocks */
 	initprog_val[0] = PROGRESS_VACUUM_PHASE_SCAN_HEAP;
 	initprog_val[1] = nblocks;
-	initprog_val[2] = vacrelstats->max_dead_tuples;
+	initprog_val[2] = dead_tuples->max_tuples;
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
 	/*
@@ -740,8 +967,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * If we are close to overrunning the available space for dead-tuple
 		 * TIDs, pause and do a cycle of vacuuming before we tackle this page.
 		 */
-		if ((vacrelstats->max_dead_tuples - vacrelstats->num_dead_tuples) < MaxHeapTuplesPerPage &&
-			vacrelstats->num_dead_tuples > 0)
+		if ((dead_tuples->max_tuples - dead_tuples->num_tuples) < MaxHeapTuplesPerPage &&
+			dead_tuples->num_tuples > 0)
 		{
 			/*
 			 * Before beginning index vacuuming, we release any pin we may
@@ -756,8 +983,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			}
 
 			/* Work on all the indexes, then the heap */
-			lazy_vacuum_all_indexes(onerel, vacrelstats, Irel,
-									nindexes, indstats);
+			lazy_vacuum_all_indexes(onerel, Irel, indstats,
+									vacrelstats, lps, nindexes);
 
 			/* Remove tuples from heap */
 			lazy_vacuum_heap(onerel, vacrelstats);
@@ -767,7 +994,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 
 			/*
 			 * Vacuum the Free Space Map to make newly-freed space visible on
@@ -962,7 +1189,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		has_dead_tuples = false;
 		nfrozen = 0;
 		hastup = false;
-		prev_dead_count = vacrelstats->num_dead_tuples;
+		prev_dead_count = dead_tuples->num_tuples;
 		maxoff = PageGetMaxOffsetNumber(page);
 
 		/*
@@ -1001,7 +1228,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 */
 			if (ItemIdIsDead(itemid))
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				all_visible = false;
 				continue;
 			}
@@ -1147,7 +1374,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 
 			if (tupgone)
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				HeapTupleHeaderAdvanceLatestRemovedXid(tuple.t_data,
 													   &vacrelstats->latestRemovedXid);
 				tups_vacuumed += 1;
@@ -1217,7 +1444,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * doing a second scan. Also we don't do that but forget dead tuples
 		 * when index cleanup is disabled.
 		 */
-		if (!vacrelstats->useindex && vacrelstats->num_dead_tuples > 0)
+		if (!vacrelstats->useindex && dead_tuples->num_tuples > 0)
 		{
 			if (nindexes == 0)
 			{
@@ -1246,7 +1473,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 
 			/*
 			 * Periodically do incremental FSM vacuuming to make newly-freed
@@ -1361,7 +1588,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * page, so remember its free space as-is.  (This path will always be
 		 * taken if there are no indexes.)
 		 */
-		if (vacrelstats->num_dead_tuples == prev_dead_count)
+		if (dead_tuples->num_tuples == prev_dead_count)
 			RecordPageWithFreeSpace(onerel, blkno, freespace);
 	}
 
@@ -1395,11 +1622,11 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 
 	/* If any tuples need to be deleted, perform final vacuum cycle */
 	/* XXX put a threshold on min number of tuples here? */
-	if (vacrelstats->num_dead_tuples > 0)
+	if (dead_tuples->num_tuples > 0)
 	{
 		/* Work on all the indexes, and then the heap */
-		lazy_vacuum_all_indexes(onerel, vacrelstats, Irel, nindexes,
-								indstats);
+		lazy_vacuum_all_indexes(onerel, Irel, indstats, vacrelstats,
+								lps, nindexes);
 
 		/* Remove tuples from heap */
 		lazy_vacuum_heap(onerel, vacrelstats);
@@ -1412,17 +1639,22 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	if (blkno > next_fsm_block_to_vacuum)
 		FreeSpaceMapVacuumRange(onerel, next_fsm_block_to_vacuum, blkno);
 
-	/* report all blocks vacuumed; and that we're cleaning up */
+	/* report all blocks vacuumed */
 	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_VACUUMED, blkno);
-	pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
-								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
 
-	/* Do post-vacuum cleanup and statistics update for each index */
+	/* Do post-vacuum cleanup */
 	if (vacrelstats->useindex)
-	{
-		for (i = 0; i < nindexes; i++)
-			lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
-	}
+		lazy_cleanup_all_indexes(Irel, indstats, vacrelstats, lps, nindexes);
+
+	/*
+	 * End parallel mode before updating index statistics as we cannot write
+	 * during parallel mode.
+	 */
+	if (ParallelVacuumIsActive(lps))
+		end_parallel_vacuum(Irel, indstats, lps, nindexes);
+
+	/* Update index statistics */
+	update_index_statistics(Irel, indstats, nindexes);
 
 	/* If no indexes, make log report that lazy_vacuum_heap would've made */
 	if (vacuumed_pages)
@@ -1467,15 +1699,16 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 /*
  *	lazy_vacuum_all_indexes() -- vacuum all indexes of relation.
  *
- *		This is a utility wrapper for lazy_vacuum_index(), able to do
- *		progress reporting.
+ * We process the indexes serially unless we are doing parallel vacuum.
  */
 static void
-lazy_vacuum_all_indexes(Relation onerel, LVRelStats *vacrelstats,
-						Relation *Irel, int nindexes,
-						IndexBulkDeleteResult **indstats)
+lazy_vacuum_all_indexes(Relation onerel, Relation *Irel,
+						IndexBulkDeleteResult **stats,
+						LVRelStats *vacrelstats, LVParallelState *lps,
+						int nindexes)
 {
-	int			i;
+	Assert(!IsParallelWorker());
+	Assert(nindexes > 0);
 
 	/* Log cleanup info before we touch indexes */
 	vacuum_log_cleanup_info(onerel, vacrelstats);
@@ -1484,9 +1717,30 @@ lazy_vacuum_all_indexes(Relation onerel, LVRelStats *vacrelstats,
 	pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
 								 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
-	/* Remove index entries */
-	for (i = 0; i < nindexes; i++)
-		lazy_vacuum_index(Irel[i], &indstats[i], vacrelstats);
+	/* Perform index vacuuming with parallel workers for parallel vacuum. */
+	if (ParallelVacuumIsActive(lps))
+	{
+		/* Tell parallel workers to do index vacuuming */
+		lps->lvshared->for_cleanup = false;
+		lps->lvshared->first_time = false;
+
+		/*
+		 * We can only provide an approximate value of num_heap_tuples in
+		 * vacuum cases.
+		 */
+		lps->lvshared->reltuples = vacrelstats->old_live_tuples;
+		lps->lvshared->estimated_count = true;
+
+		lazy_parallel_vacuum_indexes(Irel, stats, vacrelstats, lps, nindexes);
+	}
+	else
+	{
+		int			idx;
+
+		for (idx = 0; idx < nindexes; idx++)
+			lazy_vacuum_index(Irel[idx], &stats[idx], vacrelstats->dead_tuples,
+							  vacrelstats->old_live_tuples);
+	}
 
 	/* Increase and report the number of index scans */
 	vacrelstats->num_index_scans++;
@@ -1522,7 +1776,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 	npages = 0;
 
 	tupindex = 0;
-	while (tupindex < vacrelstats->num_dead_tuples)
+	while (tupindex < vacrelstats->dead_tuples->num_tuples)
 	{
 		BlockNumber tblk;
 		Buffer		buf;
@@ -1531,7 +1785,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 
 		vacuum_delay_point();
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples->itemptrs[tupindex]);
 		buf = ReadBufferExtended(onerel, MAIN_FORKNUM, tblk, RBM_NORMAL,
 								 vac_strategy);
 		if (!ConditionalLockBufferForCleanup(buf))
@@ -1579,6 +1833,7 @@ static int
 lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 				 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer)
 {
+	LVDeadTuples *dead_tuples = vacrelstats->dead_tuples;
 	Page		page = BufferGetPage(buffer);
 	OffsetNumber unused[MaxOffsetNumber];
 	int			uncnt = 0;
@@ -1589,16 +1844,16 @@ lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 
 	START_CRIT_SECTION();
 
-	for (; tupindex < vacrelstats->num_dead_tuples; tupindex++)
+	for (; tupindex < dead_tuples->num_tuples; tupindex++)
 	{
 		BlockNumber tblk;
 		OffsetNumber toff;
 		ItemId		itemid;
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&dead_tuples->itemptrs[tupindex]);
 		if (tblk != blkno)
 			break;				/* past end of tuples for this block */
-		toff = ItemPointerGetOffsetNumber(&vacrelstats->dead_tuples[tupindex]);
+		toff = ItemPointerGetOffsetNumber(&dead_tuples->itemptrs[tupindex]);
 		itemid = PageGetItemId(page, toff);
 		ItemIdSetUnused(itemid);
 		unused[uncnt++] = toff;
@@ -1719,19 +1974,353 @@ lazy_check_needs_freeze(Buffer buf, bool *hastup)
 	return false;
 }
 
+/*
+ * Perform index vacuum or index cleanup with parallel workers.  This function
+ * must be used by the parallel vacuum leader process.  The caller must set
+ * lps->lvshared->for_cleanup to indicate whether to perform vacuum or
+ * cleanup.
+ */
+static void
+lazy_parallel_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+							 LVRelStats *vacrelstats, LVParallelState *lps,
+							 int nindexes)
+{
+	int			nworkers;
+
+	Assert(!IsParallelWorker());
+	Assert(ParallelVacuumIsActive(lps));
+	Assert(nindexes > 0);
+
+	/* Determine the number of parallel workers to launch */
+	if (lps->lvshared->for_cleanup)
+	{
+		if (lps->lvshared->first_time)
+			nworkers = lps->nindexes_parallel_cleanup +
+				lps->nindexes_parallel_condcleanup;
+		else
+			nworkers = lps->nindexes_parallel_cleanup;
+
+	}
+	else
+		nworkers = lps->nindexes_parallel_bulkdel;
+
+	/* The leader process will participate */
+	nworkers--;
+
+	/*
+	 * It is possible that parallel context is initialized with fewer workers
+	 * then the number of indexes that need a separate worker in the current
+	 * phase, so we need to consider it.  See compute_parallel_vacuum_workers.
+	 */
+	nworkers = Min(nworkers, lps->pcxt->nworkers);
+
+	/* Setup the shared cost-based vacuum delay and launch workers */
+	if (nworkers > 0)
+	{
+		if (vacrelstats->num_index_scans > 0)
+		{
+			/* Reset the parallel index processing counter */
+			pg_atomic_write_u32(&(lps->lvshared->idx), 0);
+
+			/* Reinitialize the parallel context to relaunch parallel workers */
+			ReinitializeParallelDSM(lps->pcxt);
+		}
+
+		/* Enable shared cost balance */
+		VacuumSharedCostBalance = &(lps->lvshared->cost_balance);
+		VacuumActiveNWorkers = &(lps->lvshared->active_nworkers);
+
+		/*
+		 * Set up shared cost balance and the number of active workers for
+		 * vacuum delay.
+		 */
+		pg_atomic_write_u32(VacuumSharedCostBalance, VacuumCostBalance);
+		pg_atomic_write_u32(VacuumActiveNWorkers, 0);
+
+		/*
+		 * The number of workers can vary between bulkdelete and cleanup
+		 * phase.
+		 */
+		ReinitializeParallelWorkers(lps->pcxt, nworkers);
+
+		LaunchParallelWorkers(lps->pcxt);
+
+		if (lps->pcxt->nworkers_launched > 0)
+		{
+			/*
+			 * Reset the local cost values for leader backend as we have
+			 * already accumulated the remaining balance of heap.
+			 */
+			VacuumCostBalance = 0;
+			VacuumCostBalanceLocal = 0;
+		}
+		else
+		{
+			/*
+			 * Disable shared cost balance if we are not able to launch
+			 * workers.
+			 */
+			VacuumSharedCostBalance = NULL;
+			VacuumActiveNWorkers = NULL;
+		}
+
+		if (lps->lvshared->for_cleanup)
+			ereport(elevel,
+					(errmsg(ngettext("launched %d parallel vacuum worker for index cleanup (planned: %d)",
+									 "launched %d parallel vacuum workers for index cleanup (planned: %d)",
+									 lps->pcxt->nworkers_launched),
+							lps->pcxt->nworkers_launched, nworkers)));
+		else
+			ereport(elevel,
+					(errmsg(ngettext("launched %d parallel vacuum worker for index vacuuming (planned: %d)",
+									 "launched %d parallel vacuum workers for index vacuuming (planned: %d)",
+									 lps->pcxt->nworkers_launched),
+							lps->pcxt->nworkers_launched, nworkers)));
+	}
+
+	/* Process the indexes that can be processed by only leader process */
+	vacuum_indexes_leader(Irel, stats, vacrelstats, lps, nindexes);
+
+	/*
+	 * Join as a parallel worker.  The leader process alone processes all the
+	 * indexes in the case where no workers are launched.
+	 */
+	parallel_vacuum_index(Irel, stats, lps->lvshared,
+						  vacrelstats->dead_tuples, nindexes);
+
+	/* Wait for all vacuum workers to finish */
+	WaitForParallelWorkersToFinish(lps->pcxt);
+
+	/* Carry the shared balance value to heap scan */
+	if (VacuumSharedCostBalance)
+		VacuumCostBalance = pg_atomic_read_u32(VacuumSharedCostBalance);
+
+	if (nworkers > 0)
+	{
+		/* Disable shared cost balance */
+		VacuumSharedCostBalance = NULL;
+		VacuumActiveNWorkers = NULL;
+	}
+}
+
+/*
+ * Index vacuum/cleanup routine used by the leader process and parallel
+ * vacuum worker processes to process the indexes in parallel.
+ */
+static void
+parallel_vacuum_index(Relation *Irel, IndexBulkDeleteResult **stats,
+					  LVShared *lvshared, LVDeadTuples *dead_tuples,
+					  int nindexes)
+{
+	/*
+	 * Increment the active worker count if we are able to launch any worker.
+	 */
+	if (VacuumActiveNWorkers)
+		pg_atomic_add_fetch_u32(VacuumActiveNWorkers, 1);
+
+	/* Loop until all indexes are vacuumed */
+	for (;;)
+	{
+		int			idx;
+		LVSharedIndStats *shared_indstats;
+
+		/* Get an index number to process */
+		idx = pg_atomic_fetch_add_u32(&(lvshared->idx), 1);
+
+		/* Done for all indexes? */
+		if (idx >= nindexes)
+			break;
+
+		/* Get the index statistics of this index from DSM */
+		shared_indstats = get_indstats(lvshared, idx);
+
+		/*
+		 * Skip processing indexes that doesn't participate in parallel
+		 * operation
+		 */
+		if (shared_indstats == NULL ||
+			skip_parallel_vacuum_index(Irel[idx], lvshared))
+			continue;
+
+		/* Do vacuum or cleanup of the index */
+		vacuum_one_index(Irel[idx], &(stats[idx]), lvshared, shared_indstats,
+						 dead_tuples);
+	}
+
+	/*
+	 * We have completed the index vacuum so decrement the active worker
+	 * count.
+	 */
+	if (VacuumActiveNWorkers)
+		pg_atomic_sub_fetch_u32(VacuumActiveNWorkers, 1);
+}
+
+/*
+ * Vacuum or cleanup indexes that can be processed by only the leader process
+ * because these indexes don't support parallel operation at that phase.
+ */
+static void
+vacuum_indexes_leader(Relation *Irel, IndexBulkDeleteResult **stats,
+					  LVRelStats *vacrelstats, LVParallelState *lps,
+					  int nindexes)
+{
+	int			i;
+
+	Assert(!IsParallelWorker());
+
+	/*
+	 * Increment the active worker count if we are able to launch any worker.
+	 */
+	if (VacuumActiveNWorkers)
+		pg_atomic_add_fetch_u32(VacuumActiveNWorkers, 1);
+
+	for (i = 0; i < nindexes; i++)
+	{
+		LVSharedIndStats *shared_indstats;
+
+		shared_indstats = get_indstats(lps->lvshared, i);
+
+		/* Process the indexes skipped by parallel workers */
+		if (shared_indstats == NULL ||
+			skip_parallel_vacuum_index(Irel[i], lps->lvshared))
+			vacuum_one_index(Irel[i], &(stats[i]), lps->lvshared,
+							 shared_indstats, vacrelstats->dead_tuples);
+	}
+
+	/*
+	 * We have completed the index vacuum so decrement the active worker
+	 * count.
+	 */
+	if (VacuumActiveNWorkers)
+		pg_atomic_sub_fetch_u32(VacuumActiveNWorkers, 1);
+}
+
+/*
+ * Vacuum or cleanup index either by leader process or by one of the worker
+ * process.  After processing the index this function copies the index
+ * statistics returned from ambulkdelete and amvacuumcleanup to the DSM
+ * segment.
+ */
+static void
+vacuum_one_index(Relation indrel, IndexBulkDeleteResult **stats,
+				 LVShared *lvshared, LVSharedIndStats *shared_indstats,
+				 LVDeadTuples *dead_tuples)
+{
+	IndexBulkDeleteResult *bulkdelete_res = NULL;
+
+	if (shared_indstats)
+	{
+		/* Get the space for IndexBulkDeleteResult */
+		bulkdelete_res = &(shared_indstats->stats);
+
+		/*
+		 * Update the pointer to the corresponding bulk-deletion result if
+		 * someone has already updated it.
+		 */
+		if (shared_indstats->updated && *stats == NULL)
+			*stats = bulkdelete_res;
+	}
+
+	/* Do vacuum or cleanup of the index */
+	if (lvshared->for_cleanup)
+		lazy_cleanup_index(indrel, stats, lvshared->reltuples,
+						   lvshared->estimated_count);
+	else
+		lazy_vacuum_index(indrel, stats, dead_tuples,
+						  lvshared->reltuples);
+
+	/*
+	 * Copy the index bulk-deletion result returned from ambulkdelete and
+	 * amvacuumcleanup to the DSM segment if it's the first time to get it
+	 * from them, because they allocate it locally and it's possible that an
+	 * index will be vacuumed by the different vacuum process at the next
+	 * time.  The copying of the result normally happens only after the first
+	 * time of index vacuuming.  From the second time, we pass the result on
+	 * the DSM segment so that they then update it directly.
+	 *
+	 * Since all vacuum workers write the bulk-deletion result at different
+	 * slots we can write them without locking.
+	 */
+	if (shared_indstats && !shared_indstats->updated && *stats != NULL)
+	{
+		memcpy(bulkdelete_res, *stats, sizeof(IndexBulkDeleteResult));
+		shared_indstats->updated = true;
+
+		/*
+		 * Now that the stats[idx] points to the DSM segment, we don't need
+		 * the locally allocated results.
+		 */
+		pfree(*stats);
+		*stats = bulkdelete_res;
+	}
+}
+
+/*
+ *	lazy_cleanup_all_indexes() -- cleanup all indexes of relation.
+ *
+ * Cleanup indexes.  We process the indexes serially unless we are doing
+ * parallel vacuum.
+ */
+static void
+lazy_cleanup_all_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+						 LVRelStats *vacrelstats, LVParallelState *lps,
+						 int nindexes)
+{
+	int			idx;
+
+	Assert(!IsParallelWorker());
+	Assert(nindexes > 0);
+
+	/* Report that we are now cleaning up indexes */
+	pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
+								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
+
+	/*
+	 * If parallel vacuum is active we perform index cleanup with parallel
+	 * workers.
+	 */
+	if (ParallelVacuumIsActive(lps))
+	{
+		/* Tell parallel workers to do index cleanup */
+		lps->lvshared->for_cleanup = true;
+		lps->lvshared->first_time =
+			(vacrelstats->num_index_scans == 0);
+
+		/*
+		 * Now we can provide a better estimate of total number of surviving
+		 * tuples (we assume indexes are more interested in that than in the
+		 * number of nominally live tuples).
+		 */
+		lps->lvshared->reltuples = vacrelstats->new_rel_tuples;
+		lps->lvshared->estimated_count =
+			(vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+
+		lazy_parallel_vacuum_indexes(Irel, stats, vacrelstats, lps, nindexes);
+	}
+	else
+	{
+		for (idx = 0; idx < nindexes; idx++)
+			lazy_cleanup_index(Irel[idx], &stats[idx],
+							   vacrelstats->new_rel_tuples,
+							   vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	}
+}
 
 /*
  *	lazy_vacuum_index() -- vacuum one index relation.
  *
  *		Delete all the index entries pointing to tuples listed in
- *		vacrelstats->dead_tuples, and update running statistics.
+ *		dead_tuples, and update running statistics.
+ *
+ *		reltuples is the number of heap tuples to be passed to the
+ *		bulkdelete callback.
  */
 static void
-lazy_vacuum_index(Relation indrel,
-				  IndexBulkDeleteResult **stats,
-				  LVRelStats *vacrelstats)
+lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult **stats,
+				  LVDeadTuples *dead_tuples, double reltuples)
 {
 	IndexVacuumInfo ivinfo;
+	const char *msg;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
@@ -1741,30 +2330,38 @@ lazy_vacuum_index(Relation indrel,
 	ivinfo.report_progress = false;
 	ivinfo.estimated_count = true;
 	ivinfo.message_level = elevel;
-	/* We can only provide an approximate value of num_heap_tuples here */
-	ivinfo.num_heap_tuples = vacrelstats->old_live_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
 	/* Do bulk deletion */
 	*stats = index_bulk_delete(&ivinfo, *stats,
-							   lazy_tid_reaped, (void *) vacrelstats);
+							   lazy_tid_reaped, (void *) dead_tuples);
+
+	if (IsParallelWorker())
+		msg = gettext_noop("scanned index \"%s\" to remove %d row versions by parallel vacuum worker");
+	else
+		msg = gettext_noop("scanned index \"%s\" to remove %d row versions");
 
 	ereport(elevel,
-			(errmsg("scanned index \"%s\" to remove %d row versions",
+			(errmsg(msg,
 					RelationGetRelationName(indrel),
-					vacrelstats->num_dead_tuples),
+					dead_tuples->num_tuples),
 			 errdetail_internal("%s", pg_rusage_show(&ru0))));
 }
 
 /*
  *	lazy_cleanup_index() -- do post-vacuum cleanup for one index relation.
+ *
+ *		reltuples is the number of heap tuples and estimated_count is true
+ *		if the reltuples is an estimated value.
  */
 static void
 lazy_cleanup_index(Relation indrel,
-				   IndexBulkDeleteResult *stats,
-				   LVRelStats *vacrelstats)
+				   IndexBulkDeleteResult **stats,
+				   double reltuples, bool estimated_count)
 {
 	IndexVacuumInfo ivinfo;
+	const char *msg;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
@@ -1772,49 +2369,33 @@ lazy_cleanup_index(Relation indrel,
 	ivinfo.index = indrel;
 	ivinfo.analyze_only = false;
 	ivinfo.report_progress = false;
-	ivinfo.estimated_count = (vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	ivinfo.estimated_count = estimated_count;
 	ivinfo.message_level = elevel;
 
-	/*
-	 * Now we can provide a better estimate of total number of surviving
-	 * tuples (we assume indexes are more interested in that than in the
-	 * number of nominally live tuples).
-	 */
-	ivinfo.num_heap_tuples = vacrelstats->new_rel_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
-	stats = index_vacuum_cleanup(&ivinfo, stats);
+	*stats = index_vacuum_cleanup(&ivinfo, *stats);
 
-	if (!stats)
+	if (!(*stats))
 		return;
 
-	/*
-	 * Now update statistics in pg_class, but only if the index says the count
-	 * is accurate.
-	 */
-	if (!stats->estimated_count)
-		vac_update_relstats(indrel,
-							stats->num_pages,
-							stats->num_index_tuples,
-							0,
-							false,
-							InvalidTransactionId,
-							InvalidMultiXactId,
-							false);
+	if (IsParallelWorker())
+		msg = gettext_noop("index \"%s\" now contains %.0f row versions in %u pages as reported by parallel vacuum worker");
+	else
+		msg = gettext_noop("index \"%s\" now contains %.0f row versions in %u pages");
 
 	ereport(elevel,
-			(errmsg("index \"%s\" now contains %.0f row versions in %u pages",
+			(errmsg(msg,
 					RelationGetRelationName(indrel),
-					stats->num_index_tuples,
-					stats->num_pages),
+					(*stats)->num_index_tuples,
+					(*stats)->num_pages),
 			 errdetail("%.0f index row versions were removed.\n"
 					   "%u index pages have been deleted, %u are currently reusable.\n"
 					   "%s.",
-					   stats->tuples_removed,
-					   stats->pages_deleted, stats->pages_free,
+					   (*stats)->tuples_removed,
+					   (*stats)->pages_deleted, (*stats)->pages_free,
 					   pg_rusage_show(&ru0))));
-
-	pfree(stats);
 }
 
 /*
@@ -2122,19 +2703,17 @@ count_nondeletable_pages(Relation onerel, LVRelStats *vacrelstats)
 }
 
 /*
- * lazy_space_alloc - space allocation decisions for lazy vacuum
- *
- * See the comments at the head of this file for rationale.
+ * Return the maximum number of dead tuples we can record.
  */
-static void
-lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+static long
+compute_max_dead_tuples(BlockNumber relblocks, bool useindex)
 {
 	long		maxtuples;
 	int			vac_work_mem = IsAutoVacuumWorkerProcess() &&
 	autovacuum_work_mem != -1 ?
 	autovacuum_work_mem : maintenance_work_mem;
 
-	if (vacrelstats->useindex)
+	if (useindex)
 	{
 		maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);
 		maxtuples = Min(maxtuples, INT_MAX);
@@ -2148,34 +2727,48 @@ lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
 		maxtuples = Max(maxtuples, MaxHeapTuplesPerPage);
 	}
 	else
-	{
 		maxtuples = MaxHeapTuplesPerPage;
-	}
 
-	vacrelstats->num_dead_tuples = 0;
-	vacrelstats->max_dead_tuples = (int) maxtuples;
-	vacrelstats->dead_tuples = (ItemPointer)
-		palloc(maxtuples * sizeof(ItemPointerData));
+	return maxtuples;
+}
+
+/*
+ * lazy_space_alloc - space allocation decisions for lazy vacuum
+ *
+ * See the comments at the head of this file for rationale.
+ */
+static void
+lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+{
+	LVDeadTuples *dead_tuples = NULL;
+	long		maxtuples;
+
+	maxtuples = compute_max_dead_tuples(relblocks, vacrelstats->useindex);
+
+	dead_tuples = (LVDeadTuples *) palloc(SizeOfLVDeadTuples(maxtuples));
+	dead_tuples->num_tuples = 0;
+	dead_tuples->max_tuples = (int) maxtuples;
+
+	vacrelstats->dead_tuples = dead_tuples;
 }
 
 /*
  * lazy_record_dead_tuple - remember one deletable tuple
  */
 static void
-lazy_record_dead_tuple(LVRelStats *vacrelstats,
-					   ItemPointer itemptr)
+lazy_record_dead_tuple(LVDeadTuples *dead_tuples, ItemPointer itemptr)
 {
 	/*
 	 * The array shouldn't overflow under normal behavior, but perhaps it
 	 * could if we are given a really small maintenance_work_mem. In that
 	 * case, just forget the last few tuples (we'll get 'em next time).
 	 */
-	if (vacrelstats->num_dead_tuples < vacrelstats->max_dead_tuples)
+	if (dead_tuples->num_tuples < dead_tuples->max_tuples)
 	{
-		vacrelstats->dead_tuples[vacrelstats->num_dead_tuples] = *itemptr;
-		vacrelstats->num_dead_tuples++;
+		dead_tuples->itemptrs[dead_tuples->num_tuples] = *itemptr;
+		dead_tuples->num_tuples++;
 		pgstat_progress_update_param(PROGRESS_VACUUM_NUM_DEAD_TUPLES,
-									 vacrelstats->num_dead_tuples);
+									 dead_tuples->num_tuples);
 	}
 }
 
@@ -2189,12 +2782,12 @@ lazy_record_dead_tuple(LVRelStats *vacrelstats,
 static bool
 lazy_tid_reaped(ItemPointer itemptr, void *state)
 {
-	LVRelStats *vacrelstats = (LVRelStats *) state;
+	LVDeadTuples *dead_tuples = (LVDeadTuples *) state;
 	ItemPointer res;
 
 	res = (ItemPointer) bsearch((void *) itemptr,
-								(void *) vacrelstats->dead_tuples,
-								vacrelstats->num_dead_tuples,
+								(void *) dead_tuples->itemptrs,
+								dead_tuples->num_tuples,
 								sizeof(ItemPointerData),
 								vac_cmp_itemptr);
 
@@ -2342,3 +2935,449 @@ heap_page_is_all_visible(Relation rel, Buffer buf,
 
 	return all_visible;
 }
+
+/*
+ * Compute the number of parallel worker processes to request.  Both index
+ * vacuum and index cleanup can be executed with parallel workers.  The
+ * relation size of the table don't affect the parallel degree for now.
+ *
+ * nrequested is the number of parallel workers that user requested.  If
+ * nrequested is 0, we compute the parallel degree based on nindexes, that is
+ * the number of indexes that support parallel index vacuuming.  This function
+ * also sets can_parallel_vacuum to remember indexes that participate in
+ * parallel index vacuum.
+ */
+static int
+compute_parallel_vacuum_workers(Relation *Irel, int nindexes, int nrequested,
+								bool *can_parallel_vacuum)
+{
+	int			nindexes_parallel = 0;
+	int			nindexes_parallel_bulkdel = 0;
+	int			nindexes_parallel_cleanup = 0;
+	int			parallel_workers;
+	int			i;
+
+	/*
+	 * We don't allow to perform parallel operation in standalone backend or
+	 * when parallelism is disabled.
+	 */
+	if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+		return 0;
+
+	/*
+	 * Compute the number of indexes that can participate in parallel index
+	 * vacuuming.
+	 */
+	for (i = 0; i < nindexes; i++)
+	{
+		uint8		vacoptions = Irel[i]->rd_indam->amparallelvacuumoptions;
+
+		if (vacoptions == VACUUM_OPTION_NO_PARALLEL ||
+			RelationGetNumberOfBlocks(Irel[i]) < min_parallel_index_scan_size)
+			continue;
+
+		can_parallel_vacuum[i] = true;
+
+		if ((vacoptions & VACUUM_OPTION_PARALLEL_BULKDEL) != 0)
+			nindexes_parallel_bulkdel++;
+		if (((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) != 0) ||
+			((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) != 0))
+			nindexes_parallel_cleanup++;
+	}
+
+	nindexes_parallel = Max(nindexes_parallel_bulkdel,
+							nindexes_parallel_cleanup);
+
+	/* No index supports parallel index vacuuming */
+	if (nindexes_parallel == 0)
+		return 0;
+
+	/* The leader process takes one index */
+	nindexes_parallel--;
+
+	/* Compute the parallel degree */
+	parallel_workers = (nrequested > 0) ?
+		Min(nrequested, nindexes_parallel) : nindexes_parallel;
+
+	/* Cap by max_parallel_maintenance_workers */
+	parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+
+	return parallel_workers;
+}
+
+/*
+ * Initialize variables for shared index statistics, set NULL bitmap and the
+ * size of stats for each index.
+ */
+static void
+prepare_index_statistics(LVShared *lvshared, bool *can_parallel_vacuum,
+						 int nindexes)
+{
+	int			i;
+
+	/* Currently, we don't support parallel vacuum for autovacuum */
+	Assert(!IsAutoVacuumWorkerProcess());
+
+	/* Set NULL for all indexes */
+	memset(lvshared->bitmap, 0x00, BITMAPLEN(nindexes));
+
+	for (i = 0; i < nindexes; i++)
+	{
+		if (!can_parallel_vacuum[i])
+			continue;
+
+		/* Set NOT NULL as this index do support parallelism */
+		lvshared->bitmap[i >> 3] |= 1 << (i & 0x07);
+	}
+}
+
+/*
+ * Update index statistics in pg_class if the statistics is accurate.
+ */
+static void
+update_index_statistics(Relation *Irel, IndexBulkDeleteResult **stats,
+						int nindexes)
+{
+	int			i;
+
+	Assert(!IsInParallelMode());
+
+	for (i = 0; i < nindexes; i++)
+	{
+		if (stats[i] == NULL || stats[i]->estimated_count)
+			continue;
+
+		/* Update index statistics */
+		vac_update_relstats(Irel[i],
+							stats[i]->num_pages,
+							stats[i]->num_index_tuples,
+							0,
+							false,
+							InvalidTransactionId,
+							InvalidMultiXactId,
+							false);
+		pfree(stats[i]);
+	}
+}
+
+/*
+ * This function prepares and returns parallel vacuum state if we can launch
+ * even one worker.  This function is responsible to create a parallel
+ * context, enter parallel mode, and then initialize the DSM segment.
+ */
+static LVParallelState *
+begin_parallel_vacuum(Oid relid, Relation *Irel, LVRelStats *vacrelstats,
+					  BlockNumber nblocks, int nindexes, int nrequested)
+{
+	LVParallelState *lps = NULL;
+	ParallelContext *pcxt;
+	LVShared   *shared;
+	LVDeadTuples *dead_tuples;
+	bool	   *can_parallel_vacuum;
+	long		maxtuples;
+	char	   *sharedquery;
+	Size		est_shared;
+	Size		est_deadtuples;
+	int			nindexes_mwm = 0;
+	int			parallel_workers = 0;
+	int			querylen;
+	int			i;
+
+	/*
+	 * A parallel vacuum must be requested and there must be indexes on the
+	 * relation
+	 */
+	Assert(nrequested >= 0);
+	Assert(nindexes > 0);
+
+	/*
+	 * Compute the number of parallel vacuum workers to launch
+	 */
+	can_parallel_vacuum = (bool *) palloc0(sizeof(bool) * nindexes);
+	parallel_workers = compute_parallel_vacuum_workers(Irel, nindexes,
+													   nrequested,
+													   can_parallel_vacuum);
+
+	/* Can't perform vacuum in parallel */
+	if (parallel_workers <= 0)
+	{
+		pfree(can_parallel_vacuum);
+		return lps;
+	}
+
+	lps = (LVParallelState *) palloc0(sizeof(LVParallelState));
+
+	EnterParallelMode();
+	pcxt = CreateParallelContext("postgres", "parallel_vacuum_main",
+								 parallel_workers);
+	Assert(pcxt->nworkers > 0);
+	lps->pcxt = pcxt;
+
+	/* Estimate size for shared information -- PARALLEL_VACUUM_KEY_SHARED */
+	est_shared = MAXALIGN(add_size(SizeOfLVShared, BITMAPLEN(nindexes)));
+	for (i = 0; i < nindexes; i++)
+	{
+		uint8		vacoptions = Irel[i]->rd_indam->amparallelvacuumoptions;
+
+		/*
+		 * Cleanup option should be either disabled, always performing in
+		 * parallel or conditionally performing in parallel.
+		 */
+		Assert(((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) == 0) ||
+			   ((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) == 0));
+		Assert(vacoptions <= VACUUM_OPTION_MAX_VALID_VALUE);
+
+		/* Skip indexes that don't participate in parallel index vacuum */
+		if (!can_parallel_vacuum[i])
+			continue;
+
+		if (Irel[i]->rd_indam->amusemaintenanceworkmem)
+			nindexes_mwm++;
+
+		est_shared = add_size(est_shared, sizeof(LVSharedIndStats));
+
+		/*
+		 * Remember the number of indexes that support parallel operation for
+		 * each phase.
+		 */
+		if ((vacoptions & VACUUM_OPTION_PARALLEL_BULKDEL) != 0)
+			lps->nindexes_parallel_bulkdel++;
+		if ((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) != 0)
+			lps->nindexes_parallel_cleanup++;
+		if ((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) != 0)
+			lps->nindexes_parallel_condcleanup++;
+	}
+	shm_toc_estimate_chunk(&pcxt->estimator, est_shared);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	/* Estimate size for dead tuples -- PARALLEL_VACUUM_KEY_DEAD_TUPLES */
+	maxtuples = compute_max_dead_tuples(nblocks, true);
+	est_deadtuples = MAXALIGN(SizeOfLVDeadTuples(maxtuples));
+	shm_toc_estimate_chunk(&pcxt->estimator, est_deadtuples);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	/* Finally, estimate PARALLEL_VACUUM_KEY_QUERY_TEXT space */
+	querylen = strlen(debug_query_string);
+	shm_toc_estimate_chunk(&pcxt->estimator, querylen + 1);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	InitializeParallelDSM(pcxt);
+
+	/* Prepare shared information */
+	shared = (LVShared *) shm_toc_allocate(pcxt->toc, est_shared);
+	MemSet(shared, 0, est_shared);
+	shared->relid = relid;
+	shared->elevel = elevel;
+	shared->maintenance_work_mem_worker =
+		(nindexes_mwm > 0) ?
+		maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
+		maintenance_work_mem;
+
+	/*
+	 * We need to care about alignment because we estimate the shared memory
+	 * in that way.
+	 */
+	shared->offset = MAXALIGN(add_size(SizeOfLVShared, BITMAPLEN(nindexes)));
+	prepare_index_statistics(shared, can_parallel_vacuum, nindexes);
+	pg_atomic_init_u32(&(shared->idx), 0);
+	pg_atomic_init_u32(&(shared->cost_balance), 0);
+	pg_atomic_init_u32(&(shared->active_nworkers), 0);
+
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_SHARED, shared);
+	lps->lvshared = shared;
+
+	/* Prepare the dead tuple space */
+	dead_tuples = (LVDeadTuples *) shm_toc_allocate(pcxt->toc, est_deadtuples);
+	dead_tuples->max_tuples = maxtuples;
+	dead_tuples->num_tuples = 0;
+	MemSet(dead_tuples->itemptrs, 0, sizeof(ItemPointerData) * maxtuples);
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_DEAD_TUPLES, dead_tuples);
+	vacrelstats->dead_tuples = dead_tuples;
+
+	/* Store query string for workers */
+	sharedquery = (char *) shm_toc_allocate(pcxt->toc, querylen + 1);
+	memcpy(sharedquery, debug_query_string, querylen + 1);
+	sharedquery[querylen] = '\0';
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, sharedquery);
+
+	pfree(can_parallel_vacuum);
+	return lps;
+}
+
+/*
+ * Destroy the parallel context, and end parallel mode.
+ *
+ * Since writes are not allowed during the parallel mode, so we copy the
+ * updated index statistics from DSM in local memory and then later use that
+ * to update the index statistics.  One might think that we can exit from
+ * parallel mode, update the index statistics and then destroy parallel
+ * context, but that won't be safe (see ExitParallelMode).
+ */
+static void
+end_parallel_vacuum(Relation *Irel, IndexBulkDeleteResult **stats,
+					LVParallelState *lps, int nindexes)
+{
+	int			i;
+
+	Assert(!IsParallelWorker());
+
+	/* Copy the updated statistics */
+	for (i = 0; i < nindexes; i++)
+	{
+		LVSharedIndStats *indstats = get_indstats(lps->lvshared, i);
+
+		/*
+		 * Skip unused slot.  The statistics of this index are already stored
+		 * in local memory.
+		 */
+		if (indstats == NULL)
+			continue;
+
+		if (indstats->updated)
+		{
+			stats[i] = (IndexBulkDeleteResult *) palloc0(sizeof(IndexBulkDeleteResult));
+			memcpy(stats[i], &(indstats->stats), sizeof(IndexBulkDeleteResult));
+		}
+		else
+			stats[i] = NULL;
+	}
+
+	DestroyParallelContext(lps->pcxt);
+	ExitParallelMode();
+
+	/* Deactivate parallel vacuum */
+	pfree(lps);
+	lps = NULL;
+}
+
+/* Return the Nth index statistics or NULL */
+static LVSharedIndStats *
+get_indstats(LVShared *lvshared, int n)
+{
+	int			i;
+	char	   *p;
+
+	if (IndStatsIsNull(lvshared, n))
+		return NULL;
+
+	p = (char *) GetSharedIndStats(lvshared);
+	for (i = 0; i < n; i++)
+	{
+		if (IndStatsIsNull(lvshared, i))
+			continue;
+
+		p += sizeof(LVSharedIndStats);
+	}
+
+	return (LVSharedIndStats *) p;
+}
+
+/*
+ * Returns true, if the given index can't participate in parallel index vacuum
+ * or parallel index cleanup, false, otherwise.
+ */
+static bool
+skip_parallel_vacuum_index(Relation indrel, LVShared *lvshared)
+{
+	uint8		vacoptions = indrel->rd_indam->amparallelvacuumoptions;
+
+	/* first_time must be true only if for_cleanup is true */
+	Assert(lvshared->for_cleanup || !lvshared->first_time);
+
+	if (lvshared->for_cleanup)
+	{
+		/* Skip, if the index does not support parallel cleanup */
+		if (((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) == 0) &&
+			((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) == 0))
+			return true;
+
+		/*
+		 * Skip, if the index supports parallel cleanup conditionally, but we
+		 * have already processed the index (for bulkdelete).  See the
+		 * comments for option VACUUM_OPTION_PARALLEL_COND_CLEANUP to know
+		 * when indexes support parallel cleanup conditionally.
+		 */
+		if (!lvshared->first_time &&
+			((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) != 0))
+			return true;
+	}
+	else if ((vacoptions & VACUUM_OPTION_PARALLEL_BULKDEL) == 0)
+	{
+		/* Skip if the index does not support parallel bulk deletion */
+		return true;
+	}
+
+	return false;
+}
+
+/*
+ * Perform work within a launched parallel process.
+ *
+ * Since parallel vacuum workers perform only index vacuum or index cleanup,
+ * we don't need to report the progress information.
+ */
+void
+parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
+{
+	Relation	onerel;
+	Relation   *indrels;
+	LVShared   *lvshared;
+	LVDeadTuples *dead_tuples;
+	int			nindexes;
+	char	   *sharedquery;
+	IndexBulkDeleteResult **stats;
+
+	lvshared = (LVShared *) shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_SHARED,
+										   false);
+	elevel = lvshared->elevel;
+
+	ereport(DEBUG1,
+			(errmsg("starting parallel lazy vacuum worker for %s",
+					lvshared->for_cleanup ? "cleanup" : "vacuuming")));
+
+	/* Set debug_query_string for individual workers */
+	sharedquery = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, false);
+	debug_query_string = sharedquery;
+	pgstat_report_activity(STATE_RUNNING, debug_query_string);
+
+	/*
+	 * Open table.  The lock mode is the same as the leader process.  It's
+	 * okay because the lock mode does not conflict among the parallel
+	 * workers.
+	 */
+	onerel = table_open(lvshared->relid, ShareUpdateExclusiveLock);
+
+	/*
+	 * Open all indexes. indrels are sorted in order by OID, which should be
+	 * matched to the leader's one.
+	 */
+	vac_open_indexes(onerel, RowExclusiveLock, &nindexes, &indrels);
+	Assert(nindexes > 0);
+
+	/* Set dead tuple space */
+	dead_tuples = (LVDeadTuples *) shm_toc_lookup(toc,
+												  PARALLEL_VACUUM_KEY_DEAD_TUPLES,
+												  false);
+
+	/* Set cost-based vacuum delay */
+	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostBalance = 0;
+	VacuumPageHit = 0;
+	VacuumPageMiss = 0;
+	VacuumPageDirty = 0;
+	VacuumSharedCostBalance = &(lvshared->cost_balance);
+	VacuumActiveNWorkers = &(lvshared->active_nworkers);
+
+	stats = (IndexBulkDeleteResult **)
+		palloc0(nindexes * sizeof(IndexBulkDeleteResult *));
+
+	if (lvshared->maintenance_work_mem_worker > 0)
+		maintenance_work_mem = lvshared->maintenance_work_mem_worker;
+
+	/* Process indexes to perform vacuum/cleanup */
+	parallel_vacuum_index(indrels, stats, lvshared, dead_tuples, nindexes);
+
+	vac_close_indexes(nindexes, indrels, RowExclusiveLock);
+	table_close(onerel, ShareUpdateExclusiveLock);
+	pfree(stats);
+}
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index f3e2254954..df06e7d174 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -14,6 +14,7 @@
 
 #include "postgres.h"
 
+#include "access/heapam.h"
 #include "access/nbtree.h"
 #include "access/parallel.h"
 #include "access/session.h"
@@ -139,6 +140,9 @@ static const struct
 	},
 	{
 		"_bt_parallel_build_main", _bt_parallel_build_main
+	},
+	{
+		"parallel_vacuum_main", parallel_vacuum_main
 	}
 };
 
@@ -174,6 +178,7 @@ CreateParallelContext(const char *library_name, const char *function_name,
 	pcxt = palloc0(sizeof(ParallelContext));
 	pcxt->subid = GetCurrentSubTransactionId();
 	pcxt->nworkers = nworkers;
+	pcxt->nworkers_to_launch = nworkers;
 	pcxt->library_name = pstrdup(library_name);
 	pcxt->function_name = pstrdup(function_name);
 	pcxt->error_context_stack = error_context_stack;
@@ -486,6 +491,23 @@ ReinitializeParallelDSM(ParallelContext *pcxt)
 	}
 }
 
+/*
+ * Reinitialize parallel workers for a parallel context such that we could
+ * launch the different number of workers.  This is required for cases where
+ * we need to reuse the same DSM segment, but the number of workers can
+ * vary from run-to-run.
+ */
+void
+ReinitializeParallelWorkers(ParallelContext *pcxt, int nworkers_to_launch)
+{
+	/*
+	 * The number of workers that need to be launched must be less than the
+	 * number of workers with which the parallel context is initialized.
+	 */
+	Assert(pcxt->nworkers >= nworkers_to_launch);
+	pcxt->nworkers_to_launch = nworkers_to_launch;
+}
+
 /*
  * Launch parallel workers.
  */
@@ -498,7 +520,7 @@ LaunchParallelWorkers(ParallelContext *pcxt)
 	bool		any_registrations_failed = false;
 
 	/* Skip this if we have no workers. */
-	if (pcxt->nworkers == 0)
+	if (pcxt->nworkers == 0 || pcxt->nworkers_to_launch == 0)
 		return;
 
 	/* We need to be a lock group leader. */
@@ -533,7 +555,7 @@ LaunchParallelWorkers(ParallelContext *pcxt)
 	 * fails.  It wouldn't help much anyway, because registering the worker in
 	 * no way guarantees that it will start up and initialize successfully.
 	 */
-	for (i = 0; i < pcxt->nworkers; ++i)
+	for (i = 0; i < pcxt->nworkers_to_launch; ++i)
 	{
 		memcpy(worker.bgw_extra, &i, sizeof(int));
 		if (!any_registrations_failed &&
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index bb34e252e4..1cd77e79d2 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -42,6 +42,7 @@
 #include "nodes/makefuncs.h"
 #include "pgstat.h"
 #include "postmaster/autovacuum.h"
+#include "postmaster/bgworker_internals.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
 #include "storage/proc.h"
@@ -68,6 +69,14 @@ static MemoryContext vac_context = NULL;
 static BufferAccessStrategy vac_strategy;
 
 
+/*
+ * Variables for cost-based parallel vacuum.  See comments atop
+ * compute_parallel_delay to understand how it works.
+ */
+pg_atomic_uint32 *VacuumSharedCostBalance = NULL;
+pg_atomic_uint32 *VacuumActiveNWorkers = NULL;
+int			VacuumCostBalanceLocal = 0;
+
 /* non-export function prototypes */
 static List *expand_vacuum_rel(VacuumRelation *vrel, int options);
 static List *get_all_vacuum_rels(int options);
@@ -76,6 +85,7 @@ static void vac_truncate_clog(TransactionId frozenXID,
 							  TransactionId lastSaneFrozenXid,
 							  MultiXactId lastSaneMinMulti);
 static bool vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params);
+static double compute_parallel_delay(void);
 static VacOptTernaryValue get_vacopt_ternary_value(DefElem *def);
 
 /*
@@ -94,12 +104,16 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 	bool		freeze = false;
 	bool		full = false;
 	bool		disable_page_skipping = false;
+	bool		parallel_option = false;
 	ListCell   *lc;
 
 	/* Set default value */
 	params.index_cleanup = VACOPT_TERNARY_DEFAULT;
 	params.truncate = VACOPT_TERNARY_DEFAULT;
 
+	/* By default parallel vacuum is enabled */
+	params.nworkers = 0;
+
 	/* Parse options list */
 	foreach(lc, vacstmt->options)
 	{
@@ -129,6 +143,39 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 			params.index_cleanup = get_vacopt_ternary_value(opt);
 		else if (strcmp(opt->defname, "truncate") == 0)
 			params.truncate = get_vacopt_ternary_value(opt);
+		else if (strcmp(opt->defname, "parallel") == 0)
+		{
+			parallel_option = true;
+			if (opt->arg == NULL)
+			{
+				ereport(ERROR,
+						(errcode(ERRCODE_SYNTAX_ERROR),
+						 errmsg("parallel option requires a value between 0 and %d",
+								MAX_PARALLEL_WORKER_LIMIT),
+						 parser_errposition(pstate, opt->location)));
+			}
+			else
+			{
+				int			nworkers;
+
+				nworkers = defGetInt32(opt);
+				if (nworkers < 0 || nworkers > MAX_PARALLEL_WORKER_LIMIT)
+					ereport(ERROR,
+							(errcode(ERRCODE_SYNTAX_ERROR),
+							 errmsg("parallel vacuum degree must be between 0 and %d",
+									MAX_PARALLEL_WORKER_LIMIT),
+							 parser_errposition(pstate, opt->location)));
+
+				/*
+				 * Disable parallel vacuum, if user has specified parallel
+				 * degree as zero.
+				 */
+				if (nworkers == 0)
+					params.nworkers = -1;
+				else
+					params.nworkers = nworkers;
+			}
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -152,6 +199,11 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 		   !(params.options & (VACOPT_FULL | VACOPT_FREEZE)));
 	Assert(!(params.options & VACOPT_SKIPTOAST));
 
+	if ((params.options & VACOPT_FULL) && parallel_option)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot specify both FULL and PARALLEL options")));
+
 	/*
 	 * Make sure VACOPT_ANALYZE is specified if any column lists are present.
 	 */
@@ -383,6 +435,7 @@ vacuum(List *relations, VacuumParams *params,
 		VacuumPageHit = 0;
 		VacuumPageMiss = 0;
 		VacuumPageDirty = 0;
+		VacuumSharedCostBalance = NULL;
 
 		/*
 		 * Loop to process each selected relation.
@@ -1941,16 +1994,26 @@ vac_close_indexes(int nindexes, Relation *Irel, LOCKMODE lockmode)
 void
 vacuum_delay_point(void)
 {
+	double		msec = 0;
+
 	/* Always check for interrupts */
 	CHECK_FOR_INTERRUPTS();
 
-	/* Nap if appropriate */
-	if (VacuumCostActive && !InterruptPending &&
-		VacuumCostBalance >= VacuumCostLimit)
-	{
-		double		msec;
+	if (!VacuumCostActive || InterruptPending)
+		return;
 
+	/*
+	 * For parallel vacuum, the delay is computed based on the shared cost
+	 * balance.  See compute_parallel_delay.
+	 */
+	if (VacuumSharedCostBalance != NULL)
+		msec = compute_parallel_delay();
+	else if (VacuumCostBalance >= VacuumCostLimit)
 		msec = VacuumCostDelay * VacuumCostBalance / VacuumCostLimit;
+
+	/* Nap if appropriate */
+	if (msec > 0)
+	{
 		if (msec > VacuumCostDelay * 4)
 			msec = VacuumCostDelay * 4;
 
@@ -1966,6 +2029,66 @@ vacuum_delay_point(void)
 	}
 }
 
+/*
+ * Computes the vacuum delay for parallel workers.
+ *
+ * The basic idea of a cost-based vacuum delay for parallel index vacuuming
+ * is to allow each worker to sleep proportional to the work done by it.  We
+ * achieve this by allowing all parallel vacuum workers including the leader
+ * process to have a shared view of cost related parameters (mainly
+ * VacuumCostBalance).  We allow each worker to update it as and when it has
+ * incurred any cost and then based on that decide whether it needs to sleep.
+ * We compute the time to sleep for a worker based on the cost it has incurred
+ * (VacuumCostBalanceLocal) and then reduce the VacuumSharedCostBalance by
+ * that amount.  This avoids letting the workers sleep who have done less or
+ * no I/O as compared to other workers and therefore can ensure that workers
+ * who are doing more I/O got throttled more.
+ *
+ * We allow any worker to sleep only if it has performed the I/O above a
+ * certain threshold, which is calculated based on the number of active
+ * workers (VacuumActiveNWorkers), and the overall cost balance is more than
+ * VacuumCostLimit set by the system.  The testing reveals that we achieve
+ * the required throttling if we allow a worker that has done more than 50%
+ * of its share of work to sleep.
+ */
+static double
+compute_parallel_delay(void)
+{
+	double		msec = 0;
+	uint32		shared_balance;
+	int			nworkers;
+
+	/* Parallel vacuum must be active */
+	Assert(VacuumSharedCostBalance);
+
+	nworkers = pg_atomic_read_u32(VacuumActiveNWorkers);
+
+	/* At least count itself */
+	Assert(nworkers >= 1);
+
+	/* Update the shared cost balance value atomically */
+	shared_balance = pg_atomic_add_fetch_u32(VacuumSharedCostBalance, VacuumCostBalance);
+
+	/* Compute the total local balance for the current worker */
+	VacuumCostBalanceLocal += VacuumCostBalance;
+
+	if ((shared_balance >= VacuumCostLimit) &&
+		(VacuumCostBalanceLocal > 0.5 * (VacuumCostLimit / nworkers)))
+	{
+		/* Compute sleep time based on the local cost balance */
+		msec = VacuumCostDelay * VacuumCostBalanceLocal / VacuumCostLimit;
+		pg_atomic_sub_fetch_u32(VacuumSharedCostBalance, VacuumCostBalanceLocal);
+		VacuumCostBalanceLocal = 0;
+	}
+
+	/*
+	 * Reset the local balance as we accumulated it into the shared value.
+	 */
+	VacuumCostBalance = 0;
+
+	return msec;
+}
+
 /*
  * A wrapper function of defGetBoolean().
  *
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index f0e40e36af..6d1f28c327 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2886,6 +2886,8 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 			(!wraparound ? VACOPT_SKIP_LOCKED : 0);
 		tab->at_params.index_cleanup = VACOPT_TERNARY_DEFAULT;
 		tab->at_params.truncate = VACOPT_TERNARY_DEFAULT;
+		/* As of now, we don't support parallel vacuum for autovacuum */
+		tab->at_params.nworkers = -1;
 		tab->at_params.freeze_min_age = freeze_min_age;
 		tab->at_params.freeze_table_age = freeze_table_age;
 		tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 2fd88866c9..99451fd942 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -3597,7 +3597,7 @@ psql_completion(const char *text, int start, int end)
 		if (ends_with(prev_wd, '(') || ends_with(prev_wd, ','))
 			COMPLETE_WITH("FULL", "FREEZE", "ANALYZE", "VERBOSE",
 						  "DISABLE_PAGE_SKIPPING", "SKIP_LOCKED",
-						  "INDEX_CLEANUP", "TRUNCATE");
+						  "INDEX_CLEANUP", "TRUNCATE", "PARALLEL");
 		else if (TailMatches("FULL|FREEZE|ANALYZE|VERBOSE|DISABLE_PAGE_SKIPPING|SKIP_LOCKED|INDEX_CLEANUP|TRUNCATE"))
 			COMPLETE_WITH("ON", "OFF");
 	}
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 580b4caef7..479f17c55f 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -24,6 +24,8 @@
 #include "nodes/primnodes.h"
 #include "storage/bufpage.h"
 #include "storage/lockdefs.h"
+#include "storage/shm_toc.h"
+#include "storage/dsm.h"
 #include "utils/relcache.h"
 #include "utils/snapshot.h"
 
@@ -193,6 +195,7 @@ extern Size SyncScanShmemSize(void);
 struct VacuumParams;
 extern void heap_vacuum_rel(Relation onerel,
 							struct VacuumParams *params, BufferAccessStrategy bstrategy);
+extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
 
 /* in heap/heapam_visibility.c */
 extern bool HeapTupleSatisfiesVisibility(HeapTuple stup, Snapshot snapshot,
diff --git a/src/include/access/parallel.h b/src/include/access/parallel.h
index 646708ba5f..fc6a5603bb 100644
--- a/src/include/access/parallel.h
+++ b/src/include/access/parallel.h
@@ -33,7 +33,8 @@ typedef struct ParallelContext
 {
 	dlist_node	node;
 	SubTransactionId subid;
-	int			nworkers;
+	int			nworkers;		/* Maximum number of workers to launch */
+	int			nworkers_to_launch; /* Actual number of workers to launch */
 	int			nworkers_launched;
 	char	   *library_name;
 	char	   *function_name;
@@ -63,6 +64,7 @@ extern ParallelContext *CreateParallelContext(const char *library_name,
 											  const char *function_name, int nworkers);
 extern void InitializeParallelDSM(ParallelContext *pcxt);
 extern void ReinitializeParallelDSM(ParallelContext *pcxt);
+extern void ReinitializeParallelWorkers(ParallelContext *pcxt, int nworkers_to_launch);
 extern void LaunchParallelWorkers(ParallelContext *pcxt);
 extern void WaitForParallelWorkersToAttach(ParallelContext *pcxt);
 extern void WaitForParallelWorkersToFinish(ParallelContext *pcxt);
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index b3351ad406..62dd01f41f 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -222,6 +222,13 @@ typedef struct VacuumParams
 										 * default value depends on reloptions */
 	VacOptTernaryValue truncate;	/* Truncate empty pages at the end,
 									 * default value depends on reloptions */
+
+	/*
+	 * The number of parallel vacuum workers.  0 by default which means choose
+	 * based on the number of indexes.  -1 indicates a parallel vacuum is
+	 * disabled.
+	 */
+	int			nworkers;
 } VacuumParams;
 
 /* GUC parameters */
@@ -231,6 +238,11 @@ extern int	vacuum_freeze_table_age;
 extern int	vacuum_multixact_freeze_min_age;
 extern int	vacuum_multixact_freeze_table_age;
 
+/* Variables for cost-based parallel vacuum  */
+extern pg_atomic_uint32 *VacuumSharedCostBalance;
+extern pg_atomic_uint32 *VacuumActiveNWorkers;
+extern int	VacuumCostBalanceLocal;
+
 
 /* in commands/vacuum.c */
 extern void ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel);
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index 9996d882d1..22cca70687 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -92,6 +92,34 @@ CONTEXT:  SQL function "do_analyze" statement 1
 SQL function "wrap_do_analyze" statement 1
 VACUUM FULL vactst;
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+-- PARALLEL option
+CREATE TABLE pvactst (i INT, a INT[], p POINT) with (autovacuum_enabled = off);
+INSERT INTO pvactst SELECT i, array[1,2,3], point(i, i+1) FROM generate_series(1,1000) i;
+CREATE INDEX btree_pvactst ON pvactst USING btree (i);
+CREATE INDEX hash_pvactst ON pvactst USING hash (i);
+CREATE INDEX brin_pvactst ON pvactst USING brin (i);
+CREATE INDEX gin_pvactst ON pvactst USING gin (a);
+CREATE INDEX gist_pvactst ON pvactst USING gist (p);
+CREATE INDEX spgist_pvactst ON pvactst USING spgist (p);
+-- VACUUM invokes parallel index cleanup
+SET min_parallel_index_scan_size to 0;
+VACUUM (PARALLEL 2) pvactst;
+-- VACUUM invokes parallel bulk-deletion
+UPDATE pvactst SET i = i WHERE i < 1000;
+VACUUM (PARALLEL 2) pvactst;
+VACUUM (PARALLEL -1) pvactst; -- error
+ERROR:  parallel vacuum degree must be between 0 and 1024
+LINE 1: VACUUM (PARALLEL -1) pvactst;
+                ^
+VACUUM (PARALLEL 2, INDEX_CLEANUP FALSE) pvactst;
+VACUUM (PARALLEL 2, FULL TRUE) pvactst; -- error, cannot use both PARALLEL and FULL
+ERROR:  cannot specify both FULL and PARALLEL options
+CREATE TEMPORARY TABLE tmp (a int PRIMARY KEY);
+CREATE INDEX tmp_idx1 ON tmp (a);
+VACUUM (PARALLEL 1) tmp; -- disables parallel vacuum option
+WARNING:  disabling parallel option of vacuum on "tmp" --- cannot vacuum temporary tables in parallel
+RESET min_parallel_index_scan_size;
+DROP TABLE pvactst;
 -- INDEX_CLEANUP option
 CREATE TABLE no_index_cleanup (i INT PRIMARY KEY, t TEXT);
 -- Use uncompressed data stored in toast.
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 69987f75e9..d6859a5bc9 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -75,6 +75,33 @@ VACUUM FULL vactst;
 
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
 
+-- PARALLEL option
+CREATE TABLE pvactst (i INT, a INT[], p POINT) with (autovacuum_enabled = off);
+INSERT INTO pvactst SELECT i, array[1,2,3], point(i, i+1) FROM generate_series(1,1000) i;
+CREATE INDEX btree_pvactst ON pvactst USING btree (i);
+CREATE INDEX hash_pvactst ON pvactst USING hash (i);
+CREATE INDEX brin_pvactst ON pvactst USING brin (i);
+CREATE INDEX gin_pvactst ON pvactst USING gin (a);
+CREATE INDEX gist_pvactst ON pvactst USING gist (p);
+CREATE INDEX spgist_pvactst ON pvactst USING spgist (p);
+
+-- VACUUM invokes parallel index cleanup
+SET min_parallel_index_scan_size to 0;
+VACUUM (PARALLEL 2) pvactst;
+
+-- VACUUM invokes parallel bulk-deletion
+UPDATE pvactst SET i = i WHERE i < 1000;
+VACUUM (PARALLEL 2) pvactst;
+
+VACUUM (PARALLEL -1) pvactst; -- error
+VACUUM (PARALLEL 2, INDEX_CLEANUP FALSE) pvactst;
+VACUUM (PARALLEL 2, FULL TRUE) pvactst; -- error, cannot use both PARALLEL and FULL
+CREATE TEMPORARY TABLE tmp (a int PRIMARY KEY);
+CREATE INDEX tmp_idx1 ON tmp (a);
+VACUUM (PARALLEL 1) tmp; -- disables parallel vacuum option
+RESET min_parallel_index_scan_size;
+DROP TABLE pvactst;
+
 -- INDEX_CLEANUP option
 CREATE TABLE no_index_cleanup (i INT PRIMARY KEY, t TEXT);
 -- Use uncompressed data stored in toast.
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index caf6b86f92..0242e6627d 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -1216,7 +1216,11 @@ LPVOID
 LPWSTR
 LSEG
 LUID
+LVDeadTuples
 LVRelStats
+LVShared
+LVSharedIndStats
+LVParallelState
 LWLock
 LWLockHandle
 LWLockMinimallyPadded
-- 
2.16.2.windows.1

#358

Amit Kapila

amit.kapila16@gmail.com

almost 6 years ago

In reply to: Mahendra Singh Thalor (#355)

On Tue, Jan 14, 2020 at 4:17 PM Mahendra Singh Thalor
<mahi6run@gmail.com> wrote:

Hi,
+    /*
+     * Try to initialize the parallel vacuum if requested
+     */
+    if (params->nworkers >= 0 && vacrelstats->useindex)
+    {
+        /*
+         * Since parallel workers cannot access data in temporary tables, we
+         * can't perform parallel vacuum on them.
+         */
+        if (RelationUsesLocalBuffers(onerel))
+        {
+            /*
+             * Give warning only if the user explicitly tries to perform a
+             * parallel vacuum on the temporary table.
+             */
+            if (params->nworkers > 0)
+                ereport(WARNING,
+                        (errmsg("disabling parallel option of vacuum
on \"%s\" --- cannot vacuum temporary tables in parallel",
From v45 patch, we moved warning of temporary table into
"params->nworkers >= 0 && vacrelstats->useindex)" check so if table
don't have any index, then we are not giving any warning. I think, we
should give warning for all the temporary tables if parallel degree is
given. (Till v44 patch, we were giving warning for all the temporary
tables(having index and without index))

I am not sure how useful it is to give WARNING in this case as we are
anyway not going to perform a parallel vacuum because it doesn't have
an index? One can also say that WARNING is expected in the cases
where we skip a parallel vacuum due to any reason (ex., if the size of
the index is small), but I don't think that will be a good idea.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#359

Mahendra Singh Thalor

mahi6run@gmail.com

almost 6 years ago

In reply to: Mahendra Singh Thalor (#356)

On Tue, 14 Jan 2020 at 17:16, Mahendra Singh Thalor <mahi6run@gmail.com> wrote:

On Tue, 14 Jan 2020 at 16:17, Mahendra Singh Thalor <mahi6run@gmail.com> wrote:
On Tue, 14 Jan 2020 at 10:06, Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
On Tue, 14 Jan 2020 at 03:20, Mahendra Singh Thalor <mahi6run@gmail.com> wrote:
On Fri, 10 Jan 2020 at 15:51, Sergei Kornilov <sk@zsrv.org> wrote:
Hi
Thank you for update! I looked again
(vacuum_indexes_leader)
+               /* Skip the indexes that can be processed by parallel workers */
+               if (!skip_index)
+                       continue;
Does the variable name skip_index not confuse here? Maybe rename to something like can_parallel?
Again I looked into code and thought that somehow if we can add a
boolean flag(can_parallel) in IndexBulkDeleteResult structure to
identify that this index is supporting parallel vacuum or not, then it
will be easy to skip those indexes and multiple time we will not call
skip_parallel_vacuum_index (from vacuum_indexes_leader and
parallel_vacuum_index)
We can have a linked list of non-parallel supported indexes, then
directly we can pass to vacuum_indexes_leader.

Ex: let suppose we have 5 indexes into a table. If before launching
parallel workers, if we can add boolean flag(can_parallel)
IndexBulkDeleteResult structure to identify that this index is
supporting parallel vacuum or not.
Let index 1, 4 are not supporting parallel vacuum so we already have
info in a linked list that 1->4 are not supporting parallel vacuum, so
parallel_vacuum_index will process these indexes and rest will be
processed by parallel workers. If parallel worker found that
can_parallel is false, then it will skip that index.

As per my understanding, if we implement this, then we can avoid
multiple function calling of skip_parallel_vacuum_index and if there
is no index which can't performe parallel vacuum, then we will not
call vacuum_indexes_leader as head of list pointing to null. (we can
save unnecessary calling of vacuum_indexes_leader)

Thoughts?
We skip not only indexes that don't support parallel index vacuum but
also indexes supporting it depending on vacuum phase. That is, we
could skip different indexes at different vacuum phase. Therefore with
your idea, we would need to have at least three linked lists for each
possible vacuum phase(bulkdelete, conditional cleanup and cleanup), is
that right?

I think we can check if there are indexes that should be processed by
the leader process before entering the loop in vacuum_indexes_leader
by comparing nindexes_parallel_XXX of LVParallelState to the number of
indexes but I'm not sure it's effective since the number of indexes on
a table should be small.
Hi,
+    /*
+     * Try to initialize the parallel vacuum if requested
+     */
+    if (params->nworkers >= 0 && vacrelstats->useindex)
+    {
+        /*
+         * Since parallel workers cannot access data in temporary tables, we
+         * can't perform parallel vacuum on them.
+         */
+        if (RelationUsesLocalBuffers(onerel))
+        {
+            /*
+             * Give warning only if the user explicitly tries to perform a
+             * parallel vacuum on the temporary table.
+             */
+            if (params->nworkers > 0)
+                ereport(WARNING,
+                        (errmsg("disabling parallel option of vacuum
on \"%s\" --- cannot vacuum temporary tables in parallel",
From v45 patch, we moved warning of temporary table into
"params->nworkers >= 0 && vacrelstats->useindex)" check so if table
don't have any index, then we are not giving any warning. I think, we
should give warning for all the temporary tables if parallel degree is
given. (Till v44 patch, we were giving warning for all the temporary
tables(having index and without index))

Thoughts?
Hi,
I did some more review. Below is the 1 review comment for v46-0002.
+    /*
+     * Initialize the state for parallel vacuum
+     */
+    if (params->nworkers >= 0 && vacrelstats->useindex)
+    {
+        /*
+         * Since parallel workers cannot access data in temporary tables, we
+         * can't perform parallel vacuum on them.
+         */
+        if (RelationUsesLocalBuffers(onerel)
In above check, we should add "nindexes > 1" check so that if there is only 1 index, then we will not call begin_parallel_vacuum.

I think, " if (params->nworkers >= 0 && nindexes > 1)" check will be
enough here .

Thanks and Regards
Mahendra Singh Thalor
EnterpriseDB: http://www.enterprisedb.com

#360

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

almost 6 years ago

In reply to: Amit Kapila (#357)

1 attachment(s)

On Tue, 14 Jan 2020 at 21:43, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Jan 14, 2020 at 10:04 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Mon, 13 Jan 2020 at 12:50, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Sat, Jan 11, 2020 at 7:48 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

Okay, would it better if we get rid of this variable and have code like below?

/* Skip the indexes that can be processed by parallel workers */
if ( !(get_indstats(lps->lvshared, i) == NULL ||
skip_parallel_vacuum_index(Irel[i], lps->lvshared)))
continue;

Make sense to me.

I have changed the comment and condition to make it a positive test so
that it is more clear.

...

Agreed. But with the updated patch the PARALLEL option without the
parallel degree doesn't display warning because params->nworkers = 0
in that case. So how about restoring params->nworkers at the end of
vacuum_rel()?

I had also thought on those lines, but I was not entirely sure about
this resetting of workers. Today, again thinking about it, it seems
the idea Mahendra is suggesting that is giving an error if the
parallel degree is not specified seems reasonable to me. This means
Vacuum (parallel), Vacuum (parallel) <tbl_name>, etc. will give an
error "parallel degree must be specified". This idea has merit as now
we are supporting a parallel vacuum by default, so a 'parallel' option
without a parallel degree doesn't have any meaning. If we do that,
then we don't need to do anything additional about the handling of
temp tables (other than what patch is already doing) as well. What do
you think?

Good point! Agreed.

Thanks, changed accordingly.

Thank you for updating the patch! I have a few small comments. The
rest looks good to me.

1.
+ * Compute the number of parallel worker processes to request.  Both index
+ * vacuum and index cleanup can be executed with parallel workers.  The
+ * relation size of the table don't affect the parallel degree for now.

s/don't/doesn't/

2.
@@ -383,6 +435,7 @@ vacuum(List *relations, VacuumParams *params,
VacuumPageHit = 0;
VacuumPageMiss = 0;
VacuumPageDirty = 0;
+ VacuumSharedCostBalance = NULL;

I think we can initialize VacuumCostBalanceLocal and
VacuumActiveNWorkers here. We use these parameters during parallel
index vacuum and reset at the end but we might want to initialize them
for safety.

3.
+   /* Set cost-based vacuum delay */
+   VacuumCostActive = (VacuumCostDelay > 0);
+   VacuumCostBalance = 0;
+   VacuumPageHit = 0;
+   VacuumPageMiss = 0;
+   VacuumPageDirty = 0;
+   VacuumSharedCostBalance = &(lvshared->cost_balance);
+   VacuumActiveNWorkers = &(lvshared->active_nworkers);

VacuumCostBalanceLocal also needs to be initialized.

4.
The regression tests don't have the test case of PARALLEL 0.

Since I guess you already modifies the code locally I've attached the
diff containing the above review comments.

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

review_v47_masahiko.patchapplication/octet-stream; name=review_v47_masahiko.patchDownload

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index d2c895a6fb..bf5873edb1 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -2939,7 +2939,7 @@ heap_page_is_all_visible(Relation rel, Buffer buf,
 /*
  * Compute the number of parallel worker processes to request.  Both index
  * vacuum and index cleanup can be executed with parallel workers.  The
- * relation size of the table don't affect the parallel degree for now.
+ * relation size of the table doesn't affect the parallel degree for now.
  *
  * nrequested is the number of parallel workers that user requested.  If
  * nrequested is 0, we compute the parallel degree based on nindexes, that is
@@ -3365,6 +3365,7 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
 	VacuumPageHit = 0;
 	VacuumPageMiss = 0;
 	VacuumPageDirty = 0;
+	VacuumCostBalanceLocal = 0;
 	VacuumSharedCostBalance = &(lvshared->cost_balance);
 	VacuumActiveNWorkers = &(lvshared->active_nworkers);
 
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 1cd77e79d2..c645d4463f 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -435,7 +435,9 @@ vacuum(List *relations, VacuumParams *params,
 		VacuumPageHit = 0;
 		VacuumPageMiss = 0;
 		VacuumPageDirty = 0;
+		VacuumCostBalanceLocal = 0;
 		VacuumSharedCostBalance = NULL;
+		VacuumActiveNWorkers = NULL;
 
 		/*
 		 * Loop to process each selected relation.
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index 22cca70687..774ff2fc7c 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -107,6 +107,8 @@ VACUUM (PARALLEL 2) pvactst;
 -- VACUUM invokes parallel bulk-deletion
 UPDATE pvactst SET i = i WHERE i < 1000;
 VACUUM (PARALLEL 2) pvactst;
+UPDATE pvactst SET i = i WHERE i < 1000;
+VACUUM (PARALLEL 0) pvactst; -- disable parallel vacuum
 VACUUM (PARALLEL -1) pvactst; -- error
 ERROR:  parallel vacuum degree must be between 0 and 1024
 LINE 1: VACUUM (PARALLEL -1) pvactst;
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index d6859a5bc9..fbd364d7d0 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -93,6 +93,9 @@ VACUUM (PARALLEL 2) pvactst;
 UPDATE pvactst SET i = i WHERE i < 1000;
 VACUUM (PARALLEL 2) pvactst;
 
+UPDATE pvactst SET i = i WHERE i < 1000;
+VACUUM (PARALLEL 0) pvactst; -- disable parallel vacuum
+
 VACUUM (PARALLEL -1) pvactst; -- error
 VACUUM (PARALLEL 2, INDEX_CLEANUP FALSE) pvactst;
 VACUUM (PARALLEL 2, FULL TRUE) pvactst; -- error, cannot use both PARALLEL and FULL

#361

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

almost 6 years ago

In reply to: Mahendra Singh Thalor (#359)

On Wed, 15 Jan 2020 at 12:34, Mahendra Singh Thalor <mahi6run@gmail.com> wrote:

On Tue, 14 Jan 2020 at 17:16, Mahendra Singh Thalor <mahi6run@gmail.com> wrote:
On Tue, 14 Jan 2020 at 16:17, Mahendra Singh Thalor <mahi6run@gmail.com> wrote:
On Tue, 14 Jan 2020 at 10:06, Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
On Tue, 14 Jan 2020 at 03:20, Mahendra Singh Thalor <mahi6run@gmail.com> wrote:
On Fri, 10 Jan 2020 at 15:51, Sergei Kornilov <sk@zsrv.org> wrote:
Hi
Thank you for update! I looked again
(vacuum_indexes_leader)
+               /* Skip the indexes that can be processed by parallel workers */
+               if (!skip_index)
+                       continue;
Does the variable name skip_index not confuse here? Maybe rename to something like can_parallel?
Again I looked into code and thought that somehow if we can add a
boolean flag(can_parallel) in IndexBulkDeleteResult structure to
identify that this index is supporting parallel vacuum or not, then it
will be easy to skip those indexes and multiple time we will not call
skip_parallel_vacuum_index (from vacuum_indexes_leader and
parallel_vacuum_index)
We can have a linked list of non-parallel supported indexes, then
directly we can pass to vacuum_indexes_leader.

Ex: let suppose we have 5 indexes into a table. If before launching
parallel workers, if we can add boolean flag(can_parallel)
IndexBulkDeleteResult structure to identify that this index is
supporting parallel vacuum or not.
Let index 1, 4 are not supporting parallel vacuum so we already have
info in a linked list that 1->4 are not supporting parallel vacuum, so
parallel_vacuum_index will process these indexes and rest will be
processed by parallel workers. If parallel worker found that
can_parallel is false, then it will skip that index.

As per my understanding, if we implement this, then we can avoid
multiple function calling of skip_parallel_vacuum_index and if there
is no index which can't performe parallel vacuum, then we will not
call vacuum_indexes_leader as head of list pointing to null. (we can
save unnecessary calling of vacuum_indexes_leader)

Thoughts?
We skip not only indexes that don't support parallel index vacuum but
also indexes supporting it depending on vacuum phase. That is, we
could skip different indexes at different vacuum phase. Therefore with
your idea, we would need to have at least three linked lists for each
possible vacuum phase(bulkdelete, conditional cleanup and cleanup), is
that right?

I think we can check if there are indexes that should be processed by
the leader process before entering the loop in vacuum_indexes_leader
by comparing nindexes_parallel_XXX of LVParallelState to the number of
indexes but I'm not sure it's effective since the number of indexes on
a table should be small.
Hi,
+    /*
+     * Try to initialize the parallel vacuum if requested
+     */
+    if (params->nworkers >= 0 && vacrelstats->useindex)
+    {
+        /*
+         * Since parallel workers cannot access data in temporary tables, we
+         * can't perform parallel vacuum on them.
+         */
+        if (RelationUsesLocalBuffers(onerel))
+        {
+            /*
+             * Give warning only if the user explicitly tries to perform a
+             * parallel vacuum on the temporary table.
+             */
+            if (params->nworkers > 0)
+                ereport(WARNING,
+                        (errmsg("disabling parallel option of vacuum
on \"%s\" --- cannot vacuum temporary tables in parallel",
From v45 patch, we moved warning of temporary table into
"params->nworkers >= 0 && vacrelstats->useindex)" check so if table
don't have any index, then we are not giving any warning. I think, we
should give warning for all the temporary tables if parallel degree is
given. (Till v44 patch, we were giving warning for all the temporary
tables(having index and without index))

Thoughts?
Hi,
I did some more review. Below is the 1 review comment for v46-0002.
+    /*
+     * Initialize the state for parallel vacuum
+     */
+    if (params->nworkers >= 0 && vacrelstats->useindex)
+    {
+        /*
+         * Since parallel workers cannot access data in temporary tables, we
+         * can't perform parallel vacuum on them.
+         */
+        if (RelationUsesLocalBuffers(onerel)
In above check, we should add "nindexes > 1" check so that if there is only 1 index, then we will not call begin_parallel_vacuum.
I think, " if (params->nworkers >= 0 && nindexes > 1)" check will be
enough here .

Hmm I think if we removed vacrelstats->useindex from that condition we
will call begin_parallel_vacuum even when index cleanup is disabled.

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#362

Amit Kapila

amit.kapila16@gmail.com

almost 6 years ago

In reply to: Masahiko Sawada (#360)

1 attachment(s)

On Wed, Jan 15, 2020 at 10:05 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

Thank you for updating the patch! I have a few small comments.

I have adapted all your changes, fixed the comment by Mahendra related
to initializing parallel state only when there are at least two
indexes. Additionally, I have changed a few comments (make the
reference to parallel vacuum consistent, at some places we were
referring it as 'parallel lazy vacuum' and at other places it was
'parallel index vacuum').

The
rest looks good to me.

Okay, I think the patch is in good shape. I am planning to read it a
few more times (at least 2 times) and then probably will commit it
early next week (Monday or Tuesday) unless there are any major
comments. I have already committed the API patch (4d8a8d0c73).

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachments:

v48-0001-Allow-vacuum-command-to-process-indexes-in-parallel.patchapplication/octet-stream; name=v48-0001-Allow-vacuum-command-to-process-indexes-in-parallel.patchDownload

From 305e26a67f1a4740fe80192eb208d9359d5c57af Mon Sep 17 00:00:00 2001
From: Amit Kapila <akapila@postgresql.org>
Date: Thu, 9 Jan 2020 15:49:46 +0530
Subject: [PATCH] Allow vacuum command to process indexes in parallel.

This feature allows the vacuum to leverage multiple CPUs in order to
process indexes.  This enables us to perform index vacuuming and index
cleanup with background workers.  This adds a PARALLEL option to VACUUM
command where the user can specify the number of workers that can be used
to perform the command which is limited by the number of indexes on a
table.  Specifying zero as a number of workers will disable parallelism.
This option can't be used with the FULL option.

Each index is processed by at most one vacuum process.  Therefore parallel
vacuum can be used when the table has at least two indexes.

The parallel degree is either specified by the user or determined based on
the number of indexes that the table has, and further limited by
max_parallel_maintenance_workers.  The index can participate in parallel
vacuum iff it's size is greater than min_parallel_index_scan_size.

Author: Masahiko Sawada and Amit Kapila
Reviewed-by: Dilip Kumar, Amit Kapila, Robert Haas, Tomas Vondra,
Mahendra Singh and Sergei Kornilov
Tested-by: Mahendra Singh and Prabhat Sahu
Discussion:
https://postgr.es/m/CAD21AoDTPMgzSkV4E3SFo1CH_x50bf5PqZFQf4jmqjk-C03BWg@mail.gmail.com
https://postgr.es/m/CAA4eK1J-VoR9gzS5E75pcD-OH0mEyCdp8RihcwKrcuw7J-Q0+w@mail.gmail.com
---
 doc/src/sgml/config.sgml              |   18 +-
 doc/src/sgml/ref/vacuum.sgml          |   61 +-
 src/backend/access/heap/vacuumlazy.c  | 1267 ++++++++++++++++++++++++++++++---
 src/backend/access/transam/parallel.c |   26 +-
 src/backend/commands/vacuum.c         |  135 +++-
 src/backend/postmaster/autovacuum.c   |    2 +
 src/bin/psql/tab-complete.c           |    2 +-
 src/include/access/heapam.h           |    3 +
 src/include/access/parallel.h         |    4 +-
 src/include/commands/vacuum.h         |   12 +
 src/test/regress/expected/vacuum.out  |   30 +
 src/test/regress/sql/vacuum.sql       |   30 +
 src/tools/pgindent/typedefs.list      |    4 +
 13 files changed, 1458 insertions(+), 136 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 5d45b6f7cb..3ccacd528b 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2308,13 +2308,13 @@ include_dir 'conf.d'
        <listitem>
         <para>
          Sets the maximum number of parallel workers that can be
-         started by a single utility command.  Currently, the only
-         parallel utility command that supports the use of parallel
-         workers is <command>CREATE INDEX</command>, and only when
-         building a B-tree index.  Parallel workers are taken from the
-         pool of processes established by <xref
-         linkend="guc-max-worker-processes"/>, limited by <xref
-         linkend="guc-max-parallel-workers"/>.  Note that the requested
+         started by a single utility command.  Currently, the parallel
+         utility commands that support the use of parallel workers are
+         <command>CREATE INDEX</command> only when building a B-tree index,
+         and <command>VACUUM</command> without <literal>FULL</literal>
+         option.  Parallel workers are taken from the pool of processes
+         established by <xref linkend="guc-max-worker-processes"/>, limited
+         by <xref linkend="guc-max-parallel-workers"/>.  Note that the requested
          number of workers may not actually be available at run time.
          If this occurs, the utility operation will run with fewer
          workers than expected.  The default value is 2.  Setting this
@@ -4915,7 +4915,9 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
         for a parallel scan to be considered.  Note that a parallel index scan
         typically won't touch the entire index; it is the number of pages
         which the planner believes will actually be touched by the scan which
-        is relevant.
+        is relevant.  This parameter is also used to decide whether a
+        particular index can participate in a parallel vacuum.  See
+        <xref linkend="sql-vacuum"/>.
         If this value is specified without units, it is taken as blocks,
         that is <symbol>BLCKSZ</symbol> bytes, typically 8kB.
         The default is 512 kilobytes (<literal>512kB</literal>).
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index f9b0fb8794..846056a353 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -34,6 +34,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     SKIP_LOCKED [ <replaceable class="parameter">boolean</replaceable> ]
     INDEX_CLEANUP [ <replaceable class="parameter">boolean</replaceable> ]
     TRUNCATE [ <replaceable class="parameter">boolean</replaceable> ]
+    PARALLEL <replaceable class="parameter">integer</replaceable>
 
 <phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
 
@@ -75,10 +76,14 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    with normal reading and writing of the table, as an exclusive lock
    is not obtained.  However, extra space is not returned to the operating
    system (in most cases); it's just kept available for re-use within the
-   same table.  <command>VACUUM FULL</command> rewrites the entire contents
-   of the table into a new disk file with no extra space, allowing unused
-   space to be returned to the operating system.  This form is much slower and
-   requires an exclusive lock on each table while it is being processed.
+   same table.  It also allows us to leverage multiple CPUs in order to process
+   indexes.  This feature is known as <firstterm>parallel vacuum</firstterm>.
+   To disable this feature, one can use <literal>PARALLEL</literal> option and
+   specify parallel workers as zero.  <command>VACUUM FULL</command> rewrites
+   the entire contents of the table into a new disk file with no extra space,
+   allowing unused space to be returned to the operating system.  This form is
+   much slower and requires an exclusive lock on each table while it is being
+   processed.
   </para>
 
   <para>
@@ -223,6 +228,33 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><literal>PARALLEL</literal></term>
+    <listitem>
+     <para>
+      Perform vacuum index and cleanup index phases of <command>VACUUM</command>
+      in parallel using <replaceable class="parameter">integer</replaceable>
+      background workers (for the detail of each vacuum phases, please
+      refer to <xref linkend="vacuum-phases"/>).  If the
+      <literal>PARALLEL</literal> option is omitted, then
+      <command>VACUUM</command> decides the number of workers based on number
+      of indexes that support parallel vacuum operation on the relation which
+      is further limited by <xref linkend="guc-max-parallel-workers-maintenance"/>.
+      The index can participate in a parallel vacuum if and only if the size
+      of the index is more than <xref linkend="guc-min-parallel-index-scan-size"/>.
+      Please note that it is not guaranteed that the number of parallel workers
+      specified in <replaceable class="parameter">integer</replaceable> will
+      be used during execution.  It is possible for a vacuum to run with fewer
+      workers than specified, or even with no workers at all.  Only one worker
+      can be used per index.  So parallel workers are launched only when there
+      are at least <literal>2</literal> indexes in the table.  Workers for
+      vacuum launches before starting each phase and exit at the end of
+      the phase.  These behaviors might change in a future release.  This
+      option can't be used with the <literal>FULL</literal> option.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><replaceable class="parameter">boolean</replaceable></term>
     <listitem>
@@ -237,6 +269,15 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><replaceable class="parameter">integer</replaceable></term>
+    <listitem>
+     <para>
+      Specifies a non-negative integer value passed to the selected option.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><replaceable class="parameter">table_name</replaceable></term>
     <listitem>
@@ -316,11 +357,19 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     more than a plain <command>VACUUM</command> would.
    </para>
 
+   <para>
+     The <option>PARALLEL</option> option is used only for vacuum purpose.
+     Even if this option is specified with <option>ANALYZE</option> option
+     it does not affect <option>ANALYZE</option>.
+   </para>
+
    <para>
     <command>VACUUM</command> causes a substantial increase in I/O traffic,
     which might cause poor performance for other active sessions.  Therefore,
-    it is sometimes advisable to use the cost-based vacuum delay feature.
-    See <xref linkend="runtime-config-resource-vacuum-cost"/> for details.
+    it is sometimes advisable to use the cost-based vacuum delay feature.  For
+    parallel vacuum, each worker sleeps proportional to the work done by that
+    worker.  See <xref linkend="runtime-config-resource-vacuum-cost"/> for
+    details.
    </para>
 
    <para>
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index a5fe90485f..8d3aad04ed 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -22,6 +22,20 @@
  * of index scans performed.  So we don't use maintenance_work_mem memory for
  * the TID array, just enough to hold as many heap tuples as fit on one page.
  *
+ * Lazy vacuum supports parallel execution with parallel worker processes.  In
+ * a parallel vacuum, we perform both index vacuum and index cleanup with
+ * parallel worker processes.  Individual indexes are processed by one vacuum
+ * process.  At the beginning of a lazy vacuum (at lazy_scan_heap) we prepare
+ * the parallel context and initialize the DSM segment that contains shared
+ * information as well as the memory space for storing dead tuples.  When
+ * starting either index vacuum or index cleanup, we launch parallel worker
+ * processes.  Once all indexes are processed the parallel worker processes
+ * exit.  After that, the leader process re-initializes the parallel context
+ * so that it can use the same DSM for multiple passes of index vacuum and
+ * for performing index cleanup.  For updating the index statistics, we need
+ * to update the system table and since updates are not allowed during
+ * parallel mode we update the index statistics after exiting from the
+ * parallel mode.
  *
  * Portions Copyright (c) 1996-2020, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
@@ -36,25 +50,30 @@
 
 #include <math.h>
 
+#include "access/amapi.h"
 #include "access/genam.h"
 #include "access/heapam.h"
 #include "access/heapam_xlog.h"
 #include "access/htup_details.h"
 #include "access/multixact.h"
+#include "access/parallel.h"
 #include "access/transam.h"
 #include "access/visibilitymap.h"
+#include "access/xact.h"
 #include "access/xlog.h"
 #include "catalog/storage.h"
 #include "commands/dbcommands.h"
 #include "commands/progress.h"
 #include "commands/vacuum.h"
 #include "miscadmin.h"
+#include "optimizer/paths.h"
 #include "pgstat.h"
 #include "portability/instr_time.h"
 #include "postmaster/autovacuum.h"
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
 #include "storage/lmgr.h"
+#include "tcop/tcopprot.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/pg_rusage.h"
@@ -110,6 +129,142 @@
  */
 #define PREFETCH_SIZE			((BlockNumber) 32)
 
+/*
+ * DSM keys for parallel vacuum.  Unlike other parallel execution code, since
+ * we don't need to worry about DSM keys conflicting with plan_node_id we can
+ * use small integers.
+ */
+#define PARALLEL_VACUUM_KEY_SHARED			1
+#define PARALLEL_VACUUM_KEY_DEAD_TUPLES		2
+#define PARALLEL_VACUUM_KEY_QUERY_TEXT		3
+
+/*
+ * Macro to check if we are in a parallel vacuum.  If true, we are in the
+ * parallel mode and the DSM segment is initialized.
+ */
+#define ParallelVacuumIsActive(lps) (((LVParallelState *) (lps)) != NULL)
+
+/*
+ * LVDeadTuples stores the dead tuple TIDs collected during the heap scan.
+ * This is allocated in the DSM segment in parallel mode and in local memory
+ * in non-parallel mode.
+ */
+typedef struct LVDeadTuples
+{
+	int			max_tuples;		/* # slots allocated in array */
+	int			num_tuples;		/* current # of entries */
+	/* List of TIDs of tuples we intend to delete */
+	/* NB: this list is ordered by TID address */
+	ItemPointerData itemptrs[FLEXIBLE_ARRAY_MEMBER];	/* array of
+														 * ItemPointerData */
+} LVDeadTuples;
+
+#define SizeOfLVDeadTuples(cnt) \
+		add_size((offsetof(LVDeadTuples, itemptrs)), \
+				 mul_size(sizeof(ItemPointerData), cnt))
+
+/*
+ * Shared information among parallel workers.  So this is allocated in the DSM
+ * segment.
+ */
+typedef struct LVShared
+{
+	/*
+	 * Target table relid and log level.  These fields are not modified during
+	 * the lazy vacuum.
+	 */
+	Oid			relid;
+	int			elevel;
+
+	/*
+	 * An indication for vacuum workers to perform either index vacuum or
+	 * index cleanup.  first_time is true only if for_cleanup is true and
+	 * bulk-deletion is not performed yet.
+	 */
+	bool		for_cleanup;
+	bool		first_time;
+
+	/*
+	 * Fields for both index vacuum and cleanup.
+	 *
+	 * reltuples is the total number of input heap tuples.  We set either old
+	 * live tuples in the index vacuum case or the new live tuples in the
+	 * index cleanup case.
+	 *
+	 * estimated_count is true if the reltuples is an estimated value.
+	 */
+	double		reltuples;
+	bool		estimated_count;
+
+	/*
+	 * In single process lazy vacuum we could consume more memory during index
+	 * vacuuming or cleanup apart from the memory for heap scanning.  In
+	 * parallel vacuum, since individual vacuum workers can consume memory
+	 * equal to maintenance_work_mem, the new maintenance_work_mem for each
+	 * worker is set such that the parallel operation doesn't consume more
+	 * memory than single process lazy vacuum.
+	 */
+	int			maintenance_work_mem_worker;
+
+	/*
+	 * Shared vacuum cost balance.  During parallel vacuum,
+	 * VacuumSharedCostBalance points to this value and it accumulates the
+	 * balance of each parallel vacuum worker.
+	 */
+	pg_atomic_uint32 cost_balance;
+
+	/*
+	 * Number of active parallel workers.  This is used for computing the
+	 * minimum threshold of the vacuum cost balance for a worker to go for the
+	 * delay.
+	 */
+	pg_atomic_uint32 active_nworkers;
+
+	/*
+	 * Variables to control parallel vacuum.  We have a bitmap to indicate
+	 * which index has stats in shared memory.  The set bit in the map
+	 * indicates that the particular index supports a parallel vacuum.
+	 */
+	pg_atomic_uint32 idx;		/* counter for vacuuming and clean up */
+	uint32		offset;			/* sizeof header incl. bitmap */
+	bits8		bitmap[FLEXIBLE_ARRAY_MEMBER];	/* bit map of NULLs */
+
+	/* Shared index statistics data follows at end of struct */
+} LVShared;
+
+#define SizeOfLVShared (offsetof(LVShared, bitmap) + sizeof(bits8))
+#define GetSharedIndStats(s) \
+	((LVSharedIndStats *)((char *)(s) + ((LVShared *)(s))->offset))
+#define IndStatsIsNull(s, i) \
+	(!(((LVShared *)(s))->bitmap[(i) >> 3] & (1 << ((i) & 0x07))))
+
+/*
+ * Struct for an index bulk-deletion statistic used for parallel vacuum.  This
+ * is allocated in the DSM segment.
+ */
+typedef struct LVSharedIndStats
+{
+	bool		updated;		/* are the stats updated? */
+	IndexBulkDeleteResult stats;
+} LVSharedIndStats;
+
+/* Struct for maintaining a parallel vacuum state. */
+typedef struct LVParallelState
+{
+	ParallelContext *pcxt;
+
+	/* Shared information among parallel vacuum workers */
+	LVShared   *lvshared;
+
+	/*
+	 * The number of indexes that support parallel index bulk-deletion and
+	 * parallel index cleanup respectively.
+	 */
+	int			nindexes_parallel_bulkdel;
+	int			nindexes_parallel_cleanup;
+	int			nindexes_parallel_condcleanup;
+} LVParallelState;
+
 typedef struct LVRelStats
 {
 	/* useindex = true means two-pass strategy; false means one-pass */
@@ -128,11 +283,7 @@ typedef struct LVRelStats
 	BlockNumber pages_removed;
 	double		tuples_deleted;
 	BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
-	/* List of TIDs of tuples we intend to delete */
-	/* NB: this list is ordered by TID address */
-	int			num_dead_tuples;	/* current # of entries */
-	int			max_dead_tuples;	/* # slots allocated in array */
-	ItemPointer dead_tuples;	/* array of ItemPointerData */
+	LVDeadTuples *dead_tuples;
 	int			num_index_scans;
 	TransactionId latestRemovedXid;
 	bool		lock_waiter_detected;
@@ -155,15 +306,15 @@ static void lazy_scan_heap(Relation onerel, VacuumParams *params,
 						   bool aggressive);
 static void lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats);
 static bool lazy_check_needs_freeze(Buffer buf, bool *hastup);
-static void lazy_vacuum_index(Relation indrel,
-							  IndexBulkDeleteResult **stats,
-							  LVRelStats *vacrelstats);
-static void lazy_vacuum_all_indexes(Relation onerel, LVRelStats *vacrelstats,
-									Relation *Irel, int nindexes,
-									IndexBulkDeleteResult **indstats);
+static void lazy_vacuum_all_indexes(Relation onerel, Relation *Irel,
+									IndexBulkDeleteResult **stats,
+									LVRelStats *vacrelstats, LVParallelState *lps,
+									int nindexes);
+static void lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult **stats,
+							  LVDeadTuples *dead_tuples, double reltuples);
 static void lazy_cleanup_index(Relation indrel,
-							   IndexBulkDeleteResult *stats,
-							   LVRelStats *vacrelstats);
+							   IndexBulkDeleteResult **stats,
+							   double reltuples, bool estimated_count);
 static int	lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 							 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer);
 static bool should_attempt_truncation(VacuumParams *params,
@@ -172,12 +323,41 @@ static void lazy_truncate_heap(Relation onerel, LVRelStats *vacrelstats);
 static BlockNumber count_nondeletable_pages(Relation onerel,
 											LVRelStats *vacrelstats);
 static void lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks);
-static void lazy_record_dead_tuple(LVRelStats *vacrelstats,
+static void lazy_record_dead_tuple(LVDeadTuples *dead_tuples,
 								   ItemPointer itemptr);
 static bool lazy_tid_reaped(ItemPointer itemptr, void *state);
 static int	vac_cmp_itemptr(const void *left, const void *right);
 static bool heap_page_is_all_visible(Relation rel, Buffer buf,
 									 TransactionId *visibility_cutoff_xid, bool *all_frozen);
+static void lazy_parallel_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+										 LVRelStats *vacrelstats, LVParallelState *lps,
+										 int nindexes);
+static void parallel_vacuum_index(Relation *Irel, IndexBulkDeleteResult **stats,
+								  LVShared *lvshared, LVDeadTuples *dead_tuples,
+								  int nindexes);
+static void vacuum_indexes_leader(Relation *Irel, IndexBulkDeleteResult **stats,
+								  LVRelStats *vacrelstats, LVParallelState *lps,
+								  int nindexes);
+static void vacuum_one_index(Relation indrel, IndexBulkDeleteResult **stats,
+							 LVShared *lvshared, LVSharedIndStats *shared_indstats,
+							 LVDeadTuples *dead_tuples);
+static void lazy_cleanup_all_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+									 LVRelStats *vacrelstats, LVParallelState *lps,
+									 int nindexes);
+static long compute_max_dead_tuples(BlockNumber relblocks, bool hasindex);
+static int	compute_parallel_vacuum_workers(Relation *Irel, int nindexes, int nrequested,
+											bool *can_parallel_vacuum);
+static void prepare_index_statistics(LVShared *lvshared, bool *can_parallel_vacuum,
+									 int nindexes);
+static void update_index_statistics(Relation *Irel, IndexBulkDeleteResult **stats,
+									int nindexes);
+static LVParallelState *begin_parallel_vacuum(Oid relid, Relation *Irel,
+											  LVRelStats *vacrelstats, BlockNumber nblocks,
+											  int nindexes, int nrequested);
+static void end_parallel_vacuum(Relation *Irel, IndexBulkDeleteResult **stats,
+								LVParallelState *lps, int nindexes);
+static LVSharedIndStats *get_indstats(LVShared *lvshared, int n);
+static bool skip_parallel_vacuum_index(Relation indrel, LVShared *lvshared);
 
 
 /*
@@ -491,6 +671,18 @@ vacuum_log_cleanup_info(Relation rel, LVRelStats *vacrelstats)
  *		dead-tuple TIDs, invoke vacuuming of indexes and call lazy_vacuum_heap
  *		to reclaim dead line pointers.
  *
+ *		If the table has at least two indexes, we execute both index vacuum
+ *		and index cleanup with parallel workers unless the parallel vacuum is
+ *		disabled.  In a parallel vacuum, we enter parallel mode and then
+ *		create both the parallel context and the DSM segment before starting
+ *		heap scan so that we can record dead tuples to the DSM segment.  All
+ *		parallel workers are launched at beginning of index vacuuming and
+ *		index cleanup and they exit once done with all indexes.  At the end of
+ *		this function we exit from parallel mode.  Index bulk-deletion results
+ *		are stored in the DSM segment and we update index statistics for all
+ *		the indexes after exiting from parallel mode since writes are not
+ *		allowed during parallel mode.
+ *
  *		If there are no indexes then we can reclaim line pointers on the fly;
  *		dead line pointers need only be retained until all index pointers that
  *		reference them have been killed.
@@ -499,6 +691,8 @@ static void
 lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			   Relation *Irel, int nindexes, bool aggressive)
 {
+	LVParallelState *lps = NULL;
+	LVDeadTuples *dead_tuples;
 	BlockNumber nblocks,
 				blkno;
 	HeapTupleData tuple;
@@ -556,13 +750,48 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	vacrelstats->nonempty_pages = 0;
 	vacrelstats->latestRemovedXid = InvalidTransactionId;
 
-	lazy_space_alloc(vacrelstats, nblocks);
+	/*
+	 * Initialize the state for a parallel vacuum.  As of now, only one worker
+	 * can be used for an index, so we invoke parallelism only if there are at
+	 * least two indexes on a table.
+	 */
+	if (params->nworkers >= 0 && vacrelstats->useindex && nindexes > 1)
+	{
+		/*
+		 * Since parallel workers cannot access data in temporary tables, we
+		 * can't perform parallel vacuum on them.
+		 */
+		if (RelationUsesLocalBuffers(onerel))
+		{
+			/*
+			 * Give warning only if the user explicitly tries to perform a
+			 * parallel vacuum on the temporary table.
+			 */
+			if (params->nworkers > 0)
+				ereport(WARNING,
+						(errmsg("disabling parallel option of vacuum on \"%s\" --- cannot vacuum temporary tables in parallel",
+								RelationGetRelationName(onerel))));
+		}
+		else
+			lps = begin_parallel_vacuum(RelationGetRelid(onerel), Irel,
+										vacrelstats, nblocks, nindexes,
+										params->nworkers);
+	}
+
+	/*
+	 * Allocate the space for dead tuples in case the parallel vacuum is not
+	 * initialized.
+	 */
+	if (!ParallelVacuumIsActive(lps))
+		lazy_space_alloc(vacrelstats, nblocks);
+
+	dead_tuples = vacrelstats->dead_tuples;
 	frozen = palloc(sizeof(xl_heap_freeze_tuple) * MaxHeapTuplesPerPage);
 
 	/* Report that we're scanning the heap, advertising total # of blocks */
 	initprog_val[0] = PROGRESS_VACUUM_PHASE_SCAN_HEAP;
 	initprog_val[1] = nblocks;
-	initprog_val[2] = vacrelstats->max_dead_tuples;
+	initprog_val[2] = dead_tuples->max_tuples;
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
 	/*
@@ -740,8 +969,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * If we are close to overrunning the available space for dead-tuple
 		 * TIDs, pause and do a cycle of vacuuming before we tackle this page.
 		 */
-		if ((vacrelstats->max_dead_tuples - vacrelstats->num_dead_tuples) < MaxHeapTuplesPerPage &&
-			vacrelstats->num_dead_tuples > 0)
+		if ((dead_tuples->max_tuples - dead_tuples->num_tuples) < MaxHeapTuplesPerPage &&
+			dead_tuples->num_tuples > 0)
 		{
 			/*
 			 * Before beginning index vacuuming, we release any pin we may
@@ -756,8 +985,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			}
 
 			/* Work on all the indexes, then the heap */
-			lazy_vacuum_all_indexes(onerel, vacrelstats, Irel,
-									nindexes, indstats);
+			lazy_vacuum_all_indexes(onerel, Irel, indstats,
+									vacrelstats, lps, nindexes);
 
 			/* Remove tuples from heap */
 			lazy_vacuum_heap(onerel, vacrelstats);
@@ -767,7 +996,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 
 			/*
 			 * Vacuum the Free Space Map to make newly-freed space visible on
@@ -962,7 +1191,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		has_dead_tuples = false;
 		nfrozen = 0;
 		hastup = false;
-		prev_dead_count = vacrelstats->num_dead_tuples;
+		prev_dead_count = dead_tuples->num_tuples;
 		maxoff = PageGetMaxOffsetNumber(page);
 
 		/*
@@ -1001,7 +1230,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 */
 			if (ItemIdIsDead(itemid))
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				all_visible = false;
 				continue;
 			}
@@ -1147,7 +1376,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 
 			if (tupgone)
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				HeapTupleHeaderAdvanceLatestRemovedXid(tuple.t_data,
 													   &vacrelstats->latestRemovedXid);
 				tups_vacuumed += 1;
@@ -1217,7 +1446,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * doing a second scan. Also we don't do that but forget dead tuples
 		 * when index cleanup is disabled.
 		 */
-		if (!vacrelstats->useindex && vacrelstats->num_dead_tuples > 0)
+		if (!vacrelstats->useindex && dead_tuples->num_tuples > 0)
 		{
 			if (nindexes == 0)
 			{
@@ -1246,7 +1475,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 
 			/*
 			 * Periodically do incremental FSM vacuuming to make newly-freed
@@ -1361,7 +1590,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * page, so remember its free space as-is.  (This path will always be
 		 * taken if there are no indexes.)
 		 */
-		if (vacrelstats->num_dead_tuples == prev_dead_count)
+		if (dead_tuples->num_tuples == prev_dead_count)
 			RecordPageWithFreeSpace(onerel, blkno, freespace);
 	}
 
@@ -1395,11 +1624,11 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 
 	/* If any tuples need to be deleted, perform final vacuum cycle */
 	/* XXX put a threshold on min number of tuples here? */
-	if (vacrelstats->num_dead_tuples > 0)
+	if (dead_tuples->num_tuples > 0)
 	{
 		/* Work on all the indexes, and then the heap */
-		lazy_vacuum_all_indexes(onerel, vacrelstats, Irel, nindexes,
-								indstats);
+		lazy_vacuum_all_indexes(onerel, Irel, indstats, vacrelstats,
+								lps, nindexes);
 
 		/* Remove tuples from heap */
 		lazy_vacuum_heap(onerel, vacrelstats);
@@ -1412,17 +1641,22 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	if (blkno > next_fsm_block_to_vacuum)
 		FreeSpaceMapVacuumRange(onerel, next_fsm_block_to_vacuum, blkno);
 
-	/* report all blocks vacuumed; and that we're cleaning up */
+	/* report all blocks vacuumed */
 	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_VACUUMED, blkno);
-	pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
-								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
 
-	/* Do post-vacuum cleanup and statistics update for each index */
+	/* Do post-vacuum cleanup */
 	if (vacrelstats->useindex)
-	{
-		for (i = 0; i < nindexes; i++)
-			lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
-	}
+		lazy_cleanup_all_indexes(Irel, indstats, vacrelstats, lps, nindexes);
+
+	/*
+	 * End parallel mode before updating index statistics as we cannot write
+	 * during parallel mode.
+	 */
+	if (ParallelVacuumIsActive(lps))
+		end_parallel_vacuum(Irel, indstats, lps, nindexes);
+
+	/* Update index statistics */
+	update_index_statistics(Irel, indstats, nindexes);
 
 	/* If no indexes, make log report that lazy_vacuum_heap would've made */
 	if (vacuumed_pages)
@@ -1467,15 +1701,16 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 /*
  *	lazy_vacuum_all_indexes() -- vacuum all indexes of relation.
  *
- *		This is a utility wrapper for lazy_vacuum_index(), able to do
- *		progress reporting.
+ * We process the indexes serially unless we are doing parallel vacuum.
  */
 static void
-lazy_vacuum_all_indexes(Relation onerel, LVRelStats *vacrelstats,
-						Relation *Irel, int nindexes,
-						IndexBulkDeleteResult **indstats)
+lazy_vacuum_all_indexes(Relation onerel, Relation *Irel,
+						IndexBulkDeleteResult **stats,
+						LVRelStats *vacrelstats, LVParallelState *lps,
+						int nindexes)
 {
-	int			i;
+	Assert(!IsParallelWorker());
+	Assert(nindexes > 0);
 
 	/* Log cleanup info before we touch indexes */
 	vacuum_log_cleanup_info(onerel, vacrelstats);
@@ -1484,9 +1719,30 @@ lazy_vacuum_all_indexes(Relation onerel, LVRelStats *vacrelstats,
 	pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
 								 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
-	/* Remove index entries */
-	for (i = 0; i < nindexes; i++)
-		lazy_vacuum_index(Irel[i], &indstats[i], vacrelstats);
+	/* Perform index vacuuming with parallel workers for parallel vacuum. */
+	if (ParallelVacuumIsActive(lps))
+	{
+		/* Tell parallel workers to do index vacuuming */
+		lps->lvshared->for_cleanup = false;
+		lps->lvshared->first_time = false;
+
+		/*
+		 * We can only provide an approximate value of num_heap_tuples in
+		 * vacuum cases.
+		 */
+		lps->lvshared->reltuples = vacrelstats->old_live_tuples;
+		lps->lvshared->estimated_count = true;
+
+		lazy_parallel_vacuum_indexes(Irel, stats, vacrelstats, lps, nindexes);
+	}
+	else
+	{
+		int			idx;
+
+		for (idx = 0; idx < nindexes; idx++)
+			lazy_vacuum_index(Irel[idx], &stats[idx], vacrelstats->dead_tuples,
+							  vacrelstats->old_live_tuples);
+	}
 
 	/* Increase and report the number of index scans */
 	vacrelstats->num_index_scans++;
@@ -1522,7 +1778,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 	npages = 0;
 
 	tupindex = 0;
-	while (tupindex < vacrelstats->num_dead_tuples)
+	while (tupindex < vacrelstats->dead_tuples->num_tuples)
 	{
 		BlockNumber tblk;
 		Buffer		buf;
@@ -1531,7 +1787,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 
 		vacuum_delay_point();
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples->itemptrs[tupindex]);
 		buf = ReadBufferExtended(onerel, MAIN_FORKNUM, tblk, RBM_NORMAL,
 								 vac_strategy);
 		if (!ConditionalLockBufferForCleanup(buf))
@@ -1579,6 +1835,7 @@ static int
 lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 				 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer)
 {
+	LVDeadTuples *dead_tuples = vacrelstats->dead_tuples;
 	Page		page = BufferGetPage(buffer);
 	OffsetNumber unused[MaxOffsetNumber];
 	int			uncnt = 0;
@@ -1589,16 +1846,16 @@ lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 
 	START_CRIT_SECTION();
 
-	for (; tupindex < vacrelstats->num_dead_tuples; tupindex++)
+	for (; tupindex < dead_tuples->num_tuples; tupindex++)
 	{
 		BlockNumber tblk;
 		OffsetNumber toff;
 		ItemId		itemid;
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&dead_tuples->itemptrs[tupindex]);
 		if (tblk != blkno)
 			break;				/* past end of tuples for this block */
-		toff = ItemPointerGetOffsetNumber(&vacrelstats->dead_tuples[tupindex]);
+		toff = ItemPointerGetOffsetNumber(&dead_tuples->itemptrs[tupindex]);
 		itemid = PageGetItemId(page, toff);
 		ItemIdSetUnused(itemid);
 		unused[uncnt++] = toff;
@@ -1719,19 +1976,353 @@ lazy_check_needs_freeze(Buffer buf, bool *hastup)
 	return false;
 }
 
+/*
+ * Perform index vacuum or index cleanup with parallel workers.  This function
+ * must be used by the parallel vacuum leader process.  The caller must set
+ * lps->lvshared->for_cleanup to indicate whether to perform vacuum or
+ * cleanup.
+ */
+static void
+lazy_parallel_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+							 LVRelStats *vacrelstats, LVParallelState *lps,
+							 int nindexes)
+{
+	int			nworkers;
+
+	Assert(!IsParallelWorker());
+	Assert(ParallelVacuumIsActive(lps));
+	Assert(nindexes > 0);
+
+	/* Determine the number of parallel workers to launch */
+	if (lps->lvshared->for_cleanup)
+	{
+		if (lps->lvshared->first_time)
+			nworkers = lps->nindexes_parallel_cleanup +
+				lps->nindexes_parallel_condcleanup;
+		else
+			nworkers = lps->nindexes_parallel_cleanup;
+
+	}
+	else
+		nworkers = lps->nindexes_parallel_bulkdel;
+
+	/* The leader process will participate */
+	nworkers--;
+
+	/*
+	 * It is possible that parallel context is initialized with fewer workers
+	 * then the number of indexes that need a separate worker in the current
+	 * phase, so we need to consider it.  See compute_parallel_vacuum_workers.
+	 */
+	nworkers = Min(nworkers, lps->pcxt->nworkers);
+
+	/* Setup the shared cost-based vacuum delay and launch workers */
+	if (nworkers > 0)
+	{
+		if (vacrelstats->num_index_scans > 0)
+		{
+			/* Reset the parallel index processing counter */
+			pg_atomic_write_u32(&(lps->lvshared->idx), 0);
+
+			/* Reinitialize the parallel context to relaunch parallel workers */
+			ReinitializeParallelDSM(lps->pcxt);
+		}
+
+		/* Enable shared cost balance */
+		VacuumSharedCostBalance = &(lps->lvshared->cost_balance);
+		VacuumActiveNWorkers = &(lps->lvshared->active_nworkers);
+
+		/*
+		 * Set up shared cost balance and the number of active workers for
+		 * vacuum delay.
+		 */
+		pg_atomic_write_u32(VacuumSharedCostBalance, VacuumCostBalance);
+		pg_atomic_write_u32(VacuumActiveNWorkers, 0);
+
+		/*
+		 * The number of workers can vary between bulkdelete and cleanup
+		 * phase.
+		 */
+		ReinitializeParallelWorkers(lps->pcxt, nworkers);
+
+		LaunchParallelWorkers(lps->pcxt);
+
+		if (lps->pcxt->nworkers_launched > 0)
+		{
+			/*
+			 * Reset the local cost values for leader backend as we have
+			 * already accumulated the remaining balance of heap.
+			 */
+			VacuumCostBalance = 0;
+			VacuumCostBalanceLocal = 0;
+		}
+		else
+		{
+			/*
+			 * Disable shared cost balance if we are not able to launch
+			 * workers.
+			 */
+			VacuumSharedCostBalance = NULL;
+			VacuumActiveNWorkers = NULL;
+		}
+
+		if (lps->lvshared->for_cleanup)
+			ereport(elevel,
+					(errmsg(ngettext("launched %d parallel vacuum worker for index cleanup (planned: %d)",
+									 "launched %d parallel vacuum workers for index cleanup (planned: %d)",
+									 lps->pcxt->nworkers_launched),
+							lps->pcxt->nworkers_launched, nworkers)));
+		else
+			ereport(elevel,
+					(errmsg(ngettext("launched %d parallel vacuum worker for index vacuuming (planned: %d)",
+									 "launched %d parallel vacuum workers for index vacuuming (planned: %d)",
+									 lps->pcxt->nworkers_launched),
+							lps->pcxt->nworkers_launched, nworkers)));
+	}
+
+	/* Process the indexes that can be processed by only leader process */
+	vacuum_indexes_leader(Irel, stats, vacrelstats, lps, nindexes);
+
+	/*
+	 * Join as a parallel worker.  The leader process alone processes all the
+	 * indexes in the case where no workers are launched.
+	 */
+	parallel_vacuum_index(Irel, stats, lps->lvshared,
+						  vacrelstats->dead_tuples, nindexes);
+
+	/* Wait for all vacuum workers to finish */
+	WaitForParallelWorkersToFinish(lps->pcxt);
+
+	/* Carry the shared balance value to heap scan */
+	if (VacuumSharedCostBalance)
+		VacuumCostBalance = pg_atomic_read_u32(VacuumSharedCostBalance);
+
+	if (nworkers > 0)
+	{
+		/* Disable shared cost balance */
+		VacuumSharedCostBalance = NULL;
+		VacuumActiveNWorkers = NULL;
+	}
+}
+
+/*
+ * Index vacuum/cleanup routine used by the leader process and parallel
+ * vacuum worker processes to process the indexes in parallel.
+ */
+static void
+parallel_vacuum_index(Relation *Irel, IndexBulkDeleteResult **stats,
+					  LVShared *lvshared, LVDeadTuples *dead_tuples,
+					  int nindexes)
+{
+	/*
+	 * Increment the active worker count if we are able to launch any worker.
+	 */
+	if (VacuumActiveNWorkers)
+		pg_atomic_add_fetch_u32(VacuumActiveNWorkers, 1);
+
+	/* Loop until all indexes are vacuumed */
+	for (;;)
+	{
+		int			idx;
+		LVSharedIndStats *shared_indstats;
+
+		/* Get an index number to process */
+		idx = pg_atomic_fetch_add_u32(&(lvshared->idx), 1);
+
+		/* Done for all indexes? */
+		if (idx >= nindexes)
+			break;
+
+		/* Get the index statistics of this index from DSM */
+		shared_indstats = get_indstats(lvshared, idx);
+
+		/*
+		 * Skip processing indexes that doesn't participate in parallel
+		 * operation
+		 */
+		if (shared_indstats == NULL ||
+			skip_parallel_vacuum_index(Irel[idx], lvshared))
+			continue;
+
+		/* Do vacuum or cleanup of the index */
+		vacuum_one_index(Irel[idx], &(stats[idx]), lvshared, shared_indstats,
+						 dead_tuples);
+	}
+
+	/*
+	 * We have completed the index vacuum so decrement the active worker
+	 * count.
+	 */
+	if (VacuumActiveNWorkers)
+		pg_atomic_sub_fetch_u32(VacuumActiveNWorkers, 1);
+}
+
+/*
+ * Vacuum or cleanup indexes that can be processed by only the leader process
+ * because these indexes don't support parallel operation at that phase.
+ */
+static void
+vacuum_indexes_leader(Relation *Irel, IndexBulkDeleteResult **stats,
+					  LVRelStats *vacrelstats, LVParallelState *lps,
+					  int nindexes)
+{
+	int			i;
+
+	Assert(!IsParallelWorker());
+
+	/*
+	 * Increment the active worker count if we are able to launch any worker.
+	 */
+	if (VacuumActiveNWorkers)
+		pg_atomic_add_fetch_u32(VacuumActiveNWorkers, 1);
+
+	for (i = 0; i < nindexes; i++)
+	{
+		LVSharedIndStats *shared_indstats;
+
+		shared_indstats = get_indstats(lps->lvshared, i);
+
+		/* Process the indexes skipped by parallel workers */
+		if (shared_indstats == NULL ||
+			skip_parallel_vacuum_index(Irel[i], lps->lvshared))
+			vacuum_one_index(Irel[i], &(stats[i]), lps->lvshared,
+							 shared_indstats, vacrelstats->dead_tuples);
+	}
+
+	/*
+	 * We have completed the index vacuum so decrement the active worker
+	 * count.
+	 */
+	if (VacuumActiveNWorkers)
+		pg_atomic_sub_fetch_u32(VacuumActiveNWorkers, 1);
+}
+
+/*
+ * Vacuum or cleanup index either by leader process or by one of the worker
+ * process.  After processing the index this function copies the index
+ * statistics returned from ambulkdelete and amvacuumcleanup to the DSM
+ * segment.
+ */
+static void
+vacuum_one_index(Relation indrel, IndexBulkDeleteResult **stats,
+				 LVShared *lvshared, LVSharedIndStats *shared_indstats,
+				 LVDeadTuples *dead_tuples)
+{
+	IndexBulkDeleteResult *bulkdelete_res = NULL;
+
+	if (shared_indstats)
+	{
+		/* Get the space for IndexBulkDeleteResult */
+		bulkdelete_res = &(shared_indstats->stats);
+
+		/*
+		 * Update the pointer to the corresponding bulk-deletion result if
+		 * someone has already updated it.
+		 */
+		if (shared_indstats->updated && *stats == NULL)
+			*stats = bulkdelete_res;
+	}
+
+	/* Do vacuum or cleanup of the index */
+	if (lvshared->for_cleanup)
+		lazy_cleanup_index(indrel, stats, lvshared->reltuples,
+						   lvshared->estimated_count);
+	else
+		lazy_vacuum_index(indrel, stats, dead_tuples,
+						  lvshared->reltuples);
+
+	/*
+	 * Copy the index bulk-deletion result returned from ambulkdelete and
+	 * amvacuumcleanup to the DSM segment if it's the first time to get it
+	 * from them, because they allocate it locally and it's possible that an
+	 * index will be vacuumed by the different vacuum process at the next
+	 * time.  The copying of the result normally happens only after the first
+	 * time of index vacuuming.  From the second time, we pass the result on
+	 * the DSM segment so that they then update it directly.
+	 *
+	 * Since all vacuum workers write the bulk-deletion result at different
+	 * slots we can write them without locking.
+	 */
+	if (shared_indstats && !shared_indstats->updated && *stats != NULL)
+	{
+		memcpy(bulkdelete_res, *stats, sizeof(IndexBulkDeleteResult));
+		shared_indstats->updated = true;
+
+		/*
+		 * Now that the stats[idx] points to the DSM segment, we don't need
+		 * the locally allocated results.
+		 */
+		pfree(*stats);
+		*stats = bulkdelete_res;
+	}
+}
+
+/*
+ *	lazy_cleanup_all_indexes() -- cleanup all indexes of relation.
+ *
+ * Cleanup indexes.  We process the indexes serially unless we are doing
+ * parallel vacuum.
+ */
+static void
+lazy_cleanup_all_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+						 LVRelStats *vacrelstats, LVParallelState *lps,
+						 int nindexes)
+{
+	int			idx;
+
+	Assert(!IsParallelWorker());
+	Assert(nindexes > 0);
+
+	/* Report that we are now cleaning up indexes */
+	pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
+								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
+
+	/*
+	 * If parallel vacuum is active we perform index cleanup with parallel
+	 * workers.
+	 */
+	if (ParallelVacuumIsActive(lps))
+	{
+		/* Tell parallel workers to do index cleanup */
+		lps->lvshared->for_cleanup = true;
+		lps->lvshared->first_time =
+			(vacrelstats->num_index_scans == 0);
+
+		/*
+		 * Now we can provide a better estimate of total number of surviving
+		 * tuples (we assume indexes are more interested in that than in the
+		 * number of nominally live tuples).
+		 */
+		lps->lvshared->reltuples = vacrelstats->new_rel_tuples;
+		lps->lvshared->estimated_count =
+			(vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+
+		lazy_parallel_vacuum_indexes(Irel, stats, vacrelstats, lps, nindexes);
+	}
+	else
+	{
+		for (idx = 0; idx < nindexes; idx++)
+			lazy_cleanup_index(Irel[idx], &stats[idx],
+							   vacrelstats->new_rel_tuples,
+							   vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	}
+}
 
 /*
  *	lazy_vacuum_index() -- vacuum one index relation.
  *
  *		Delete all the index entries pointing to tuples listed in
- *		vacrelstats->dead_tuples, and update running statistics.
+ *		dead_tuples, and update running statistics.
+ *
+ *		reltuples is the number of heap tuples to be passed to the
+ *		bulkdelete callback.
  */
 static void
-lazy_vacuum_index(Relation indrel,
-				  IndexBulkDeleteResult **stats,
-				  LVRelStats *vacrelstats)
+lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult **stats,
+				  LVDeadTuples *dead_tuples, double reltuples)
 {
 	IndexVacuumInfo ivinfo;
+	const char *msg;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
@@ -1741,30 +2332,38 @@ lazy_vacuum_index(Relation indrel,
 	ivinfo.report_progress = false;
 	ivinfo.estimated_count = true;
 	ivinfo.message_level = elevel;
-	/* We can only provide an approximate value of num_heap_tuples here */
-	ivinfo.num_heap_tuples = vacrelstats->old_live_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
 	/* Do bulk deletion */
 	*stats = index_bulk_delete(&ivinfo, *stats,
-							   lazy_tid_reaped, (void *) vacrelstats);
+							   lazy_tid_reaped, (void *) dead_tuples);
+
+	if (IsParallelWorker())
+		msg = gettext_noop("scanned index \"%s\" to remove %d row versions by parallel vacuum worker");
+	else
+		msg = gettext_noop("scanned index \"%s\" to remove %d row versions");
 
 	ereport(elevel,
-			(errmsg("scanned index \"%s\" to remove %d row versions",
+			(errmsg(msg,
 					RelationGetRelationName(indrel),
-					vacrelstats->num_dead_tuples),
+					dead_tuples->num_tuples),
 			 errdetail_internal("%s", pg_rusage_show(&ru0))));
 }
 
 /*
  *	lazy_cleanup_index() -- do post-vacuum cleanup for one index relation.
+ *
+ *		reltuples is the number of heap tuples and estimated_count is true
+ *		if the reltuples is an estimated value.
  */
 static void
 lazy_cleanup_index(Relation indrel,
-				   IndexBulkDeleteResult *stats,
-				   LVRelStats *vacrelstats)
+				   IndexBulkDeleteResult **stats,
+				   double reltuples, bool estimated_count)
 {
 	IndexVacuumInfo ivinfo;
+	const char *msg;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
@@ -1772,49 +2371,33 @@ lazy_cleanup_index(Relation indrel,
 	ivinfo.index = indrel;
 	ivinfo.analyze_only = false;
 	ivinfo.report_progress = false;
-	ivinfo.estimated_count = (vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	ivinfo.estimated_count = estimated_count;
 	ivinfo.message_level = elevel;
 
-	/*
-	 * Now we can provide a better estimate of total number of surviving
-	 * tuples (we assume indexes are more interested in that than in the
-	 * number of nominally live tuples).
-	 */
-	ivinfo.num_heap_tuples = vacrelstats->new_rel_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
-	stats = index_vacuum_cleanup(&ivinfo, stats);
+	*stats = index_vacuum_cleanup(&ivinfo, *stats);
 
-	if (!stats)
+	if (!(*stats))
 		return;
 
-	/*
-	 * Now update statistics in pg_class, but only if the index says the count
-	 * is accurate.
-	 */
-	if (!stats->estimated_count)
-		vac_update_relstats(indrel,
-							stats->num_pages,
-							stats->num_index_tuples,
-							0,
-							false,
-							InvalidTransactionId,
-							InvalidMultiXactId,
-							false);
+	if (IsParallelWorker())
+		msg = gettext_noop("index \"%s\" now contains %.0f row versions in %u pages as reported by parallel vacuum worker");
+	else
+		msg = gettext_noop("index \"%s\" now contains %.0f row versions in %u pages");
 
 	ereport(elevel,
-			(errmsg("index \"%s\" now contains %.0f row versions in %u pages",
+			(errmsg(msg,
 					RelationGetRelationName(indrel),
-					stats->num_index_tuples,
-					stats->num_pages),
+					(*stats)->num_index_tuples,
+					(*stats)->num_pages),
 			 errdetail("%.0f index row versions were removed.\n"
 					   "%u index pages have been deleted, %u are currently reusable.\n"
 					   "%s.",
-					   stats->tuples_removed,
-					   stats->pages_deleted, stats->pages_free,
+					   (*stats)->tuples_removed,
+					   (*stats)->pages_deleted, (*stats)->pages_free,
 					   pg_rusage_show(&ru0))));
-
-	pfree(stats);
 }
 
 /*
@@ -2122,19 +2705,17 @@ count_nondeletable_pages(Relation onerel, LVRelStats *vacrelstats)
 }
 
 /*
- * lazy_space_alloc - space allocation decisions for lazy vacuum
- *
- * See the comments at the head of this file for rationale.
+ * Return the maximum number of dead tuples we can record.
  */
-static void
-lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+static long
+compute_max_dead_tuples(BlockNumber relblocks, bool useindex)
 {
 	long		maxtuples;
 	int			vac_work_mem = IsAutoVacuumWorkerProcess() &&
 	autovacuum_work_mem != -1 ?
 	autovacuum_work_mem : maintenance_work_mem;
 
-	if (vacrelstats->useindex)
+	if (useindex)
 	{
 		maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);
 		maxtuples = Min(maxtuples, INT_MAX);
@@ -2148,34 +2729,48 @@ lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
 		maxtuples = Max(maxtuples, MaxHeapTuplesPerPage);
 	}
 	else
-	{
 		maxtuples = MaxHeapTuplesPerPage;
-	}
 
-	vacrelstats->num_dead_tuples = 0;
-	vacrelstats->max_dead_tuples = (int) maxtuples;
-	vacrelstats->dead_tuples = (ItemPointer)
-		palloc(maxtuples * sizeof(ItemPointerData));
+	return maxtuples;
+}
+
+/*
+ * lazy_space_alloc - space allocation decisions for lazy vacuum
+ *
+ * See the comments at the head of this file for rationale.
+ */
+static void
+lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+{
+	LVDeadTuples *dead_tuples = NULL;
+	long		maxtuples;
+
+	maxtuples = compute_max_dead_tuples(relblocks, vacrelstats->useindex);
+
+	dead_tuples = (LVDeadTuples *) palloc(SizeOfLVDeadTuples(maxtuples));
+	dead_tuples->num_tuples = 0;
+	dead_tuples->max_tuples = (int) maxtuples;
+
+	vacrelstats->dead_tuples = dead_tuples;
 }
 
 /*
  * lazy_record_dead_tuple - remember one deletable tuple
  */
 static void
-lazy_record_dead_tuple(LVRelStats *vacrelstats,
-					   ItemPointer itemptr)
+lazy_record_dead_tuple(LVDeadTuples *dead_tuples, ItemPointer itemptr)
 {
 	/*
 	 * The array shouldn't overflow under normal behavior, but perhaps it
 	 * could if we are given a really small maintenance_work_mem. In that
 	 * case, just forget the last few tuples (we'll get 'em next time).
 	 */
-	if (vacrelstats->num_dead_tuples < vacrelstats->max_dead_tuples)
+	if (dead_tuples->num_tuples < dead_tuples->max_tuples)
 	{
-		vacrelstats->dead_tuples[vacrelstats->num_dead_tuples] = *itemptr;
-		vacrelstats->num_dead_tuples++;
+		dead_tuples->itemptrs[dead_tuples->num_tuples] = *itemptr;
+		dead_tuples->num_tuples++;
 		pgstat_progress_update_param(PROGRESS_VACUUM_NUM_DEAD_TUPLES,
-									 vacrelstats->num_dead_tuples);
+									 dead_tuples->num_tuples);
 	}
 }
 
@@ -2189,12 +2784,12 @@ lazy_record_dead_tuple(LVRelStats *vacrelstats,
 static bool
 lazy_tid_reaped(ItemPointer itemptr, void *state)
 {
-	LVRelStats *vacrelstats = (LVRelStats *) state;
+	LVDeadTuples *dead_tuples = (LVDeadTuples *) state;
 	ItemPointer res;
 
 	res = (ItemPointer) bsearch((void *) itemptr,
-								(void *) vacrelstats->dead_tuples,
-								vacrelstats->num_dead_tuples,
+								(void *) dead_tuples->itemptrs,
+								dead_tuples->num_tuples,
 								sizeof(ItemPointerData),
 								vac_cmp_itemptr);
 
@@ -2342,3 +2937,449 @@ heap_page_is_all_visible(Relation rel, Buffer buf,
 
 	return all_visible;
 }
+
+/*
+ * Compute the number of parallel worker processes to request.  Both index
+ * vacuum and index cleanup can be executed with parallel workers.  The
+ * relation size of the table doesn't affect the parallel degree for now.
+ *
+ * nrequested is the number of parallel workers that user requested.  If
+ * nrequested is 0, we compute the parallel degree based on nindexes, that is
+ * the number of indexes that support parallel vacuum.  This function also
+ * sets can_parallel_vacuum to remember indexes that participate in parallel
+ * vacuum.
+ */
+static int
+compute_parallel_vacuum_workers(Relation *Irel, int nindexes, int nrequested,
+								bool *can_parallel_vacuum)
+{
+	int			nindexes_parallel = 0;
+	int			nindexes_parallel_bulkdel = 0;
+	int			nindexes_parallel_cleanup = 0;
+	int			parallel_workers;
+	int			i;
+
+	/*
+	 * We don't allow to perform parallel operation in standalone backend or
+	 * when parallelism is disabled.
+	 */
+	if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+		return 0;
+
+	/*
+	 * Compute the number of indexes that can participate in parallel vacuum.
+	 */
+	for (i = 0; i < nindexes; i++)
+	{
+		uint8		vacoptions = Irel[i]->rd_indam->amparallelvacuumoptions;
+
+		if (vacoptions == VACUUM_OPTION_NO_PARALLEL ||
+			RelationGetNumberOfBlocks(Irel[i]) < min_parallel_index_scan_size)
+			continue;
+
+		can_parallel_vacuum[i] = true;
+
+		if ((vacoptions & VACUUM_OPTION_PARALLEL_BULKDEL) != 0)
+			nindexes_parallel_bulkdel++;
+		if (((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) != 0) ||
+			((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) != 0))
+			nindexes_parallel_cleanup++;
+	}
+
+	nindexes_parallel = Max(nindexes_parallel_bulkdel,
+							nindexes_parallel_cleanup);
+
+	/* No index supports parallel vacuum */
+	if (nindexes_parallel == 0)
+		return 0;
+
+	/* The leader process takes one index */
+	nindexes_parallel--;
+
+	/* Compute the parallel degree */
+	parallel_workers = (nrequested > 0) ?
+		Min(nrequested, nindexes_parallel) : nindexes_parallel;
+
+	/* Cap by max_parallel_maintenance_workers */
+	parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+
+	return parallel_workers;
+}
+
+/*
+ * Initialize variables for shared index statistics, set NULL bitmap and the
+ * size of stats for each index.
+ */
+static void
+prepare_index_statistics(LVShared *lvshared, bool *can_parallel_vacuum,
+						 int nindexes)
+{
+	int			i;
+
+	/* Currently, we don't support parallel vacuum for autovacuum */
+	Assert(!IsAutoVacuumWorkerProcess());
+
+	/* Set NULL for all indexes */
+	memset(lvshared->bitmap, 0x00, BITMAPLEN(nindexes));
+
+	for (i = 0; i < nindexes; i++)
+	{
+		if (!can_parallel_vacuum[i])
+			continue;
+
+		/* Set NOT NULL as this index do support parallelism */
+		lvshared->bitmap[i >> 3] |= 1 << (i & 0x07);
+	}
+}
+
+/*
+ * Update index statistics in pg_class if the statistics is accurate.
+ */
+static void
+update_index_statistics(Relation *Irel, IndexBulkDeleteResult **stats,
+						int nindexes)
+{
+	int			i;
+
+	Assert(!IsInParallelMode());
+
+	for (i = 0; i < nindexes; i++)
+	{
+		if (stats[i] == NULL || stats[i]->estimated_count)
+			continue;
+
+		/* Update index statistics */
+		vac_update_relstats(Irel[i],
+							stats[i]->num_pages,
+							stats[i]->num_index_tuples,
+							0,
+							false,
+							InvalidTransactionId,
+							InvalidMultiXactId,
+							false);
+		pfree(stats[i]);
+	}
+}
+
+/*
+ * This function prepares and returns parallel vacuum state if we can launch
+ * even one worker.  This function is responsible to create a parallel
+ * context, enter parallel mode, and then initialize the DSM segment.
+ */
+static LVParallelState *
+begin_parallel_vacuum(Oid relid, Relation *Irel, LVRelStats *vacrelstats,
+					  BlockNumber nblocks, int nindexes, int nrequested)
+{
+	LVParallelState *lps = NULL;
+	ParallelContext *pcxt;
+	LVShared   *shared;
+	LVDeadTuples *dead_tuples;
+	bool	   *can_parallel_vacuum;
+	long		maxtuples;
+	char	   *sharedquery;
+	Size		est_shared;
+	Size		est_deadtuples;
+	int			nindexes_mwm = 0;
+	int			parallel_workers = 0;
+	int			querylen;
+	int			i;
+
+	/*
+	 * A parallel vacuum must be requested and there must be indexes on the
+	 * relation
+	 */
+	Assert(nrequested >= 0);
+	Assert(nindexes > 0);
+
+	/*
+	 * Compute the number of parallel vacuum workers to launch
+	 */
+	can_parallel_vacuum = (bool *) palloc0(sizeof(bool) * nindexes);
+	parallel_workers = compute_parallel_vacuum_workers(Irel, nindexes,
+													   nrequested,
+													   can_parallel_vacuum);
+
+	/* Can't perform vacuum in parallel */
+	if (parallel_workers <= 0)
+	{
+		pfree(can_parallel_vacuum);
+		return lps;
+	}
+
+	lps = (LVParallelState *) palloc0(sizeof(LVParallelState));
+
+	EnterParallelMode();
+	pcxt = CreateParallelContext("postgres", "parallel_vacuum_main",
+								 parallel_workers);
+	Assert(pcxt->nworkers > 0);
+	lps->pcxt = pcxt;
+
+	/* Estimate size for shared information -- PARALLEL_VACUUM_KEY_SHARED */
+	est_shared = MAXALIGN(add_size(SizeOfLVShared, BITMAPLEN(nindexes)));
+	for (i = 0; i < nindexes; i++)
+	{
+		uint8		vacoptions = Irel[i]->rd_indam->amparallelvacuumoptions;
+
+		/*
+		 * Cleanup option should be either disabled, always performing in
+		 * parallel or conditionally performing in parallel.
+		 */
+		Assert(((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) == 0) ||
+			   ((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) == 0));
+		Assert(vacoptions <= VACUUM_OPTION_MAX_VALID_VALUE);
+
+		/* Skip indexes that don't participate in parallel vacuum */
+		if (!can_parallel_vacuum[i])
+			continue;
+
+		if (Irel[i]->rd_indam->amusemaintenanceworkmem)
+			nindexes_mwm++;
+
+		est_shared = add_size(est_shared, sizeof(LVSharedIndStats));
+
+		/*
+		 * Remember the number of indexes that support parallel operation for
+		 * each phase.
+		 */
+		if ((vacoptions & VACUUM_OPTION_PARALLEL_BULKDEL) != 0)
+			lps->nindexes_parallel_bulkdel++;
+		if ((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) != 0)
+			lps->nindexes_parallel_cleanup++;
+		if ((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) != 0)
+			lps->nindexes_parallel_condcleanup++;
+	}
+	shm_toc_estimate_chunk(&pcxt->estimator, est_shared);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	/* Estimate size for dead tuples -- PARALLEL_VACUUM_KEY_DEAD_TUPLES */
+	maxtuples = compute_max_dead_tuples(nblocks, true);
+	est_deadtuples = MAXALIGN(SizeOfLVDeadTuples(maxtuples));
+	shm_toc_estimate_chunk(&pcxt->estimator, est_deadtuples);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	/* Finally, estimate PARALLEL_VACUUM_KEY_QUERY_TEXT space */
+	querylen = strlen(debug_query_string);
+	shm_toc_estimate_chunk(&pcxt->estimator, querylen + 1);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	InitializeParallelDSM(pcxt);
+
+	/* Prepare shared information */
+	shared = (LVShared *) shm_toc_allocate(pcxt->toc, est_shared);
+	MemSet(shared, 0, est_shared);
+	shared->relid = relid;
+	shared->elevel = elevel;
+	shared->maintenance_work_mem_worker =
+		(nindexes_mwm > 0) ?
+		maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
+		maintenance_work_mem;
+
+	/*
+	 * We need to care about alignment because we estimate the shared memory
+	 * in that way.
+	 */
+	shared->offset = MAXALIGN(add_size(SizeOfLVShared, BITMAPLEN(nindexes)));
+	prepare_index_statistics(shared, can_parallel_vacuum, nindexes);
+	pg_atomic_init_u32(&(shared->idx), 0);
+	pg_atomic_init_u32(&(shared->cost_balance), 0);
+	pg_atomic_init_u32(&(shared->active_nworkers), 0);
+
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_SHARED, shared);
+	lps->lvshared = shared;
+
+	/* Prepare the dead tuple space */
+	dead_tuples = (LVDeadTuples *) shm_toc_allocate(pcxt->toc, est_deadtuples);
+	dead_tuples->max_tuples = maxtuples;
+	dead_tuples->num_tuples = 0;
+	MemSet(dead_tuples->itemptrs, 0, sizeof(ItemPointerData) * maxtuples);
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_DEAD_TUPLES, dead_tuples);
+	vacrelstats->dead_tuples = dead_tuples;
+
+	/* Store query string for workers */
+	sharedquery = (char *) shm_toc_allocate(pcxt->toc, querylen + 1);
+	memcpy(sharedquery, debug_query_string, querylen + 1);
+	sharedquery[querylen] = '\0';
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, sharedquery);
+
+	pfree(can_parallel_vacuum);
+	return lps;
+}
+
+/*
+ * Destroy the parallel context, and end parallel mode.
+ *
+ * Since writes are not allowed during the parallel mode, so we copy the
+ * updated index statistics from DSM in local memory and then later use that
+ * to update the index statistics.  One might think that we can exit from
+ * parallel mode, update the index statistics and then destroy parallel
+ * context, but that won't be safe (see ExitParallelMode).
+ */
+static void
+end_parallel_vacuum(Relation *Irel, IndexBulkDeleteResult **stats,
+					LVParallelState *lps, int nindexes)
+{
+	int			i;
+
+	Assert(!IsParallelWorker());
+
+	/* Copy the updated statistics */
+	for (i = 0; i < nindexes; i++)
+	{
+		LVSharedIndStats *indstats = get_indstats(lps->lvshared, i);
+
+		/*
+		 * Skip unused slot.  The statistics of this index are already stored
+		 * in local memory.
+		 */
+		if (indstats == NULL)
+			continue;
+
+		if (indstats->updated)
+		{
+			stats[i] = (IndexBulkDeleteResult *) palloc0(sizeof(IndexBulkDeleteResult));
+			memcpy(stats[i], &(indstats->stats), sizeof(IndexBulkDeleteResult));
+		}
+		else
+			stats[i] = NULL;
+	}
+
+	DestroyParallelContext(lps->pcxt);
+	ExitParallelMode();
+
+	/* Deactivate parallel vacuum */
+	pfree(lps);
+	lps = NULL;
+}
+
+/* Return the Nth index statistics or NULL */
+static LVSharedIndStats *
+get_indstats(LVShared *lvshared, int n)
+{
+	int			i;
+	char	   *p;
+
+	if (IndStatsIsNull(lvshared, n))
+		return NULL;
+
+	p = (char *) GetSharedIndStats(lvshared);
+	for (i = 0; i < n; i++)
+	{
+		if (IndStatsIsNull(lvshared, i))
+			continue;
+
+		p += sizeof(LVSharedIndStats);
+	}
+
+	return (LVSharedIndStats *) p;
+}
+
+/*
+ * Returns true, if the given index can't participate in parallel index vacuum
+ * or parallel index cleanup, false, otherwise.
+ */
+static bool
+skip_parallel_vacuum_index(Relation indrel, LVShared *lvshared)
+{
+	uint8		vacoptions = indrel->rd_indam->amparallelvacuumoptions;
+
+	/* first_time must be true only if for_cleanup is true */
+	Assert(lvshared->for_cleanup || !lvshared->first_time);
+
+	if (lvshared->for_cleanup)
+	{
+		/* Skip, if the index does not support parallel cleanup */
+		if (((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) == 0) &&
+			((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) == 0))
+			return true;
+
+		/*
+		 * Skip, if the index supports parallel cleanup conditionally, but we
+		 * have already processed the index (for bulkdelete).  See the
+		 * comments for option VACUUM_OPTION_PARALLEL_COND_CLEANUP to know
+		 * when indexes support parallel cleanup conditionally.
+		 */
+		if (!lvshared->first_time &&
+			((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) != 0))
+			return true;
+	}
+	else if ((vacoptions & VACUUM_OPTION_PARALLEL_BULKDEL) == 0)
+	{
+		/* Skip if the index does not support parallel bulk deletion */
+		return true;
+	}
+
+	return false;
+}
+
+/*
+ * Perform work within a launched parallel process.
+ *
+ * Since parallel vacuum workers perform only index vacuum or index cleanup,
+ * we don't need to report the progress information.
+ */
+void
+parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
+{
+	Relation	onerel;
+	Relation   *indrels;
+	LVShared   *lvshared;
+	LVDeadTuples *dead_tuples;
+	int			nindexes;
+	char	   *sharedquery;
+	IndexBulkDeleteResult **stats;
+
+	lvshared = (LVShared *) shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_SHARED,
+										   false);
+	elevel = lvshared->elevel;
+
+	ereport(DEBUG1,
+			(errmsg("starting parallel vacuum worker for %s",
+					lvshared->for_cleanup ? "cleanup" : "vacuum")));
+
+	/* Set debug_query_string for individual workers */
+	sharedquery = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, false);
+	debug_query_string = sharedquery;
+	pgstat_report_activity(STATE_RUNNING, debug_query_string);
+
+	/*
+	 * Open table.  The lock mode is the same as the leader process.  It's
+	 * okay because the lock mode does not conflict among the parallel
+	 * workers.
+	 */
+	onerel = table_open(lvshared->relid, ShareUpdateExclusiveLock);
+
+	/*
+	 * Open all indexes. indrels are sorted in order by OID, which should be
+	 * matched to the leader's one.
+	 */
+	vac_open_indexes(onerel, RowExclusiveLock, &nindexes, &indrels);
+	Assert(nindexes > 0);
+
+	/* Set dead tuple space */
+	dead_tuples = (LVDeadTuples *) shm_toc_lookup(toc,
+												  PARALLEL_VACUUM_KEY_DEAD_TUPLES,
+												  false);
+
+	/* Set cost-based vacuum delay */
+	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostBalance = 0;
+	VacuumPageHit = 0;
+	VacuumPageMiss = 0;
+	VacuumPageDirty = 0;
+	VacuumCostBalanceLocal = 0;
+	VacuumSharedCostBalance = &(lvshared->cost_balance);
+	VacuumActiveNWorkers = &(lvshared->active_nworkers);
+
+	stats = (IndexBulkDeleteResult **)
+		palloc0(nindexes * sizeof(IndexBulkDeleteResult *));
+
+	if (lvshared->maintenance_work_mem_worker > 0)
+		maintenance_work_mem = lvshared->maintenance_work_mem_worker;
+
+	/* Process indexes to perform vacuum/cleanup */
+	parallel_vacuum_index(indrels, stats, lvshared, dead_tuples, nindexes);
+
+	vac_close_indexes(nindexes, indrels, RowExclusiveLock);
+	table_close(onerel, ShareUpdateExclusiveLock);
+	pfree(stats);
+}
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index f3e2254954..df06e7d174 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -14,6 +14,7 @@
 
 #include "postgres.h"
 
+#include "access/heapam.h"
 #include "access/nbtree.h"
 #include "access/parallel.h"
 #include "access/session.h"
@@ -139,6 +140,9 @@ static const struct
 	},
 	{
 		"_bt_parallel_build_main", _bt_parallel_build_main
+	},
+	{
+		"parallel_vacuum_main", parallel_vacuum_main
 	}
 };
 
@@ -174,6 +178,7 @@ CreateParallelContext(const char *library_name, const char *function_name,
 	pcxt = palloc0(sizeof(ParallelContext));
 	pcxt->subid = GetCurrentSubTransactionId();
 	pcxt->nworkers = nworkers;
+	pcxt->nworkers_to_launch = nworkers;
 	pcxt->library_name = pstrdup(library_name);
 	pcxt->function_name = pstrdup(function_name);
 	pcxt->error_context_stack = error_context_stack;
@@ -486,6 +491,23 @@ ReinitializeParallelDSM(ParallelContext *pcxt)
 	}
 }
 
+/*
+ * Reinitialize parallel workers for a parallel context such that we could
+ * launch the different number of workers.  This is required for cases where
+ * we need to reuse the same DSM segment, but the number of workers can
+ * vary from run-to-run.
+ */
+void
+ReinitializeParallelWorkers(ParallelContext *pcxt, int nworkers_to_launch)
+{
+	/*
+	 * The number of workers that need to be launched must be less than the
+	 * number of workers with which the parallel context is initialized.
+	 */
+	Assert(pcxt->nworkers >= nworkers_to_launch);
+	pcxt->nworkers_to_launch = nworkers_to_launch;
+}
+
 /*
  * Launch parallel workers.
  */
@@ -498,7 +520,7 @@ LaunchParallelWorkers(ParallelContext *pcxt)
 	bool		any_registrations_failed = false;
 
 	/* Skip this if we have no workers. */
-	if (pcxt->nworkers == 0)
+	if (pcxt->nworkers == 0 || pcxt->nworkers_to_launch == 0)
 		return;
 
 	/* We need to be a lock group leader. */
@@ -533,7 +555,7 @@ LaunchParallelWorkers(ParallelContext *pcxt)
 	 * fails.  It wouldn't help much anyway, because registering the worker in
 	 * no way guarantees that it will start up and initialize successfully.
 	 */
-	for (i = 0; i < pcxt->nworkers; ++i)
+	for (i = 0; i < pcxt->nworkers_to_launch; ++i)
 	{
 		memcpy(worker.bgw_extra, &i, sizeof(int));
 		if (!any_registrations_failed &&
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index bb34e252e4..c645d4463f 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -42,6 +42,7 @@
 #include "nodes/makefuncs.h"
 #include "pgstat.h"
 #include "postmaster/autovacuum.h"
+#include "postmaster/bgworker_internals.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
 #include "storage/proc.h"
@@ -68,6 +69,14 @@ static MemoryContext vac_context = NULL;
 static BufferAccessStrategy vac_strategy;
 
 
+/*
+ * Variables for cost-based parallel vacuum.  See comments atop
+ * compute_parallel_delay to understand how it works.
+ */
+pg_atomic_uint32 *VacuumSharedCostBalance = NULL;
+pg_atomic_uint32 *VacuumActiveNWorkers = NULL;
+int			VacuumCostBalanceLocal = 0;
+
 /* non-export function prototypes */
 static List *expand_vacuum_rel(VacuumRelation *vrel, int options);
 static List *get_all_vacuum_rels(int options);
@@ -76,6 +85,7 @@ static void vac_truncate_clog(TransactionId frozenXID,
 							  TransactionId lastSaneFrozenXid,
 							  MultiXactId lastSaneMinMulti);
 static bool vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params);
+static double compute_parallel_delay(void);
 static VacOptTernaryValue get_vacopt_ternary_value(DefElem *def);
 
 /*
@@ -94,12 +104,16 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 	bool		freeze = false;
 	bool		full = false;
 	bool		disable_page_skipping = false;
+	bool		parallel_option = false;
 	ListCell   *lc;
 
 	/* Set default value */
 	params.index_cleanup = VACOPT_TERNARY_DEFAULT;
 	params.truncate = VACOPT_TERNARY_DEFAULT;
 
+	/* By default parallel vacuum is enabled */
+	params.nworkers = 0;
+
 	/* Parse options list */
 	foreach(lc, vacstmt->options)
 	{
@@ -129,6 +143,39 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 			params.index_cleanup = get_vacopt_ternary_value(opt);
 		else if (strcmp(opt->defname, "truncate") == 0)
 			params.truncate = get_vacopt_ternary_value(opt);
+		else if (strcmp(opt->defname, "parallel") == 0)
+		{
+			parallel_option = true;
+			if (opt->arg == NULL)
+			{
+				ereport(ERROR,
+						(errcode(ERRCODE_SYNTAX_ERROR),
+						 errmsg("parallel option requires a value between 0 and %d",
+								MAX_PARALLEL_WORKER_LIMIT),
+						 parser_errposition(pstate, opt->location)));
+			}
+			else
+			{
+				int			nworkers;
+
+				nworkers = defGetInt32(opt);
+				if (nworkers < 0 || nworkers > MAX_PARALLEL_WORKER_LIMIT)
+					ereport(ERROR,
+							(errcode(ERRCODE_SYNTAX_ERROR),
+							 errmsg("parallel vacuum degree must be between 0 and %d",
+									MAX_PARALLEL_WORKER_LIMIT),
+							 parser_errposition(pstate, opt->location)));
+
+				/*
+				 * Disable parallel vacuum, if user has specified parallel
+				 * degree as zero.
+				 */
+				if (nworkers == 0)
+					params.nworkers = -1;
+				else
+					params.nworkers = nworkers;
+			}
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -152,6 +199,11 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 		   !(params.options & (VACOPT_FULL | VACOPT_FREEZE)));
 	Assert(!(params.options & VACOPT_SKIPTOAST));
 
+	if ((params.options & VACOPT_FULL) && parallel_option)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot specify both FULL and PARALLEL options")));
+
 	/*
 	 * Make sure VACOPT_ANALYZE is specified if any column lists are present.
 	 */
@@ -383,6 +435,9 @@ vacuum(List *relations, VacuumParams *params,
 		VacuumPageHit = 0;
 		VacuumPageMiss = 0;
 		VacuumPageDirty = 0;
+		VacuumCostBalanceLocal = 0;
+		VacuumSharedCostBalance = NULL;
+		VacuumActiveNWorkers = NULL;
 
 		/*
 		 * Loop to process each selected relation.
@@ -1941,16 +1996,26 @@ vac_close_indexes(int nindexes, Relation *Irel, LOCKMODE lockmode)
 void
 vacuum_delay_point(void)
 {
+	double		msec = 0;
+
 	/* Always check for interrupts */
 	CHECK_FOR_INTERRUPTS();
 
-	/* Nap if appropriate */
-	if (VacuumCostActive && !InterruptPending &&
-		VacuumCostBalance >= VacuumCostLimit)
-	{
-		double		msec;
+	if (!VacuumCostActive || InterruptPending)
+		return;
 
+	/*
+	 * For parallel vacuum, the delay is computed based on the shared cost
+	 * balance.  See compute_parallel_delay.
+	 */
+	if (VacuumSharedCostBalance != NULL)
+		msec = compute_parallel_delay();
+	else if (VacuumCostBalance >= VacuumCostLimit)
 		msec = VacuumCostDelay * VacuumCostBalance / VacuumCostLimit;
+
+	/* Nap if appropriate */
+	if (msec > 0)
+	{
 		if (msec > VacuumCostDelay * 4)
 			msec = VacuumCostDelay * 4;
 
@@ -1966,6 +2031,66 @@ vacuum_delay_point(void)
 	}
 }
 
+/*
+ * Computes the vacuum delay for parallel workers.
+ *
+ * The basic idea of a cost-based vacuum delay for parallel index vacuuming
+ * is to allow each worker to sleep proportional to the work done by it.  We
+ * achieve this by allowing all parallel vacuum workers including the leader
+ * process to have a shared view of cost related parameters (mainly
+ * VacuumCostBalance).  We allow each worker to update it as and when it has
+ * incurred any cost and then based on that decide whether it needs to sleep.
+ * We compute the time to sleep for a worker based on the cost it has incurred
+ * (VacuumCostBalanceLocal) and then reduce the VacuumSharedCostBalance by
+ * that amount.  This avoids letting the workers sleep who have done less or
+ * no I/O as compared to other workers and therefore can ensure that workers
+ * who are doing more I/O got throttled more.
+ *
+ * We allow any worker to sleep only if it has performed the I/O above a
+ * certain threshold, which is calculated based on the number of active
+ * workers (VacuumActiveNWorkers), and the overall cost balance is more than
+ * VacuumCostLimit set by the system.  The testing reveals that we achieve
+ * the required throttling if we allow a worker that has done more than 50%
+ * of its share of work to sleep.
+ */
+static double
+compute_parallel_delay(void)
+{
+	double		msec = 0;
+	uint32		shared_balance;
+	int			nworkers;
+
+	/* Parallel vacuum must be active */
+	Assert(VacuumSharedCostBalance);
+
+	nworkers = pg_atomic_read_u32(VacuumActiveNWorkers);
+
+	/* At least count itself */
+	Assert(nworkers >= 1);
+
+	/* Update the shared cost balance value atomically */
+	shared_balance = pg_atomic_add_fetch_u32(VacuumSharedCostBalance, VacuumCostBalance);
+
+	/* Compute the total local balance for the current worker */
+	VacuumCostBalanceLocal += VacuumCostBalance;
+
+	if ((shared_balance >= VacuumCostLimit) &&
+		(VacuumCostBalanceLocal > 0.5 * (VacuumCostLimit / nworkers)))
+	{
+		/* Compute sleep time based on the local cost balance */
+		msec = VacuumCostDelay * VacuumCostBalanceLocal / VacuumCostLimit;
+		pg_atomic_sub_fetch_u32(VacuumSharedCostBalance, VacuumCostBalanceLocal);
+		VacuumCostBalanceLocal = 0;
+	}
+
+	/*
+	 * Reset the local balance as we accumulated it into the shared value.
+	 */
+	VacuumCostBalance = 0;
+
+	return msec;
+}
+
 /*
  * A wrapper function of defGetBoolean().
  *
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index f0e40e36af..6d1f28c327 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2886,6 +2886,8 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 			(!wraparound ? VACOPT_SKIP_LOCKED : 0);
 		tab->at_params.index_cleanup = VACOPT_TERNARY_DEFAULT;
 		tab->at_params.truncate = VACOPT_TERNARY_DEFAULT;
+		/* As of now, we don't support parallel vacuum for autovacuum */
+		tab->at_params.nworkers = -1;
 		tab->at_params.freeze_min_age = freeze_min_age;
 		tab->at_params.freeze_table_age = freeze_table_age;
 		tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index b52396c17a..052d98b5c0 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -3597,7 +3597,7 @@ psql_completion(const char *text, int start, int end)
 		if (ends_with(prev_wd, '(') || ends_with(prev_wd, ','))
 			COMPLETE_WITH("FULL", "FREEZE", "ANALYZE", "VERBOSE",
 						  "DISABLE_PAGE_SKIPPING", "SKIP_LOCKED",
-						  "INDEX_CLEANUP", "TRUNCATE");
+						  "INDEX_CLEANUP", "TRUNCATE", "PARALLEL");
 		else if (TailMatches("FULL|FREEZE|ANALYZE|VERBOSE|DISABLE_PAGE_SKIPPING|SKIP_LOCKED|INDEX_CLEANUP|TRUNCATE"))
 			COMPLETE_WITH("ON", "OFF");
 	}
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 580b4caef7..479f17c55f 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -24,6 +24,8 @@
 #include "nodes/primnodes.h"
 #include "storage/bufpage.h"
 #include "storage/lockdefs.h"
+#include "storage/shm_toc.h"
+#include "storage/dsm.h"
 #include "utils/relcache.h"
 #include "utils/snapshot.h"
 
@@ -193,6 +195,7 @@ extern Size SyncScanShmemSize(void);
 struct VacuumParams;
 extern void heap_vacuum_rel(Relation onerel,
 							struct VacuumParams *params, BufferAccessStrategy bstrategy);
+extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
 
 /* in heap/heapam_visibility.c */
 extern bool HeapTupleSatisfiesVisibility(HeapTuple stup, Snapshot snapshot,
diff --git a/src/include/access/parallel.h b/src/include/access/parallel.h
index 646708ba5f..fc6a5603bb 100644
--- a/src/include/access/parallel.h
+++ b/src/include/access/parallel.h
@@ -33,7 +33,8 @@ typedef struct ParallelContext
 {
 	dlist_node	node;
 	SubTransactionId subid;
-	int			nworkers;
+	int			nworkers;		/* Maximum number of workers to launch */
+	int			nworkers_to_launch; /* Actual number of workers to launch */
 	int			nworkers_launched;
 	char	   *library_name;
 	char	   *function_name;
@@ -63,6 +64,7 @@ extern ParallelContext *CreateParallelContext(const char *library_name,
 											  const char *function_name, int nworkers);
 extern void InitializeParallelDSM(ParallelContext *pcxt);
 extern void ReinitializeParallelDSM(ParallelContext *pcxt);
+extern void ReinitializeParallelWorkers(ParallelContext *pcxt, int nworkers_to_launch);
 extern void LaunchParallelWorkers(ParallelContext *pcxt);
 extern void WaitForParallelWorkersToAttach(ParallelContext *pcxt);
 extern void WaitForParallelWorkersToFinish(ParallelContext *pcxt);
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index b3351ad406..62dd01f41f 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -222,6 +222,13 @@ typedef struct VacuumParams
 										 * default value depends on reloptions */
 	VacOptTernaryValue truncate;	/* Truncate empty pages at the end,
 									 * default value depends on reloptions */
+
+	/*
+	 * The number of parallel vacuum workers.  0 by default which means choose
+	 * based on the number of indexes.  -1 indicates a parallel vacuum is
+	 * disabled.
+	 */
+	int			nworkers;
 } VacuumParams;
 
 /* GUC parameters */
@@ -231,6 +238,11 @@ extern int	vacuum_freeze_table_age;
 extern int	vacuum_multixact_freeze_min_age;
 extern int	vacuum_multixact_freeze_table_age;
 
+/* Variables for cost-based parallel vacuum  */
+extern pg_atomic_uint32 *VacuumSharedCostBalance;
+extern pg_atomic_uint32 *VacuumActiveNWorkers;
+extern int	VacuumCostBalanceLocal;
+
 
 /* in commands/vacuum.c */
 extern void ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel);
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index 9996d882d1..774ff2fc7c 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -92,6 +92,36 @@ CONTEXT:  SQL function "do_analyze" statement 1
 SQL function "wrap_do_analyze" statement 1
 VACUUM FULL vactst;
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+-- PARALLEL option
+CREATE TABLE pvactst (i INT, a INT[], p POINT) with (autovacuum_enabled = off);
+INSERT INTO pvactst SELECT i, array[1,2,3], point(i, i+1) FROM generate_series(1,1000) i;
+CREATE INDEX btree_pvactst ON pvactst USING btree (i);
+CREATE INDEX hash_pvactst ON pvactst USING hash (i);
+CREATE INDEX brin_pvactst ON pvactst USING brin (i);
+CREATE INDEX gin_pvactst ON pvactst USING gin (a);
+CREATE INDEX gist_pvactst ON pvactst USING gist (p);
+CREATE INDEX spgist_pvactst ON pvactst USING spgist (p);
+-- VACUUM invokes parallel index cleanup
+SET min_parallel_index_scan_size to 0;
+VACUUM (PARALLEL 2) pvactst;
+-- VACUUM invokes parallel bulk-deletion
+UPDATE pvactst SET i = i WHERE i < 1000;
+VACUUM (PARALLEL 2) pvactst;
+UPDATE pvactst SET i = i WHERE i < 1000;
+VACUUM (PARALLEL 0) pvactst; -- disable parallel vacuum
+VACUUM (PARALLEL -1) pvactst; -- error
+ERROR:  parallel vacuum degree must be between 0 and 1024
+LINE 1: VACUUM (PARALLEL -1) pvactst;
+                ^
+VACUUM (PARALLEL 2, INDEX_CLEANUP FALSE) pvactst;
+VACUUM (PARALLEL 2, FULL TRUE) pvactst; -- error, cannot use both PARALLEL and FULL
+ERROR:  cannot specify both FULL and PARALLEL options
+CREATE TEMPORARY TABLE tmp (a int PRIMARY KEY);
+CREATE INDEX tmp_idx1 ON tmp (a);
+VACUUM (PARALLEL 1) tmp; -- disables parallel vacuum option
+WARNING:  disabling parallel option of vacuum on "tmp" --- cannot vacuum temporary tables in parallel
+RESET min_parallel_index_scan_size;
+DROP TABLE pvactst;
 -- INDEX_CLEANUP option
 CREATE TABLE no_index_cleanup (i INT PRIMARY KEY, t TEXT);
 -- Use uncompressed data stored in toast.
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 69987f75e9..fbd364d7d0 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -75,6 +75,36 @@ VACUUM FULL vactst;
 
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
 
+-- PARALLEL option
+CREATE TABLE pvactst (i INT, a INT[], p POINT) with (autovacuum_enabled = off);
+INSERT INTO pvactst SELECT i, array[1,2,3], point(i, i+1) FROM generate_series(1,1000) i;
+CREATE INDEX btree_pvactst ON pvactst USING btree (i);
+CREATE INDEX hash_pvactst ON pvactst USING hash (i);
+CREATE INDEX brin_pvactst ON pvactst USING brin (i);
+CREATE INDEX gin_pvactst ON pvactst USING gin (a);
+CREATE INDEX gist_pvactst ON pvactst USING gist (p);
+CREATE INDEX spgist_pvactst ON pvactst USING spgist (p);
+
+-- VACUUM invokes parallel index cleanup
+SET min_parallel_index_scan_size to 0;
+VACUUM (PARALLEL 2) pvactst;
+
+-- VACUUM invokes parallel bulk-deletion
+UPDATE pvactst SET i = i WHERE i < 1000;
+VACUUM (PARALLEL 2) pvactst;
+
+UPDATE pvactst SET i = i WHERE i < 1000;
+VACUUM (PARALLEL 0) pvactst; -- disable parallel vacuum
+
+VACUUM (PARALLEL -1) pvactst; -- error
+VACUUM (PARALLEL 2, INDEX_CLEANUP FALSE) pvactst;
+VACUUM (PARALLEL 2, FULL TRUE) pvactst; -- error, cannot use both PARALLEL and FULL
+CREATE TEMPORARY TABLE tmp (a int PRIMARY KEY);
+CREATE INDEX tmp_idx1 ON tmp (a);
+VACUUM (PARALLEL 1) tmp; -- disables parallel vacuum option
+RESET min_parallel_index_scan_size;
+DROP TABLE pvactst;
+
 -- INDEX_CLEANUP option
 CREATE TABLE no_index_cleanup (i INT PRIMARY KEY, t TEXT);
 -- Use uncompressed data stored in toast.
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index caf6b86f92..0242e6627d 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -1216,7 +1216,11 @@ LPVOID
 LPWSTR
 LSEG
 LUID
+LVDeadTuples
 LVRelStats
+LVShared
+LVSharedIndStats
+LVParallelState
 LWLock
 LWLockHandle
 LWLockMinimallyPadded
-- 
2.16.2.windows.1

#363

Mahendra Singh Thalor

mahi6run@gmail.com

almost 6 years ago

In reply to: Amit Kapila (#362)

On Wed, 15 Jan 2020 at 17:27, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Jan 15, 2020 at 10:05 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

Thank you for updating the patch! I have a few small comments.

I have adapted all your changes, fixed the comment by Mahendra related
to initializing parallel state only when there are at least two
indexes. Additionally, I have changed a few comments (make the
reference to parallel vacuum consistent, at some places we were
referring it as 'parallel lazy vacuum' and at other places it was
'parallel index vacuum').

The
rest looks good to me.

Okay, I think the patch is in good shape. I am planning to read it a
few more times (at least 2 times) and then probably will commit it
early next week (Monday or Tuesday) unless there are any major
comments. I have already committed the API patch (4d8a8d0c73).

Hi,
Thanks Amit for fixing review comments.

I reviewed v48 patch and below are some comments.

1.
+ * based on the number of indexes. -1 indicates a parallel vacuum is

I think, above should be like "-1 indicates that parallel vacuum is"

2.
+/* Variables for cost-based parallel vacuum */

At the end of comment, there is 2 spaces. I think, it should be only 1
space.

3.
I think, we should add a test case for parallel option(when degree is not
specified).
*Ex:*
postgres=# VACUUM (PARALLEL) tmp;
ERROR: parallel option requires a value between 0 and 1024
LINE 1: VACUUM (PARALLEL) tmp;
^
postgres=#

Because above error is added in this parallel patch, so we should have test
case for this to increase code coverage.

--
Thanks and Regards
Mahendra Singh Thalor
EnterpriseDB: http://www.enterprisedb.com

#364

Mahendra Singh Thalor

mahi6run@gmail.com

almost 6 years ago

In reply to: Mahendra Singh Thalor (#363)

On Wed, 15 Jan 2020 at 19:04, Mahendra Singh Thalor <mahi6run@gmail.com> wrote:

On Wed, 15 Jan 2020 at 17:27, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Jan 15, 2020 at 10:05 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

Thank you for updating the patch! I have a few small comments.

I have adapted all your changes, fixed the comment by Mahendra related
to initializing parallel state only when there are at least two
indexes. Additionally, I have changed a few comments (make the
reference to parallel vacuum consistent, at some places we were
referring it as 'parallel lazy vacuum' and at other places it was
'parallel index vacuum').

The
rest looks good to me.

Okay, I think the patch is in good shape. I am planning to read it a
few more times (at least 2 times) and then probably will commit it
early next week (Monday or Tuesday) unless there are any major
comments. I have already committed the API patch (4d8a8d0c73).

Hi,
Thanks Amit for fixing review comments.

I reviewed v48 patch and below are some comments.

1.
+ * based on the number of indexes. -1 indicates a parallel vacuum is

I think, above should be like "-1 indicates that parallel vacuum is"

2.
+/* Variables for cost-based parallel vacuum */

At the end of comment, there is 2 spaces. I think, it should be only 1 space.

3.
I think, we should add a test case for parallel option(when degree is not specified).
Ex:
postgres=# VACUUM (PARALLEL) tmp;
ERROR: parallel option requires a value between 0 and 1024
LINE 1: VACUUM (PARALLEL) tmp;
^
postgres=#

Because above error is added in this parallel patch, so we should have test case for this to increase code coverage.

Hi
Below are some more review comments for v48 patch.

1.
#include "storage/bufpage.h"
#include "storage/lockdefs.h"
+#include "storage/shm_toc.h"
+#include "storage/dsm.h"

Here, order of header file is not alphabetically. (storage/dsm.h
should come before storage/lockdefs.h)

2.
+    /* No index supports parallel vacuum */
+    if (nindexes_parallel == 0)
+        return 0;
+
+    /* The leader process takes one index */
+    nindexes_parallel--;

Above code can be rearranged as:

+    /* The leader process takes one index */
+    nindexes_parallel--;
+
+    /* No index supports parallel vacuum */
+    if (nindexes_parallel <= 0)
+        return 0;

If we do like this, then in some cases, we can skip some calculations
of parallel workers.

--
Thanks and Regards
Mahendra Singh Thalor
EnterpriseDB: http://www.enterprisedb.com

#365

Mahendra Singh Thalor

mahi6run@gmail.com

almost 6 years ago

In reply to: Mahendra Singh Thalor (#364)

On Wed, 15 Jan 2020 at 19:31, Mahendra Singh Thalor <mahi6run@gmail.com> wrote:

On Wed, 15 Jan 2020 at 19:04, Mahendra Singh Thalor <mahi6run@gmail.com> wrote:

On Wed, 15 Jan 2020 at 17:27, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Jan 15, 2020 at 10:05 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

Thank you for updating the patch! I have a few small comments.

I have adapted all your changes, fixed the comment by Mahendra related
to initializing parallel state only when there are at least two
indexes. Additionally, I have changed a few comments (make the
reference to parallel vacuum consistent, at some places we were
referring it as 'parallel lazy vacuum' and at other places it was
'parallel index vacuum').

The
rest looks good to me.

Okay, I think the patch is in good shape. I am planning to read it a
few more times (at least 2 times) and then probably will commit it
early next week (Monday or Tuesday) unless there are any major
comments. I have already committed the API patch (4d8a8d0c73).

Hi,
Thanks Amit for fixing review comments.

I reviewed v48 patch and below are some comments.

1.
+ * based on the number of indexes. -1 indicates a parallel vacuum is

I think, above should be like "-1 indicates that parallel vacuum is"

2.
+/* Variables for cost-based parallel vacuum */

At the end of comment, there is 2 spaces. I think, it should be only 1 space.

3.
I think, we should add a test case for parallel option(when degree is not specified).
Ex:
postgres=# VACUUM (PARALLEL) tmp;
ERROR: parallel option requires a value between 0 and 1024
LINE 1: VACUUM (PARALLEL) tmp;
^
postgres=#

Because above error is added in this parallel patch, so we should have test case for this to increase code coverage.

Hi
Below are some more review comments for v48 patch.

1.
#include "storage/bufpage.h"
#include "storage/lockdefs.h"
+#include "storage/shm_toc.h"
+#include "storage/dsm.h"

Here, order of header file is not alphabetically. (storage/dsm.h
should come before storage/lockdefs.h)
2.
+    /* No index supports parallel vacuum */
+    if (nindexes_parallel == 0)
+        return 0;
+
+    /* The leader process takes one index */
+    nindexes_parallel--;
Above code can be rearranged as:
+    /* The leader process takes one index */
+    nindexes_parallel--;
+
+    /* No index supports parallel vacuum */
+    if (nindexes_parallel <= 0)
+        return 0;
If we do like this, then in some cases, we can skip some calculations
of parallel workers.

--
Thanks and Regards
Mahendra Singh Thalor
EnterpriseDB: http://www.enterprisedb.com

Hi,
I checked code coverage and time taken by vacuum.sql test with and
without v48 patch. Below are some findings (I ran "make check-world
-i" to get coverage.)

1.
With v45 patch, compute_parallel_delay is never called so function hit
is zero. I think, we can add some delay options into vacuum.sql test
to hit function.

2.
I checked time taken by vacuum.sql test. Execution time is almost same
with and without v45 patch.

Without v45 patch:
Run1) vacuum ... ok 701 ms
Run2) vacuum ... ok 549 ms
Run3) vacuum ... ok 559 ms
Run4) vacuum ... ok 480 ms

With v45 patch:
Run1) vacuum ... ok 842 ms
Run2) vacuum ... ok 808 ms
Run3) vacuum ... ok 774 ms
Run4) vacuum ... ok 792 ms

--
Thanks and Regards
Mahendra Singh Thalor
EnterpriseDB: http://www.enterprisedb.com

#366

Amit Kapila

amit.kapila16@gmail.com

almost 6 years ago

In reply to: Mahendra Singh Thalor (#365)

On Thu, Jan 16, 2020 at 1:02 AM Mahendra Singh Thalor
<mahi6run@gmail.com> wrote:

On Wed, 15 Jan 2020 at 19:31, Mahendra Singh Thalor <mahi6run@gmail.com> wrote:

On Wed, 15 Jan 2020 at 19:04, Mahendra Singh Thalor <mahi6run@gmail.com> wrote:

I reviewed v48 patch and below are some comments.

1.
+ * based on the number of indexes. -1 indicates a parallel vacuum is

I think, above should be like "-1 indicates that parallel vacuum is"

I am not an expert in this matter, but I am not sure if your
suggestion is correct. I thought an article is required here, but I
could be wrong. Can you please clarify?

2.
+/* Variables for cost-based parallel vacuum */

At the end of comment, there is 2 spaces. I think, it should be only 1 space.

3.
I think, we should add a test case for parallel option(when degree is not specified).
Ex:
postgres=# VACUUM (PARALLEL) tmp;
ERROR: parallel option requires a value between 0 and 1024
LINE 1: VACUUM (PARALLEL) tmp;
^
postgres=#

Because above error is added in this parallel patch, so we should have test case for this to increase code coverage.

I thought about it but was not sure to add a test for it. We might
not want to add a test for each and every case as that will increase
the number and time of tests without a significant advantage. Now
that you have pointed this, I can add a test for it unless someone
else thinks otherwise.

1.
With v45 patch, compute_parallel_delay is never called so function hit
is zero. I think, we can add some delay options into vacuum.sql test
to hit function.

But how can we meaningfully test the functionality of the delay? It
would be tricky to come up with a portable test that can always
produce consistent results.

2.
I checked time taken by vacuum.sql test. Execution time is almost same
with and without v45 patch.

Without v45 patch:
Run1) vacuum ... ok 701 ms
Run2) vacuum ... ok 549 ms
Run3) vacuum ... ok 559 ms
Run4) vacuum ... ok 480 ms

With v45 patch:
Run1) vacuum ... ok 842 ms
Run2) vacuum ... ok 808 ms
Run3) vacuum ... ok 774 ms
Run4) vacuum ... ok 792 ms

I see some variance in results, have you run with autovacuum as off.
I was expecting that this might speed up some cases where parallel
vacuum is used by default.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#367

Mahendra Singh Thalor

mahi6run@gmail.com

almost 6 years ago

In reply to: Amit Kapila (#366)

On Thu, 16 Jan 2020 at 08:22, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Thu, Jan 16, 2020 at 1:02 AM Mahendra Singh Thalor
<mahi6run@gmail.com> wrote:

On Wed, 15 Jan 2020 at 19:31, Mahendra Singh Thalor <mahi6run@gmail.com> wrote:

On Wed, 15 Jan 2020 at 19:04, Mahendra Singh Thalor <mahi6run@gmail.com> wrote:

I reviewed v48 patch and below are some comments.

1.
+ * based on the number of indexes. -1 indicates a parallel vacuum is

I think, above should be like "-1 indicates that parallel vacuum is"

I am not an expert in this matter, but I am not sure if your
suggestion is correct. I thought an article is required here, but I
could be wrong. Can you please clarify?

2.
+/* Variables for cost-based parallel vacuum */

At the end of comment, there is 2 spaces. I think, it should be only 1 space.

3.
I think, we should add a test case for parallel option(when degree is not specified).
Ex:
postgres=# VACUUM (PARALLEL) tmp;
ERROR: parallel option requires a value between 0 and 1024
LINE 1: VACUUM (PARALLEL) tmp;
^
postgres=#

Because above error is added in this parallel patch, so we should have test case for this to increase code coverage.

I thought about it but was not sure to add a test for it. We might
not want to add a test for each and every case as that will increase
the number and time of tests without a significant advantage. Now
that you have pointed this, I can add a test for it unless someone
else thinks otherwise.

1.
With v45 patch, compute_parallel_delay is never called so function hit
is zero. I think, we can add some delay options into vacuum.sql test
to hit function.

But how can we meaningfully test the functionality of the delay? It
would be tricky to come up with a portable test that can always
produce consistent results.

2.
I checked time taken by vacuum.sql test. Execution time is almost same
with and without v45 patch.

Without v45 patch:
Run1) vacuum ... ok 701 ms
Run2) vacuum ... ok 549 ms
Run3) vacuum ... ok 559 ms
Run4) vacuum ... ok 480 ms

With v45 patch:
Run1) vacuum ... ok 842 ms
Run2) vacuum ... ok 808 ms
Run3) vacuum ... ok 774 ms
Run4) vacuum ... ok 792 ms

I see some variance in results, have you run with autovacuum as off.
I was expecting that this might speed up some cases where parallel
vacuum is used by default.

I think, this is expected difference in timing because we are adding
some vacuum related test. I am not starting server manually(means I am
starting server with only default setting).

If we start server with default settings, then we will not hit vacuum
related test cases to parallel because size of index relation is very
small so we will not do parallel vacuum.

--
Thanks and Regards
Mahendra Singh Thalor
EnterpriseDB: http://www.enterprisedb.com

#368

Amit Kapila

amit.kapila16@gmail.com

almost 6 years ago

In reply to: Mahendra Singh Thalor (#367)

On Thu, Jan 16, 2020 at 10:11 AM Mahendra Singh Thalor
<mahi6run@gmail.com> wrote:

On Thu, 16 Jan 2020 at 08:22, Amit Kapila <amit.kapila16@gmail.com> wrote:

2.
I checked time taken by vacuum.sql test. Execution time is almost same
with and without v45 patch.

Without v45 patch:
Run1) vacuum ... ok 701 ms
Run2) vacuum ... ok 549 ms
Run3) vacuum ... ok 559 ms
Run4) vacuum ... ok 480 ms

With v45 patch:
Run1) vacuum ... ok 842 ms
Run2) vacuum ... ok 808 ms
Run3) vacuum ... ok 774 ms
Run4) vacuum ... ok 792 ms

I see some variance in results, have you run with autovacuum as off.
I was expecting that this might speed up some cases where parallel
vacuum is used by default.

I think, this is expected difference in timing because we are adding
some vacuum related test. I am not starting server manually(means I am
starting server with only default setting).

Can you once test by setting autovacuum = off? The autovacuum leads
to variability in test timing.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#369

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

almost 6 years ago

In reply to: Amit Kapila (#368)

On Thu, 16 Jan 2020 at 14:11, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Thu, Jan 16, 2020 at 10:11 AM Mahendra Singh Thalor
<mahi6run@gmail.com> wrote:

On Thu, 16 Jan 2020 at 08:22, Amit Kapila <amit.kapila16@gmail.com> wrote:

2.
I checked time taken by vacuum.sql test. Execution time is almost same
with and without v45 patch.

Without v45 patch:
Run1) vacuum ... ok 701 ms
Run2) vacuum ... ok 549 ms
Run3) vacuum ... ok 559 ms
Run4) vacuum ... ok 480 ms

With v45 patch:
Run1) vacuum ... ok 842 ms
Run2) vacuum ... ok 808 ms
Run3) vacuum ... ok 774 ms
Run4) vacuum ... ok 792 ms

I see some variance in results, have you run with autovacuum as off.
I was expecting that this might speed up some cases where parallel
vacuum is used by default.

I think, this is expected difference in timing because we are adding
some vacuum related test. I am not starting server manually(means I am
starting server with only default setting).

Can you once test by setting autovacuum = off? The autovacuum leads
to variability in test timing.

I've also run the regression tests with and without the patch:

* w/o patch and autovacuum = on: 255 ms
* w/o patch and autovacuum = off: 258 ms
* w/ patch and autovacuum = on: 370 ms
* w/ patch and autovacuum = off: 375 ms

If we start server with default settings, then we will not hit vacuum
related test cases to parallel because size of index relation is very
small so we will not do parallel vacuum.

Right. Most indexes (all?) of tables that are used in the regression
tests are smaller than min_parallel_index_scan_size. And we set
min_parallel_index_scan_size to 0 in vacuum.sql but VACUUM would not
be speeded-up much because of the relation size. Since we instead
populate new table for parallel vacuum testing the regression test for
vacuum would take a longer time.

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#370

Amit Kapila

amit.kapila16@gmail.com

almost 6 years ago

In reply to: Masahiko Sawada (#369)

1 attachment(s)

On Thu, Jan 16, 2020 at 4:46 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

Right. Most indexes (all?) of tables that are used in the regression
tests are smaller than min_parallel_index_scan_size. And we set
min_parallel_index_scan_size to 0 in vacuum.sql but VACUUM would not
be speeded-up much because of the relation size. Since we instead
populate new table for parallel vacuum testing the regression test for
vacuum would take a longer time.

Fair enough and I think it is good in a way that it won't change the
coverage of existing vacuum code. I have fixed all the issues
reported by Mahendra and have fixed a few other cosmetic things in the
attached patch.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachments:

v49-0001-Allow-vacuum-command-to-process-indexes-in-parallel.patchapplication/octet-stream; name=v49-0001-Allow-vacuum-command-to-process-indexes-in-parallel.patchDownload

From 79241c58c3f925a670b64fee26d6acda456ebeda Mon Sep 17 00:00:00 2001
From: Amit Kapila <akapila@postgresql.org>
Date: Thu, 9 Jan 2020 15:49:46 +0530
Subject: [PATCH] Allow vacuum command to process indexes in parallel.

This feature allows the vacuum to leverage multiple CPUs in order to
process indexes.  This enables us to perform index vacuuming and index
cleanup with background workers.  This adds a PARALLEL option to VACUUM
command where the user can specify the number of workers that can be used
to perform the command which is limited by the number of indexes on a
table.  Specifying zero as a number of workers will disable parallelism.
This option can't be used with the FULL option.

Each index is processed by at most one vacuum process.  Therefore parallel
vacuum can be used when the table has at least two indexes.

The parallel degree is either specified by the user or determined based on
the number of indexes that the table has, and further limited by
max_parallel_maintenance_workers.  The index can participate in parallel
vacuum iff it's size is greater than min_parallel_index_scan_size.

Author: Masahiko Sawada and Amit Kapila
Reviewed-by: Dilip Kumar, Amit Kapila, Robert Haas, Tomas Vondra,
Mahendra Singh and Sergei Kornilov
Tested-by: Mahendra Singh and Prabhat Sahu
Discussion:
https://postgr.es/m/CAD21AoDTPMgzSkV4E3SFo1CH_x50bf5PqZFQf4jmqjk-C03BWg@mail.gmail.com
https://postgr.es/m/CAA4eK1J-VoR9gzS5E75pcD-OH0mEyCdp8RihcwKrcuw7J-Q0+w@mail.gmail.com
---
 doc/src/sgml/config.sgml              |   18 +-
 doc/src/sgml/ref/vacuum.sgml          |   61 +-
 src/backend/access/heap/vacuumlazy.c  | 1266 ++++++++++++++++++++++++++++++---
 src/backend/access/transam/parallel.c |   26 +-
 src/backend/commands/vacuum.c         |  135 +++-
 src/backend/postmaster/autovacuum.c   |    2 +
 src/bin/psql/tab-complete.c           |    2 +-
 src/include/access/heapam.h           |    3 +
 src/include/access/parallel.h         |    4 +-
 src/include/commands/vacuum.h         |   12 +
 src/test/regress/expected/vacuum.out  |   34 +
 src/test/regress/sql/vacuum.sql       |   31 +
 src/tools/pgindent/typedefs.list      |    4 +
 13 files changed, 1462 insertions(+), 136 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 5d45b6f..3ccacd5 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2308,13 +2308,13 @@ include_dir 'conf.d'
        <listitem>
         <para>
          Sets the maximum number of parallel workers that can be
-         started by a single utility command.  Currently, the only
-         parallel utility command that supports the use of parallel
-         workers is <command>CREATE INDEX</command>, and only when
-         building a B-tree index.  Parallel workers are taken from the
-         pool of processes established by <xref
-         linkend="guc-max-worker-processes"/>, limited by <xref
-         linkend="guc-max-parallel-workers"/>.  Note that the requested
+         started by a single utility command.  Currently, the parallel
+         utility commands that support the use of parallel workers are
+         <command>CREATE INDEX</command> only when building a B-tree index,
+         and <command>VACUUM</command> without <literal>FULL</literal>
+         option.  Parallel workers are taken from the pool of processes
+         established by <xref linkend="guc-max-worker-processes"/>, limited
+         by <xref linkend="guc-max-parallel-workers"/>.  Note that the requested
          number of workers may not actually be available at run time.
          If this occurs, the utility operation will run with fewer
          workers than expected.  The default value is 2.  Setting this
@@ -4915,7 +4915,9 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
         for a parallel scan to be considered.  Note that a parallel index scan
         typically won't touch the entire index; it is the number of pages
         which the planner believes will actually be touched by the scan which
-        is relevant.
+        is relevant.  This parameter is also used to decide whether a
+        particular index can participate in a parallel vacuum.  See
+        <xref linkend="sql-vacuum"/>.
         If this value is specified without units, it is taken as blocks,
         that is <symbol>BLCKSZ</symbol> bytes, typically 8kB.
         The default is 512 kilobytes (<literal>512kB</literal>).
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index f9b0fb8..846056a 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -34,6 +34,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     SKIP_LOCKED [ <replaceable class="parameter">boolean</replaceable> ]
     INDEX_CLEANUP [ <replaceable class="parameter">boolean</replaceable> ]
     TRUNCATE [ <replaceable class="parameter">boolean</replaceable> ]
+    PARALLEL <replaceable class="parameter">integer</replaceable>
 
 <phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
 
@@ -75,10 +76,14 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    with normal reading and writing of the table, as an exclusive lock
    is not obtained.  However, extra space is not returned to the operating
    system (in most cases); it's just kept available for re-use within the
-   same table.  <command>VACUUM FULL</command> rewrites the entire contents
-   of the table into a new disk file with no extra space, allowing unused
-   space to be returned to the operating system.  This form is much slower and
-   requires an exclusive lock on each table while it is being processed.
+   same table.  It also allows us to leverage multiple CPUs in order to process
+   indexes.  This feature is known as <firstterm>parallel vacuum</firstterm>.
+   To disable this feature, one can use <literal>PARALLEL</literal> option and
+   specify parallel workers as zero.  <command>VACUUM FULL</command> rewrites
+   the entire contents of the table into a new disk file with no extra space,
+   allowing unused space to be returned to the operating system.  This form is
+   much slower and requires an exclusive lock on each table while it is being
+   processed.
   </para>
 
   <para>
@@ -224,6 +229,33 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    </varlistentry>
 
    <varlistentry>
+    <term><literal>PARALLEL</literal></term>
+    <listitem>
+     <para>
+      Perform vacuum index and cleanup index phases of <command>VACUUM</command>
+      in parallel using <replaceable class="parameter">integer</replaceable>
+      background workers (for the detail of each vacuum phases, please
+      refer to <xref linkend="vacuum-phases"/>).  If the
+      <literal>PARALLEL</literal> option is omitted, then
+      <command>VACUUM</command> decides the number of workers based on number
+      of indexes that support parallel vacuum operation on the relation which
+      is further limited by <xref linkend="guc-max-parallel-workers-maintenance"/>.
+      The index can participate in a parallel vacuum if and only if the size
+      of the index is more than <xref linkend="guc-min-parallel-index-scan-size"/>.
+      Please note that it is not guaranteed that the number of parallel workers
+      specified in <replaceable class="parameter">integer</replaceable> will
+      be used during execution.  It is possible for a vacuum to run with fewer
+      workers than specified, or even with no workers at all.  Only one worker
+      can be used per index.  So parallel workers are launched only when there
+      are at least <literal>2</literal> indexes in the table.  Workers for
+      vacuum launches before starting each phase and exit at the end of
+      the phase.  These behaviors might change in a future release.  This
+      option can't be used with the <literal>FULL</literal> option.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><replaceable class="parameter">boolean</replaceable></term>
     <listitem>
      <para>
@@ -238,6 +270,15 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    </varlistentry>
 
    <varlistentry>
+    <term><replaceable class="parameter">integer</replaceable></term>
+    <listitem>
+     <para>
+      Specifies a non-negative integer value passed to the selected option.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><replaceable class="parameter">table_name</replaceable></term>
     <listitem>
      <para>
@@ -317,10 +358,18 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    </para>
 
    <para>
+     The <option>PARALLEL</option> option is used only for vacuum purpose.
+     Even if this option is specified with <option>ANALYZE</option> option
+     it does not affect <option>ANALYZE</option>.
+   </para>
+
+   <para>
     <command>VACUUM</command> causes a substantial increase in I/O traffic,
     which might cause poor performance for other active sessions.  Therefore,
-    it is sometimes advisable to use the cost-based vacuum delay feature.
-    See <xref linkend="runtime-config-resource-vacuum-cost"/> for details.
+    it is sometimes advisable to use the cost-based vacuum delay feature.  For
+    parallel vacuum, each worker sleeps proportional to the work done by that
+    worker.  See <xref linkend="runtime-config-resource-vacuum-cost"/> for
+    details.
    </para>
 
    <para>
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index a5fe904..fe6a44a 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -22,6 +22,20 @@
  * of index scans performed.  So we don't use maintenance_work_mem memory for
  * the TID array, just enough to hold as many heap tuples as fit on one page.
  *
+ * Lazy vacuum supports parallel execution with parallel worker processes.  In
+ * a parallel vacuum, we perform both index vacuum and index cleanup with
+ * parallel worker processes.  Individual indexes are processed by one vacuum
+ * process.  At the beginning of a lazy vacuum (at lazy_scan_heap) we prepare
+ * the parallel context and initialize the DSM segment that contains shared
+ * information as well as the memory space for storing dead tuples.  When
+ * starting either index vacuum or index cleanup, we launch parallel worker
+ * processes.  Once all indexes are processed the parallel worker processes
+ * exit.  After that, the leader process re-initializes the parallel context
+ * so that it can use the same DSM for multiple passes of index vacuum and
+ * for performing index cleanup.  For updating the index statistics, we need
+ * to update the system table and since updates are not allowed during
+ * parallel mode we update the index statistics after exiting from the
+ * parallel mode.
  *
  * Portions Copyright (c) 1996-2020, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
@@ -36,25 +50,30 @@
 
 #include <math.h>
 
+#include "access/amapi.h"
 #include "access/genam.h"
 #include "access/heapam.h"
 #include "access/heapam_xlog.h"
 #include "access/htup_details.h"
 #include "access/multixact.h"
+#include "access/parallel.h"
 #include "access/transam.h"
 #include "access/visibilitymap.h"
+#include "access/xact.h"
 #include "access/xlog.h"
 #include "catalog/storage.h"
 #include "commands/dbcommands.h"
 #include "commands/progress.h"
 #include "commands/vacuum.h"
 #include "miscadmin.h"
+#include "optimizer/paths.h"
 #include "pgstat.h"
 #include "portability/instr_time.h"
 #include "postmaster/autovacuum.h"
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
 #include "storage/lmgr.h"
+#include "tcop/tcopprot.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/pg_rusage.h"
@@ -110,6 +129,142 @@
  */
 #define PREFETCH_SIZE			((BlockNumber) 32)
 
+/*
+ * DSM keys for parallel vacuum.  Unlike other parallel execution code, since
+ * we don't need to worry about DSM keys conflicting with plan_node_id we can
+ * use small integers.
+ */
+#define PARALLEL_VACUUM_KEY_SHARED			1
+#define PARALLEL_VACUUM_KEY_DEAD_TUPLES		2
+#define PARALLEL_VACUUM_KEY_QUERY_TEXT		3
+
+/*
+ * Macro to check if we are in a parallel vacuum.  If true, we are in the
+ * parallel mode and the DSM segment is initialized.
+ */
+#define ParallelVacuumIsActive(lps) (((LVParallelState *) (lps)) != NULL)
+
+/*
+ * LVDeadTuples stores the dead tuple TIDs collected during the heap scan.
+ * This is allocated in the DSM segment in parallel mode and in local memory
+ * in non-parallel mode.
+ */
+typedef struct LVDeadTuples
+{
+	int			max_tuples;		/* # slots allocated in array */
+	int			num_tuples;		/* current # of entries */
+	/* List of TIDs of tuples we intend to delete */
+	/* NB: this list is ordered by TID address */
+	ItemPointerData itemptrs[FLEXIBLE_ARRAY_MEMBER];	/* array of
+														 * ItemPointerData */
+} LVDeadTuples;
+
+#define SizeOfLVDeadTuples(cnt) \
+		add_size((offsetof(LVDeadTuples, itemptrs)), \
+				 mul_size(sizeof(ItemPointerData), cnt))
+
+/*
+ * Shared information among parallel workers.  So this is allocated in the DSM
+ * segment.
+ */
+typedef struct LVShared
+{
+	/*
+	 * Target table relid and log level.  These fields are not modified during
+	 * the lazy vacuum.
+	 */
+	Oid			relid;
+	int			elevel;
+
+	/*
+	 * An indication for vacuum workers to perform either index vacuum or
+	 * index cleanup.  first_time is true only if for_cleanup is true and
+	 * bulk-deletion is not performed yet.
+	 */
+	bool		for_cleanup;
+	bool		first_time;
+
+	/*
+	 * Fields for both index vacuum and cleanup.
+	 *
+	 * reltuples is the total number of input heap tuples.  We set either old
+	 * live tuples in the index vacuum case or the new live tuples in the
+	 * index cleanup case.
+	 *
+	 * estimated_count is true if the reltuples is an estimated value.
+	 */
+	double		reltuples;
+	bool		estimated_count;
+
+	/*
+	 * In single process lazy vacuum we could consume more memory during index
+	 * vacuuming or cleanup apart from the memory for heap scanning.  In
+	 * parallel vacuum, since individual vacuum workers can consume memory
+	 * equal to maintenance_work_mem, the new maintenance_work_mem for each
+	 * worker is set such that the parallel operation doesn't consume more
+	 * memory than single process lazy vacuum.
+	 */
+	int			maintenance_work_mem_worker;
+
+	/*
+	 * Shared vacuum cost balance.  During parallel vacuum,
+	 * VacuumSharedCostBalance points to this value and it accumulates the
+	 * balance of each parallel vacuum worker.
+	 */
+	pg_atomic_uint32 cost_balance;
+
+	/*
+	 * Number of active parallel workers.  This is used for computing the
+	 * minimum threshold of the vacuum cost balance for a worker to go for the
+	 * delay.
+	 */
+	pg_atomic_uint32 active_nworkers;
+
+	/*
+	 * Variables to control parallel vacuum.  We have a bitmap to indicate
+	 * which index has stats in shared memory.  The set bit in the map
+	 * indicates that the particular index supports a parallel vacuum.
+	 */
+	pg_atomic_uint32 idx;		/* counter for vacuuming and clean up */
+	uint32		offset;			/* sizeof header incl. bitmap */
+	bits8		bitmap[FLEXIBLE_ARRAY_MEMBER];	/* bit map of NULLs */
+
+	/* Shared index statistics data follows at end of struct */
+} LVShared;
+
+#define SizeOfLVShared (offsetof(LVShared, bitmap) + sizeof(bits8))
+#define GetSharedIndStats(s) \
+	((LVSharedIndStats *)((char *)(s) + ((LVShared *)(s))->offset))
+#define IndStatsIsNull(s, i) \
+	(!(((LVShared *)(s))->bitmap[(i) >> 3] & (1 << ((i) & 0x07))))
+
+/*
+ * Struct for an index bulk-deletion statistic used for parallel vacuum.  This
+ * is allocated in the DSM segment.
+ */
+typedef struct LVSharedIndStats
+{
+	bool		updated;		/* are the stats updated? */
+	IndexBulkDeleteResult stats;
+} LVSharedIndStats;
+
+/* Struct for maintaining a parallel vacuum state. */
+typedef struct LVParallelState
+{
+	ParallelContext *pcxt;
+
+	/* Shared information among parallel vacuum workers */
+	LVShared   *lvshared;
+
+	/*
+	 * The number of indexes that support parallel index bulk-deletion and
+	 * parallel index cleanup respectively.
+	 */
+	int			nindexes_parallel_bulkdel;
+	int			nindexes_parallel_cleanup;
+	int			nindexes_parallel_condcleanup;
+} LVParallelState;
+
 typedef struct LVRelStats
 {
 	/* useindex = true means two-pass strategy; false means one-pass */
@@ -128,11 +283,7 @@ typedef struct LVRelStats
 	BlockNumber pages_removed;
 	double		tuples_deleted;
 	BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
-	/* List of TIDs of tuples we intend to delete */
-	/* NB: this list is ordered by TID address */
-	int			num_dead_tuples;	/* current # of entries */
-	int			max_dead_tuples;	/* # slots allocated in array */
-	ItemPointer dead_tuples;	/* array of ItemPointerData */
+	LVDeadTuples *dead_tuples;
 	int			num_index_scans;
 	TransactionId latestRemovedXid;
 	bool		lock_waiter_detected;
@@ -155,15 +306,15 @@ static void lazy_scan_heap(Relation onerel, VacuumParams *params,
 						   bool aggressive);
 static void lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats);
 static bool lazy_check_needs_freeze(Buffer buf, bool *hastup);
-static void lazy_vacuum_index(Relation indrel,
-							  IndexBulkDeleteResult **stats,
-							  LVRelStats *vacrelstats);
-static void lazy_vacuum_all_indexes(Relation onerel, LVRelStats *vacrelstats,
-									Relation *Irel, int nindexes,
-									IndexBulkDeleteResult **indstats);
+static void lazy_vacuum_all_indexes(Relation onerel, Relation *Irel,
+									IndexBulkDeleteResult **stats,
+									LVRelStats *vacrelstats, LVParallelState *lps,
+									int nindexes);
+static void lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult **stats,
+							  LVDeadTuples *dead_tuples, double reltuples);
 static void lazy_cleanup_index(Relation indrel,
-							   IndexBulkDeleteResult *stats,
-							   LVRelStats *vacrelstats);
+							   IndexBulkDeleteResult **stats,
+							   double reltuples, bool estimated_count);
 static int	lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 							 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer);
 static bool should_attempt_truncation(VacuumParams *params,
@@ -172,12 +323,41 @@ static void lazy_truncate_heap(Relation onerel, LVRelStats *vacrelstats);
 static BlockNumber count_nondeletable_pages(Relation onerel,
 											LVRelStats *vacrelstats);
 static void lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks);
-static void lazy_record_dead_tuple(LVRelStats *vacrelstats,
+static void lazy_record_dead_tuple(LVDeadTuples *dead_tuples,
 								   ItemPointer itemptr);
 static bool lazy_tid_reaped(ItemPointer itemptr, void *state);
 static int	vac_cmp_itemptr(const void *left, const void *right);
 static bool heap_page_is_all_visible(Relation rel, Buffer buf,
 									 TransactionId *visibility_cutoff_xid, bool *all_frozen);
+static void lazy_parallel_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+										 LVRelStats *vacrelstats, LVParallelState *lps,
+										 int nindexes);
+static void parallel_vacuum_index(Relation *Irel, IndexBulkDeleteResult **stats,
+								  LVShared *lvshared, LVDeadTuples *dead_tuples,
+								  int nindexes);
+static void vacuum_indexes_leader(Relation *Irel, IndexBulkDeleteResult **stats,
+								  LVRelStats *vacrelstats, LVParallelState *lps,
+								  int nindexes);
+static void vacuum_one_index(Relation indrel, IndexBulkDeleteResult **stats,
+							 LVShared *lvshared, LVSharedIndStats *shared_indstats,
+							 LVDeadTuples *dead_tuples);
+static void lazy_cleanup_all_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+									 LVRelStats *vacrelstats, LVParallelState *lps,
+									 int nindexes);
+static long compute_max_dead_tuples(BlockNumber relblocks, bool hasindex);
+static int	compute_parallel_vacuum_workers(Relation *Irel, int nindexes, int nrequested,
+											bool *can_parallel_vacuum);
+static void prepare_index_statistics(LVShared *lvshared, bool *can_parallel_vacuum,
+									 int nindexes);
+static void update_index_statistics(Relation *Irel, IndexBulkDeleteResult **stats,
+									int nindexes);
+static LVParallelState *begin_parallel_vacuum(Oid relid, Relation *Irel,
+											  LVRelStats *vacrelstats, BlockNumber nblocks,
+											  int nindexes, int nrequested);
+static void end_parallel_vacuum(Relation *Irel, IndexBulkDeleteResult **stats,
+								LVParallelState *lps, int nindexes);
+static LVSharedIndStats *get_indstats(LVShared *lvshared, int n);
+static bool skip_parallel_vacuum_index(Relation indrel, LVShared *lvshared);
 
 
 /*
@@ -491,6 +671,18 @@ vacuum_log_cleanup_info(Relation rel, LVRelStats *vacrelstats)
  *		dead-tuple TIDs, invoke vacuuming of indexes and call lazy_vacuum_heap
  *		to reclaim dead line pointers.
  *
+ *		If the table has at least two indexes, we execute both index vacuum
+ *		and index cleanup with parallel workers unless the parallel vacuum is
+ *		disabled.  In a parallel vacuum, we enter parallel mode and then
+ *		create both the parallel context and the DSM segment before starting
+ *		heap scan so that we can record dead tuples to the DSM segment.  All
+ *		parallel workers are launched at beginning of index vacuuming and
+ *		index cleanup and they exit once done with all indexes.  At the end of
+ *		this function we exit from parallel mode.  Index bulk-deletion results
+ *		are stored in the DSM segment and we update index statistics for all
+ *		the indexes after exiting from parallel mode since writes are not
+ *		allowed during parallel mode.
+ *
  *		If there are no indexes then we can reclaim line pointers on the fly;
  *		dead line pointers need only be retained until all index pointers that
  *		reference them have been killed.
@@ -499,6 +691,8 @@ static void
 lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			   Relation *Irel, int nindexes, bool aggressive)
 {
+	LVParallelState *lps = NULL;
+	LVDeadTuples *dead_tuples;
 	BlockNumber nblocks,
 				blkno;
 	HeapTupleData tuple;
@@ -556,13 +750,48 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	vacrelstats->nonempty_pages = 0;
 	vacrelstats->latestRemovedXid = InvalidTransactionId;
 
-	lazy_space_alloc(vacrelstats, nblocks);
+	/*
+	 * Initialize the state for a parallel vacuum.  As of now, only one worker
+	 * can be used for an index, so we invoke parallelism only if there are at
+	 * least two indexes on a table.
+	 */
+	if (params->nworkers >= 0 && vacrelstats->useindex && nindexes > 1)
+	{
+		/*
+		 * Since parallel workers cannot access data in temporary tables, we
+		 * can't perform parallel vacuum on them.
+		 */
+		if (RelationUsesLocalBuffers(onerel))
+		{
+			/*
+			 * Give warning only if the user explicitly tries to perform a
+			 * parallel vacuum on the temporary table.
+			 */
+			if (params->nworkers > 0)
+				ereport(WARNING,
+						(errmsg("disabling parallel option of vacuum on \"%s\" --- cannot vacuum temporary tables in parallel",
+								RelationGetRelationName(onerel))));
+		}
+		else
+			lps = begin_parallel_vacuum(RelationGetRelid(onerel), Irel,
+										vacrelstats, nblocks, nindexes,
+										params->nworkers);
+	}
+
+	/*
+	 * Allocate the space for dead tuples in case the parallel vacuum is not
+	 * initialized.
+	 */
+	if (!ParallelVacuumIsActive(lps))
+		lazy_space_alloc(vacrelstats, nblocks);
+
+	dead_tuples = vacrelstats->dead_tuples;
 	frozen = palloc(sizeof(xl_heap_freeze_tuple) * MaxHeapTuplesPerPage);
 
 	/* Report that we're scanning the heap, advertising total # of blocks */
 	initprog_val[0] = PROGRESS_VACUUM_PHASE_SCAN_HEAP;
 	initprog_val[1] = nblocks;
-	initprog_val[2] = vacrelstats->max_dead_tuples;
+	initprog_val[2] = dead_tuples->max_tuples;
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
 	/*
@@ -740,8 +969,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * If we are close to overrunning the available space for dead-tuple
 		 * TIDs, pause and do a cycle of vacuuming before we tackle this page.
 		 */
-		if ((vacrelstats->max_dead_tuples - vacrelstats->num_dead_tuples) < MaxHeapTuplesPerPage &&
-			vacrelstats->num_dead_tuples > 0)
+		if ((dead_tuples->max_tuples - dead_tuples->num_tuples) < MaxHeapTuplesPerPage &&
+			dead_tuples->num_tuples > 0)
 		{
 			/*
 			 * Before beginning index vacuuming, we release any pin we may
@@ -756,8 +985,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			}
 
 			/* Work on all the indexes, then the heap */
-			lazy_vacuum_all_indexes(onerel, vacrelstats, Irel,
-									nindexes, indstats);
+			lazy_vacuum_all_indexes(onerel, Irel, indstats,
+									vacrelstats, lps, nindexes);
 
 			/* Remove tuples from heap */
 			lazy_vacuum_heap(onerel, vacrelstats);
@@ -767,7 +996,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 
 			/*
 			 * Vacuum the Free Space Map to make newly-freed space visible on
@@ -962,7 +1191,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		has_dead_tuples = false;
 		nfrozen = 0;
 		hastup = false;
-		prev_dead_count = vacrelstats->num_dead_tuples;
+		prev_dead_count = dead_tuples->num_tuples;
 		maxoff = PageGetMaxOffsetNumber(page);
 
 		/*
@@ -1001,7 +1230,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 */
 			if (ItemIdIsDead(itemid))
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				all_visible = false;
 				continue;
 			}
@@ -1147,7 +1376,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 
 			if (tupgone)
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				HeapTupleHeaderAdvanceLatestRemovedXid(tuple.t_data,
 													   &vacrelstats->latestRemovedXid);
 				tups_vacuumed += 1;
@@ -1217,7 +1446,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * doing a second scan. Also we don't do that but forget dead tuples
 		 * when index cleanup is disabled.
 		 */
-		if (!vacrelstats->useindex && vacrelstats->num_dead_tuples > 0)
+		if (!vacrelstats->useindex && dead_tuples->num_tuples > 0)
 		{
 			if (nindexes == 0)
 			{
@@ -1246,7 +1475,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 
 			/*
 			 * Periodically do incremental FSM vacuuming to make newly-freed
@@ -1361,7 +1590,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * page, so remember its free space as-is.  (This path will always be
 		 * taken if there are no indexes.)
 		 */
-		if (vacrelstats->num_dead_tuples == prev_dead_count)
+		if (dead_tuples->num_tuples == prev_dead_count)
 			RecordPageWithFreeSpace(onerel, blkno, freespace);
 	}
 
@@ -1395,11 +1624,11 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 
 	/* If any tuples need to be deleted, perform final vacuum cycle */
 	/* XXX put a threshold on min number of tuples here? */
-	if (vacrelstats->num_dead_tuples > 0)
+	if (dead_tuples->num_tuples > 0)
 	{
 		/* Work on all the indexes, and then the heap */
-		lazy_vacuum_all_indexes(onerel, vacrelstats, Irel, nindexes,
-								indstats);
+		lazy_vacuum_all_indexes(onerel, Irel, indstats, vacrelstats,
+								lps, nindexes);
 
 		/* Remove tuples from heap */
 		lazy_vacuum_heap(onerel, vacrelstats);
@@ -1412,17 +1641,22 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	if (blkno > next_fsm_block_to_vacuum)
 		FreeSpaceMapVacuumRange(onerel, next_fsm_block_to_vacuum, blkno);
 
-	/* report all blocks vacuumed; and that we're cleaning up */
+	/* report all blocks vacuumed */
 	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_VACUUMED, blkno);
-	pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
-								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
 
-	/* Do post-vacuum cleanup and statistics update for each index */
+	/* Do post-vacuum cleanup */
 	if (vacrelstats->useindex)
-	{
-		for (i = 0; i < nindexes; i++)
-			lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
-	}
+		lazy_cleanup_all_indexes(Irel, indstats, vacrelstats, lps, nindexes);
+
+	/*
+	 * End parallel mode before updating index statistics as we cannot write
+	 * during parallel mode.
+	 */
+	if (ParallelVacuumIsActive(lps))
+		end_parallel_vacuum(Irel, indstats, lps, nindexes);
+
+	/* Update index statistics */
+	update_index_statistics(Irel, indstats, nindexes);
 
 	/* If no indexes, make log report that lazy_vacuum_heap would've made */
 	if (vacuumed_pages)
@@ -1467,15 +1701,16 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 /*
  *	lazy_vacuum_all_indexes() -- vacuum all indexes of relation.
  *
- *		This is a utility wrapper for lazy_vacuum_index(), able to do
- *		progress reporting.
+ * We process the indexes serially unless we are doing parallel vacuum.
  */
 static void
-lazy_vacuum_all_indexes(Relation onerel, LVRelStats *vacrelstats,
-						Relation *Irel, int nindexes,
-						IndexBulkDeleteResult **indstats)
+lazy_vacuum_all_indexes(Relation onerel, Relation *Irel,
+						IndexBulkDeleteResult **stats,
+						LVRelStats *vacrelstats, LVParallelState *lps,
+						int nindexes)
 {
-	int			i;
+	Assert(!IsParallelWorker());
+	Assert(nindexes > 0);
 
 	/* Log cleanup info before we touch indexes */
 	vacuum_log_cleanup_info(onerel, vacrelstats);
@@ -1484,9 +1719,30 @@ lazy_vacuum_all_indexes(Relation onerel, LVRelStats *vacrelstats,
 	pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
 								 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
-	/* Remove index entries */
-	for (i = 0; i < nindexes; i++)
-		lazy_vacuum_index(Irel[i], &indstats[i], vacrelstats);
+	/* Perform index vacuuming with parallel workers for parallel vacuum. */
+	if (ParallelVacuumIsActive(lps))
+	{
+		/* Tell parallel workers to do index vacuuming */
+		lps->lvshared->for_cleanup = false;
+		lps->lvshared->first_time = false;
+
+		/*
+		 * We can only provide an approximate value of num_heap_tuples in
+		 * vacuum cases.
+		 */
+		lps->lvshared->reltuples = vacrelstats->old_live_tuples;
+		lps->lvshared->estimated_count = true;
+
+		lazy_parallel_vacuum_indexes(Irel, stats, vacrelstats, lps, nindexes);
+	}
+	else
+	{
+		int			idx;
+
+		for (idx = 0; idx < nindexes; idx++)
+			lazy_vacuum_index(Irel[idx], &stats[idx], vacrelstats->dead_tuples,
+							  vacrelstats->old_live_tuples);
+	}
 
 	/* Increase and report the number of index scans */
 	vacrelstats->num_index_scans++;
@@ -1522,7 +1778,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 	npages = 0;
 
 	tupindex = 0;
-	while (tupindex < vacrelstats->num_dead_tuples)
+	while (tupindex < vacrelstats->dead_tuples->num_tuples)
 	{
 		BlockNumber tblk;
 		Buffer		buf;
@@ -1531,7 +1787,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 
 		vacuum_delay_point();
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples->itemptrs[tupindex]);
 		buf = ReadBufferExtended(onerel, MAIN_FORKNUM, tblk, RBM_NORMAL,
 								 vac_strategy);
 		if (!ConditionalLockBufferForCleanup(buf))
@@ -1579,6 +1835,7 @@ static int
 lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 				 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer)
 {
+	LVDeadTuples *dead_tuples = vacrelstats->dead_tuples;
 	Page		page = BufferGetPage(buffer);
 	OffsetNumber unused[MaxOffsetNumber];
 	int			uncnt = 0;
@@ -1589,16 +1846,16 @@ lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 
 	START_CRIT_SECTION();
 
-	for (; tupindex < vacrelstats->num_dead_tuples; tupindex++)
+	for (; tupindex < dead_tuples->num_tuples; tupindex++)
 	{
 		BlockNumber tblk;
 		OffsetNumber toff;
 		ItemId		itemid;
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&dead_tuples->itemptrs[tupindex]);
 		if (tblk != blkno)
 			break;				/* past end of tuples for this block */
-		toff = ItemPointerGetOffsetNumber(&vacrelstats->dead_tuples[tupindex]);
+		toff = ItemPointerGetOffsetNumber(&dead_tuples->itemptrs[tupindex]);
 		itemid = PageGetItemId(page, toff);
 		ItemIdSetUnused(itemid);
 		unused[uncnt++] = toff;
@@ -1719,19 +1976,352 @@ lazy_check_needs_freeze(Buffer buf, bool *hastup)
 	return false;
 }
 
+/*
+ * Perform index vacuum or index cleanup with parallel workers.  This function
+ * must be used by the parallel vacuum leader process.  The caller must set
+ * lps->lvshared->for_cleanup to indicate whether to perform vacuum or
+ * cleanup.
+ */
+static void
+lazy_parallel_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+							 LVRelStats *vacrelstats, LVParallelState *lps,
+							 int nindexes)
+{
+	int			nworkers;
+
+	Assert(!IsParallelWorker());
+	Assert(ParallelVacuumIsActive(lps));
+	Assert(nindexes > 0);
+
+	/* Determine the number of parallel workers to launch */
+	if (lps->lvshared->for_cleanup)
+	{
+		if (lps->lvshared->first_time)
+			nworkers = lps->nindexes_parallel_cleanup +
+				lps->nindexes_parallel_condcleanup;
+		else
+			nworkers = lps->nindexes_parallel_cleanup;
+	}
+	else
+		nworkers = lps->nindexes_parallel_bulkdel;
+
+	/* The leader process will participate */
+	nworkers--;
+
+	/*
+	 * It is possible that parallel context is initialized with fewer workers
+	 * then the number of indexes that need a separate worker in the current
+	 * phase, so we need to consider it.  See compute_parallel_vacuum_workers.
+	 */
+	nworkers = Min(nworkers, lps->pcxt->nworkers);
+
+	/* Setup the shared cost-based vacuum delay and launch workers */
+	if (nworkers > 0)
+	{
+		if (vacrelstats->num_index_scans > 0)
+		{
+			/* Reset the parallel index processing counter */
+			pg_atomic_write_u32(&(lps->lvshared->idx), 0);
+
+			/* Reinitialize the parallel context to relaunch parallel workers */
+			ReinitializeParallelDSM(lps->pcxt);
+		}
+
+		/* Enable shared cost balance */
+		VacuumSharedCostBalance = &(lps->lvshared->cost_balance);
+		VacuumActiveNWorkers = &(lps->lvshared->active_nworkers);
+
+		/*
+		 * Set up shared cost balance and the number of active workers for
+		 * vacuum delay.
+		 */
+		pg_atomic_write_u32(VacuumSharedCostBalance, VacuumCostBalance);
+		pg_atomic_write_u32(VacuumActiveNWorkers, 0);
+
+		/*
+		 * The number of workers can vary between bulkdelete and cleanup
+		 * phase.
+		 */
+		ReinitializeParallelWorkers(lps->pcxt, nworkers);
+
+		LaunchParallelWorkers(lps->pcxt);
+
+		if (lps->pcxt->nworkers_launched > 0)
+		{
+			/*
+			 * Reset the local cost values for leader backend as we have
+			 * already accumulated the remaining balance of heap.
+			 */
+			VacuumCostBalance = 0;
+			VacuumCostBalanceLocal = 0;
+		}
+		else
+		{
+			/*
+			 * Disable shared cost balance if we are not able to launch
+			 * workers.
+			 */
+			VacuumSharedCostBalance = NULL;
+			VacuumActiveNWorkers = NULL;
+		}
+
+		if (lps->lvshared->for_cleanup)
+			ereport(elevel,
+					(errmsg(ngettext("launched %d parallel vacuum worker for index cleanup (planned: %d)",
+									 "launched %d parallel vacuum workers for index cleanup (planned: %d)",
+									 lps->pcxt->nworkers_launched),
+							lps->pcxt->nworkers_launched, nworkers)));
+		else
+			ereport(elevel,
+					(errmsg(ngettext("launched %d parallel vacuum worker for index vacuuming (planned: %d)",
+									 "launched %d parallel vacuum workers for index vacuuming (planned: %d)",
+									 lps->pcxt->nworkers_launched),
+							lps->pcxt->nworkers_launched, nworkers)));
+	}
+
+	/* Process the indexes that can be processed by only leader process */
+	vacuum_indexes_leader(Irel, stats, vacrelstats, lps, nindexes);
+
+	/*
+	 * Join as a parallel worker.  The leader process alone processes all the
+	 * indexes in the case where no workers are launched.
+	 */
+	parallel_vacuum_index(Irel, stats, lps->lvshared,
+						  vacrelstats->dead_tuples, nindexes);
+
+	/* Wait for all vacuum workers to finish */
+	WaitForParallelWorkersToFinish(lps->pcxt);
+
+	/* Carry the shared balance value to heap scan */
+	if (VacuumSharedCostBalance)
+		VacuumCostBalance = pg_atomic_read_u32(VacuumSharedCostBalance);
+
+	if (nworkers > 0)
+	{
+		/* Disable shared cost balance */
+		VacuumSharedCostBalance = NULL;
+		VacuumActiveNWorkers = NULL;
+	}
+}
+
+/*
+ * Index vacuum/cleanup routine used by the leader process and parallel
+ * vacuum worker processes to process the indexes in parallel.
+ */
+static void
+parallel_vacuum_index(Relation *Irel, IndexBulkDeleteResult **stats,
+					  LVShared *lvshared, LVDeadTuples *dead_tuples,
+					  int nindexes)
+{
+	/*
+	 * Increment the active worker count if we are able to launch any worker.
+	 */
+	if (VacuumActiveNWorkers)
+		pg_atomic_add_fetch_u32(VacuumActiveNWorkers, 1);
+
+	/* Loop until all indexes are vacuumed */
+	for (;;)
+	{
+		int			idx;
+		LVSharedIndStats *shared_indstats;
+
+		/* Get an index number to process */
+		idx = pg_atomic_fetch_add_u32(&(lvshared->idx), 1);
+
+		/* Done for all indexes? */
+		if (idx >= nindexes)
+			break;
+
+		/* Get the index statistics of this index from DSM */
+		shared_indstats = get_indstats(lvshared, idx);
+
+		/*
+		 * Skip processing indexes that doesn't participate in parallel
+		 * operation
+		 */
+		if (shared_indstats == NULL ||
+			skip_parallel_vacuum_index(Irel[idx], lvshared))
+			continue;
+
+		/* Do vacuum or cleanup of the index */
+		vacuum_one_index(Irel[idx], &(stats[idx]), lvshared, shared_indstats,
+						 dead_tuples);
+	}
+
+	/*
+	 * We have completed the index vacuum so decrement the active worker
+	 * count.
+	 */
+	if (VacuumActiveNWorkers)
+		pg_atomic_sub_fetch_u32(VacuumActiveNWorkers, 1);
+}
+
+/*
+ * Vacuum or cleanup indexes that can be processed by only the leader process
+ * because these indexes don't support parallel operation at that phase.
+ */
+static void
+vacuum_indexes_leader(Relation *Irel, IndexBulkDeleteResult **stats,
+					  LVRelStats *vacrelstats, LVParallelState *lps,
+					  int nindexes)
+{
+	int			i;
+
+	Assert(!IsParallelWorker());
+
+	/*
+	 * Increment the active worker count if we are able to launch any worker.
+	 */
+	if (VacuumActiveNWorkers)
+		pg_atomic_add_fetch_u32(VacuumActiveNWorkers, 1);
+
+	for (i = 0; i < nindexes; i++)
+	{
+		LVSharedIndStats *shared_indstats;
+
+		shared_indstats = get_indstats(lps->lvshared, i);
+
+		/* Process the indexes skipped by parallel workers */
+		if (shared_indstats == NULL ||
+			skip_parallel_vacuum_index(Irel[i], lps->lvshared))
+			vacuum_one_index(Irel[i], &(stats[i]), lps->lvshared,
+							 shared_indstats, vacrelstats->dead_tuples);
+	}
+
+	/*
+	 * We have completed the index vacuum so decrement the active worker
+	 * count.
+	 */
+	if (VacuumActiveNWorkers)
+		pg_atomic_sub_fetch_u32(VacuumActiveNWorkers, 1);
+}
+
+/*
+ * Vacuum or cleanup index either by leader process or by one of the worker
+ * process.  After processing the index this function copies the index
+ * statistics returned from ambulkdelete and amvacuumcleanup to the DSM
+ * segment.
+ */
+static void
+vacuum_one_index(Relation indrel, IndexBulkDeleteResult **stats,
+				 LVShared *lvshared, LVSharedIndStats *shared_indstats,
+				 LVDeadTuples *dead_tuples)
+{
+	IndexBulkDeleteResult *bulkdelete_res = NULL;
+
+	if (shared_indstats)
+	{
+		/* Get the space for IndexBulkDeleteResult */
+		bulkdelete_res = &(shared_indstats->stats);
+
+		/*
+		 * Update the pointer to the corresponding bulk-deletion result if
+		 * someone has already updated it.
+		 */
+		if (shared_indstats->updated && *stats == NULL)
+			*stats = bulkdelete_res;
+	}
+
+	/* Do vacuum or cleanup of the index */
+	if (lvshared->for_cleanup)
+		lazy_cleanup_index(indrel, stats, lvshared->reltuples,
+						   lvshared->estimated_count);
+	else
+		lazy_vacuum_index(indrel, stats, dead_tuples,
+						  lvshared->reltuples);
+
+	/*
+	 * Copy the index bulk-deletion result returned from ambulkdelete and
+	 * amvacuumcleanup to the DSM segment if it's the first time to get it
+	 * from them, because they allocate it locally and it's possible that an
+	 * index will be vacuumed by the different vacuum process at the next
+	 * time.  The copying of the result normally happens only after the first
+	 * time of index vacuuming.  From the second time, we pass the result on
+	 * the DSM segment so that they then update it directly.
+	 *
+	 * Since all vacuum workers write the bulk-deletion result at different
+	 * slots we can write them without locking.
+	 */
+	if (shared_indstats && !shared_indstats->updated && *stats != NULL)
+	{
+		memcpy(bulkdelete_res, *stats, sizeof(IndexBulkDeleteResult));
+		shared_indstats->updated = true;
+
+		/*
+		 * Now that the stats[idx] points to the DSM segment, we don't need
+		 * the locally allocated results.
+		 */
+		pfree(*stats);
+		*stats = bulkdelete_res;
+	}
+}
+
+/*
+ *	lazy_cleanup_all_indexes() -- cleanup all indexes of relation.
+ *
+ * Cleanup indexes.  We process the indexes serially unless we are doing
+ * parallel vacuum.
+ */
+static void
+lazy_cleanup_all_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+						 LVRelStats *vacrelstats, LVParallelState *lps,
+						 int nindexes)
+{
+	int			idx;
+
+	Assert(!IsParallelWorker());
+	Assert(nindexes > 0);
+
+	/* Report that we are now cleaning up indexes */
+	pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
+								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
+
+	/*
+	 * If parallel vacuum is active we perform index cleanup with parallel
+	 * workers.
+	 */
+	if (ParallelVacuumIsActive(lps))
+	{
+		/* Tell parallel workers to do index cleanup */
+		lps->lvshared->for_cleanup = true;
+		lps->lvshared->first_time =
+			(vacrelstats->num_index_scans == 0);
+
+		/*
+		 * Now we can provide a better estimate of total number of surviving
+		 * tuples (we assume indexes are more interested in that than in the
+		 * number of nominally live tuples).
+		 */
+		lps->lvshared->reltuples = vacrelstats->new_rel_tuples;
+		lps->lvshared->estimated_count =
+			(vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+
+		lazy_parallel_vacuum_indexes(Irel, stats, vacrelstats, lps, nindexes);
+	}
+	else
+	{
+		for (idx = 0; idx < nindexes; idx++)
+			lazy_cleanup_index(Irel[idx], &stats[idx],
+							   vacrelstats->new_rel_tuples,
+							   vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	}
+}
 
 /*
  *	lazy_vacuum_index() -- vacuum one index relation.
  *
  *		Delete all the index entries pointing to tuples listed in
- *		vacrelstats->dead_tuples, and update running statistics.
+ *		dead_tuples, and update running statistics.
+ *
+ *		reltuples is the number of heap tuples to be passed to the
+ *		bulkdelete callback.
  */
 static void
-lazy_vacuum_index(Relation indrel,
-				  IndexBulkDeleteResult **stats,
-				  LVRelStats *vacrelstats)
+lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult **stats,
+				  LVDeadTuples *dead_tuples, double reltuples)
 {
 	IndexVacuumInfo ivinfo;
+	const char *msg;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
@@ -1741,30 +2331,38 @@ lazy_vacuum_index(Relation indrel,
 	ivinfo.report_progress = false;
 	ivinfo.estimated_count = true;
 	ivinfo.message_level = elevel;
-	/* We can only provide an approximate value of num_heap_tuples here */
-	ivinfo.num_heap_tuples = vacrelstats->old_live_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
 	/* Do bulk deletion */
 	*stats = index_bulk_delete(&ivinfo, *stats,
-							   lazy_tid_reaped, (void *) vacrelstats);
+							   lazy_tid_reaped, (void *) dead_tuples);
+
+	if (IsParallelWorker())
+		msg = gettext_noop("scanned index \"%s\" to remove %d row versions by parallel vacuum worker");
+	else
+		msg = gettext_noop("scanned index \"%s\" to remove %d row versions");
 
 	ereport(elevel,
-			(errmsg("scanned index \"%s\" to remove %d row versions",
+			(errmsg(msg,
 					RelationGetRelationName(indrel),
-					vacrelstats->num_dead_tuples),
+					dead_tuples->num_tuples),
 			 errdetail_internal("%s", pg_rusage_show(&ru0))));
 }
 
 /*
  *	lazy_cleanup_index() -- do post-vacuum cleanup for one index relation.
+ *
+ *		reltuples is the number of heap tuples and estimated_count is true
+ *		if the reltuples is an estimated value.
  */
 static void
 lazy_cleanup_index(Relation indrel,
-				   IndexBulkDeleteResult *stats,
-				   LVRelStats *vacrelstats)
+				   IndexBulkDeleteResult **stats,
+				   double reltuples, bool estimated_count)
 {
 	IndexVacuumInfo ivinfo;
+	const char *msg;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
@@ -1772,49 +2370,33 @@ lazy_cleanup_index(Relation indrel,
 	ivinfo.index = indrel;
 	ivinfo.analyze_only = false;
 	ivinfo.report_progress = false;
-	ivinfo.estimated_count = (vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	ivinfo.estimated_count = estimated_count;
 	ivinfo.message_level = elevel;
 
-	/*
-	 * Now we can provide a better estimate of total number of surviving
-	 * tuples (we assume indexes are more interested in that than in the
-	 * number of nominally live tuples).
-	 */
-	ivinfo.num_heap_tuples = vacrelstats->new_rel_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
-	stats = index_vacuum_cleanup(&ivinfo, stats);
+	*stats = index_vacuum_cleanup(&ivinfo, *stats);
 
-	if (!stats)
+	if (!(*stats))
 		return;
 
-	/*
-	 * Now update statistics in pg_class, but only if the index says the count
-	 * is accurate.
-	 */
-	if (!stats->estimated_count)
-		vac_update_relstats(indrel,
-							stats->num_pages,
-							stats->num_index_tuples,
-							0,
-							false,
-							InvalidTransactionId,
-							InvalidMultiXactId,
-							false);
+	if (IsParallelWorker())
+		msg = gettext_noop("index \"%s\" now contains %.0f row versions in %u pages as reported by parallel vacuum worker");
+	else
+		msg = gettext_noop("index \"%s\" now contains %.0f row versions in %u pages");
 
 	ereport(elevel,
-			(errmsg("index \"%s\" now contains %.0f row versions in %u pages",
+			(errmsg(msg,
 					RelationGetRelationName(indrel),
-					stats->num_index_tuples,
-					stats->num_pages),
+					(*stats)->num_index_tuples,
+					(*stats)->num_pages),
 			 errdetail("%.0f index row versions were removed.\n"
 					   "%u index pages have been deleted, %u are currently reusable.\n"
 					   "%s.",
-					   stats->tuples_removed,
-					   stats->pages_deleted, stats->pages_free,
+					   (*stats)->tuples_removed,
+					   (*stats)->pages_deleted, (*stats)->pages_free,
 					   pg_rusage_show(&ru0))));
-
-	pfree(stats);
 }
 
 /*
@@ -2122,19 +2704,17 @@ count_nondeletable_pages(Relation onerel, LVRelStats *vacrelstats)
 }
 
 /*
- * lazy_space_alloc - space allocation decisions for lazy vacuum
- *
- * See the comments at the head of this file for rationale.
+ * Return the maximum number of dead tuples we can record.
  */
-static void
-lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+static long
+compute_max_dead_tuples(BlockNumber relblocks, bool useindex)
 {
 	long		maxtuples;
 	int			vac_work_mem = IsAutoVacuumWorkerProcess() &&
 	autovacuum_work_mem != -1 ?
 	autovacuum_work_mem : maintenance_work_mem;
 
-	if (vacrelstats->useindex)
+	if (useindex)
 	{
 		maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);
 		maxtuples = Min(maxtuples, INT_MAX);
@@ -2148,34 +2728,48 @@ lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
 		maxtuples = Max(maxtuples, MaxHeapTuplesPerPage);
 	}
 	else
-	{
 		maxtuples = MaxHeapTuplesPerPage;
-	}
 
-	vacrelstats->num_dead_tuples = 0;
-	vacrelstats->max_dead_tuples = (int) maxtuples;
-	vacrelstats->dead_tuples = (ItemPointer)
-		palloc(maxtuples * sizeof(ItemPointerData));
+	return maxtuples;
+}
+
+/*
+ * lazy_space_alloc - space allocation decisions for lazy vacuum
+ *
+ * See the comments at the head of this file for rationale.
+ */
+static void
+lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+{
+	LVDeadTuples *dead_tuples = NULL;
+	long		maxtuples;
+
+	maxtuples = compute_max_dead_tuples(relblocks, vacrelstats->useindex);
+
+	dead_tuples = (LVDeadTuples *) palloc(SizeOfLVDeadTuples(maxtuples));
+	dead_tuples->num_tuples = 0;
+	dead_tuples->max_tuples = (int) maxtuples;
+
+	vacrelstats->dead_tuples = dead_tuples;
 }
 
 /*
  * lazy_record_dead_tuple - remember one deletable tuple
  */
 static void
-lazy_record_dead_tuple(LVRelStats *vacrelstats,
-					   ItemPointer itemptr)
+lazy_record_dead_tuple(LVDeadTuples *dead_tuples, ItemPointer itemptr)
 {
 	/*
 	 * The array shouldn't overflow under normal behavior, but perhaps it
 	 * could if we are given a really small maintenance_work_mem. In that
 	 * case, just forget the last few tuples (we'll get 'em next time).
 	 */
-	if (vacrelstats->num_dead_tuples < vacrelstats->max_dead_tuples)
+	if (dead_tuples->num_tuples < dead_tuples->max_tuples)
 	{
-		vacrelstats->dead_tuples[vacrelstats->num_dead_tuples] = *itemptr;
-		vacrelstats->num_dead_tuples++;
+		dead_tuples->itemptrs[dead_tuples->num_tuples] = *itemptr;
+		dead_tuples->num_tuples++;
 		pgstat_progress_update_param(PROGRESS_VACUUM_NUM_DEAD_TUPLES,
-									 vacrelstats->num_dead_tuples);
+									 dead_tuples->num_tuples);
 	}
 }
 
@@ -2189,12 +2783,12 @@ lazy_record_dead_tuple(LVRelStats *vacrelstats,
 static bool
 lazy_tid_reaped(ItemPointer itemptr, void *state)
 {
-	LVRelStats *vacrelstats = (LVRelStats *) state;
+	LVDeadTuples *dead_tuples = (LVDeadTuples *) state;
 	ItemPointer res;
 
 	res = (ItemPointer) bsearch((void *) itemptr,
-								(void *) vacrelstats->dead_tuples,
-								vacrelstats->num_dead_tuples,
+								(void *) dead_tuples->itemptrs,
+								dead_tuples->num_tuples,
 								sizeof(ItemPointerData),
 								vac_cmp_itemptr);
 
@@ -2342,3 +2936,449 @@ heap_page_is_all_visible(Relation rel, Buffer buf,
 
 	return all_visible;
 }
+
+/*
+ * Compute the number of parallel worker processes to request.  Both index
+ * vacuum and index cleanup can be executed with parallel workers.  The
+ * relation size of the table doesn't affect the parallel degree for now.
+ *
+ * nrequested is the number of parallel workers that user requested.  If
+ * nrequested is 0, we compute the parallel degree based on nindexes, that is
+ * the number of indexes that support parallel vacuum.  This function also
+ * sets can_parallel_vacuum to remember indexes that participate in parallel
+ * vacuum.
+ */
+static int
+compute_parallel_vacuum_workers(Relation *Irel, int nindexes, int nrequested,
+								bool *can_parallel_vacuum)
+{
+	int			nindexes_parallel = 0;
+	int			nindexes_parallel_bulkdel = 0;
+	int			nindexes_parallel_cleanup = 0;
+	int			parallel_workers;
+	int			i;
+
+	/*
+	 * We don't allow to perform parallel operation in standalone backend or
+	 * when parallelism is disabled.
+	 */
+	if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+		return 0;
+
+	/*
+	 * Compute the number of indexes that can participate in parallel vacuum.
+	 */
+	for (i = 0; i < nindexes; i++)
+	{
+		uint8		vacoptions = Irel[i]->rd_indam->amparallelvacuumoptions;
+
+		if (vacoptions == VACUUM_OPTION_NO_PARALLEL ||
+			RelationGetNumberOfBlocks(Irel[i]) < min_parallel_index_scan_size)
+			continue;
+
+		can_parallel_vacuum[i] = true;
+
+		if ((vacoptions & VACUUM_OPTION_PARALLEL_BULKDEL) != 0)
+			nindexes_parallel_bulkdel++;
+		if (((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) != 0) ||
+			((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) != 0))
+			nindexes_parallel_cleanup++;
+	}
+
+	nindexes_parallel = Max(nindexes_parallel_bulkdel,
+							nindexes_parallel_cleanup);
+
+	/* The leader process takes one index */
+	nindexes_parallel--;
+
+	/* No index supports parallel vacuum */
+	if (nindexes_parallel == 0)
+		return 0;
+
+	/* Compute the parallel degree */
+	parallel_workers = (nrequested > 0) ?
+		Min(nrequested, nindexes_parallel) : nindexes_parallel;
+
+	/* Cap by max_parallel_maintenance_workers */
+	parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+
+	return parallel_workers;
+}
+
+/*
+ * Initialize variables for shared index statistics, set NULL bitmap and the
+ * size of stats for each index.
+ */
+static void
+prepare_index_statistics(LVShared *lvshared, bool *can_parallel_vacuum,
+						 int nindexes)
+{
+	int			i;
+
+	/* Currently, we don't support parallel vacuum for autovacuum */
+	Assert(!IsAutoVacuumWorkerProcess());
+
+	/* Set NULL for all indexes */
+	memset(lvshared->bitmap, 0x00, BITMAPLEN(nindexes));
+
+	for (i = 0; i < nindexes; i++)
+	{
+		if (!can_parallel_vacuum[i])
+			continue;
+
+		/* Set NOT NULL as this index do support parallelism */
+		lvshared->bitmap[i >> 3] |= 1 << (i & 0x07);
+	}
+}
+
+/*
+ * Update index statistics in pg_class if the statistics is accurate.
+ */
+static void
+update_index_statistics(Relation *Irel, IndexBulkDeleteResult **stats,
+						int nindexes)
+{
+	int			i;
+
+	Assert(!IsInParallelMode());
+
+	for (i = 0; i < nindexes; i++)
+	{
+		if (stats[i] == NULL || stats[i]->estimated_count)
+			continue;
+
+		/* Update index statistics */
+		vac_update_relstats(Irel[i],
+							stats[i]->num_pages,
+							stats[i]->num_index_tuples,
+							0,
+							false,
+							InvalidTransactionId,
+							InvalidMultiXactId,
+							false);
+		pfree(stats[i]);
+	}
+}
+
+/*
+ * This function prepares and returns parallel vacuum state if we can launch
+ * even one worker.  This function is responsible to create a parallel
+ * context, enter parallel mode, and then initialize the DSM segment.
+ */
+static LVParallelState *
+begin_parallel_vacuum(Oid relid, Relation *Irel, LVRelStats *vacrelstats,
+					  BlockNumber nblocks, int nindexes, int nrequested)
+{
+	LVParallelState *lps = NULL;
+	ParallelContext *pcxt;
+	LVShared   *shared;
+	LVDeadTuples *dead_tuples;
+	bool	   *can_parallel_vacuum;
+	long		maxtuples;
+	char	   *sharedquery;
+	Size		est_shared;
+	Size		est_deadtuples;
+	int			nindexes_mwm = 0;
+	int			parallel_workers = 0;
+	int			querylen;
+	int			i;
+
+	/*
+	 * A parallel vacuum must be requested and there must be indexes on the
+	 * relation
+	 */
+	Assert(nrequested >= 0);
+	Assert(nindexes > 0);
+
+	/*
+	 * Compute the number of parallel vacuum workers to launch
+	 */
+	can_parallel_vacuum = (bool *) palloc0(sizeof(bool) * nindexes);
+	parallel_workers = compute_parallel_vacuum_workers(Irel, nindexes,
+													   nrequested,
+													   can_parallel_vacuum);
+
+	/* Can't perform vacuum in parallel */
+	if (parallel_workers <= 0)
+	{
+		pfree(can_parallel_vacuum);
+		return lps;
+	}
+
+	lps = (LVParallelState *) palloc0(sizeof(LVParallelState));
+
+	EnterParallelMode();
+	pcxt = CreateParallelContext("postgres", "parallel_vacuum_main",
+								 parallel_workers);
+	Assert(pcxt->nworkers > 0);
+	lps->pcxt = pcxt;
+
+	/* Estimate size for shared information -- PARALLEL_VACUUM_KEY_SHARED */
+	est_shared = MAXALIGN(add_size(SizeOfLVShared, BITMAPLEN(nindexes)));
+	for (i = 0; i < nindexes; i++)
+	{
+		uint8		vacoptions = Irel[i]->rd_indam->amparallelvacuumoptions;
+
+		/*
+		 * Cleanup option should be either disabled, always performing in
+		 * parallel or conditionally performing in parallel.
+		 */
+		Assert(((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) == 0) ||
+			   ((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) == 0));
+		Assert(vacoptions <= VACUUM_OPTION_MAX_VALID_VALUE);
+
+		/* Skip indexes that don't participate in parallel vacuum */
+		if (!can_parallel_vacuum[i])
+			continue;
+
+		if (Irel[i]->rd_indam->amusemaintenanceworkmem)
+			nindexes_mwm++;
+
+		est_shared = add_size(est_shared, sizeof(LVSharedIndStats));
+
+		/*
+		 * Remember the number of indexes that support parallel operation for
+		 * each phase.
+		 */
+		if ((vacoptions & VACUUM_OPTION_PARALLEL_BULKDEL) != 0)
+			lps->nindexes_parallel_bulkdel++;
+		if ((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) != 0)
+			lps->nindexes_parallel_cleanup++;
+		if ((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) != 0)
+			lps->nindexes_parallel_condcleanup++;
+	}
+	shm_toc_estimate_chunk(&pcxt->estimator, est_shared);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	/* Estimate size for dead tuples -- PARALLEL_VACUUM_KEY_DEAD_TUPLES */
+	maxtuples = compute_max_dead_tuples(nblocks, true);
+	est_deadtuples = MAXALIGN(SizeOfLVDeadTuples(maxtuples));
+	shm_toc_estimate_chunk(&pcxt->estimator, est_deadtuples);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	/* Finally, estimate PARALLEL_VACUUM_KEY_QUERY_TEXT space */
+	querylen = strlen(debug_query_string);
+	shm_toc_estimate_chunk(&pcxt->estimator, querylen + 1);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	InitializeParallelDSM(pcxt);
+
+	/* Prepare shared information */
+	shared = (LVShared *) shm_toc_allocate(pcxt->toc, est_shared);
+	MemSet(shared, 0, est_shared);
+	shared->relid = relid;
+	shared->elevel = elevel;
+	shared->maintenance_work_mem_worker =
+		(nindexes_mwm > 0) ?
+		maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
+		maintenance_work_mem;
+
+	/*
+	 * We need to care about alignment because we estimate the shared memory
+	 * in that way.
+	 */
+	shared->offset = MAXALIGN(add_size(SizeOfLVShared, BITMAPLEN(nindexes)));
+	prepare_index_statistics(shared, can_parallel_vacuum, nindexes);
+	pg_atomic_init_u32(&(shared->idx), 0);
+	pg_atomic_init_u32(&(shared->cost_balance), 0);
+	pg_atomic_init_u32(&(shared->active_nworkers), 0);
+
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_SHARED, shared);
+	lps->lvshared = shared;
+
+	/* Prepare the dead tuple space */
+	dead_tuples = (LVDeadTuples *) shm_toc_allocate(pcxt->toc, est_deadtuples);
+	dead_tuples->max_tuples = maxtuples;
+	dead_tuples->num_tuples = 0;
+	MemSet(dead_tuples->itemptrs, 0, sizeof(ItemPointerData) * maxtuples);
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_DEAD_TUPLES, dead_tuples);
+	vacrelstats->dead_tuples = dead_tuples;
+
+	/* Store query string for workers */
+	sharedquery = (char *) shm_toc_allocate(pcxt->toc, querylen + 1);
+	memcpy(sharedquery, debug_query_string, querylen + 1);
+	sharedquery[querylen] = '\0';
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, sharedquery);
+
+	pfree(can_parallel_vacuum);
+	return lps;
+}
+
+/*
+ * Destroy the parallel context, and end parallel mode.
+ *
+ * Since writes are not allowed during the parallel mode, so we copy the
+ * updated index statistics from DSM in local memory and then later use that
+ * to update the index statistics.  One might think that we can exit from
+ * parallel mode, update the index statistics and then destroy parallel
+ * context, but that won't be safe (see ExitParallelMode).
+ */
+static void
+end_parallel_vacuum(Relation *Irel, IndexBulkDeleteResult **stats,
+					LVParallelState *lps, int nindexes)
+{
+	int			i;
+
+	Assert(!IsParallelWorker());
+
+	/* Copy the updated statistics */
+	for (i = 0; i < nindexes; i++)
+	{
+		LVSharedIndStats *indstats = get_indstats(lps->lvshared, i);
+
+		/*
+		 * Skip unused slot.  The statistics of this index are already stored
+		 * in local memory.
+		 */
+		if (indstats == NULL)
+			continue;
+
+		if (indstats->updated)
+		{
+			stats[i] = (IndexBulkDeleteResult *) palloc0(sizeof(IndexBulkDeleteResult));
+			memcpy(stats[i], &(indstats->stats), sizeof(IndexBulkDeleteResult));
+		}
+		else
+			stats[i] = NULL;
+	}
+
+	DestroyParallelContext(lps->pcxt);
+	ExitParallelMode();
+
+	/* Deactivate parallel vacuum */
+	pfree(lps);
+	lps = NULL;
+}
+
+/* Return the Nth index statistics or NULL */
+static LVSharedIndStats *
+get_indstats(LVShared *lvshared, int n)
+{
+	int			i;
+	char	   *p;
+
+	if (IndStatsIsNull(lvshared, n))
+		return NULL;
+
+	p = (char *) GetSharedIndStats(lvshared);
+	for (i = 0; i < n; i++)
+	{
+		if (IndStatsIsNull(lvshared, i))
+			continue;
+
+		p += sizeof(LVSharedIndStats);
+	}
+
+	return (LVSharedIndStats *) p;
+}
+
+/*
+ * Returns true, if the given index can't participate in parallel index vacuum
+ * or parallel index cleanup, false, otherwise.
+ */
+static bool
+skip_parallel_vacuum_index(Relation indrel, LVShared *lvshared)
+{
+	uint8		vacoptions = indrel->rd_indam->amparallelvacuumoptions;
+
+	/* first_time must be true only if for_cleanup is true */
+	Assert(lvshared->for_cleanup || !lvshared->first_time);
+
+	if (lvshared->for_cleanup)
+	{
+		/* Skip, if the index does not support parallel cleanup */
+		if (((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) == 0) &&
+			((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) == 0))
+			return true;
+
+		/*
+		 * Skip, if the index supports parallel cleanup conditionally, but we
+		 * have already processed the index (for bulkdelete).  See the
+		 * comments for option VACUUM_OPTION_PARALLEL_COND_CLEANUP to know
+		 * when indexes support parallel cleanup conditionally.
+		 */
+		if (!lvshared->first_time &&
+			((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) != 0))
+			return true;
+	}
+	else if ((vacoptions & VACUUM_OPTION_PARALLEL_BULKDEL) == 0)
+	{
+		/* Skip if the index does not support parallel bulk deletion */
+		return true;
+	}
+
+	return false;
+}
+
+/*
+ * Perform work within a launched parallel process.
+ *
+ * Since parallel vacuum workers perform only index vacuum or index cleanup,
+ * we don't need to report the progress information.
+ */
+void
+parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
+{
+	Relation	onerel;
+	Relation   *indrels;
+	LVShared   *lvshared;
+	LVDeadTuples *dead_tuples;
+	int			nindexes;
+	char	   *sharedquery;
+	IndexBulkDeleteResult **stats;
+
+	lvshared = (LVShared *) shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_SHARED,
+										   false);
+	elevel = lvshared->elevel;
+
+	ereport(DEBUG1,
+			(errmsg("starting parallel vacuum worker for %s",
+					lvshared->for_cleanup ? "cleanup" : "bulk delete")));
+
+	/* Set debug_query_string for individual workers */
+	sharedquery = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, false);
+	debug_query_string = sharedquery;
+	pgstat_report_activity(STATE_RUNNING, debug_query_string);
+
+	/*
+	 * Open table.  The lock mode is the same as the leader process.  It's
+	 * okay because the lock mode does not conflict among the parallel
+	 * workers.
+	 */
+	onerel = table_open(lvshared->relid, ShareUpdateExclusiveLock);
+
+	/*
+	 * Open all indexes. indrels are sorted in order by OID, which should be
+	 * matched to the leader's one.
+	 */
+	vac_open_indexes(onerel, RowExclusiveLock, &nindexes, &indrels);
+	Assert(nindexes > 0);
+
+	/* Set dead tuple space */
+	dead_tuples = (LVDeadTuples *) shm_toc_lookup(toc,
+												  PARALLEL_VACUUM_KEY_DEAD_TUPLES,
+												  false);
+
+	/* Set cost-based vacuum delay */
+	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostBalance = 0;
+	VacuumPageHit = 0;
+	VacuumPageMiss = 0;
+	VacuumPageDirty = 0;
+	VacuumCostBalanceLocal = 0;
+	VacuumSharedCostBalance = &(lvshared->cost_balance);
+	VacuumActiveNWorkers = &(lvshared->active_nworkers);
+
+	stats = (IndexBulkDeleteResult **)
+		palloc0(nindexes * sizeof(IndexBulkDeleteResult *));
+
+	if (lvshared->maintenance_work_mem_worker > 0)
+		maintenance_work_mem = lvshared->maintenance_work_mem_worker;
+
+	/* Process indexes to perform vacuum/cleanup */
+	parallel_vacuum_index(indrels, stats, lvshared, dead_tuples, nindexes);
+
+	vac_close_indexes(nindexes, indrels, RowExclusiveLock);
+	table_close(onerel, ShareUpdateExclusiveLock);
+	pfree(stats);
+}
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index f3e2254..df06e7d 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -14,6 +14,7 @@
 
 #include "postgres.h"
 
+#include "access/heapam.h"
 #include "access/nbtree.h"
 #include "access/parallel.h"
 #include "access/session.h"
@@ -139,6 +140,9 @@ static const struct
 	},
 	{
 		"_bt_parallel_build_main", _bt_parallel_build_main
+	},
+	{
+		"parallel_vacuum_main", parallel_vacuum_main
 	}
 };
 
@@ -174,6 +178,7 @@ CreateParallelContext(const char *library_name, const char *function_name,
 	pcxt = palloc0(sizeof(ParallelContext));
 	pcxt->subid = GetCurrentSubTransactionId();
 	pcxt->nworkers = nworkers;
+	pcxt->nworkers_to_launch = nworkers;
 	pcxt->library_name = pstrdup(library_name);
 	pcxt->function_name = pstrdup(function_name);
 	pcxt->error_context_stack = error_context_stack;
@@ -487,6 +492,23 @@ ReinitializeParallelDSM(ParallelContext *pcxt)
 }
 
 /*
+ * Reinitialize parallel workers for a parallel context such that we could
+ * launch the different number of workers.  This is required for cases where
+ * we need to reuse the same DSM segment, but the number of workers can
+ * vary from run-to-run.
+ */
+void
+ReinitializeParallelWorkers(ParallelContext *pcxt, int nworkers_to_launch)
+{
+	/*
+	 * The number of workers that need to be launched must be less than the
+	 * number of workers with which the parallel context is initialized.
+	 */
+	Assert(pcxt->nworkers >= nworkers_to_launch);
+	pcxt->nworkers_to_launch = nworkers_to_launch;
+}
+
+/*
  * Launch parallel workers.
  */
 void
@@ -498,7 +520,7 @@ LaunchParallelWorkers(ParallelContext *pcxt)
 	bool		any_registrations_failed = false;
 
 	/* Skip this if we have no workers. */
-	if (pcxt->nworkers == 0)
+	if (pcxt->nworkers == 0 || pcxt->nworkers_to_launch == 0)
 		return;
 
 	/* We need to be a lock group leader. */
@@ -533,7 +555,7 @@ LaunchParallelWorkers(ParallelContext *pcxt)
 	 * fails.  It wouldn't help much anyway, because registering the worker in
 	 * no way guarantees that it will start up and initialize successfully.
 	 */
-	for (i = 0; i < pcxt->nworkers; ++i)
+	for (i = 0; i < pcxt->nworkers_to_launch; ++i)
 	{
 		memcpy(worker.bgw_extra, &i, sizeof(int));
 		if (!any_registrations_failed &&
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index bb34e25..c645d44 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -42,6 +42,7 @@
 #include "nodes/makefuncs.h"
 #include "pgstat.h"
 #include "postmaster/autovacuum.h"
+#include "postmaster/bgworker_internals.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
 #include "storage/proc.h"
@@ -68,6 +69,14 @@ static MemoryContext vac_context = NULL;
 static BufferAccessStrategy vac_strategy;
 
 
+/*
+ * Variables for cost-based parallel vacuum.  See comments atop
+ * compute_parallel_delay to understand how it works.
+ */
+pg_atomic_uint32 *VacuumSharedCostBalance = NULL;
+pg_atomic_uint32 *VacuumActiveNWorkers = NULL;
+int			VacuumCostBalanceLocal = 0;
+
 /* non-export function prototypes */
 static List *expand_vacuum_rel(VacuumRelation *vrel, int options);
 static List *get_all_vacuum_rels(int options);
@@ -76,6 +85,7 @@ static void vac_truncate_clog(TransactionId frozenXID,
 							  TransactionId lastSaneFrozenXid,
 							  MultiXactId lastSaneMinMulti);
 static bool vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params);
+static double compute_parallel_delay(void);
 static VacOptTernaryValue get_vacopt_ternary_value(DefElem *def);
 
 /*
@@ -94,12 +104,16 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 	bool		freeze = false;
 	bool		full = false;
 	bool		disable_page_skipping = false;
+	bool		parallel_option = false;
 	ListCell   *lc;
 
 	/* Set default value */
 	params.index_cleanup = VACOPT_TERNARY_DEFAULT;
 	params.truncate = VACOPT_TERNARY_DEFAULT;
 
+	/* By default parallel vacuum is enabled */
+	params.nworkers = 0;
+
 	/* Parse options list */
 	foreach(lc, vacstmt->options)
 	{
@@ -129,6 +143,39 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 			params.index_cleanup = get_vacopt_ternary_value(opt);
 		else if (strcmp(opt->defname, "truncate") == 0)
 			params.truncate = get_vacopt_ternary_value(opt);
+		else if (strcmp(opt->defname, "parallel") == 0)
+		{
+			parallel_option = true;
+			if (opt->arg == NULL)
+			{
+				ereport(ERROR,
+						(errcode(ERRCODE_SYNTAX_ERROR),
+						 errmsg("parallel option requires a value between 0 and %d",
+								MAX_PARALLEL_WORKER_LIMIT),
+						 parser_errposition(pstate, opt->location)));
+			}
+			else
+			{
+				int			nworkers;
+
+				nworkers = defGetInt32(opt);
+				if (nworkers < 0 || nworkers > MAX_PARALLEL_WORKER_LIMIT)
+					ereport(ERROR,
+							(errcode(ERRCODE_SYNTAX_ERROR),
+							 errmsg("parallel vacuum degree must be between 0 and %d",
+									MAX_PARALLEL_WORKER_LIMIT),
+							 parser_errposition(pstate, opt->location)));
+
+				/*
+				 * Disable parallel vacuum, if user has specified parallel
+				 * degree as zero.
+				 */
+				if (nworkers == 0)
+					params.nworkers = -1;
+				else
+					params.nworkers = nworkers;
+			}
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -152,6 +199,11 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 		   !(params.options & (VACOPT_FULL | VACOPT_FREEZE)));
 	Assert(!(params.options & VACOPT_SKIPTOAST));
 
+	if ((params.options & VACOPT_FULL) && parallel_option)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot specify both FULL and PARALLEL options")));
+
 	/*
 	 * Make sure VACOPT_ANALYZE is specified if any column lists are present.
 	 */
@@ -383,6 +435,9 @@ vacuum(List *relations, VacuumParams *params,
 		VacuumPageHit = 0;
 		VacuumPageMiss = 0;
 		VacuumPageDirty = 0;
+		VacuumCostBalanceLocal = 0;
+		VacuumSharedCostBalance = NULL;
+		VacuumActiveNWorkers = NULL;
 
 		/*
 		 * Loop to process each selected relation.
@@ -1941,16 +1996,26 @@ vac_close_indexes(int nindexes, Relation *Irel, LOCKMODE lockmode)
 void
 vacuum_delay_point(void)
 {
+	double		msec = 0;
+
 	/* Always check for interrupts */
 	CHECK_FOR_INTERRUPTS();
 
-	/* Nap if appropriate */
-	if (VacuumCostActive && !InterruptPending &&
-		VacuumCostBalance >= VacuumCostLimit)
-	{
-		double		msec;
+	if (!VacuumCostActive || InterruptPending)
+		return;
 
+	/*
+	 * For parallel vacuum, the delay is computed based on the shared cost
+	 * balance.  See compute_parallel_delay.
+	 */
+	if (VacuumSharedCostBalance != NULL)
+		msec = compute_parallel_delay();
+	else if (VacuumCostBalance >= VacuumCostLimit)
 		msec = VacuumCostDelay * VacuumCostBalance / VacuumCostLimit;
+
+	/* Nap if appropriate */
+	if (msec > 0)
+	{
 		if (msec > VacuumCostDelay * 4)
 			msec = VacuumCostDelay * 4;
 
@@ -1967,6 +2032,66 @@ vacuum_delay_point(void)
 }
 
 /*
+ * Computes the vacuum delay for parallel workers.
+ *
+ * The basic idea of a cost-based vacuum delay for parallel index vacuuming
+ * is to allow each worker to sleep proportional to the work done by it.  We
+ * achieve this by allowing all parallel vacuum workers including the leader
+ * process to have a shared view of cost related parameters (mainly
+ * VacuumCostBalance).  We allow each worker to update it as and when it has
+ * incurred any cost and then based on that decide whether it needs to sleep.
+ * We compute the time to sleep for a worker based on the cost it has incurred
+ * (VacuumCostBalanceLocal) and then reduce the VacuumSharedCostBalance by
+ * that amount.  This avoids letting the workers sleep who have done less or
+ * no I/O as compared to other workers and therefore can ensure that workers
+ * who are doing more I/O got throttled more.
+ *
+ * We allow any worker to sleep only if it has performed the I/O above a
+ * certain threshold, which is calculated based on the number of active
+ * workers (VacuumActiveNWorkers), and the overall cost balance is more than
+ * VacuumCostLimit set by the system.  The testing reveals that we achieve
+ * the required throttling if we allow a worker that has done more than 50%
+ * of its share of work to sleep.
+ */
+static double
+compute_parallel_delay(void)
+{
+	double		msec = 0;
+	uint32		shared_balance;
+	int			nworkers;
+
+	/* Parallel vacuum must be active */
+	Assert(VacuumSharedCostBalance);
+
+	nworkers = pg_atomic_read_u32(VacuumActiveNWorkers);
+
+	/* At least count itself */
+	Assert(nworkers >= 1);
+
+	/* Update the shared cost balance value atomically */
+	shared_balance = pg_atomic_add_fetch_u32(VacuumSharedCostBalance, VacuumCostBalance);
+
+	/* Compute the total local balance for the current worker */
+	VacuumCostBalanceLocal += VacuumCostBalance;
+
+	if ((shared_balance >= VacuumCostLimit) &&
+		(VacuumCostBalanceLocal > 0.5 * (VacuumCostLimit / nworkers)))
+	{
+		/* Compute sleep time based on the local cost balance */
+		msec = VacuumCostDelay * VacuumCostBalanceLocal / VacuumCostLimit;
+		pg_atomic_sub_fetch_u32(VacuumSharedCostBalance, VacuumCostBalanceLocal);
+		VacuumCostBalanceLocal = 0;
+	}
+
+	/*
+	 * Reset the local balance as we accumulated it into the shared value.
+	 */
+	VacuumCostBalance = 0;
+
+	return msec;
+}
+
+/*
  * A wrapper function of defGetBoolean().
  *
  * This function returns VACOPT_TERNARY_ENABLED and VACOPT_TERNARY_DISABLED
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index f0e40e3..6d1f28c 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2886,6 +2886,8 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 			(!wraparound ? VACOPT_SKIP_LOCKED : 0);
 		tab->at_params.index_cleanup = VACOPT_TERNARY_DEFAULT;
 		tab->at_params.truncate = VACOPT_TERNARY_DEFAULT;
+		/* As of now, we don't support parallel vacuum for autovacuum */
+		tab->at_params.nworkers = -1;
 		tab->at_params.freeze_min_age = freeze_min_age;
 		tab->at_params.freeze_table_age = freeze_table_age;
 		tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index b52396c..052d98b 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -3597,7 +3597,7 @@ psql_completion(const char *text, int start, int end)
 		if (ends_with(prev_wd, '(') || ends_with(prev_wd, ','))
 			COMPLETE_WITH("FULL", "FREEZE", "ANALYZE", "VERBOSE",
 						  "DISABLE_PAGE_SKIPPING", "SKIP_LOCKED",
-						  "INDEX_CLEANUP", "TRUNCATE");
+						  "INDEX_CLEANUP", "TRUNCATE", "PARALLEL");
 		else if (TailMatches("FULL|FREEZE|ANALYZE|VERBOSE|DISABLE_PAGE_SKIPPING|SKIP_LOCKED|INDEX_CLEANUP|TRUNCATE"))
 			COMPLETE_WITH("ON", "OFF");
 	}
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 580b4ca..00a17f5 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -23,7 +23,9 @@
 #include "nodes/lockoptions.h"
 #include "nodes/primnodes.h"
 #include "storage/bufpage.h"
+#include "storage/dsm.h"
 #include "storage/lockdefs.h"
+#include "storage/shm_toc.h"
 #include "utils/relcache.h"
 #include "utils/snapshot.h"
 
@@ -193,6 +195,7 @@ extern Size SyncScanShmemSize(void);
 struct VacuumParams;
 extern void heap_vacuum_rel(Relation onerel,
 							struct VacuumParams *params, BufferAccessStrategy bstrategy);
+extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
 
 /* in heap/heapam_visibility.c */
 extern bool HeapTupleSatisfiesVisibility(HeapTuple stup, Snapshot snapshot,
diff --git a/src/include/access/parallel.h b/src/include/access/parallel.h
index 646708b..fc6a560 100644
--- a/src/include/access/parallel.h
+++ b/src/include/access/parallel.h
@@ -33,7 +33,8 @@ typedef struct ParallelContext
 {
 	dlist_node	node;
 	SubTransactionId subid;
-	int			nworkers;
+	int			nworkers;		/* Maximum number of workers to launch */
+	int			nworkers_to_launch; /* Actual number of workers to launch */
 	int			nworkers_launched;
 	char	   *library_name;
 	char	   *function_name;
@@ -63,6 +64,7 @@ extern ParallelContext *CreateParallelContext(const char *library_name,
 											  const char *function_name, int nworkers);
 extern void InitializeParallelDSM(ParallelContext *pcxt);
 extern void ReinitializeParallelDSM(ParallelContext *pcxt);
+extern void ReinitializeParallelWorkers(ParallelContext *pcxt, int nworkers_to_launch);
 extern void LaunchParallelWorkers(ParallelContext *pcxt);
 extern void WaitForParallelWorkersToAttach(ParallelContext *pcxt);
 extern void WaitForParallelWorkersToFinish(ParallelContext *pcxt);
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index b3351ad..c27d255 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -222,6 +222,13 @@ typedef struct VacuumParams
 										 * default value depends on reloptions */
 	VacOptTernaryValue truncate;	/* Truncate empty pages at the end,
 									 * default value depends on reloptions */
+
+	/*
+	 * The number of parallel vacuum workers.  0 by default which means choose
+	 * based on the number of indexes.  -1 indicates a parallel vacuum is
+	 * disabled.
+	 */
+	int			nworkers;
 } VacuumParams;
 
 /* GUC parameters */
@@ -231,6 +238,11 @@ extern int	vacuum_freeze_table_age;
 extern int	vacuum_multixact_freeze_min_age;
 extern int	vacuum_multixact_freeze_table_age;
 
+/* Variables for cost-based parallel vacuum */
+extern pg_atomic_uint32 *VacuumSharedCostBalance;
+extern pg_atomic_uint32 *VacuumActiveNWorkers;
+extern int	VacuumCostBalanceLocal;
+
 
 /* in commands/vacuum.c */
 extern void ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel);
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index 9996d88..f4250a4 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -92,6 +92,40 @@ CONTEXT:  SQL function "do_analyze" statement 1
 SQL function "wrap_do_analyze" statement 1
 VACUUM FULL vactst;
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+-- PARALLEL option
+CREATE TABLE pvactst (i INT, a INT[], p POINT) with (autovacuum_enabled = off);
+INSERT INTO pvactst SELECT i, array[1,2,3], point(i, i+1) FROM generate_series(1,1000) i;
+CREATE INDEX btree_pvactst ON pvactst USING btree (i);
+CREATE INDEX hash_pvactst ON pvactst USING hash (i);
+CREATE INDEX brin_pvactst ON pvactst USING brin (i);
+CREATE INDEX gin_pvactst ON pvactst USING gin (a);
+CREATE INDEX gist_pvactst ON pvactst USING gist (p);
+CREATE INDEX spgist_pvactst ON pvactst USING spgist (p);
+-- VACUUM invokes parallel index cleanup
+SET min_parallel_index_scan_size to 0;
+VACUUM (PARALLEL 2) pvactst;
+-- VACUUM invokes parallel bulk-deletion
+UPDATE pvactst SET i = i WHERE i < 1000;
+VACUUM (PARALLEL 2) pvactst;
+UPDATE pvactst SET i = i WHERE i < 1000;
+VACUUM (PARALLEL 0) pvactst; -- disable parallel vacuum
+VACUUM (PARALLEL -1) pvactst; -- error
+ERROR:  parallel vacuum degree must be between 0 and 1024
+LINE 1: VACUUM (PARALLEL -1) pvactst;
+                ^
+VACUUM (PARALLEL 2, INDEX_CLEANUP FALSE) pvactst;
+VACUUM (PARALLEL 2, FULL TRUE) pvactst; -- error, cannot use both PARALLEL and FULL
+ERROR:  cannot specify both FULL and PARALLEL options
+VACUUM (PARALLEL) pvactst; -- error, cannot use PARALLEL option without parallel degree
+ERROR:  parallel option requires a value between 0 and 1024
+LINE 1: VACUUM (PARALLEL) pvactst;
+                ^
+CREATE TEMPORARY TABLE tmp (a int PRIMARY KEY);
+CREATE INDEX tmp_idx1 ON tmp (a);
+VACUUM (PARALLEL 1) tmp; -- disables parallel vacuum option
+WARNING:  disabling parallel option of vacuum on "tmp" --- cannot vacuum temporary tables in parallel
+RESET min_parallel_index_scan_size;
+DROP TABLE pvactst;
 -- INDEX_CLEANUP option
 CREATE TABLE no_index_cleanup (i INT PRIMARY KEY, t TEXT);
 -- Use uncompressed data stored in toast.
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 69987f7..cf741f7 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -75,6 +75,37 @@ VACUUM FULL vactst;
 
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
 
+-- PARALLEL option
+CREATE TABLE pvactst (i INT, a INT[], p POINT) with (autovacuum_enabled = off);
+INSERT INTO pvactst SELECT i, array[1,2,3], point(i, i+1) FROM generate_series(1,1000) i;
+CREATE INDEX btree_pvactst ON pvactst USING btree (i);
+CREATE INDEX hash_pvactst ON pvactst USING hash (i);
+CREATE INDEX brin_pvactst ON pvactst USING brin (i);
+CREATE INDEX gin_pvactst ON pvactst USING gin (a);
+CREATE INDEX gist_pvactst ON pvactst USING gist (p);
+CREATE INDEX spgist_pvactst ON pvactst USING spgist (p);
+
+-- VACUUM invokes parallel index cleanup
+SET min_parallel_index_scan_size to 0;
+VACUUM (PARALLEL 2) pvactst;
+
+-- VACUUM invokes parallel bulk-deletion
+UPDATE pvactst SET i = i WHERE i < 1000;
+VACUUM (PARALLEL 2) pvactst;
+
+UPDATE pvactst SET i = i WHERE i < 1000;
+VACUUM (PARALLEL 0) pvactst; -- disable parallel vacuum
+
+VACUUM (PARALLEL -1) pvactst; -- error
+VACUUM (PARALLEL 2, INDEX_CLEANUP FALSE) pvactst;
+VACUUM (PARALLEL 2, FULL TRUE) pvactst; -- error, cannot use both PARALLEL and FULL
+VACUUM (PARALLEL) pvactst; -- error, cannot use PARALLEL option without parallel degree
+CREATE TEMPORARY TABLE tmp (a int PRIMARY KEY);
+CREATE INDEX tmp_idx1 ON tmp (a);
+VACUUM (PARALLEL 1) tmp; -- disables parallel vacuum option
+RESET min_parallel_index_scan_size;
+DROP TABLE pvactst;
+
 -- INDEX_CLEANUP option
 CREATE TABLE no_index_cleanup (i INT PRIMARY KEY, t TEXT);
 -- Use uncompressed data stored in toast.
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index caf6b86..0242e66 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -1216,7 +1216,11 @@ LPVOID
 LPWSTR
 LSEG
 LUID
+LVDeadTuples
 LVRelStats
+LVShared
+LVSharedIndStats
+LVParallelState
 LWLock
 LWLockHandle
 LWLockMinimallyPadded
-- 
1.8.3.1

#371

Prabhat Sahu

prabhat.sahu@enterprisedb.com

almost 6 years ago

In reply to: Amit Kapila (#370)

Hi all,

I would like to share my observation on this PG feature "Block-level
parallel vacuum".
I have tested the earlier patch (i.e v48) with below high-level test
scenarios, and those are working as expected.

- I have played around with these GUC parameters while testing

max_worker_processes

autovacuum = off

shared_buffers

max_parallel_workers

max_parallel_maintenance_workers

min_parallel_index_scan_size

vacuum_cost_limit

vacuum_cost_delay

- Tested the parallel vacuum with tables and Partition tables having
possible datatypes and Columns having various indexes(like btree, gist,
etc.) on part / full table.
- Tested the pgbench tables data with multiple indexes created manually
and ran script(vacuum_test.sql) with DMLs and VACUUM for multiple Clients,
Jobs, and Time as below.

./pgbench -c 8 -j 16 -T 900 postgres -f vacuum_test.sql

We observe the usage of parallel workers during VACUUM.

- Ran few isolation schedule test cases(in regression) with huge data
and indexes, perform DMLs -> VACUUM
- Tested with PARTITION TABLEs -> global/local indexes -> DMLs -> VACUUM
- Tested with PARTITION TABLE having different TABLESPACE in different
location -> global/local indexes -> DMLs -> VACUUM
- With Changing STORAGE options for columns(as PLAIN / EXTERNAL /
EXTENDED) -> DMLs -> VACUUM
- Create index with CONCURRENTLY option / Changing storage_parameter for
index as below -> DMLs -> VACUUM

with(buffering=auto) / with(buffering=on) / with(buffering=off) /
with(fillfactor=30);

- Tested with creating Simple and Partitioned tables -> DMLs ->
pg_dump/pg_restore/pg_upgrade -> VACUUM

Verified the data after restore / upgrade / VACUUM.

- Indexes on UUID-OSSP data -> DMLs -> pg_upgrade -> VACUUM
- Verified with various test scenarios for better performance of
parallel VACUUM as compared to Non-parallel VACUUM.

Time taken by VACUUM on PG HEAD+PATCH(with PARALLEL) < Time taken
by VACUUM on PG HEAD (without PARALLEL)

Machine configuration: (16 VCPUs / RAM: 16GB / Disk size: 640GB)

*PG HEAD:*
VACUUM tab1;

Time: 38915.384 ms (00:38.915)

Time: 48389.006 ms (00:48.389)

Time: 41324.223 ms (00:41.324)

*Time: 37640.874 ms (00:37.641) --median*

Time: 36897.325 ms (00:36.897)

Time: 36351.022 ms (00:36.351)

Time: 36198.890 ms (00:36.199)

*PG HEAD + v48 Patch:*
VACUUM tab1;

Time: 37051.589 ms (00:37.052)

*Time: 33647.459 ms (00:33.647) --median*

Time: 31580.894 ms (00:31.581)

Time: 34442.046 ms (00:34.442)

Time: 31335.960 ms (00:31.336)

Time: 34441.245 ms (00:34.441)

Time: 31159.639 ms (00:31.160)

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com

#372

Dilip Kumar

dilipbalaut@gmail.com

almost 6 years ago

In reply to: Amit Kapila (#370)

On Thu, Jan 16, 2020 at 5:34 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Thu, Jan 16, 2020 at 4:46 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

Right. Most indexes (all?) of tables that are used in the regression
tests are smaller than min_parallel_index_scan_size. And we set
min_parallel_index_scan_size to 0 in vacuum.sql but VACUUM would not
be speeded-up much because of the relation size. Since we instead
populate new table for parallel vacuum testing the regression test for
vacuum would take a longer time.

Fair enough and I think it is good in a way that it won't change the
coverage of existing vacuum code. I have fixed all the issues
reported by Mahendra and have fixed a few other cosmetic things in the
attached patch.

I have few small comments.

1.
logical streaming for large in-progress transactions+
+ /* Can't perform vacuum in parallel */
+ if (parallel_workers <= 0)
+ {
+ pfree(can_parallel_vacuum);
+ return lps;
+ }

why are we checking parallel_workers <= 0, Function
compute_parallel_vacuum_workers only returns 0 or greater than 0
so isn't it better to just check if (parallel_workers == 0) ?

2.
+/*
+ * Macro to check if we are in a parallel vacuum.  If true, we are in the
+ * parallel mode and the DSM segment is initialized.
+ */
+#define ParallelVacuumIsActive(lps) (((LVParallelState *) (lps)) != NULL)

(LVParallelState *) (lps) -> this typecast is not required, just (lps)
!= NULL should be enough.

+ shared->offset = MAXALIGN(add_size(SizeOfLVShared, BITMAPLEN(nindexes)));
+ prepare_index_statistics(shared, can_parallel_vacuum, nindexes);
+ pg_atomic_init_u32(&(shared->idx), 0);
+ pg_atomic_init_u32(&(shared->cost_balance), 0);
+ pg_atomic_init_u32(&(shared->active_nworkers), 0);

I think it will look cleaner if we can initialize in the order they
are declared in structure.

4.
+ VacuumSharedCostBalance = &(lps->lvshared->cost_balance);
+ VacuumActiveNWorkers = &(lps->lvshared->active_nworkers);
+
+ /*
+ * Set up shared cost balance and the number of active workers for
+ * vacuum delay.
+ */
+ pg_atomic_write_u32(VacuumSharedCostBalance, VacuumCostBalance);
+ pg_atomic_write_u32(VacuumActiveNWorkers, 0);
+
+ /*
+ * The number of workers can vary between bulkdelete and cleanup
+ * phase.
+ */
+ ReinitializeParallelWorkers(lps->pcxt, nworkers);
+
+ LaunchParallelWorkers(lps->pcxt);
+
+ if (lps->pcxt->nworkers_launched > 0)
+ {
+ /*
+ * Reset the local cost values for leader backend as we have
+ * already accumulated the remaining balance of heap.
+ */
+ VacuumCostBalance = 0;
+ VacuumCostBalanceLocal = 0;
+ }
+ else
+ {
+ /*
+ * Disable shared cost balance if we are not able to launch
+ * workers.
+ */
+ VacuumSharedCostBalance = NULL;
+ VacuumActiveNWorkers = NULL;
+ }
+

I don't like the idea of first initializing the
VacuumSharedCostBalance with lps->lvshared->cost_balance and then
uninitialize if nworkers_launched is 0.
I am not sure why do we need to initialize VacuumSharedCostBalance
here? just to perform pg_atomic_write_u32(VacuumSharedCostBalance,
VacuumCostBalance);?
I think we can initialize it only if nworkers_launched > 0 then we can
get rid of the else branch completely.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#373

Dilip Kumar

dilipbalaut@gmail.com

almost 6 years ago

In reply to: Dilip Kumar (#372)

On Fri, Jan 17, 2020 at 9:36 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Thu, Jan 16, 2020 at 5:34 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Thu, Jan 16, 2020 at 4:46 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

Right. Most indexes (all?) of tables that are used in the regression
tests are smaller than min_parallel_index_scan_size. And we set
min_parallel_index_scan_size to 0 in vacuum.sql but VACUUM would not
be speeded-up much because of the relation size. Since we instead
populate new table for parallel vacuum testing the regression test for
vacuum would take a longer time.

Fair enough and I think it is good in a way that it won't change the
coverage of existing vacuum code. I have fixed all the issues
reported by Mahendra and have fixed a few other cosmetic things in the
attached patch.

I have few small comments.
1.
logical streaming for large in-progress transactions+
+ /* Can't perform vacuum in parallel */
+ if (parallel_workers <= 0)
+ {
+ pfree(can_parallel_vacuum);
+ return lps;
+ }
why are we checking parallel_workers <= 0, Function
compute_parallel_vacuum_workers only returns 0 or greater than 0
so isn't it better to just check if (parallel_workers == 0) ?
2.
+/*
+ * Macro to check if we are in a parallel vacuum.  If true, we are in the
+ * parallel mode and the DSM segment is initialized.
+ */
+#define ParallelVacuumIsActive(lps) (((LVParallelState *) (lps)) != NULL)
(LVParallelState *) (lps) -> this typecast is not required, just (lps)
!= NULL should be enough.

3.
+ shared->offset = MAXALIGN(add_size(SizeOfLVShared, BITMAPLEN(nindexes)));
+ prepare_index_statistics(shared, can_parallel_vacuum, nindexes);
+ pg_atomic_init_u32(&(shared->idx), 0);
+ pg_atomic_init_u32(&(shared->cost_balance), 0);
+ pg_atomic_init_u32(&(shared->active_nworkers), 0);
I think it will look cleaner if we can initialize in the order they
are declared in structure.
4.
+ VacuumSharedCostBalance = &(lps->lvshared->cost_balance);
+ VacuumActiveNWorkers = &(lps->lvshared->active_nworkers);
+
+ /*
+ * Set up shared cost balance and the number of active workers for
+ * vacuum delay.
+ */
+ pg_atomic_write_u32(VacuumSharedCostBalance, VacuumCostBalance);
+ pg_atomic_write_u32(VacuumActiveNWorkers, 0);
+
+ /*
+ * The number of workers can vary between bulkdelete and cleanup
+ * phase.
+ */
+ ReinitializeParallelWorkers(lps->pcxt, nworkers);
+
+ LaunchParallelWorkers(lps->pcxt);
+
+ if (lps->pcxt->nworkers_launched > 0)
+ {
+ /*
+ * Reset the local cost values for leader backend as we have
+ * already accumulated the remaining balance of heap.
+ */
+ VacuumCostBalance = 0;
+ VacuumCostBalanceLocal = 0;
+ }
+ else
+ {
+ /*
+ * Disable shared cost balance if we are not able to launch
+ * workers.
+ */
+ VacuumSharedCostBalance = NULL;
+ VacuumActiveNWorkers = NULL;
+ }
+
I don't like the idea of first initializing the
VacuumSharedCostBalance with lps->lvshared->cost_balance and then
uninitialize if nworkers_launched is 0.
I am not sure why do we need to initialize VacuumSharedCostBalance
here? just to perform pg_atomic_write_u32(VacuumSharedCostBalance,
VacuumCostBalance);?
I think we can initialize it only if nworkers_launched > 0 then we can
get rid of the else branch completely.

I missed one of my comment

+ /* Carry the shared balance value to heap scan */
+ if (VacuumSharedCostBalance)
+ VacuumCostBalance = pg_atomic_read_u32(VacuumSharedCostBalance);
+
+ if (nworkers > 0)
+ {
+ /* Disable shared cost balance */
+ VacuumSharedCostBalance = NULL;
+ VacuumActiveNWorkers = NULL;
+ }

Doesn't make sense to keep them as two conditions, we can combine them as below

/* If shared costing is enable, carry the shared balance value to heap
scan and disable the shared costing */
if (VacuumSharedCostBalance)
{
VacuumCostBalance = pg_atomic_read_u32(VacuumSharedCostBalance);
VacuumSharedCostBalance = NULL;
VacuumActiveNWorkers = NULL;
}

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#374

Amit Kapila

amit.kapila16@gmail.com

almost 6 years ago

In reply to: Dilip Kumar (#372)

On Fri, Jan 17, 2020 at 9:36 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:

I have few small comments.
1.
logical streaming for large in-progress transactions+
+ /* Can't perform vacuum in parallel */
+ if (parallel_workers <= 0)
+ {
+ pfree(can_parallel_vacuum);
+ return lps;
+ }
why are we checking parallel_workers <= 0, Function
compute_parallel_vacuum_workers only returns 0 or greater than 0
so isn't it better to just check if (parallel_workers == 0) ?

Why to have such an assumption about
compute_parallel_vacuum_workers()? The function
compute_parallel_vacuum_workers() returns int, so such a check
(<= 0) seems reasonable to me.

2.
+/*
+ * Macro to check if we are in a parallel vacuum.  If true, we are in the
+ * parallel mode and the DSM segment is initialized.
+ */
+#define ParallelVacuumIsActive(lps) (((LVParallelState *) (lps)) != NULL)
(LVParallelState *) (lps) -> this typecast is not required, just (lps)
!= NULL should be enough.

I think the better idea would be to just replace it PointerIsValid
like below. I see similar usage in other places.
#define ParallelVacuumIsActive(lps) PointerIsValid(lps)

+ shared->offset = MAXALIGN(add_size(SizeOfLVShared, BITMAPLEN(nindexes)));
+ prepare_index_statistics(shared, can_parallel_vacuum, nindexes);
+ pg_atomic_init_u32(&(shared->idx), 0);
+ pg_atomic_init_u32(&(shared->cost_balance), 0);
+ pg_atomic_init_u32(&(shared->active_nworkers), 0);

I think it will look cleaner if we can initialize in the order they
are declared in structure.

Okay.

4.
+ VacuumSharedCostBalance = &(lps->lvshared->cost_balance);
+ VacuumActiveNWorkers = &(lps->lvshared->active_nworkers);
+
+ /*
+ * Set up shared cost balance and the number of active workers for
+ * vacuum delay.
+ */
+ pg_atomic_write_u32(VacuumSharedCostBalance, VacuumCostBalance);
+ pg_atomic_write_u32(VacuumActiveNWorkers, 0);
+
+ /*
+ * The number of workers can vary between bulkdelete and cleanup
+ * phase.
+ */
+ ReinitializeParallelWorkers(lps->pcxt, nworkers);
+
+ LaunchParallelWorkers(lps->pcxt);
+
+ if (lps->pcxt->nworkers_launched > 0)
+ {
+ /*
+ * Reset the local cost values for leader backend as we have
+ * already accumulated the remaining balance of heap.
+ */
+ VacuumCostBalance = 0;
+ VacuumCostBalanceLocal = 0;
+ }
+ else
+ {
+ /*
+ * Disable shared cost balance if we are not able to launch
+ * workers.
+ */
+ VacuumSharedCostBalance = NULL;
+ VacuumActiveNWorkers = NULL;
+ }
+
I don't like the idea of first initializing the
VacuumSharedCostBalance with lps->lvshared->cost_balance and then
uninitialize if nworkers_launched is 0.
I am not sure why do we need to initialize VacuumSharedCostBalance
here? just to perform pg_atomic_write_u32(VacuumSharedCostBalance,
VacuumCostBalance);?
I think we can initialize it only if nworkers_launched > 0 then we can
get rid of the else branch completely.

No, we can't initialize after nworkers_launched > 0 because by that
time some workers would have already tried to access the shared cost
balance. So, it needs to be done before launching the workers as is
done in code. We can probably add a comment.

+ /* Carry the shared balance value to heap scan */
+ if (VacuumSharedCostBalance)
+ VacuumCostBalance = pg_atomic_read_u32(VacuumSharedCostBalance);
+
+ if (nworkers > 0)
+ {
+ /* Disable shared cost balance */
+ VacuumSharedCostBalance = NULL;
+ VacuumActiveNWorkers = NULL;
+ }
Doesn't make sense to keep them as two conditions, we can combine them as below

/* If shared costing is enable, carry the shared balance value to heap
scan and disable the shared costing */
if (VacuumSharedCostBalance)
{
VacuumCostBalance = pg_atomic_read_u32(VacuumSharedCostBalance);
VacuumSharedCostBalance = NULL;
VacuumActiveNWorkers = NULL;
}

makes sense to me, will change.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#375

Dilip Kumar

dilipbalaut@gmail.com

almost 6 years ago

In reply to: Amit Kapila (#374)

On Fri, Jan 17, 2020 at 10:44 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Jan 17, 2020 at 9:36 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
I have few small comments.
1.
logical streaming for large in-progress transactions+
+ /* Can't perform vacuum in parallel */
+ if (parallel_workers <= 0)
+ {
+ pfree(can_parallel_vacuum);
+ return lps;
+ }
why are we checking parallel_workers <= 0, Function
compute_parallel_vacuum_workers only returns 0 or greater than 0
so isn't it better to just check if (parallel_workers == 0) ?
Why to have such an assumption about
compute_parallel_vacuum_workers()? The function
compute_parallel_vacuum_workers() returns int, so such a check
(<= 0) seems reasonable to me.

Okay so I should probably change my statement to why
compute_parallel_vacuum_workers is returning "int" instead of uint? I
mean when this function is designed to return 0 or more worker why to
make it return int and then handle extra values on caller. Am I
missing something, can it really return negative in some cases?

I find the below code in "compute_parallel_vacuum_workers" a bit confusing

+static int
+compute_parallel_vacuum_workers(Relation *Irel, int nindexes, int nrequested,
+ bool *can_parallel_vacuum)
+{
......
+ /* The leader process takes one index */
+ nindexes_parallel--;        --> nindexes_parallel can become -1
+
+ /* No index supports parallel vacuum */
+ if (nindexes_parallel == 0) .  -> Now if it is 0 then return 0 but
if its -1 then continue. seems strange no?  I think here itself we can
handle if (nindexes_parallel <= 0), that will make code cleaner.
+ return 0;
+
+ /* Compute the parallel degree */
+ parallel_workers = (nrequested > 0) ?
+ Min(nrequested, nindexes_parallel) : nindexes_parallel;

2.
+/*
+ * Macro to check if we are in a parallel vacuum.  If true, we are in the
+ * parallel mode and the DSM segment is initialized.
+ */
+#define ParallelVacuumIsActive(lps) (((LVParallelState *) (lps)) != NULL)
(LVParallelState *) (lps) -> this typecast is not required, just (lps)
!= NULL should be enough.
I think the better idea would be to just replace it PointerIsValid
like below. I see similar usage in other places.
#define ParallelVacuumIsActive(lps) PointerIsValid(lps)

Make sense to me.

3.
+ shared->offset = MAXALIGN(add_size(SizeOfLVShared, BITMAPLEN(nindexes)));
+ prepare_index_statistics(shared, can_parallel_vacuum, nindexes);
+ pg_atomic_init_u32(&(shared->idx), 0);
+ pg_atomic_init_u32(&(shared->cost_balance), 0);
+ pg_atomic_init_u32(&(shared->active_nworkers), 0);
I think it will look cleaner if we can initialize in the order they
are declared in structure.
Okay.
4.
+ VacuumSharedCostBalance = &(lps->lvshared->cost_balance);
+ VacuumActiveNWorkers = &(lps->lvshared->active_nworkers);
+
+ /*
+ * Set up shared cost balance and the number of active workers for
+ * vacuum delay.
+ */
+ pg_atomic_write_u32(VacuumSharedCostBalance, VacuumCostBalance);
+ pg_atomic_write_u32(VacuumActiveNWorkers, 0);
+
+ /*
+ * The number of workers can vary between bulkdelete and cleanup
+ * phase.
+ */
+ ReinitializeParallelWorkers(lps->pcxt, nworkers);
+
+ LaunchParallelWorkers(lps->pcxt);
+
+ if (lps->pcxt->nworkers_launched > 0)
+ {
+ /*
+ * Reset the local cost values for leader backend as we have
+ * already accumulated the remaining balance of heap.
+ */
+ VacuumCostBalance = 0;
+ VacuumCostBalanceLocal = 0;
+ }
+ else
+ {
+ /*
+ * Disable shared cost balance if we are not able to launch
+ * workers.
+ */
+ VacuumSharedCostBalance = NULL;
+ VacuumActiveNWorkers = NULL;
+ }
+
I don't like the idea of first initializing the
VacuumSharedCostBalance with lps->lvshared->cost_balance and then
uninitialize if nworkers_launched is 0.
I am not sure why do we need to initialize VacuumSharedCostBalance
here? just to perform pg_atomic_write_u32(VacuumSharedCostBalance,
VacuumCostBalance);?
I think we can initialize it only if nworkers_launched > 0 then we can
get rid of the else branch completely.
No, we can't initialize after nworkers_launched > 0 because by that
time some workers would have already tried to access the shared cost
balance. So, it needs to be done before launching the workers as is
done in code. We can probably add a comment.

I don't think so, VacuumSharedCostBalance is a process local which is
just pointing to the shared memory variable right?

and each process has to point it to the shared memory and that we are
already doing in parallel_vacuum_main. So we can initialize it after
worker is launched.
Basically code will look like below

pg_atomic_write_u32(&(lps->lvshared->cost_balance), VacuumCostBalance);
pg_atomic_write_u32(&(lps->lvshared->active_nworkers), 0);
..
ReinitializeParallelWorkers(lps->pcxt, nworkers);

LaunchParallelWorkers(lps->pcxt);

if (lps->pcxt->nworkers_launched > 0)
{
..
VacuumCostBalance = 0;
VacuumCostBalanceLocal = 0;
VacuumSharedCostBalance = &(lps->lvshared->cost_balance);
VacuumActiveNWorkers = &(lps->lvshared->active_nworkers);
}
-- remove the else part completely..

+ /* Carry the shared balance value to heap scan */
+ if (VacuumSharedCostBalance)
+ VacuumCostBalance = pg_atomic_read_u32(VacuumSharedCostBalance);
+
+ if (nworkers > 0)
+ {
+ /* Disable shared cost balance */
+ VacuumSharedCostBalance = NULL;
+ VacuumActiveNWorkers = NULL;
+ }
Doesn't make sense to keep them as two conditions, we can combine them as below

/* If shared costing is enable, carry the shared balance value to heap
scan and disable the shared costing */
if (VacuumSharedCostBalance)
{
VacuumCostBalance = pg_atomic_read_u32(VacuumSharedCostBalance);
VacuumSharedCostBalance = NULL;
VacuumActiveNWorkers = NULL;
}
makes sense to me, will change.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#376

Amit Kapila

amit.kapila16@gmail.com

almost 6 years ago

In reply to: Dilip Kumar (#375)

On Fri, Jan 17, 2020 at 11:00 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Fri, Jan 17, 2020 at 10:44 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Jan 17, 2020 at 9:36 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
I have few small comments.
1.
logical streaming for large in-progress transactions+
+ /* Can't perform vacuum in parallel */
+ if (parallel_workers <= 0)
+ {
+ pfree(can_parallel_vacuum);
+ return lps;
+ }
why are we checking parallel_workers <= 0, Function
compute_parallel_vacuum_workers only returns 0 or greater than 0
so isn't it better to just check if (parallel_workers == 0) ?
Why to have such an assumption about
compute_parallel_vacuum_workers()? The function
compute_parallel_vacuum_workers() returns int, so such a check
(<= 0) seems reasonable to me.
Okay so I should probably change my statement to why
compute_parallel_vacuum_workers is returning "int" instead of uint?

Hmm, I think the number of workers at most places is int, so it is
better to return int here which will keep it consistent with how we do
at other places. See, the similar usage in compute_parallel_worker.

mean when this function is designed to return 0 or more worker why to
make it return int and then handle extra values on caller. Am I
missing something, can it really return negative in some cases?

I find the below code in "compute_parallel_vacuum_workers" a bit confusing
+static int
+compute_parallel_vacuum_workers(Relation *Irel, int nindexes, int nrequested,
+ bool *can_parallel_vacuum)
+{
......
+ /* The leader process takes one index */
+ nindexes_parallel--;        --> nindexes_parallel can become -1
+
+ /* No index supports parallel vacuum */
+ if (nindexes_parallel == 0) .  -> Now if it is 0 then return 0 but
if its -1 then continue. seems strange no?  I think here itself we can
handle if (nindexes_parallel <= 0), that will make code cleaner.
+ return 0;
+

I think this got recently introduce by one of my changes based on the
comment by Mahendra, we can adjust this check.

I don't like the idea of first initializing the
VacuumSharedCostBalance with lps->lvshared->cost_balance and then
uninitialize if nworkers_launched is 0.
I am not sure why do we need to initialize VacuumSharedCostBalance
here? just to perform pg_atomic_write_u32(VacuumSharedCostBalance,
VacuumCostBalance);?
I think we can initialize it only if nworkers_launched > 0 then we can
get rid of the else branch completely.

No, we can't initialize after nworkers_launched > 0 because by that
time some workers would have already tried to access the shared cost
balance. So, it needs to be done before launching the workers as is
done in code. We can probably add a comment.

I don't think so, VacuumSharedCostBalance is a process local which is
just pointing to the shared memory variable right?

and each process has to point it to the shared memory and that we are
already doing in parallel_vacuum_main. So we can initialize it after
worker is launched.
Basically code will look like below

pg_atomic_write_u32(&(lps->lvshared->cost_balance), VacuumCostBalance);
pg_atomic_write_u32(&(lps->lvshared->active_nworkers), 0);

oh, I thought you were telling to initialize the shared memory itself
after launching the workers. However, you are asking to change the
usage of the local variable, I think we can do that.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#377

Dilip Kumar

dilipbalaut@gmail.com

almost 6 years ago

In reply to: Amit Kapila (#376)

On Fri, Jan 17, 2020 at 11:34 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Jan 17, 2020 at 11:00 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Fri, Jan 17, 2020 at 10:44 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Jan 17, 2020 at 9:36 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
I have few small comments.
1.
logical streaming for large in-progress transactions+
+ /* Can't perform vacuum in parallel */
+ if (parallel_workers <= 0)
+ {
+ pfree(can_parallel_vacuum);
+ return lps;
+ }
why are we checking parallel_workers <= 0, Function
compute_parallel_vacuum_workers only returns 0 or greater than 0
so isn't it better to just check if (parallel_workers == 0) ?
Why to have such an assumption about
compute_parallel_vacuum_workers()? The function
compute_parallel_vacuum_workers() returns int, so such a check
(<= 0) seems reasonable to me.
Okay so I should probably change my statement to why
compute_parallel_vacuum_workers is returning "int" instead of uint?
Hmm, I think the number of workers at most places is int, so it is
better to return int here which will keep it consistent with how we do
at other places. See, the similar usage in compute_parallel_worker.

Okay, I see.

I
mean when this function is designed to return 0 or more worker why to
make it return int and then handle extra values on caller. Am I
missing something, can it really return negative in some cases?

I find the below code in "compute_parallel_vacuum_workers" a bit confusing
+static int
+compute_parallel_vacuum_workers(Relation *Irel, int nindexes, int nrequested,
+ bool *can_parallel_vacuum)
+{
......
+ /* The leader process takes one index */
+ nindexes_parallel--;        --> nindexes_parallel can become -1
+
+ /* No index supports parallel vacuum */
+ if (nindexes_parallel == 0) .  -> Now if it is 0 then return 0 but
if its -1 then continue. seems strange no?  I think here itself we can
handle if (nindexes_parallel <= 0), that will make code cleaner.
+ return 0;
+
I think this got recently introduce by one of my changes based on the
comment by Mahendra, we can adjust this check.

I don't like the idea of first initializing the
VacuumSharedCostBalance with lps->lvshared->cost_balance and then
uninitialize if nworkers_launched is 0.
I am not sure why do we need to initialize VacuumSharedCostBalance
here? just to perform pg_atomic_write_u32(VacuumSharedCostBalance,
VacuumCostBalance);?
I think we can initialize it only if nworkers_launched > 0 then we can
get rid of the else branch completely.

No, we can't initialize after nworkers_launched > 0 because by that
time some workers would have already tried to access the shared cost
balance. So, it needs to be done before launching the workers as is
done in code. We can probably add a comment.

I don't think so, VacuumSharedCostBalance is a process local which is
just pointing to the shared memory variable right?

and each process has to point it to the shared memory and that we are
already doing in parallel_vacuum_main. So we can initialize it after
worker is launched.
Basically code will look like below

pg_atomic_write_u32(&(lps->lvshared->cost_balance), VacuumCostBalance);
pg_atomic_write_u32(&(lps->lvshared->active_nworkers), 0);

oh, I thought you were telling to initialize the shared memory itself
after launching the workers. However, you are asking to change the
usage of the local variable, I think we can do that.

Okay.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#378

Dilip Kumar

dilipbalaut@gmail.com

almost 6 years ago

In reply to: Dilip Kumar (#377)

On Fri, Jan 17, 2020 at 11:39 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
I have performed cost delay testing on the latest test(I have used
same script as attahced in [1]/messages/by-id/CAFiTN-tFLN=vdu5Ra-23E9_7Z1JXkk5MkRY3Bkj2zAoWK7fULA@mail.gmail.com and [2]/messages/by-id/CAFiTN-tC=NcvcEd+5J62fR8-D8x7EHuVi2xhS-0DMf1bnJs4hw@mail.gmail.com.
vacuum_cost_delay = 10
vacuum_cost_limit = 2000

Observation: As we have concluded earlier, the delay time is in sync
with the I/O performed by the worker
and the total delay (heap + index) is almost the same as the
non-parallel operation.

test1:[1]/messages/by-id/CAFiTN-tFLN=vdu5Ra-23E9_7Z1JXkk5MkRY3Bkj2zAoWK7fULA@mail.gmail.com

Vacuum non-parallel

WARNING: VacuumCostTotalDelay=11332.320000

Vacuum 2 workers
WARNING: worker 0 delay=171.085000 total io=34288 hit=22208 miss=0 dirty=604
WARNING: worker 1 delay=87.790000 total io=17910 hit=17890 miss=0 dirty=1
WARNING: worker 2 delay=88.620000 total io=17910 hit=17890 miss=0 dirty=1

WARNING: VacuumCostTotalDelay=11505.650000

Vacuum 4 workers
WARNING: worker 0 delay=87.750000 total io=17910 hit=17890 miss=0 dirty=1
WARNING: worker 1 delay=89.155000 total io=17910 hit=17890 miss=0 dirty=1
WARNING: worker 2 delay=87.080000 total io=17910 hit=17890 miss=0 dirty=1
WARNING: worker 3 delay=78.745000 total io=16378 hit=4318 miss=0 dirty=603

WARNING: VacuumCostTotalDelay=11590.680000

test2:[2]/messages/by-id/CAFiTN-tC=NcvcEd+5J62fR8-D8x7EHuVi2xhS-0DMf1bnJs4hw@mail.gmail.com

Vacuum non-parallel
WARNING: VacuumCostTotalDelay=22835.970000

Vacuum 2 workers
WARNING: worker 0 delay=345.550000 total io=69338 hit=45338 miss=0 dirty=1200
WARNING: worker 1 delay=177.150000 total io=35807 hit=35787 miss=0 dirty=1
WARNING: worker 2 delay=178.105000 total io=35807 hit=35787 miss=0 dirty=1
WARNING: VacuumCostTotalDelay=23191.405000

Vacuum 4 workers
WARNING: worker 0 delay=177.265000 total io=35807 hit=35787 miss=0 dirty=1
WARNING: worker 1 delay=177.175000 total io=35807 hit=35787 miss=0 dirty=1
WARNING: worker 2 delay=177.385000 total io=35807 hit=35787 miss=0 dirty=1
WARNING: worker 3 delay=166.515000 total io=33531 hit=9551 miss=0 dirty=1199
WARNING: VacuumCostTotalDelay=23357.115000

[1]: /messages/by-id/CAFiTN-tFLN=vdu5Ra-23E9_7Z1JXkk5MkRY3Bkj2zAoWK7fULA@mail.gmail.com
[2]: /messages/by-id/CAFiTN-tC=NcvcEd+5J62fR8-D8x7EHuVi2xhS-0DMf1bnJs4hw@mail.gmail.com

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#379

Amit Kapila

amit.kapila16@gmail.com

almost 6 years ago

In reply to: Dilip Kumar (#378)

1 attachment(s)

On Fri, Jan 17, 2020 at 12:51 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Fri, Jan 17, 2020 at 11:39 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
I have performed cost delay testing on the latest test(I have used
same script as attahced in [1] and [2].
vacuum_cost_delay = 10
vacuum_cost_limit = 2000

Observation: As we have concluded earlier, the delay time is in sync
with the I/O performed by the worker
and the total delay (heap + index) is almost the same as the
non-parallel operation.

Thanks for doing this test again. In the attached patch, I have
addressed all the comments and modified a few comments.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachments:

v50-0001-Allow-vacuum-command-to-process-indexes-in-parallel.patchapplication/octet-stream; name=v50-0001-Allow-vacuum-command-to-process-indexes-in-parallel.patchDownload

From b67f2c648c510748f3ba0604440dd4dd210a84bd Mon Sep 17 00:00:00 2001
From: Amit Kapila <akapila@postgresql.org>
Date: Thu, 9 Jan 2020 15:49:46 +0530
Subject: [PATCH] Allow vacuum command to process indexes in parallel.

This feature allows the vacuum to leverage multiple CPUs in order to
process indexes.  This enables us to perform index vacuuming and index
cleanup with background workers.  This adds a PARALLEL option to VACUUM
command where the user can specify the number of workers that can be used
to perform the command which is limited by the number of indexes on a
table.  Specifying zero as a number of workers will disable parallelism.
This option can't be used with the FULL option.

Each index is processed by at most one vacuum process.  Therefore parallel
vacuum can be used when the table has at least two indexes.

The parallel degree is either specified by the user or determined based on
the number of indexes that the table has, and further limited by
max_parallel_maintenance_workers.  The index can participate in parallel
vacuum iff it's size is greater than min_parallel_index_scan_size.

Author: Masahiko Sawada and Amit Kapila
Reviewed-by: Dilip Kumar, Amit Kapila, Robert Haas, Tomas Vondra,
Mahendra Singh and Sergei Kornilov
Tested-by: Mahendra Singh and Prabhat Sahu
Discussion:
https://postgr.es/m/CAD21AoDTPMgzSkV4E3SFo1CH_x50bf5PqZFQf4jmqjk-C03BWg@mail.gmail.com
https://postgr.es/m/CAA4eK1J-VoR9gzS5E75pcD-OH0mEyCdp8RihcwKrcuw7J-Q0+w@mail.gmail.com
---
 doc/src/sgml/config.sgml              |   18 +-
 doc/src/sgml/ref/vacuum.sgml          |   61 +-
 src/backend/access/heap/vacuumlazy.c  | 1257 ++++++++++++++++++++++++++++++---
 src/backend/access/transam/parallel.c |   26 +-
 src/backend/commands/vacuum.c         |  135 +++-
 src/backend/postmaster/autovacuum.c   |    2 +
 src/bin/psql/tab-complete.c           |    2 +-
 src/include/access/heapam.h           |    3 +
 src/include/access/parallel.h         |    4 +-
 src/include/commands/vacuum.h         |   12 +
 src/test/regress/expected/vacuum.out  |   34 +
 src/test/regress/sql/vacuum.sql       |   31 +
 src/tools/pgindent/typedefs.list      |    4 +
 13 files changed, 1453 insertions(+), 136 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 5d45b6f..3ccacd5 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2308,13 +2308,13 @@ include_dir 'conf.d'
        <listitem>
         <para>
          Sets the maximum number of parallel workers that can be
-         started by a single utility command.  Currently, the only
-         parallel utility command that supports the use of parallel
-         workers is <command>CREATE INDEX</command>, and only when
-         building a B-tree index.  Parallel workers are taken from the
-         pool of processes established by <xref
-         linkend="guc-max-worker-processes"/>, limited by <xref
-         linkend="guc-max-parallel-workers"/>.  Note that the requested
+         started by a single utility command.  Currently, the parallel
+         utility commands that support the use of parallel workers are
+         <command>CREATE INDEX</command> only when building a B-tree index,
+         and <command>VACUUM</command> without <literal>FULL</literal>
+         option.  Parallel workers are taken from the pool of processes
+         established by <xref linkend="guc-max-worker-processes"/>, limited
+         by <xref linkend="guc-max-parallel-workers"/>.  Note that the requested
          number of workers may not actually be available at run time.
          If this occurs, the utility operation will run with fewer
          workers than expected.  The default value is 2.  Setting this
@@ -4915,7 +4915,9 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
         for a parallel scan to be considered.  Note that a parallel index scan
         typically won't touch the entire index; it is the number of pages
         which the planner believes will actually be touched by the scan which
-        is relevant.
+        is relevant.  This parameter is also used to decide whether a
+        particular index can participate in a parallel vacuum.  See
+        <xref linkend="sql-vacuum"/>.
         If this value is specified without units, it is taken as blocks,
         that is <symbol>BLCKSZ</symbol> bytes, typically 8kB.
         The default is 512 kilobytes (<literal>512kB</literal>).
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index f9b0fb8..846056a 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -34,6 +34,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     SKIP_LOCKED [ <replaceable class="parameter">boolean</replaceable> ]
     INDEX_CLEANUP [ <replaceable class="parameter">boolean</replaceable> ]
     TRUNCATE [ <replaceable class="parameter">boolean</replaceable> ]
+    PARALLEL <replaceable class="parameter">integer</replaceable>
 
 <phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
 
@@ -75,10 +76,14 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    with normal reading and writing of the table, as an exclusive lock
    is not obtained.  However, extra space is not returned to the operating
    system (in most cases); it's just kept available for re-use within the
-   same table.  <command>VACUUM FULL</command> rewrites the entire contents
-   of the table into a new disk file with no extra space, allowing unused
-   space to be returned to the operating system.  This form is much slower and
-   requires an exclusive lock on each table while it is being processed.
+   same table.  It also allows us to leverage multiple CPUs in order to process
+   indexes.  This feature is known as <firstterm>parallel vacuum</firstterm>.
+   To disable this feature, one can use <literal>PARALLEL</literal> option and
+   specify parallel workers as zero.  <command>VACUUM FULL</command> rewrites
+   the entire contents of the table into a new disk file with no extra space,
+   allowing unused space to be returned to the operating system.  This form is
+   much slower and requires an exclusive lock on each table while it is being
+   processed.
   </para>
 
   <para>
@@ -224,6 +229,33 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    </varlistentry>
 
    <varlistentry>
+    <term><literal>PARALLEL</literal></term>
+    <listitem>
+     <para>
+      Perform vacuum index and cleanup index phases of <command>VACUUM</command>
+      in parallel using <replaceable class="parameter">integer</replaceable>
+      background workers (for the detail of each vacuum phases, please
+      refer to <xref linkend="vacuum-phases"/>).  If the
+      <literal>PARALLEL</literal> option is omitted, then
+      <command>VACUUM</command> decides the number of workers based on number
+      of indexes that support parallel vacuum operation on the relation which
+      is further limited by <xref linkend="guc-max-parallel-workers-maintenance"/>.
+      The index can participate in a parallel vacuum if and only if the size
+      of the index is more than <xref linkend="guc-min-parallel-index-scan-size"/>.
+      Please note that it is not guaranteed that the number of parallel workers
+      specified in <replaceable class="parameter">integer</replaceable> will
+      be used during execution.  It is possible for a vacuum to run with fewer
+      workers than specified, or even with no workers at all.  Only one worker
+      can be used per index.  So parallel workers are launched only when there
+      are at least <literal>2</literal> indexes in the table.  Workers for
+      vacuum launches before starting each phase and exit at the end of
+      the phase.  These behaviors might change in a future release.  This
+      option can't be used with the <literal>FULL</literal> option.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><replaceable class="parameter">boolean</replaceable></term>
     <listitem>
      <para>
@@ -238,6 +270,15 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    </varlistentry>
 
    <varlistentry>
+    <term><replaceable class="parameter">integer</replaceable></term>
+    <listitem>
+     <para>
+      Specifies a non-negative integer value passed to the selected option.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><replaceable class="parameter">table_name</replaceable></term>
     <listitem>
      <para>
@@ -317,10 +358,18 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    </para>
 
    <para>
+     The <option>PARALLEL</option> option is used only for vacuum purpose.
+     Even if this option is specified with <option>ANALYZE</option> option
+     it does not affect <option>ANALYZE</option>.
+   </para>
+
+   <para>
     <command>VACUUM</command> causes a substantial increase in I/O traffic,
     which might cause poor performance for other active sessions.  Therefore,
-    it is sometimes advisable to use the cost-based vacuum delay feature.
-    See <xref linkend="runtime-config-resource-vacuum-cost"/> for details.
+    it is sometimes advisable to use the cost-based vacuum delay feature.  For
+    parallel vacuum, each worker sleeps proportional to the work done by that
+    worker.  See <xref linkend="runtime-config-resource-vacuum-cost"/> for
+    details.
    </para>
 
    <para>
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index a5fe904..e5ac323 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -22,6 +22,20 @@
  * of index scans performed.  So we don't use maintenance_work_mem memory for
  * the TID array, just enough to hold as many heap tuples as fit on one page.
  *
+ * Lazy vacuum supports parallel execution with parallel worker processes.  In
+ * a parallel vacuum, we perform both index vacuum and index cleanup with
+ * parallel worker processes.  Individual indexes are processed by one vacuum
+ * process.  At the beginning of a lazy vacuum (at lazy_scan_heap) we prepare
+ * the parallel context and initialize the DSM segment that contains shared
+ * information as well as the memory space for storing dead tuples.  When
+ * starting either index vacuum or index cleanup, we launch parallel worker
+ * processes.  Once all indexes are processed the parallel worker processes
+ * exit.  After that, the leader process re-initializes the parallel context
+ * so that it can use the same DSM for multiple passes of index vacuum and
+ * for performing index cleanup.  For updating the index statistics, we need
+ * to update the system table and since updates are not allowed during
+ * parallel mode we update the index statistics after exiting from the
+ * parallel mode.
  *
  * Portions Copyright (c) 1996-2020, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
@@ -36,25 +50,30 @@
 
 #include <math.h>
 
+#include "access/amapi.h"
 #include "access/genam.h"
 #include "access/heapam.h"
 #include "access/heapam_xlog.h"
 #include "access/htup_details.h"
 #include "access/multixact.h"
+#include "access/parallel.h"
 #include "access/transam.h"
 #include "access/visibilitymap.h"
+#include "access/xact.h"
 #include "access/xlog.h"
 #include "catalog/storage.h"
 #include "commands/dbcommands.h"
 #include "commands/progress.h"
 #include "commands/vacuum.h"
 #include "miscadmin.h"
+#include "optimizer/paths.h"
 #include "pgstat.h"
 #include "portability/instr_time.h"
 #include "postmaster/autovacuum.h"
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
 #include "storage/lmgr.h"
+#include "tcop/tcopprot.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/pg_rusage.h"
@@ -110,6 +129,142 @@
  */
 #define PREFETCH_SIZE			((BlockNumber) 32)
 
+/*
+ * DSM keys for parallel vacuum.  Unlike other parallel execution code, since
+ * we don't need to worry about DSM keys conflicting with plan_node_id we can
+ * use small integers.
+ */
+#define PARALLEL_VACUUM_KEY_SHARED			1
+#define PARALLEL_VACUUM_KEY_DEAD_TUPLES		2
+#define PARALLEL_VACUUM_KEY_QUERY_TEXT		3
+
+/*
+ * Macro to check if we are in a parallel vacuum.  If true, we are in the
+ * parallel mode and the DSM segment is initialized.
+ */
+#define ParallelVacuumIsActive(lps) PointerIsValid(lps)
+
+/*
+ * LVDeadTuples stores the dead tuple TIDs collected during the heap scan.
+ * This is allocated in the DSM segment in parallel mode and in local memory
+ * in non-parallel mode.
+ */
+typedef struct LVDeadTuples
+{
+	int			max_tuples;		/* # slots allocated in array */
+	int			num_tuples;		/* current # of entries */
+	/* List of TIDs of tuples we intend to delete */
+	/* NB: this list is ordered by TID address */
+	ItemPointerData itemptrs[FLEXIBLE_ARRAY_MEMBER];	/* array of
+														 * ItemPointerData */
+} LVDeadTuples;
+
+#define SizeOfLVDeadTuples(cnt) \
+		add_size((offsetof(LVDeadTuples, itemptrs)), \
+				 mul_size(sizeof(ItemPointerData), cnt))
+
+/*
+ * Shared information among parallel workers.  So this is allocated in the DSM
+ * segment.
+ */
+typedef struct LVShared
+{
+	/*
+	 * Target table relid and log level.  These fields are not modified during
+	 * the lazy vacuum.
+	 */
+	Oid			relid;
+	int			elevel;
+
+	/*
+	 * An indication for vacuum workers to perform either index vacuum or
+	 * index cleanup.  first_time is true only if for_cleanup is true and
+	 * bulk-deletion is not performed yet.
+	 */
+	bool		for_cleanup;
+	bool		first_time;
+
+	/*
+	 * Fields for both index vacuum and cleanup.
+	 *
+	 * reltuples is the total number of input heap tuples.  We set either old
+	 * live tuples in the index vacuum case or the new live tuples in the
+	 * index cleanup case.
+	 *
+	 * estimated_count is true if the reltuples is an estimated value.
+	 */
+	double		reltuples;
+	bool		estimated_count;
+
+	/*
+	 * In single process lazy vacuum we could consume more memory during index
+	 * vacuuming or cleanup apart from the memory for heap scanning.  In
+	 * parallel vacuum, since individual vacuum workers can consume memory
+	 * equal to maintenance_work_mem, the new maintenance_work_mem for each
+	 * worker is set such that the parallel operation doesn't consume more
+	 * memory than single process lazy vacuum.
+	 */
+	int			maintenance_work_mem_worker;
+
+	/*
+	 * Shared vacuum cost balance.  During parallel vacuum,
+	 * VacuumSharedCostBalance points to this value and it accumulates the
+	 * balance of each parallel vacuum worker.
+	 */
+	pg_atomic_uint32 cost_balance;
+
+	/*
+	 * Number of active parallel workers.  This is used for computing the
+	 * minimum threshold of the vacuum cost balance for a worker to go for the
+	 * delay.
+	 */
+	pg_atomic_uint32 active_nworkers;
+
+	/*
+	 * Variables to control parallel vacuum.  We have a bitmap to indicate
+	 * which index has stats in shared memory.  The set bit in the map
+	 * indicates that the particular index supports a parallel vacuum.
+	 */
+	pg_atomic_uint32 idx;		/* counter for vacuuming and clean up */
+	uint32		offset;			/* sizeof header incl. bitmap */
+	bits8		bitmap[FLEXIBLE_ARRAY_MEMBER];	/* bit map of NULLs */
+
+	/* Shared index statistics data follows at end of struct */
+} LVShared;
+
+#define SizeOfLVShared (offsetof(LVShared, bitmap) + sizeof(bits8))
+#define GetSharedIndStats(s) \
+	((LVSharedIndStats *)((char *)(s) + ((LVShared *)(s))->offset))
+#define IndStatsIsNull(s, i) \
+	(!(((LVShared *)(s))->bitmap[(i) >> 3] & (1 << ((i) & 0x07))))
+
+/*
+ * Struct for an index bulk-deletion statistic used for parallel vacuum.  This
+ * is allocated in the DSM segment.
+ */
+typedef struct LVSharedIndStats
+{
+	bool		updated;		/* are the stats updated? */
+	IndexBulkDeleteResult stats;
+} LVSharedIndStats;
+
+/* Struct for maintaining a parallel vacuum state. */
+typedef struct LVParallelState
+{
+	ParallelContext *pcxt;
+
+	/* Shared information among parallel vacuum workers */
+	LVShared   *lvshared;
+
+	/*
+	 * The number of indexes that support parallel index bulk-deletion and
+	 * parallel index cleanup respectively.
+	 */
+	int			nindexes_parallel_bulkdel;
+	int			nindexes_parallel_cleanup;
+	int			nindexes_parallel_condcleanup;
+} LVParallelState;
+
 typedef struct LVRelStats
 {
 	/* useindex = true means two-pass strategy; false means one-pass */
@@ -128,11 +283,7 @@ typedef struct LVRelStats
 	BlockNumber pages_removed;
 	double		tuples_deleted;
 	BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
-	/* List of TIDs of tuples we intend to delete */
-	/* NB: this list is ordered by TID address */
-	int			num_dead_tuples;	/* current # of entries */
-	int			max_dead_tuples;	/* # slots allocated in array */
-	ItemPointer dead_tuples;	/* array of ItemPointerData */
+	LVDeadTuples *dead_tuples;
 	int			num_index_scans;
 	TransactionId latestRemovedXid;
 	bool		lock_waiter_detected;
@@ -155,15 +306,15 @@ static void lazy_scan_heap(Relation onerel, VacuumParams *params,
 						   bool aggressive);
 static void lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats);
 static bool lazy_check_needs_freeze(Buffer buf, bool *hastup);
-static void lazy_vacuum_index(Relation indrel,
-							  IndexBulkDeleteResult **stats,
-							  LVRelStats *vacrelstats);
-static void lazy_vacuum_all_indexes(Relation onerel, LVRelStats *vacrelstats,
-									Relation *Irel, int nindexes,
-									IndexBulkDeleteResult **indstats);
+static void lazy_vacuum_all_indexes(Relation onerel, Relation *Irel,
+									IndexBulkDeleteResult **stats,
+									LVRelStats *vacrelstats, LVParallelState *lps,
+									int nindexes);
+static void lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult **stats,
+							  LVDeadTuples *dead_tuples, double reltuples);
 static void lazy_cleanup_index(Relation indrel,
-							   IndexBulkDeleteResult *stats,
-							   LVRelStats *vacrelstats);
+							   IndexBulkDeleteResult **stats,
+							   double reltuples, bool estimated_count);
 static int	lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 							 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer);
 static bool should_attempt_truncation(VacuumParams *params,
@@ -172,12 +323,41 @@ static void lazy_truncate_heap(Relation onerel, LVRelStats *vacrelstats);
 static BlockNumber count_nondeletable_pages(Relation onerel,
 											LVRelStats *vacrelstats);
 static void lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks);
-static void lazy_record_dead_tuple(LVRelStats *vacrelstats,
+static void lazy_record_dead_tuple(LVDeadTuples *dead_tuples,
 								   ItemPointer itemptr);
 static bool lazy_tid_reaped(ItemPointer itemptr, void *state);
 static int	vac_cmp_itemptr(const void *left, const void *right);
 static bool heap_page_is_all_visible(Relation rel, Buffer buf,
 									 TransactionId *visibility_cutoff_xid, bool *all_frozen);
+static void lazy_parallel_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+										 LVRelStats *vacrelstats, LVParallelState *lps,
+										 int nindexes);
+static void parallel_vacuum_index(Relation *Irel, IndexBulkDeleteResult **stats,
+								  LVShared *lvshared, LVDeadTuples *dead_tuples,
+								  int nindexes);
+static void vacuum_indexes_leader(Relation *Irel, IndexBulkDeleteResult **stats,
+								  LVRelStats *vacrelstats, LVParallelState *lps,
+								  int nindexes);
+static void vacuum_one_index(Relation indrel, IndexBulkDeleteResult **stats,
+							 LVShared *lvshared, LVSharedIndStats *shared_indstats,
+							 LVDeadTuples *dead_tuples);
+static void lazy_cleanup_all_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+									 LVRelStats *vacrelstats, LVParallelState *lps,
+									 int nindexes);
+static long compute_max_dead_tuples(BlockNumber relblocks, bool hasindex);
+static int	compute_parallel_vacuum_workers(Relation *Irel, int nindexes, int nrequested,
+											bool *can_parallel_vacuum);
+static void prepare_index_statistics(LVShared *lvshared, bool *can_parallel_vacuum,
+									 int nindexes);
+static void update_index_statistics(Relation *Irel, IndexBulkDeleteResult **stats,
+									int nindexes);
+static LVParallelState *begin_parallel_vacuum(Oid relid, Relation *Irel,
+											  LVRelStats *vacrelstats, BlockNumber nblocks,
+											  int nindexes, int nrequested);
+static void end_parallel_vacuum(Relation *Irel, IndexBulkDeleteResult **stats,
+								LVParallelState *lps, int nindexes);
+static LVSharedIndStats *get_indstats(LVShared *lvshared, int n);
+static bool skip_parallel_vacuum_index(Relation indrel, LVShared *lvshared);
 
 
 /*
@@ -491,6 +671,18 @@ vacuum_log_cleanup_info(Relation rel, LVRelStats *vacrelstats)
  *		dead-tuple TIDs, invoke vacuuming of indexes and call lazy_vacuum_heap
  *		to reclaim dead line pointers.
  *
+ *		If the table has at least two indexes, we execute both index vacuum
+ *		and index cleanup with parallel workers unless the parallel vacuum is
+ *		disabled.  In a parallel vacuum, we enter parallel mode and then
+ *		create both the parallel context and the DSM segment before starting
+ *		heap scan so that we can record dead tuples to the DSM segment.  All
+ *		parallel workers are launched at beginning of index vacuuming and
+ *		index cleanup and they exit once done with all indexes.  At the end of
+ *		this function we exit from parallel mode.  Index bulk-deletion results
+ *		are stored in the DSM segment and we update index statistics for all
+ *		the indexes after exiting from parallel mode since writes are not
+ *		allowed during parallel mode.
+ *
  *		If there are no indexes then we can reclaim line pointers on the fly;
  *		dead line pointers need only be retained until all index pointers that
  *		reference them have been killed.
@@ -499,6 +691,8 @@ static void
 lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			   Relation *Irel, int nindexes, bool aggressive)
 {
+	LVParallelState *lps = NULL;
+	LVDeadTuples *dead_tuples;
 	BlockNumber nblocks,
 				blkno;
 	HeapTupleData tuple;
@@ -556,13 +750,48 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	vacrelstats->nonempty_pages = 0;
 	vacrelstats->latestRemovedXid = InvalidTransactionId;
 
-	lazy_space_alloc(vacrelstats, nblocks);
+	/*
+	 * Initialize the state for a parallel vacuum.  As of now, only one worker
+	 * can be used for an index, so we invoke parallelism only if there are at
+	 * least two indexes on a table.
+	 */
+	if (params->nworkers >= 0 && vacrelstats->useindex && nindexes > 1)
+	{
+		/*
+		 * Since parallel workers cannot access data in temporary tables, we
+		 * can't perform parallel vacuum on them.
+		 */
+		if (RelationUsesLocalBuffers(onerel))
+		{
+			/*
+			 * Give warning only if the user explicitly tries to perform a
+			 * parallel vacuum on the temporary table.
+			 */
+			if (params->nworkers > 0)
+				ereport(WARNING,
+						(errmsg("disabling parallel option of vacuum on \"%s\" --- cannot vacuum temporary tables in parallel",
+								RelationGetRelationName(onerel))));
+		}
+		else
+			lps = begin_parallel_vacuum(RelationGetRelid(onerel), Irel,
+										vacrelstats, nblocks, nindexes,
+										params->nworkers);
+	}
+
+	/*
+	 * Allocate the space for dead tuples in case the parallel vacuum is not
+	 * initialized.
+	 */
+	if (!ParallelVacuumIsActive(lps))
+		lazy_space_alloc(vacrelstats, nblocks);
+
+	dead_tuples = vacrelstats->dead_tuples;
 	frozen = palloc(sizeof(xl_heap_freeze_tuple) * MaxHeapTuplesPerPage);
 
 	/* Report that we're scanning the heap, advertising total # of blocks */
 	initprog_val[0] = PROGRESS_VACUUM_PHASE_SCAN_HEAP;
 	initprog_val[1] = nblocks;
-	initprog_val[2] = vacrelstats->max_dead_tuples;
+	initprog_val[2] = dead_tuples->max_tuples;
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
 	/*
@@ -740,8 +969,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * If we are close to overrunning the available space for dead-tuple
 		 * TIDs, pause and do a cycle of vacuuming before we tackle this page.
 		 */
-		if ((vacrelstats->max_dead_tuples - vacrelstats->num_dead_tuples) < MaxHeapTuplesPerPage &&
-			vacrelstats->num_dead_tuples > 0)
+		if ((dead_tuples->max_tuples - dead_tuples->num_tuples) < MaxHeapTuplesPerPage &&
+			dead_tuples->num_tuples > 0)
 		{
 			/*
 			 * Before beginning index vacuuming, we release any pin we may
@@ -756,8 +985,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			}
 
 			/* Work on all the indexes, then the heap */
-			lazy_vacuum_all_indexes(onerel, vacrelstats, Irel,
-									nindexes, indstats);
+			lazy_vacuum_all_indexes(onerel, Irel, indstats,
+									vacrelstats, lps, nindexes);
 
 			/* Remove tuples from heap */
 			lazy_vacuum_heap(onerel, vacrelstats);
@@ -767,7 +996,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 
 			/*
 			 * Vacuum the Free Space Map to make newly-freed space visible on
@@ -962,7 +1191,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		has_dead_tuples = false;
 		nfrozen = 0;
 		hastup = false;
-		prev_dead_count = vacrelstats->num_dead_tuples;
+		prev_dead_count = dead_tuples->num_tuples;
 		maxoff = PageGetMaxOffsetNumber(page);
 
 		/*
@@ -1001,7 +1230,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 */
 			if (ItemIdIsDead(itemid))
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				all_visible = false;
 				continue;
 			}
@@ -1147,7 +1376,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 
 			if (tupgone)
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
 				HeapTupleHeaderAdvanceLatestRemovedXid(tuple.t_data,
 													   &vacrelstats->latestRemovedXid);
 				tups_vacuumed += 1;
@@ -1217,7 +1446,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * doing a second scan. Also we don't do that but forget dead tuples
 		 * when index cleanup is disabled.
 		 */
-		if (!vacrelstats->useindex && vacrelstats->num_dead_tuples > 0)
+		if (!vacrelstats->useindex && dead_tuples->num_tuples > 0)
 		{
 			if (nindexes == 0)
 			{
@@ -1246,7 +1475,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 * not to reset latestRemovedXid since we want that value to be
 			 * valid.
 			 */
-			vacrelstats->num_dead_tuples = 0;
+			dead_tuples->num_tuples = 0;
 
 			/*
 			 * Periodically do incremental FSM vacuuming to make newly-freed
@@ -1361,7 +1590,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 * page, so remember its free space as-is.  (This path will always be
 		 * taken if there are no indexes.)
 		 */
-		if (vacrelstats->num_dead_tuples == prev_dead_count)
+		if (dead_tuples->num_tuples == prev_dead_count)
 			RecordPageWithFreeSpace(onerel, blkno, freespace);
 	}
 
@@ -1395,11 +1624,11 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 
 	/* If any tuples need to be deleted, perform final vacuum cycle */
 	/* XXX put a threshold on min number of tuples here? */
-	if (vacrelstats->num_dead_tuples > 0)
+	if (dead_tuples->num_tuples > 0)
 	{
 		/* Work on all the indexes, and then the heap */
-		lazy_vacuum_all_indexes(onerel, vacrelstats, Irel, nindexes,
-								indstats);
+		lazy_vacuum_all_indexes(onerel, Irel, indstats, vacrelstats,
+								lps, nindexes);
 
 		/* Remove tuples from heap */
 		lazy_vacuum_heap(onerel, vacrelstats);
@@ -1412,17 +1641,22 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	if (blkno > next_fsm_block_to_vacuum)
 		FreeSpaceMapVacuumRange(onerel, next_fsm_block_to_vacuum, blkno);
 
-	/* report all blocks vacuumed; and that we're cleaning up */
+	/* report all blocks vacuumed */
 	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_VACUUMED, blkno);
-	pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
-								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
 
-	/* Do post-vacuum cleanup and statistics update for each index */
+	/* Do post-vacuum cleanup */
 	if (vacrelstats->useindex)
-	{
-		for (i = 0; i < nindexes; i++)
-			lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
-	}
+		lazy_cleanup_all_indexes(Irel, indstats, vacrelstats, lps, nindexes);
+
+	/*
+	 * End parallel mode before updating index statistics as we cannot write
+	 * during parallel mode.
+	 */
+	if (ParallelVacuumIsActive(lps))
+		end_parallel_vacuum(Irel, indstats, lps, nindexes);
+
+	/* Update index statistics */
+	update_index_statistics(Irel, indstats, nindexes);
 
 	/* If no indexes, make log report that lazy_vacuum_heap would've made */
 	if (vacuumed_pages)
@@ -1467,15 +1701,16 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 /*
  *	lazy_vacuum_all_indexes() -- vacuum all indexes of relation.
  *
- *		This is a utility wrapper for lazy_vacuum_index(), able to do
- *		progress reporting.
+ * We process the indexes serially unless we are doing parallel vacuum.
  */
 static void
-lazy_vacuum_all_indexes(Relation onerel, LVRelStats *vacrelstats,
-						Relation *Irel, int nindexes,
-						IndexBulkDeleteResult **indstats)
+lazy_vacuum_all_indexes(Relation onerel, Relation *Irel,
+						IndexBulkDeleteResult **stats,
+						LVRelStats *vacrelstats, LVParallelState *lps,
+						int nindexes)
 {
-	int			i;
+	Assert(!IsParallelWorker());
+	Assert(nindexes > 0);
 
 	/* Log cleanup info before we touch indexes */
 	vacuum_log_cleanup_info(onerel, vacrelstats);
@@ -1484,9 +1719,30 @@ lazy_vacuum_all_indexes(Relation onerel, LVRelStats *vacrelstats,
 	pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
 								 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
-	/* Remove index entries */
-	for (i = 0; i < nindexes; i++)
-		lazy_vacuum_index(Irel[i], &indstats[i], vacrelstats);
+	/* Perform index vacuuming with parallel workers for parallel vacuum. */
+	if (ParallelVacuumIsActive(lps))
+	{
+		/* Tell parallel workers to do index vacuuming */
+		lps->lvshared->for_cleanup = false;
+		lps->lvshared->first_time = false;
+
+		/*
+		 * We can only provide an approximate value of num_heap_tuples in
+		 * vacuum cases.
+		 */
+		lps->lvshared->reltuples = vacrelstats->old_live_tuples;
+		lps->lvshared->estimated_count = true;
+
+		lazy_parallel_vacuum_indexes(Irel, stats, vacrelstats, lps, nindexes);
+	}
+	else
+	{
+		int			idx;
+
+		for (idx = 0; idx < nindexes; idx++)
+			lazy_vacuum_index(Irel[idx], &stats[idx], vacrelstats->dead_tuples,
+							  vacrelstats->old_live_tuples);
+	}
 
 	/* Increase and report the number of index scans */
 	vacrelstats->num_index_scans++;
@@ -1522,7 +1778,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 	npages = 0;
 
 	tupindex = 0;
-	while (tupindex < vacrelstats->num_dead_tuples)
+	while (tupindex < vacrelstats->dead_tuples->num_tuples)
 	{
 		BlockNumber tblk;
 		Buffer		buf;
@@ -1531,7 +1787,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 
 		vacuum_delay_point();
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples->itemptrs[tupindex]);
 		buf = ReadBufferExtended(onerel, MAIN_FORKNUM, tblk, RBM_NORMAL,
 								 vac_strategy);
 		if (!ConditionalLockBufferForCleanup(buf))
@@ -1579,6 +1835,7 @@ static int
 lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 				 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer)
 {
+	LVDeadTuples *dead_tuples = vacrelstats->dead_tuples;
 	Page		page = BufferGetPage(buffer);
 	OffsetNumber unused[MaxOffsetNumber];
 	int			uncnt = 0;
@@ -1589,16 +1846,16 @@ lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 
 	START_CRIT_SECTION();
 
-	for (; tupindex < vacrelstats->num_dead_tuples; tupindex++)
+	for (; tupindex < dead_tuples->num_tuples; tupindex++)
 	{
 		BlockNumber tblk;
 		OffsetNumber toff;
 		ItemId		itemid;
 
-		tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
+		tblk = ItemPointerGetBlockNumber(&dead_tuples->itemptrs[tupindex]);
 		if (tblk != blkno)
 			break;				/* past end of tuples for this block */
-		toff = ItemPointerGetOffsetNumber(&vacrelstats->dead_tuples[tupindex]);
+		toff = ItemPointerGetOffsetNumber(&dead_tuples->itemptrs[tupindex]);
 		itemid = PageGetItemId(page, toff);
 		ItemIdSetUnused(itemid);
 		unused[uncnt++] = toff;
@@ -1719,19 +1976,345 @@ lazy_check_needs_freeze(Buffer buf, bool *hastup)
 	return false;
 }
 
+/*
+ * Perform index vacuum or index cleanup with parallel workers.  This function
+ * must be used by the parallel vacuum leader process.  The caller must set
+ * lps->lvshared->for_cleanup to indicate whether to perform vacuum or
+ * cleanup.
+ */
+static void
+lazy_parallel_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+							 LVRelStats *vacrelstats, LVParallelState *lps,
+							 int nindexes)
+{
+	int			nworkers;
+
+	Assert(!IsParallelWorker());
+	Assert(ParallelVacuumIsActive(lps));
+	Assert(nindexes > 0);
+
+	/* Determine the number of parallel workers to launch */
+	if (lps->lvshared->for_cleanup)
+	{
+		if (lps->lvshared->first_time)
+			nworkers = lps->nindexes_parallel_cleanup +
+				lps->nindexes_parallel_condcleanup;
+		else
+			nworkers = lps->nindexes_parallel_cleanup;
+	}
+	else
+		nworkers = lps->nindexes_parallel_bulkdel;
+
+	/* The leader process will participate */
+	nworkers--;
+
+	/*
+	 * It is possible that parallel context is initialized with fewer workers
+	 * then the number of indexes that need a separate worker in the current
+	 * phase, so we need to consider it.  See compute_parallel_vacuum_workers.
+	 */
+	nworkers = Min(nworkers, lps->pcxt->nworkers);
+
+	/* Setup the shared cost-based vacuum delay and launch workers */
+	if (nworkers > 0)
+	{
+		if (vacrelstats->num_index_scans > 0)
+		{
+			/* Reset the parallel index processing counter */
+			pg_atomic_write_u32(&(lps->lvshared->idx), 0);
+
+			/* Reinitialize the parallel context to relaunch parallel workers */
+			ReinitializeParallelDSM(lps->pcxt);
+		}
+
+		/*
+		 * Set up shared cost balance and the number of active workers for
+		 * vacuum delay.  We need to do this before launching workers as
+		 * otherwise, they might not see the updated values for these
+		 * parameters.
+		 */
+		pg_atomic_write_u32(&(lps->lvshared->cost_balance), VacuumCostBalance);
+		pg_atomic_write_u32(&(lps->lvshared->active_nworkers), 0);
+
+		/*
+		 * The number of workers can vary between bulkdelete and cleanup
+		 * phase.
+		 */
+		ReinitializeParallelWorkers(lps->pcxt, nworkers);
+
+		LaunchParallelWorkers(lps->pcxt);
+
+		if (lps->pcxt->nworkers_launched > 0)
+		{
+			/*
+			 * Reset the local cost values for leader backend as we have
+			 * already accumulated the remaining balance of heap.
+			 */
+			VacuumCostBalance = 0;
+			VacuumCostBalanceLocal = 0;
+
+			/* Enable shared cost balance for leader backend */
+			VacuumSharedCostBalance = &(lps->lvshared->cost_balance);
+			VacuumActiveNWorkers = &(lps->lvshared->active_nworkers);
+		}
+
+		if (lps->lvshared->for_cleanup)
+			ereport(elevel,
+					(errmsg(ngettext("launched %d parallel vacuum worker for index cleanup (planned: %d)",
+									 "launched %d parallel vacuum workers for index cleanup (planned: %d)",
+									 lps->pcxt->nworkers_launched),
+							lps->pcxt->nworkers_launched, nworkers)));
+		else
+			ereport(elevel,
+					(errmsg(ngettext("launched %d parallel vacuum worker for index vacuuming (planned: %d)",
+									 "launched %d parallel vacuum workers for index vacuuming (planned: %d)",
+									 lps->pcxt->nworkers_launched),
+							lps->pcxt->nworkers_launched, nworkers)));
+	}
+
+	/* Process the indexes that can be processed by only leader process */
+	vacuum_indexes_leader(Irel, stats, vacrelstats, lps, nindexes);
+
+	/*
+	 * Join as a parallel worker.  The leader process alone processes all the
+	 * indexes in the case where no workers are launched.
+	 */
+	parallel_vacuum_index(Irel, stats, lps->lvshared,
+						  vacrelstats->dead_tuples, nindexes);
+
+	/* Wait for all vacuum workers to finish */
+	WaitForParallelWorkersToFinish(lps->pcxt);
+
+	/*
+	 * Carry the shared balance value to heap scan and disable shared
+	 * costing
+	 */
+	if (VacuumSharedCostBalance)
+	{
+		VacuumCostBalance = pg_atomic_read_u32(VacuumSharedCostBalance);
+		VacuumSharedCostBalance = NULL;
+		VacuumActiveNWorkers = NULL;
+	}
+}
+
+/*
+ * Index vacuum/cleanup routine used by the leader process and parallel
+ * vacuum worker processes to process the indexes in parallel.
+ */
+static void
+parallel_vacuum_index(Relation *Irel, IndexBulkDeleteResult **stats,
+					  LVShared *lvshared, LVDeadTuples *dead_tuples,
+					  int nindexes)
+{
+	/*
+	 * Increment the active worker count if we are able to launch any worker.
+	 */
+	if (VacuumActiveNWorkers)
+		pg_atomic_add_fetch_u32(VacuumActiveNWorkers, 1);
+
+	/* Loop until all indexes are vacuumed */
+	for (;;)
+	{
+		int			idx;
+		LVSharedIndStats *shared_indstats;
+
+		/* Get an index number to process */
+		idx = pg_atomic_fetch_add_u32(&(lvshared->idx), 1);
+
+		/* Done for all indexes? */
+		if (idx >= nindexes)
+			break;
+
+		/* Get the index statistics of this index from DSM */
+		shared_indstats = get_indstats(lvshared, idx);
+
+		/*
+		 * Skip processing indexes that doesn't participate in parallel
+		 * operation
+		 */
+		if (shared_indstats == NULL ||
+			skip_parallel_vacuum_index(Irel[idx], lvshared))
+			continue;
+
+		/* Do vacuum or cleanup of the index */
+		vacuum_one_index(Irel[idx], &(stats[idx]), lvshared, shared_indstats,
+						 dead_tuples);
+	}
+
+	/*
+	 * We have completed the index vacuum so decrement the active worker
+	 * count.
+	 */
+	if (VacuumActiveNWorkers)
+		pg_atomic_sub_fetch_u32(VacuumActiveNWorkers, 1);
+}
+
+/*
+ * Vacuum or cleanup indexes that can be processed by only the leader process
+ * because these indexes don't support parallel operation at that phase.
+ */
+static void
+vacuum_indexes_leader(Relation *Irel, IndexBulkDeleteResult **stats,
+					  LVRelStats *vacrelstats, LVParallelState *lps,
+					  int nindexes)
+{
+	int			i;
+
+	Assert(!IsParallelWorker());
+
+	/*
+	 * Increment the active worker count if we are able to launch any worker.
+	 */
+	if (VacuumActiveNWorkers)
+		pg_atomic_add_fetch_u32(VacuumActiveNWorkers, 1);
+
+	for (i = 0; i < nindexes; i++)
+	{
+		LVSharedIndStats *shared_indstats;
+
+		shared_indstats = get_indstats(lps->lvshared, i);
+
+		/* Process the indexes skipped by parallel workers */
+		if (shared_indstats == NULL ||
+			skip_parallel_vacuum_index(Irel[i], lps->lvshared))
+			vacuum_one_index(Irel[i], &(stats[i]), lps->lvshared,
+							 shared_indstats, vacrelstats->dead_tuples);
+	}
+
+	/*
+	 * We have completed the index vacuum so decrement the active worker
+	 * count.
+	 */
+	if (VacuumActiveNWorkers)
+		pg_atomic_sub_fetch_u32(VacuumActiveNWorkers, 1);
+}
+
+/*
+ * Vacuum or cleanup index either by leader process or by one of the worker
+ * process.  After processing the index this function copies the index
+ * statistics returned from ambulkdelete and amvacuumcleanup to the DSM
+ * segment.
+ */
+static void
+vacuum_one_index(Relation indrel, IndexBulkDeleteResult **stats,
+				 LVShared *lvshared, LVSharedIndStats *shared_indstats,
+				 LVDeadTuples *dead_tuples)
+{
+	IndexBulkDeleteResult *bulkdelete_res = NULL;
+
+	if (shared_indstats)
+	{
+		/* Get the space for IndexBulkDeleteResult */
+		bulkdelete_res = &(shared_indstats->stats);
+
+		/*
+		 * Update the pointer to the corresponding bulk-deletion result if
+		 * someone has already updated it.
+		 */
+		if (shared_indstats->updated && *stats == NULL)
+			*stats = bulkdelete_res;
+	}
+
+	/* Do vacuum or cleanup of the index */
+	if (lvshared->for_cleanup)
+		lazy_cleanup_index(indrel, stats, lvshared->reltuples,
+						   lvshared->estimated_count);
+	else
+		lazy_vacuum_index(indrel, stats, dead_tuples,
+						  lvshared->reltuples);
+
+	/*
+	 * Copy the index bulk-deletion result returned from ambulkdelete and
+	 * amvacuumcleanup to the DSM segment if it's the first time to get it
+	 * from them, because they allocate it locally and it's possible that an
+	 * index will be vacuumed by the different vacuum process at the next
+	 * time.  The copying of the result normally happens only after the first
+	 * time of index vacuuming.  From the second time, we pass the result on
+	 * the DSM segment so that they then update it directly.
+	 *
+	 * Since all vacuum workers write the bulk-deletion result at different
+	 * slots we can write them without locking.
+	 */
+	if (shared_indstats && !shared_indstats->updated && *stats != NULL)
+	{
+		memcpy(bulkdelete_res, *stats, sizeof(IndexBulkDeleteResult));
+		shared_indstats->updated = true;
+
+		/*
+		 * Now that the stats[idx] points to the DSM segment, we don't need
+		 * the locally allocated results.
+		 */
+		pfree(*stats);
+		*stats = bulkdelete_res;
+	}
+}
+
+/*
+ *	lazy_cleanup_all_indexes() -- cleanup all indexes of relation.
+ *
+ * Cleanup indexes.  We process the indexes serially unless we are doing
+ * parallel vacuum.
+ */
+static void
+lazy_cleanup_all_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+						 LVRelStats *vacrelstats, LVParallelState *lps,
+						 int nindexes)
+{
+	int			idx;
+
+	Assert(!IsParallelWorker());
+	Assert(nindexes > 0);
+
+	/* Report that we are now cleaning up indexes */
+	pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
+								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
+
+	/*
+	 * If parallel vacuum is active we perform index cleanup with parallel
+	 * workers.
+	 */
+	if (ParallelVacuumIsActive(lps))
+	{
+		/* Tell parallel workers to do index cleanup */
+		lps->lvshared->for_cleanup = true;
+		lps->lvshared->first_time =
+			(vacrelstats->num_index_scans == 0);
+
+		/*
+		 * Now we can provide a better estimate of total number of surviving
+		 * tuples (we assume indexes are more interested in that than in the
+		 * number of nominally live tuples).
+		 */
+		lps->lvshared->reltuples = vacrelstats->new_rel_tuples;
+		lps->lvshared->estimated_count =
+			(vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+
+		lazy_parallel_vacuum_indexes(Irel, stats, vacrelstats, lps, nindexes);
+	}
+	else
+	{
+		for (idx = 0; idx < nindexes; idx++)
+			lazy_cleanup_index(Irel[idx], &stats[idx],
+							   vacrelstats->new_rel_tuples,
+							   vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	}
+}
 
 /*
  *	lazy_vacuum_index() -- vacuum one index relation.
  *
  *		Delete all the index entries pointing to tuples listed in
- *		vacrelstats->dead_tuples, and update running statistics.
+ *		dead_tuples, and update running statistics.
+ *
+ *		reltuples is the number of heap tuples to be passed to the
+ *		bulkdelete callback.
  */
 static void
-lazy_vacuum_index(Relation indrel,
-				  IndexBulkDeleteResult **stats,
-				  LVRelStats *vacrelstats)
+lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult **stats,
+				  LVDeadTuples *dead_tuples, double reltuples)
 {
 	IndexVacuumInfo ivinfo;
+	const char *msg;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
@@ -1741,30 +2324,38 @@ lazy_vacuum_index(Relation indrel,
 	ivinfo.report_progress = false;
 	ivinfo.estimated_count = true;
 	ivinfo.message_level = elevel;
-	/* We can only provide an approximate value of num_heap_tuples here */
-	ivinfo.num_heap_tuples = vacrelstats->old_live_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
 	/* Do bulk deletion */
 	*stats = index_bulk_delete(&ivinfo, *stats,
-							   lazy_tid_reaped, (void *) vacrelstats);
+							   lazy_tid_reaped, (void *) dead_tuples);
+
+	if (IsParallelWorker())
+		msg = gettext_noop("scanned index \"%s\" to remove %d row versions by parallel vacuum worker");
+	else
+		msg = gettext_noop("scanned index \"%s\" to remove %d row versions");
 
 	ereport(elevel,
-			(errmsg("scanned index \"%s\" to remove %d row versions",
+			(errmsg(msg,
 					RelationGetRelationName(indrel),
-					vacrelstats->num_dead_tuples),
+					dead_tuples->num_tuples),
 			 errdetail_internal("%s", pg_rusage_show(&ru0))));
 }
 
 /*
  *	lazy_cleanup_index() -- do post-vacuum cleanup for one index relation.
+ *
+ *		reltuples is the number of heap tuples and estimated_count is true
+ *		if the reltuples is an estimated value.
  */
 static void
 lazy_cleanup_index(Relation indrel,
-				   IndexBulkDeleteResult *stats,
-				   LVRelStats *vacrelstats)
+				   IndexBulkDeleteResult **stats,
+				   double reltuples, bool estimated_count)
 {
 	IndexVacuumInfo ivinfo;
+	const char *msg;
 	PGRUsage	ru0;
 
 	pg_rusage_init(&ru0);
@@ -1772,49 +2363,33 @@ lazy_cleanup_index(Relation indrel,
 	ivinfo.index = indrel;
 	ivinfo.analyze_only = false;
 	ivinfo.report_progress = false;
-	ivinfo.estimated_count = (vacrelstats->tupcount_pages < vacrelstats->rel_pages);
+	ivinfo.estimated_count = estimated_count;
 	ivinfo.message_level = elevel;
 
-	/*
-	 * Now we can provide a better estimate of total number of surviving
-	 * tuples (we assume indexes are more interested in that than in the
-	 * number of nominally live tuples).
-	 */
-	ivinfo.num_heap_tuples = vacrelstats->new_rel_tuples;
+	ivinfo.num_heap_tuples = reltuples;
 	ivinfo.strategy = vac_strategy;
 
-	stats = index_vacuum_cleanup(&ivinfo, stats);
+	*stats = index_vacuum_cleanup(&ivinfo, *stats);
 
-	if (!stats)
+	if (!(*stats))
 		return;
 
-	/*
-	 * Now update statistics in pg_class, but only if the index says the count
-	 * is accurate.
-	 */
-	if (!stats->estimated_count)
-		vac_update_relstats(indrel,
-							stats->num_pages,
-							stats->num_index_tuples,
-							0,
-							false,
-							InvalidTransactionId,
-							InvalidMultiXactId,
-							false);
+	if (IsParallelWorker())
+		msg = gettext_noop("index \"%s\" now contains %.0f row versions in %u pages as reported by parallel vacuum worker");
+	else
+		msg = gettext_noop("index \"%s\" now contains %.0f row versions in %u pages");
 
 	ereport(elevel,
-			(errmsg("index \"%s\" now contains %.0f row versions in %u pages",
+			(errmsg(msg,
 					RelationGetRelationName(indrel),
-					stats->num_index_tuples,
-					stats->num_pages),
+					(*stats)->num_index_tuples,
+					(*stats)->num_pages),
 			 errdetail("%.0f index row versions were removed.\n"
 					   "%u index pages have been deleted, %u are currently reusable.\n"
 					   "%s.",
-					   stats->tuples_removed,
-					   stats->pages_deleted, stats->pages_free,
+					   (*stats)->tuples_removed,
+					   (*stats)->pages_deleted, (*stats)->pages_free,
 					   pg_rusage_show(&ru0))));
-
-	pfree(stats);
 }
 
 /*
@@ -2122,19 +2697,17 @@ count_nondeletable_pages(Relation onerel, LVRelStats *vacrelstats)
 }
 
 /*
- * lazy_space_alloc - space allocation decisions for lazy vacuum
- *
- * See the comments at the head of this file for rationale.
+ * Return the maximum number of dead tuples we can record.
  */
-static void
-lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+static long
+compute_max_dead_tuples(BlockNumber relblocks, bool useindex)
 {
 	long		maxtuples;
 	int			vac_work_mem = IsAutoVacuumWorkerProcess() &&
 	autovacuum_work_mem != -1 ?
 	autovacuum_work_mem : maintenance_work_mem;
 
-	if (vacrelstats->useindex)
+	if (useindex)
 	{
 		maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);
 		maxtuples = Min(maxtuples, INT_MAX);
@@ -2148,34 +2721,48 @@ lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
 		maxtuples = Max(maxtuples, MaxHeapTuplesPerPage);
 	}
 	else
-	{
 		maxtuples = MaxHeapTuplesPerPage;
-	}
 
-	vacrelstats->num_dead_tuples = 0;
-	vacrelstats->max_dead_tuples = (int) maxtuples;
-	vacrelstats->dead_tuples = (ItemPointer)
-		palloc(maxtuples * sizeof(ItemPointerData));
+	return maxtuples;
+}
+
+/*
+ * lazy_space_alloc - space allocation decisions for lazy vacuum
+ *
+ * See the comments at the head of this file for rationale.
+ */
+static void
+lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+{
+	LVDeadTuples *dead_tuples = NULL;
+	long		maxtuples;
+
+	maxtuples = compute_max_dead_tuples(relblocks, vacrelstats->useindex);
+
+	dead_tuples = (LVDeadTuples *) palloc(SizeOfLVDeadTuples(maxtuples));
+	dead_tuples->num_tuples = 0;
+	dead_tuples->max_tuples = (int) maxtuples;
+
+	vacrelstats->dead_tuples = dead_tuples;
 }
 
 /*
  * lazy_record_dead_tuple - remember one deletable tuple
  */
 static void
-lazy_record_dead_tuple(LVRelStats *vacrelstats,
-					   ItemPointer itemptr)
+lazy_record_dead_tuple(LVDeadTuples *dead_tuples, ItemPointer itemptr)
 {
 	/*
 	 * The array shouldn't overflow under normal behavior, but perhaps it
 	 * could if we are given a really small maintenance_work_mem. In that
 	 * case, just forget the last few tuples (we'll get 'em next time).
 	 */
-	if (vacrelstats->num_dead_tuples < vacrelstats->max_dead_tuples)
+	if (dead_tuples->num_tuples < dead_tuples->max_tuples)
 	{
-		vacrelstats->dead_tuples[vacrelstats->num_dead_tuples] = *itemptr;
-		vacrelstats->num_dead_tuples++;
+		dead_tuples->itemptrs[dead_tuples->num_tuples] = *itemptr;
+		dead_tuples->num_tuples++;
 		pgstat_progress_update_param(PROGRESS_VACUUM_NUM_DEAD_TUPLES,
-									 vacrelstats->num_dead_tuples);
+									 dead_tuples->num_tuples);
 	}
 }
 
@@ -2189,12 +2776,12 @@ lazy_record_dead_tuple(LVRelStats *vacrelstats,
 static bool
 lazy_tid_reaped(ItemPointer itemptr, void *state)
 {
-	LVRelStats *vacrelstats = (LVRelStats *) state;
+	LVDeadTuples *dead_tuples = (LVDeadTuples *) state;
 	ItemPointer res;
 
 	res = (ItemPointer) bsearch((void *) itemptr,
-								(void *) vacrelstats->dead_tuples,
-								vacrelstats->num_dead_tuples,
+								(void *) dead_tuples->itemptrs,
+								dead_tuples->num_tuples,
 								sizeof(ItemPointerData),
 								vac_cmp_itemptr);
 
@@ -2342,3 +2929,447 @@ heap_page_is_all_visible(Relation rel, Buffer buf,
 
 	return all_visible;
 }
+
+/*
+ * Compute the number of parallel worker processes to request.  Both index
+ * vacuum and index cleanup can be executed with parallel workers.  The index
+ * is eligible for parallel vacuum iff it's size is greater than
+ * min_parallel_index_scan_size as invoking workers for very small indexes
+ * can hurt the performance.
+ *
+ * nrequested is the number of parallel workers that user requested.  If
+ * nrequested is 0, we compute the parallel degree based on nindexes, that is
+ * the number of indexes that support parallel vacuum.  This function also
+ * sets can_parallel_vacuum to remember indexes that participate in parallel
+ * vacuum.
+ */
+static int
+compute_parallel_vacuum_workers(Relation *Irel, int nindexes, int nrequested,
+								bool *can_parallel_vacuum)
+{
+	int			nindexes_parallel = 0;
+	int			nindexes_parallel_bulkdel = 0;
+	int			nindexes_parallel_cleanup = 0;
+	int			parallel_workers;
+	int			i;
+
+	/*
+	 * We don't allow to perform parallel operation in standalone backend or
+	 * when parallelism is disabled.
+	 */
+	if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+		return 0;
+
+	/*
+	 * Compute the number of indexes that can participate in parallel vacuum.
+	 */
+	for (i = 0; i < nindexes; i++)
+	{
+		uint8		vacoptions = Irel[i]->rd_indam->amparallelvacuumoptions;
+
+		if (vacoptions == VACUUM_OPTION_NO_PARALLEL ||
+			RelationGetNumberOfBlocks(Irel[i]) < min_parallel_index_scan_size)
+			continue;
+
+		can_parallel_vacuum[i] = true;
+
+		if ((vacoptions & VACUUM_OPTION_PARALLEL_BULKDEL) != 0)
+			nindexes_parallel_bulkdel++;
+		if (((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) != 0) ||
+			((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) != 0))
+			nindexes_parallel_cleanup++;
+	}
+
+	nindexes_parallel = Max(nindexes_parallel_bulkdel,
+							nindexes_parallel_cleanup);
+
+	/* The leader process takes one index */
+	nindexes_parallel--;
+
+	/* No index supports parallel vacuum */
+	if (nindexes_parallel <= 0)
+		return 0;
+
+	/* Compute the parallel degree */
+	parallel_workers = (nrequested > 0) ?
+		Min(nrequested, nindexes_parallel) : nindexes_parallel;
+
+	/* Cap by max_parallel_maintenance_workers */
+	parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+
+	return parallel_workers;
+}
+
+/*
+ * Initialize variables for shared index statistics, set NULL bitmap and the
+ * size of stats for each index.
+ */
+static void
+prepare_index_statistics(LVShared *lvshared, bool *can_parallel_vacuum,
+						 int nindexes)
+{
+	int			i;
+
+	/* Currently, we don't support parallel vacuum for autovacuum */
+	Assert(!IsAutoVacuumWorkerProcess());
+
+	/* Set NULL for all indexes */
+	memset(lvshared->bitmap, 0x00, BITMAPLEN(nindexes));
+
+	for (i = 0; i < nindexes; i++)
+	{
+		if (!can_parallel_vacuum[i])
+			continue;
+
+		/* Set NOT NULL as this index do support parallelism */
+		lvshared->bitmap[i >> 3] |= 1 << (i & 0x07);
+	}
+}
+
+/*
+ * Update index statistics in pg_class if the statistics is accurate.
+ */
+static void
+update_index_statistics(Relation *Irel, IndexBulkDeleteResult **stats,
+						int nindexes)
+{
+	int			i;
+
+	Assert(!IsInParallelMode());
+
+	for (i = 0; i < nindexes; i++)
+	{
+		if (stats[i] == NULL || stats[i]->estimated_count)
+			continue;
+
+		/* Update index statistics */
+		vac_update_relstats(Irel[i],
+							stats[i]->num_pages,
+							stats[i]->num_index_tuples,
+							0,
+							false,
+							InvalidTransactionId,
+							InvalidMultiXactId,
+							false);
+		pfree(stats[i]);
+	}
+}
+
+/*
+ * This function prepares and returns parallel vacuum state if we can launch
+ * even one worker.  This function is responsible to enter parallel mode,
+ * create a parallel context, and then initialize the DSM segment.
+ */
+static LVParallelState *
+begin_parallel_vacuum(Oid relid, Relation *Irel, LVRelStats *vacrelstats,
+					  BlockNumber nblocks, int nindexes, int nrequested)
+{
+	LVParallelState *lps = NULL;
+	ParallelContext *pcxt;
+	LVShared   *shared;
+	LVDeadTuples *dead_tuples;
+	bool	   *can_parallel_vacuum;
+	long		maxtuples;
+	char	   *sharedquery;
+	Size		est_shared;
+	Size		est_deadtuples;
+	int			nindexes_mwm = 0;
+	int			parallel_workers = 0;
+	int			querylen;
+	int			i;
+
+	/*
+	 * A parallel vacuum must be requested and there must be indexes on the
+	 * relation
+	 */
+	Assert(nrequested >= 0);
+	Assert(nindexes > 0);
+
+	/*
+	 * Compute the number of parallel vacuum workers to launch
+	 */
+	can_parallel_vacuum = (bool *) palloc0(sizeof(bool) * nindexes);
+	parallel_workers = compute_parallel_vacuum_workers(Irel, nindexes,
+													   nrequested,
+													   can_parallel_vacuum);
+
+	/* Can't perform vacuum in parallel */
+	if (parallel_workers <= 0)
+	{
+		pfree(can_parallel_vacuum);
+		return lps;
+	}
+
+	lps = (LVParallelState *) palloc0(sizeof(LVParallelState));
+
+	EnterParallelMode();
+	pcxt = CreateParallelContext("postgres", "parallel_vacuum_main",
+								 parallel_workers);
+	Assert(pcxt->nworkers > 0);
+	lps->pcxt = pcxt;
+
+	/* Estimate size for shared information -- PARALLEL_VACUUM_KEY_SHARED */
+	est_shared = MAXALIGN(add_size(SizeOfLVShared, BITMAPLEN(nindexes)));
+	for (i = 0; i < nindexes; i++)
+	{
+		uint8		vacoptions = Irel[i]->rd_indam->amparallelvacuumoptions;
+
+		/*
+		 * Cleanup option should be either disabled, always performing in
+		 * parallel or conditionally performing in parallel.
+		 */
+		Assert(((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) == 0) ||
+			   ((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) == 0));
+		Assert(vacoptions <= VACUUM_OPTION_MAX_VALID_VALUE);
+
+		/* Skip indexes that don't participate in parallel vacuum */
+		if (!can_parallel_vacuum[i])
+			continue;
+
+		if (Irel[i]->rd_indam->amusemaintenanceworkmem)
+			nindexes_mwm++;
+
+		est_shared = add_size(est_shared, sizeof(LVSharedIndStats));
+
+		/*
+		 * Remember the number of indexes that support parallel operation for
+		 * each phase.
+		 */
+		if ((vacoptions & VACUUM_OPTION_PARALLEL_BULKDEL) != 0)
+			lps->nindexes_parallel_bulkdel++;
+		if ((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) != 0)
+			lps->nindexes_parallel_cleanup++;
+		if ((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) != 0)
+			lps->nindexes_parallel_condcleanup++;
+	}
+	shm_toc_estimate_chunk(&pcxt->estimator, est_shared);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	/* Estimate size for dead tuples -- PARALLEL_VACUUM_KEY_DEAD_TUPLES */
+	maxtuples = compute_max_dead_tuples(nblocks, true);
+	est_deadtuples = MAXALIGN(SizeOfLVDeadTuples(maxtuples));
+	shm_toc_estimate_chunk(&pcxt->estimator, est_deadtuples);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	/* Finally, estimate PARALLEL_VACUUM_KEY_QUERY_TEXT space */
+	querylen = strlen(debug_query_string);
+	shm_toc_estimate_chunk(&pcxt->estimator, querylen + 1);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	InitializeParallelDSM(pcxt);
+
+	/* Prepare shared information */
+	shared = (LVShared *) shm_toc_allocate(pcxt->toc, est_shared);
+	MemSet(shared, 0, est_shared);
+	shared->relid = relid;
+	shared->elevel = elevel;
+	shared->maintenance_work_mem_worker =
+		(nindexes_mwm > 0) ?
+		maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
+		maintenance_work_mem;
+
+	pg_atomic_init_u32(&(shared->cost_balance), 0);
+	pg_atomic_init_u32(&(shared->active_nworkers), 0);
+	pg_atomic_init_u32(&(shared->idx), 0);
+	shared->offset = MAXALIGN(add_size(SizeOfLVShared, BITMAPLEN(nindexes)));
+	prepare_index_statistics(shared, can_parallel_vacuum, nindexes);
+
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_SHARED, shared);
+	lps->lvshared = shared;
+
+	/* Prepare the dead tuple space */
+	dead_tuples = (LVDeadTuples *) shm_toc_allocate(pcxt->toc, est_deadtuples);
+	dead_tuples->max_tuples = maxtuples;
+	dead_tuples->num_tuples = 0;
+	MemSet(dead_tuples->itemptrs, 0, sizeof(ItemPointerData) * maxtuples);
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_DEAD_TUPLES, dead_tuples);
+	vacrelstats->dead_tuples = dead_tuples;
+
+	/* Store query string for workers */
+	sharedquery = (char *) shm_toc_allocate(pcxt->toc, querylen + 1);
+	memcpy(sharedquery, debug_query_string, querylen + 1);
+	sharedquery[querylen] = '\0';
+	shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, sharedquery);
+
+	pfree(can_parallel_vacuum);
+	return lps;
+}
+
+/*
+ * Destroy the parallel context, and end parallel mode.
+ *
+ * Since writes are not allowed during the parallel mode, so we copy the
+ * updated index statistics from DSM in local memory and then later use that
+ * to update the index statistics.  One might think that we can exit from
+ * parallel mode, update the index statistics and then destroy parallel
+ * context, but that won't be safe (see ExitParallelMode).
+ */
+static void
+end_parallel_vacuum(Relation *Irel, IndexBulkDeleteResult **stats,
+					LVParallelState *lps, int nindexes)
+{
+	int			i;
+
+	Assert(!IsParallelWorker());
+
+	/* Copy the updated statistics */
+	for (i = 0; i < nindexes; i++)
+	{
+		LVSharedIndStats *indstats = get_indstats(lps->lvshared, i);
+
+		/*
+		 * Skip unused slot.  The statistics of this index are already stored
+		 * in local memory.
+		 */
+		if (indstats == NULL)
+			continue;
+
+		if (indstats->updated)
+		{
+			stats[i] = (IndexBulkDeleteResult *) palloc0(sizeof(IndexBulkDeleteResult));
+			memcpy(stats[i], &(indstats->stats), sizeof(IndexBulkDeleteResult));
+		}
+		else
+			stats[i] = NULL;
+	}
+
+	DestroyParallelContext(lps->pcxt);
+	ExitParallelMode();
+
+	/* Deactivate parallel vacuum */
+	pfree(lps);
+	lps = NULL;
+}
+
+/* Return the Nth index statistics or NULL */
+static LVSharedIndStats *
+get_indstats(LVShared *lvshared, int n)
+{
+	int			i;
+	char	   *p;
+
+	if (IndStatsIsNull(lvshared, n))
+		return NULL;
+
+	p = (char *) GetSharedIndStats(lvshared);
+	for (i = 0; i < n; i++)
+	{
+		if (IndStatsIsNull(lvshared, i))
+			continue;
+
+		p += sizeof(LVSharedIndStats);
+	}
+
+	return (LVSharedIndStats *) p;
+}
+
+/*
+ * Returns true, if the given index can't participate in parallel index vacuum
+ * or parallel index cleanup, false, otherwise.
+ */
+static bool
+skip_parallel_vacuum_index(Relation indrel, LVShared *lvshared)
+{
+	uint8		vacoptions = indrel->rd_indam->amparallelvacuumoptions;
+
+	/* first_time must be true only if for_cleanup is true */
+	Assert(lvshared->for_cleanup || !lvshared->first_time);
+
+	if (lvshared->for_cleanup)
+	{
+		/* Skip, if the index does not support parallel cleanup */
+		if (((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) == 0) &&
+			((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) == 0))
+			return true;
+
+		/*
+		 * Skip, if the index supports parallel cleanup conditionally, but we
+		 * have already processed the index (for bulkdelete).  See the
+		 * comments for option VACUUM_OPTION_PARALLEL_COND_CLEANUP to know
+		 * when indexes support parallel cleanup conditionally.
+		 */
+		if (!lvshared->first_time &&
+			((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) != 0))
+			return true;
+	}
+	else if ((vacoptions & VACUUM_OPTION_PARALLEL_BULKDEL) == 0)
+	{
+		/* Skip if the index does not support parallel bulk deletion */
+		return true;
+	}
+
+	return false;
+}
+
+/*
+ * Perform work within a launched parallel process.
+ *
+ * Since parallel vacuum workers perform only index vacuum or index cleanup,
+ * we don't need to report the progress information.
+ */
+void
+parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
+{
+	Relation	onerel;
+	Relation   *indrels;
+	LVShared   *lvshared;
+	LVDeadTuples *dead_tuples;
+	int			nindexes;
+	char	   *sharedquery;
+	IndexBulkDeleteResult **stats;
+
+	lvshared = (LVShared *) shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_SHARED,
+										   false);
+	elevel = lvshared->elevel;
+
+	ereport(DEBUG1,
+			(errmsg("starting parallel vacuum worker for %s",
+					lvshared->for_cleanup ? "cleanup" : "bulk delete")));
+
+	/* Set debug_query_string for individual workers */
+	sharedquery = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, false);
+	debug_query_string = sharedquery;
+	pgstat_report_activity(STATE_RUNNING, debug_query_string);
+
+	/*
+	 * Open table.  The lock mode is the same as the leader process.  It's
+	 * okay because the lock mode does not conflict among the parallel
+	 * workers.
+	 */
+	onerel = table_open(lvshared->relid, ShareUpdateExclusiveLock);
+
+	/*
+	 * Open all indexes. indrels are sorted in order by OID, which should be
+	 * matched to the leader's one.
+	 */
+	vac_open_indexes(onerel, RowExclusiveLock, &nindexes, &indrels);
+	Assert(nindexes > 0);
+
+	/* Set dead tuple space */
+	dead_tuples = (LVDeadTuples *) shm_toc_lookup(toc,
+												  PARALLEL_VACUUM_KEY_DEAD_TUPLES,
+												  false);
+
+	/* Set cost-based vacuum delay */
+	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostBalance = 0;
+	VacuumPageHit = 0;
+	VacuumPageMiss = 0;
+	VacuumPageDirty = 0;
+	VacuumCostBalanceLocal = 0;
+	VacuumSharedCostBalance = &(lvshared->cost_balance);
+	VacuumActiveNWorkers = &(lvshared->active_nworkers);
+
+	stats = (IndexBulkDeleteResult **)
+		palloc0(nindexes * sizeof(IndexBulkDeleteResult *));
+
+	if (lvshared->maintenance_work_mem_worker > 0)
+		maintenance_work_mem = lvshared->maintenance_work_mem_worker;
+
+	/* Process indexes to perform vacuum/cleanup */
+	parallel_vacuum_index(indrels, stats, lvshared, dead_tuples, nindexes);
+
+	vac_close_indexes(nindexes, indrels, RowExclusiveLock);
+	table_close(onerel, ShareUpdateExclusiveLock);
+	pfree(stats);
+}
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index f3e2254..df06e7d 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -14,6 +14,7 @@
 
 #include "postgres.h"
 
+#include "access/heapam.h"
 #include "access/nbtree.h"
 #include "access/parallel.h"
 #include "access/session.h"
@@ -139,6 +140,9 @@ static const struct
 	},
 	{
 		"_bt_parallel_build_main", _bt_parallel_build_main
+	},
+	{
+		"parallel_vacuum_main", parallel_vacuum_main
 	}
 };
 
@@ -174,6 +178,7 @@ CreateParallelContext(const char *library_name, const char *function_name,
 	pcxt = palloc0(sizeof(ParallelContext));
 	pcxt->subid = GetCurrentSubTransactionId();
 	pcxt->nworkers = nworkers;
+	pcxt->nworkers_to_launch = nworkers;
 	pcxt->library_name = pstrdup(library_name);
 	pcxt->function_name = pstrdup(function_name);
 	pcxt->error_context_stack = error_context_stack;
@@ -487,6 +492,23 @@ ReinitializeParallelDSM(ParallelContext *pcxt)
 }
 
 /*
+ * Reinitialize parallel workers for a parallel context such that we could
+ * launch the different number of workers.  This is required for cases where
+ * we need to reuse the same DSM segment, but the number of workers can
+ * vary from run-to-run.
+ */
+void
+ReinitializeParallelWorkers(ParallelContext *pcxt, int nworkers_to_launch)
+{
+	/*
+	 * The number of workers that need to be launched must be less than the
+	 * number of workers with which the parallel context is initialized.
+	 */
+	Assert(pcxt->nworkers >= nworkers_to_launch);
+	pcxt->nworkers_to_launch = nworkers_to_launch;
+}
+
+/*
  * Launch parallel workers.
  */
 void
@@ -498,7 +520,7 @@ LaunchParallelWorkers(ParallelContext *pcxt)
 	bool		any_registrations_failed = false;
 
 	/* Skip this if we have no workers. */
-	if (pcxt->nworkers == 0)
+	if (pcxt->nworkers == 0 || pcxt->nworkers_to_launch == 0)
 		return;
 
 	/* We need to be a lock group leader. */
@@ -533,7 +555,7 @@ LaunchParallelWorkers(ParallelContext *pcxt)
 	 * fails.  It wouldn't help much anyway, because registering the worker in
 	 * no way guarantees that it will start up and initialize successfully.
 	 */
-	for (i = 0; i < pcxt->nworkers; ++i)
+	for (i = 0; i < pcxt->nworkers_to_launch; ++i)
 	{
 		memcpy(worker.bgw_extra, &i, sizeof(int));
 		if (!any_registrations_failed &&
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index bb34e25..d625d17 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -42,6 +42,7 @@
 #include "nodes/makefuncs.h"
 #include "pgstat.h"
 #include "postmaster/autovacuum.h"
+#include "postmaster/bgworker_internals.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
 #include "storage/proc.h"
@@ -68,6 +69,14 @@ static MemoryContext vac_context = NULL;
 static BufferAccessStrategy vac_strategy;
 
 
+/*
+ * Variables for cost-based parallel vacuum.  See comments atop
+ * compute_parallel_delay to understand how it works.
+ */
+pg_atomic_uint32 *VacuumSharedCostBalance = NULL;
+pg_atomic_uint32 *VacuumActiveNWorkers = NULL;
+int			VacuumCostBalanceLocal = 0;
+
 /* non-export function prototypes */
 static List *expand_vacuum_rel(VacuumRelation *vrel, int options);
 static List *get_all_vacuum_rels(int options);
@@ -76,6 +85,7 @@ static void vac_truncate_clog(TransactionId frozenXID,
 							  TransactionId lastSaneFrozenXid,
 							  MultiXactId lastSaneMinMulti);
 static bool vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params);
+static double compute_parallel_delay(void);
 static VacOptTernaryValue get_vacopt_ternary_value(DefElem *def);
 
 /*
@@ -94,12 +104,16 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 	bool		freeze = false;
 	bool		full = false;
 	bool		disable_page_skipping = false;
+	bool		parallel_option = false;
 	ListCell   *lc;
 
 	/* Set default value */
 	params.index_cleanup = VACOPT_TERNARY_DEFAULT;
 	params.truncate = VACOPT_TERNARY_DEFAULT;
 
+	/* By default parallel vacuum is enabled */
+	params.nworkers = 0;
+
 	/* Parse options list */
 	foreach(lc, vacstmt->options)
 	{
@@ -129,6 +143,39 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 			params.index_cleanup = get_vacopt_ternary_value(opt);
 		else if (strcmp(opt->defname, "truncate") == 0)
 			params.truncate = get_vacopt_ternary_value(opt);
+		else if (strcmp(opt->defname, "parallel") == 0)
+		{
+			parallel_option = true;
+			if (opt->arg == NULL)
+			{
+				ereport(ERROR,
+						(errcode(ERRCODE_SYNTAX_ERROR),
+						 errmsg("parallel option requires a value between 0 and %d",
+								MAX_PARALLEL_WORKER_LIMIT),
+						 parser_errposition(pstate, opt->location)));
+			}
+			else
+			{
+				int			nworkers;
+
+				nworkers = defGetInt32(opt);
+				if (nworkers < 0 || nworkers > MAX_PARALLEL_WORKER_LIMIT)
+					ereport(ERROR,
+							(errcode(ERRCODE_SYNTAX_ERROR),
+							 errmsg("parallel vacuum degree must be between 0 and %d",
+									MAX_PARALLEL_WORKER_LIMIT),
+							 parser_errposition(pstate, opt->location)));
+
+				/*
+				 * Disable parallel vacuum, if user has specified parallel
+				 * degree as zero.
+				 */
+				if (nworkers == 0)
+					params.nworkers = -1;
+				else
+					params.nworkers = nworkers;
+			}
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -152,6 +199,11 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 		   !(params.options & (VACOPT_FULL | VACOPT_FREEZE)));
 	Assert(!(params.options & VACOPT_SKIPTOAST));
 
+	if ((params.options & VACOPT_FULL) && parallel_option)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot specify both FULL and PARALLEL options")));
+
 	/*
 	 * Make sure VACOPT_ANALYZE is specified if any column lists are present.
 	 */
@@ -383,6 +435,9 @@ vacuum(List *relations, VacuumParams *params,
 		VacuumPageHit = 0;
 		VacuumPageMiss = 0;
 		VacuumPageDirty = 0;
+		VacuumCostBalanceLocal = 0;
+		VacuumSharedCostBalance = NULL;
+		VacuumActiveNWorkers = NULL;
 
 		/*
 		 * Loop to process each selected relation.
@@ -1941,16 +1996,26 @@ vac_close_indexes(int nindexes, Relation *Irel, LOCKMODE lockmode)
 void
 vacuum_delay_point(void)
 {
+	double		msec = 0;
+
 	/* Always check for interrupts */
 	CHECK_FOR_INTERRUPTS();
 
-	/* Nap if appropriate */
-	if (VacuumCostActive && !InterruptPending &&
-		VacuumCostBalance >= VacuumCostLimit)
-	{
-		double		msec;
+	if (!VacuumCostActive || InterruptPending)
+		return;
 
+	/*
+	 * For parallel vacuum, the delay is computed based on the shared cost
+	 * balance.  See compute_parallel_delay.
+	 */
+	if (VacuumSharedCostBalance != NULL)
+		msec = compute_parallel_delay();
+	else if (VacuumCostBalance >= VacuumCostLimit)
 		msec = VacuumCostDelay * VacuumCostBalance / VacuumCostLimit;
+
+	/* Nap if appropriate */
+	if (msec > 0)
+	{
 		if (msec > VacuumCostDelay * 4)
 			msec = VacuumCostDelay * 4;
 
@@ -1967,6 +2032,66 @@ vacuum_delay_point(void)
 }
 
 /*
+ * Computes the vacuum delay for parallel workers.
+ *
+ * The basic idea of a cost-based vacuum delay for parallel vacuum is to allow
+ * each worker to sleep proportional to the work done by it.  We achieve this
+ * by allowing all parallel vacuum workers including the leader process to
+ * have a shared view of cost related parameters (mainly VacuumCostBalance).
+ * We allow each worker to update it as and when it has incurred any cost and
+ * then based on that decide whether it needs to sleep.  We compute the time
+ * to sleep for a worker based on the cost it has incurred
+ * (VacuumCostBalanceLocal) and then reduce the VacuumSharedCostBalance by
+ * that amount.  This avoids letting the workers sleep who have done less or
+ * no I/O as compared to other workers and therefore can ensure that workers
+ * who are doing more I/O got throttled more.
+ *
+ * We allow any worker to sleep only if it has performed the I/O above a
+ * certain threshold, which is calculated based on the number of active
+ * workers (VacuumActiveNWorkers), and the overall cost balance is more than
+ * VacuumCostLimit set by the system.  The testing reveals that we achieve
+ * the required throttling if we allow a worker that has done more than 50%
+ * of its share of work to sleep.
+ */
+static double
+compute_parallel_delay(void)
+{
+	double		msec = 0;
+	uint32		shared_balance;
+	int			nworkers;
+
+	/* Parallel vacuum must be active */
+	Assert(VacuumSharedCostBalance);
+
+	nworkers = pg_atomic_read_u32(VacuumActiveNWorkers);
+
+	/* At least count itself */
+	Assert(nworkers >= 1);
+
+	/* Update the shared cost balance value atomically */
+	shared_balance = pg_atomic_add_fetch_u32(VacuumSharedCostBalance, VacuumCostBalance);
+
+	/* Compute the total local balance for the current worker */
+	VacuumCostBalanceLocal += VacuumCostBalance;
+
+	if ((shared_balance >= VacuumCostLimit) &&
+		(VacuumCostBalanceLocal > 0.5 * (VacuumCostLimit / nworkers)))
+	{
+		/* Compute sleep time based on the local cost balance */
+		msec = VacuumCostDelay * VacuumCostBalanceLocal / VacuumCostLimit;
+		pg_atomic_sub_fetch_u32(VacuumSharedCostBalance, VacuumCostBalanceLocal);
+		VacuumCostBalanceLocal = 0;
+	}
+
+	/*
+	 * Reset the local balance as we accumulated it into the shared value.
+	 */
+	VacuumCostBalance = 0;
+
+	return msec;
+}
+
+/*
  * A wrapper function of defGetBoolean().
  *
  * This function returns VACOPT_TERNARY_ENABLED and VACOPT_TERNARY_DISABLED
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index f0e40e3..6d1f28c 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2886,6 +2886,8 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 			(!wraparound ? VACOPT_SKIP_LOCKED : 0);
 		tab->at_params.index_cleanup = VACOPT_TERNARY_DEFAULT;
 		tab->at_params.truncate = VACOPT_TERNARY_DEFAULT;
+		/* As of now, we don't support parallel vacuum for autovacuum */
+		tab->at_params.nworkers = -1;
 		tab->at_params.freeze_min_age = freeze_min_age;
 		tab->at_params.freeze_table_age = freeze_table_age;
 		tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index b52396c..052d98b 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -3597,7 +3597,7 @@ psql_completion(const char *text, int start, int end)
 		if (ends_with(prev_wd, '(') || ends_with(prev_wd, ','))
 			COMPLETE_WITH("FULL", "FREEZE", "ANALYZE", "VERBOSE",
 						  "DISABLE_PAGE_SKIPPING", "SKIP_LOCKED",
-						  "INDEX_CLEANUP", "TRUNCATE");
+						  "INDEX_CLEANUP", "TRUNCATE", "PARALLEL");
 		else if (TailMatches("FULL|FREEZE|ANALYZE|VERBOSE|DISABLE_PAGE_SKIPPING|SKIP_LOCKED|INDEX_CLEANUP|TRUNCATE"))
 			COMPLETE_WITH("ON", "OFF");
 	}
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 580b4ca..00a17f5 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -23,7 +23,9 @@
 #include "nodes/lockoptions.h"
 #include "nodes/primnodes.h"
 #include "storage/bufpage.h"
+#include "storage/dsm.h"
 #include "storage/lockdefs.h"
+#include "storage/shm_toc.h"
 #include "utils/relcache.h"
 #include "utils/snapshot.h"
 
@@ -193,6 +195,7 @@ extern Size SyncScanShmemSize(void);
 struct VacuumParams;
 extern void heap_vacuum_rel(Relation onerel,
 							struct VacuumParams *params, BufferAccessStrategy bstrategy);
+extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
 
 /* in heap/heapam_visibility.c */
 extern bool HeapTupleSatisfiesVisibility(HeapTuple stup, Snapshot snapshot,
diff --git a/src/include/access/parallel.h b/src/include/access/parallel.h
index 646708b..fc6a560 100644
--- a/src/include/access/parallel.h
+++ b/src/include/access/parallel.h
@@ -33,7 +33,8 @@ typedef struct ParallelContext
 {
 	dlist_node	node;
 	SubTransactionId subid;
-	int			nworkers;
+	int			nworkers;		/* Maximum number of workers to launch */
+	int			nworkers_to_launch; /* Actual number of workers to launch */
 	int			nworkers_launched;
 	char	   *library_name;
 	char	   *function_name;
@@ -63,6 +64,7 @@ extern ParallelContext *CreateParallelContext(const char *library_name,
 											  const char *function_name, int nworkers);
 extern void InitializeParallelDSM(ParallelContext *pcxt);
 extern void ReinitializeParallelDSM(ParallelContext *pcxt);
+extern void ReinitializeParallelWorkers(ParallelContext *pcxt, int nworkers_to_launch);
 extern void LaunchParallelWorkers(ParallelContext *pcxt);
 extern void WaitForParallelWorkersToAttach(ParallelContext *pcxt);
 extern void WaitForParallelWorkersToFinish(ParallelContext *pcxt);
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index b3351ad..c27d255 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -222,6 +222,13 @@ typedef struct VacuumParams
 										 * default value depends on reloptions */
 	VacOptTernaryValue truncate;	/* Truncate empty pages at the end,
 									 * default value depends on reloptions */
+
+	/*
+	 * The number of parallel vacuum workers.  0 by default which means choose
+	 * based on the number of indexes.  -1 indicates a parallel vacuum is
+	 * disabled.
+	 */
+	int			nworkers;
 } VacuumParams;
 
 /* GUC parameters */
@@ -231,6 +238,11 @@ extern int	vacuum_freeze_table_age;
 extern int	vacuum_multixact_freeze_min_age;
 extern int	vacuum_multixact_freeze_table_age;
 
+/* Variables for cost-based parallel vacuum */
+extern pg_atomic_uint32 *VacuumSharedCostBalance;
+extern pg_atomic_uint32 *VacuumActiveNWorkers;
+extern int	VacuumCostBalanceLocal;
+
 
 /* in commands/vacuum.c */
 extern void ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel);
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index 9996d88..f4250a4 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -92,6 +92,40 @@ CONTEXT:  SQL function "do_analyze" statement 1
 SQL function "wrap_do_analyze" statement 1
 VACUUM FULL vactst;
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+-- PARALLEL option
+CREATE TABLE pvactst (i INT, a INT[], p POINT) with (autovacuum_enabled = off);
+INSERT INTO pvactst SELECT i, array[1,2,3], point(i, i+1) FROM generate_series(1,1000) i;
+CREATE INDEX btree_pvactst ON pvactst USING btree (i);
+CREATE INDEX hash_pvactst ON pvactst USING hash (i);
+CREATE INDEX brin_pvactst ON pvactst USING brin (i);
+CREATE INDEX gin_pvactst ON pvactst USING gin (a);
+CREATE INDEX gist_pvactst ON pvactst USING gist (p);
+CREATE INDEX spgist_pvactst ON pvactst USING spgist (p);
+-- VACUUM invokes parallel index cleanup
+SET min_parallel_index_scan_size to 0;
+VACUUM (PARALLEL 2) pvactst;
+-- VACUUM invokes parallel bulk-deletion
+UPDATE pvactst SET i = i WHERE i < 1000;
+VACUUM (PARALLEL 2) pvactst;
+UPDATE pvactst SET i = i WHERE i < 1000;
+VACUUM (PARALLEL 0) pvactst; -- disable parallel vacuum
+VACUUM (PARALLEL -1) pvactst; -- error
+ERROR:  parallel vacuum degree must be between 0 and 1024
+LINE 1: VACUUM (PARALLEL -1) pvactst;
+                ^
+VACUUM (PARALLEL 2, INDEX_CLEANUP FALSE) pvactst;
+VACUUM (PARALLEL 2, FULL TRUE) pvactst; -- error, cannot use both PARALLEL and FULL
+ERROR:  cannot specify both FULL and PARALLEL options
+VACUUM (PARALLEL) pvactst; -- error, cannot use PARALLEL option without parallel degree
+ERROR:  parallel option requires a value between 0 and 1024
+LINE 1: VACUUM (PARALLEL) pvactst;
+                ^
+CREATE TEMPORARY TABLE tmp (a int PRIMARY KEY);
+CREATE INDEX tmp_idx1 ON tmp (a);
+VACUUM (PARALLEL 1) tmp; -- disables parallel vacuum option
+WARNING:  disabling parallel option of vacuum on "tmp" --- cannot vacuum temporary tables in parallel
+RESET min_parallel_index_scan_size;
+DROP TABLE pvactst;
 -- INDEX_CLEANUP option
 CREATE TABLE no_index_cleanup (i INT PRIMARY KEY, t TEXT);
 -- Use uncompressed data stored in toast.
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 69987f7..cf741f7 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -75,6 +75,37 @@ VACUUM FULL vactst;
 
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
 
+-- PARALLEL option
+CREATE TABLE pvactst (i INT, a INT[], p POINT) with (autovacuum_enabled = off);
+INSERT INTO pvactst SELECT i, array[1,2,3], point(i, i+1) FROM generate_series(1,1000) i;
+CREATE INDEX btree_pvactst ON pvactst USING btree (i);
+CREATE INDEX hash_pvactst ON pvactst USING hash (i);
+CREATE INDEX brin_pvactst ON pvactst USING brin (i);
+CREATE INDEX gin_pvactst ON pvactst USING gin (a);
+CREATE INDEX gist_pvactst ON pvactst USING gist (p);
+CREATE INDEX spgist_pvactst ON pvactst USING spgist (p);
+
+-- VACUUM invokes parallel index cleanup
+SET min_parallel_index_scan_size to 0;
+VACUUM (PARALLEL 2) pvactst;
+
+-- VACUUM invokes parallel bulk-deletion
+UPDATE pvactst SET i = i WHERE i < 1000;
+VACUUM (PARALLEL 2) pvactst;
+
+UPDATE pvactst SET i = i WHERE i < 1000;
+VACUUM (PARALLEL 0) pvactst; -- disable parallel vacuum
+
+VACUUM (PARALLEL -1) pvactst; -- error
+VACUUM (PARALLEL 2, INDEX_CLEANUP FALSE) pvactst;
+VACUUM (PARALLEL 2, FULL TRUE) pvactst; -- error, cannot use both PARALLEL and FULL
+VACUUM (PARALLEL) pvactst; -- error, cannot use PARALLEL option without parallel degree
+CREATE TEMPORARY TABLE tmp (a int PRIMARY KEY);
+CREATE INDEX tmp_idx1 ON tmp (a);
+VACUUM (PARALLEL 1) tmp; -- disables parallel vacuum option
+RESET min_parallel_index_scan_size;
+DROP TABLE pvactst;
+
 -- INDEX_CLEANUP option
 CREATE TABLE no_index_cleanup (i INT PRIMARY KEY, t TEXT);
 -- Use uncompressed data stored in toast.
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index caf6b86..0242e66 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -1216,7 +1216,11 @@ LPVOID
 LPWSTR
 LSEG
 LUID
+LVDeadTuples
 LVRelStats
+LVShared
+LVSharedIndStats
+LVParallelState
 LWLock
 LWLockHandle
 LWLockMinimallyPadded
-- 
1.8.3.1

#380

Mahendra Singh Thalor

mahi6run@gmail.com

almost 6 years ago

In reply to: Amit Kapila (#379)

On Fri, 17 Jan 2020 at 14:47, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Jan 17, 2020 at 12:51 PM Dilip Kumar <dilipbalaut@gmail.com>

wrote:

On Fri, Jan 17, 2020 at 11:39 AM Dilip Kumar <dilipbalaut@gmail.com>

wrote:

I have performed cost delay testing on the latest test(I have used
same script as attahced in [1] and [2].
vacuum_cost_delay = 10
vacuum_cost_limit = 2000

Observation: As we have concluded earlier, the delay time is in sync
with the I/O performed by the worker
and the total delay (heap + index) is almost the same as the
non-parallel operation.

Thanks for doing this test again. In the attached patch, I have
addressed all the comments and modified a few comments.

Hi,
Below are some review comments for v50 patch.

1.
+LVShared
+LVSharedIndStats
+LVParallelState
 LWLock

I think, LVParallelState should come before LVSharedIndStats.

2.
+    /*
+     * It is possible that parallel context is initialized with fewer
workers
+     * then the number of indexes that need a separate worker in the
current
+     * phase, so we need to consider it.  See
compute_parallel_vacuum_workers.
+     */

This comment is confusing me. I think, "then" should be replaced with
"than".

--
Thanks and Regards
Mahendra Singh Thalor
EnterpriseDB: http://www.enterprisedb.com

#381

Peter Geoghegan

pg@bowt.ie

almost 6 years ago

In reply to: Amit Kapila (#379)

On Fri, Jan 17, 2020 at 1:18 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

Thanks for doing this test again. In the attached patch, I have
addressed all the comments and modified a few comments.

I am in favor of the general idea of parallel VACUUM that parallelizes
the processing of each index (I haven't looked at the patch, though).
I observed something during a recent benchmark of the deduplication
patch that seems like it might be relevant to parallel VACUUM. This
happened during a recreation of the original WARM benchmark, which is
described here:

/messages/by-id/CABOikdMNy6yowA+wTGK9RVd8iw+CzqHeQSGpW7Yka_4RSZ_LOQ@mail.gmail.com

(There is an extra pgbench_accounts index on abalance, plus 4 indexes
on large text columns with filler MD5 hashes, all of which are
random.)

On the master branch, I can clearly observe that the "filler" MD5
indexes are bloated to a degree that is affected by the order of their
original creation/pg_class OID order. These are all indexes that
become bloated purely due to "version churn" -- or what I like to call
"unnecessary" page splits. The keys used in each pgbench_accounts
logical row never change, except in the case of the extra abalance
index (the idea is to prevent all HOT updates without ever updating
most indexed columns). I noticed that pgb_a_filler1 is a bit less
bloated than pgb_a_filler2, which is a little less bloated than
pgb_a_filler3, which is a little less bloated than pgb_a_filler4. Even
after 4 hours, and even though the "shape" of each index is identical.
This demonstrates an important general principle about vacuuming
indexes: timeliness can matter a lot.

In general, a big benefit of the deduplication patch is that it "buys
time" for VACUUM to run before "unnecessary" page splits can occur --
that is why the deduplication patch prevents *all* page splits in
these "filler" indexes, whereas on the master branch the filler
indexes are about 2x larger (the exact amount varies based on VACUUM
processing order, at least earlier on).

For tables with several indexes, giving each index its own VACUUM
worker process will prevent "unnecessary" page splits caused by
version churn, simply because VACUUM will start to clean each index
sooner than it would compared to serial processing (except for the
"lucky" first index). There is no "lucky" first index that gets
preferential treatment -- presumably VACUUM will start processing each
index at the same time with this patch, making each index equally
"lucky".

I think that there may even be a *complementary* effect with parallel
VACUUM, though I haven't tested that theory. Deduplication "buys time"
for VACUUM to run, while at the same time VACUUM takes less time to
show up and prevent "unnecessary" page splits. My guess is that these
two seemingly unrelated patches may actually address this "unnecessary
page split" problem from two completely different angles, with an
overall effect that is greater than the sum of its parts.

While the difference in size of each filler index on the master branch
wasn't that significant on its own, it's still interesting. It's
probably quite workload dependent.

--
Peter Geoghegan

#382

Amit Kapila

amit.kapila16@gmail.com

almost 6 years ago

In reply to: Peter Geoghegan (#381)

On Sun, Jan 19, 2020 at 2:15 AM Peter Geoghegan <pg@bowt.ie> wrote:

On Fri, Jan 17, 2020 at 1:18 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

Thanks for doing this test again. In the attached patch, I have
addressed all the comments and modified a few comments.

I am in favor of the general idea of parallel VACUUM that parallelizes
the processing of each index (I haven't looked at the patch, though).
I observed something during a recent benchmark of the deduplication
patch that seems like it might be relevant to parallel VACUUM. This
happened during a recreation of the original WARM benchmark, which is
described here:

/messages/by-id/CABOikdMNy6yowA+wTGK9RVd8iw+CzqHeQSGpW7Yka_4RSZ_LOQ@mail.gmail.com

(There is an extra pgbench_accounts index on abalance, plus 4 indexes
on large text columns with filler MD5 hashes, all of which are
random.)

On the master branch, I can clearly observe that the "filler" MD5
indexes are bloated to a degree that is affected by the order of their
original creation/pg_class OID order. These are all indexes that
become bloated purely due to "version churn" -- or what I like to call
"unnecessary" page splits. The keys used in each pgbench_accounts
logical row never change, except in the case of the extra abalance
index (the idea is to prevent all HOT updates without ever updating
most indexed columns). I noticed that pgb_a_filler1 is a bit less
bloated than pgb_a_filler2, which is a little less bloated than
pgb_a_filler3, which is a little less bloated than pgb_a_filler4. Even
after 4 hours, and even though the "shape" of each index is identical.
This demonstrates an important general principle about vacuuming
indexes: timeliness can matter a lot.

In general, a big benefit of the deduplication patch is that it "buys
time" for VACUUM to run before "unnecessary" page splits can occur --
that is why the deduplication patch prevents *all* page splits in
these "filler" indexes, whereas on the master branch the filler
indexes are about 2x larger (the exact amount varies based on VACUUM
processing order, at least earlier on).

For tables with several indexes, giving each index its own VACUUM
worker process will prevent "unnecessary" page splits caused by
version churn, simply because VACUUM will start to clean each index
sooner than it would compared to serial processing (except for the
"lucky" first index). There is no "lucky" first index that gets
preferential treatment -- presumably VACUUM will start processing each
index at the same time with this patch, making each index equally
"lucky".

I think that there may even be a *complementary* effect with parallel
VACUUM, though I haven't tested that theory. Deduplication "buys time"
for VACUUM to run, while at the same time VACUUM takes less time to
show up and prevent "unnecessary" page splits. My guess is that these
two seemingly unrelated patches may actually address this "unnecessary
page split" problem from two completely different angles, with an
overall effect that is greater than the sum of its parts.

Good analysis and I agree that the parallel vacuum patch can help in
such cases. However, as of now, it only works via Vacuum command, so
some user intervention is required to realize the benefit.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#383

Amit Kapila

amit.kapila16@gmail.com

almost 6 years ago

In reply to: Mahendra Singh Thalor (#380)

On Fri, Jan 17, 2020 at 4:35 PM Mahendra Singh Thalor
<mahi6run@gmail.com> wrote:

Below are some review comments for v50 patch.
1.
+LVShared
+LVSharedIndStats
+LVParallelState
LWLock
I think, LVParallelState should come before LVSharedIndStats.
2.
+    /*
+     * It is possible that parallel context is initialized with fewer workers
+     * then the number of indexes that need a separate worker in the current
+     * phase, so we need to consider it.  See compute_parallel_vacuum_workers.
+     */
This comment is confusing me. I think, "then" should be replaced with "than".

Pushed, after fixing these two comments.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#384

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

almost 6 years ago

In reply to: Amit Kapila (#383)

On Mon, 20 Jan 2020 at 12:39, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Jan 17, 2020 at 4:35 PM Mahendra Singh Thalor
<mahi6run@gmail.com> wrote:
Below are some review comments for v50 patch.
1.
+LVShared
+LVSharedIndStats
+LVParallelState
LWLock
I think, LVParallelState should come before LVSharedIndStats.
2.
+    /*
+     * It is possible that parallel context is initialized with fewer workers
+     * then the number of indexes that need a separate worker in the current
+     * phase, so we need to consider it.  See compute_parallel_vacuum_workers.
+     */
This comment is confusing me. I think, "then" should be replaced with "than".
Pushed, after fixing these two comments.

Thank you for committing!

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#385

Andres Freund

andres@anarazel.de

almost 6 years ago

In reply to: Amit Kapila (#383)

Hi,

On 2020-01-20 09:09:35 +0530, Amit Kapila wrote:

Pushed, after fixing these two comments.

When attempting to vacuum a large table I just got:

postgres=# vacuum FREEZE ;
ERROR: invalid memory alloc request size 1073741828

#0 palloc (size=1073741828) at /mnt/tools/src/postgresql/src/backend/utils/mmgr/mcxt.c:959
#1 0x000056452cc45cac in lazy_space_alloc (vacrelstats=0x56452e5ab0e8, vacrelstats=0x56452e5ab0e8, relblocks=24686152)
at /mnt/tools/src/postgresql/src/backend/access/heap/vacuumlazy.c:2741
#2 lazy_scan_heap (aggressive=true, nindexes=1, Irel=0x56452e5ab1c8, vacrelstats=<optimized out>, params=0x7ffdf8c00290, onerel=<optimized out>)
at /mnt/tools/src/postgresql/src/backend/access/heap/vacuumlazy.c:786
#3 heap_vacuum_rel (onerel=<optimized out>, params=0x7ffdf8c00290, bstrategy=<optimized out>)
at /mnt/tools/src/postgresql/src/backend/access/heap/vacuumlazy.c:472
#4 0x000056452cd8b42c in table_relation_vacuum (bstrategy=<optimized out>, params=0x7ffdf8c00290, rel=0x7fbcdff1e248)
at /mnt/tools/src/postgresql/src/include/access/tableam.h:1450
#5 vacuum_rel (relid=16454, relation=<optimized out>, params=params@entry=0x7ffdf8c00290) at /mnt/tools/src/postgresql/src/backend/commands/vacuum.c:1882

Looks to me that the calculation moved into compute_max_dead_tuples()
continues to use use an allocation ceiling
maxtuples = Min(maxtuples, MaxAllocSize / sizeof(ItemPointerData));
but the actual allocation now is

#define SizeOfLVDeadTuples(cnt) \
add_size((offsetof(LVDeadTuples, itemptrs)), \
mul_size(sizeof(ItemPointerData), cnt))

i.e. the overhead of offsetof(LVDeadTuples, itemptrs) is not taken into
account.

Regards,

Andres

#386

Amit Kapila

amit.kapila16@gmail.com

almost 6 years ago

In reply to: Andres Freund (#385)

On Tue, Jan 21, 2020 at 11:30 AM Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2020-01-20 09:09:35 +0530, Amit Kapila wrote:

Pushed, after fixing these two comments.

When attempting to vacuum a large table I just got:

postgres=# vacuum FREEZE ;
ERROR: invalid memory alloc request size 1073741828

#0 palloc (size=1073741828) at /mnt/tools/src/postgresql/src/backend/utils/mmgr/mcxt.c:959
#1 0x000056452cc45cac in lazy_space_alloc (vacrelstats=0x56452e5ab0e8, vacrelstats=0x56452e5ab0e8, relblocks=24686152)
at /mnt/tools/src/postgresql/src/backend/access/heap/vacuumlazy.c:2741
#2 lazy_scan_heap (aggressive=true, nindexes=1, Irel=0x56452e5ab1c8, vacrelstats=<optimized out>, params=0x7ffdf8c00290, onerel=<optimized out>)
at /mnt/tools/src/postgresql/src/backend/access/heap/vacuumlazy.c:786
#3 heap_vacuum_rel (onerel=<optimized out>, params=0x7ffdf8c00290, bstrategy=<optimized out>)
at /mnt/tools/src/postgresql/src/backend/access/heap/vacuumlazy.c:472
#4 0x000056452cd8b42c in table_relation_vacuum (bstrategy=<optimized out>, params=0x7ffdf8c00290, rel=0x7fbcdff1e248)
at /mnt/tools/src/postgresql/src/include/access/tableam.h:1450
#5 vacuum_rel (relid=16454, relation=<optimized out>, params=params@entry=0x7ffdf8c00290) at /mnt/tools/src/postgresql/src/backend/commands/vacuum.c:1882

Looks to me that the calculation moved into compute_max_dead_tuples()
continues to use use an allocation ceiling
maxtuples = Min(maxtuples, MaxAllocSize / sizeof(ItemPointerData));
but the actual allocation now is

#define SizeOfLVDeadTuples(cnt) \
add_size((offsetof(LVDeadTuples, itemptrs)), \
mul_size(sizeof(ItemPointerData), cnt))

i.e. the overhead of offsetof(LVDeadTuples, itemptrs) is not taken into
account.

Right, I think we need to take into account in both the places in
compute_max_dead_tuples():

maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);
..
maxtuples = Min(maxtuples, MaxAllocSize / sizeof(ItemPointerData));

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#387

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

almost 6 years ago

In reply to: Amit Kapila (#386)

1 attachment(s)

On Tue, 21 Jan 2020 at 15:35, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Jan 21, 2020 at 11:30 AM Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2020-01-20 09:09:35 +0530, Amit Kapila wrote:

Pushed, after fixing these two comments.

When attempting to vacuum a large table I just got:

postgres=# vacuum FREEZE ;
ERROR: invalid memory alloc request size 1073741828

#0 palloc (size=1073741828) at /mnt/tools/src/postgresql/src/backend/utils/mmgr/mcxt.c:959
#1 0x000056452cc45cac in lazy_space_alloc (vacrelstats=0x56452e5ab0e8, vacrelstats=0x56452e5ab0e8, relblocks=24686152)
at /mnt/tools/src/postgresql/src/backend/access/heap/vacuumlazy.c:2741
#2 lazy_scan_heap (aggressive=true, nindexes=1, Irel=0x56452e5ab1c8, vacrelstats=<optimized out>, params=0x7ffdf8c00290, onerel=<optimized out>)
at /mnt/tools/src/postgresql/src/backend/access/heap/vacuumlazy.c:786
#3 heap_vacuum_rel (onerel=<optimized out>, params=0x7ffdf8c00290, bstrategy=<optimized out>)
at /mnt/tools/src/postgresql/src/backend/access/heap/vacuumlazy.c:472
#4 0x000056452cd8b42c in table_relation_vacuum (bstrategy=<optimized out>, params=0x7ffdf8c00290, rel=0x7fbcdff1e248)
at /mnt/tools/src/postgresql/src/include/access/tableam.h:1450
#5 vacuum_rel (relid=16454, relation=<optimized out>, params=params@entry=0x7ffdf8c00290) at /mnt/tools/src/postgresql/src/backend/commands/vacuum.c:1882

Looks to me that the calculation moved into compute_max_dead_tuples()
continues to use use an allocation ceiling
maxtuples = Min(maxtuples, MaxAllocSize / sizeof(ItemPointerData));
but the actual allocation now is

#define SizeOfLVDeadTuples(cnt) \
add_size((offsetof(LVDeadTuples, itemptrs)), \
mul_size(sizeof(ItemPointerData), cnt))

i.e. the overhead of offsetof(LVDeadTuples, itemptrs) is not taken into
account.

Right, I think we need to take into account in both the places in
compute_max_dead_tuples():

maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);
..
maxtuples = Min(maxtuples, MaxAllocSize / sizeof(ItemPointerData));

Agreed. Attached patch should fix this issue.

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

fix_max_dead_tuples.patchapplication/octet-stream; name=fix_max_dead_tuples.patchDownload

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index b331f4c279..765d8992ea 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -159,9 +159,9 @@ typedef struct LVDeadTuples
 														 * ItemPointerData */
 } LVDeadTuples;
 
-#define SizeOfLVDeadTuples(cnt) \
-		add_size((offsetof(LVDeadTuples, itemptrs)), \
-				 mul_size(sizeof(ItemPointerData), cnt))
+#define SizeOfLVDeadTuples (offsetof(LVDeadTuples, itemptrs))
+#define SizeOfDeadTuples(cnt) \
+	add_size(SizeOfLVDeadTuples, mul_size(sizeof(ItemPointerData), cnt))
 
 /*
  * Shared information among parallel workers.  So this is allocated in the DSM
@@ -2708,9 +2708,10 @@ compute_max_dead_tuples(BlockNumber relblocks, bool useindex)
 
 	if (useindex)
 	{
-		maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);
+		maxtuples = ((vac_work_mem * 1024L) - SizeOfLVDeadTuplesHeader) / sizeof(ItemPointerData);
 		maxtuples = Min(maxtuples, INT_MAX);
-		maxtuples = Min(maxtuples, MaxAllocSize / sizeof(ItemPointerData));
+		maxtuples = Min(maxtuples,
+						(MaxAllocSize - SizeOfLVDeadTuples) / sizeof(ItemPointerData));
 
 		/* curious coding here to ensure the multiplication can't overflow */
 		if ((BlockNumber) (maxtuples / LAZY_ALLOC_TUPLES) > relblocks)
@@ -2738,7 +2739,7 @@ lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
 
 	maxtuples = compute_max_dead_tuples(relblocks, vacrelstats->useindex);
 
-	dead_tuples = (LVDeadTuples *) palloc(SizeOfLVDeadTuples(maxtuples));
+	dead_tuples = (LVDeadTuples *) palloc(SizeOfDeadTuples(maxtuples));
 	dead_tuples->num_tuples = 0;
 	dead_tuples->max_tuples = (int) maxtuples;
 
@@ -3146,7 +3147,7 @@ begin_parallel_vacuum(Oid relid, Relation *Irel, LVRelStats *vacrelstats,
 
 	/* Estimate size for dead tuples -- PARALLEL_VACUUM_KEY_DEAD_TUPLES */
 	maxtuples = compute_max_dead_tuples(nblocks, true);
-	est_deadtuples = MAXALIGN(SizeOfLVDeadTuples(maxtuples));
+	est_deadtuples = MAXALIGN(SizeOfDeadTuples(maxtuples));
 	shm_toc_estimate_chunk(&pcxt->estimator, est_deadtuples);
 	shm_toc_estimate_keys(&pcxt->estimator, 1);

#388

Amit Kapila

amit.kapila16@gmail.com

almost 6 years ago

In reply to: Masahiko Sawada (#387)

On Tue, Jan 21, 2020 at 12:11 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Tue, 21 Jan 2020 at 15:35, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Jan 21, 2020 at 11:30 AM Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2020-01-20 09:09:35 +0530, Amit Kapila wrote:

Pushed, after fixing these two comments.

When attempting to vacuum a large table I just got:

postgres=# vacuum FREEZE ;
ERROR: invalid memory alloc request size 1073741828

#0 palloc (size=1073741828) at /mnt/tools/src/postgresql/src/backend/utils/mmgr/mcxt.c:959
#1 0x000056452cc45cac in lazy_space_alloc (vacrelstats=0x56452e5ab0e8, vacrelstats=0x56452e5ab0e8, relblocks=24686152)
at /mnt/tools/src/postgresql/src/backend/access/heap/vacuumlazy.c:2741
#2 lazy_scan_heap (aggressive=true, nindexes=1, Irel=0x56452e5ab1c8, vacrelstats=<optimized out>, params=0x7ffdf8c00290, onerel=<optimized out>)
at /mnt/tools/src/postgresql/src/backend/access/heap/vacuumlazy.c:786
#3 heap_vacuum_rel (onerel=<optimized out>, params=0x7ffdf8c00290, bstrategy=<optimized out>)
at /mnt/tools/src/postgresql/src/backend/access/heap/vacuumlazy.c:472
#4 0x000056452cd8b42c in table_relation_vacuum (bstrategy=<optimized out>, params=0x7ffdf8c00290, rel=0x7fbcdff1e248)
at /mnt/tools/src/postgresql/src/include/access/tableam.h:1450
#5 vacuum_rel (relid=16454, relation=<optimized out>, params=params@entry=0x7ffdf8c00290) at /mnt/tools/src/postgresql/src/backend/commands/vacuum.c:1882

Looks to me that the calculation moved into compute_max_dead_tuples()
continues to use use an allocation ceiling
maxtuples = Min(maxtuples, MaxAllocSize / sizeof(ItemPointerData));
but the actual allocation now is

#define SizeOfLVDeadTuples(cnt) \
add_size((offsetof(LVDeadTuples, itemptrs)), \
mul_size(sizeof(ItemPointerData), cnt))

i.e. the overhead of offsetof(LVDeadTuples, itemptrs) is not taken into
account.

Right, I think we need to take into account in both the places in
compute_max_dead_tuples():

maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);
..
maxtuples = Min(maxtuples, MaxAllocSize / sizeof(ItemPointerData));

Agreed. Attached patch should fix this issue.

if (useindex)
  {
- maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);
+ maxtuples = ((vac_work_mem * 1024L) - SizeOfLVDeadTuplesHeader) /
sizeof(ItemPointerData);

SizeOfLVDeadTuplesHeader is not defined by patch. Do you think it
makes sense to add a comment here about the calculation?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#389

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

almost 6 years ago

In reply to: Amit Kapila (#388)

1 attachment(s)

On Tue, 21 Jan 2020 at 16:13, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Jan 21, 2020 at 12:11 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Tue, 21 Jan 2020 at 15:35, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Jan 21, 2020 at 11:30 AM Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2020-01-20 09:09:35 +0530, Amit Kapila wrote:

Pushed, after fixing these two comments.

When attempting to vacuum a large table I just got:

postgres=# vacuum FREEZE ;
ERROR: invalid memory alloc request size 1073741828

#0 palloc (size=1073741828) at /mnt/tools/src/postgresql/src/backend/utils/mmgr/mcxt.c:959
#1 0x000056452cc45cac in lazy_space_alloc (vacrelstats=0x56452e5ab0e8, vacrelstats=0x56452e5ab0e8, relblocks=24686152)
at /mnt/tools/src/postgresql/src/backend/access/heap/vacuumlazy.c:2741
#2 lazy_scan_heap (aggressive=true, nindexes=1, Irel=0x56452e5ab1c8, vacrelstats=<optimized out>, params=0x7ffdf8c00290, onerel=<optimized out>)
at /mnt/tools/src/postgresql/src/backend/access/heap/vacuumlazy.c:786
#3 heap_vacuum_rel (onerel=<optimized out>, params=0x7ffdf8c00290, bstrategy=<optimized out>)
at /mnt/tools/src/postgresql/src/backend/access/heap/vacuumlazy.c:472
#4 0x000056452cd8b42c in table_relation_vacuum (bstrategy=<optimized out>, params=0x7ffdf8c00290, rel=0x7fbcdff1e248)
at /mnt/tools/src/postgresql/src/include/access/tableam.h:1450
#5 vacuum_rel (relid=16454, relation=<optimized out>, params=params@entry=0x7ffdf8c00290) at /mnt/tools/src/postgresql/src/backend/commands/vacuum.c:1882

Looks to me that the calculation moved into compute_max_dead_tuples()
continues to use use an allocation ceiling
maxtuples = Min(maxtuples, MaxAllocSize / sizeof(ItemPointerData));
but the actual allocation now is

#define SizeOfLVDeadTuples(cnt) \
add_size((offsetof(LVDeadTuples, itemptrs)), \
mul_size(sizeof(ItemPointerData), cnt))

i.e. the overhead of offsetof(LVDeadTuples, itemptrs) is not taken into
account.

Right, I think we need to take into account in both the places in
compute_max_dead_tuples():

maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);
..
maxtuples = Min(maxtuples, MaxAllocSize / sizeof(ItemPointerData));

Agreed. Attached patch should fix this issue.
if (useindex)
{
- maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);
+ maxtuples = ((vac_work_mem * 1024L) - SizeOfLVDeadTuplesHeader) /
sizeof(ItemPointerData);
SizeOfLVDeadTuplesHeader is not defined by patch. Do you think it
makes sense to add a comment here about the calculation?

Oops, it should be SizeOfLVDeadTuples. Attached updated version.

I defined two macros: SizeOfLVDeadTuples is the size of LVDeadTuples
struct and SizeOfDeadTuples is the size including LVDeadTuples struct
and dead tuples.

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

fix_max_dead_tuples_v2.patchapplication/octet-stream; name=fix_max_dead_tuples_v2.patchDownload

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index b331f4c279..e776558008 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -159,9 +159,9 @@ typedef struct LVDeadTuples
 														 * ItemPointerData */
 } LVDeadTuples;
 
-#define SizeOfLVDeadTuples(cnt) \
-		add_size((offsetof(LVDeadTuples, itemptrs)), \
-				 mul_size(sizeof(ItemPointerData), cnt))
+#define SizeOfLVDeadTuples (offsetof(LVDeadTuples, itemptrs))
+#define SizeOfDeadTuples(cnt) \
+	add_size(SizeOfLVDeadTuples, mul_size(sizeof(ItemPointerData), cnt))
 
 /*
  * Shared information among parallel workers.  So this is allocated in the DSM
@@ -2708,9 +2708,11 @@ compute_max_dead_tuples(BlockNumber relblocks, bool useindex)
 
 	if (useindex)
 	{
-		maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);
+		/* The dead tuple space consists of LVDeadTuples and dead tuple TIDs */
+		maxtuples = ((vac_work_mem * 1024L) - SizeOfLVDeadTuples) / sizeof(ItemPointerData);
 		maxtuples = Min(maxtuples, INT_MAX);
-		maxtuples = Min(maxtuples, MaxAllocSize / sizeof(ItemPointerData));
+		maxtuples = Min(maxtuples,
+						(MaxAllocSize - SizeOfLVDeadTuples) / sizeof(ItemPointerData));
 
 		/* curious coding here to ensure the multiplication can't overflow */
 		if ((BlockNumber) (maxtuples / LAZY_ALLOC_TUPLES) > relblocks)
@@ -2738,7 +2740,7 @@ lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
 
 	maxtuples = compute_max_dead_tuples(relblocks, vacrelstats->useindex);
 
-	dead_tuples = (LVDeadTuples *) palloc(SizeOfLVDeadTuples(maxtuples));
+	dead_tuples = (LVDeadTuples *) palloc(SizeOfDeadTuples(maxtuples));
 	dead_tuples->num_tuples = 0;
 	dead_tuples->max_tuples = (int) maxtuples;
 
@@ -3146,7 +3148,7 @@ begin_parallel_vacuum(Oid relid, Relation *Irel, LVRelStats *vacrelstats,
 
 	/* Estimate size for dead tuples -- PARALLEL_VACUUM_KEY_DEAD_TUPLES */
 	maxtuples = compute_max_dead_tuples(nblocks, true);
-	est_deadtuples = MAXALIGN(SizeOfLVDeadTuples(maxtuples));
+	est_deadtuples = MAXALIGN(SizeOfDeadTuples(maxtuples));
 	shm_toc_estimate_chunk(&pcxt->estimator, est_deadtuples);
 	shm_toc_estimate_keys(&pcxt->estimator, 1);

#390

Amit Kapila

amit.kapila16@gmail.com

almost 6 years ago

In reply to: Masahiko Sawada (#389)

1 attachment(s)

On Tue, Jan 21, 2020 at 12:51 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Tue, 21 Jan 2020 at 16:13, Amit Kapila <amit.kapila16@gmail.com> wrote:

SizeOfLVDeadTuplesHeader is not defined by patch. Do you think it
makes sense to add a comment here about the calculation?

Oops, it should be SizeOfLVDeadTuples. Attached updated version.

I defined two macros: SizeOfLVDeadTuples is the size of LVDeadTuples
struct and SizeOfDeadTuples is the size including LVDeadTuples struct
and dead tuples.

I have reproduced the issue by defining MaxAllocSize as 10240000 and
then during debugging, skipped the check related to LAZY_ALLOC_TUPLES.
After patch, it fixes the problem for me. I have slightly modified
your patch to define the macros on the lines of existing macros
TXID_SNAPSHOT_SIZE and TXID_SNAPSHOT_MAX_NXIP. What do you think
about it?

Andres, see if you get a chance to run the test again with the
attached patch, otherwise, I will commit it tomorrow morning.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachments:

fix_max_dead_tuples_v3.patchapplication/octet-stream; name=fix_max_dead_tuples_v3.patchDownload

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index b331f4c279..26f055fc30 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -159,9 +159,12 @@ typedef struct LVDeadTuples
 														 * ItemPointerData */
 } LVDeadTuples;
 
-#define SizeOfLVDeadTuples(cnt) \
-		add_size((offsetof(LVDeadTuples, itemptrs)), \
-				 mul_size(sizeof(ItemPointerData), cnt))
+/* The dead tuple space consists of LVDeadTuples and dead tuple TIDs */
+#define SizeOfDeadTuples(cnt) \
+	add_size(offsetof(LVDeadTuples, itemptrs), \
+			 mul_size(sizeof(ItemPointerData), cnt))
+#define MAXDEADTUPLES(max_size) \
+		((max_size - offsetof(LVDeadTuples, itemptrs)) / sizeof(ItemPointerData))
 
 /*
  * Shared information among parallel workers.  So this is allocated in the DSM
@@ -2708,9 +2711,9 @@ compute_max_dead_tuples(BlockNumber relblocks, bool useindex)
 
 	if (useindex)
 	{
-		maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);
+		maxtuples = MAXDEADTUPLES(vac_work_mem * 1024L);
 		maxtuples = Min(maxtuples, INT_MAX);
-		maxtuples = Min(maxtuples, MaxAllocSize / sizeof(ItemPointerData));
+		maxtuples = Min(maxtuples, MAXDEADTUPLES(MaxAllocSize));
 
 		/* curious coding here to ensure the multiplication can't overflow */
 		if ((BlockNumber) (maxtuples / LAZY_ALLOC_TUPLES) > relblocks)
@@ -2738,7 +2741,7 @@ lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
 
 	maxtuples = compute_max_dead_tuples(relblocks, vacrelstats->useindex);
 
-	dead_tuples = (LVDeadTuples *) palloc(SizeOfLVDeadTuples(maxtuples));
+	dead_tuples = (LVDeadTuples *) palloc(SizeOfDeadTuples(maxtuples));
 	dead_tuples->num_tuples = 0;
 	dead_tuples->max_tuples = (int) maxtuples;
 
@@ -3146,7 +3149,7 @@ begin_parallel_vacuum(Oid relid, Relation *Irel, LVRelStats *vacrelstats,
 
 	/* Estimate size for dead tuples -- PARALLEL_VACUUM_KEY_DEAD_TUPLES */
 	maxtuples = compute_max_dead_tuples(nblocks, true);
-	est_deadtuples = MAXALIGN(SizeOfLVDeadTuples(maxtuples));
+	est_deadtuples = MAXALIGN(SizeOfDeadTuples(maxtuples));
 	shm_toc_estimate_chunk(&pcxt->estimator, est_deadtuples);
 	shm_toc_estimate_keys(&pcxt->estimator, 1);

#391

Dilip Kumar

dilipbalaut@gmail.com

almost 6 years ago

In reply to: Amit Kapila (#390)

On Tue, Jan 21, 2020 at 2:46 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Jan 21, 2020 at 12:51 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Tue, 21 Jan 2020 at 16:13, Amit Kapila <amit.kapila16@gmail.com> wrote:

SizeOfLVDeadTuplesHeader is not defined by patch. Do you think it
makes sense to add a comment here about the calculation?

Oops, it should be SizeOfLVDeadTuples. Attached updated version.

I defined two macros: SizeOfLVDeadTuples is the size of LVDeadTuples
struct and SizeOfDeadTuples is the size including LVDeadTuples struct
and dead tuples.

I have reproduced the issue by defining MaxAllocSize as 10240000 and
then during debugging, skipped the check related to LAZY_ALLOC_TUPLES.
After patch, it fixes the problem for me. I have slightly modified
your patch to define the macros on the lines of existing macros
TXID_SNAPSHOT_SIZE and TXID_SNAPSHOT_MAX_NXIP. What do you think
about it?

Andres, see if you get a chance to run the test again with the
attached patch, otherwise, I will commit it tomorrow morning.

Patch looks fine to me except, we better use parentheses for the
variable passed in macro.

+#define MAXDEADTUPLES(max_size) \
+ ((max_size - offsetof(LVDeadTuples, itemptrs)) / sizeof(ItemPointerData))
change to -> (((max_size) - offsetof(LVDeadTuples, itemptrs)) /
sizeof(ItemPointerData))

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#392

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

almost 6 years ago

In reply to: Amit Kapila (#390)

On Tue, 21 Jan 2020 at 18:16, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Jan 21, 2020 at 12:51 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Tue, 21 Jan 2020 at 16:13, Amit Kapila <amit.kapila16@gmail.com> wrote:

SizeOfLVDeadTuplesHeader is not defined by patch. Do you think it
makes sense to add a comment here about the calculation?

Oops, it should be SizeOfLVDeadTuples. Attached updated version.

I defined two macros: SizeOfLVDeadTuples is the size of LVDeadTuples
struct and SizeOfDeadTuples is the size including LVDeadTuples struct
and dead tuples.

I have reproduced the issue by defining MaxAllocSize as 10240000 and
then during debugging, skipped the check related to LAZY_ALLOC_TUPLES.
After patch, it fixes the problem for me. I have slightly modified
your patch to define the macros on the lines of existing macros
TXID_SNAPSHOT_SIZE and TXID_SNAPSHOT_MAX_NXIP. What do you think
about it?

Thank you for updating the patch. Yeah MAXDEADTUPLES is better than
what I did in the previous version patch.

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#393

Amit Kapila

amit.kapila16@gmail.com

almost 6 years ago

In reply to: Masahiko Sawada (#392)

On Wed, Jan 22, 2020 at 7:14 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Tue, 21 Jan 2020 at 18:16, Amit Kapila <amit.kapila16@gmail.com> wrote:

I have reproduced the issue by defining MaxAllocSize as 10240000 and
then during debugging, skipped the check related to LAZY_ALLOC_TUPLES.
After patch, it fixes the problem for me. I have slightly modified
your patch to define the macros on the lines of existing macros
TXID_SNAPSHOT_SIZE and TXID_SNAPSHOT_MAX_NXIP. What do you think
about it?

Thank you for updating the patch. Yeah MAXDEADTUPLES is better than
what I did in the previous version patch.

Pushed after making the change suggested by Dilip.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#394

Amit Kapila

amit.kapila16@gmail.com

almost 6 years ago

In reply to: Masahiko Sawada (#392)

On Wed, Jan 22, 2020 at 7:14 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

Thank you for updating the patch. Yeah MAXDEADTUPLES is better than
what I did in the previous version patch.

Would you like to resubmit your vacuumdb utility patch for this
enhancement? I see some old version of it and it seems to me that you
need to update that patch.

+ if (optarg != NULL)
+ {
+ parallel_workers = atoi(optarg);
+ if (parallel_workers <= 0)
+ {
+ pg_log_error("number of parallel workers must be at least 1");
+ exit(1);
+ }
+ }

This will no longer be true.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#395

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

almost 6 years ago

In reply to: Amit Kapila (#394)

1 attachment(s)

On Wed, 22 Jan 2020 at 11:23, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Jan 22, 2020 at 7:14 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

Thank you for updating the patch. Yeah MAXDEADTUPLES is better than
what I did in the previous version patch.

Would you like to resubmit your vacuumdb utility patch for this
enhancement? I see some old version of it and it seems to me that you
need to update that patch.
+ if (optarg != NULL)
+ {
+ parallel_workers = atoi(optarg);
+ if (parallel_workers <= 0)
+ {
+ pg_log_error("number of parallel workers must be at least 1");
+ exit(1);
+ }
+ }
This will no longer be true.

Attached the updated version patch.

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

v36-0001-Add-paralell-P-option-to-vacuumdb-command.patchapplication/octet-stream; name=v36-0001-Add-paralell-P-option-to-vacuumdb-command.patchDownload

From 27a10160986ba92be8c11e23b98cd1aa40941bea Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Wed, 23 Jan 2019 16:07:53 +0900
Subject: [PATCH v36] Add --paralell, -P option to vacuumdb command

---
 doc/src/sgml/ref/vacuumdb.sgml    | 16 +++++++++++++++
 src/bin/scripts/t/100_vacuumdb.pl | 13 +++++++++++-
 src/bin/scripts/vacuumdb.c        | 34 ++++++++++++++++++++++++++++++-
 3 files changed, 61 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/ref/vacuumdb.sgml b/doc/src/sgml/ref/vacuumdb.sgml
index 47d93456f8..7b11fbb28f 100644
--- a/doc/src/sgml/ref/vacuumdb.sgml
+++ b/doc/src/sgml/ref/vacuumdb.sgml
@@ -226,6 +226,22 @@ PostgreSQL documentation
       </listitem>
      </varlistentry>
 
+     <varlistentry>
+      <term><option>-P <replaceable class="parameter">parallel_degree</replaceable></option></term>
+      <term><option>--parallel=<replaceable class="parameter">parallel_degree</replaceable></option></term>
+      <listitem>
+       <para>
+        Specify the parallel degree of <firstterm>parallel vacuum</firstterm>.
+       </para>
+       <note>
+        <para>
+         This option is only available for servers running
+         <productname>PostgreSQL</productname> 13 and later.
+        </para>
+       </note>
+      </listitem>
+     </varlistentry>
+
      <varlistentry>
       <term><option>-q</option></term>
       <term><option>--quiet</option></term>
diff --git a/src/bin/scripts/t/100_vacuumdb.pl b/src/bin/scripts/t/100_vacuumdb.pl
index b685b35282..c2284c8195 100644
--- a/src/bin/scripts/t/100_vacuumdb.pl
+++ b/src/bin/scripts/t/100_vacuumdb.pl
@@ -3,7 +3,7 @@ use warnings;
 
 use PostgresNode;
 use TestLib;
-use Test::More tests => 44;
+use Test::More tests => 49;
 
 program_help_ok('vacuumdb');
 program_version_ok('vacuumdb');
@@ -48,6 +48,14 @@ $node->issues_sql_like(
 $node->command_fails(
 	[ 'vacuumdb', '--analyze-only', '--disable-page-skipping', 'postgres' ],
 	'--analyze-only and --disable-page-skipping specified together');
+$node->issues_sql_like(
+	[ 'vacuumdb', '-P', 2, 'postgres' ],
+	qr/statement: VACUUM \(PARALLEL 2\).*;/,
+	'vacuumdb -P 2');
+$node->issues_sql_like(
+	[ 'vacuumdb', '-P', 0, 'postgres' ],
+	qr/statement: VACUUM \(PARALLEL 0\).*;/,
+	'vacuumdb -P 0');
 $node->command_ok([qw(vacuumdb -Z --table=pg_am dbname=template1)],
 	'vacuumdb with connection string');
 
@@ -81,6 +89,9 @@ $node->command_fails(
 $node->command_fails(
 	[ 'vacuumdb', '--analyze', '--table', 'vactable(c)', 'postgres' ],
 	'incorrect column name with ANALYZE');
+$node->command_fails(
+	[ 'vacuumdb', '-P', -1, 'postgres' ],
+	'negative parallel degree');
 $node->issues_sql_like(
 	[ 'vacuumdb', '--analyze', '--table', 'vactable(a, b)', 'postgres' ],
 	qr/statement: VACUUM \(ANALYZE\) public.vactable\(a, b\);/,
diff --git a/src/bin/scripts/vacuumdb.c b/src/bin/scripts/vacuumdb.c
index bfa6ac6355..2f7f5bde4a 100644
--- a/src/bin/scripts/vacuumdb.c
+++ b/src/bin/scripts/vacuumdb.c
@@ -35,6 +35,8 @@ typedef struct vacuumingOptions
 	bool		skip_locked;
 	int			min_xid_age;
 	int			min_mxid_age;
+	int			parallel_workers;	/* >= 0 indicates user specified the parallel
+									 * degree, otherwise -1 */
 } vacuumingOptions;
 
 
@@ -87,6 +89,7 @@ main(int argc, char *argv[])
 		{"full", no_argument, NULL, 'f'},
 		{"verbose", no_argument, NULL, 'v'},
 		{"jobs", required_argument, NULL, 'j'},
+		{"parallel", required_argument, NULL, 'P'},
 		{"maintenance-db", required_argument, NULL, 2},
 		{"analyze-in-stages", no_argument, NULL, 3},
 		{"disable-page-skipping", no_argument, NULL, 4},
@@ -116,6 +119,7 @@ main(int argc, char *argv[])
 
 	/* initialize options to all false */
 	memset(&vacopts, 0, sizeof(vacopts));
+	vacopts.parallel_workers = -1;
 
 	pg_logging_init(argv[0]);
 	progname = get_progname(argv[0]);
@@ -123,7 +127,7 @@ main(int argc, char *argv[])
 
 	handle_help_version_opts(argc, argv, "vacuumdb", help);
 
-	while ((c = getopt_long(argc, argv, "h:p:U:wWeqd:zZFat:fvj:", long_options, &optindex)) != -1)
+	while ((c = getopt_long(argc, argv, "h:p:P:U:wWeqd:zZFat:fvj:", long_options, &optindex)) != -1)
 	{
 		switch (c)
 		{
@@ -183,6 +187,14 @@ main(int argc, char *argv[])
 					exit(1);
 				}
 				break;
+			case 'P':
+				vacopts.parallel_workers = atoi(optarg);
+				if (vacopts.parallel_workers < 0)
+				{
+					pg_log_error("parallel vacuum degree must be a non-negative integer");
+					exit(1);
+				}
+				break;
 			case 2:
 				maintenance_db = pg_strdup(optarg);
 				break;
@@ -255,6 +267,12 @@ main(int argc, char *argv[])
 						 "disable-page-skipping");
 			exit(1);
 		}
+		if (vacopts.parallel_workers >= 0)
+		{
+			pg_log_error("cannot use the \"%s\" option when performing only analyze",
+						 "parallel");
+			exit(1);
+		}
 		/* allow 'and_analyze' with 'analyze_only' */
 	}
 
@@ -405,6 +423,13 @@ vacuum_one_database(const char *dbname, vacuumingOptions *vacopts,
 		exit(1);
 	}
 
+	if (vacopts->parallel_workers >= 0 && PQserverVersion(conn) < 130000)
+	{
+		pg_log_error("cannot use the \"%s\" option on server versions older than PostgreSQL %s",
+					 "--parallel", "13");
+		exit(1);
+	}
+
 	if (!quiet)
 	{
 		if (stage != ANALYZE_NO_STAGE)
@@ -823,6 +848,12 @@ prepare_vacuum_command(PQExpBuffer sql, int serverVersion,
 				appendPQExpBuffer(sql, "%sANALYZE", sep);
 				sep = comma;
 			}
+			if (vacopts->parallel_workers >= 0)
+			{
+				appendPQExpBuffer(sql, "%sPARALLEL %d", sep,
+								  vacopts->parallel_workers);
+				sep = comma;
+			}
 			if (sep != paren)
 				appendPQExpBufferChar(sql, ')');
 		}
@@ -886,6 +917,7 @@ help(const char *progname)
 	printf(_("  -j, --jobs=NUM                  use this many concurrent connections to vacuum\n"));
 	printf(_("      --min-mxid-age=MXID_AGE     minimum multixact ID age of tables to vacuum\n"));
 	printf(_("      --min-xid-age=XID_AGE       minimum transaction ID age of tables to vacuum\n"));
+	printf(_("  -P, --parallel=PARALLEL_DEGREE  do parallel vacuum\n"));
 	printf(_("  -q, --quiet                     don't write any messages\n"));
 	printf(_("      --skip-locked               skip relations that cannot be immediately locked\n"));
 	printf(_("  -t, --table='TABLE[(COLUMNS)]'  vacuum specific table(s) only\n"));
-- 
2.23.0

#396

Mahendra Singh Thalor

mahi6run@gmail.com

almost 6 years ago

In reply to: Masahiko Sawada (#395)

On Wed, 22 Jan 2020 at 12:48, Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Wed, 22 Jan 2020 at 11:23, Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Jan 22, 2020 at 7:14 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

Thank you for updating the patch. Yeah MAXDEADTUPLES is better than
what I did in the previous version patch.

Would you like to resubmit your vacuumdb utility patch for this
enhancement? I see some old version of it and it seems to me that you
need to update that patch.
+ if (optarg != NULL)
+ {
+ parallel_workers = atoi(optarg);
+ if (parallel_workers <= 0)
+ {
+ pg_log_error("number of parallel workers must be at least 1");
+ exit(1);
+ }
+ }
This will no longer be true.
Attached the updated version patch.

Thanks Sawada-san for the re-based patch.

I reviewed and tested this patch. Patch looks good to me.

--
Thanks and Regards
Mahendra Singh Thalor
EnterpriseDB: http://www.enterprisedb.com

#397

Mahendra Singh Thalor

mahi6run@gmail.com

almost 6 years ago

In reply to: Mahendra Singh Thalor (#396)

On Thu, 23 Jan 2020 at 15:32, Mahendra Singh Thalor <mahi6run@gmail.com> wrote:

On Wed, 22 Jan 2020 at 12:48, Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
On Wed, 22 Jan 2020 at 11:23, Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Jan 22, 2020 at 7:14 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

Thank you for updating the patch. Yeah MAXDEADTUPLES is better than
what I did in the previous version patch.

Would you like to resubmit your vacuumdb utility patch for this
enhancement? I see some old version of it and it seems to me that you
need to update that patch.
+ if (optarg != NULL)
+ {
+ parallel_workers = atoi(optarg);
+ if (parallel_workers <= 0)
+ {
+ pg_log_error("number of parallel workers must be at least 1");
+ exit(1);
+ }
+ }
This will no longer be true.
Attached the updated version patch.
Thanks Sawada-san for the re-based patch.

I reviewed and tested this patch. Patch looks good to me.

As offline, suggested by Amit Kapila, I verified vacuumdb "-P" option
functionality with older versions(<13) and also I tested vacuumdb by
giving "-j" option with "-P". All are working as per expectation and I
didn't find any issue with these options.

--
Thanks and Regards
Mahendra Singh Thalor
EnterpriseDB: http://www.enterprisedb.com

#398

Amit Kapila

amit.kapila16@gmail.com

almost 6 years ago

In reply to: Mahendra Singh Thalor (#397)

1 attachment(s)

On Fri, Jan 24, 2020 at 4:58 PM Mahendra Singh Thalor
<mahi6run@gmail.com> wrote:

On Thu, 23 Jan 2020 at 15:32, Mahendra Singh Thalor <mahi6run@gmail.com> wrote:

On Wed, 22 Jan 2020 at 12:48, Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

Attached the updated version patch.

Thanks Sawada-san for the re-based patch.

I reviewed and tested this patch. Patch looks good to me.

As offline, suggested by Amit Kapila, I verified vacuumdb "-P" option
functionality with older versions(<13) and also I tested vacuumdb by
giving "-j" option with "-P". All are working as per expectation and I
didn't find any issue with these options.

I have made few modifications in the patch.

1. I think we should try to block the usage of 'full' and 'parallel'
option in the utility rather than allowing the server to return an
error.
2. It is better to handle 'P' option in getopt_long in the order of
its declaration in long_options array.
3. Added an Assert for server version while handling of parallel option.
4. Added a few sentences in the documentation.

What do you guys think of the attached?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachments:

v37-0001-Add-parallel-option-to-vacuumdb-command.patchapplication/octet-stream; name=v37-0001-Add-parallel-option-to-vacuumdb-command.patchDownload

From e1f52d3381bf274f2d7999abeb4b609e7f81fc95 Mon Sep 17 00:00:00 2001
From: Amit Kapila <akapila@postgresql.org>
Date: Sat, 25 Jan 2020 11:53:51 +0530
Subject: [PATCH] Add --parallel option to vacuumdb command.

Commit 40d964ec99 allowed vacuum command to leverage multiple CPUs by
invoking parallel workers to process indexes.  This commit provides a
--parallel option to specify the parallel degree used by vacuum command.

Author: Masahiko Sawada with few modifications by me
Reviewed-by: Mahendra Singh and Amit Kapila
Discussion: https://postgr.es/m/CAD21AoDTPMgzSkV4E3SFo1CH_x50bf5PqZFQf4jmqjk-C03BWg@mail.gmail.com
---
 doc/src/sgml/ref/vacuumdb.sgml    | 18 +++++++++++++++
 src/bin/scripts/t/100_vacuumdb.pl | 13 ++++++++++-
 src/bin/scripts/vacuumdb.c        | 47 ++++++++++++++++++++++++++++++++++++++-
 3 files changed, 76 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/ref/vacuumdb.sgml b/doc/src/sgml/ref/vacuumdb.sgml
index 47d9345..775c9ec 100644
--- a/doc/src/sgml/ref/vacuumdb.sgml
+++ b/doc/src/sgml/ref/vacuumdb.sgml
@@ -227,6 +227,24 @@ PostgreSQL documentation
      </varlistentry>
 
      <varlistentry>
+      <term><option>-P <replaceable class="parameter">parallel_degree</replaceable></option></term>
+      <term><option>--parallel=<replaceable class="parameter">parallel_degree</replaceable></option></term>
+      <listitem>
+       <para>
+        Specify the parallel degree of <firstterm>parallel vacuum</firstterm>.
+        This allows the vacuum to leverage multiple CPUs to process indexes.
+        See <xref linkend="sql-vacuum"/>.
+       </para>
+       <note>
+        <para>
+         This option is only available for servers running
+         <productname>PostgreSQL</productname> 13 and later.
+        </para>
+       </note>
+      </listitem>
+     </varlistentry>
+
+     <varlistentry>
       <term><option>-q</option></term>
       <term><option>--quiet</option></term>
       <listitem>
diff --git a/src/bin/scripts/t/100_vacuumdb.pl b/src/bin/scripts/t/100_vacuumdb.pl
index b685b35..c2284c8 100644
--- a/src/bin/scripts/t/100_vacuumdb.pl
+++ b/src/bin/scripts/t/100_vacuumdb.pl
@@ -3,7 +3,7 @@ use warnings;
 
 use PostgresNode;
 use TestLib;
-use Test::More tests => 44;
+use Test::More tests => 49;
 
 program_help_ok('vacuumdb');
 program_version_ok('vacuumdb');
@@ -48,6 +48,14 @@ $node->issues_sql_like(
 $node->command_fails(
 	[ 'vacuumdb', '--analyze-only', '--disable-page-skipping', 'postgres' ],
 	'--analyze-only and --disable-page-skipping specified together');
+$node->issues_sql_like(
+	[ 'vacuumdb', '-P', 2, 'postgres' ],
+	qr/statement: VACUUM \(PARALLEL 2\).*;/,
+	'vacuumdb -P 2');
+$node->issues_sql_like(
+	[ 'vacuumdb', '-P', 0, 'postgres' ],
+	qr/statement: VACUUM \(PARALLEL 0\).*;/,
+	'vacuumdb -P 0');
 $node->command_ok([qw(vacuumdb -Z --table=pg_am dbname=template1)],
 	'vacuumdb with connection string');
 
@@ -81,6 +89,9 @@ $node->command_fails(
 $node->command_fails(
 	[ 'vacuumdb', '--analyze', '--table', 'vactable(c)', 'postgres' ],
 	'incorrect column name with ANALYZE');
+$node->command_fails(
+	[ 'vacuumdb', '-P', -1, 'postgres' ],
+	'negative parallel degree');
 $node->issues_sql_like(
 	[ 'vacuumdb', '--analyze', '--table', 'vactable(a, b)', 'postgres' ],
 	qr/statement: VACUUM \(ANALYZE\) public.vactable\(a, b\);/,
diff --git a/src/bin/scripts/vacuumdb.c b/src/bin/scripts/vacuumdb.c
index bfa6ac6..b54bb6f 100644
--- a/src/bin/scripts/vacuumdb.c
+++ b/src/bin/scripts/vacuumdb.c
@@ -35,6 +35,8 @@ typedef struct vacuumingOptions
 	bool		skip_locked;
 	int			min_xid_age;
 	int			min_mxid_age;
+	int			parallel_workers;	/* >= 0 indicates user specified the parallel
+									 * degree, otherwise -1 */
 } vacuumingOptions;
 
 
@@ -87,6 +89,7 @@ main(int argc, char *argv[])
 		{"full", no_argument, NULL, 'f'},
 		{"verbose", no_argument, NULL, 'v'},
 		{"jobs", required_argument, NULL, 'j'},
+		{"parallel", required_argument, NULL, 'P'},
 		{"maintenance-db", required_argument, NULL, 2},
 		{"analyze-in-stages", no_argument, NULL, 3},
 		{"disable-page-skipping", no_argument, NULL, 4},
@@ -116,6 +119,7 @@ main(int argc, char *argv[])
 
 	/* initialize options to all false */
 	memset(&vacopts, 0, sizeof(vacopts));
+	vacopts.parallel_workers = -1;
 
 	pg_logging_init(argv[0]);
 	progname = get_progname(argv[0]);
@@ -123,7 +127,7 @@ main(int argc, char *argv[])
 
 	handle_help_version_opts(argc, argv, "vacuumdb", help);
 
-	while ((c = getopt_long(argc, argv, "h:p:U:wWeqd:zZFat:fvj:", long_options, &optindex)) != -1)
+	while ((c = getopt_long(argc, argv, "h:p:U:wWeqd:zZFat:fvj:P:", long_options, &optindex)) != -1)
 	{
 		switch (c)
 		{
@@ -183,6 +187,14 @@ main(int argc, char *argv[])
 					exit(1);
 				}
 				break;
+			case 'P':
+				vacopts.parallel_workers = atoi(optarg);
+				if (vacopts.parallel_workers < 0)
+				{
+					pg_log_error("parallel vacuum degree must be a non-negative integer");
+					exit(1);
+				}
+				break;
 			case 2:
 				maintenance_db = pg_strdup(optarg);
 				break;
@@ -258,6 +270,23 @@ main(int argc, char *argv[])
 		/* allow 'and_analyze' with 'analyze_only' */
 	}
 
+	/* Prohibit full and analyze_only options with parallel option */
+	if (vacopts.parallel_workers >= 0)
+	{
+		if (vacopts.analyze_only)
+		{
+			pg_log_error("cannot use the \"%s\" option when performing only analyze",
+						 "parallel");
+			exit(1);
+		}
+		if (vacopts.full)
+		{
+			pg_log_error("cannot use the \"%s\" option when performing full",
+						 "parallel");
+			exit(1);
+		}
+	}
+
 	setup_cancel_handler(NULL);
 
 	/* Avoid opening extra connections. */
@@ -405,6 +434,13 @@ vacuum_one_database(const char *dbname, vacuumingOptions *vacopts,
 		exit(1);
 	}
 
+	if (vacopts->parallel_workers >= 0 && PQserverVersion(conn) < 130000)
+	{
+		pg_log_error("cannot use the \"%s\" option on server versions older than PostgreSQL %s",
+					 "--parallel", "13");
+		exit(1);
+	}
+
 	if (!quiet)
 	{
 		if (stage != ANALYZE_NO_STAGE)
@@ -823,6 +859,14 @@ prepare_vacuum_command(PQExpBuffer sql, int serverVersion,
 				appendPQExpBuffer(sql, "%sANALYZE", sep);
 				sep = comma;
 			}
+			if (vacopts->parallel_workers >= 0)
+			{
+				/* PARALLEL is supported since v13 */
+				Assert(serverVersion >= 130000);
+				appendPQExpBuffer(sql, "%sPARALLEL %d", sep,
+								  vacopts->parallel_workers);
+				sep = comma;
+			}
 			if (sep != paren)
 				appendPQExpBufferChar(sql, ')');
 		}
@@ -886,6 +930,7 @@ help(const char *progname)
 	printf(_("  -j, --jobs=NUM                  use this many concurrent connections to vacuum\n"));
 	printf(_("      --min-mxid-age=MXID_AGE     minimum multixact ID age of tables to vacuum\n"));
 	printf(_("      --min-xid-age=XID_AGE       minimum transaction ID age of tables to vacuum\n"));
+	printf(_("  -P, --parallel=PARALLEL_DEGREE  do parallel vacuum\n"));
 	printf(_("  -q, --quiet                     don't write any messages\n"));
 	printf(_("      --skip-locked               skip relations that cannot be immediately locked\n"));
 	printf(_("  -t, --table='TABLE[(COLUMNS)]'  vacuum specific table(s) only\n"));
-- 
1.8.3.1

#399

Mahendra Singh Thalor

mahi6run@gmail.com

almost 6 years ago

In reply to: Amit Kapila (#398)

On Sat, 25 Jan 2020 at 12:11, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Jan 24, 2020 at 4:58 PM Mahendra Singh Thalor
<mahi6run@gmail.com> wrote:

On Thu, 23 Jan 2020 at 15:32, Mahendra Singh Thalor <mahi6run@gmail.com>

wrote:

On Wed, 22 Jan 2020 at 12:48, Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

Attached the updated version patch.

Thanks Sawada-san for the re-based patch.

I reviewed and tested this patch. Patch looks good to me.

As offline, suggested by Amit Kapila, I verified vacuumdb "-P" option
functionality with older versions(<13) and also I tested vacuumdb by
giving "-j" option with "-P". All are working as per expectation and I
didn't find any issue with these options.

I have made few modifications in the patch.

1. I think we should try to block the usage of 'full' and 'parallel'
option in the utility rather than allowing the server to return an
error.
2. It is better to handle 'P' option in getopt_long in the order of
its declaration in long_options array.
3. Added an Assert for server version while handling of parallel option.
4. Added a few sentences in the documentation.

What do you guys think of the attached?

I took one more review round. Below are some review comments:

1.
-P, --parallel=PARALLEL_DEGREE do parallel vacuum

I think, "do parallel vacuum" should be modified. Without specifying -P, we
are still doing parallel vacuum so we can use like "degree for parallel
vacuum"

2. Error message inconsistent for FULL and parallel option:
*Error for normal vacuum:*
ERROR: cannot specify both FULL and PARALLEL options

*Error for vacuumdb:*
error: cannot use the "parallel" option when performing full

I think, both the places, we should use 2nd error message as it is giving
more clarity.

--
Thanks and Regards
Mahendra Singh Thalor
EnterpriseDB: http://www.enterprisedb.com

#400

Amit Kapila

amit.kapila16@gmail.com

almost 6 years ago

In reply to: Mahendra Singh Thalor (#399)

On Tue, Jan 28, 2020 at 2:13 AM Mahendra Singh Thalor
<mahi6run@gmail.com> wrote:

On Sat, 25 Jan 2020 at 12:11, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Jan 24, 2020 at 4:58 PM Mahendra Singh Thalor
<mahi6run@gmail.com> wrote:

On Thu, 23 Jan 2020 at 15:32, Mahendra Singh Thalor <mahi6run@gmail.com> wrote:

On Wed, 22 Jan 2020 at 12:48, Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

Attached the updated version patch.

Thanks Sawada-san for the re-based patch.

I reviewed and tested this patch. Patch looks good to me.

As offline, suggested by Amit Kapila, I verified vacuumdb "-P" option
functionality with older versions(<13) and also I tested vacuumdb by
giving "-j" option with "-P". All are working as per expectation and I
didn't find any issue with these options.

I have made few modifications in the patch.

1. I think we should try to block the usage of 'full' and 'parallel'
option in the utility rather than allowing the server to return an
error.
2. It is better to handle 'P' option in getopt_long in the order of
its declaration in long_options array.
3. Added an Assert for server version while handling of parallel option.
4. Added a few sentences in the documentation.

What do you guys think of the attached?

I took one more review round. Below are some review comments:

1.
-P, --parallel=PARALLEL_DEGREE do parallel vacuum

I think, "do parallel vacuum" should be modified. Without specifying -P, we are still doing parallel vacuum so we can use like "degree for parallel vacuum"

I am not sure if 'degree' makes it very clear. How about "use this
many background workers for vacuum, if available"?

2. Error message inconsistent for FULL and parallel option:
Error for normal vacuum:
ERROR: cannot specify both FULL and PARALLEL options

Error for vacuumdb:
error: cannot use the "parallel" option when performing full

I think, both the places, we should use 2nd error message as it is giving more clarity.

Which message are you advocating here "cannot use the "parallel"
option when performing full" or "cannot specify both FULL and PARALLEL
options"? The message used in this patch is mainly because of
consistency with nearby messages in the vacuumdb utility. If you are
advocating to change "cannot specify both FULL and PARALLEL options"
to match what we are using in this patch, then it is better to do that
separately and maybe ask for more opinions. I think I understand your
desire to use the same message at both places, but it seems to me the
messages used in both the places are to maintain consistency with the
nearby code or the message used for a similar purpose.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#401

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

almost 6 years ago

In reply to: Amit Kapila (#398)

On Sat, 25 Jan 2020 at 15:41, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Jan 24, 2020 at 4:58 PM Mahendra Singh Thalor
<mahi6run@gmail.com> wrote:

On Thu, 23 Jan 2020 at 15:32, Mahendra Singh Thalor <mahi6run@gmail.com> wrote:

On Wed, 22 Jan 2020 at 12:48, Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

Attached the updated version patch.

Thanks Sawada-san for the re-based patch.

I reviewed and tested this patch. Patch looks good to me.

As offline, suggested by Amit Kapila, I verified vacuumdb "-P" option
functionality with older versions(<13) and also I tested vacuumdb by
giving "-j" option with "-P". All are working as per expectation and I
didn't find any issue with these options.

I have made few modifications in the patch.

1. I think we should try to block the usage of 'full' and 'parallel'
option in the utility rather than allowing the server to return an
error.
2. It is better to handle 'P' option in getopt_long in the order of
its declaration in long_options array.
3. Added an Assert for server version while handling of parallel option.
4. Added a few sentences in the documentation.

What do you guys think of the attached?

Your changes look good me.

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#402

Mahendra Singh Thalor

mahi6run@gmail.com

almost 6 years ago

In reply to: Amit Kapila (#400)

On Tue, 28 Jan 2020 at 08:14, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Jan 28, 2020 at 2:13 AM Mahendra Singh Thalor
<mahi6run@gmail.com> wrote:

On Sat, 25 Jan 2020 at 12:11, Amit Kapila <amit.kapila16@gmail.com>

wrote:

On Fri, Jan 24, 2020 at 4:58 PM Mahendra Singh Thalor
<mahi6run@gmail.com> wrote:

On Thu, 23 Jan 2020 at 15:32, Mahendra Singh Thalor <

mahi6run@gmail.com> wrote:

On Wed, 22 Jan 2020 at 12:48, Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

Attached the updated version patch.

Thanks Sawada-san for the re-based patch.

I reviewed and tested this patch. Patch looks good to me.

As offline, suggested by Amit Kapila, I verified vacuumdb "-P"

option

functionality with older versions(<13) and also I tested vacuumdb by
giving "-j" option with "-P". All are working as per expectation

and I

didn't find any issue with these options.

I have made few modifications in the patch.

1. I think we should try to block the usage of 'full' and 'parallel'
option in the utility rather than allowing the server to return an
error.
2. It is better to handle 'P' option in getopt_long in the order of
its declaration in long_options array.
3. Added an Assert for server version while handling of parallel

option.

4. Added a few sentences in the documentation.

What do you guys think of the attached?

I took one more review round. Below are some review comments:

1.
-P, --parallel=PARALLEL_DEGREE do parallel vacuum

I think, "do parallel vacuum" should be modified. Without specifying

-P, we are still doing parallel vacuum so we can use like "degree for
parallel vacuum"

I am not sure if 'degree' makes it very clear. How about "use this
many background workers for vacuum, if available"?

If background workers are many, then automatically, we are using them(by
default parallel vacuum). This option is to put limit on background
workers(limit for vacuum workers) to be used by vacuum process. So I
think, we can use "max parallel vacuum workers (by default, based on no. of
indexes)" or "control parallel vacuum workers"

2. Error message inconsistent for FULL and parallel option:
Error for normal vacuum:
ERROR: cannot specify both FULL and PARALLEL options

Error for vacuumdb:
error: cannot use the "parallel" option when performing full

I think, both the places, we should use 2nd error message as it is

giving more clarity.

Which message are you advocating here "cannot use the "parallel"
option when performing full" or "cannot specify both FULL and PARALLEL
options"? The message used in this patch is mainly because of

I mean that "*cannot use the "parallel" option when performing full"*
should be used in both the places.

consistency with nearby messages in the vacuumdb utility. If you are
advocating to change "cannot specify both FULL and PARALLEL options"
to match what we are using in this patch, then it is better to do that
separately and maybe ask for more opinions. I think I understand your
desire to use the same message at both places, but it seems to me the
messages used in both the places are to maintain consistency with the
nearby code or the message used for a similar purpose.

Okay. I am agree with your points. Let's keep as it is.

--
Thanks and Regards
Mahendra Singh Thalor
EnterpriseDB: http://www.enterprisedb.com

#403

Amit Kapila

amit.kapila16@gmail.com

almost 6 years ago

In reply to: Mahendra Singh Thalor (#402)

On Tue, Jan 28, 2020 at 12:04 PM Mahendra Singh Thalor
<mahi6run@gmail.com> wrote:

On Tue, 28 Jan 2020 at 08:14, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Jan 28, 2020 at 2:13 AM Mahendra Singh Thalor
<mahi6run@gmail.com> wrote:

On Sat, 25 Jan 2020 at 12:11, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Jan 24, 2020 at 4:58 PM Mahendra Singh Thalor
<mahi6run@gmail.com> wrote:

On Thu, 23 Jan 2020 at 15:32, Mahendra Singh Thalor <mahi6run@gmail.com> wrote:

On Wed, 22 Jan 2020 at 12:48, Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

Attached the updated version patch.

Thanks Sawada-san for the re-based patch.

I reviewed and tested this patch. Patch looks good to me.

As offline, suggested by Amit Kapila, I verified vacuumdb "-P" option
functionality with older versions(<13) and also I tested vacuumdb by
giving "-j" option with "-P". All are working as per expectation and I
didn't find any issue with these options.

I have made few modifications in the patch.

1. I think we should try to block the usage of 'full' and 'parallel'
option in the utility rather than allowing the server to return an
error.
2. It is better to handle 'P' option in getopt_long in the order of
its declaration in long_options array.
3. Added an Assert for server version while handling of parallel option.
4. Added a few sentences in the documentation.

What do you guys think of the attached?

I took one more review round. Below are some review comments:

1.
-P, --parallel=PARALLEL_DEGREE do parallel vacuum

I think, "do parallel vacuum" should be modified. Without specifying -P, we are still doing parallel vacuum so we can use like "degree for parallel vacuum"

I am not sure if 'degree' makes it very clear. How about "use this
many background workers for vacuum, if available"?

If background workers are many, then automatically, we are using them(by default parallel vacuum). This option is to put limit on background workers(limit for vacuum workers) to be used by vacuum process.

I don't think that the option is just to specify the max limit because
that is generally controlled by guc parameters. This option allows
users to specify the number of workers for the cases where he has more
knowledge about the size/type of indexes. In some cases, the user
might be able to make a better decision and that was the reason we
have added this option in the first place.

So I think, we can use "max parallel vacuum workers (by default, based on no. of indexes)" or "control parallel vacuum workers"

Hmm, I feel what I suggested is better because of the above explanation.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#404

Mahendra Singh Thalor

mahi6run@gmail.com

almost 6 years ago

In reply to: Amit Kapila (#403)

On Tue, 28 Jan 2020 at 12:32, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Jan 28, 2020 at 12:04 PM Mahendra Singh Thalor
<mahi6run@gmail.com> wrote:

On Tue, 28 Jan 2020 at 08:14, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Jan 28, 2020 at 2:13 AM Mahendra Singh Thalor
<mahi6run@gmail.com> wrote:

On Sat, 25 Jan 2020 at 12:11, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Jan 24, 2020 at 4:58 PM Mahendra Singh Thalor
<mahi6run@gmail.com> wrote:

On Thu, 23 Jan 2020 at 15:32, Mahendra Singh Thalor <mahi6run@gmail.com> wrote:

On Wed, 22 Jan 2020 at 12:48, Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

Attached the updated version patch.

Thanks Sawada-san for the re-based patch.

I reviewed and tested this patch. Patch looks good to me.

As offline, suggested by Amit Kapila, I verified vacuumdb "-P" option
functionality with older versions(<13) and also I tested vacuumdb by
giving "-j" option with "-P". All are working as per expectation and I
didn't find any issue with these options.

I have made few modifications in the patch.

1. I think we should try to block the usage of 'full' and 'parallel'
option in the utility rather than allowing the server to return an
error.
2. It is better to handle 'P' option in getopt_long in the order of
its declaration in long_options array.
3. Added an Assert for server version while handling of parallel option.
4. Added a few sentences in the documentation.

What do you guys think of the attached?

I took one more review round. Below are some review comments:

1.
-P, --parallel=PARALLEL_DEGREE do parallel vacuum

I think, "do parallel vacuum" should be modified. Without specifying -P, we are still doing parallel vacuum so we can use like "degree for parallel vacuum"

I am not sure if 'degree' makes it very clear. How about "use this
many background workers for vacuum, if available"?

If background workers are many, then automatically, we are using them(by default parallel vacuum). This option is to put limit on background workers(limit for vacuum workers) to be used by vacuum process.

I don't think that the option is just to specify the max limit because
that is generally controlled by guc parameters. This option allows
users to specify the number of workers for the cases where he has more
knowledge about the size/type of indexes. In some cases, the user
might be able to make a better decision and that was the reason we
have added this option in the first place.

So I think, we can use "max parallel vacuum workers (by default, based on no. of indexes)" or "control parallel vacuum workers"

Hmm, I feel what I suggested is better because of the above explanation.

Agreed.

--
Thanks and Regards
Mahendra Singh Thalor
EnterpriseDB: http://www.enterprisedb.com

#405

Amit Kapila

amit.kapila16@gmail.com

almost 6 years ago

In reply to: Mahendra Singh Thalor (#404)

1 attachment(s)

On Tue, Jan 28, 2020 at 12:53 PM Mahendra Singh Thalor
<mahi6run@gmail.com> wrote:

1.
-P, --parallel=PARALLEL_DEGREE do parallel vacuum

I think, "do parallel vacuum" should be modified. Without specifying -P, we are still doing parallel vacuum so we can use like "degree for parallel vacuum"

I am not sure if 'degree' makes it very clear. How about "use this
many background workers for vacuum, if available"?

If background workers are many, then automatically, we are using them(by default parallel vacuum). This option is to put limit on background workers(limit for vacuum workers) to be used by vacuum process.

I don't think that the option is just to specify the max limit because
that is generally controlled by guc parameters. This option allows
users to specify the number of workers for the cases where he has more
knowledge about the size/type of indexes. In some cases, the user
might be able to make a better decision and that was the reason we
have added this option in the first place.

So I think, we can use "max parallel vacuum workers (by default, based on no. of indexes)" or "control parallel vacuum workers"

Hmm, I feel what I suggested is better because of the above explanation.

Agreed.

Okay, thanks for the review. Attached is an updated patch. I have
additionally run pgindent. I am planning to commit the attached
tomorrow unless I see more comments.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachments:

v38-0001-Add-parallel-option-to-vacuumdb-command.patchapplication/octet-stream; name=v38-0001-Add-parallel-option-to-vacuumdb-command.patchDownload

From 2cac687e8c3f125aa3ca54c1264a2bd98fd2933a Mon Sep 17 00:00:00 2001
From: Amit Kapila <akapila@postgresql.org>
Date: Sat, 25 Jan 2020 11:53:51 +0530
Subject: [PATCH] Add --parallel option to vacuumdb command.

Commit 40d964ec99 allowed vacuum command to leverage multiple CPUs by
invoking parallel workers to process indexes.  This commit provides a
--parallel option to specify the parallel degree used by vacuum command.

Author: Masahiko Sawada with few modifications by me
Reviewed-by: Mahendra Singh and Amit Kapila
Discussion: https://postgr.es/m/CAD21AoDTPMgzSkV4E3SFo1CH_x50bf5PqZFQf4jmqjk-C03BWg@mail.gmail.com
---
 doc/src/sgml/ref/vacuumdb.sgml    | 18 +++++++++++++++
 src/bin/scripts/t/100_vacuumdb.pl | 13 ++++++++++-
 src/bin/scripts/vacuumdb.c        | 47 ++++++++++++++++++++++++++++++++++++++-
 3 files changed, 76 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/ref/vacuumdb.sgml b/doc/src/sgml/ref/vacuumdb.sgml
index 47d9345..775c9ec 100644
--- a/doc/src/sgml/ref/vacuumdb.sgml
+++ b/doc/src/sgml/ref/vacuumdb.sgml
@@ -227,6 +227,24 @@ PostgreSQL documentation
      </varlistentry>
 
      <varlistentry>
+      <term><option>-P <replaceable class="parameter">parallel_degree</replaceable></option></term>
+      <term><option>--parallel=<replaceable class="parameter">parallel_degree</replaceable></option></term>
+      <listitem>
+       <para>
+        Specify the parallel degree of <firstterm>parallel vacuum</firstterm>.
+        This allows the vacuum to leverage multiple CPUs to process indexes.
+        See <xref linkend="sql-vacuum"/>.
+       </para>
+       <note>
+        <para>
+         This option is only available for servers running
+         <productname>PostgreSQL</productname> 13 and later.
+        </para>
+       </note>
+      </listitem>
+     </varlistentry>
+
+     <varlistentry>
       <term><option>-q</option></term>
       <term><option>--quiet</option></term>
       <listitem>
diff --git a/src/bin/scripts/t/100_vacuumdb.pl b/src/bin/scripts/t/100_vacuumdb.pl
index b685b35..c2284c8 100644
--- a/src/bin/scripts/t/100_vacuumdb.pl
+++ b/src/bin/scripts/t/100_vacuumdb.pl
@@ -3,7 +3,7 @@ use warnings;
 
 use PostgresNode;
 use TestLib;
-use Test::More tests => 44;
+use Test::More tests => 49;
 
 program_help_ok('vacuumdb');
 program_version_ok('vacuumdb');
@@ -48,6 +48,14 @@ $node->issues_sql_like(
 $node->command_fails(
 	[ 'vacuumdb', '--analyze-only', '--disable-page-skipping', 'postgres' ],
 	'--analyze-only and --disable-page-skipping specified together');
+$node->issues_sql_like(
+	[ 'vacuumdb', '-P', 2, 'postgres' ],
+	qr/statement: VACUUM \(PARALLEL 2\).*;/,
+	'vacuumdb -P 2');
+$node->issues_sql_like(
+	[ 'vacuumdb', '-P', 0, 'postgres' ],
+	qr/statement: VACUUM \(PARALLEL 0\).*;/,
+	'vacuumdb -P 0');
 $node->command_ok([qw(vacuumdb -Z --table=pg_am dbname=template1)],
 	'vacuumdb with connection string');
 
@@ -81,6 +89,9 @@ $node->command_fails(
 $node->command_fails(
 	[ 'vacuumdb', '--analyze', '--table', 'vactable(c)', 'postgres' ],
 	'incorrect column name with ANALYZE');
+$node->command_fails(
+	[ 'vacuumdb', '-P', -1, 'postgres' ],
+	'negative parallel degree');
 $node->issues_sql_like(
 	[ 'vacuumdb', '--analyze', '--table', 'vactable(a, b)', 'postgres' ],
 	qr/statement: VACUUM \(ANALYZE\) public.vactable\(a, b\);/,
diff --git a/src/bin/scripts/vacuumdb.c b/src/bin/scripts/vacuumdb.c
index bfa6ac6..0560f63 100644
--- a/src/bin/scripts/vacuumdb.c
+++ b/src/bin/scripts/vacuumdb.c
@@ -35,6 +35,8 @@ typedef struct vacuumingOptions
 	bool		skip_locked;
 	int			min_xid_age;
 	int			min_mxid_age;
+	int			parallel_workers;	/* >= 0 indicates user specified the
+									 * parallel degree, otherwise -1 */
 } vacuumingOptions;
 
 
@@ -87,6 +89,7 @@ main(int argc, char *argv[])
 		{"full", no_argument, NULL, 'f'},
 		{"verbose", no_argument, NULL, 'v'},
 		{"jobs", required_argument, NULL, 'j'},
+		{"parallel", required_argument, NULL, 'P'},
 		{"maintenance-db", required_argument, NULL, 2},
 		{"analyze-in-stages", no_argument, NULL, 3},
 		{"disable-page-skipping", no_argument, NULL, 4},
@@ -116,6 +119,7 @@ main(int argc, char *argv[])
 
 	/* initialize options to all false */
 	memset(&vacopts, 0, sizeof(vacopts));
+	vacopts.parallel_workers = -1;
 
 	pg_logging_init(argv[0]);
 	progname = get_progname(argv[0]);
@@ -123,7 +127,7 @@ main(int argc, char *argv[])
 
 	handle_help_version_opts(argc, argv, "vacuumdb", help);
 
-	while ((c = getopt_long(argc, argv, "h:p:U:wWeqd:zZFat:fvj:", long_options, &optindex)) != -1)
+	while ((c = getopt_long(argc, argv, "h:p:U:wWeqd:zZFat:fvj:P:", long_options, &optindex)) != -1)
 	{
 		switch (c)
 		{
@@ -183,6 +187,14 @@ main(int argc, char *argv[])
 					exit(1);
 				}
 				break;
+			case 'P':
+				vacopts.parallel_workers = atoi(optarg);
+				if (vacopts.parallel_workers < 0)
+				{
+					pg_log_error("parallel vacuum degree must be a non-negative integer");
+					exit(1);
+				}
+				break;
 			case 2:
 				maintenance_db = pg_strdup(optarg);
 				break;
@@ -258,6 +270,23 @@ main(int argc, char *argv[])
 		/* allow 'and_analyze' with 'analyze_only' */
 	}
 
+	/* Prohibit full and analyze_only options with parallel option */
+	if (vacopts.parallel_workers >= 0)
+	{
+		if (vacopts.analyze_only)
+		{
+			pg_log_error("cannot use the \"%s\" option when performing only analyze",
+						 "parallel");
+			exit(1);
+		}
+		if (vacopts.full)
+		{
+			pg_log_error("cannot use the \"%s\" option when performing full",
+						 "parallel");
+			exit(1);
+		}
+	}
+
 	setup_cancel_handler(NULL);
 
 	/* Avoid opening extra connections. */
@@ -405,6 +434,13 @@ vacuum_one_database(const char *dbname, vacuumingOptions *vacopts,
 		exit(1);
 	}
 
+	if (vacopts->parallel_workers >= 0 && PQserverVersion(conn) < 130000)
+	{
+		pg_log_error("cannot use the \"%s\" option on server versions older than PostgreSQL %s",
+					 "--parallel", "13");
+		exit(1);
+	}
+
 	if (!quiet)
 	{
 		if (stage != ANALYZE_NO_STAGE)
@@ -823,6 +859,14 @@ prepare_vacuum_command(PQExpBuffer sql, int serverVersion,
 				appendPQExpBuffer(sql, "%sANALYZE", sep);
 				sep = comma;
 			}
+			if (vacopts->parallel_workers >= 0)
+			{
+				/* PARALLEL is supported since v13 */
+				Assert(serverVersion >= 130000);
+				appendPQExpBuffer(sql, "%sPARALLEL %d", sep,
+								  vacopts->parallel_workers);
+				sep = comma;
+			}
 			if (sep != paren)
 				appendPQExpBufferChar(sql, ')');
 		}
@@ -886,6 +930,7 @@ help(const char *progname)
 	printf(_("  -j, --jobs=NUM                  use this many concurrent connections to vacuum\n"));
 	printf(_("      --min-mxid-age=MXID_AGE     minimum multixact ID age of tables to vacuum\n"));
 	printf(_("      --min-xid-age=XID_AGE       minimum transaction ID age of tables to vacuum\n"));
+	printf(_("  -P, --parallel=PARALLEL_DEGREE  use this many background workers for vacuum, if available\n"));
 	printf(_("  -q, --quiet                     don't write any messages\n"));
 	printf(_("      --skip-locked               skip relations that cannot be immediately locked\n"));
 	printf(_("  -t, --table='TABLE[(COLUMNS)]'  vacuum specific table(s) only\n"));
-- 
1.8.3.1

#406

Amit Kapila

amit.kapila16@gmail.com

almost 6 years ago

In reply to: Masahiko Sawada (#401)

On Tue, Jan 28, 2020 at 8:56 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Sat, 25 Jan 2020 at 15:41, Amit Kapila <amit.kapila16@gmail.com> wrote:

I have made few modifications in the patch.

1. I think we should try to block the usage of 'full' and 'parallel'
option in the utility rather than allowing the server to return an
error.
2. It is better to handle 'P' option in getopt_long in the order of
its declaration in long_options array.
3. Added an Assert for server version while handling of parallel option.
4. Added a few sentences in the documentation.

What do you guys think of the attached?

Your changes look good me.

Thanks for the review.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#407

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

almost 6 years ago

In reply to: Amit Kapila (#405)

On Tue, 28 Jan 2020 at 18:47, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Jan 28, 2020 at 12:53 PM Mahendra Singh Thalor
<mahi6run@gmail.com> wrote:

1.
-P, --parallel=PARALLEL_DEGREE do parallel vacuum

I think, "do parallel vacuum" should be modified. Without specifying -P, we are still doing parallel vacuum so we can use like "degree for parallel vacuum"

I am not sure if 'degree' makes it very clear. How about "use this
many background workers for vacuum, if available"?

If background workers are many, then automatically, we are using them(by default parallel vacuum). This option is to put limit on background workers(limit for vacuum workers) to be used by vacuum process.

I don't think that the option is just to specify the max limit because
that is generally controlled by guc parameters. This option allows
users to specify the number of workers for the cases where he has more
knowledge about the size/type of indexes. In some cases, the user
might be able to make a better decision and that was the reason we
have added this option in the first place.

So I think, we can use "max parallel vacuum workers (by default, based on no. of indexes)" or "control parallel vacuum workers"

Hmm, I feel what I suggested is better because of the above explanation.

Agreed.

Okay, thanks for the review. Attached is an updated patch. I have
additionally run pgindent. I am planning to commit the attached
tomorrow unless I see more comments.

Thank you for committing it!

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#408

Amit Kapila

amit.kapila16@gmail.com

almost 6 years ago

In reply to: Masahiko Sawada (#407)

On Wed, Jan 29, 2020 at 7:20 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

Okay, thanks for the review. Attached is an updated patch. I have
additionally run pgindent. I am planning to commit the attached
tomorrow unless I see more comments.

Thank you for committing it!

I have marked this patch as committed in CF.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com