New vacuum option to do only freezing

Started by Masahiko Sawadaover 7 years ago118 messages
#1Masahiko Sawada
sawada.mshk@gmail.com
1 attachment(s)

Hi,

Attached patch adds a new option FREEZE_ONLY to VACUUM command. This
option is same as FREEZE option except for it disables reclaiming dead
tuples. That is, with this option vacuum does pruning HOT chain,
freezing live tuples and maintaining both visibility map and freespace
map but does not collect dead tuples and invoke neither heap vacuum
nor index vacuum. This option will be useful if user wants to prevent
XID wraparound a table as quick as possible, especially when table is
quite large and is about to XID wraparound. I think this usecase was
mentioned in some threads but I couldn't find them.

Currently this patch just adds the new option to VACUUM command but it
might be good to make autovacuum use it when emergency vacuum is
required.

This is a performance-test result for FREEZE option and FREEZE_ONLY
option. I've tested them on the table which is about 3.8GB table
without indexes and randomly modified.

* FREEZE
INFO: aggressively vacuuming "public.pgbench_accounts"
INFO: "pgbench_accounts": removed 5 row versions in 8 pages
INFO: "pgbench_accounts": found 5 removable, 30000000 nonremovable
row versions in 491804 out of 491804 pages
DETAIL: 0 dead row versions cannot be removed yet, oldest xmin: 722
There were 0 unused item pointers.
Skipped 0 pages due to buffer pins, 0 frozen pages.
0 pages are entirely empty.
CPU: user: 4.20 s, system: 16.47 s, elapsed: 50.28 s.
VACUUM
Time: 50301.262 ms (00:50.301)

* FREEZE_ONLY
INFO: aggressively vacuuming "public.pgbench_accounts"
INFO: "pgbench_accounts": found 4 removable, 30000000 nonremovable
row versions in 491804 out of 491804 pages
DETAIL: freeze 30000000 rows
There were 0 unused item pointers.
Skipped 0 pages due to buffer pins, 0 frozen pages.
0 pages are entirely empty.
CPU: user: 3.10 s, system: 14.85 s, elapsed: 44.56 s.
VACUUM
Time: 44589.794 ms (00:44.590)

Feedback is very welcome.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachments:

v1-0001-Add-FREEZE_ONLY-option-to-VACUUM-command.patchapplication/x-patch; name=v1-0001-Add-FREEZE_ONLY-option-to-VACUUM-command.patchDownload
From 14c870c263a58cb7b39151e618e1729395e8e354 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 28 Sep 2018 15:38:05 +0900
Subject: [PATCH v1] Add FREEZE_ONLY option to VACUUM command.

---
 doc/src/sgml/ref/vacuum.sgml         | 19 ++++++++
 src/backend/access/heap/pruneheap.c  | 49 +++++++++++++++++----
 src/backend/commands/vacuum.c        |  9 ++--
 src/backend/commands/vacuumlazy.c    | 84 +++++++++++++++++++++++++++---------
 src/backend/parser/gram.y            |  2 +
 src/include/access/heapam.h          |  3 +-
 src/include/nodes/parsenodes.h       |  3 +-
 src/test/regress/expected/vacuum.out |  1 +
 src/test/regress/sql/vacuum.sql      |  1 +
 9 files changed, 137 insertions(+), 34 deletions(-)

diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index b760e8e..c5560df 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -28,6 +28,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
 
     FULL
     FREEZE
+    FREEZE_ONLY
     VERBOSE
     ANALYZE
     DISABLE_PAGE_SKIPPING
@@ -123,6 +124,24 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    </varlistentry>
 
    <varlistentry>
+    <term><literal>FREEZE_ONLY</literal></term>
+    <listitem>
+     <para>
+      <command>VACUUM</command> removes dead tuples and prunes HOT-updated
+      tuples chain for live tuples. If the table has any dead tuple it removes
+      them on both table and its indexes.
+      This option is same as <literal>FREEZE</literal> except for it disables
+      removing dead tuples, so it does only freezing tuples and pruning
+      HOT-updated tuple chains. This is intended to be used when necessary to
+      prevent transaction ID wraparound as quick as possible.
+      Aggressive freezing is always performed when the
+      table is rewritten, so this option is redundant when <literal>FULL</literal>
+      is specified.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><literal>VERBOSE</literal></term>
     <listitem>
      <para>
diff --git a/src/backend/access/heap/pruneheap.c b/src/backend/access/heap/pruneheap.c
index c2f5343..867d840 100644
--- a/src/backend/access/heap/pruneheap.c
+++ b/src/backend/access/heap/pruneheap.c
@@ -35,6 +35,7 @@ typedef struct
 	int			nredirected;	/* numbers of entries in arrays below */
 	int			ndead;
 	int			nunused;
+	bool		prune_root;		/* do we prune whole chain or root item? */
 	/* arrays that accumulate indexes of items to be changed */
 	OffsetNumber redirected[MaxHeapTuplesPerPage * 2];
 	OffsetNumber nowdead[MaxHeapTuplesPerPage];
@@ -53,6 +54,8 @@ static void heap_prune_record_redirect(PruneState *prstate,
 						   OffsetNumber offnum, OffsetNumber rdoffnum);
 static void heap_prune_record_dead(PruneState *prstate, OffsetNumber offnum);
 static void heap_prune_record_unused(PruneState *prstate, OffsetNumber offnum);
+static void heap_prune_advance_latest_removed_xid(PruneState *prstate, Page page,
+												  OffsetNumber offnum);
 
 
 /*
@@ -152,7 +155,8 @@ heap_page_prune_opt(Relation relation, Buffer buffer)
 															 * needed */
 
 			/* OK to prune */
-			(void) heap_page_prune(relation, buffer, OldestXmin, true, &ignore);
+			(void) heap_page_prune(relation, buffer, OldestXmin, true, &ignore,
+								   true);
 		}
 
 		/* And release buffer lock */
@@ -179,7 +183,8 @@ heap_page_prune_opt(Relation relation, Buffer buffer)
  */
 int
 heap_page_prune(Relation relation, Buffer buffer, TransactionId OldestXmin,
-				bool report_stats, TransactionId *latestRemovedXid)
+				bool report_stats, TransactionId *latestRemovedXid,
+				bool prune_root)
 {
 	int			ndeleted = 0;
 	Page		page = BufferGetPage(buffer);
@@ -201,6 +206,7 @@ heap_page_prune(Relation relation, Buffer buffer, TransactionId OldestXmin,
 	prstate.new_prune_xid = InvalidTransactionId;
 	prstate.latestRemovedXid = *latestRemovedXid;
 	prstate.nredirected = prstate.ndead = prstate.nunused = 0;
+	prstate.prune_root = prune_root;
 	memset(prstate.marked, 0, sizeof(prstate.marked));
 
 	/* Scan the page */
@@ -537,11 +543,7 @@ heap_prune_chain(Relation relation, Buffer buffer, OffsetNumber rootoffnum,
 		 * find another DEAD tuple is a fairly unusual corner case.)
 		 */
 		if (tupdead)
-		{
 			latestdead = offnum;
-			HeapTupleHeaderAdvanceLatestRemovedXid(htup,
-												   &prstate->latestRemovedXid);
-		}
 		else if (!recent_dead)
 			break;
 
@@ -580,6 +582,7 @@ heap_prune_chain(Relation relation, Buffer buffer, OffsetNumber rootoffnum,
 		 */
 		for (i = 1; (i < nchain) && (chainitems[i - 1] != latestdead); i++)
 		{
+			heap_prune_advance_latest_removed_xid(prstate, dp, chainitems[i]);
 			heap_prune_record_unused(prstate, chainitems[i]);
 			ndeleted++;
 		}
@@ -598,7 +601,13 @@ heap_prune_chain(Relation relation, Buffer buffer, OffsetNumber rootoffnum,
 		 * redirect the root to the correct chain member.
 		 */
 		if (i >= nchain)
-			heap_prune_record_dead(prstate, rootoffnum);
+		{
+			if (prstate->prune_root)
+			{
+				heap_prune_advance_latest_removed_xid(prstate, dp, rootoffnum);
+				heap_prune_record_dead(prstate, rootoffnum);
+			}
+		}
 		else
 			heap_prune_record_redirect(prstate, rootoffnum, chainitems[i]);
 	}
@@ -611,12 +620,36 @@ heap_prune_chain(Relation relation, Buffer buffer, OffsetNumber rootoffnum,
 		 * redirect item.  We can clean up by setting the redirect item to
 		 * DEAD state.
 		 */
-		heap_prune_record_dead(prstate, rootoffnum);
+		if (prstate->prune_root)
+			heap_prune_record_dead(prstate, rootoffnum);
 	}
 
 	return ndeleted;
 }
 
+/*
+ * Advance latestRemovedXid using by the given page and offset. This
+ * function is used for the deleted tuples in the HOT chain. Since the
+ * root line pointer can be redirected we skip it.
+ */
+static void
+heap_prune_advance_latest_removed_xid(PruneState *prstate, Page page,
+									  OffsetNumber offnum)
+{
+	ItemId		lp;
+	HeapTupleHeader	htup;
+
+	lp = PageGetItemId(page, offnum);
+	Assert(ItemIdIsNormal(lp) || ItemIdIsRedirected(lp));
+
+	if (ItemIdIsRedirected(lp))
+		return;
+
+	htup = (HeapTupleHeader) PageGetItem(page, lp);
+	HeapTupleHeaderAdvanceLatestRemovedXid(htup,
+										   &prstate->latestRemovedXid);
+}
+
 /* Record lowest soon-prunable XID */
 static void
 heap_prune_record_prunable(PruneState *prstate, TransactionId xid)
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index f166509..de1c235 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -91,7 +91,7 @@ ExecVacuum(VacuumStmt *vacstmt, bool isTopLevel)
 	/* sanity checks on options */
 	Assert(vacstmt->options & (VACOPT_VACUUM | VACOPT_ANALYZE));
 	Assert((vacstmt->options & VACOPT_VACUUM) ||
-		   !(vacstmt->options & (VACOPT_FULL | VACOPT_FREEZE)));
+		   !(vacstmt->options & (VACOPT_FULL | VACOPT_FREEZE | VACOPT_FREEZE_ONLY)));
 	Assert(!(vacstmt->options & VACOPT_SKIPTOAST));
 
 	/*
@@ -113,10 +113,11 @@ ExecVacuum(VacuumStmt *vacstmt, bool isTopLevel)
 	}
 
 	/*
-	 * All freeze ages are zero if the FREEZE option is given; otherwise pass
-	 * them as -1 which means to use the default values.
+	 * All freeze ages are zero if either the FREEZE option or the FREEZE_ONLY
+	 * option is given; otherwise pass them as -1 which means to use the default
+	 * values.
 	 */
-	if (vacstmt->options & VACOPT_FREEZE)
+	if (vacstmt->options & (VACOPT_FREEZE | VACOPT_FREEZE_ONLY))
 	{
 		params.freeze_min_age = 0;
 		params.freeze_table_age = 0;
diff --git a/src/backend/commands/vacuumlazy.c b/src/backend/commands/vacuumlazy.c
index 8996d36..b1764ba 100644
--- a/src/backend/commands/vacuumlazy.c
+++ b/src/backend/commands/vacuumlazy.c
@@ -168,7 +168,8 @@ static bool should_attempt_truncation(LVRelStats *vacrelstats);
 static void lazy_truncate_heap(Relation onerel, LVRelStats *vacrelstats);
 static BlockNumber count_nondeletable_pages(Relation onerel,
 						 LVRelStats *vacrelstats);
-static void lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks);
+static void lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks,
+							 bool collect_deadtuples);
 static void lazy_record_dead_tuple(LVRelStats *vacrelstats,
 					   ItemPointer itemptr);
 static bool lazy_tid_reaped(ItemPointer itemptr, void *state);
@@ -286,7 +287,7 @@ lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 	/*
 	 * Optionally truncate the relation.
 	 */
-	if (should_attempt_truncation(vacrelstats))
+	if (should_attempt_truncation(vacrelstats) && (options & VACOPT_FREEZE_ONLY) == 0)
 		lazy_truncate_heap(onerel, vacrelstats);
 
 	/* Report that we are now doing final cleanup */
@@ -468,6 +469,11 @@ vacuum_log_cleanup_info(Relation rel, LVRelStats *vacrelstats)
  *		If there are no indexes then we can reclaim line pointers on the fly;
  *		dead line pointers need only be retained until all index pointers that
  *		reference them have been killed.
+ *
+ *		If FREEE_ONLY option is specified, we do only freezing live tuples,
+ *		HOT pruning and maintaining both the freespace map and the visibility
+ *		map, but don't collect dead tuple pointers and invoke both vacuuming
+ *		of indexes and heap.
  */
 static void
 lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
@@ -495,6 +501,8 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	bool		skipping_blocks;
 	xl_heap_freeze_tuple *frozen;
 	StringInfoData buf;
+	bool		freeze_only = (options & VACOPT_FREEZE_ONLY) != 0;
+	int			total_nfrozen = 0;
 	const int	initprog_index[] = {
 		PROGRESS_VACUUM_PHASE,
 		PROGRESS_VACUUM_TOTAL_HEAP_BLKS,
@@ -530,7 +538,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	vacrelstats->nonempty_pages = 0;
 	vacrelstats->latestRemovedXid = InvalidTransactionId;
 
-	lazy_space_alloc(vacrelstats, nblocks);
+	lazy_space_alloc(vacrelstats, nblocks, !freeze_only);
 	frozen = palloc(sizeof(xl_heap_freeze_tuple) * MaxHeapTuplesPerPage);
 
 	/* Report that we're scanning the heap, advertising total # of blocks */
@@ -723,6 +731,8 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			};
 			int64		hvp_val[2];
 
+			Assert(!freeze_only);
+
 			/*
 			 * Before beginning index vacuuming, we release any pin we may
 			 * hold on the visibility map page.  This isn't necessary for
@@ -946,7 +956,8 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		 * We count tuples removed by the pruning step as removed by VACUUM.
 		 */
 		tups_vacuumed += heap_page_prune(onerel, buf, OldestXmin, false,
-										 &vacrelstats->latestRemovedXid);
+										 &vacrelstats->latestRemovedXid,
+										 !freeze_only);
 
 		/*
 		 * Now scan the page to collect vacuumable items and check for tuples
@@ -995,7 +1006,8 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			 */
 			if (ItemIdIsDead(itemid))
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				if (!freeze_only)
+					lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
 				all_visible = false;
 				continue;
 			}
@@ -1024,11 +1036,11 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 				case HEAPTUPLE_DEAD:
 
 					/*
-					 * Ordinarily, DEAD tuples would have been removed by
-					 * heap_page_prune(), but it's possible that the tuple
-					 * state changed since heap_page_prune() looked.  In
-					 * particular an INSERT_IN_PROGRESS tuple could have
-					 * changed to DEAD if the inserter aborted.  So this
+					 * If reclaiming tuples is enabled, DEAD tuples would have
+					 * been removed by heap_page_prune(), but it's possible
+					 * that the tuple state changed since heap_page_prune()
+					 * looked. In particular an INSERT_IN_PROGRESS tuple could
+					 * have changed to DEAD if the inserter aborted.  So this
 					 * cannot be considered an error condition.
 					 *
 					 * If the tuple is HOT-updated then it must only be
@@ -1045,6 +1057,8 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 					 * to detect that case and abort the transaction,
 					 * preventing corruption.
 					 */
+					if (freeze_only)
+						tupgone = true;
 					if (HeapTupleIsHotUpdated(&tuple) ||
 						HeapTupleIsHeapOnly(&tuple))
 						nkeep += 1;
@@ -1141,10 +1155,13 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 
 			if (tupgone)
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
-				HeapTupleHeaderAdvanceLatestRemovedXid(tuple.t_data,
-													   &vacrelstats->latestRemovedXid);
-				tups_vacuumed += 1;
+				if (!freeze_only)
+				{
+					lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+					HeapTupleHeaderAdvanceLatestRemovedXid(tuple.t_data,
+														   &vacrelstats->latestRemovedXid);
+					tups_vacuumed += 1;
+				}
 				has_dead_tuples = true;
 			}
 			else
@@ -1204,6 +1221,9 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			}
 
 			END_CRIT_SECTION();
+
+			/* Remember nfrozen in total */
+			total_nfrozen += nfrozen;
 		}
 
 		/*
@@ -1213,6 +1233,8 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		if (nindexes == 0 &&
 			vacrelstats->num_dead_tuples > 0)
 		{
+			Assert(!freeze_only);
+
 			/* Remove tuples from heap */
 			lazy_vacuum_page(onerel, blkno, buf, 0, vacrelstats, &vmbuffer);
 			has_dead_tuples = false;
@@ -1380,6 +1402,8 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		};
 		int64		hvp_val[2];
 
+		Assert(!freeze_only);
+
 		/* Log cleanup info before we touch indexes */
 		vacuum_log_cleanup_info(onerel, vacrelstats);
 
@@ -1418,8 +1442,11 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
 
 	/* Do post-vacuum cleanup and statistics update for each index */
-	for (i = 0; i < nindexes; i++)
-		lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
+	if (!freeze_only)
+	{
+		for (i = 0; i < nindexes; i++)
+			lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
+	}
 
 	/* If no indexes, make log report that lazy_vacuum_heap would've made */
 	if (vacuumed_pages)
@@ -1433,9 +1460,16 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	 * individual parts of the message when not applicable.
 	 */
 	initStringInfo(&buf);
-	appendStringInfo(&buf,
-					 _("%.0f dead row versions cannot be removed yet, oldest xmin: %u\n"),
-					 nkeep, OldestXmin);
+	if (!freeze_only)
+		appendStringInfo(&buf,
+						 _("%.0f dead row versions cannot be removed yet, oldest xmin: %u\n"),
+						 nkeep, OldestXmin);
+	else
+		appendStringInfo(&buf, ngettext("freeze %d row\n",
+										"freeze %d rows\n",
+										total_nfrozen),
+						 total_nfrozen);
+
 	appendStringInfo(&buf, _("There were %.0f unused item pointers.\n"),
 					 nunused);
 	appendStringInfo(&buf, ngettext("Skipped %u page due to buffer pins, ",
@@ -2085,13 +2119,23 @@ count_nondeletable_pages(Relation onerel, LVRelStats *vacrelstats)
  * See the comments at the head of this file for rationale.
  */
 static void
-lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks,
+				 bool collect_deadtuples)
 {
 	long		maxtuples;
 	int			vac_work_mem = IsAutoVacuumWorkerProcess() &&
 	autovacuum_work_mem != -1 ?
 	autovacuum_work_mem : maintenance_work_mem;
 
+	/* Don't prepare dead tuple space if we don't collect them */
+	if (!collect_deadtuples)
+	{
+		vacrelstats->num_dead_tuples = 0;
+		vacrelstats->max_dead_tuples = 0;
+		vacrelstats->dead_tuples = NULL;
+		return;
+	}
+
 	if (vacrelstats->hasindex)
 	{
 		maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index d41eecd..348425e 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -10538,6 +10538,8 @@ vacuum_option_elem:
 				{
 					if (strcmp($1, "disable_page_skipping") == 0)
 						$$ = VACOPT_DISABLE_PAGE_SKIPPING;
+					else if (strcmp($1, "freeze_only") == 0)
+						$$ = VACOPT_FREEZE_ONLY;
 					else
 						ereport(ERROR,
 								(errcode(ERRCODE_SYNTAX_ERROR),
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index ca5cad7..cc017fe 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -187,7 +187,8 @@ extern void heap_update_snapshot(HeapScanDesc scan, Snapshot snapshot);
 extern void heap_page_prune_opt(Relation relation, Buffer buffer);
 extern int heap_page_prune(Relation relation, Buffer buffer,
 				TransactionId OldestXmin,
-				bool report_stats, TransactionId *latestRemovedXid);
+				bool report_stats, TransactionId *latestRemovedXid,
+				bool prune_root);
 extern void heap_page_prune_execute(Buffer buffer,
 						OffsetNumber *redirected, int nredirected,
 						OffsetNumber *nowdead, int ndead,
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 200df8e..98daedc 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3157,7 +3157,8 @@ typedef enum VacuumOption
 	VACOPT_SKIP_LOCKED = 1 << 5,	/* skip if cannot get lock (autovacuum
 									 * only) */
 	VACOPT_SKIPTOAST = 1 << 6,	/* don't process the TOAST table, if any */
-	VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7	/* don't skip any pages */
+	VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7,	/* don't skip any pages */
+	VACOPT_FREEZE_ONLY = 1 << 8	/* do only freezing */
 } VacuumOption;
 
 /*
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index 5993a90..5fff354 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -80,6 +80,7 @@ CONTEXT:  SQL function "do_analyze" statement 1
 SQL function "wrap_do_analyze" statement 1
 VACUUM FULL vactst;
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+VACUUM (FREEZE_ONLY) vaccluster, vactst;
 -- partitioned table
 CREATE TABLE vacparted (a int, b char) PARTITION BY LIST (a);
 CREATE TABLE vacparted1 PARTITION OF vacparted FOR VALUES IN (1);
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 7f74da3..1965903 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -61,6 +61,7 @@ VACUUM FULL vaccluster;
 VACUUM FULL vactst;
 
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+VACUUM (FREEZE_ONLY) vaccluster, vactst;
 
 -- partitioned table
 CREATE TABLE vacparted (a int, b char) PARTITION BY LIST (a);
-- 
2.10.5

#2Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Masahiko Sawada (#1)
Re: New vacuum option to do only freezing

On Mon, Oct 1, 2018 at 7:20 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Hi,

Attached patch adds a new option FREEZE_ONLY to VACUUM command. This
option is same as FREEZE option except for it disables reclaiming dead
tuples. That is, with this option vacuum does pruning HOT chain,
freezing live tuples and maintaining both visibility map and freespace
map but does not collect dead tuples and invoke neither heap vacuum
nor index vacuum. This option will be useful if user wants to prevent
XID wraparound a table as quick as possible, especially when table is
quite large and is about to XID wraparound. I think this usecase was
mentioned in some threads but I couldn't find them.

Currently this patch just adds the new option to VACUUM command but it
might be good to make autovacuum use it when emergency vacuum is
required.

This is a performance-test result for FREEZE option and FREEZE_ONLY
option. I've tested them on the table which is about 3.8GB table
without indexes and randomly modified.

* FREEZE
INFO: aggressively vacuuming "public.pgbench_accounts"
INFO: "pgbench_accounts": removed 5 row versions in 8 pages
INFO: "pgbench_accounts": found 5 removable, 30000000 nonremovable
row versions in 491804 out of 491804 pages
DETAIL: 0 dead row versions cannot be removed yet, oldest xmin: 722
There were 0 unused item pointers.
Skipped 0 pages due to buffer pins, 0 frozen pages.
0 pages are entirely empty.
CPU: user: 4.20 s, system: 16.47 s, elapsed: 50.28 s.
VACUUM
Time: 50301.262 ms (00:50.301)

* FREEZE_ONLY
INFO: aggressively vacuuming "public.pgbench_accounts"
INFO: "pgbench_accounts": found 4 removable, 30000000 nonremovable
row versions in 491804 out of 491804 pages
DETAIL: freeze 30000000 rows
There were 0 unused item pointers.
Skipped 0 pages due to buffer pins, 0 frozen pages.
0 pages are entirely empty.
CPU: user: 3.10 s, system: 14.85 s, elapsed: 44.56 s.
VACUUM
Time: 44589.794 ms (00:44.590)

Feedback is very welcome.

Added to the next commit fest.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#3Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Masahiko Sawada (#2)
1 attachment(s)
Re: New vacuum option to do only freezing

On Thu, Oct 4, 2018 at 6:15 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Mon, Oct 1, 2018 at 7:20 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Hi,

Attached patch adds a new option FREEZE_ONLY to VACUUM command. This
option is same as FREEZE option except for it disables reclaiming dead
tuples. That is, with this option vacuum does pruning HOT chain,
freezing live tuples and maintaining both visibility map and freespace
map but does not collect dead tuples and invoke neither heap vacuum
nor index vacuum. This option will be useful if user wants to prevent
XID wraparound a table as quick as possible, especially when table is
quite large and is about to XID wraparound. I think this usecase was
mentioned in some threads but I couldn't find them.

Currently this patch just adds the new option to VACUUM command but it
might be good to make autovacuum use it when emergency vacuum is
required.

This is a performance-test result for FREEZE option and FREEZE_ONLY
option. I've tested them on the table which is about 3.8GB table
without indexes and randomly modified.

* FREEZE
INFO: aggressively vacuuming "public.pgbench_accounts"
INFO: "pgbench_accounts": removed 5 row versions in 8 pages
INFO: "pgbench_accounts": found 5 removable, 30000000 nonremovable
row versions in 491804 out of 491804 pages
DETAIL: 0 dead row versions cannot be removed yet, oldest xmin: 722
There were 0 unused item pointers.
Skipped 0 pages due to buffer pins, 0 frozen pages.
0 pages are entirely empty.
CPU: user: 4.20 s, system: 16.47 s, elapsed: 50.28 s.
VACUUM
Time: 50301.262 ms (00:50.301)

* FREEZE_ONLY
INFO: aggressively vacuuming "public.pgbench_accounts"
INFO: "pgbench_accounts": found 4 removable, 30000000 nonremovable
row versions in 491804 out of 491804 pages
DETAIL: freeze 30000000 rows
There were 0 unused item pointers.
Skipped 0 pages due to buffer pins, 0 frozen pages.
0 pages are entirely empty.
CPU: user: 3.10 s, system: 14.85 s, elapsed: 44.56 s.
VACUUM
Time: 44589.794 ms (00:44.590)

Feedback is very welcome.

Added to the next commit fest.

Rebaed to the current HEAD.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachments:

v2-0001-Add-FREEZE_ONLY-option-to-VACUUM-command.patchapplication/octet-stream; name=v2-0001-Add-FREEZE_ONLY-option-to-VACUUM-command.patchDownload
From ddb820f9471f9c5f04aee98066f7f9cd8c3278d2 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 28 Sep 2018 15:38:05 +0900
Subject: [PATCH v2] Add FREEZE_ONLY option to VACUUM command.

---
 doc/src/sgml/ref/vacuum.sgml         | 19 ++++++++
 src/backend/access/heap/pruneheap.c  | 49 +++++++++++++++++----
 src/backend/commands/vacuum.c        |  9 ++--
 src/backend/commands/vacuumlazy.c    | 84 +++++++++++++++++++++++++++---------
 src/backend/parser/gram.y            |  2 +
 src/include/access/heapam.h          |  3 +-
 src/include/nodes/parsenodes.h       |  3 +-
 src/test/regress/expected/vacuum.out |  1 +
 src/test/regress/sql/vacuum.sql      |  1 +
 9 files changed, 137 insertions(+), 34 deletions(-)

diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index fd911f5..f2ab450 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -28,6 +28,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
 
     FULL
     FREEZE
+    FREEZE_ONLY
     VERBOSE
     ANALYZE
     DISABLE_PAGE_SKIPPING
@@ -124,6 +125,24 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    </varlistentry>
 
    <varlistentry>
+    <term><literal>FREEZE_ONLY</literal></term>
+    <listitem>
+     <para>
+      <command>VACUUM</command> removes dead tuples and prunes HOT-updated
+      tuples chain for live tuples. If the table has any dead tuple it removes
+      them on both table and its indexes.
+      This option is same as <literal>FREEZE</literal> except for it disables
+      removing dead tuples, so it does only freezing tuples and pruning
+      HOT-updated tuple chains. This is intended to be used when necessary to
+      prevent transaction ID wraparound as quick as possible.
+      Aggressive freezing is always performed when the
+      table is rewritten, so this option is redundant when <literal>FULL</literal>
+      is specified.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><literal>VERBOSE</literal></term>
     <listitem>
      <para>
diff --git a/src/backend/access/heap/pruneheap.c b/src/backend/access/heap/pruneheap.c
index c2f5343..867d840 100644
--- a/src/backend/access/heap/pruneheap.c
+++ b/src/backend/access/heap/pruneheap.c
@@ -35,6 +35,7 @@ typedef struct
 	int			nredirected;	/* numbers of entries in arrays below */
 	int			ndead;
 	int			nunused;
+	bool		prune_root;		/* do we prune whole chain or root item? */
 	/* arrays that accumulate indexes of items to be changed */
 	OffsetNumber redirected[MaxHeapTuplesPerPage * 2];
 	OffsetNumber nowdead[MaxHeapTuplesPerPage];
@@ -53,6 +54,8 @@ static void heap_prune_record_redirect(PruneState *prstate,
 						   OffsetNumber offnum, OffsetNumber rdoffnum);
 static void heap_prune_record_dead(PruneState *prstate, OffsetNumber offnum);
 static void heap_prune_record_unused(PruneState *prstate, OffsetNumber offnum);
+static void heap_prune_advance_latest_removed_xid(PruneState *prstate, Page page,
+												  OffsetNumber offnum);
 
 
 /*
@@ -152,7 +155,8 @@ heap_page_prune_opt(Relation relation, Buffer buffer)
 															 * needed */
 
 			/* OK to prune */
-			(void) heap_page_prune(relation, buffer, OldestXmin, true, &ignore);
+			(void) heap_page_prune(relation, buffer, OldestXmin, true, &ignore,
+								   true);
 		}
 
 		/* And release buffer lock */
@@ -179,7 +183,8 @@ heap_page_prune_opt(Relation relation, Buffer buffer)
  */
 int
 heap_page_prune(Relation relation, Buffer buffer, TransactionId OldestXmin,
-				bool report_stats, TransactionId *latestRemovedXid)
+				bool report_stats, TransactionId *latestRemovedXid,
+				bool prune_root)
 {
 	int			ndeleted = 0;
 	Page		page = BufferGetPage(buffer);
@@ -201,6 +206,7 @@ heap_page_prune(Relation relation, Buffer buffer, TransactionId OldestXmin,
 	prstate.new_prune_xid = InvalidTransactionId;
 	prstate.latestRemovedXid = *latestRemovedXid;
 	prstate.nredirected = prstate.ndead = prstate.nunused = 0;
+	prstate.prune_root = prune_root;
 	memset(prstate.marked, 0, sizeof(prstate.marked));
 
 	/* Scan the page */
@@ -537,11 +543,7 @@ heap_prune_chain(Relation relation, Buffer buffer, OffsetNumber rootoffnum,
 		 * find another DEAD tuple is a fairly unusual corner case.)
 		 */
 		if (tupdead)
-		{
 			latestdead = offnum;
-			HeapTupleHeaderAdvanceLatestRemovedXid(htup,
-												   &prstate->latestRemovedXid);
-		}
 		else if (!recent_dead)
 			break;
 
@@ -580,6 +582,7 @@ heap_prune_chain(Relation relation, Buffer buffer, OffsetNumber rootoffnum,
 		 */
 		for (i = 1; (i < nchain) && (chainitems[i - 1] != latestdead); i++)
 		{
+			heap_prune_advance_latest_removed_xid(prstate, dp, chainitems[i]);
 			heap_prune_record_unused(prstate, chainitems[i]);
 			ndeleted++;
 		}
@@ -598,7 +601,13 @@ heap_prune_chain(Relation relation, Buffer buffer, OffsetNumber rootoffnum,
 		 * redirect the root to the correct chain member.
 		 */
 		if (i >= nchain)
-			heap_prune_record_dead(prstate, rootoffnum);
+		{
+			if (prstate->prune_root)
+			{
+				heap_prune_advance_latest_removed_xid(prstate, dp, rootoffnum);
+				heap_prune_record_dead(prstate, rootoffnum);
+			}
+		}
 		else
 			heap_prune_record_redirect(prstate, rootoffnum, chainitems[i]);
 	}
@@ -611,12 +620,36 @@ heap_prune_chain(Relation relation, Buffer buffer, OffsetNumber rootoffnum,
 		 * redirect item.  We can clean up by setting the redirect item to
 		 * DEAD state.
 		 */
-		heap_prune_record_dead(prstate, rootoffnum);
+		if (prstate->prune_root)
+			heap_prune_record_dead(prstate, rootoffnum);
 	}
 
 	return ndeleted;
 }
 
+/*
+ * Advance latestRemovedXid using by the given page and offset. This
+ * function is used for the deleted tuples in the HOT chain. Since the
+ * root line pointer can be redirected we skip it.
+ */
+static void
+heap_prune_advance_latest_removed_xid(PruneState *prstate, Page page,
+									  OffsetNumber offnum)
+{
+	ItemId		lp;
+	HeapTupleHeader	htup;
+
+	lp = PageGetItemId(page, offnum);
+	Assert(ItemIdIsNormal(lp) || ItemIdIsRedirected(lp));
+
+	if (ItemIdIsRedirected(lp))
+		return;
+
+	htup = (HeapTupleHeader) PageGetItem(page, lp);
+	HeapTupleHeaderAdvanceLatestRemovedXid(htup,
+										   &prstate->latestRemovedXid);
+}
+
 /* Record lowest soon-prunable XID */
 static void
 heap_prune_record_prunable(PruneState *prstate, TransactionId xid)
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index a86963f..6f36fbc 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -91,7 +91,7 @@ ExecVacuum(VacuumStmt *vacstmt, bool isTopLevel)
 	/* sanity checks on options */
 	Assert(vacstmt->options & (VACOPT_VACUUM | VACOPT_ANALYZE));
 	Assert((vacstmt->options & VACOPT_VACUUM) ||
-		   !(vacstmt->options & (VACOPT_FULL | VACOPT_FREEZE)));
+		   !(vacstmt->options & (VACOPT_FULL | VACOPT_FREEZE | VACOPT_FREEZE_ONLY)));
 	Assert(!(vacstmt->options & VACOPT_SKIPTOAST));
 
 	/*
@@ -113,10 +113,11 @@ ExecVacuum(VacuumStmt *vacstmt, bool isTopLevel)
 	}
 
 	/*
-	 * All freeze ages are zero if the FREEZE option is given; otherwise pass
-	 * them as -1 which means to use the default values.
+	 * All freeze ages are zero if either the FREEZE option or the FREEZE_ONLY
+	 * option is given; otherwise pass them as -1 which means to use the default
+	 * values.
 	 */
-	if (vacstmt->options & VACOPT_FREEZE)
+	if (vacstmt->options & (VACOPT_FREEZE | VACOPT_FREEZE_ONLY))
 	{
 		params.freeze_min_age = 0;
 		params.freeze_table_age = 0;
diff --git a/src/backend/commands/vacuumlazy.c b/src/backend/commands/vacuumlazy.c
index 8996d36..b1764ba 100644
--- a/src/backend/commands/vacuumlazy.c
+++ b/src/backend/commands/vacuumlazy.c
@@ -168,7 +168,8 @@ static bool should_attempt_truncation(LVRelStats *vacrelstats);
 static void lazy_truncate_heap(Relation onerel, LVRelStats *vacrelstats);
 static BlockNumber count_nondeletable_pages(Relation onerel,
 						 LVRelStats *vacrelstats);
-static void lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks);
+static void lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks,
+							 bool collect_deadtuples);
 static void lazy_record_dead_tuple(LVRelStats *vacrelstats,
 					   ItemPointer itemptr);
 static bool lazy_tid_reaped(ItemPointer itemptr, void *state);
@@ -286,7 +287,7 @@ lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 	/*
 	 * Optionally truncate the relation.
 	 */
-	if (should_attempt_truncation(vacrelstats))
+	if (should_attempt_truncation(vacrelstats) && (options & VACOPT_FREEZE_ONLY) == 0)
 		lazy_truncate_heap(onerel, vacrelstats);
 
 	/* Report that we are now doing final cleanup */
@@ -468,6 +469,11 @@ vacuum_log_cleanup_info(Relation rel, LVRelStats *vacrelstats)
  *		If there are no indexes then we can reclaim line pointers on the fly;
  *		dead line pointers need only be retained until all index pointers that
  *		reference them have been killed.
+ *
+ *		If FREEE_ONLY option is specified, we do only freezing live tuples,
+ *		HOT pruning and maintaining both the freespace map and the visibility
+ *		map, but don't collect dead tuple pointers and invoke both vacuuming
+ *		of indexes and heap.
  */
 static void
 lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
@@ -495,6 +501,8 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	bool		skipping_blocks;
 	xl_heap_freeze_tuple *frozen;
 	StringInfoData buf;
+	bool		freeze_only = (options & VACOPT_FREEZE_ONLY) != 0;
+	int			total_nfrozen = 0;
 	const int	initprog_index[] = {
 		PROGRESS_VACUUM_PHASE,
 		PROGRESS_VACUUM_TOTAL_HEAP_BLKS,
@@ -530,7 +538,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	vacrelstats->nonempty_pages = 0;
 	vacrelstats->latestRemovedXid = InvalidTransactionId;
 
-	lazy_space_alloc(vacrelstats, nblocks);
+	lazy_space_alloc(vacrelstats, nblocks, !freeze_only);
 	frozen = palloc(sizeof(xl_heap_freeze_tuple) * MaxHeapTuplesPerPage);
 
 	/* Report that we're scanning the heap, advertising total # of blocks */
@@ -723,6 +731,8 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			};
 			int64		hvp_val[2];
 
+			Assert(!freeze_only);
+
 			/*
 			 * Before beginning index vacuuming, we release any pin we may
 			 * hold on the visibility map page.  This isn't necessary for
@@ -946,7 +956,8 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		 * We count tuples removed by the pruning step as removed by VACUUM.
 		 */
 		tups_vacuumed += heap_page_prune(onerel, buf, OldestXmin, false,
-										 &vacrelstats->latestRemovedXid);
+										 &vacrelstats->latestRemovedXid,
+										 !freeze_only);
 
 		/*
 		 * Now scan the page to collect vacuumable items and check for tuples
@@ -995,7 +1006,8 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			 */
 			if (ItemIdIsDead(itemid))
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+				if (!freeze_only)
+					lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
 				all_visible = false;
 				continue;
 			}
@@ -1024,11 +1036,11 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 				case HEAPTUPLE_DEAD:
 
 					/*
-					 * Ordinarily, DEAD tuples would have been removed by
-					 * heap_page_prune(), but it's possible that the tuple
-					 * state changed since heap_page_prune() looked.  In
-					 * particular an INSERT_IN_PROGRESS tuple could have
-					 * changed to DEAD if the inserter aborted.  So this
+					 * If reclaiming tuples is enabled, DEAD tuples would have
+					 * been removed by heap_page_prune(), but it's possible
+					 * that the tuple state changed since heap_page_prune()
+					 * looked. In particular an INSERT_IN_PROGRESS tuple could
+					 * have changed to DEAD if the inserter aborted.  So this
 					 * cannot be considered an error condition.
 					 *
 					 * If the tuple is HOT-updated then it must only be
@@ -1045,6 +1057,8 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 					 * to detect that case and abort the transaction,
 					 * preventing corruption.
 					 */
+					if (freeze_only)
+						tupgone = true;
 					if (HeapTupleIsHotUpdated(&tuple) ||
 						HeapTupleIsHeapOnly(&tuple))
 						nkeep += 1;
@@ -1141,10 +1155,13 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 
 			if (tupgone)
 			{
-				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
-				HeapTupleHeaderAdvanceLatestRemovedXid(tuple.t_data,
-													   &vacrelstats->latestRemovedXid);
-				tups_vacuumed += 1;
+				if (!freeze_only)
+				{
+					lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
+					HeapTupleHeaderAdvanceLatestRemovedXid(tuple.t_data,
+														   &vacrelstats->latestRemovedXid);
+					tups_vacuumed += 1;
+				}
 				has_dead_tuples = true;
 			}
 			else
@@ -1204,6 +1221,9 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			}
 
 			END_CRIT_SECTION();
+
+			/* Remember nfrozen in total */
+			total_nfrozen += nfrozen;
 		}
 
 		/*
@@ -1213,6 +1233,8 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		if (nindexes == 0 &&
 			vacrelstats->num_dead_tuples > 0)
 		{
+			Assert(!freeze_only);
+
 			/* Remove tuples from heap */
 			lazy_vacuum_page(onerel, blkno, buf, 0, vacrelstats, &vmbuffer);
 			has_dead_tuples = false;
@@ -1380,6 +1402,8 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		};
 		int64		hvp_val[2];
 
+		Assert(!freeze_only);
+
 		/* Log cleanup info before we touch indexes */
 		vacuum_log_cleanup_info(onerel, vacrelstats);
 
@@ -1418,8 +1442,11 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
 
 	/* Do post-vacuum cleanup and statistics update for each index */
-	for (i = 0; i < nindexes; i++)
-		lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
+	if (!freeze_only)
+	{
+		for (i = 0; i < nindexes; i++)
+			lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
+	}
 
 	/* If no indexes, make log report that lazy_vacuum_heap would've made */
 	if (vacuumed_pages)
@@ -1433,9 +1460,16 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	 * individual parts of the message when not applicable.
 	 */
 	initStringInfo(&buf);
-	appendStringInfo(&buf,
-					 _("%.0f dead row versions cannot be removed yet, oldest xmin: %u\n"),
-					 nkeep, OldestXmin);
+	if (!freeze_only)
+		appendStringInfo(&buf,
+						 _("%.0f dead row versions cannot be removed yet, oldest xmin: %u\n"),
+						 nkeep, OldestXmin);
+	else
+		appendStringInfo(&buf, ngettext("freeze %d row\n",
+										"freeze %d rows\n",
+										total_nfrozen),
+						 total_nfrozen);
+
 	appendStringInfo(&buf, _("There were %.0f unused item pointers.\n"),
 					 nunused);
 	appendStringInfo(&buf, ngettext("Skipped %u page due to buffer pins, ",
@@ -2085,13 +2119,23 @@ count_nondeletable_pages(Relation onerel, LVRelStats *vacrelstats)
  * See the comments at the head of this file for rationale.
  */
 static void
-lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks,
+				 bool collect_deadtuples)
 {
 	long		maxtuples;
 	int			vac_work_mem = IsAutoVacuumWorkerProcess() &&
 	autovacuum_work_mem != -1 ?
 	autovacuum_work_mem : maintenance_work_mem;
 
+	/* Don't prepare dead tuple space if we don't collect them */
+	if (!collect_deadtuples)
+	{
+		vacrelstats->num_dead_tuples = 0;
+		vacrelstats->max_dead_tuples = 0;
+		vacrelstats->dead_tuples = NULL;
+		return;
+	}
+
 	if (vacrelstats->hasindex)
 	{
 		maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 6d23bfb..9866b0b 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -10540,6 +10540,8 @@ vacuum_option_elem:
 						$$ = VACOPT_DISABLE_PAGE_SKIPPING;
 					else if (strcmp($1, "skip_locked") == 0)
 						$$ = VACOPT_SKIP_LOCKED;
+					else if (strcmp($1, "freeze_only") == 0)
+						$$ = VACOPT_FREEZE_ONLY;
 					else
 						ereport(ERROR,
 								(errcode(ERRCODE_SYNTAX_ERROR),
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 40e153f..423eeb1 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -188,7 +188,8 @@ extern void heap_update_snapshot(HeapScanDesc scan, Snapshot snapshot);
 extern void heap_page_prune_opt(Relation relation, Buffer buffer);
 extern int heap_page_prune(Relation relation, Buffer buffer,
 				TransactionId OldestXmin,
-				bool report_stats, TransactionId *latestRemovedXid);
+				bool report_stats, TransactionId *latestRemovedXid,
+				bool prune_root);
 extern void heap_page_prune_execute(Buffer buffer,
 						OffsetNumber *redirected, int nredirected,
 						OffsetNumber *nowdead, int ndead,
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index aa4a0db..478c82c 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3156,7 +3156,8 @@ typedef enum VacuumOption
 	VACOPT_FULL = 1 << 4,		/* FULL (non-concurrent) vacuum */
 	VACOPT_SKIP_LOCKED = 1 << 5,	/* skip if cannot get lock */
 	VACOPT_SKIPTOAST = 1 << 6,	/* don't process the TOAST table, if any */
-	VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7	/* don't skip any pages */
+	VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7,	/* don't skip any pages */
+	VACOPT_FREEZE_ONLY = 1 << 8	/* do only freezing */
 } VacuumOption;
 
 /*
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index fa9d663..5553a69 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -80,6 +80,7 @@ CONTEXT:  SQL function "do_analyze" statement 1
 SQL function "wrap_do_analyze" statement 1
 VACUUM FULL vactst;
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+VACUUM (FREEZE_ONLY) vaccluster, vactst;
 -- partitioned table
 CREATE TABLE vacparted (a int, b char) PARTITION BY LIST (a);
 CREATE TABLE vacparted1 PARTITION OF vacparted FOR VALUES IN (1);
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 9defa0d..0e8805e 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -61,6 +61,7 @@ VACUUM FULL vaccluster;
 VACUUM FULL vactst;
 
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+VACUUM (FREEZE_ONLY) vaccluster, vactst;
 
 -- partitioned table
 CREATE TABLE vacparted (a int, b char) PARTITION BY LIST (a);
-- 
2.10.5

#4Bossart, Nathan
bossartn@amazon.com
In reply to: Masahiko Sawada (#3)
Re: New vacuum option to do only freezing

Hi,

On 10/1/18, 5:23 AM, "Masahiko Sawada" <sawada.mshk@gmail.com> wrote:

Attached patch adds a new option FREEZE_ONLY to VACUUM command. This
option is same as FREEZE option except for it disables reclaiming dead
tuples. That is, with this option vacuum does pruning HOT chain,
freezing live tuples and maintaining both visibility map and freespace
map but does not collect dead tuples and invoke neither heap vacuum
nor index vacuum. This option will be useful if user wants to prevent
XID wraparound a table as quick as possible, especially when table is
quite large and is about to XID wraparound. I think this usecase was
mentioned in some threads but I couldn't find them.

I've thought about this a bit myself. One of the reasons VACUUM can
take so long is because of all the index scans needed. If you're in a
potential XID wraparound situation and just need a quick way out, it
would be nice to have a way to do the minimum amount of work necessary
to reclaim transaction IDs. At a high level, I think there are some
improvements to this design we should consider.

1. Create a separate FREEZE command instead of adding a new VACUUM
option

The first line of the VACUUM documentation reads, "VACUUM reclaims
storage occupied by dead tuples," which is something that we would
explicitly not be doing with FREEZE_ONLY. I think it makes sense to
reuse many of the VACUUM code paths to implement this feature, but
from a user perspective, it should be separate.

2. We should reclaim transaction IDs from dead tuples as well

Unless we also have a way to freeze XMAX like we do XMIN, I doubt this
feature will be useful for the imminent-XID-wraparound use-case. In
short, we won't be able to advance relfrozenxid and relminmxid beyond
the oldest XMAX value for the relation. IIUC the idea of freezing
XMAX doesn't really exist yet. Either the XMAX is aborted/invalid and
can be reset to InvalidTransactionId, or it is committed and the tuple
can be removed if it beyond the freezing threshold. So, we probably
also want to look into adding a way to freeze XMAX, either by setting
it to FrozenTransactionId or by setting the hint bits to
(HEAP_XMAX_COMMITTED | HEAP_XMIN_INVALID) as is done for XMIN.

Looking closer, I see that the phrase "freezing XMAX" is currently
used to refer to setting it to InvalidTransactionId if it is aborted
or invalid (e.g. lock-only).

Currently this patch just adds the new option to VACUUM command but it
might be good to make autovacuum use it when emergency vacuum is
required.

This also seems like a valid use-case, but it should definitely be
done as a separate effort after this feature has been committed.

This is a performance-test result for FREEZE option and FREEZE_ONLY
option. I've tested them on the table which is about 3.8GB table
without indexes and randomly modified.

* FREEZE
...
Time: 50301.262 ms (00:50.301)

* FREEZE_ONLY
...
Time: 44589.794 ms (00:44.590)

I'd be curious to see the improvements you get when there are several
indexes on the relation. The ability to skip the index scans is
likely how this feature will really help speed things up.

Nathan

#5Robert Haas
robertmhaas@gmail.com
In reply to: Masahiko Sawada (#1)
Re: New vacuum option to do only freezing

On Mon, Oct 1, 2018 at 6:23 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Attached patch adds a new option FREEZE_ONLY to VACUUM command. This
option is same as FREEZE option except for it disables reclaiming dead
tuples. That is, with this option vacuum does pruning HOT chain,
freezing live tuples and maintaining both visibility map and freespace
map but does not collect dead tuples and invoke neither heap vacuum
nor index vacuum. This option will be useful if user wants to prevent
XID wraparound a table as quick as possible, especially when table is
quite large and is about to XID wraparound. I think this usecase was
mentioned in some threads but I couldn't find them.

The feature seems like a reasonable one, but the name doesn't seem
like a good choice. It doesn't only freeze - e.g. it HOT-prunes, as
it must. Maybe consider something like (without_index_cleanup true)
or (index_cleanup false).

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#6Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Bossart, Nathan (#4)
Re: New vacuum option to do only freezing

On Fri, Nov 2, 2018 at 1:32 AM Bossart, Nathan <bossartn@amazon.com> wrote:

Hi,

On 10/1/18, 5:23 AM, "Masahiko Sawada" <sawada.mshk@gmail.com> wrote:

Attached patch adds a new option FREEZE_ONLY to VACUUM command. This
option is same as FREEZE option except for it disables reclaiming dead
tuples. That is, with this option vacuum does pruning HOT chain,
freezing live tuples and maintaining both visibility map and freespace
map but does not collect dead tuples and invoke neither heap vacuum
nor index vacuum. This option will be useful if user wants to prevent
XID wraparound a table as quick as possible, especially when table is
quite large and is about to XID wraparound. I think this usecase was
mentioned in some threads but I couldn't find them.

Thank you for the comment!

I've thought about this a bit myself. One of the reasons VACUUM can
take so long is because of all the index scans needed. If you're in a
potential XID wraparound situation and just need a quick way out, it
would be nice to have a way to do the minimum amount of work necessary
to reclaim transaction IDs. At a high level, I think there are some
improvements to this design we should consider.

1. Create a separate FREEZE command instead of adding a new VACUUM
option

The first line of the VACUUM documentation reads, "VACUUM reclaims
storage occupied by dead tuples," which is something that we would
explicitly not be doing with FREEZE_ONLY.

No. Actually FREEZE_ONLY option (maybe will be changed its name) could
reclaim dead tuples by HOT-purning. If a page have HOT-updated chains
the FREEZE_ONLY prunes them and reclaim disk space occupied.

I think it makes sense to
reuse many of the VACUUM code paths to implement this feature, but
from a user perspective, it should be separate.

I'm concernced that since the existing users already have recognized
that vacuuming and freezing are closely related they would get
confused more if we have a similar purpose feature with different
name.

2. We should reclaim transaction IDs from dead tuples as well

Unless we also have a way to freeze XMAX like we do XMIN, I doubt this
feature will be useful for the imminent-XID-wraparound use-case. In
short, we won't be able to advance relfrozenxid and relminmxid beyond
the oldest XMAX value for the relation.
IIUC the idea of freezing> XMAX doesn't really exist yet. Either the XMAX is aborted/invalid and
can be reset to InvalidTransactionId, or it is committed and the tuple
can be removed if it beyond the freezing threshold. So, we probably
also want to look into adding a way to freeze XMAX, either by setting
it to FrozenTransactionId or by setting the hint bits to
(HEAP_XMAX_COMMITTED | HEAP_XMIN_INVALID) as is done for XMIN.

That's a good point. If the oldest xmax is close to the old
relfrozenxid we will not be able to advance relfrozenxid enough.
However, since dead tuples are vacuumed by autovacuum periodically I
think that we can advance relfrozenxid enough in common case. There is
possible that we eventually need to do vacuum with removing dead
tuples after done FREEZE_ONLY but it would be a rare case. Thought?

Looking closer, I see that the phrase "freezing XMAX" is currently
used to refer to setting it to InvalidTransactionId if it is aborted
or invalid (e.g. lock-only).

Currently this patch just adds the new option to VACUUM command but it
might be good to make autovacuum use it when emergency vacuum is
required.

This also seems like a valid use-case, but it should definitely be
done as a separate effort after this feature has been committed.

Agreed.

This is a performance-test result for FREEZE option and FREEZE_ONLY
option. I've tested them on the table which is about 3.8GB table
without indexes and randomly modified.

* FREEZE
...
Time: 50301.262 ms (00:50.301)

* FREEZE_ONLY
...
Time: 44589.794 ms (00:44.590)

I'd be curious to see the improvements you get when there are several
indexes on the relation. The ability to skip the index scans is
likely how this feature will really help speed things up.

I've tested performance of FREEZE option and FREEZE_ONLY option using
a 3GB table having 3 indexes. Before do vacuum I modified 1 % of data
on the table.

* FREEZE
Time: 78677.211 ms (01:18.677)
Time: 86958.452 ms (01:26.958)
Time: 78351.190 ms (01:18.351)

* FREEZE_ONLY
Time: 19913.863 ms (00:19.914)
Time: 18917.379 ms (00:18.917)
Time: 20048.541 ms (00:20.049)

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#7Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Robert Haas (#5)
Re: New vacuum option to do only freezing

On Fri, Nov 2, 2018 at 2:02 AM Robert Haas <robertmhaas@gmail.com> wrote:

On Mon, Oct 1, 2018 at 6:23 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Attached patch adds a new option FREEZE_ONLY to VACUUM command. This
option is same as FREEZE option except for it disables reclaiming dead
tuples. That is, with this option vacuum does pruning HOT chain,
freezing live tuples and maintaining both visibility map and freespace
map but does not collect dead tuples and invoke neither heap vacuum
nor index vacuum. This option will be useful if user wants to prevent
XID wraparound a table as quick as possible, especially when table is
quite large and is about to XID wraparound. I think this usecase was
mentioned in some threads but I couldn't find them.

Thank you for the comment!

The feature seems like a reasonable one, but the name doesn't seem
like a good choice. It doesn't only freeze - e.g. it HOT-prunes, as
it must. Maybe consider something like (without_index_cleanup true)
or (index_cleanup false).

Adding a field-and-value style option might be worth. Or maybe we can
add one option for example freeze_without_index_cleanup?

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#8Bossart, Nathan
bossartn@amazon.com
In reply to: Masahiko Sawada (#7)
Re: New vacuum option to do only freezing

On 11/5/18, 2:07 AM, "Masahiko Sawada" <sawada.mshk@gmail.com> wrote:

On Fri, Nov 2, 2018 at 1:32 AM Bossart, Nathan <bossartn@amazon.com> wrote:

1. Create a separate FREEZE command instead of adding a new VACUUM
option

The first line of the VACUUM documentation reads, "VACUUM reclaims
storage occupied by dead tuples," which is something that we would
explicitly not be doing with FREEZE_ONLY.

No. Actually FREEZE_ONLY option (maybe will be changed its name) could
reclaim dead tuples by HOT-purning. If a page have HOT-updated chains
the FREEZE_ONLY prunes them and reclaim disk space occupied.

I see.

I think it makes sense to
reuse many of the VACUUM code paths to implement this feature, but
from a user perspective, it should be separate.

I'm concernced that since the existing users already have recognized
that vacuuming and freezing are closely related they would get
confused more if we have a similar purpose feature with different
name.

That seems reasonable to me. Perhaps decoupling this option from
FREEZE would make it clearer to users and easier to name. This would
allow users to do both VACUUM (WITHOUT_INDEX_CLEANUP) and VACUUM
(FREEZE, WITHOUT_INDEX_CLEANUP).

2. We should reclaim transaction IDs from dead tuples as well

Unless we also have a way to freeze XMAX like we do XMIN, I doubt this
feature will be useful for the imminent-XID-wraparound use-case. In
short, we won't be able to advance relfrozenxid and relminmxid beyond
the oldest XMAX value for the relation.
IIUC the idea of freezing> XMAX doesn't really exist yet. Either the XMAX is aborted/invalid and
can be reset to InvalidTransactionId, or it is committed and the tuple
can be removed if it beyond the freezing threshold. So, we probably
also want to look into adding a way to freeze XMAX, either by setting
it to FrozenTransactionId or by setting the hint bits to
(HEAP_XMAX_COMMITTED | HEAP_XMIN_INVALID) as is done for XMIN.

That's a good point. If the oldest xmax is close to the old
relfrozenxid we will not be able to advance relfrozenxid enough.
However, since dead tuples are vacuumed by autovacuum periodically I
think that we can advance relfrozenxid enough in common case. There is
possible that we eventually need to do vacuum with removing dead
tuples after done FREEZE_ONLY but it would be a rare case. Thought?

Given that a stated goal of this patch is to help recover from near
wraparound, I think this is very important optimization. It's true
that you might be able to advance relfrozenxid/relminmxid a bit in
some cases, but there are many others where you won't. For example,
if I create a row, delete it, and insert many more rows, my table's
XID age would be stuck at the row deletion. If I was in a situation
where this table was near wraparound and autovacuum wasn't keeping up,
I would run VACUUM (WITHOUT_INDEX_CLEANUP, FREEZE) with the intent of
reclaiming transaction IDs as fast as possible.

I'd be curious to see the improvements you get when there are several
indexes on the relation. The ability to skip the index scans is
likely how this feature will really help speed things up.

I've tested performance of FREEZE option and FREEZE_ONLY option using
a 3GB table having 3 indexes. Before do vacuum I modified 1 % of data
on the table.

* FREEZE
Time: 78677.211 ms (01:18.677)
Time: 86958.452 ms (01:26.958)
Time: 78351.190 ms (01:18.351)

* FREEZE_ONLY
Time: 19913.863 ms (00:19.914)
Time: 18917.379 ms (00:18.917)
Time: 20048.541 ms (00:20.049)

Nice.

Nathan

#9Robert Haas
robertmhaas@gmail.com
In reply to: Masahiko Sawada (#7)
Re: New vacuum option to do only freezing

On Mon, Nov 5, 2018 at 3:12 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Adding a field-and-value style option might be worth. Or maybe we can
add one option for example freeze_without_index_cleanup?

That seems non-orthogonal. We have an existing flag to force freezing
(FREEZE); we don't need a second option that does the same thing.
Skipping the index scans (and thus necessarily the second heap pass)
is a separate behavior which we don't currently have a way to control.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#10Greg Stark
stark@mit.edu
In reply to: Robert Haas (#9)
Re: New vacuum option to do only freezing

On Mon 5 Nov 2018, 12:49 Robert Haas <robertmhaas@gmail.com wrote:

That seems non-orthogonal. We have an existing flag to force freezing
(FREEZE); we don't need a second option that does the same thing.
Skipping the index scans (and thus necessarily the second heap pass)
is a separate behavior which we don't currently have a way to control.

It sounds like it might be better to name this "VACUUM (FAST)” and document
that it skips some of the normal (and necessary) work that vacuum does and
is only suitable for avoiding wraparounds and not sufficient for avoiding
bloat

Show quoted text
#11Robert Haas
robertmhaas@gmail.com
In reply to: Greg Stark (#10)
Re: New vacuum option to do only freezing

On Mon, Nov 5, 2018 at 9:12 PM Greg Stark <stark@mit.edu> wrote:

It sounds like it might be better to name this "VACUUM (FAST)” and document that it skips some of the normal (and necessary) work that vacuum does and is only suitable for avoiding wraparounds and not sufficient for avoiding bloat

We could do that, but I don't see why that's better than VACUUM
(SKIP_INDEX_SCANS) or similar. There are, perhaps, multiple kinds of
shortcuts that could make vacuum run faster, but skipping index scans
is what it is.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#12Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Robert Haas (#9)
Re: New vacuum option to do only freezing

On Tue, Nov 6, 2018 at 2:48 AM Robert Haas <robertmhaas@gmail.com> wrote:

On Mon, Nov 5, 2018 at 3:12 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Adding a field-and-value style option might be worth. Or maybe we can
add one option for example freeze_without_index_cleanup?

That seems non-orthogonal. We have an existing flag to force freezing
(FREEZE); we don't need a second option that does the same thing.
Skipping the index scans (and thus necessarily the second heap pass)
is a separate behavior which we don't currently have a way to control.

We already have disable_page_skipping option, not (page_skipping
false). So ISMT disable_index_cleanup would be more natural. Also,
since what to do with this option is not only skipping vacuum indexes
but also skipping removeing dead tuples on heap, I think that the
option should have a more understandable name for users indicating
that both it removes dead tuples less than the normal vacuum and it's
aimed to freeze tuple more faster. Of course we document them, though.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#13Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Masahiko Sawada (#12)
Re: New vacuum option to do only freezing

On 2018-Nov-08, Masahiko Sawada wrote:

On Tue, Nov 6, 2018 at 2:48 AM Robert Haas <robertmhaas@gmail.com> wrote:

On Mon, Nov 5, 2018 at 3:12 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Adding a field-and-value style option might be worth. Or maybe we can
add one option for example freeze_without_index_cleanup?

That seems non-orthogonal. We have an existing flag to force freezing
(FREEZE); we don't need a second option that does the same thing.
Skipping the index scans (and thus necessarily the second heap pass)
is a separate behavior which we don't currently have a way to control.

We already have disable_page_skipping option, not (page_skipping
false). So ISMT disable_index_cleanup would be more natural. Also,
since what to do with this option is not only skipping vacuum indexes
but also skipping removeing dead tuples on heap, I think that the
option should have a more understandable name for users indicating
that both it removes dead tuples less than the normal vacuum and it's
aimed to freeze tuple more faster. Of course we document them, though.

I would name this based on the fact that it freezes quickly -- something
like FAST_FREEZE perhaps. The other things seem implementation details.

--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#14Robert Haas
robertmhaas@gmail.com
In reply to: Masahiko Sawada (#12)
Re: New vacuum option to do only freezing

On Thu, Nov 8, 2018 at 4:36 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

We already have disable_page_skipping option, not (page_skipping
false). So ISMT disable_index_cleanup would be more natural.

Sure.

Also,
since what to do with this option is not only skipping vacuum indexes
but also skipping removeing dead tuples on heap, I think that the
option should have a more understandable name for users indicating
that both it removes dead tuples less than the normal vacuum and it's
aimed to freeze tuple more faster. Of course we document them, though.

Well, I actually don't think that you should control two behaviors
with the same option. If you want to vacuum and skip index cleanup,
you should say VACUUM (disable_index_cleanup). If you want to vacuum,
disable index cleanup, and skip aggressively, you should say VACUUM
(freeze, disable_index_cleanup). Both behaviors seem useful.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#15Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Robert Haas (#14)
1 attachment(s)
Re: New vacuum option to do only freezing

On Fri, Nov 9, 2018 at 2:56 AM Robert Haas <robertmhaas@gmail.com> wrote:

On Thu, Nov 8, 2018 at 4:36 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

We already have disable_page_skipping option, not (page_skipping
false). So ISMT disable_index_cleanup would be more natural.

Sure.

Also,
since what to do with this option is not only skipping vacuum indexes
but also skipping removeing dead tuples on heap, I think that the
option should have a more understandable name for users indicating
that both it removes dead tuples less than the normal vacuum and it's
aimed to freeze tuple more faster. Of course we document them, though.

Well, I actually don't think that you should control two behaviors
with the same option. If you want to vacuum and skip index cleanup,
you should say VACUUM (disable_index_cleanup). If you want to vacuum,
disable index cleanup, and skip aggressively, you should say VACUUM
(freeze, disable_index_cleanup). Both behaviors seem useful.

Attached the updated version patch incorporated all comments I got.
The patch includes the following changes.

* Changed the option name to DISABLE_INDEX_CLENAUP. I agreed with
Robert and Nathan, having an option separated from FREEZE.

* Instead of freezing xmax I've changed the behaviour of the new
option (DISABLE_INDEX_CLEANUP) so that it sets dead tuples as dead
instead of as unused and skips both index vacuum and index cleanup.
That is, we remove the storage of dead tuple but leave dead itemids
for the index lookups. These are removed by the next vacuum execution
enabling index cleanup. Currently HOT-pruning doesn't set the root of
the chain as unused even if the whole chain is dead. Since setting
tuples as dead removes storage the freezing xmax is no longer
required.

The second change might conflict with the retail index deletion
patch[1]/messages/by-id/425db134-8bba-005c-b59d-56e50de3b41e@postgrespro.ru, which makes HOT-pruning *mark* tuples as dead (i.g., using
ItemIdMarkDead() instead). I think that this patch would not block
that patch but I've not considered deeply yet the combination with
these two pathes.

[1]: /messages/by-id/425db134-8bba-005c-b59d-56e50de3b41e@postgrespro.ru

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachments:

v3-0001-Add-DISABLE_INDEX_CLEANUP-option-to-VACUUM-comman.patchapplication/octet-stream; name=v3-0001-Add-DISABLE_INDEX_CLEANUP-option-to-VACUUM-comman.patchDownload
From 0f0432f1be115411602dc97400937cff42434f19 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Wed, 9 Jan 2019 09:22:20 +0900
Subject: [PATCH v3] Add DISABLE_INDEX_CLEANUP option to VACUUM command.

---
 doc/src/sgml/ref/vacuum.sgml         |  19 +++++-
 src/backend/commands/vacuum.c        |   8 ++-
 src/backend/commands/vacuumlazy.c    | 128 ++++++++++++++++++++++++++++++-----
 src/backend/parser/gram.y            |   2 +
 src/include/nodes/parsenodes.h       |   4 +-
 src/test/regress/expected/vacuum.out |   2 +
 src/test/regress/sql/vacuum.sql      |   2 +
 7 files changed, 145 insertions(+), 20 deletions(-)

diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index fd911f5..4ae0473 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -31,6 +31,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     VERBOSE
     ANALYZE
     DISABLE_PAGE_SKIPPING
+    DISABLE_INDEX_CLEANUP
     SKIP_LOCKED
 
 <phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
@@ -161,7 +162,23 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     </listitem>
    </varlistentry>
 
-   <varlistentry>
+    <varlistentry>
+    <term><literal>DISABLE_INDEX_CLEANUP</literal></term>
+    <listitem>
+     <para>
+      <command>VACUUM</command> removes dead tuples and prunes HOT-updated
+      tuples chain for live tuples on table. If the table has any dead tuple
+      it removes them from both table and indexes for re-use. With this
+      option <command>VACUUM</command> marks tuples as dead (i.e., it doesn't
+      remove tuple storage) and disables removing dead tuples from indexes.
+      This is suitable for avoiding transaction ID wraparound but not
+      sufficient for avoiding index bloat. This option cannot be used in
+      conjunction with <literal>FULL</literal> option.
+     </para>
+    </listitem>
+   </varlistentry>
+
+    <varlistentry>
     <term><literal>SKIP_LOCKED</literal></term>
     <listitem>
      <para>
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index ff1e178..5f0e7a4 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -204,7 +204,8 @@ vacuum(int options, List *relations, VacuumParams *params,
 						stmttype)));
 
 	/*
-	 * Sanity check DISABLE_PAGE_SKIPPING option.
+	 * Sanity check DISABLE_PAGE_SKIPPING option and DISABLE_INDEX_CLEANUP
+	 * option.
 	 */
 	if ((options & VACOPT_FULL) != 0 &&
 		(options & VACOPT_DISABLE_PAGE_SKIPPING) != 0)
@@ -212,6 +213,11 @@ vacuum(int options, List *relations, VacuumParams *params,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg("VACUUM option DISABLE_PAGE_SKIPPING cannot be used with FULL")));
 
+	if ((options & VACOPT_FULL) != 0 &&
+		(options & VACOPT_DISABLE_INDEX_CLEANUP) != 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("VACUUM option DISABLE_INDEX_CLEANUP cannot be used with FULL")));
 	/*
 	 * Send info about dead objects to the statistics collector, unless we are
 	 * in autovacuum --- autovacuum.c does this for itself.
diff --git a/src/backend/commands/vacuumlazy.c b/src/backend/commands/vacuumlazy.c
index b67267f..994d4cb 100644
--- a/src/backend/commands/vacuumlazy.c
+++ b/src/backend/commands/vacuumlazy.c
@@ -164,11 +164,14 @@ static void lazy_cleanup_index(Relation indrel,
 				   LVRelStats *vacrelstats);
 static int lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 				 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer);
+static void lazy_set_tuples_dead(Relation onerel, BlockNumber blkno, Buffer buffer,
+								 LVRelStats *vacrelstats);
 static bool should_attempt_truncation(LVRelStats *vacrelstats);
 static void lazy_truncate_heap(Relation onerel, LVRelStats *vacrelstats);
 static BlockNumber count_nondeletable_pages(Relation onerel,
 						 LVRelStats *vacrelstats);
-static void lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks);
+static void lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks,
+							 bool disable_removing_tuples);
 static void lazy_record_dead_tuple(LVRelStats *vacrelstats,
 					   ItemPointer itemptr);
 static bool lazy_tid_reaped(ItemPointer itemptr, void *state);
@@ -284,9 +287,11 @@ lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 		scanned_all_unfrozen = true;
 
 	/*
-	 * Optionally truncate the relation.
+	 * Optionally truncate the relation. If DISABLE_INDEX_CLEANUP is specified,
+	 * since we skipped to remove dead tuples there is no need to truncate heap.
 	 */
-	if (should_attempt_truncation(vacrelstats))
+	if (should_attempt_truncation(vacrelstats) &&
+		(options & VACOPT_DISABLE_INDEX_CLEANUP) == 0)
 		lazy_truncate_heap(onerel, vacrelstats);
 
 	/* Report that we are now doing final cleanup */
@@ -493,6 +498,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	Buffer		vmbuffer = InvalidBuffer;
 	BlockNumber next_unskippable_block;
 	bool		skipping_blocks;
+	bool		disable_index_cleanup = (options & VACOPT_DISABLE_INDEX_CLEANUP) != 0;
 	xl_heap_freeze_tuple *frozen;
 	StringInfoData buf;
 	const int	initprog_index[] = {
@@ -530,7 +536,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	vacrelstats->nonempty_pages = 0;
 	vacrelstats->latestRemovedXid = InvalidTransactionId;
 
-	lazy_space_alloc(vacrelstats, nblocks);
+	lazy_space_alloc(vacrelstats, nblocks, disable_index_cleanup);
 	frozen = palloc(sizeof(xl_heap_freeze_tuple) * MaxHeapTuplesPerPage);
 
 	/* Report that we're scanning the heap, advertising total # of blocks */
@@ -723,6 +729,8 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			};
 			int64		hvp_val[2];
 
+			Assert(!disable_index_cleanup);
+
 			/*
 			 * Before beginning index vacuuming, we release any pin we may
 			 * hold on the visibility map page.  This isn't necessary for
@@ -1201,14 +1209,23 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		}
 
 		/*
-		 * If there are no indexes then we can vacuum the page right now
-		 * instead of doing a second scan.
+		 * If either there are no indexes or index cleanup is disabled then
+		 * we can vacuum the page right now instead of doing a second scan.
 		 */
-		if (nindexes == 0 &&
+		if ((nindexes == 0 || disable_index_cleanup) &&
 			vacrelstats->num_dead_tuples > 0)
 		{
-			/* Remove tuples from heap */
-			lazy_vacuum_page(onerel, blkno, buf, 0, vacrelstats, &vmbuffer);
+			/*
+			 * If index cleanup is disabled, we set the recorded tuples as
+			 * dead, which removes its tuple storage but leaves ItemIDs.
+			 * By leaving dead ItemIDs an index lookup can properly see the
+			 * tuple is already dead. And the remained dead ItemIds will be
+			 * removed by the next vacuum enabled the index cleanup.
+			 */
+			if (nindexes > 0 && disable_index_cleanup)
+				lazy_set_tuples_dead(onerel, blkno, buf, vacrelstats);
+			else
+				lazy_vacuum_page(onerel, blkno, buf, 0, vacrelstats, &vmbuffer);
 			has_dead_tuples = false;
 
 			/*
@@ -1374,6 +1391,8 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		};
 		int64		hvp_val[2];
 
+		Assert(!disable_index_cleanup);
+
 		/* Log cleanup info before we touch indexes */
 		vacuum_log_cleanup_info(onerel, vacrelstats);
 
@@ -1412,15 +1431,24 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
 
 	/* Do post-vacuum cleanup and statistics update for each index */
-	for (i = 0; i < nindexes; i++)
-		lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
+	if (!disable_index_cleanup)
+		for (i = 0; i < nindexes; i++)
+			lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
 
 	/* If no indexes, make log report that lazy_vacuum_heap would've made */
 	if (vacuumed_pages)
-		ereport(elevel,
-				(errmsg("\"%s\": removed %.0f row versions in %u pages",
-						RelationGetRelationName(onerel),
-						tups_vacuumed, vacuumed_pages)));
+	{
+		if (disable_index_cleanup)
+			ereport(elevel,
+					(errmsg("\"%s\": marked %.0f row versions in %u pages as dead",
+							RelationGetRelationName(onerel),
+							tups_vacuumed, vacuumed_pages)));
+		else
+			ereport(elevel,
+					(errmsg("\"%s\": removed %.0f row versions in %u pages",
+							RelationGetRelationName(onerel),
+							tups_vacuumed, vacuumed_pages)));
+	}
 
 	/*
 	 * This is pretty messy, but we split it up so that we can skip emitting
@@ -1623,6 +1651,71 @@ lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 	return tupindex;
 }
 
+
+/*
+ *	lazy_set_tuples_dead() -- set tuples as dead on a page
+ *							and repair its fragmentation.
+ *
+ * Unlike lazy_vacuum_page, this function sets the recorded tuples as dead
+ * instead of as unused, and doesn't update the page visibility.
+ * Also this function assumes that vacrelstats->dead_tuples has only dead
+ * tuples on the 'blkno' block.
+ *
+ * Caller must hold pin and buffer cleanup lock on the buffer.
+ */
+static void
+lazy_set_tuples_dead(Relation onerel, BlockNumber blkno, Buffer buffer,
+					LVRelStats *vacrelstats)
+{
+	Page		page = BufferGetPage(buffer);
+	int			deadcnt = 0;
+	OffsetNumber dead[MaxOffsetNumber];
+	int			tupindex;
+
+	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_VACUUMED, blkno);
+
+	START_CRIT_SECTION();
+
+	for (tupindex = 0; tupindex < vacrelstats->num_dead_tuples; tupindex++)
+	{
+		OffsetNumber toff;
+		ItemId		itemid;
+
+		Assert(blkno == ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]));
+
+		toff = ItemPointerGetOffsetNumber(&vacrelstats->dead_tuples[tupindex]);
+		itemid = PageGetItemId(page, toff);
+
+		if (ItemIdIsDead(itemid))
+			continue;
+
+		ItemIdSetDead(itemid);
+		dead[deadcnt++] = toff;
+	}
+
+	PageRepairFragmentation(page);
+
+	/*
+	 * Mark buffer dirty before we write WAL.
+	 */
+	MarkBufferDirty(buffer);
+
+	/* XLOG stuff */
+	if (RelationNeedsWAL(onerel))
+	{
+		XLogRecPtr	recptr;
+
+		recptr = log_heap_clean(onerel, buffer,
+								NULL, 0,
+								dead, deadcnt,
+								NULL, 0,
+								vacrelstats->latestRemovedXid);
+		PageSetLSN(page, recptr);
+	}
+
+	END_CRIT_SECTION();
+}
+
 /*
  *	lazy_check_needs_freeze() -- scan page to see if any tuples
  *					 need to be cleaned to avoid wraparound
@@ -2079,14 +2172,15 @@ count_nondeletable_pages(Relation onerel, LVRelStats *vacrelstats)
  * See the comments at the head of this file for rationale.
  */
 static void
-lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks,
+				 bool disable_index_cleanup)
 {
 	long		maxtuples;
 	int			vac_work_mem = IsAutoVacuumWorkerProcess() &&
 	autovacuum_work_mem != -1 ?
 	autovacuum_work_mem : maintenance_work_mem;
 
-	if (vacrelstats->hasindex)
+	if (vacrelstats->hasindex && !disable_index_cleanup)
 	{
 		maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);
 		maxtuples = Min(maxtuples, INT_MAX);
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index c086235..e688c04 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -10511,6 +10511,8 @@ vacuum_option_elem:
 				{
 					if (strcmp($1, "disable_page_skipping") == 0)
 						$$ = VACOPT_DISABLE_PAGE_SKIPPING;
+					else if (strcmp($1, "disable_index_cleanup") == 0)
+						$$ = VACOPT_DISABLE_INDEX_CLEANUP;
 					else if (strcmp($1, "skip_locked") == 0)
 						$$ = VACOPT_SKIP_LOCKED;
 					else
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 27782fe..8266eaf 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3152,7 +3152,9 @@ typedef enum VacuumOption
 	VACOPT_FULL = 1 << 4,		/* FULL (non-concurrent) vacuum */
 	VACOPT_SKIP_LOCKED = 1 << 5,	/* skip if cannot get lock */
 	VACOPT_SKIPTOAST = 1 << 6,	/* don't process the TOAST table, if any */
-	VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7	/* don't skip any pages */
+	VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7,	/* don't skip any pages */
+	VACOPT_DISABLE_INDEX_CLEANUP = 1 << 8	/* don't remove dead tuple and
+											 * cleanup indexes */
 } VacuumOption;
 
 /*
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index fa9d663..bf3e50e 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -80,6 +80,8 @@ CONTEXT:  SQL function "do_analyze" statement 1
 SQL function "wrap_do_analyze" statement 1
 VACUUM FULL vactst;
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+VACUUM (DISABLE_INDEX_CLEANUP) vaccluster, vactst;
+VACUUM (DISABLE_INDEX_CLEANUP, FREEZE) vaccluster, vactst;
 -- partitioned table
 CREATE TABLE vacparted (a int, b char) PARTITION BY LIST (a);
 CREATE TABLE vacparted1 PARTITION OF vacparted FOR VALUES IN (1);
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 9defa0d..a1beafc 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -61,6 +61,8 @@ VACUUM FULL vaccluster;
 VACUUM FULL vactst;
 
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+VACUUM (DISABLE_INDEX_CLEANUP) vaccluster, vactst;
+VACUUM (DISABLE_INDEX_CLEANUP, FREEZE) vaccluster, vactst;
 
 -- partitioned table
 CREATE TABLE vacparted (a int, b char) PARTITION BY LIST (a);
-- 
2.10.5

#16Bossart, Nathan
bossartn@amazon.com
In reply to: Masahiko Sawada (#15)
Re: New vacuum option to do only freezing

Hi,

On 1/8/19, 7:03 PM, "Masahiko Sawada" <sawada.mshk@gmail.com> wrote:

Attached the updated version patch incorporated all comments I got.

Thanks for the new patch!

* Instead of freezing xmax I've changed the behaviour of the new
option (DISABLE_INDEX_CLEANUP) so that it sets dead tuples as dead
instead of as unused and skips both index vacuum and index cleanup.
That is, we remove the storage of dead tuple but leave dead itemids
for the index lookups. These are removed by the next vacuum execution
enabling index cleanup. Currently HOT-pruning doesn't set the root of
the chain as unused even if the whole chain is dead. Since setting
tuples as dead removes storage the freezing xmax is no longer
required.

I tested this option with a variety of cases (HOT, indexes, etc.), and
it seems to work well. I haven't looked too deeply into the
implications of using LP_DEAD this way, but it seems like a reasonable
approach at first glance.

+    <varlistentry>
+    <term><literal>DISABLE_INDEX_CLEANUP</literal></term>
+    <listitem>
+     <para>
+      <command>VACUUM</command> removes dead tuples and prunes HOT-updated
+      tuples chain for live tuples on table. If the table has any dead tuple
+      it removes them from both table and indexes for re-use. With this
+      option <command>VACUUM</command> marks tuples as dead (i.e., it doesn't
+      remove tuple storage) and disables removing dead tuples from indexes.
+      This is suitable for avoiding transaction ID wraparound but not
+      sufficient for avoiding index bloat. This option cannot be used in
+      conjunction with <literal>FULL</literal> option.
+     </para>
+    </listitem>
+   </varlistentry>

I think we should clarify the expected behavior when this option is
used on a table with no indexes. We probably do not want to fail, as
this could disrupt VACUUM commands that touch several tables. Also,
we don't need to set tuples as dead instead of unused, which appears
to be what this patch is actually doing:

+                       if (nindexes > 0 && disable_index_cleanup)
+                               lazy_set_tuples_dead(onerel, blkno, buf, vacrelstats);
+                       else
+                               lazy_vacuum_page(onerel, blkno, buf, 0, vacrelstats, &vmbuffer);

In this case, perhaps we should generate a log statement that notes
that DISABLE_INDEX_CLEANUP is being ignored and set
disable_index_cleanup to false during processing.

+               if (disable_index_cleanup)
+                       ereport(elevel,
+                                       (errmsg("\"%s\": marked %.0f row versions in %u pages as dead",
+                                                       RelationGetRelationName(onerel),
+                                                       tups_vacuumed, vacuumed_pages)));
+               else
+                       ereport(elevel,
+                                       (errmsg("\"%s\": removed %.0f row versions in %u pages",
+                                                       RelationGetRelationName(onerel),
+                                                       tups_vacuumed, vacuumed_pages)));

Should the first log statement only be generated when there are also
indexes?

+static void
+lazy_set_tuples_dead(Relation onerel, BlockNumber blkno, Buffer buffer,
+                                       LVRelStats *vacrelstats)

This function looks very similar to lazy_vacuum_page(). Perhaps the
two could be combined?

Nathan

#17Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Bossart, Nathan (#16)
Re: New vacuum option to do only freezing

On Thu, Jan 10, 2019 at 5:23 AM Bossart, Nathan <bossartn@amazon.com> wrote:

Hi,

On 1/8/19, 7:03 PM, "Masahiko Sawada" <sawada.mshk@gmail.com> wrote:

Attached the updated version patch incorporated all comments I got.

Thanks for the new patch!

* Instead of freezing xmax I've changed the behaviour of the new
option (DISABLE_INDEX_CLEANUP) so that it sets dead tuples as dead
instead of as unused and skips both index vacuum and index cleanup.
That is, we remove the storage of dead tuple but leave dead itemids
for the index lookups. These are removed by the next vacuum execution
enabling index cleanup. Currently HOT-pruning doesn't set the root of
the chain as unused even if the whole chain is dead. Since setting
tuples as dead removes storage the freezing xmax is no longer
required.

I tested this option with a variety of cases (HOT, indexes, etc.), and
it seems to work well. I haven't looked too deeply into the
implications of using LP_DEAD this way, but it seems like a reasonable
approach at first glance.

Thank you for reviewing the patch!

+    <varlistentry>
+    <term><literal>DISABLE_INDEX_CLEANUP</literal></term>
+    <listitem>
+     <para>
+      <command>VACUUM</command> removes dead tuples and prunes HOT-updated
+      tuples chain for live tuples on table. If the table has any dead tuple
+      it removes them from both table and indexes for re-use. With this
+      option <command>VACUUM</command> marks tuples as dead (i.e., it doesn't
+      remove tuple storage) and disables removing dead tuples from indexes.
+      This is suitable for avoiding transaction ID wraparound but not
+      sufficient for avoiding index bloat. This option cannot be used in
+      conjunction with <literal>FULL</literal> option.
+     </para>
+    </listitem>
+   </varlistentry>

I think we should clarify the expected behavior when this option is
used on a table with no indexes. We probably do not want to fail, as
this could disrupt VACUUM commands that touch several tables. Also,
we don't need to set tuples as dead instead of unused, which appears
to be what this patch is actually doing:

+                       if (nindexes > 0 && disable_index_cleanup)
+                               lazy_set_tuples_dead(onerel, blkno, buf, vacrelstats);
+                       else
+                               lazy_vacuum_page(onerel, blkno, buf, 0, vacrelstats, &vmbuffer);

In this case, perhaps we should generate a log statement that notes
that DISABLE_INDEX_CLEANUP is being ignored and set
disable_index_cleanup to false during processing.

Agreed. How about output a NOTICE message before calling
lazy_scan_heap() in lazy_vacuum_rel()?

+               if (disable_index_cleanup)
+                       ereport(elevel,
+                                       (errmsg("\"%s\": marked %.0f row versions in %u pages as dead",
+                                                       RelationGetRelationName(onerel),
+                                                       tups_vacuumed, vacuumed_pages)));
+               else
+                       ereport(elevel,
+                                       (errmsg("\"%s\": removed %.0f row versions in %u pages",
+                                                       RelationGetRelationName(onerel),
+                                                       tups_vacuumed, vacuumed_pages)));

Should the first log statement only be generated when there are also
indexes?

You're right. Will fix.

+static void
+lazy_set_tuples_dead(Relation onerel, BlockNumber blkno, Buffer buffer,
+                                       LVRelStats *vacrelstats)

This function looks very similar to lazy_vacuum_page(). Perhaps the
two could be combined?

Yes, I intentionally separed them as I was concerned the these
functions have different assumptions and usage. But the combining them
also could work. Will change it.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#18Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Masahiko Sawada (#17)
1 attachment(s)
Re: New vacuum option to do only freezing

On Thu, Jan 10, 2019 at 2:45 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Thu, Jan 10, 2019 at 5:23 AM Bossart, Nathan <bossartn@amazon.com> wrote:

Hi,

On 1/8/19, 7:03 PM, "Masahiko Sawada" <sawada.mshk@gmail.com> wrote:

Attached the updated version patch incorporated all comments I got.

Thanks for the new patch!

* Instead of freezing xmax I've changed the behaviour of the new
option (DISABLE_INDEX_CLEANUP) so that it sets dead tuples as dead
instead of as unused and skips both index vacuum and index cleanup.
That is, we remove the storage of dead tuple but leave dead itemids
for the index lookups. These are removed by the next vacuum execution
enabling index cleanup. Currently HOT-pruning doesn't set the root of
the chain as unused even if the whole chain is dead. Since setting
tuples as dead removes storage the freezing xmax is no longer
required.

I tested this option with a variety of cases (HOT, indexes, etc.), and
it seems to work well. I haven't looked too deeply into the
implications of using LP_DEAD this way, but it seems like a reasonable
approach at first glance.

Thank you for reviewing the patch!

+    <varlistentry>
+    <term><literal>DISABLE_INDEX_CLEANUP</literal></term>
+    <listitem>
+     <para>
+      <command>VACUUM</command> removes dead tuples and prunes HOT-updated
+      tuples chain for live tuples on table. If the table has any dead tuple
+      it removes them from both table and indexes for re-use. With this
+      option <command>VACUUM</command> marks tuples as dead (i.e., it doesn't
+      remove tuple storage) and disables removing dead tuples from indexes.
+      This is suitable for avoiding transaction ID wraparound but not
+      sufficient for avoiding index bloat. This option cannot be used in
+      conjunction with <literal>FULL</literal> option.
+     </para>
+    </listitem>
+   </varlistentry>

I think we should clarify the expected behavior when this option is
used on a table with no indexes. We probably do not want to fail, as
this could disrupt VACUUM commands that touch several tables. Also,
we don't need to set tuples as dead instead of unused, which appears
to be what this patch is actually doing:

+                       if (nindexes > 0 && disable_index_cleanup)
+                               lazy_set_tuples_dead(onerel, blkno, buf, vacrelstats);
+                       else
+                               lazy_vacuum_page(onerel, blkno, buf, 0, vacrelstats, &vmbuffer);

In this case, perhaps we should generate a log statement that notes
that DISABLE_INDEX_CLEANUP is being ignored and set
disable_index_cleanup to false during processing.

Agreed. How about output a NOTICE message before calling
lazy_scan_heap() in lazy_vacuum_rel()?

+               if (disable_index_cleanup)
+                       ereport(elevel,
+                                       (errmsg("\"%s\": marked %.0f row versions in %u pages as dead",
+                                                       RelationGetRelationName(onerel),
+                                                       tups_vacuumed, vacuumed_pages)));
+               else
+                       ereport(elevel,
+                                       (errmsg("\"%s\": removed %.0f row versions in %u pages",
+                                                       RelationGetRelationName(onerel),
+                                                       tups_vacuumed, vacuumed_pages)));

Should the first log statement only be generated when there are also
indexes?

You're right. Will fix.

+static void
+lazy_set_tuples_dead(Relation onerel, BlockNumber blkno, Buffer buffer,
+                                       LVRelStats *vacrelstats)

This function looks very similar to lazy_vacuum_page(). Perhaps the
two could be combined?

Yes, I intentionally separed them as I was concerned the these
functions have different assumptions and usage. But the combining them
also could work. Will change it.

Attached the updated patch. Please review it.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachments:

v4-0001-Add-DISABLE_INDEX_CLEANUP-option-to-VACUUM-comman.patchapplication/octet-stream; name=v4-0001-Add-DISABLE_INDEX_CLEANUP-option-to-VACUUM-comman.patchDownload
From 904ae364b7493bd17496a87b2eefa6dffe642d4b Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Wed, 9 Jan 2019 09:22:20 +0900
Subject: [PATCH v4] Add DISABLE_INDEX_CLEANUP option to VACUUM command.

---
 doc/src/sgml/ref/vacuum.sgml         |  20 ++++-
 src/backend/commands/vacuum.c        |   8 +-
 src/backend/commands/vacuumlazy.c    | 138 +++++++++++++++++++++++++++--------
 src/backend/parser/gram.y            |   2 +
 src/include/nodes/parsenodes.h       |   4 +-
 src/test/regress/expected/vacuum.out |   4 +
 src/test/regress/sql/vacuum.sql      |   3 +
 7 files changed, 146 insertions(+), 33 deletions(-)

diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index fd911f5..6f6cc44 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -31,6 +31,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     VERBOSE
     ANALYZE
     DISABLE_PAGE_SKIPPING
+    DISABLE_INDEX_CLEANUP
     SKIP_LOCKED
 
 <phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
@@ -161,7 +162,24 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     </listitem>
    </varlistentry>
 
-   <varlistentry>
+    <varlistentry>
+    <term><literal>DISABLE_INDEX_CLEANUP</literal></term>
+    <listitem>
+     <para>
+      <command>VACUUM</command> removes dead tuples and prunes HOT-updated
+      tuples chain for live tuples on table. If the table has any dead tuple
+      it removes them from both table and indexes for re-use. With this
+      option <command>VACUUM</command> marks tuples as dead (i.e., it doesn't
+      remove tuple storage) and disables removing dead tuples from indexes.
+      This is suitable for avoiding transaction ID wraparound but not
+      sufficient for avoiding index bloat. This option is ignored if the
+      table doesn't have index. Also, this cannot be used in conjunction
+      with <literal>FULL</literal> option.
+     </para>
+    </listitem>
+   </varlistentry>
+
+    <varlistentry>
     <term><literal>SKIP_LOCKED</literal></term>
     <listitem>
      <para>
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index ff1e178..5f0e7a4 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -204,7 +204,8 @@ vacuum(int options, List *relations, VacuumParams *params,
 						stmttype)));
 
 	/*
-	 * Sanity check DISABLE_PAGE_SKIPPING option.
+	 * Sanity check DISABLE_PAGE_SKIPPING option and DISABLE_INDEX_CLEANUP
+	 * option.
 	 */
 	if ((options & VACOPT_FULL) != 0 &&
 		(options & VACOPT_DISABLE_PAGE_SKIPPING) != 0)
@@ -212,6 +213,11 @@ vacuum(int options, List *relations, VacuumParams *params,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg("VACUUM option DISABLE_PAGE_SKIPPING cannot be used with FULL")));
 
+	if ((options & VACOPT_FULL) != 0 &&
+		(options & VACOPT_DISABLE_INDEX_CLEANUP) != 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("VACUUM option DISABLE_INDEX_CLEANUP cannot be used with FULL")));
 	/*
 	 * Send info about dead objects to the statistics collector, unless we are
 	 * in autovacuum --- autovacuum.c does this for itself.
diff --git a/src/backend/commands/vacuumlazy.c b/src/backend/commands/vacuumlazy.c
index b67267f..656786f 100644
--- a/src/backend/commands/vacuumlazy.c
+++ b/src/backend/commands/vacuumlazy.c
@@ -113,7 +113,10 @@
 
 typedef struct LVRelStats
 {
-	/* hasindex = true means two-pass strategy; false means one-pass */
+	/*
+	 * hasindex = true means two-pass strategy; false means one-pass. But we
+	 * always use the one-pass strategy when index vacuum is disabled.
+	 */
 	bool		hasindex;
 	/* Overall statistics about rel */
 	BlockNumber old_rel_pages;	/* previous value of pg_class.relpages */
@@ -163,12 +166,14 @@ static void lazy_cleanup_index(Relation indrel,
 				   IndexBulkDeleteResult *stats,
 				   LVRelStats *vacrelstats);
 static int lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
-				 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer);
+							int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer,
+							bool setdead);
 static bool should_attempt_truncation(LVRelStats *vacrelstats);
 static void lazy_truncate_heap(Relation onerel, LVRelStats *vacrelstats);
 static BlockNumber count_nondeletable_pages(Relation onerel,
 						 LVRelStats *vacrelstats);
-static void lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks);
+static void lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks,
+							 bool skip_index_vacuum);
 static void lazy_record_dead_tuple(LVRelStats *vacrelstats,
 					   ItemPointer itemptr);
 static bool lazy_tid_reaped(ItemPointer itemptr, void *state);
@@ -261,6 +266,12 @@ lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 	vac_open_indexes(onerel, RowExclusiveLock, &nindexes, &Irel);
 	vacrelstats->hasindex = (nindexes > 0);
 
+	/* Notify user that DISABLE_INDEX_CLEANUP option is ignored */
+	if (!vacrelstats->hasindex && (options & VACOPT_DISABLE_INDEX_CLEANUP))
+		ereport(NOTICE,
+				(errmsg("DISABLE_INDEX_CLEANUP is ignored because table \"%s\" does not have index",
+						RelationGetRelationName(onerel))));
+
 	/* Do the vacuuming */
 	lazy_scan_heap(onerel, options, vacrelstats, Irel, nindexes, aggressive);
 
@@ -284,9 +295,11 @@ lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 		scanned_all_unfrozen = true;
 
 	/*
-	 * Optionally truncate the relation.
+	 * Optionally truncate the relation. If DISABLE_INDEX_CLEANUP is specified,
+	 * since we skipped to remove dead tuples there is no need to truncate heap.
 	 */
-	if (should_attempt_truncation(vacrelstats))
+	if (should_attempt_truncation(vacrelstats) &&
+		(options & VACOPT_DISABLE_INDEX_CLEANUP) == 0)
 		lazy_truncate_heap(onerel, vacrelstats);
 
 	/* Report that we are now doing final cleanup */
@@ -493,6 +506,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	Buffer		vmbuffer = InvalidBuffer;
 	BlockNumber next_unskippable_block;
 	bool		skipping_blocks;
+	bool		skip_index_vacuum;
 	xl_heap_freeze_tuple *frozen;
 	StringInfoData buf;
 	const int	initprog_index[] = {
@@ -530,7 +544,14 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	vacrelstats->nonempty_pages = 0;
 	vacrelstats->latestRemovedXid = InvalidTransactionId;
 
-	lazy_space_alloc(vacrelstats, nblocks);
+	/*
+	 * Skip index vacuum if it's requested for table with indexes. In this
+	 * case, we use the one-pass strategy and don't remove tuple storage.
+	 */
+	skip_index_vacuum =
+		(options & VACOPT_DISABLE_INDEX_CLEANUP) != 0 && vacrelstats->hasindex;
+
+		lazy_space_alloc(vacrelstats, nblocks, skip_index_vacuum);
 	frozen = palloc(sizeof(xl_heap_freeze_tuple) * MaxHeapTuplesPerPage);
 
 	/* Report that we're scanning the heap, advertising total # of blocks */
@@ -723,6 +744,8 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			};
 			int64		hvp_val[2];
 
+			Assert(!skip_index_vacuum);
+
 			/*
 			 * Before beginning index vacuuming, we release any pin we may
 			 * hold on the visibility map page.  This isn't necessary for
@@ -1201,14 +1224,21 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		}
 
 		/*
-		 * If there are no indexes then we can vacuum the page right now
-		 * instead of doing a second scan.
+		 * If either there are no indexes or skip index vacuums then we can
+		 * vacuum the page right now instead of doing a second scan.
 		 */
-		if (nindexes == 0 &&
+		if ((nindexes == 0 || skip_index_vacuum) &&
 			vacrelstats->num_dead_tuples > 0)
 		{
-			/* Remove tuples from heap */
-			lazy_vacuum_page(onerel, blkno, buf, 0, vacrelstats, &vmbuffer);
+			/*
+			 * If we skip index vacuum, we set the recorded tuples as
+			 * dead, which removes its tuple storage but leaves ItemIDs.
+			 * By leaving dead ItemIDs an index lookup can properly see the
+			 * tuple is already dead. The remained dead ItemIds will be
+			 * removed by the next vacuum enabled the index cleanup.
+			 */
+			lazy_vacuum_page(onerel, blkno, buf, 0, vacrelstats, &vmbuffer,
+							 skip_index_vacuum);
 			has_dead_tuples = false;
 
 			/*
@@ -1374,6 +1404,8 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		};
 		int64		hvp_val[2];
 
+		Assert(!skip_index_vacuum);
+
 		/* Log cleanup info before we touch indexes */
 		vacuum_log_cleanup_info(onerel, vacrelstats);
 
@@ -1412,15 +1444,24 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
 
 	/* Do post-vacuum cleanup and statistics update for each index */
-	for (i = 0; i < nindexes; i++)
-		lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
+	if (!skip_index_vacuum)
+		for (i = 0; i < nindexes; i++)
+			lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
 
-	/* If no indexes, make log report that lazy_vacuum_heap would've made */
+	/* Make log report that lazy_vacuum_heap would've made */
 	if (vacuumed_pages)
-		ereport(elevel,
-				(errmsg("\"%s\": removed %.0f row versions in %u pages",
-						RelationGetRelationName(onerel),
-						tups_vacuumed, vacuumed_pages)));
+	{
+		if (skip_index_vacuum)
+			ereport(elevel,
+					(errmsg("\"%s\": marked %.0f row versions as dead in %u pages",
+							RelationGetRelationName(onerel),
+							tups_vacuumed, vacuumed_pages)));
+		else
+			ereport(elevel,
+					(errmsg("\"%s\": removed %.0f row versions in %u pages",
+							RelationGetRelationName(onerel),
+							tups_vacuumed, vacuumed_pages)));
+	}
 
 	/*
 	 * This is pretty messy, but we split it up so that we can skip emitting
@@ -1498,7 +1539,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
 			continue;
 		}
 		tupindex = lazy_vacuum_page(onerel, tblk, buf, tupindex, vacrelstats,
-									&vmbuffer);
+									&vmbuffer, false);
 
 		/* Now that we've compacted the page, record its available space */
 		page = BufferGetPage(buf);
@@ -1526,6 +1567,10 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
  *	lazy_vacuum_page() -- free dead tuples on a page
  *					 and repair its fragmentation.
  *
+ * If 'setdead' is true (e.g., when index cleanup is disabled) then we
+ * set dead tuples as DEAD instead of as UNUSED and skip the visibility
+ * update.
+ *
  * Caller must hold pin and buffer cleanup lock on the buffer.
  *
  * tupindex is the index in vacrelstats->dead_tuples of the first dead
@@ -1534,11 +1579,12 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)
  */
 static int
 lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
-				 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer)
+				 int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer,
+				 bool setdead)
 {
 	Page		page = BufferGetPage(buffer);
-	OffsetNumber unused[MaxOffsetNumber];
-	int			uncnt = 0;
+	OffsetNumber items[MaxOffsetNumber];
+	int			cnt = 0;
 	TransactionId visibility_cutoff_xid;
 	bool		all_frozen;
 
@@ -1557,8 +1603,23 @@ lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 			break;				/* past end of tuples for this block */
 		toff = ItemPointerGetOffsetNumber(&vacrelstats->dead_tuples[tupindex]);
 		itemid = PageGetItemId(page, toff);
-		ItemIdSetUnused(itemid);
-		unused[uncnt++] = toff;
+
+		if (setdead)
+		{
+			/*
+			 * When index cleanup is disabled, vacrelstats->dead_tuples could
+			 * have tuples that has already been set as DEAD by HOT-pruning.
+			 * So skip them to not WAL-logging again.
+			 */
+			if (ItemIdIsDead(itemid))
+				continue;
+
+			ItemIdSetDead(itemid);
+		}
+		else
+			ItemIdSetUnused(itemid);
+
+		items[cnt++] = toff;
 	}
 
 	PageRepairFragmentation(page);
@@ -1573,10 +1634,19 @@ lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 	{
 		XLogRecPtr	recptr;
 
-		recptr = log_heap_clean(onerel, buffer,
-								NULL, 0, NULL, 0,
-								unused, uncnt,
-								vacrelstats->latestRemovedXid);
+		if (setdead)
+			recptr = log_heap_clean(onerel, buffer,
+									NULL, 0,	/* redirected */
+									items, cnt,	/* dead */
+									NULL, 0,	/* unused */
+									vacrelstats->latestRemovedXid);
+		else
+			recptr = log_heap_clean(onerel, buffer,
+									NULL, 0,	/* redirected */
+									NULL, 0,	/* dead */
+									items, cnt,	/* unused */
+									vacrelstats->latestRemovedXid);
+
 		PageSetLSN(page, recptr);
 	}
 
@@ -1589,6 +1659,13 @@ lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 	END_CRIT_SECTION();
 
 	/*
+	 * Since we didn't actually remove dead tuples the visibility is not
+	 * changed.
+	 */
+	if (setdead)
+		return tupindex;
+
+	/*
 	 * Now that we have removed the dead tuples from the page, once again
 	 * check if the page has become all-visible.  The page is already marked
 	 * dirty, exclusively locked, and, if needed, a full page image has been
@@ -2079,14 +2156,15 @@ count_nondeletable_pages(Relation onerel, LVRelStats *vacrelstats)
  * See the comments at the head of this file for rationale.
  */
 static void
-lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks,
+				 bool skip_index_vacuum)
 {
 	long		maxtuples;
 	int			vac_work_mem = IsAutoVacuumWorkerProcess() &&
 	autovacuum_work_mem != -1 ?
 	autovacuum_work_mem : maintenance_work_mem;
 
-	if (vacrelstats->hasindex)
+	if (vacrelstats->hasindex && !skip_index_vacuum)
 	{
 		maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);
 		maxtuples = Min(maxtuples, INT_MAX);
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index c086235..e688c04 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -10511,6 +10511,8 @@ vacuum_option_elem:
 				{
 					if (strcmp($1, "disable_page_skipping") == 0)
 						$$ = VACOPT_DISABLE_PAGE_SKIPPING;
+					else if (strcmp($1, "disable_index_cleanup") == 0)
+						$$ = VACOPT_DISABLE_INDEX_CLEANUP;
 					else if (strcmp($1, "skip_locked") == 0)
 						$$ = VACOPT_SKIP_LOCKED;
 					else
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 27782fe..8266eaf 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3152,7 +3152,9 @@ typedef enum VacuumOption
 	VACOPT_FULL = 1 << 4,		/* FULL (non-concurrent) vacuum */
 	VACOPT_SKIP_LOCKED = 1 << 5,	/* skip if cannot get lock */
 	VACOPT_SKIPTOAST = 1 << 6,	/* don't process the TOAST table, if any */
-	VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7	/* don't skip any pages */
+	VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7,	/* don't skip any pages */
+	VACOPT_DISABLE_INDEX_CLEANUP = 1 << 8	/* don't remove dead tuple and
+											 * cleanup indexes */
 } VacuumOption;
 
 /*
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index fa9d663..402a8be 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -80,6 +80,10 @@ CONTEXT:  SQL function "do_analyze" statement 1
 SQL function "wrap_do_analyze" statement 1
 VACUUM FULL vactst;
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+VACUUM (DISABLE_INDEX_CLEANUP) vaccluster;
+VACUUM (DISABLE_INDEX_CLEANUP) vactst; -- DISABLE_INDEX_CLEANUP is ignored
+NOTICE:  DISABLE_INDEX_CLEANUP is ignored because table "vactst" does not have index
+VACUUM (DISABLE_INDEX_CLEANUP, FREEZE) vaccluster;
 -- partitioned table
 CREATE TABLE vacparted (a int, b char) PARTITION BY LIST (a);
 CREATE TABLE vacparted1 PARTITION OF vacparted FOR VALUES IN (1);
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 9defa0d..9c4bdb7 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -61,6 +61,9 @@ VACUUM FULL vaccluster;
 VACUUM FULL vactst;
 
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+VACUUM (DISABLE_INDEX_CLEANUP) vaccluster;
+VACUUM (DISABLE_INDEX_CLEANUP) vactst; -- DISABLE_INDEX_CLEANUP is ignored
+VACUUM (DISABLE_INDEX_CLEANUP, FREEZE) vaccluster;
 
 -- partitioned table
 CREATE TABLE vacparted (a int, b char) PARTITION BY LIST (a);
-- 
2.10.5

#19Robert Haas
robertmhaas@gmail.com
In reply to: Masahiko Sawada (#18)
Re: New vacuum option to do only freezing

On Fri, Jan 11, 2019 at 1:14 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Attached the updated patch. Please review it.

I'm quite confused by this patch. It seems to me that the easiest way
to implement this patch would be to (1) make lazy_space_alloc take the
maxtuples = MaxHeapTuplesPerPage branch when the new option is
specified, and then (2) forget about them after each page i.e.

if (nindexes == 0 &&
vacrelstats->num_dead_tuples > 0)
{
...
}
else if (skipping index cleanup)
vacrelstats->num_dead_tuples = 0;

I don't see why it should touch the logic inside lazy_vacuum_page() or
the decision about whether to truncate.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#20Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Robert Haas (#19)
Re: New vacuum option to do only freezing

On Sat, Jan 12, 2019 at 3:27 AM Robert Haas <robertmhaas@gmail.com> wrote:

On Fri, Jan 11, 2019 at 1:14 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Attached the updated patch. Please review it.

I'm quite confused by this patch. It seems to me that the easiest way
to implement this patch would be to (1) make lazy_space_alloc take the
maxtuples = MaxHeapTuplesPerPage branch when the new option is
specified, and then (2) forget about them after each page i.e.

if (nindexes == 0 &&
vacrelstats->num_dead_tuples > 0)
{
...
}
else if (skipping index cleanup)
vacrelstats->num_dead_tuples = 0;

I don't see why it should touch the logic inside lazy_vacuum_page() or
the decision about whether to truncate.

I think that because the tuples that got dead after heap_page_prune()
looked are recorded but not removed without lazy_vacuum_page() we need
to process them in lazy_vacuum_page(). For decision about whether to
truncate we should not change it, so I will fix it. It should be an
another option to control whether to truncate if we want.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#21Robert Haas
robertmhaas@gmail.com
In reply to: Masahiko Sawada (#20)
Re: New vacuum option to do only freezing

On Mon, Jan 14, 2019 at 9:24 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

I think that because the tuples that got dead after heap_page_prune()
looked are recorded but not removed without lazy_vacuum_page() we need
to process them in lazy_vacuum_page(). For decision about whether to
truncate we should not change it, so I will fix it. It should be an
another option to control whether to truncate if we want.

I think that heap_page_prune() is going to truncate dead tuples to
dead line pointers. At that point the tuple is gone. The only
remaining step is to mark the dead line pointer as unused, which can't
be done without scanning the indexes. So I don't understand what
lazy_vacuum_page() would need to do in this case.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#22Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Robert Haas (#21)
Re: New vacuum option to do only freezing

On Wed, Jan 16, 2019 at 5:24 AM Robert Haas <robertmhaas@gmail.com> wrote:

On Mon, Jan 14, 2019 at 9:24 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

I think that because the tuples that got dead after heap_page_prune()
looked are recorded but not removed without lazy_vacuum_page() we need
to process them in lazy_vacuum_page(). For decision about whether to
truncate we should not change it, so I will fix it. It should be an
another option to control whether to truncate if we want.

I think that heap_page_prune() is going to truncate dead tuples to
dead line pointers. At that point the tuple is gone. The only
remaining step is to mark the dead line pointer as unused, which can't
be done without scanning the indexes. So I don't understand what
lazy_vacuum_page() would need to do in this case.

vacuumlazy.c:L1022
switch (HeapTupleSatisfiesVacuum(&tuple, OldestXmin, buf))
{
case HEAPTUPLE_DEAD:

/*
* Ordinarily, DEAD tuples would have been removed by
* heap_page_prune(), but it's possible that the tuple
* state changed since heap_page_prune() looked. In
* particular an INSERT_IN_PROGRESS tuple could have
* changed to DEAD if the inserter aborted. So this
* cannot be considered an error condition.
*
(snip)
*/
if (HeapTupleIsHotUpdated(&tuple) ||
HeapTupleIsHeapOnly(&tuple))
nkeep += 1;
else
tupgone = true; /* we can delete the tuple */
all_visible = false;
break;

As the above comment says, it's possible that the state of an
INSERT_IN_PROGRESS tuple could be changed to 'dead' after
heap_page_prune(). Since such tuple is not truncated at this point we
record it and set it as UNUSED in lazy_vacuum_page(). I think that the
DISABLE_INDEX_CLEANUP case is the same; we need to process them after
recorded. Am I missing something?

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#23Robert Haas
robertmhaas@gmail.com
In reply to: Masahiko Sawada (#22)
Re: New vacuum option to do only freezing

On Wed, Jan 16, 2019 at 3:30 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

As the above comment says, it's possible that the state of an
INSERT_IN_PROGRESS tuple could be changed to 'dead' after
heap_page_prune(). Since such tuple is not truncated at this point we
record it and set it as UNUSED in lazy_vacuum_page(). I think that the
DISABLE_INDEX_CLEANUP case is the same; we need to process them after
recorded. Am I missing something?

I believe you are. Think about it this way. After the first pass
over the heap has been completed but before we've done anything to the
indexes, let alone started the second pass over the heap, somebody
could kill the vacuum process. Somebody could in fact yank the plug
out of the wall, stopping the entire server in its tracks. If they do
that, then lazy_vacuum_page() will never get executed. Yet, the heap
can't be in any kind of corrupted state at this point, right? We know
that the system is resilient against crashes, and killing a vacuum or
even the whole server midway through does not leave the system in any
kind of bad state. If it's fine for lazy_vacuum_page() to never be
reached in that case, it must also be fine for it never to be reached
if we ask for vacuum to stop cleanly before lazy_vacuum_page().

In the case of the particular comment to which you are referring, that
comment is part of lazy_scan_heap(), not lazy_vacuum_page(), so I
don't see how it bears on the question of whether we need to call
lazy_vacuum_page(). It's true that, at any point in time, an
in-progress transaction could abort. And if it does then some
insert-in-progress tuples could become dead. But if that happens,
then the next vacuum will remove them, just as it will remove any
tuples that become dead for that reason when vacuum isn't running in
the first place. You can't use that as a justification for needing a
second heap pass, because if it were, then you'd also need a THIRD
heap pass in case a transaction aborts after the second heap pass has
visited the pages, and a fourth heap pass in case a transaction aborts
after the third heap pass has visited the pages, etc. etc. forever.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#24Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Robert Haas (#23)
Re: New vacuum option to do only freezing

On Thu, Jan 17, 2019 at 1:33 AM Robert Haas <robertmhaas@gmail.com> wrote:

On Wed, Jan 16, 2019 at 3:30 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

As the above comment says, it's possible that the state of an
INSERT_IN_PROGRESS tuple could be changed to 'dead' after
heap_page_prune(). Since such tuple is not truncated at this point we
record it and set it as UNUSED in lazy_vacuum_page(). I think that the
DISABLE_INDEX_CLEANUP case is the same; we need to process them after
recorded. Am I missing something?

I believe you are. Think about it this way. After the first pass
over the heap has been completed but before we've done anything to the
indexes, let alone started the second pass over the heap, somebody
could kill the vacuum process. Somebody could in fact yank the plug
out of the wall, stopping the entire server in its tracks. If they do
that, then lazy_vacuum_page() will never get executed. Yet, the heap
can't be in any kind of corrupted state at this point, right? We know
that the system is resilient against crashes, and killing a vacuum or
even the whole server midway through does not leave the system in any
kind of bad state. If it's fine for lazy_vacuum_page() to never be
reached in that case, it must also be fine for it never to be reached
if we ask for vacuum to stop cleanly before lazy_vacuum_page().

Yes, that's right. It doesn't necessarily need to be reached lazy_vacuum_page().

In the case of the particular comment to which you are referring, that
comment is part of lazy_scan_heap(), not lazy_vacuum_page(), so I
don't see how it bears on the question of whether we need to call
lazy_vacuum_page(). It's true that, at any point in time, an
in-progress transaction could abort. And if it does then some
insert-in-progress tuples could become dead. But if that happens,
then the next vacuum will remove them, just as it will remove any
tuples that become dead for that reason when vacuum isn't running in
the first place. You can't use that as a justification for needing a
second heap pass, because if it were, then you'd also need a THIRD
heap pass in case a transaction aborts after the second heap pass has
visited the pages, and a fourth heap pass in case a transaction aborts
after the third heap pass has visited the pages, etc. etc. forever.

The reason why I processed the tuples that became dead after the first
heap pass is that I was not sure the reason why we ignore such tuples
in the second heap pass despite of there already have been the code
doing so which has been used for a long time. I thought we can do that
in the same manner even in DISABLE_INDEX_CLEANUP case. Also, since I
thought that lazy_vacuum_page() is the best place to set them as DEAD
I modified it (In the previous patch I introduced another function
setting them as DEAD aside from lazy_vacuum_page(). But since these
were almost same I merged them).

However, as you mentioned it's true that these tuples will be removed
by the next vacuum even if we ignore. Also these tuples would not be
many and without this change the patch would get more simple. So if
the risk of this change is greater than the benefit, we should not do
that.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#25Robert Haas
robertmhaas@gmail.com
In reply to: Masahiko Sawada (#24)
Re: New vacuum option to do only freezing

On Thu, Jan 17, 2019 at 1:57 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

The reason why I processed the tuples that became dead after the first
heap pass is that I was not sure the reason why we ignore such tuples
in the second heap pass despite of there already have been the code
doing so which has been used for a long time. I thought we can do that
in the same manner even in DISABLE_INDEX_CLEANUP case. Also, since I
thought that lazy_vacuum_page() is the best place to set them as DEAD
I modified it (In the previous patch I introduced another function
setting them as DEAD aside from lazy_vacuum_page(). But since these
were almost same I merged them).

The race you're concerned about is extremely narrow. We HOT-prune the
page, and then immediately afterward -- probably a few milliseconds
later -- we loop over the tuples still on the page and check the
status of each one. The only time we get a different answer is when a
transaction aborts in those few milliseconds. We don't worry about
handling those because it's a very rare condition.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#26Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Robert Haas (#25)
1 attachment(s)
Re: New vacuum option to do only freezing

On Sat, Jan 19, 2019 at 5:08 AM Robert Haas <robertmhaas@gmail.com> wrote:

On Thu, Jan 17, 2019 at 1:57 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

The reason why I processed the tuples that became dead after the first
heap pass is that I was not sure the reason why we ignore such tuples
in the second heap pass despite of there already have been the code
doing so which has been used for a long time. I thought we can do that
in the same manner even in DISABLE_INDEX_CLEANUP case. Also, since I
thought that lazy_vacuum_page() is the best place to set them as DEAD
I modified it (In the previous patch I introduced another function
setting them as DEAD aside from lazy_vacuum_page(). But since these
were almost same I merged them).

The race you're concerned about is extremely narrow. We HOT-prune the
page, and then immediately afterward -- probably a few milliseconds
later -- we loop over the tuples still on the page and check the
status of each one. The only time we get a different answer is when a
transaction aborts in those few milliseconds. We don't worry about
handling those because it's a very rare condition.

Understood and Agreed. I've attached the new version patch
incorporated the review comments.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachments:

v5-0001-Add-DISABLE_INDEX_CLEANUP-option-to-VACUUM-comman.patchapplication/octet-stream; name=v5-0001-Add-DISABLE_INDEX_CLEANUP-option-to-VACUUM-comman.patchDownload
From a7f06a93eb447139a8537f50a92cd97792039dd1 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Mon, 21 Jan 2019 19:07:44 +0900
Subject: [PATCH v5] Add DISABLE_INDEX_CLEANUP option to VACUUM command

---
 doc/src/sgml/ref/vacuum.sgml         | 20 +++++++++-
 src/backend/access/heap/vacuumlazy.c | 75 ++++++++++++++++++++++++++++--------
 src/backend/commands/vacuum.c        |  8 +++-
 src/backend/parser/gram.y            |  2 +
 src/include/nodes/parsenodes.h       |  4 +-
 src/test/regress/expected/vacuum.out |  4 ++
 src/test/regress/sql/vacuum.sql      |  3 ++
 7 files changed, 97 insertions(+), 19 deletions(-)

diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index fd911f5..11ee59a 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -31,6 +31,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     VERBOSE
     ANALYZE
     DISABLE_PAGE_SKIPPING
+    DISABLE_INDEX_CLEANUP
     SKIP_LOCKED
 
 <phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
@@ -161,7 +162,24 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     </listitem>
    </varlistentry>
 
-   <varlistentry>
+    <varlistentry>
+    <term><literal>DISABLE_INDEX_CLEANUP</literal></term>
+    <listitem>
+     <para>
+      <command>VACUUM</command> removes dead tuples and prunes HOT-updated
+      tuples chain for live tuples on table. If the table has any dead tuple
+      it removes them from both table and indexes for re-use. With this
+      option <command>VACUUM</command> doesn't completely remove dead tuples
+      and disables removing dead tuples from indexes.  This is suitable for
+      avoiding transaction ID wraparound but not sufficient for avoiding
+      index bloat. This option is ignored if the table doesn't have index.
+      Also, this cannot be used in conjunction with <literal>FULL</literal>
+      option.
+     </para>
+    </listitem>
+   </varlistentry>
+
+    <varlistentry>
     <term><literal>SKIP_LOCKED</literal></term>
     <listitem>
      <para>
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index c09eb6e..e6ed278 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -113,7 +113,10 @@
 
 typedef struct LVRelStats
 {
-	/* hasindex = true means two-pass strategy; false means one-pass */
+	/*
+	 * hasindex = true means two-pass strategy; false means one-pass. But we
+	 * always use the one-pass strategy when index vacuum is disabled.
+	 */
 	bool		hasindex;
 	/* Overall statistics about rel */
 	BlockNumber old_rel_pages;	/* previous value of pg_class.relpages */
@@ -168,7 +171,8 @@ static bool should_attempt_truncation(LVRelStats *vacrelstats);
 static void lazy_truncate_heap(Relation onerel, LVRelStats *vacrelstats);
 static BlockNumber count_nondeletable_pages(Relation onerel,
 						 LVRelStats *vacrelstats);
-static void lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks);
+static void lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks,
+							 bool skip_index_vacuum);
 static void lazy_record_dead_tuple(LVRelStats *vacrelstats,
 					   ItemPointer itemptr);
 static bool lazy_tid_reaped(ItemPointer itemptr, void *state);
@@ -261,6 +265,12 @@ heap_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 	vac_open_indexes(onerel, RowExclusiveLock, &nindexes, &Irel);
 	vacrelstats->hasindex = (nindexes > 0);
 
+	/* Notify user that DISABLE_INDEX_CLEANUP option is ignored */
+	if (!vacrelstats->hasindex && (options & VACOPT_DISABLE_INDEX_CLEANUP))
+		ereport(NOTICE,
+				(errmsg("DISABLE_INDEX_CLEANUP is ignored because table \"%s\" does not have index",
+						RelationGetRelationName(onerel))));
+
 	/* Do the vacuuming */
 	lazy_scan_heap(onerel, options, vacrelstats, Irel, nindexes, aggressive);
 
@@ -493,6 +503,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	Buffer		vmbuffer = InvalidBuffer;
 	BlockNumber next_unskippable_block;
 	bool		skipping_blocks;
+	bool		skip_index_vacuum;
 	xl_heap_freeze_tuple *frozen;
 	StringInfoData buf;
 	const int	initprog_index[] = {
@@ -530,7 +541,14 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	vacrelstats->nonempty_pages = 0;
 	vacrelstats->latestRemovedXid = InvalidTransactionId;
 
-	lazy_space_alloc(vacrelstats, nblocks);
+	/*
+	 * Skip index vacuum if it's requested for table with indexes. In this
+	 * case, we use the one-pass strategy and don't remove tuple storage.
+	 */
+	skip_index_vacuum =
+		(options & VACOPT_DISABLE_INDEX_CLEANUP) != 0 && vacrelstats->hasindex;
+
+	lazy_space_alloc(vacrelstats, nblocks, skip_index_vacuum);
 	frozen = palloc(sizeof(xl_heap_freeze_tuple) * MaxHeapTuplesPerPage);
 
 	/* Report that we're scanning the heap, advertising total # of blocks */
@@ -723,6 +741,8 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			};
 			int64		hvp_val[2];
 
+			Assert(!skip_index_vacuum);
+
 			/*
 			 * Before beginning index vacuuming, we release any pin we may
 			 * hold on the visibility map page.  This isn't necessary for
@@ -1204,12 +1224,23 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		 * If there are no indexes then we can vacuum the page right now
 		 * instead of doing a second scan.
 		 */
-		if (nindexes == 0 &&
+		if ((nindexes == 0 || skip_index_vacuum) &&
 			vacrelstats->num_dead_tuples > 0)
 		{
-			/* Remove tuples from heap */
-			lazy_vacuum_page(onerel, blkno, buf, 0, vacrelstats, &vmbuffer);
-			has_dead_tuples = false;
+			/*
+			 * Remove tuples from heap if the table has no index.  If the table
+			 * has index but index vacuum is disabled, we don't vacuum but forget
+			 * them. The vacrelstats->dead_tuples could have tuples which became
+			 * dead after checked at HOT-pruning time which are handled by
+			 * lazy_vacuum_page() but we don't worry about handling those because
+			 * it's a very rare condition and these would not be a large number.
+			 */
+			if (nindexes == 0)
+			{
+				Assert(!skip_index_vacuum);
+				lazy_vacuum_page(onerel, blkno, buf, 0, vacrelstats, &vmbuffer);
+				has_dead_tuples = false;
+			}
 
 			/*
 			 * Forget the now-vacuumed tuples, and press on, but be careful
@@ -1374,6 +1405,8 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		};
 		int64		hvp_val[2];
 
+		Assert(!skip_index_vacuum);
+
 		/* Log cleanup info before we touch indexes */
 		vacuum_log_cleanup_info(onerel, vacrelstats);
 
@@ -1412,15 +1445,24 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
 
 	/* Do post-vacuum cleanup and statistics update for each index */
-	for (i = 0; i < nindexes; i++)
-		lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
+	if (!skip_index_vacuum)
+		for (i = 0; i < nindexes; i++)
+			lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
 
-	/* If no indexes, make log report that lazy_vacuum_heap would've made */
+	/* Make log report that lazy_vacuum_heap would've made */
 	if (vacuumed_pages)
-		ereport(elevel,
-				(errmsg("\"%s\": removed %.0f row versions in %u pages",
-						RelationGetRelationName(onerel),
-						tups_vacuumed, vacuumed_pages)));
+	{
+		if (skip_index_vacuum)
+			ereport(elevel,
+					(errmsg("\"%s\": marked %.0f row versions as dead in %u pages",
+							RelationGetRelationName(onerel),
+							tups_vacuumed, vacuumed_pages)));
+		else
+			ereport(elevel,
+					(errmsg("\"%s\": removed %.0f row versions in %u pages",
+							RelationGetRelationName(onerel),
+							tups_vacuumed, vacuumed_pages)));
+	}
 
 	/*
 	 * This is pretty messy, but we split it up so that we can skip emitting
@@ -2079,14 +2121,15 @@ count_nondeletable_pages(Relation onerel, LVRelStats *vacrelstats)
  * See the comments at the head of this file for rationale.
  */
 static void
-lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks,
+				 bool skip_index_vacuum)
 {
 	long		maxtuples;
 	int			vac_work_mem = IsAutoVacuumWorkerProcess() &&
 	autovacuum_work_mem != -1 ?
 	autovacuum_work_mem : maintenance_work_mem;
 
-	if (vacrelstats->hasindex)
+	if (vacrelstats->hasindex && !skip_index_vacuum)
 	{
 		maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);
 		maxtuples = Min(maxtuples, INT_MAX);
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index c4522cd..a6f6fa0 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -204,7 +204,8 @@ vacuum(int options, List *relations, VacuumParams *params,
 						stmttype)));
 
 	/*
-	 * Sanity check DISABLE_PAGE_SKIPPING option.
+	 * Sanity check DISABLE_PAGE_SKIPPING option and DISABLE_INDEX_CLEANUP
+	 * option.
 	 */
 	if ((options & VACOPT_FULL) != 0 &&
 		(options & VACOPT_DISABLE_PAGE_SKIPPING) != 0)
@@ -212,6 +213,11 @@ vacuum(int options, List *relations, VacuumParams *params,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg("VACUUM option DISABLE_PAGE_SKIPPING cannot be used with FULL")));
 
+	if ((options & VACOPT_FULL) != 0 &&
+		(options & VACOPT_DISABLE_INDEX_CLEANUP) != 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("VACUUM option DISABLE_INDEX_CLEANUP cannot be used with FULL")));
 	/*
 	 * Send info about dead objects to the statistics collector, unless we are
 	 * in autovacuum --- autovacuum.c does this for itself.
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index d8a3c2d..183a50d 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -10519,6 +10519,8 @@ vacuum_option_elem:
 				{
 					if (strcmp($1, "disable_page_skipping") == 0)
 						$$ = VACOPT_DISABLE_PAGE_SKIPPING;
+					else if (strcmp($1, "disable_index_cleanup") == 0)
+						$$ = VACOPT_DISABLE_INDEX_CLEANUP;
 					else if (strcmp($1, "skip_locked") == 0)
 						$$ = VACOPT_SKIP_LOCKED;
 					else
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index addc2c2..91c6fa6 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3153,7 +3153,9 @@ typedef enum VacuumOption
 	VACOPT_FULL = 1 << 4,		/* FULL (non-concurrent) vacuum */
 	VACOPT_SKIP_LOCKED = 1 << 5,	/* skip if cannot get lock */
 	VACOPT_SKIPTOAST = 1 << 6,	/* don't process the TOAST table, if any */
-	VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7	/* don't skip any pages */
+	VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7,	/* don't skip any pages */
+	VACOPT_DISABLE_INDEX_CLEANUP = 1 << 8	/* don't remove dead tuple and
+											 * cleanup indexes */
 } VacuumOption;
 
 /*
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index fa9d663..402a8be 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -80,6 +80,10 @@ CONTEXT:  SQL function "do_analyze" statement 1
 SQL function "wrap_do_analyze" statement 1
 VACUUM FULL vactst;
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+VACUUM (DISABLE_INDEX_CLEANUP) vaccluster;
+VACUUM (DISABLE_INDEX_CLEANUP) vactst; -- DISABLE_INDEX_CLEANUP is ignored
+NOTICE:  DISABLE_INDEX_CLEANUP is ignored because table "vactst" does not have index
+VACUUM (DISABLE_INDEX_CLEANUP, FREEZE) vaccluster;
 -- partitioned table
 CREATE TABLE vacparted (a int, b char) PARTITION BY LIST (a);
 CREATE TABLE vacparted1 PARTITION OF vacparted FOR VALUES IN (1);
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 9defa0d..9c4bdb7 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -61,6 +61,9 @@ VACUUM FULL vaccluster;
 VACUUM FULL vactst;
 
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+VACUUM (DISABLE_INDEX_CLEANUP) vaccluster;
+VACUUM (DISABLE_INDEX_CLEANUP) vactst; -- DISABLE_INDEX_CLEANUP is ignored
+VACUUM (DISABLE_INDEX_CLEANUP, FREEZE) vaccluster;
 
 -- partitioned table
 CREATE TABLE vacparted (a int, b char) PARTITION BY LIST (a);
-- 
2.10.5

#27Bossart, Nathan
bossartn@amazon.com
In reply to: Masahiko Sawada (#26)
Re: New vacuum option to do only freezing

On 1/21/19, 2:23 AM, "Masahiko Sawada" <sawada.mshk@gmail.com> wrote:

Understood and Agreed. I've attached the new version patch
incorporated the review comments.

I took a look at the latest version of the patch.

+    <varlistentry>
+    <term><literal>DISABLE_INDEX_CLEANUP</literal></term>
+    <listitem>
+     <para>
+      <command>VACUUM</command> removes dead tuples and prunes HOT-updated
+      tuples chain for live tuples on table. If the table has any dead tuple
+      it removes them from both table and indexes for re-use. With this
+      option <command>VACUUM</command> doesn't completely remove dead tuples
+      and disables removing dead tuples from indexes.  This is suitable for
+      avoiding transaction ID wraparound but not sufficient for avoiding
+      index bloat. This option is ignored if the table doesn't have index.
+      Also, this cannot be used in conjunction with <literal>FULL</literal>
+      option.
+     </para>
+    </listitem>
+   </varlistentry>

IMHO we could document this feature at a slightly higher level without
leaving out any really important user-facing behavior. Here's a quick
attempt to show what I am thinking:

With this option, VACUUM skips all index cleanup behavior and
only marks tuples as "dead" without reclaiming the storage.
While this can help reclaim transaction IDs faster to avoid
transaction ID wraparound (see Section 24.1.5), it will not
reduce bloat. Note that this option is ignored for tables
that have no indexes. Also, this option cannot be used in
conjunction with the FULL option, since FULL requires
rewriting the table.

+	/* Notify user that DISABLE_INDEX_CLEANUP option is ignored */
+	if (!vacrelstats->hasindex && (options & VACOPT_DISABLE_INDEX_CLEANUP))
+		ereport(NOTICE,
+				(errmsg("DISABLE_INDEX_CLEANUP is ignored because table \"%s\" does not have index",
+						RelationGetRelationName(onerel))));

It might make more sense to emit this in lazy_scan_heap() where we
determine the value of skip_index_vacuum.

+		if (skip_index_vacuum)
+			ereport(elevel,
+					(errmsg("\"%s\": marked %.0f row versions as dead in %u pages",
+							RelationGetRelationName(onerel),
+							tups_vacuumed, vacuumed_pages)));

IIUC tups_vacuumed will include tuples removed during HOT pruning, so
it could be inaccurate to say all of these tuples have only been
marked "dead." Perhaps we could keep a separate count of tuples
removed via HOT pruning in case we are using DISABLE_INDEX_CLEANUP.
There might be similar problems with the values stored in vacrelstats
that are used at the end of heap_vacuum_rel() (e.g. tuples_deleted).

I would suggest adding this option to vacuumdb in this patch set as
well.

Nathan

#28Michael Paquier
michael@paquier.xyz
In reply to: Bossart, Nathan (#27)
Re: New vacuum option to do only freezing

On Fri, Feb 01, 2019 at 10:43:37PM +0000, Bossart, Nathan wrote:

I would suggest adding this option to vacuumdb in this patch set as
well.

Doing it would be simple enough, for now I have moved it to next CF.
--
Michael

#29Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Bossart, Nathan (#27)
Re: New vacuum option to do only freezing

On 2019-Feb-01, Bossart, Nathan wrote:

IMHO we could document this feature at a slightly higher level without
leaving out any really important user-facing behavior. Here's a quick
attempt to show what I am thinking:

With this option, VACUUM skips all index cleanup behavior and
only marks tuples as "dead" without reclaiming the storage.
While this can help reclaim transaction IDs faster to avoid
transaction ID wraparound (see Section 24.1.5), it will not
reduce bloat.

Hmm ... don't we compact out the storage for dead tuples? If we do (and
I think we should) then this wording is not entirely correct.

Note that this option is ignored for tables
that have no indexes. Also, this option cannot be used in
conjunction with the FULL option, since FULL requires
rewriting the table.

I would remove the "Also," in there, since it seems to me to give the
wrong impression about those two things being similar, but they're not:
one is convenient behavior, the other is a limitation.

+	/* Notify user that DISABLE_INDEX_CLEANUP option is ignored */
+	if (!vacrelstats->hasindex && (options & VACOPT_DISABLE_INDEX_CLEANUP))
+		ereport(NOTICE,
+				(errmsg("DISABLE_INDEX_CLEANUP is ignored because table \"%s\" does not have index",
+						RelationGetRelationName(onerel))));

It might make more sense to emit this in lazy_scan_heap() where we
determine the value of skip_index_vacuum.

Why do we need a NOTICE here? I think this case should be silent.

--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#30Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Alvaro Herrera (#29)
2 attachment(s)
Re: New vacuum option to do only freezing

On Sat, Feb 2, 2019 at 7:00 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:

On 2019-Feb-01, Bossart, Nathan wrote:

IMHO we could document this feature at a slightly higher level without
leaving out any really important user-facing behavior. Here's a quick
attempt to show what I am thinking:

With this option, VACUUM skips all index cleanup behavior and
only marks tuples as "dead" without reclaiming the storage.
While this can help reclaim transaction IDs faster to avoid
transaction ID wraparound (see Section 24.1.5), it will not
reduce bloat.

Hmm ... don't we compact out the storage for dead tuples? If we do (and
I think we should) then this wording is not entirely correct.

Yeah, we remove tuple and leave the dead ItemId. So we actually
reclaim the almost tuple storage.

Note that this option is ignored for tables
that have no indexes. Also, this option cannot be used in
conjunction with the FULL option, since FULL requires
rewriting the table.

I would remove the "Also," in there, since it seems to me to give the
wrong impression about those two things being similar, but they're not:
one is convenient behavior, the other is a limitation.

Agreed.

+     /* Notify user that DISABLE_INDEX_CLEANUP option is ignored */
+     if (!vacrelstats->hasindex && (options & VACOPT_DISABLE_INDEX_CLEANUP))
+             ereport(NOTICE,
+                             (errmsg("DISABLE_INDEX_CLEANUP is ignored because table \"%s\" does not have index",
+                                             RelationGetRelationName(onerel))));

It might make more sense to emit this in lazy_scan_heap() where we
determine the value of skip_index_vacuum.

Why do we need a NOTICE here? I think this case should be silent.

Okay, removed it.

On Fri, Feb 1, 2019 at 11:43 PM Bossart, Nathan <bossartn@amazon.com> wrote:

+               if (skip_index_vacuum)
+                       ereport(elevel,
+                                       (errmsg("\"%s\": marked %.0f row versions as dead in %u pages",
+                                                       RelationGetRelationName(onerel),
+                                                       tups_vacuumed, vacuumed_pages)));

IIUC tups_vacuumed will include tuples removed during HOT pruning, so
it could be inaccurate to say all of these tuples have only been
marked "dead." Perhaps we could keep a separate count of tuples
removed via HOT pruning in case we are using DISABLE_INDEX_CLEANUP.
There might be similar problems with the values stored in vacrelstats
that are used at the end of heap_vacuum_rel() (e.g. tuples_deleted).

Yeah, tups_vacuumed include such tuples so the message is inaccurate.
So I think that we should not change the message but we can report the
dead item pointers we left in errdetail. That's more accurate and
would help user.

postgres(1:1130)=# vacuum (verbose, disable_index_cleanup) test;
INFO: vacuuming "public.test"
INFO: "test": removed 49 row versions in 1 pages
INFO: "test": found 49 removable, 51 nonremovable row versions in 1
out of 1 pages
DETAIL: 0 dead row versions cannot be removed yet, oldest xmin: 509
There were 0 unused item pointers.
Skipped 0 pages due to buffer pins, 0 frozen pages.
0 pages are entirely empty.
49 tuples are left as dead.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
VACUUM

Attached the updated patch and the patch for vacuumdb.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachments:

v6-0001-Add-DISABLE_INDEX_CLEANUP-option-to-VACUUM-comman.patchapplication/x-patch; name=v6-0001-Add-DISABLE_INDEX_CLEANUP-option-to-VACUUM-comman.patchDownload
From e309d7829572648157a5d7a8363e876aaef675ef Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Mon, 21 Jan 2019 19:07:44 +0900
Subject: [PATCH v6 1/2] Add DISABLE_INDEX_CLEANUP option to VACUUM command

---
 doc/src/sgml/ref/vacuum.sgml         | 21 ++++++++++-
 src/backend/access/heap/vacuumlazy.c | 68 ++++++++++++++++++++++++++++--------
 src/backend/commands/vacuum.c        |  8 ++++-
 src/backend/parser/gram.y            |  2 ++
 src/include/nodes/parsenodes.h       |  4 ++-
 src/test/regress/expected/vacuum.out |  3 ++
 src/test/regress/sql/vacuum.sql      |  3 ++
 7 files changed, 92 insertions(+), 17 deletions(-)

diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index fd911f5..f5cde2b 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -31,6 +31,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     VERBOSE
     ANALYZE
     DISABLE_PAGE_SKIPPING
+    DISABLE_INDEX_CLEANUP
     SKIP_LOCKED
 
 <phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
@@ -161,7 +162,25 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     </listitem>
    </varlistentry>
 
-   <varlistentry>
+    <varlistentry>
+    <term><literal>DISABLE_INDEX_CLEANUP</literal></term>
+    <listitem>
+     <para>
+      <command>VACUUM</command> removes dead tuples and prunes HOT-updated
+      tuples chain for live tuples on table. If the table has any dead tuple
+      it removes them from both table and indexes for re-use. With this
+      option <command>VACUUM</command> doesn't completely remove dead tuples
+      and disables removing dead tuples from indexes.  This is suitable for
+      avoiding transaction ID wraparound (see
+      <xref linkend="vacuum-for-wraparound"/>) but not sufficient for avoiding
+      index bloat. This option is ignored if the table doesn't have index.
+      This cannot be used in conjunction with <literal>FULL</literal>
+      option.
+     </para>
+    </listitem>
+   </varlistentry>
+
+    <varlistentry>
     <term><literal>SKIP_LOCKED</literal></term>
     <listitem>
      <para>
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 37aa484..9585beb 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -112,7 +112,10 @@
 
 typedef struct LVRelStats
 {
-	/* hasindex = true means two-pass strategy; false means one-pass */
+	/*
+	 * hasindex = true means two-pass strategy; false means one-pass. But we
+	 * always use the one-pass strategy when index vacuum is disabled.
+	 */
 	bool		hasindex;
 	/* Overall statistics about rel */
 	BlockNumber old_rel_pages;	/* previous value of pg_class.relpages */
@@ -167,7 +170,8 @@ static bool should_attempt_truncation(LVRelStats *vacrelstats);
 static void lazy_truncate_heap(Relation onerel, LVRelStats *vacrelstats);
 static BlockNumber count_nondeletable_pages(Relation onerel,
 						 LVRelStats *vacrelstats);
-static void lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks);
+static void lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks,
+							 bool skip_index_vacuum);
 static void lazy_record_dead_tuple(LVRelStats *vacrelstats,
 					   ItemPointer itemptr);
 static bool lazy_tid_reaped(ItemPointer itemptr, void *state);
@@ -485,13 +489,15 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 				live_tuples,	/* live tuples (reltuples estimate) */
 				tups_vacuumed,	/* tuples cleaned up by vacuum */
 				nkeep,			/* dead-but-not-removable tuples */
-				nunused;		/* unused item pointers */
+				nunused,		/* unused item pointers */
+				nleft;			/* item pointers we left */
 	IndexBulkDeleteResult **indstats;
 	int			i;
 	PGRUsage	ru0;
 	Buffer		vmbuffer = InvalidBuffer;
 	BlockNumber next_unskippable_block;
 	bool		skipping_blocks;
+	bool		skip_index_vacuum;
 	xl_heap_freeze_tuple *frozen;
 	StringInfoData buf;
 	const int	initprog_index[] = {
@@ -517,7 +523,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 
 	empty_pages = vacuumed_pages = 0;
 	next_fsm_block_to_vacuum = (BlockNumber) 0;
-	num_tuples = live_tuples = tups_vacuumed = nkeep = nunused = 0;
+	num_tuples = live_tuples = tups_vacuumed = nkeep = nunused = nleft = 0;
 
 	indstats = (IndexBulkDeleteResult **)
 		palloc0(nindexes * sizeof(IndexBulkDeleteResult *));
@@ -529,7 +535,14 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	vacrelstats->nonempty_pages = 0;
 	vacrelstats->latestRemovedXid = InvalidTransactionId;
 
-	lazy_space_alloc(vacrelstats, nblocks);
+	/*
+	 * Skip index vacuum if it's requested for table with indexes. In this
+	 * case we use the one-pass strategy and don't remove tuple storage.
+	 */
+	skip_index_vacuum =
+		(options & VACOPT_DISABLE_INDEX_CLEANUP) != 0 && vacrelstats->hasindex;
+
+	lazy_space_alloc(vacrelstats, nblocks, skip_index_vacuum);
 	frozen = palloc(sizeof(xl_heap_freeze_tuple) * MaxHeapTuplesPerPage);
 
 	/* Report that we're scanning the heap, advertising total # of blocks */
@@ -722,6 +735,8 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			};
 			int64		hvp_val[2];
 
+			Assert(!skip_index_vacuum);
+
 			/*
 			 * Before beginning index vacuuming, we release any pin we may
 			 * hold on the visibility map page.  This isn't necessary for
@@ -1203,12 +1218,28 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		 * If there are no indexes then we can vacuum the page right now
 		 * instead of doing a second scan.
 		 */
-		if (nindexes == 0 &&
+		if ((nindexes == 0 || skip_index_vacuum) &&
 			vacrelstats->num_dead_tuples > 0)
 		{
-			/* Remove tuples from heap */
-			lazy_vacuum_page(onerel, blkno, buf, 0, vacrelstats, &vmbuffer);
-			has_dead_tuples = false;
+			/*
+			 * Remove tuples from heap if the table has no index.  If the table
+			 * has index but index vacuum is disabled, we don't vacuum and forget
+			 * them. The vacrelstats->dead_tuples could have tuples which became
+			 * dead after checked at HOT-pruning time which are handled by
+			 * lazy_vacuum_page() but we don't worry about handling those because
+			 * it's a very rare condition and these would not be a large number.
+			 */
+			if (nindexes == 0)
+			{
+				Assert(!skip_index_vacuum);
+				lazy_vacuum_page(onerel, blkno, buf, 0, vacrelstats, &vmbuffer);
+				has_dead_tuples = false;
+			}
+			else
+			{
+				/* Keep the number of tuples we left for the report */
+				nleft += vacrelstats->num_dead_tuples;
+			}
 
 			/*
 			 * Forget the now-vacuumed tuples, and press on, but be careful
@@ -1373,6 +1404,8 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		};
 		int64		hvp_val[2];
 
+		Assert(!skip_index_vacuum);
+
 		/* Log cleanup info before we touch indexes */
 		vacuum_log_cleanup_info(onerel, vacrelstats);
 
@@ -1411,10 +1444,11 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
 
 	/* Do post-vacuum cleanup and statistics update for each index */
-	for (i = 0; i < nindexes; i++)
-		lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
+	if (!skip_index_vacuum)
+		for (i = 0; i < nindexes; i++)
+			lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
 
-	/* If no indexes, make log report that lazy_vacuum_heap would've made */
+	/* Make log report that lazy_vacuum_heap would've made */
 	if (vacuumed_pages)
 		ereport(elevel,
 				(errmsg("\"%s\": removed %.0f row versions in %u pages",
@@ -1443,6 +1477,11 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 									"%u pages are entirely empty.\n",
 									empty_pages),
 					 empty_pages);
+	if (skip_index_vacuum)
+		appendStringInfo(&buf, ngettext("%.0f tuple is left as dead.\n",
+										"%.0f tuples are left as dead.\n",
+										nleft),
+						 nleft);
 	appendStringInfo(&buf, _("%s."), pg_rusage_show(&ru0));
 
 	ereport(elevel,
@@ -2078,14 +2117,15 @@ count_nondeletable_pages(Relation onerel, LVRelStats *vacrelstats)
  * See the comments at the head of this file for rationale.
  */
 static void
-lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks,
+				 bool skip_index_vacuum)
 {
 	long		maxtuples;
 	int			vac_work_mem = IsAutoVacuumWorkerProcess() &&
 	autovacuum_work_mem != -1 ?
 	autovacuum_work_mem : maintenance_work_mem;
 
-	if (vacrelstats->hasindex)
+	if (vacrelstats->hasindex && !skip_index_vacuum)
 	{
 		maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);
 		maxtuples = Min(maxtuples, INT_MAX);
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index e91df21..00024dd 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -203,7 +203,8 @@ vacuum(int options, List *relations, VacuumParams *params,
 						stmttype)));
 
 	/*
-	 * Sanity check DISABLE_PAGE_SKIPPING option.
+	 * Sanity check DISABLE_PAGE_SKIPPING option and DISABLE_INDEX_CLEANUP
+	 * option.
 	 */
 	if ((options & VACOPT_FULL) != 0 &&
 		(options & VACOPT_DISABLE_PAGE_SKIPPING) != 0)
@@ -211,6 +212,11 @@ vacuum(int options, List *relations, VacuumParams *params,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg("VACUUM option DISABLE_PAGE_SKIPPING cannot be used with FULL")));
 
+	if ((options & VACOPT_FULL) != 0 &&
+		(options & VACOPT_DISABLE_INDEX_CLEANUP) != 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("VACUUM option DISABLE_INDEX_CLEANUP cannot be used with FULL")));
 	/*
 	 * Send info about dead objects to the statistics collector, unless we are
 	 * in autovacuum --- autovacuum.c does this for itself.
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index c1faf41..12a5057 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -10465,6 +10465,8 @@ vacuum_option_elem:
 				{
 					if (strcmp($1, "disable_page_skipping") == 0)
 						$$ = VACOPT_DISABLE_PAGE_SKIPPING;
+					else if (strcmp($1, "disable_index_cleanup") == 0)
+						$$ = VACOPT_DISABLE_INDEX_CLEANUP;
 					else if (strcmp($1, "skip_locked") == 0)
 						$$ = VACOPT_SKIP_LOCKED;
 					else
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 2fe14d7..e79e67c 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3156,7 +3156,9 @@ typedef enum VacuumOption
 	VACOPT_FULL = 1 << 4,		/* FULL (non-concurrent) vacuum */
 	VACOPT_SKIP_LOCKED = 1 << 5,	/* skip if cannot get lock */
 	VACOPT_SKIPTOAST = 1 << 6,	/* don't process the TOAST table, if any */
-	VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7	/* don't skip any pages */
+	VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7,	/* don't skip any pages */
+	VACOPT_DISABLE_INDEX_CLEANUP = 1 << 8	/* don't remove dead tuple and
+											 * cleanup indexes */
 } VacuumOption;
 
 /*
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index fa9d663..cffce59 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -80,6 +80,9 @@ CONTEXT:  SQL function "do_analyze" statement 1
 SQL function "wrap_do_analyze" statement 1
 VACUUM FULL vactst;
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+VACUUM (DISABLE_INDEX_CLEANUP) vaccluster;
+VACUUM (DISABLE_INDEX_CLEANUP) vactst; -- DISABLE_INDEX_CLEANUP is ignored
+VACUUM (DISABLE_INDEX_CLEANUP, FREEZE) vaccluster;
 -- partitioned table
 CREATE TABLE vacparted (a int, b char) PARTITION BY LIST (a);
 CREATE TABLE vacparted1 PARTITION OF vacparted FOR VALUES IN (1);
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 9defa0d..9c4bdb7 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -61,6 +61,9 @@ VACUUM FULL vaccluster;
 VACUUM FULL vactst;
 
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+VACUUM (DISABLE_INDEX_CLEANUP) vaccluster;
+VACUUM (DISABLE_INDEX_CLEANUP) vactst; -- DISABLE_INDEX_CLEANUP is ignored
+VACUUM (DISABLE_INDEX_CLEANUP, FREEZE) vaccluster;
 
 -- partitioned table
 CREATE TABLE vacparted (a int, b char) PARTITION BY LIST (a);
-- 
1.8.3.1

v6-0002-Add-diable-index-cleanup-option-to-vacuumdb.patchapplication/x-patch; name=v6-0002-Add-diable-index-cleanup-option-to-vacuumdb.patchDownload
From 81690e88c1c53df19dcd7efdf79c3eef780c1d26 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 1 Feb 2019 16:02:50 +0100
Subject: [PATCH v6 2/2] Add --diable-index-cleanup option to vacuumdb.

---
 doc/src/sgml/ref/vacuumdb.sgml    | 15 +++++++++++++++
 src/bin/scripts/t/100_vacuumdb.pl |  9 ++++++++-
 src/bin/scripts/vacuumdb.c        | 29 +++++++++++++++++++++++++++++
 3 files changed, 52 insertions(+), 1 deletion(-)

diff --git a/doc/src/sgml/ref/vacuumdb.sgml b/doc/src/sgml/ref/vacuumdb.sgml
index 41c7f3d..02d6e46 100644
--- a/doc/src/sgml/ref/vacuumdb.sgml
+++ b/doc/src/sgml/ref/vacuumdb.sgml
@@ -118,6 +118,21 @@ PostgreSQL documentation
      </varlistentry>
 
      <varlistentry>
+      <term><option>--disable-index-cleanup</option></term>
+      <listitem>
+       <para>
+        Disable index vacuuming and index cleanup.
+       </para>
+       <note>
+        <para>
+         This option is only available for servers running
+         <productname>PostgreSQL</productname> 12 and later.
+        </para>
+       </note>
+      </listitem>
+     </varlistentry>
+
+     <varlistentry>
       <term><option>-e</option></term>
       <term><option>--echo</option></term>
       <listitem>
diff --git a/src/bin/scripts/t/100_vacuumdb.pl b/src/bin/scripts/t/100_vacuumdb.pl
index 7f3a9b1..d227225 100644
--- a/src/bin/scripts/t/100_vacuumdb.pl
+++ b/src/bin/scripts/t/100_vacuumdb.pl
@@ -3,7 +3,7 @@ use warnings;
 
 use PostgresNode;
 use TestLib;
-use Test::More tests => 44;
+use Test::More tests => 47;
 
 program_help_ok('vacuumdb');
 program_version_ok('vacuumdb');
@@ -38,6 +38,10 @@ $node->issues_sql_like(
 	qr/statement: VACUUM \(DISABLE_PAGE_SKIPPING\).*;/,
 	'vacuumdb --disable-page-skipping');
 $node->issues_sql_like(
+	[ 'vacuumdb', '--disable-index-cleanup', 'postgres' ],
+	qr/statement: VACUUM \(DISABLE_INDEX_CLEANUP\).*;/,
+	'vacuumdb --disable-index-cleanup');
+$node->issues_sql_like(
 	[ 'vacuumdb', '--skip-locked', 'postgres' ],
 	qr/statement: VACUUM \(SKIP_LOCKED\).*;/,
 	'vacuumdb --skip-locked');
@@ -48,6 +52,9 @@ $node->issues_sql_like(
 $node->command_fails(
 	[ 'vacuumdb', '--analyze-only', '--disable-page-skipping', 'postgres' ],
 	'--analyze-only and --disable-page-skipping specified together');
+$node->command_fails(
+	[ 'vacuumdb', '--analyze-only', '--disable-index-cleanup', 'postgres' ],
+	'--analyze-only and --disable-index-cleanup specified together');
 $node->command_ok([qw(vacuumdb -Z --table=pg_am dbname=template1)],
 	'vacuumdb with connection string');
 
diff --git a/src/bin/scripts/vacuumdb.c b/src/bin/scripts/vacuumdb.c
index 5ac41ea..7aaa3d0 100644
--- a/src/bin/scripts/vacuumdb.c
+++ b/src/bin/scripts/vacuumdb.c
@@ -42,6 +42,7 @@ typedef struct vacuumingOptions
 	bool		full;
 	bool		freeze;
 	bool		disable_page_skipping;
+	bool		disable_index_cleanup;
 	bool		skip_locked;
 	int			min_xid_age;
 	int			min_mxid_age;
@@ -117,6 +118,7 @@ main(int argc, char *argv[])
 		{"skip-locked", no_argument, NULL, 5},
 		{"min-xid-age", required_argument, NULL, 6},
 		{"min-mxid-age", required_argument, NULL, 7},
+		{"disable-index-cleanup", no_argument, NULL, 8},
 		{NULL, 0, NULL, 0}
 	};
 
@@ -244,6 +246,11 @@ main(int argc, char *argv[])
 					exit(1);
 				}
 				break;
+			case 8:
+				{
+					vacopts.disable_index_cleanup = true;
+					break;
+				}
 			default:
 				fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname);
 				exit(1);
@@ -288,6 +295,12 @@ main(int argc, char *argv[])
 					progname, "disable-page-skipping");
 			exit(1);
 		}
+		if (vacopts.disable_index_cleanup)
+		{
+			fprintf(stderr, _("%s: cannot use the \"%s\" option when performing only analyze\n"),
+					progname, "disable-index-cleanup");
+			exit(1);
+		}
 		/* allow 'and_analyze' with 'analyze_only' */
 	}
 
@@ -418,6 +431,14 @@ vacuum_one_database(const char *dbname, vacuumingOptions *vacopts,
 		exit(1);
 	}
 
+	if (vacopts->disable_index_cleanup && PQserverVersion(conn) < 120000)
+	{
+		PQfinish(conn);
+		fprintf(stderr, _("%s: cannot use the \"%s\" option on server versions older than PostgreSQL 12\n"),
+				progname, "disable-index-cleanup");
+		exit(1);
+	}
+
 	if (vacopts->skip_locked && PQserverVersion(conn) < 120000)
 	{
 		PQfinish(conn);
@@ -868,6 +889,13 @@ prepare_vacuum_command(PQExpBuffer sql, int serverVersion,
 				appendPQExpBuffer(sql, "%sDISABLE_PAGE_SKIPPING", sep);
 				sep = comma;
 			}
+			if (vacopts->disable_index_cleanup)
+			{
+				/* DISABLE_PAGE_SKIPPING is supported since 12 */
+				Assert(serverVersion >= 120000);
+				appendPQExpBuffer(sql, "%sDISABLE_INDEX_CLEANUP", sep);
+				sep = comma;
+			}
 			if (vacopts->skip_locked)
 			{
 				/* SKIP_LOCKED is supported since v12 */
@@ -1221,6 +1249,7 @@ help(const char *progname)
 	printf(_("  -a, --all                       vacuum all databases\n"));
 	printf(_("  -d, --dbname=DBNAME             database to vacuum\n"));
 	printf(_("      --disable-page-skipping     disable all page-skipping behavior\n"));
+	printf(_("      --disable-index-cleanup     disable index vacuuming and index clenaup\n"));
 	printf(_("  -e, --echo                      show the commands being sent to the server\n"));
 	printf(_("  -f, --full                      do full vacuuming\n"));
 	printf(_("  -F, --freeze                    freeze row transaction information\n"));
-- 
1.8.3.1

#31Bossart, Nathan
bossartn@amazon.com
In reply to: Masahiko Sawada (#30)
Re: New vacuum option to do only freezing

On 2/3/19, 1:48 PM, "Masahiko Sawada" <sawada.mshk@gmail.com> wrote:

On Sat, Feb 2, 2019 at 7:00 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:

On 2019-Feb-01, Bossart, Nathan wrote:

IMHO we could document this feature at a slightly higher level without
leaving out any really important user-facing behavior. Here's a quick
attempt to show what I am thinking:

With this option, VACUUM skips all index cleanup behavior and
only marks tuples as "dead" without reclaiming the storage.
While this can help reclaim transaction IDs faster to avoid
transaction ID wraparound (see Section 24.1.5), it will not
reduce bloat.

Hmm ... don't we compact out the storage for dead tuples? If we do (and
I think we should) then this wording is not entirely correct.

Yeah, we remove tuple and leave the dead ItemId. So we actually
reclaim the almost tuple storage.

Ah, yes. I was wrong here. Thank you for clarifying.

Attached the updated patch and the patch for vacuumdb.

Thanks! I am hoping to take a deeper look at this patch in the next
couple of days.

Nathan

#32Bossart, Nathan
bossartn@amazon.com
In reply to: Bossart, Nathan (#31)
Re: New vacuum option to do only freezing

Sorry for the delay. I finally got a chance to look through the
latest patches.

On 2/3/19, 1:48 PM, "Masahiko Sawada" <sawada.mshk@gmail.com> wrote:

On Fri, Feb 1, 2019 at 11:43 PM Bossart, Nathan <bossartn@amazon.com> wrote:

+               if (skip_index_vacuum)
+                       ereport(elevel,
+                                       (errmsg("\"%s\": marked %.0f row versions as dead in %u pages",
+                                                       RelationGetRelationName(onerel),
+                                                       tups_vacuumed, vacuumed_pages)));

IIUC tups_vacuumed will include tuples removed during HOT pruning, so
it could be inaccurate to say all of these tuples have only been
marked "dead." Perhaps we could keep a separate count of tuples
removed via HOT pruning in case we are using DISABLE_INDEX_CLEANUP.
There might be similar problems with the values stored in vacrelstats
that are used at the end of heap_vacuum_rel() (e.g. tuples_deleted).

Yeah, tups_vacuumed include such tuples so the message is inaccurate.
So I think that we should not change the message but we can report the
dead item pointers we left in errdetail. That's more accurate and
would help user.

postgres(1:1130)=# vacuum (verbose, disable_index_cleanup) test;
INFO: vacuuming "public.test"
INFO: "test": removed 49 row versions in 1 pages
INFO: "test": found 49 removable, 51 nonremovable row versions in 1
out of 1 pages
DETAIL: 0 dead row versions cannot be removed yet, oldest xmin: 509
There were 0 unused item pointers.
Skipped 0 pages due to buffer pins, 0 frozen pages.
0 pages are entirely empty.
49 tuples are left as dead.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
VACUUM

This seems reasonable to me.

The current version of the patches builds cleanly, passes 'make
check-world', and seems to work well in my own manual tests. I have a
number of small suggestions, but I think this will be ready-for-
committer soon.

+ Assert(!skip_index_vacuum);

There are two places in lazy_scan_heap() that I see this without any
comment. Can we add a short comment explaining why this should be
true at these points?

+       if (skip_index_vacuum)
+               appendStringInfo(&buf, ngettext("%.0f tuple is left as dead.\n",
+                                                                               "%.0f tuples are left as dead.\n",
+                                                                               nleft),
+                                                nleft);

I think we could emit this metric for all cases, not only when
DISABLE_INDEX_CLEANUP is used.

+                       /*
+                        * Remove tuples from heap if the table has no index.  If the table
+                        * has index but index vacuum is disabled, we don't vacuum and forget
+                        * them. The vacrelstats->dead_tuples could have tuples which became
+                        * dead after checked at HOT-pruning time which are handled by
+                        * lazy_vacuum_page() but we don't worry about handling those because
+                        * it's a very rare condition and these would not be a large number.
+                        */

Based on this, it sounds like nleft could be inaccurate. Do you think
it is worth adjusting the log message to reflect that, or is this
discrepancy something we should just live with? I think adding
something like "at least N tuples left marked dead" is arguably a bit
misleading, since the only time the number is actually higher is when
this "very rare condition" occurs.

+       /*
+        * Skip index vacuum if it's requested for table with indexes. In this
+        * case we use the one-pass strategy and don't remove tuple storage.
+        */
+       skip_index_vacuum =
+               (options & VACOPT_DISABLE_INDEX_CLEANUP) != 0 && vacrelstats->hasindex;

AFAICT we don't actually need to adjust this based on
vacrelstats->hasindex because we are already checking for indexes
everywhere we check for this option. What do you think about leaving
that part out?

+                       if (vacopts->disable_index_cleanup)
+                       {
+                               /* DISABLE_PAGE_SKIPPING is supported since 12 */
+                               Assert(serverVersion >= 120000);
+                               appendPQExpBuffer(sql, "%sDISABLE_INDEX_CLEANUP", sep);
+                               sep = comma;
+                       }

s/DISABLE_PAGE_SKIPPING/DISABLE_INDEX_CLEANUP/

+ printf(_(" --disable-index-cleanup disable index vacuuming and index clenaup\n"));

s/clenaup/cleanup/

Nathan

#33Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Bossart, Nathan (#32)
Re: New vacuum option to do only freezing

On Wed, Feb 27, 2019 at 10:02 AM Bossart, Nathan <bossartn@amazon.com> wrote:

Sorry for the delay. I finally got a chance to look through the
latest patches.

On 2/3/19, 1:48 PM, "Masahiko Sawada" <sawada.mshk@gmail.com> wrote:

On Fri, Feb 1, 2019 at 11:43 PM Bossart, Nathan <bossartn@amazon.com> wrote:

+               if (skip_index_vacuum)
+                       ereport(elevel,
+                                       (errmsg("\"%s\": marked %.0f row versions as dead in %u pages",
+                                                       RelationGetRelationName(onerel),
+                                                       tups_vacuumed, vacuumed_pages)));

IIUC tups_vacuumed will include tuples removed during HOT pruning, so
it could be inaccurate to say all of these tuples have only been
marked "dead." Perhaps we could keep a separate count of tuples
removed via HOT pruning in case we are using DISABLE_INDEX_CLEANUP.
There might be similar problems with the values stored in vacrelstats
that are used at the end of heap_vacuum_rel() (e.g. tuples_deleted).

Yeah, tups_vacuumed include such tuples so the message is inaccurate.
So I think that we should not change the message but we can report the
dead item pointers we left in errdetail. That's more accurate and
would help user.

postgres(1:1130)=# vacuum (verbose, disable_index_cleanup) test;
INFO: vacuuming "public.test"
INFO: "test": removed 49 row versions in 1 pages
INFO: "test": found 49 removable, 51 nonremovable row versions in 1
out of 1 pages
DETAIL: 0 dead row versions cannot be removed yet, oldest xmin: 509
There were 0 unused item pointers.
Skipped 0 pages due to buffer pins, 0 frozen pages.
0 pages are entirely empty.
49 tuples are left as dead.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
VACUUM

This seems reasonable to me.

The current version of the patches builds cleanly, passes 'make
check-world', and seems to work well in my own manual tests. I have a
number of small suggestions, but I think this will be ready-for-
committer soon.

Thank you for reviewing the patch!

+ Assert(!skip_index_vacuum);

There are two places in lazy_scan_heap() that I see this without any
comment. Can we add a short comment explaining why this should be
true at these points?

Agreed. Will add a comment.

+       if (skip_index_vacuum)
+               appendStringInfo(&buf, ngettext("%.0f tuple is left as dead.\n",
+                                                                               "%.0f tuples are left as dead.\n",
+                                                                               nleft),
+                                                nleft);

I think we could emit this metric for all cases, not only when
DISABLE_INDEX_CLEANUP is used.

I think that tups_vacuumed shows total number of vacuumed tuples and
is already shown in the log message. The 'nleft' counts the total
number of recorded dead tuple but not counts tuples are removed during
HOT-pruning. Is this a valuable for users in non-DISABLE_INDEX_CLEANUP
case?

+                       /*
+                        * Remove tuples from heap if the table has no index.  If the table
+                        * has index but index vacuum is disabled, we don't vacuum and forget
+                        * them. The vacrelstats->dead_tuples could have tuples which became
+                        * dead after checked at HOT-pruning time which are handled by
+                        * lazy_vacuum_page() but we don't worry about handling those because
+                        * it's a very rare condition and these would not be a large number.
+                        */

Based on this, it sounds like nleft could be inaccurate. Do you think
it is worth adjusting the log message to reflect that, or is this
discrepancy something we should just live with? I think adding
something like "at least N tuples left marked dead" is arguably a bit
misleading, since the only time the number is actually higher is when
this "very rare condition" occurs.

Hmm, I think it's true that we leave 'nleft' dead tuples because it
includes both tuples marked as dead and tuples not marked yet. So I
think the log message works. Maybe the comment leads misreading. Will
fix it.

+       /*
+        * Skip index vacuum if it's requested for table with indexes. In this
+        * case we use the one-pass strategy and don't remove tuple storage.
+        */
+       skip_index_vacuum =
+               (options & VACOPT_DISABLE_INDEX_CLEANUP) != 0 && vacrelstats->hasindex;

AFAICT we don't actually need to adjust this based on
vacrelstats->hasindex because we are already checking for indexes
everywhere we check for this option. What do you think about leaving
that part out?

Yeah, I think you're right. Will fix.

+                       if (vacopts->disable_index_cleanup)
+                       {
+                               /* DISABLE_PAGE_SKIPPING is supported since 12 */
+                               Assert(serverVersion >= 120000);
+                               appendPQExpBuffer(sql, "%sDISABLE_INDEX_CLEANUP", sep);
+                               sep = comma;
+                       }

s/DISABLE_PAGE_SKIPPING/DISABLE_INDEX_CLEANUP/

Will fix.

+ printf(_(" --disable-index-cleanup disable index vacuuming and index clenaup\n"));

s/clenaup/cleanup/

Will fix.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#34Bossart, Nathan
bossartn@amazon.com
In reply to: Masahiko Sawada (#33)
Re: New vacuum option to do only freezing

On 2/27/19, 2:08 AM, "Masahiko Sawada" <sawada.mshk@gmail.com> wrote:

+       if (skip_index_vacuum)
+               appendStringInfo(&buf, ngettext("%.0f tuple is left as dead.\n",
+                                                                               "%.0f tuples are left as dead.\n",
+                                                                               nleft),
+                                                nleft);

I think we could emit this metric for all cases, not only when
DISABLE_INDEX_CLEANUP is used.

I think that tups_vacuumed shows total number of vacuumed tuples and
is already shown in the log message. The 'nleft' counts the total
number of recorded dead tuple but not counts tuples are removed during
HOT-pruning. Is this a valuable for users in non-DISABLE_INDEX_CLEANUP
case?

I think it is valuable. When DISABLE_INDEX_CLEANUP is not used or it
is used for a relation with no indexes, it makes it clear that no
tuples were left marked as dead. Also, it looks like all of the other
information here is provided regardless of the options used. IMO it
is good to list all of the stats so that users have the full picture
of what VACUUM did.

Nathan

#35Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Bossart, Nathan (#34)
2 attachment(s)
Re: New vacuum option to do only freezing

On Thu, Feb 28, 2019 at 2:46 AM Bossart, Nathan <bossartn@amazon.com> wrote:

On 2/27/19, 2:08 AM, "Masahiko Sawada" <sawada.mshk@gmail.com> wrote:

+       if (skip_index_vacuum)
+               appendStringInfo(&buf, ngettext("%.0f tuple is left as dead.\n",
+                                                                               "%.0f tuples are left as dead.\n",
+                                                                               nleft),
+                                                nleft);

I think we could emit this metric for all cases, not only when
DISABLE_INDEX_CLEANUP is used.

I think that tups_vacuumed shows total number of vacuumed tuples and
is already shown in the log message. The 'nleft' counts the total
number of recorded dead tuple but not counts tuples are removed during
HOT-pruning. Is this a valuable for users in non-DISABLE_INDEX_CLEANUP
case?

I think it is valuable. When DISABLE_INDEX_CLEANUP is not used or it
is used for a relation with no indexes, it makes it clear that no
tuples were left marked as dead. Also, it looks like all of the other
information here is provided regardless of the options used. IMO it
is good to list all of the stats so that users have the full picture
of what VACUUM did.

I see your point. That seems good to me.

Attached the updated version patch. I've incorporated all review
comments I got and have changed the number of tuples being reported as
'removed tuples'. With this option, tuples completely being removed is
only tuples marked as unused during HOT-pruning, other dead tuples are
left. So we count those tuples during HOT-pruning and reports it as
removed tuples.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachments:

v7-0002-Add-diable-index-cleanup-option-to-vacuumdb.patchapplication/octet-stream; name=v7-0002-Add-diable-index-cleanup-option-to-vacuumdb.patchDownload
From 1cbc323d5f20cef9fd7a2feeb3731a29c6bb6fe7 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 1 Feb 2019 16:02:50 +0100
Subject: [PATCH v7 2/2] Add --diable-index-cleanup option to vacuumdb.

---
 doc/src/sgml/ref/vacuumdb.sgml    | 15 +++++++++++++++
 src/bin/scripts/t/100_vacuumdb.pl |  9 ++++++++-
 src/bin/scripts/vacuumdb.c        | 29 +++++++++++++++++++++++++++++
 3 files changed, 52 insertions(+), 1 deletion(-)

diff --git a/doc/src/sgml/ref/vacuumdb.sgml b/doc/src/sgml/ref/vacuumdb.sgml
index 41c7f3d..02d6e46 100644
--- a/doc/src/sgml/ref/vacuumdb.sgml
+++ b/doc/src/sgml/ref/vacuumdb.sgml
@@ -118,6 +118,21 @@ PostgreSQL documentation
      </varlistentry>
 
      <varlistentry>
+      <term><option>--disable-index-cleanup</option></term>
+      <listitem>
+       <para>
+        Disable index vacuuming and index cleanup.
+       </para>
+       <note>
+        <para>
+         This option is only available for servers running
+         <productname>PostgreSQL</productname> 12 and later.
+        </para>
+       </note>
+      </listitem>
+     </varlistentry>
+
+     <varlistentry>
       <term><option>-e</option></term>
       <term><option>--echo</option></term>
       <listitem>
diff --git a/src/bin/scripts/t/100_vacuumdb.pl b/src/bin/scripts/t/100_vacuumdb.pl
index 7f3a9b1..d227225 100644
--- a/src/bin/scripts/t/100_vacuumdb.pl
+++ b/src/bin/scripts/t/100_vacuumdb.pl
@@ -3,7 +3,7 @@ use warnings;
 
 use PostgresNode;
 use TestLib;
-use Test::More tests => 44;
+use Test::More tests => 47;
 
 program_help_ok('vacuumdb');
 program_version_ok('vacuumdb');
@@ -38,6 +38,10 @@ $node->issues_sql_like(
 	qr/statement: VACUUM \(DISABLE_PAGE_SKIPPING\).*;/,
 	'vacuumdb --disable-page-skipping');
 $node->issues_sql_like(
+	[ 'vacuumdb', '--disable-index-cleanup', 'postgres' ],
+	qr/statement: VACUUM \(DISABLE_INDEX_CLEANUP\).*;/,
+	'vacuumdb --disable-index-cleanup');
+$node->issues_sql_like(
 	[ 'vacuumdb', '--skip-locked', 'postgres' ],
 	qr/statement: VACUUM \(SKIP_LOCKED\).*;/,
 	'vacuumdb --skip-locked');
@@ -48,6 +52,9 @@ $node->issues_sql_like(
 $node->command_fails(
 	[ 'vacuumdb', '--analyze-only', '--disable-page-skipping', 'postgres' ],
 	'--analyze-only and --disable-page-skipping specified together');
+$node->command_fails(
+	[ 'vacuumdb', '--analyze-only', '--disable-index-cleanup', 'postgres' ],
+	'--analyze-only and --disable-index-cleanup specified together');
 $node->command_ok([qw(vacuumdb -Z --table=pg_am dbname=template1)],
 	'vacuumdb with connection string');
 
diff --git a/src/bin/scripts/vacuumdb.c b/src/bin/scripts/vacuumdb.c
index 5ac41ea..4705217 100644
--- a/src/bin/scripts/vacuumdb.c
+++ b/src/bin/scripts/vacuumdb.c
@@ -42,6 +42,7 @@ typedef struct vacuumingOptions
 	bool		full;
 	bool		freeze;
 	bool		disable_page_skipping;
+	bool		disable_index_cleanup;
 	bool		skip_locked;
 	int			min_xid_age;
 	int			min_mxid_age;
@@ -117,6 +118,7 @@ main(int argc, char *argv[])
 		{"skip-locked", no_argument, NULL, 5},
 		{"min-xid-age", required_argument, NULL, 6},
 		{"min-mxid-age", required_argument, NULL, 7},
+		{"disable-index-cleanup", no_argument, NULL, 8},
 		{NULL, 0, NULL, 0}
 	};
 
@@ -244,6 +246,11 @@ main(int argc, char *argv[])
 					exit(1);
 				}
 				break;
+			case 8:
+				{
+					vacopts.disable_index_cleanup = true;
+					break;
+				}
 			default:
 				fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname);
 				exit(1);
@@ -288,6 +295,12 @@ main(int argc, char *argv[])
 					progname, "disable-page-skipping");
 			exit(1);
 		}
+		if (vacopts.disable_index_cleanup)
+		{
+			fprintf(stderr, _("%s: cannot use the \"%s\" option when performing only analyze\n"),
+					progname, "disable-index-cleanup");
+			exit(1);
+		}
 		/* allow 'and_analyze' with 'analyze_only' */
 	}
 
@@ -418,6 +431,14 @@ vacuum_one_database(const char *dbname, vacuumingOptions *vacopts,
 		exit(1);
 	}
 
+	if (vacopts->disable_index_cleanup && PQserverVersion(conn) < 120000)
+	{
+		PQfinish(conn);
+		fprintf(stderr, _("%s: cannot use the \"%s\" option on server versions older than PostgreSQL 12\n"),
+				progname, "disable-index-cleanup");
+		exit(1);
+	}
+
 	if (vacopts->skip_locked && PQserverVersion(conn) < 120000)
 	{
 		PQfinish(conn);
@@ -868,6 +889,13 @@ prepare_vacuum_command(PQExpBuffer sql, int serverVersion,
 				appendPQExpBuffer(sql, "%sDISABLE_PAGE_SKIPPING", sep);
 				sep = comma;
 			}
+			if (vacopts->disable_index_cleanup)
+			{
+				/* DISABLE_INDEX_CLEANUP is supported since 12 */
+				Assert(serverVersion >= 120000);
+				appendPQExpBuffer(sql, "%sDISABLE_INDEX_CLEANUP", sep);
+				sep = comma;
+			}
 			if (vacopts->skip_locked)
 			{
 				/* SKIP_LOCKED is supported since v12 */
@@ -1221,6 +1249,7 @@ help(const char *progname)
 	printf(_("  -a, --all                       vacuum all databases\n"));
 	printf(_("  -d, --dbname=DBNAME             database to vacuum\n"));
 	printf(_("      --disable-page-skipping     disable all page-skipping behavior\n"));
+	printf(_("      --disable-index-cleanup     disable index vacuuming and index cleanup\n"));
 	printf(_("  -e, --echo                      show the commands being sent to the server\n"));
 	printf(_("  -f, --full                      do full vacuuming\n"));
 	printf(_("  -F, --freeze                    freeze row transaction information\n"));
-- 
2.10.5

v7-0001-Add-DISABLE_INDEX_CLEANUP-option-to-VACUUM-comman.patchapplication/octet-stream; name=v7-0001-Add-DISABLE_INDEX_CLEANUP-option-to-VACUUM-comman.patchDownload
From 5eb9e221c7cd025fa2d47e95ab7de78af0866c04 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Mon, 21 Jan 2019 19:07:44 +0900
Subject: [PATCH v7 1/2] Add DISABLE_INDEX_CLEANUP option to VACUUM command

With this option, VACUUM does HOT-pruning for live tuples but doesn't
remove dead tuples completely and disables index vacuum.

vacrelstats->dead_tuples could have tuples that became dead after
checked at a HOT-pruning time, which are not marked as dead. Per
discussion on pgsql-hackers We normally records and remove them but
with this option we don't process and leave for the next vacuum for
simplifing the code. That's okay because it's very rare condition and
those tuples will be processed by the next vacuum.
---
 doc/src/sgml/ref/vacuum.sgml         | 21 +++++++++-
 src/backend/access/heap/pruneheap.c  | 16 ++++---
 src/backend/access/heap/vacuumlazy.c | 81 +++++++++++++++++++++++++++---------
 src/backend/commands/vacuum.c        |  8 +++-
 src/backend/parser/gram.y            |  2 +
 src/include/access/heapam.h          |  3 +-
 src/include/nodes/parsenodes.h       |  3 +-
 src/test/regress/expected/vacuum.out |  3 ++
 src/test/regress/sql/vacuum.sql      |  3 ++
 9 files changed, 111 insertions(+), 29 deletions(-)

diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index fd911f5..f5cde2b 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -31,6 +31,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     VERBOSE
     ANALYZE
     DISABLE_PAGE_SKIPPING
+    DISABLE_INDEX_CLEANUP
     SKIP_LOCKED
 
 <phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
@@ -161,7 +162,25 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     </listitem>
    </varlistentry>
 
-   <varlistentry>
+    <varlistentry>
+    <term><literal>DISABLE_INDEX_CLEANUP</literal></term>
+    <listitem>
+     <para>
+      <command>VACUUM</command> removes dead tuples and prunes HOT-updated
+      tuples chain for live tuples on table. If the table has any dead tuple
+      it removes them from both table and indexes for re-use. With this
+      option <command>VACUUM</command> doesn't completely remove dead tuples
+      and disables removing dead tuples from indexes.  This is suitable for
+      avoiding transaction ID wraparound (see
+      <xref linkend="vacuum-for-wraparound"/>) but not sufficient for avoiding
+      index bloat. This option is ignored if the table doesn't have index.
+      This cannot be used in conjunction with <literal>FULL</literal>
+      option.
+     </para>
+    </listitem>
+   </varlistentry>
+
+    <varlistentry>
     <term><literal>SKIP_LOCKED</literal></term>
     <listitem>
      <para>
diff --git a/src/backend/access/heap/pruneheap.c b/src/backend/access/heap/pruneheap.c
index a3e5192..9db51e8 100644
--- a/src/backend/access/heap/pruneheap.c
+++ b/src/backend/access/heap/pruneheap.c
@@ -147,11 +147,13 @@ heap_page_prune_opt(Relation relation, Buffer buffer)
 		 */
 		if (PageIsFull(page) || PageGetHeapFreeSpace(page) < minfree)
 		{
-			TransactionId ignore = InvalidTransactionId;	/* return value not
-															 * needed */
+			/* return values not needed */
+			TransactionId ignore1 = InvalidTransactionId;
+			double ignore2 = 0;
 
 			/* OK to prune */
-			(void) heap_page_prune(relation, buffer, OldestXmin, true, &ignore);
+			(void) heap_page_prune(relation, buffer, OldestXmin, true,
+								   &ignore1, &ignore2);
 		}
 
 		/* And release buffer lock */
@@ -173,12 +175,13 @@ heap_page_prune_opt(Relation relation, Buffer buffer)
  * send its own new total to pgstats, and we don't want this delta applied
  * on top of that.)
  *
- * Returns the number of tuples deleted from the page and sets
- * latestRemovedXid.
+ * Returns the number of tuples deleted from the page and set latestRemoveXid
+ * and increment nunused.
  */
 int
 heap_page_prune(Relation relation, Buffer buffer, TransactionId OldestXmin,
-				bool report_stats, TransactionId *latestRemovedXid)
+				bool report_stats, TransactionId *latestRemovedXid,
+				double *nunused)
 {
 	int			ndeleted = 0;
 	Page		page = BufferGetPage(buffer);
@@ -302,6 +305,7 @@ heap_page_prune(Relation relation, Buffer buffer, TransactionId OldestXmin,
 		pgstat_update_heap_dead_tuples(relation, ndeleted - prstate.ndead);
 
 	*latestRemovedXid = prstate.latestRemovedXid;
+	*nunused += prstate.nunused;
 
 	/*
 	 * XXX Should we update the FSM information of this page ?
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 9416c31..4a922c1 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -112,7 +112,10 @@
 
 typedef struct LVRelStats
 {
-	/* hasindex = true means two-pass strategy; false means one-pass */
+	/*
+	 * hasindex = true means two-pass strategy; false means one-pass. But we
+	 * always use the one-pass strategy when index vacuum is disabled.
+	 */
 	bool		hasindex;
 	/* Overall statistics about rel */
 	BlockNumber old_rel_pages;	/* previous value of pg_class.relpages */
@@ -167,7 +170,8 @@ static bool should_attempt_truncation(LVRelStats *vacrelstats);
 static void lazy_truncate_heap(Relation onerel, LVRelStats *vacrelstats);
 static BlockNumber count_nondeletable_pages(Relation onerel,
 						 LVRelStats *vacrelstats);
-static void lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks);
+static void lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks,
+							 bool skip_index_vacuum);
 static void lazy_record_dead_tuple(LVRelStats *vacrelstats,
 					   ItemPointer itemptr);
 static bool lazy_tid_reaped(ItemPointer itemptr, void *state);
@@ -485,13 +489,16 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 				live_tuples,	/* live tuples (reltuples estimate) */
 				tups_vacuumed,	/* tuples cleaned up by vacuum */
 				nkeep,			/* dead-but-not-removable tuples */
-				nunused;		/* unused item pointers */
+				nunused,		/* unused item pointers */
+				tups_pruned,	/* tuples marked as unused by HOT-pruning */
+				nleft;			/* item pointers we left */
 	IndexBulkDeleteResult **indstats;
 	int			i;
 	PGRUsage	ru0;
 	Buffer		vmbuffer = InvalidBuffer;
 	BlockNumber next_unskippable_block;
 	bool		skipping_blocks;
+	bool		skip_index_vacuum = (options & VACOPT_DISABLE_INDEX_CLEANUP) != 0;
 	xl_heap_freeze_tuple *frozen;
 	StringInfoData buf;
 	const int	initprog_index[] = {
@@ -517,7 +524,8 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 
 	empty_pages = vacuumed_pages = 0;
 	next_fsm_block_to_vacuum = (BlockNumber) 0;
-	num_tuples = live_tuples = tups_vacuumed = nkeep = nunused = 0;
+	num_tuples = live_tuples = tups_vacuumed = nkeep = nunused
+		= nleft = tups_pruned = 0;
 
 	indstats = (IndexBulkDeleteResult **)
 		palloc0(nindexes * sizeof(IndexBulkDeleteResult *));
@@ -529,7 +537,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	vacrelstats->nonempty_pages = 0;
 	vacrelstats->latestRemovedXid = InvalidTransactionId;
 
-	lazy_space_alloc(vacrelstats, nblocks);
+	lazy_space_alloc(vacrelstats, nblocks, skip_index_vacuum);
 	frozen = palloc(sizeof(xl_heap_freeze_tuple) * MaxHeapTuplesPerPage);
 
 	/* Report that we're scanning the heap, advertising total # of blocks */
@@ -722,6 +730,9 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			};
 			int64		hvp_val[2];
 
+			/* Index vacuum must be enabled in two-pass vacuum */
+			Assert(!skip_index_vacuum);
+
 			/*
 			 * Before beginning index vacuuming, we release any pin we may
 			 * hold on the visibility map page.  This isn't necessary for
@@ -951,7 +962,8 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		 * We count tuples removed by the pruning step as removed by VACUUM.
 		 */
 		tups_vacuumed += heap_page_prune(onerel, buf, OldestXmin, false,
-										 &vacrelstats->latestRemovedXid);
+										 &vacrelstats->latestRemovedXid,
+										 &tups_pruned);
 
 		/*
 		 * Now scan the page to collect vacuumable items and check for tuples
@@ -1206,15 +1218,31 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		}
 
 		/*
-		 * If there are no indexes then we can vacuum the page right now
-		 * instead of doing a second scan.
+		 * If there are no indexes or we skip index vacuum then we can vacuum
+		 * the page right now instead of doing a second scan.
 		 */
-		if (nindexes == 0 &&
+		if ((nindexes == 0 || skip_index_vacuum) &&
 			vacrelstats->num_dead_tuples > 0)
 		{
-			/* Remove tuples from heap */
-			lazy_vacuum_page(onerel, blkno, buf, 0, vacrelstats, &vmbuffer);
-			has_dead_tuples = false;
+			if (nindexes == 0)
+			{
+				/* Remove tuples from heap if the table has no index */
+				lazy_vacuum_page(onerel, blkno, buf, 0, vacrelstats, &vmbuffer);
+				has_dead_tuples = false;
+			}
+			else
+			{
+				/*
+				 * Here, we have indexes but index vacuum is disabled. We don't
+				 * vacuum dead tuples on heap but forget them as we skip index
+				 * vacuum. The vacrelstats->dead_tuples could have tuples which
+				 * became dead after checked at HOT-pruning time but aren't marked
+				 * as dead yet. We don't process them because it's a very rare
+				 * condition and the next vacuum will process them.
+				 */
+				Assert(skip_index_vacuum);
+				nleft += vacrelstats->num_dead_tuples;
+			}
 
 			/*
 			 * Forget the now-vacuumed tuples, and press on, but be careful
@@ -1379,6 +1407,9 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		};
 		int64		hvp_val[2];
 
+		/* Index vacuum must be enabled in two-pass vacuum */
+		Assert(!skip_index_vacuum);
+
 		/* Log cleanup info before we touch indexes */
 		vacuum_log_cleanup_info(onerel, vacrelstats);
 
@@ -1417,15 +1448,21 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
 
 	/* Do post-vacuum cleanup and statistics update for each index */
-	for (i = 0; i < nindexes; i++)
-		lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
+	if (!skip_index_vacuum)
+		for (i = 0; i < nindexes; i++)
+			lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
 
-	/* If no indexes, make log report that lazy_vacuum_heap would've made */
+	/*
+	 * If no index or disables index vacuum, make log report that lazy_vacuum_heap
+	 * would've made. If index vacuum is disabled, we didn't remove all dead
+	 * tuples but did for tuples removed by HOT-pruning.
+	 */
 	if (vacuumed_pages)
 		ereport(elevel,
 				(errmsg("\"%s\": removed %.0f row versions in %u pages",
 						RelationGetRelationName(onerel),
-						tups_vacuumed, vacuumed_pages)));
+						skip_index_vacuum ? tups_pruned : tups_vacuumed,
+						vacuumed_pages)));
 
 	/*
 	 * This is pretty messy, but we split it up so that we can skip emitting
@@ -1449,12 +1486,17 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 									"%u pages are entirely empty.\n",
 									empty_pages),
 					 empty_pages);
+	appendStringInfo(&buf, ngettext("%.0f tuple is left as dead.\n",
+									"%.0f tuples are left as dead.\n",
+									nleft),
+					 nleft);
 	appendStringInfo(&buf, _("%s."), pg_rusage_show(&ru0));
 
 	ereport(elevel,
 			(errmsg("\"%s\": found %.0f removable, %.0f nonremovable row versions in %u out of %u pages",
 					RelationGetRelationName(onerel),
-					tups_vacuumed, num_tuples,
+					skip_index_vacuum ? tups_pruned : tups_vacuumed,
+					num_tuples,
 					vacrelstats->scanned_pages, nblocks),
 			 errdetail_internal("%s", buf.data)));
 	pfree(buf.data);
@@ -2085,14 +2127,15 @@ count_nondeletable_pages(Relation onerel, LVRelStats *vacrelstats)
  * See the comments at the head of this file for rationale.
  */
 static void
-lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks,
+				 bool skip_index_vacuum)
 {
 	long		maxtuples;
 	int			vac_work_mem = IsAutoVacuumWorkerProcess() &&
 	autovacuum_work_mem != -1 ?
 	autovacuum_work_mem : maintenance_work_mem;
 
-	if (vacrelstats->hasindex)
+	if (vacrelstats->hasindex && !skip_index_vacuum)
 	{
 		maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);
 		maxtuples = Min(maxtuples, INT_MAX);
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index e91df21..00024dd 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -203,7 +203,8 @@ vacuum(int options, List *relations, VacuumParams *params,
 						stmttype)));
 
 	/*
-	 * Sanity check DISABLE_PAGE_SKIPPING option.
+	 * Sanity check DISABLE_PAGE_SKIPPING option and DISABLE_INDEX_CLEANUP
+	 * option.
 	 */
 	if ((options & VACOPT_FULL) != 0 &&
 		(options & VACOPT_DISABLE_PAGE_SKIPPING) != 0)
@@ -211,6 +212,11 @@ vacuum(int options, List *relations, VacuumParams *params,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg("VACUUM option DISABLE_PAGE_SKIPPING cannot be used with FULL")));
 
+	if ((options & VACOPT_FULL) != 0 &&
+		(options & VACOPT_DISABLE_INDEX_CLEANUP) != 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("VACUUM option DISABLE_INDEX_CLEANUP cannot be used with FULL")));
 	/*
 	 * Send info about dead objects to the statistics collector, unless we are
 	 * in autovacuum --- autovacuum.c does this for itself.
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 0279013..0b13ad7 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -10471,6 +10471,8 @@ vacuum_option_elem:
 				{
 					if (strcmp($1, "disable_page_skipping") == 0)
 						$$ = VACOPT_DISABLE_PAGE_SKIPPING;
+					else if (strcmp($1, "disable_index_cleanup") == 0)
+						$$ = VACOPT_DISABLE_INDEX_CLEANUP;
 					else if (strcmp($1, "skip_locked") == 0)
 						$$ = VACOPT_SKIP_LOCKED;
 					else
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index ab08791..e0230d7 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -170,7 +170,8 @@ extern void heap_update_snapshot(HeapScanDesc scan, Snapshot snapshot);
 extern void heap_page_prune_opt(Relation relation, Buffer buffer);
 extern int heap_page_prune(Relation relation, Buffer buffer,
 				TransactionId OldestXmin,
-				bool report_stats, TransactionId *latestRemovedXid);
+				bool report_stats, TransactionId *latestRemovedXid,
+				double *nunused);
 extern void heap_page_prune_execute(Buffer buffer,
 						OffsetNumber *redirected, int nredirected,
 						OffsetNumber *nowdead, int ndead,
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index a7e859d..f6d52ac 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3163,7 +3163,8 @@ typedef enum VacuumOption
 	VACOPT_FULL = 1 << 4,		/* FULL (non-concurrent) vacuum */
 	VACOPT_SKIP_LOCKED = 1 << 5,	/* skip if cannot get lock */
 	VACOPT_SKIPTOAST = 1 << 6,	/* don't process the TOAST table, if any */
-	VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7	/* don't skip any pages */
+	VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7,	/* don't skip any pages */
+	VACOPT_DISABLE_INDEX_CLEANUP = 1 << 8	/* skip index vacuum and cleanup */
 } VacuumOption;
 
 /*
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index fa9d663..cffce59 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -80,6 +80,9 @@ CONTEXT:  SQL function "do_analyze" statement 1
 SQL function "wrap_do_analyze" statement 1
 VACUUM FULL vactst;
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+VACUUM (DISABLE_INDEX_CLEANUP) vaccluster;
+VACUUM (DISABLE_INDEX_CLEANUP) vactst; -- DISABLE_INDEX_CLEANUP is ignored
+VACUUM (DISABLE_INDEX_CLEANUP, FREEZE) vaccluster;
 -- partitioned table
 CREATE TABLE vacparted (a int, b char) PARTITION BY LIST (a);
 CREATE TABLE vacparted1 PARTITION OF vacparted FOR VALUES IN (1);
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 9defa0d..9c4bdb7 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -61,6 +61,9 @@ VACUUM FULL vaccluster;
 VACUUM FULL vactst;
 
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+VACUUM (DISABLE_INDEX_CLEANUP) vaccluster;
+VACUUM (DISABLE_INDEX_CLEANUP) vactst; -- DISABLE_INDEX_CLEANUP is ignored
+VACUUM (DISABLE_INDEX_CLEANUP, FREEZE) vaccluster;
 
 -- partitioned table
 CREATE TABLE vacparted (a int, b char) PARTITION BY LIST (a);
-- 
2.10.5

#36Robert Haas
robertmhaas@gmail.com
In reply to: Masahiko Sawada (#35)
Re: New vacuum option to do only freezing

On Thu, Feb 28, 2019 at 3:12 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Attached the updated version patch.

Regarding the user interface for this patch, please have a look at the
concerns I mention in

/messages/by-id/CA+TgmoZORX_UUv67rjaSX_aswkdZWV8kWfKfrWxyLdCqFqj+Yw@mail.gmail.com

I provided a suggestion there as to how to resolve the issue but
naturally others may feel differently.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#37Bossart, Nathan
bossartn@amazon.com
In reply to: Robert Haas (#36)
Re: New vacuum option to do only freezing

On 2/28/19, 12:13 AM, "Masahiko Sawada" <sawada.mshk@gmail.com> wrote:

Attached the updated version patch. I've incorporated all review
comments I got and have changed the number of tuples being reported as
'removed tuples'. With this option, tuples completely being removed is
only tuples marked as unused during HOT-pruning, other dead tuples are
left. So we count those tuples during HOT-pruning and reports it as
removed tuples.

Thanks for the new patches. Beyond the reloptions discussion, most of
my feedback is wording suggestions.

+      <command>VACUUM</command> removes dead tuples and prunes HOT-updated
+      tuples chain for live tuples on table. If the table has any dead tuple
+      it removes them from both table and indexes for re-use. With this
+      option <command>VACUUM</command> doesn't completely remove dead tuples
+      and disables removing dead tuples from indexes.  This is suitable for
+      avoiding transaction ID wraparound (see
+      <xref linkend="vacuum-for-wraparound"/>) but not sufficient for avoiding
+      index bloat. This option is ignored if the table doesn't have index.
+      This cannot be used in conjunction with <literal>FULL</literal>
+      option.

There are a couple of small changes I would make. How does something
like this sound?

VACUUM removes dead tuples and prunes HOT-updated tuple chains for
live tuples on the table. If the table has any dead tuples, it
removes them from both the table and its indexes and marks the
corresponding line pointers as available for re-use. With this
option, VACUUM still removes dead tuples from the table, but it
does not process any indexes, and the line pointers are marked as
dead instead of available for re-use. This is suitable for
avoiding transaction ID wraparound (see Section 24.1.5) but not
sufficient for avoiding index bloat. This option is ignored if
the table does not have any indexes. This cannot be used in
conjunction with the FULL option.

- * Returns the number of tuples deleted from the page and sets
- * latestRemovedXid.
+ * Returns the number of tuples deleted from the page and set latestRemoveXid
+ * and increment nunused.

I would say something like: "Returns the number of tuples deleted from
the page, sets latestRemovedXid, and updates nunused."

+	/*
+	 * hasindex = true means two-pass strategy; false means one-pass. But we
+	 * always use the one-pass strategy when index vacuum is disabled.
+	 */

I think the added sentence should make it more clear that hasindex
will still be true when DISABLE_INDEX_CLEANUP is used. Maybe
something like:

/*
* hasindex = true means two-pass strategy; false means one-pass
*
* If DISABLE_INDEX_CLEANUP is used, hasindex may still be true,
* but we'll always use the one-pass strategy.
*/

                tups_vacuumed += heap_page_prune(onerel, buf, OldestXmin, false,
-                                                                                &vacrelstats->latestRemovedXid);
+                                                                                &vacrelstats->latestRemovedXid,
+                                                                                &tups_pruned);

Why do we need a separate tups_pruned argument in heap_page_prune()?
Could we add the result of heap_page_prune() to tups_pruned instead,
then report the total number of removed tuples as tups_vacuumed +
tups_pruned elsewhere?

+		 * If there are no indexes or we skip index vacuum then we can vacuum
+		 * the page right now instead of doing a second scan.

How about:

If there are no indexes or index cleanup is disabled, we can
vacuum the page right now instead of doing a second scan.

+				/*
+				 * Here, we have indexes but index vacuum is disabled. We don't
+				 * vacuum dead tuples on heap but forget them as we skip index
+				 * vacuum. The vacrelstats->dead_tuples could have tuples which
+				 * became dead after checked at HOT-pruning time but aren't marked
+				 * as dead yet. We don't process them because it's a very rare
+				 * condition and the next vacuum will process them.
+				 */

I would suggest a few small changes:

/*
* Here, we have indexes but index vacuum is disabled. Instead of
* vacuuming the dead tuples on the heap, we just forget them.
*
* Note that vacrelstats->dead_tuples could include tuples which
* became dead after HOT-pruning but are not marked dead yet. We
* do not process them because this is a very rare condition, and
* the next vacuum will process them anyway.
*/

-       /* If no indexes, make log report that lazy_vacuum_heap would've made */
+       /*
+        * If no index or disables index vacuum, make log report that lazy_vacuum_heap
+        * would've made. If index vacuum is disabled, we didn't remove all dead
+        * tuples but did for tuples removed by HOT-pruning.
+        */
        if (vacuumed_pages)
                ereport(elevel,
                                (errmsg("\"%s\": removed %.0f row versions in %u pages",
                                                RelationGetRelationName(onerel),
-                                               tups_vacuumed, vacuumed_pages)));
+                                               skip_index_vacuum ? tups_pruned : tups_vacuumed,
+                                               vacuumed_pages)));

How about:

/*
* If no index or index vacuum is disabled, make log report that
* lazy_vacuum_heap would've made. If index vacuum is disabled,
* we don't include the tuples that we marked dead, but we do
* include tuples removed by HOT-pruning.
*/

Another interesting thing I noticed is that this "removed X row
versions" message is only emitted if vacuumed_pages is greater than 0.
However, if we only did HOT pruning, tups_vacuumed will be greater
than 0 while vacuumed_pages will still be 0, so some information will
be left out. I think this is already the case, though, so this could
probably be handled in a separate thread.

Nathan

#38Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Bossart, Nathan (#37)
Re: New vacuum option to do only freezing

On Tue, Mar 5, 2019 at 8:27 AM Bossart, Nathan <bossartn@amazon.com> wrote:

On 2/28/19, 12:13 AM, "Masahiko Sawada" <sawada.mshk@gmail.com> wrote:

Attached the updated version patch. I've incorporated all review
comments I got and have changed the number of tuples being reported as
'removed tuples'. With this option, tuples completely being removed is
only tuples marked as unused during HOT-pruning, other dead tuples are
left. So we count those tuples during HOT-pruning and reports it as
removed tuples.

Thanks for the new patches. Beyond the reloptions discussion, most of
my feedback is wording suggestions.

Thank you for the comment!

+      <command>VACUUM</command> removes dead tuples and prunes HOT-updated
+      tuples chain for live tuples on table. If the table has any dead tuple
+      it removes them from both table and indexes for re-use. With this
+      option <command>VACUUM</command> doesn't completely remove dead tuples
+      and disables removing dead tuples from indexes.  This is suitable for
+      avoiding transaction ID wraparound (see
+      <xref linkend="vacuum-for-wraparound"/>) but not sufficient for avoiding
+      index bloat. This option is ignored if the table doesn't have index.
+      This cannot be used in conjunction with <literal>FULL</literal>
+      option.

There are a couple of small changes I would make. How does something
like this sound?

VACUUM removes dead tuples and prunes HOT-updated tuple chains for
live tuples on the table. If the table has any dead tuples, it
removes them from both the table and its indexes and marks the
corresponding line pointers as available for re-use. With this
option, VACUUM still removes dead tuples from the table, but it
does not process any indexes, and the line pointers are marked as
dead instead of available for re-use. This is suitable for
avoiding transaction ID wraparound (see Section 24.1.5) but not
sufficient for avoiding index bloat. This option is ignored if
the table does not have any indexes. This cannot be used in
conjunction with the FULL option.

Hmm, that's good idea but I'm not sure that user knows the word 'line
pointer' because it is used only at pageinspect document. I wonder if
the word 'item identifier' would rather be appropriate here because
this is used at at describing page layout(in storage.sgml). Thought?

- * Returns the number of tuples deleted from the page and sets
- * latestRemovedXid.
+ * Returns the number of tuples deleted from the page and set latestRemoveXid
+ * and increment nunused.

I would say something like: "Returns the number of tuples deleted from
the page, sets latestRemovedXid, and updates nunused."

Fixed.

+       /*
+        * hasindex = true means two-pass strategy; false means one-pass. But we
+        * always use the one-pass strategy when index vacuum is disabled.
+        */

I think the added sentence should make it more clear that hasindex
will still be true when DISABLE_INDEX_CLEANUP is used. Maybe
something like:

/*
* hasindex = true means two-pass strategy; false means one-pass
*
* If DISABLE_INDEX_CLEANUP is used, hasindex may still be true,
* but we'll always use the one-pass strategy.
*/

Fixed.

tups_vacuumed += heap_page_prune(onerel, buf, OldestXmin, false,
-                                                                                &vacrelstats->latestRemovedXid);
+                                                                                &vacrelstats->latestRemovedXid,
+                                                                                &tups_pruned);

Why do we need a separate tups_pruned argument in heap_page_prune()?
Could we add the result of heap_page_prune() to tups_pruned instead,
then report the total number of removed tuples as tups_vacuumed +
tups_pruned elsewhere?

Hmm, I thought that we should report only the number of tuples
completely removed but we already count the tulples marked as
redirected as tups_vacuumed. Let me summarize the fate of dead tuples.
I think we can roughly classify dead tuples as follows.

1. root tuple of HOT chain that became dead
2. root tuple of HOT chain that became redirected
3. other tupels of HOT chain that became unused
4. tuples that became dead after HOT pruning

The tuples of #1 through #3 either have only ItemIDs or have been
completely removed but tuples of #4 has its tuple storage because they
are not processed when HOT-pruning.

Currently tups_vacuumed counts all of them, nleft (=
vacrelstats->num_dead_tuples) counts #1 + #4. I think that the number
of removed tuples being reported would be #1 + #2 + #3. Or should we
use #2 + #3 instead?

+                * If there are no indexes or we skip index vacuum then we can vacuum
+                * the page right now instead of doing a second scan.

How about:

If there are no indexes or index cleanup is disabled, we can
vacuum the page right now instead of doing a second scan.

Fixed.

+                               /*
+                                * Here, we have indexes but index vacuum is disabled. We don't
+                                * vacuum dead tuples on heap but forget them as we skip index
+                                * vacuum. The vacrelstats->dead_tuples could have tuples which
+                                * became dead after checked at HOT-pruning time but aren't marked
+                                * as dead yet. We don't process them because it's a very rare
+                                * condition and the next vacuum will process them.
+                                */

I would suggest a few small changes:

/*
* Here, we have indexes but index vacuum is disabled. Instead of
* vacuuming the dead tuples on the heap, we just forget them.
*
* Note that vacrelstats->dead_tuples could include tuples which
* became dead after HOT-pruning but are not marked dead yet. We
* do not process them because this is a very rare condition, and
* the next vacuum will process them anyway.
*/

Fixed.

-       /* If no indexes, make log report that lazy_vacuum_heap would've made */
+       /*
+        * If no index or disables index vacuum, make log report that lazy_vacuum_heap
+        * would've made. If index vacuum is disabled, we didn't remove all dead
+        * tuples but did for tuples removed by HOT-pruning.
+        */
if (vacuumed_pages)
ereport(elevel,
(errmsg("\"%s\": removed %.0f row versions in %u pages",
RelationGetRelationName(onerel),
-                                               tups_vacuumed, vacuumed_pages)));
+                                               skip_index_vacuum ? tups_pruned : tups_vacuumed,
+                                               vacuumed_pages)));

How about:

/*
* If no index or index vacuum is disabled, make log report that
* lazy_vacuum_heap would've made. If index vacuum is disabled,
* we don't include the tuples that we marked dead, but we do
* include tuples removed by HOT-pruning.
*/

Fixed.

Another interesting thing I noticed is that this "removed X row
versions" message is only emitted if vacuumed_pages is greater than 0.
However, if we only did HOT pruning, tups_vacuumed will be greater
than 0 while vacuumed_pages will still be 0, so some information will
be left out. I think this is already the case, though, so this could
probably be handled in a separate thread.

Hmm, since this log message is corresponding to the one that
lazy_vacuum_heap makes and total number of removed tuples are always
reported, it seems consistent to me. Do you have another point?

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#39Kyotaro HORIGUCHI
horiguchi.kyotaro@lab.ntt.co.jp
In reply to: Bossart, Nathan (#37)
Re: New vacuum option to do only freezing

Hello, I have some other comments.

At Mon, 4 Mar 2019 23:27:10 +0000, "Bossart, Nathan" <bossartn@amazon.com> wrote in <48410154-E6C5-4C07-8122-8D04E3BCD1F8@amazon.com>

On 2/28/19, 12:13 AM, "Masahiko Sawada" <sawada.mshk@gmail.com> wrote:

Attached the updated version patch. I've incorporated all review
comments I got and have changed the number of tuples being reported as
'removed tuples'. With this option, tuples completely being removed is
only tuples marked as unused during HOT-pruning, other dead tuples are
left. So we count those tuples during HOT-pruning and reports it as
removed tuples.

Thanks for the new patches. Beyond the reloptions discussion, most of
my feedback is wording suggestions.

+      <command>VACUUM</command> removes dead tuples and prunes HOT-updated
+      tuples chain for live tuples on table. If the table has any dead tuple
+      it removes them from both table and indexes for re-use. With this
+      option <command>VACUUM</command> doesn't completely remove dead tuples
+      and disables removing dead tuples from indexes.  This is suitable for
+      avoiding transaction ID wraparound (see
+      <xref linkend="vacuum-for-wraparound"/>) but not sufficient for avoiding
+      index bloat. This option is ignored if the table doesn't have index.
+      This cannot be used in conjunction with <literal>FULL</literal>
+      option.

There are a couple of small changes I would make. How does something
like this sound?

VACUUM removes dead tuples and prunes HOT-updated tuple chains for
live tuples on the table. If the table has any dead tuples, it
removes them from both the table and its indexes and marks the
corresponding line pointers as available for re-use. With this
option, VACUUM still removes dead tuples from the table, but it
does not process any indexes, and the line pointers are marked as
dead instead of available for re-use. This is suitable for
avoiding transaction ID wraparound (see Section 24.1.5) but not
sufficient for avoiding index bloat. This option is ignored if
the table does not have any indexes. This cannot be used in
conjunction with the FULL option.

- * Returns the number of tuples deleted from the page and sets
- * latestRemovedXid.
+ * Returns the number of tuples deleted from the page and set latestRemoveXid
+ * and increment nunused.

I would say something like: "Returns the number of tuples deleted from
the page, sets latestRemovedXid, and updates nunused."

+	/*
+	 * hasindex = true means two-pass strategy; false means one-pass. But we
+	 * always use the one-pass strategy when index vacuum is disabled.
+	 */

I think the added sentence should make it more clear that hasindex
will still be true when DISABLE_INDEX_CLEANUP is used. Maybe
something like:

/*
* hasindex = true means two-pass strategy; false means one-pass
*
* If DISABLE_INDEX_CLEANUP is used, hasindex may still be true,
* but we'll always use the one-pass strategy.
*/

tups_vacuumed += heap_page_prune(onerel, buf, OldestXmin, false,
-                                                                                &vacrelstats->latestRemovedXid);
+                                                                                &vacrelstats->latestRemovedXid,
+                                                                                &tups_pruned);

Why do we need a separate tups_pruned argument in heap_page_prune()?
Could we add the result of heap_page_prune() to tups_pruned instead,
then report the total number of removed tuples as tups_vacuumed +
tups_pruned elsewhere?

+		 * If there are no indexes or we skip index vacuum then we can vacuum
+		 * the page right now instead of doing a second scan.

How about:

If there are no indexes or index cleanup is disabled, we can
vacuum the page right now instead of doing a second scan.

+				/*
+				 * Here, we have indexes but index vacuum is disabled. We don't
+				 * vacuum dead tuples on heap but forget them as we skip index
+				 * vacuum. The vacrelstats->dead_tuples could have tuples which
+				 * became dead after checked at HOT-pruning time but aren't marked
+				 * as dead yet. We don't process them because it's a very rare
+				 * condition and the next vacuum will process them.
+				 */

I would suggest a few small changes:

/*
* Here, we have indexes but index vacuum is disabled. Instead of
* vacuuming the dead tuples on the heap, we just forget them.
*
* Note that vacrelstats->dead_tuples could include tuples which
* became dead after HOT-pruning but are not marked dead yet. We
* do not process them because this is a very rare condition, and
* the next vacuum will process them anyway.
*/

-       /* If no indexes, make log report that lazy_vacuum_heap would've made */
+       /*
+        * If no index or disables index vacuum, make log report that lazy_vacuum_heap
+        * would've made. If index vacuum is disabled, we didn't remove all dead
+        * tuples but did for tuples removed by HOT-pruning.
+        */
if (vacuumed_pages)
ereport(elevel,
(errmsg("\"%s\": removed %.0f row versions in %u pages",
RelationGetRelationName(onerel),
-                                               tups_vacuumed, vacuumed_pages)));
+                                               skip_index_vacuum ? tups_pruned : tups_vacuumed,
+                                               vacuumed_pages)));

How about:

/*
* If no index or index vacuum is disabled, make log report that
* lazy_vacuum_heap would've made. If index vacuum is disabled,
* we don't include the tuples that we marked dead, but we do
* include tuples removed by HOT-pruning.
*/

Another interesting thing I noticed is that this "removed X row
versions" message is only emitted if vacuumed_pages is greater than 0.
However, if we only did HOT pruning, tups_vacuumed will be greater
than 0 while vacuumed_pages will still be 0, so some information will
be left out. I think this is already the case, though, so this could
probably be handled in a separate thread.

+ nleft; /* item pointers we left */

The name seems to be something other, and the comment doesn't
makes sense at least.. for me.. Looking below,

+                                    "%.0f tuples are left as dead.\n",
+                                    nleft),
+                     nleft);

How about "nleft_dead; /* iterm pointers left as dead */"?

In this block:

-        if (nindexes == 0 &&
+        if ((nindexes == 0 || skip_index_vacuum) &&
             vacrelstats->num_dead_tuples > 0)
         {

Is it right that vacuumed_pages is incremented and FSM is updated
while the page is not actually vacuumed?

         tups_vacuumed += heap_page_prune(onerel, buf, OldestXmin, false,
-                                         &vacrelstats->latestRemovedXid);
+                                         &vacrelstats->latestRemovedXid,
+                                         &tups_pruned);

tups_pruned looks as "HOT pruned tuples". It is named "unused" in
the function's parameters. (But I think it is useless. Please see
the details below.)

I tested it with a simple data set.

(autovacuum = off)
drop table if exists t;
create table t with (fillfactor=50) as select a, a % 3 as b from generate_series(1, 9) a;
create index on t(a);
update t set a = -a where b = 0;
update t set b = b + 1 where b = 1;

We now have 9 tuples, 15 versions and 3 out of 6 "old" tuples are
to be "left dead" by DISABLE_INDEX_CLEANUP vacuum. It means,
three tuples ends with "left dead", three tuples are removed and
12 tuples will survive the vacuum below.

vacuum (verbose, freeze ,disable_index_cleanup, disable_page_skipping) t;

INFO: "t": removed 0 row versions in 1 pages
INFO: "t": found 0 removable, 9 nonremovable row versions in 1 out of 1 pages
DETAIL: 0 dead row versions cannot be removed yet, oldest xmin: 925
There were 0 unused item pointers.
Skipped 0 pages due to buffer pins, 0 frozen pages.
0 pages are entirely empty.
3 tuples are left as dead.

Three tuple versions have vanished. Actually they were removed
but not shown in the message.

heap_prune_chain() doesn't count a live root entry of a chain as
"unused (line pointer)" since it is marked as "redierected". As
the result the vanished tuples are counted in tups_vacuumed, not
in tups_pruned. Maybe the name tups_vacuumed is confusing. After
removing tups_pruned code it works correctly.

INFO: "t": removed 6 row versions in 1 pages
INFO: "t": found 6 removable, 9 nonremovable row versions in 1 out of 1 pages
DETAIL: 0 dead row versions cannot be removed yet, oldest xmin: 932
There were 0 unused item pointers.
Skipped 0 pages due to buffer pins, 0 frozen pages.
0 pages are entirely empty.
3 tuples are left as dead.

I see two choices of the second line above.

1> "t": found 6 removable, 9 nonremovable row versions in 1 out of 1 pages

"removable" includes "left dead" tuples.

2> "t": found 3 removable, 12 nonremovable row versions in 1 out of 1 pages

"removable" excludes "left dead" tuples.

If you prefer the latter, removable and nonremoveable need to be
corrected using nleft.

INFO: "t": found 3 removable, 12 nonremovable row versions in 1 out of 1 pages
DETAIL: 0 dead row versions cannot be removed yet, oldest xmin: 942
There were 0 unused item pointers.
Skipped 0 pages due to buffer pins, 0 frozen pages.
0 pages are entirely empty.
3 tuples are left as dead.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#40Bossart, Nathan
bossartn@amazon.com
In reply to: Kyotaro HORIGUCHI (#39)
Re: New vacuum option to do only freezing

On 3/5/19, 1:22 AM, "Masahiko Sawada" <sawada.mshk@gmail.com> wrote:

On Tue, Mar 5, 2019 at 8:27 AM Bossart, Nathan <bossartn@amazon.com> wrote:

+      <command>VACUUM</command> removes dead tuples and prunes HOT-updated
+      tuples chain for live tuples on table. If the table has any dead tuple
+      it removes them from both table and indexes for re-use. With this
+      option <command>VACUUM</command> doesn't completely remove dead tuples
+      and disables removing dead tuples from indexes.  This is suitable for
+      avoiding transaction ID wraparound (see
+      <xref linkend="vacuum-for-wraparound"/>) but not sufficient for avoiding
+      index bloat. This option is ignored if the table doesn't have index.
+      This cannot be used in conjunction with <literal>FULL</literal>
+      option.

There are a couple of small changes I would make. How does something
like this sound?

VACUUM removes dead tuples and prunes HOT-updated tuple chains for
live tuples on the table. If the table has any dead tuples, it
removes them from both the table and its indexes and marks the
corresponding line pointers as available for re-use. With this
option, VACUUM still removes dead tuples from the table, but it
does not process any indexes, and the line pointers are marked as
dead instead of available for re-use. This is suitable for
avoiding transaction ID wraparound (see Section 24.1.5) but not
sufficient for avoiding index bloat. This option is ignored if
the table does not have any indexes. This cannot be used in
conjunction with the FULL option.

Hmm, that's good idea but I'm not sure that user knows the word 'line
pointer' because it is used only at pageinspect document. I wonder if
the word 'item identifier' would rather be appropriate here because
this is used at at describing page layout(in storage.sgml). Thought?

That seems reasonable to me. It seems like ItemIdData is referred to
as "item pointer," "line pointer," or "item identifier" depending on
where you're looking, but ItemPointerData is also referred to as "item
pointer." I think using "item identifier" is appropriate here for
clarity and consistency with storage.sgml.

tups_vacuumed += heap_page_prune(onerel, buf, OldestXmin, false,
-                                                                                &vacrelstats->latestRemovedXid);
+                                                                                &vacrelstats->latestRemovedXid,
+                                                                                &tups_pruned);

Why do we need a separate tups_pruned argument in heap_page_prune()?
Could we add the result of heap_page_prune() to tups_pruned instead,
then report the total number of removed tuples as tups_vacuumed +
tups_pruned elsewhere?

Hmm, I thought that we should report only the number of tuples
completely removed but we already count the tulples marked as
redirected as tups_vacuumed. Let me summarize the fate of dead tuples.
I think we can roughly classify dead tuples as follows.

1. root tuple of HOT chain that became dead
2. root tuple of HOT chain that became redirected
3. other tupels of HOT chain that became unused
4. tuples that became dead after HOT pruning

The tuples of #1 through #3 either have only ItemIDs or have been
completely removed but tuples of #4 has its tuple storage because they
are not processed when HOT-pruning.

Currently tups_vacuumed counts all of them, nleft (=
vacrelstats->num_dead_tuples) counts #1 + #4. I think that the number
of removed tuples being reported would be #1 + #2 + #3. Or should we
use #2 + #3 instead?

I think I'm actually more in favor of what was in v6. IIRC that
version of the patch didn't modify how we tracked the "removed" tuples
at all, but it just added the "X item identifiers left marked dead"
metric. Since even the tuples we are leaving marked dead lose
storage, that seems accurate enough to me.

Another interesting thing I noticed is that this "removed X row
versions" message is only emitted if vacuumed_pages is greater than 0.
However, if we only did HOT pruning, tups_vacuumed will be greater
than 0 while vacuumed_pages will still be 0, so some information will
be left out. I think this is already the case, though, so this could
probably be handled in a separate thread.

Hmm, since this log message is corresponding to the one that
lazy_vacuum_heap makes and total number of removed tuples are always
reported, it seems consistent to me. Do you have another point?

Here's an example:

postgres=# CREATE TABLE test (a INT, b INT);
CREATE TABLE
postgres=# CREATE INDEX ON test (a);
CREATE INDEX
postgres=# INSERT INTO test VALUES (1, 2);
INSERT 0 1

After only HOT updates, the "removed X row versions in Y pages"
message is not emitted:

postgres=# UPDATE test SET b = 3;
UPDATE 1
postgres=# UPDATE test SET b = 4;
UPDATE 1
postgres=# VACUUM (FREEZE, VERBOSE) test;
INFO: aggressively vacuuming "public.test"
INFO: index "test_a_idx" now contains 1 row versions in 2 pages
DETAIL: 0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO: "test": found 2 removable, 1 nonremovable row versions in 1 out of 1 pages
DETAIL: 0 dead row versions cannot be removed yet, oldest xmin: 494
There were 1 unused item pointers.
Skipped 0 pages due to buffer pins, 0 frozen pages.
0 pages are entirely empty.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
VACUUM

After non-HOT updates, the "removed" message is emitted:

postgres=# UPDATE test SET a = 5;
UPDATE 1
postgres=# VACUUM (FREEZE, VERBOSE) test;
INFO: aggressively vacuuming "public.test"
INFO: scanned index "test_a_idx" to remove 1 row versions
DETAIL: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s
INFO: "test": removed 1 row versions in 1 pages
DETAIL: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s
INFO: index "test_a_idx" now contains 1 row versions in 2 pages
DETAIL: 1 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO: "test": found 1 removable, 1 nonremovable row versions in 1 out of 1 pages
DETAIL: 0 dead row versions cannot be removed yet, oldest xmin: 495
There were 1 unused item pointers.
Skipped 0 pages due to buffer pins, 0 frozen pages.
0 pages are entirely empty.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
VACUUM

Nathan

#41Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Kyotaro HORIGUCHI (#39)
2 attachment(s)
Re: New vacuum option to do only freezing

On Tue, Mar 5, 2019 at 8:01 PM Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:

Hello, I have some other comments.

Thank you for the comment!

On Tue, Mar 5, 2019 at 8:01 PM Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:

+ nleft; /* item pointers we left */

The name seems to be something other, and the comment doesn't
makes sense at least.. for me.. Looking below,

+                                    "%.0f tuples are left as dead.\n",
+                                    nleft),
+                     nleft);

How about "nleft_dead; /* iterm pointers left as dead */"?

Fixed.

In this block:

-        if (nindexes == 0 &&
+        if ((nindexes == 0 || skip_index_vacuum) &&
vacrelstats->num_dead_tuples > 0)
{

Is it right that vacuumed_pages is incremented and FSM is updated
while the page is not actually vacuumed?

Good catch. I think the FSM stuff is right because we actually did HOT
pruning but the increment of vacuumed_page seems wrong to me. I think
we should not increment it and not report "removed XX row version in
YY pages" message.

tups_vacuumed += heap_page_prune(onerel, buf, OldestXmin, false,
-                                         &vacrelstats->latestRemovedXid);
+                                         &vacrelstats->latestRemovedXid,
+                                         &tups_pruned);

tups_pruned looks as "HOT pruned tuples". It is named "unused" in
the function's parameters. (But I think it is useless. Please see
the details below.)

I tested it with a simple data set.

(autovacuum = off)
drop table if exists t;
create table t with (fillfactor=50) as select a, a % 3 as b from generate_series(1, 9) a;
create index on t(a);
update t set a = -a where b = 0;
update t set b = b + 1 where b = 1;

We now have 9 tuples, 15 versions and 3 out of 6 "old" tuples are
to be "left dead" by DISABLE_INDEX_CLEANUP vacuum. It means,
three tuples ends with "left dead", three tuples are removed and
12 tuples will survive the vacuum below.

vacuum (verbose, freeze ,disable_index_cleanup, disable_page_skipping) t;

INFO: "t": removed 0 row versions in 1 pages
INFO: "t": found 0 removable, 9 nonremovable row versions in 1 out of 1 pages
DETAIL: 0 dead row versions cannot be removed yet, oldest xmin: 925
There were 0 unused item pointers.
Skipped 0 pages due to buffer pins, 0 frozen pages.
0 pages are entirely empty.
3 tuples are left as dead.

Three tuple versions have vanished. Actually they were removed
but not shown in the message.

heap_prune_chain() doesn't count a live root entry of a chain as
"unused (line pointer)" since it is marked as "redierected". As
the result the vanished tuples are counted in tups_vacuumed, not
in tups_pruned. Maybe the name tups_vacuumed is confusing. After
removing tups_pruned code it works correctly.

INFO: "t": removed 6 row versions in 1 pages
INFO: "t": found 6 removable, 9 nonremovable row versions in 1 out of 1 pages
DETAIL: 0 dead row versions cannot be removed yet, oldest xmin: 932
There were 0 unused item pointers.
Skipped 0 pages due to buffer pins, 0 frozen pages.
0 pages are entirely empty.
3 tuples are left as dead.

I see two choices of the second line above.

1> "t": found 6 removable, 9 nonremovable row versions in 1 out of 1 pages

"removable" includes "left dead" tuples.

2> "t": found 3 removable, 12 nonremovable row versions in 1 out of 1 pages

"removable" excludes "left dead" tuples.

If you prefer the latter, removable and nonremoveable need to be
corrected using nleft.

I think that the first vacuum should report the former message because
it's true that the table has 6 removable tuples. We remove 6 tuples
but leave 3 item pointers. So in the second vacuum, it should be
"found 0 removable, 9 nonremovable row versions ..." and "3 tuples are
left as dead". But to report more precisely it'd be better to report
"0 tuples and 3 item identifiers are left as dead" here.

Attached updated patch incorporated all of comments. Also I've added
new reloption vacuum_index_cleanup as per discussion on the "reloption
to prevent VACUUM from truncating empty pages at the end of relation"
thread. Autovacuums also can skip index cleanup when the reloption is
set to false. Since the setting this to false might lead some problems
I've made autovacuums report the number of dead tuples and dead
itemids we left.

Regards,

Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachments:

v8-0002-Add-diable-index-cleanup-option-to-vacuumdb.patchapplication/x-patch; name=v8-0002-Add-diable-index-cleanup-option-to-vacuumdb.patchDownload
From 70358ea1ae7d7c8f78d9101176f4cc1652c3c978 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 1 Feb 2019 16:02:50 +0100
Subject: [PATCH v8 2/2] Add --diable-index-cleanup option to vacuumdb.

---
 doc/src/sgml/ref/vacuumdb.sgml    | 15 +++++++++++++++
 src/bin/scripts/t/100_vacuumdb.pl |  9 ++++++++-
 src/bin/scripts/vacuumdb.c        | 29 +++++++++++++++++++++++++++++
 3 files changed, 52 insertions(+), 1 deletion(-)

diff --git a/doc/src/sgml/ref/vacuumdb.sgml b/doc/src/sgml/ref/vacuumdb.sgml
index 41c7f3d..02d6e46 100644
--- a/doc/src/sgml/ref/vacuumdb.sgml
+++ b/doc/src/sgml/ref/vacuumdb.sgml
@@ -118,6 +118,21 @@ PostgreSQL documentation
      </varlistentry>
 
      <varlistentry>
+      <term><option>--disable-index-cleanup</option></term>
+      <listitem>
+       <para>
+        Disable index vacuuming and index cleanup.
+       </para>
+       <note>
+        <para>
+         This option is only available for servers running
+         <productname>PostgreSQL</productname> 12 and later.
+        </para>
+       </note>
+      </listitem>
+     </varlistentry>
+
+     <varlistentry>
       <term><option>-e</option></term>
       <term><option>--echo</option></term>
       <listitem>
diff --git a/src/bin/scripts/t/100_vacuumdb.pl b/src/bin/scripts/t/100_vacuumdb.pl
index 7f3a9b1..d227225 100644
--- a/src/bin/scripts/t/100_vacuumdb.pl
+++ b/src/bin/scripts/t/100_vacuumdb.pl
@@ -3,7 +3,7 @@ use warnings;
 
 use PostgresNode;
 use TestLib;
-use Test::More tests => 44;
+use Test::More tests => 47;
 
 program_help_ok('vacuumdb');
 program_version_ok('vacuumdb');
@@ -38,6 +38,10 @@ $node->issues_sql_like(
 	qr/statement: VACUUM \(DISABLE_PAGE_SKIPPING\).*;/,
 	'vacuumdb --disable-page-skipping');
 $node->issues_sql_like(
+	[ 'vacuumdb', '--disable-index-cleanup', 'postgres' ],
+	qr/statement: VACUUM \(DISABLE_INDEX_CLEANUP\).*;/,
+	'vacuumdb --disable-index-cleanup');
+$node->issues_sql_like(
 	[ 'vacuumdb', '--skip-locked', 'postgres' ],
 	qr/statement: VACUUM \(SKIP_LOCKED\).*;/,
 	'vacuumdb --skip-locked');
@@ -48,6 +52,9 @@ $node->issues_sql_like(
 $node->command_fails(
 	[ 'vacuumdb', '--analyze-only', '--disable-page-skipping', 'postgres' ],
 	'--analyze-only and --disable-page-skipping specified together');
+$node->command_fails(
+	[ 'vacuumdb', '--analyze-only', '--disable-index-cleanup', 'postgres' ],
+	'--analyze-only and --disable-index-cleanup specified together');
 $node->command_ok([qw(vacuumdb -Z --table=pg_am dbname=template1)],
 	'vacuumdb with connection string');
 
diff --git a/src/bin/scripts/vacuumdb.c b/src/bin/scripts/vacuumdb.c
index 5ac41ea..4705217 100644
--- a/src/bin/scripts/vacuumdb.c
+++ b/src/bin/scripts/vacuumdb.c
@@ -42,6 +42,7 @@ typedef struct vacuumingOptions
 	bool		full;
 	bool		freeze;
 	bool		disable_page_skipping;
+	bool		disable_index_cleanup;
 	bool		skip_locked;
 	int			min_xid_age;
 	int			min_mxid_age;
@@ -117,6 +118,7 @@ main(int argc, char *argv[])
 		{"skip-locked", no_argument, NULL, 5},
 		{"min-xid-age", required_argument, NULL, 6},
 		{"min-mxid-age", required_argument, NULL, 7},
+		{"disable-index-cleanup", no_argument, NULL, 8},
 		{NULL, 0, NULL, 0}
 	};
 
@@ -244,6 +246,11 @@ main(int argc, char *argv[])
 					exit(1);
 				}
 				break;
+			case 8:
+				{
+					vacopts.disable_index_cleanup = true;
+					break;
+				}
 			default:
 				fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname);
 				exit(1);
@@ -288,6 +295,12 @@ main(int argc, char *argv[])
 					progname, "disable-page-skipping");
 			exit(1);
 		}
+		if (vacopts.disable_index_cleanup)
+		{
+			fprintf(stderr, _("%s: cannot use the \"%s\" option when performing only analyze\n"),
+					progname, "disable-index-cleanup");
+			exit(1);
+		}
 		/* allow 'and_analyze' with 'analyze_only' */
 	}
 
@@ -418,6 +431,14 @@ vacuum_one_database(const char *dbname, vacuumingOptions *vacopts,
 		exit(1);
 	}
 
+	if (vacopts->disable_index_cleanup && PQserverVersion(conn) < 120000)
+	{
+		PQfinish(conn);
+		fprintf(stderr, _("%s: cannot use the \"%s\" option on server versions older than PostgreSQL 12\n"),
+				progname, "disable-index-cleanup");
+		exit(1);
+	}
+
 	if (vacopts->skip_locked && PQserverVersion(conn) < 120000)
 	{
 		PQfinish(conn);
@@ -868,6 +889,13 @@ prepare_vacuum_command(PQExpBuffer sql, int serverVersion,
 				appendPQExpBuffer(sql, "%sDISABLE_PAGE_SKIPPING", sep);
 				sep = comma;
 			}
+			if (vacopts->disable_index_cleanup)
+			{
+				/* DISABLE_INDEX_CLEANUP is supported since 12 */
+				Assert(serverVersion >= 120000);
+				appendPQExpBuffer(sql, "%sDISABLE_INDEX_CLEANUP", sep);
+				sep = comma;
+			}
 			if (vacopts->skip_locked)
 			{
 				/* SKIP_LOCKED is supported since v12 */
@@ -1221,6 +1249,7 @@ help(const char *progname)
 	printf(_("  -a, --all                       vacuum all databases\n"));
 	printf(_("  -d, --dbname=DBNAME             database to vacuum\n"));
 	printf(_("      --disable-page-skipping     disable all page-skipping behavior\n"));
+	printf(_("      --disable-index-cleanup     disable index vacuuming and index cleanup\n"));
 	printf(_("  -e, --echo                      show the commands being sent to the server\n"));
 	printf(_("  -f, --full                      do full vacuuming\n"));
 	printf(_("  -F, --freeze                    freeze row transaction information\n"));
-- 
2.10.5

v8-0001-Add-DISABLE_INDEX_CLEANUP-option-to-VACUUM-comman.patchapplication/x-patch; name=v8-0001-Add-DISABLE_INDEX_CLEANUP-option-to-VACUUM-comman.patchDownload
From ecfaa508634bd7b17e43d9806bc0653617e2f296 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Mon, 21 Jan 2019 19:07:44 +0900
Subject: [PATCH v8 1/2] Add DISABLE_INDEX_CLEANUP option to VACUUM command

With this option, VACUUM does HOT-pruning for live tuples but doesn't
remove dead tuples completely and disables index vacuum.

vacrelstats->dead_tuples could have tuples that became dead after
checked at a HOT-pruning time, which are not marked as dead. Per
discussion on pgsql-hackers We normally records and remove them but
with this option we don't process and leave for the next vacuum for
simplifing the code. That's okay because it's very rare condition and
those tuples will be processed by the next vacuum.
---
 doc/src/sgml/ref/create_table.sgml     | 16 ++++++
 doc/src/sgml/ref/vacuum.sgml           | 29 ++++++++++-
 src/backend/access/common/reloptions.c | 13 ++++-
 src/backend/access/heap/vacuumlazy.c   | 93 +++++++++++++++++++++++++++-------
 src/backend/commands/vacuum.c          | 15 +++++-
 src/backend/parser/gram.y              |  2 +
 src/bin/psql/tab-complete.c            |  4 +-
 src/include/nodes/parsenodes.h         |  3 +-
 src/include/utils/rel.h                |  1 +
 src/test/regress/expected/vacuum.out   |  3 ++
 src/test/regress/sql/vacuum.sql        |  3 ++
 11 files changed, 159 insertions(+), 23 deletions(-)

diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index 22dbc07..cd9cea9 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -1345,6 +1345,22 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
    </varlistentry>
 
    <varlistentry>
+    <term><literal>vacuum_index_cleanup</literal> (<type>boolean</type>)</term>
+    <listitem>
+     <para>
+      Per table setting to use <literal>DISABLE_INDEX_CLEANUP</literal> option
+      of <command>VACUUM</command> command. The default value is true. If false,
+      autovacuum daemon and <command>VACUUM</command> never perform index vacuuming
+      and index cleanup, that is, always set <literal>DISABLE_INDEX_CLEANUP</literal>
+      option. Note that out of disk space due to index bloat. Setting this
+      parameter to false makes sense to avoid scanning large indexes when the
+      table has a few dead tuples. See <xref linkend="sql-vacuum"/> for more
+      details on <literal>DISABLE_INDEX_CLEANUP</literal> option.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><literal>autovacuum_vacuum_threshold</literal>, <literal>toast.autovacuum_vacuum_threshold</literal> (<type>integer</type>)</term>
     <listitem>
      <para>
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index fd911f5..ac4a104 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -31,6 +31,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     VERBOSE
     ANALYZE
     DISABLE_PAGE_SKIPPING
+    DISABLE_INDEX_CLEANUP
     SKIP_LOCKED
 
 <phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
@@ -161,7 +162,26 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     </listitem>
    </varlistentry>
 
-   <varlistentry>
+    <varlistentry>
+    <term><literal>DISABLE_INDEX_CLEANUP</literal></term>
+    <listitem>
+     <para>
+      <command>VACUUM</command> removes dead tuples and prunes HOT-updated
+      tuples chain for live tuples on table. If the table has any dead tuple
+      it removes them from both the table and its indexes and marks the
+      corresponding item identifers as available for re-use. With this option
+      <command>VACUUM</command> still removes dead tuples from the table, but
+      it does not process any indexes, and the item identifers are marked as
+      dead instead of available. This is suitable for avoiding transaction
+      ID wraparound (see <xref linkend="vacuum-for-wraparound"/>) but not
+      sufficient for avoiding index bloat. This option is ignored if the
+      table does not have index. This cannot be used in conjunction with
+      <literal>FULL</literal> option.
+     </para>
+    </listitem>
+   </varlistentry>
+
+    <varlistentry>
     <term><literal>SKIP_LOCKED</literal></term>
     <listitem>
      <para>
@@ -261,6 +281,13 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    </para>
 
    <para>
+    Setting a values for <literal>vacuum_index_cleanup</literal> via
+    <xref linkend="sql-altertable"/> also enables and disables index cleanup.
+    The <literal>DISABLE_INDEX_CLEANUP</literal> options to
+    <command>VACUUM</command> takes precedence over this option.
+   </para>
+
+   <para>
     <command>VACUUM</command> causes a substantial increase in I/O traffic,
     which might cause poor performance for other active sessions.  Therefore,
     it is sometimes advisable to use the cost-based vacuum delay feature.
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index cdf1f4a..2cb423d 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -138,6 +138,15 @@ static relopt_bool boolRelOpts[] =
 		},
 		false
 	},
+	{
+		{
+			"vacuum_index_cleanup",
+			"Enables index vacuuming and index cleanup",
+			RELOPT_KIND_HEAP,
+			ShareUpdateExclusiveLock
+		},
+		true
+	},
 	/* list terminator */
 	{{NULL}}
 };
@@ -1388,7 +1397,9 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
 		{"parallel_workers", RELOPT_TYPE_INT,
 		offsetof(StdRdOptions, parallel_workers)},
 		{"vacuum_cleanup_index_scale_factor", RELOPT_TYPE_REAL,
-		offsetof(StdRdOptions, vacuum_cleanup_index_scale_factor)}
+		offsetof(StdRdOptions, vacuum_cleanup_index_scale_factor)},
+		{"vacuum_index_cleanup", RELOPT_TYPE_BOOL,
+		offsetof(StdRdOptions, vacuum_index_cleanup)}
 	};
 
 	options = parseRelOptions(reloptions, validate, kind, &numoptions);
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 9416c31..cad0f98 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -112,7 +112,12 @@
 
 typedef struct LVRelStats
 {
-	/* hasindex = true means two-pass strategy; false means one-pass */
+	/*
+	 * hasindex = true means two-pass strategy; false means one-pass.
+	 *
+	 * If DISABLE_INDEX_CLEANUP is used, hasindex may still be true,
+	 * but we'll always use the one-pass strategy.
+	 */
 	bool		hasindex;
 	/* Overall statistics about rel */
 	BlockNumber old_rel_pages;	/* previous value of pg_class.relpages */
@@ -125,6 +130,8 @@ typedef struct LVRelStats
 	double		new_rel_tuples; /* new estimated total # of tuples */
 	double		new_live_tuples;	/* new estimated total # of live tuples */
 	double		new_dead_tuples;	/* new estimated total # of dead tuples */
+	double		nleft_dead_tuples;	/* # of dead tuples we left */
+	double		nleft_dead_itemids;	/* # of dead item pointers we left */
 	BlockNumber pages_removed;
 	double		tuples_deleted;
 	BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
@@ -167,7 +174,8 @@ static bool should_attempt_truncation(LVRelStats *vacrelstats);
 static void lazy_truncate_heap(Relation onerel, LVRelStats *vacrelstats);
 static BlockNumber count_nondeletable_pages(Relation onerel,
 						 LVRelStats *vacrelstats);
-static void lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks);
+static void lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks,
+							 bool skip_index_vacuum);
 static void lazy_record_dead_tuple(LVRelStats *vacrelstats,
 					   ItemPointer itemptr);
 static bool lazy_tid_reaped(ItemPointer itemptr, void *state);
@@ -403,6 +411,9 @@ heap_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 							 vacrelstats->new_rel_tuples,
 							 vacrelstats->new_dead_tuples,
 							 OldestXmin);
+			appendStringInfo(&buf, "%.0f tuples and %.0f item identifiers are left as dead.\n",
+							 vacrelstats->nleft_dead_tuples,
+							 vacrelstats->nleft_dead_itemids);
 			appendStringInfo(&buf,
 							 _("buffer usage: %d hits, %d misses, %d dirtied\n"),
 							 VacuumPageHit,
@@ -485,13 +496,17 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 				live_tuples,	/* live tuples (reltuples estimate) */
 				tups_vacuumed,	/* tuples cleaned up by vacuum */
 				nkeep,			/* dead-but-not-removable tuples */
-				nunused;		/* unused item pointers */
+				nunused,		/* unused item pointers */
+				nleft_dead_tuples,		/* tuples we left as dead */
+				nleft_dead_itemids;		/* item pointers we left as dead,
+										 * includes nleft_dead_tuples. */
 	IndexBulkDeleteResult **indstats;
 	int			i;
 	PGRUsage	ru0;
 	Buffer		vmbuffer = InvalidBuffer;
 	BlockNumber next_unskippable_block;
 	bool		skipping_blocks;
+	bool		skip_index_vacuum = (options & VACOPT_DISABLE_INDEX_CLEANUP) != 0;
 	xl_heap_freeze_tuple *frozen;
 	StringInfoData buf;
 	const int	initprog_index[] = {
@@ -517,7 +532,8 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 
 	empty_pages = vacuumed_pages = 0;
 	next_fsm_block_to_vacuum = (BlockNumber) 0;
-	num_tuples = live_tuples = tups_vacuumed = nkeep = nunused = 0;
+	num_tuples = live_tuples = tups_vacuumed  = nkeep = nunused =
+		nleft_dead_itemids = nleft_dead_tuples = 0;
 
 	indstats = (IndexBulkDeleteResult **)
 		palloc0(nindexes * sizeof(IndexBulkDeleteResult *));
@@ -529,7 +545,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	vacrelstats->nonempty_pages = 0;
 	vacrelstats->latestRemovedXid = InvalidTransactionId;
 
-	lazy_space_alloc(vacrelstats, nblocks);
+	lazy_space_alloc(vacrelstats, nblocks, skip_index_vacuum);
 	frozen = palloc(sizeof(xl_heap_freeze_tuple) * MaxHeapTuplesPerPage);
 
 	/* Report that we're scanning the heap, advertising total # of blocks */
@@ -722,6 +738,9 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			};
 			int64		hvp_val[2];
 
+			/* Index vacuum must be enabled in two-pass vacuum */
+			Assert(!skip_index_vacuum);
+
 			/*
 			 * Before beginning index vacuuming, we release any pin we may
 			 * hold on the visibility map page.  This isn't necessary for
@@ -1054,7 +1073,16 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 						HeapTupleIsHeapOnly(&tuple))
 						nkeep += 1;
 					else
+					{
 						tupgone = true; /* we can delete the tuple */
+
+						/*
+						 * Since the dead tuples will be not be vacuumed
+						 * and ignored when index vacuum is disabled we
+						 * count them for reporting.
+						 */
+						nleft_dead_tuples++;
+					}
 					all_visible = false;
 					break;
 				case HEAPTUPLE_LIVE:
@@ -1206,15 +1234,33 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		}
 
 		/*
-		 * If there are no indexes then we can vacuum the page right now
-		 * instead of doing a second scan.
+		 * If there are no indexes or index vacuum is disabled we can
+		 * vacuum the page right now instead of doing a second scan.
 		 */
-		if (nindexes == 0 &&
+		if ((nindexes == 0 || skip_index_vacuum) &&
 			vacrelstats->num_dead_tuples > 0)
 		{
-			/* Remove tuples from heap */
-			lazy_vacuum_page(onerel, blkno, buf, 0, vacrelstats, &vmbuffer);
-			has_dead_tuples = false;
+			if (nindexes == 0)
+			{
+				/* Remove tuples from heap if the table has no index */
+				lazy_vacuum_page(onerel, blkno, buf, 0, vacrelstats, &vmbuffer);
+				vacuumed_pages++;
+				has_dead_tuples = false;
+			}
+			else
+			{
+				/*
+				 * Here, we have indexes but index vacuum is disabled. Instead of
+				 * vacuuming the dead tuples on the heap, we just forget them.
+				 *
+				 * Note that vacrelstats->dead_tuples could have tuples which
+				 * became dead after HOT-pruning but are not marked dead yet.
+				 * We do not process them because it's a very rare condition, and
+				 * the next vacuum will process them anyway.
+				 */
+				Assert(skip_index_vacuum);
+				nleft_dead_itemids += vacrelstats->num_dead_tuples;
+			}
 
 			/*
 			 * Forget the now-vacuumed tuples, and press on, but be careful
@@ -1222,7 +1268,6 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			 * valid.
 			 */
 			vacrelstats->num_dead_tuples = 0;
-			vacuumed_pages++;
 
 			/*
 			 * Periodically do incremental FSM vacuuming to make newly-freed
@@ -1348,7 +1393,9 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 
 	/* save stats for use later */
 	vacrelstats->tuples_deleted = tups_vacuumed;
-	vacrelstats->new_dead_tuples = nkeep;
+	vacrelstats->new_dead_tuples = nkeep + nleft_dead_tuples;
+	vacrelstats->nleft_dead_tuples = nleft_dead_tuples;
+	vacrelstats->nleft_dead_itemids = nleft_dead_itemids;
 
 	/* now we can compute the new value for pg_class.reltuples */
 	vacrelstats->new_live_tuples = vac_estimate_reltuples(onerel,
@@ -1379,6 +1426,9 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		};
 		int64		hvp_val[2];
 
+		/* Index vacuum must be enabled in two-pass vacuum */
+		Assert(!skip_index_vacuum);
+
 		/* Log cleanup info before we touch indexes */
 		vacuum_log_cleanup_info(onerel, vacrelstats);
 
@@ -1417,10 +1467,14 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
 
 	/* Do post-vacuum cleanup and statistics update for each index */
-	for (i = 0; i < nindexes; i++)
-		lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
+	if (!skip_index_vacuum)
+		for (i = 0; i < nindexes; i++)
+			lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
 
-	/* If no indexes, make log report that lazy_vacuum_heap would've made */
+	/*
+	 * If no index or index vacuum is disabled, make log report that
+	 * lazy_vacuum_heap would've make.
+	 */
 	if (vacuumed_pages)
 		ereport(elevel,
 				(errmsg("\"%s\": removed %.0f row versions in %u pages",
@@ -1449,6 +1503,8 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 									"%u pages are entirely empty.\n",
 									empty_pages),
 					 empty_pages);
+	appendStringInfo(&buf, "%.0f tuples and %.0f item identifiers are left as dead.\n",
+					 nleft_dead_tuples, nleft_dead_itemids);
 	appendStringInfo(&buf, _("%s."), pg_rusage_show(&ru0));
 
 	ereport(elevel,
@@ -2085,14 +2141,15 @@ count_nondeletable_pages(Relation onerel, LVRelStats *vacrelstats)
  * See the comments at the head of this file for rationale.
  */
 static void
-lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks,
+				 bool skip_index_vacuum)
 {
 	long		maxtuples;
 	int			vac_work_mem = IsAutoVacuumWorkerProcess() &&
 	autovacuum_work_mem != -1 ?
 	autovacuum_work_mem : maintenance_work_mem;
 
-	if (vacrelstats->hasindex)
+	if (vacrelstats->hasindex && !skip_index_vacuum)
 	{
 		maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);
 		maxtuples = Min(maxtuples, INT_MAX);
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index e91df21..a0f2622 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -203,7 +203,8 @@ vacuum(int options, List *relations, VacuumParams *params,
 						stmttype)));
 
 	/*
-	 * Sanity check DISABLE_PAGE_SKIPPING option.
+	 * Sanity check DISABLE_PAGE_SKIPPING option and DISABLE_INDEX_CLEANUP
+	 * option.
 	 */
 	if ((options & VACOPT_FULL) != 0 &&
 		(options & VACOPT_DISABLE_PAGE_SKIPPING) != 0)
@@ -211,6 +212,11 @@ vacuum(int options, List *relations, VacuumParams *params,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg("VACUUM option DISABLE_PAGE_SKIPPING cannot be used with FULL")));
 
+	if ((options & VACOPT_FULL) != 0 &&
+		(options & VACOPT_DISABLE_INDEX_CLEANUP) != 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("VACUUM option DISABLE_INDEX_CLEANUP cannot be used with FULL")));
 	/*
 	 * Send info about dead objects to the statistics collector, unless we are
 	 * in autovacuum --- autovacuum.c does this for itself.
@@ -1672,6 +1678,13 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 	LockRelationIdForSession(&onerelid, lmode);
 
 	/*
+	 * Disables index cleanup based on reloptions.
+	 */
+	if (onerel->rd_options &&
+		!((StdRdOptions *) onerel->rd_options)->vacuum_index_cleanup)
+		options |= VACOPT_DISABLE_INDEX_CLEANUP;
+
+	/*
 	 * Remember the relation's TOAST relation for later, if the caller asked
 	 * us to process it.  In VACUUM FULL, though, the toast table is
 	 * automatically rebuilt by cluster_rel so we shouldn't recurse to it.
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 0279013..0b13ad7 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -10471,6 +10471,8 @@ vacuum_option_elem:
 				{
 					if (strcmp($1, "disable_page_skipping") == 0)
 						$$ = VACOPT_DISABLE_PAGE_SKIPPING;
+					else if (strcmp($1, "disable_index_cleanup") == 0)
+						$$ = VACOPT_DISABLE_INDEX_CLEANUP;
 					else if (strcmp($1, "skip_locked") == 0)
 						$$ = VACOPT_SKIP_LOCKED;
 					else
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 7b7a88f..b4e416d 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1039,6 +1039,7 @@ static const char *const table_storage_parameters[] = {
 	"toast.log_autovacuum_min_duration",
 	"toast_tuple_target",
 	"user_catalog_table",
+	"vacuum_index_cleanup",
 	NULL
 };
 
@@ -3423,7 +3424,8 @@ psql_completion(const char *text, int start, int end)
 		 */
 		if (ends_with(prev_wd, '(') || ends_with(prev_wd, ','))
 			COMPLETE_WITH("FULL", "FREEZE", "ANALYZE", "VERBOSE",
-						  "DISABLE_PAGE_SKIPPING");
+						  "DISABLE_PAGE_SKIPPING",
+						  "DISABLE_INDEX_CLEANUP");
 	}
 	else if (HeadMatches("VACUUM") && TailMatches("("))
 		/* "VACUUM (" should be caught above, so assume we want columns */
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index a7e859d..f6d52ac 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3163,7 +3163,8 @@ typedef enum VacuumOption
 	VACOPT_FULL = 1 << 4,		/* FULL (non-concurrent) vacuum */
 	VACOPT_SKIP_LOCKED = 1 << 5,	/* skip if cannot get lock */
 	VACOPT_SKIPTOAST = 1 << 6,	/* don't process the TOAST table, if any */
-	VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7	/* don't skip any pages */
+	VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7,	/* don't skip any pages */
+	VACOPT_DISABLE_INDEX_CLEANUP = 1 << 8	/* skip index vacuum and cleanup */
 } VacuumOption;
 
 /*
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 1d05465..14e325e 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -253,6 +253,7 @@ typedef struct StdRdOptions
 	AutoVacOpts autovacuum;		/* autovacuum-related options */
 	bool		user_catalog_table; /* use as an additional catalog relation */
 	int			parallel_workers;	/* max number of parallel workers */
+	bool		vacuum_index_cleanup;	/* enables index vacuuming and cleanup */
 } StdRdOptions;
 
 #define HEAP_MIN_FILLFACTOR			10
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index fa9d663..cffce59 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -80,6 +80,9 @@ CONTEXT:  SQL function "do_analyze" statement 1
 SQL function "wrap_do_analyze" statement 1
 VACUUM FULL vactst;
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+VACUUM (DISABLE_INDEX_CLEANUP) vaccluster;
+VACUUM (DISABLE_INDEX_CLEANUP) vactst; -- DISABLE_INDEX_CLEANUP is ignored
+VACUUM (DISABLE_INDEX_CLEANUP, FREEZE) vaccluster;
 -- partitioned table
 CREATE TABLE vacparted (a int, b char) PARTITION BY LIST (a);
 CREATE TABLE vacparted1 PARTITION OF vacparted FOR VALUES IN (1);
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 9defa0d..9c4bdb7 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -61,6 +61,9 @@ VACUUM FULL vaccluster;
 VACUUM FULL vactst;
 
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+VACUUM (DISABLE_INDEX_CLEANUP) vaccluster;
+VACUUM (DISABLE_INDEX_CLEANUP) vactst; -- DISABLE_INDEX_CLEANUP is ignored
+VACUUM (DISABLE_INDEX_CLEANUP, FREEZE) vaccluster;
 
 -- partitioned table
 CREATE TABLE vacparted (a int, b char) PARTITION BY LIST (a);
-- 
2.10.5

#42Robert Haas
robertmhaas@gmail.com
In reply to: Masahiko Sawada (#41)
Re: New vacuum option to do only freezing

On Tue, Mar 5, 2019 at 11:29 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Attached updated patch incorporated all of comments. Also I've added
new reloption vacuum_index_cleanup as per discussion on the "reloption
to prevent VACUUM from truncating empty pages at the end of relation"
thread. Autovacuums also can skip index cleanup when the reloption is
set to false. Since the setting this to false might lead some problems
I've made autovacuums report the number of dead tuples and dead
itemids we left.

It seems to me that the disable_index_cleanup should be renamed
index_cleanup and the default should be changed to true, for
consistency with the reloption (and, perhaps, other patches).

- num_tuples = live_tuples = tups_vacuumed = nkeep = nunused = 0;
+ num_tuples = live_tuples = tups_vacuumed  = nkeep = nunused =
+ nleft_dead_itemids = nleft_dead_tuples = 0;

I would suggest leaving the existing line alone (and not adding an
extra space to it as the patch does now) and just adding a second
initialization on the next line as a separate statement. a = b = c = d
= e = 0 isn't such great coding style that we should stick to it
rigorously even when it ends up having to be broken across lines.

+ /* Index vacuum must be enabled in two-pass vacuum */
+ Assert(!skip_index_vacuum);

I am a big believer in naming consistency. Please, let's use the same
name everywhere! If it's going to be index_cleanup, then call the
reloption vacuum_index_cleanup, and call the option index_cleanup, and
call the variable index_cleanup. Picking a different subset of
cleanup, index, vacuum, skip, and disable for each new name makes it
harder to understand.

- * If there are no indexes then we can vacuum the page right now
- * instead of doing a second scan.
+ * If there are no indexes or index vacuum is disabled we can
+ * vacuum the page right now instead of doing a second scan.

This comment is wrong. That wouldn't be safe. And that's probably
why it's not what the code does.

- /* If no indexes, make log report that lazy_vacuum_heap would've made */
+ /*
+ * If no index or index vacuum is disabled, make log report that
+ * lazy_vacuum_heap would've make.
+ */
  if (vacuumed_pages)

Hmm, does this really do what the comment claims? It looks to me like
we only increment vacuumed_pages when we call lazy_vacuum_page(), and
we (correctly) don't do that when index cleanup is disabled, but then
here this claims that if (vacuumed_pages) will be true in that case.

I wonder if it would be cleaner to rename vacrelstate->hasindex to
'useindex' and set it to false if there are no indexes or index
cleanup is disabled. But that might actually be worse, not sure.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#43Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Robert Haas (#42)
2 attachment(s)
Re: New vacuum option to do only freezing

On Thu, Mar 7, 2019 at 3:55 AM Robert Haas <robertmhaas@gmail.com> wrote:

On Tue, Mar 5, 2019 at 11:29 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Attached updated patch incorporated all of comments. Also I've added
new reloption vacuum_index_cleanup as per discussion on the "reloption
to prevent VACUUM from truncating empty pages at the end of relation"
thread. Autovacuums also can skip index cleanup when the reloption is
set to false. Since the setting this to false might lead some problems
I've made autovacuums report the number of dead tuples and dead
itemids we left.

It seems to me that the disable_index_cleanup should be renamed
index_cleanup and the default should be changed to true, for
consistency with the reloption (and, perhaps, other patches).

Hmm, the patch already has new reloption vacuum_index_cleanup and
default value is true but you meant I should change its name to
index_cleanup?

- num_tuples = live_tuples = tups_vacuumed = nkeep = nunused = 0;
+ num_tuples = live_tuples = tups_vacuumed  = nkeep = nunused =
+ nleft_dead_itemids = nleft_dead_tuples = 0;

I would suggest leaving the existing line alone (and not adding an
extra space to it as the patch does now) and just adding a second
initialization on the next line as a separate statement. a = b = c = d
= e = 0 isn't such great coding style that we should stick to it
rigorously even when it ends up having to be broken across lines.

Fixed.

+ /* Index vacuum must be enabled in two-pass vacuum */
+ Assert(!skip_index_vacuum);

I am a big believer in naming consistency. Please, let's use the same
name everywhere! If it's going to be index_cleanup, then call the
reloption vacuum_index_cleanup, and call the option index_cleanup, and
call the variable index_cleanup. Picking a different subset of
cleanup, index, vacuum, skip, and disable for each new name makes it
harder to understand.

Fixed.

- * If there are no indexes then we can vacuum the page right now
- * instead of doing a second scan.
+ * If there are no indexes or index vacuum is disabled we can
+ * vacuum the page right now instead of doing a second scan.

This comment is wrong. That wouldn't be safe. And that's probably
why it's not what the code does.

Fixed.

- /* If no indexes, make log report that lazy_vacuum_heap would've made */
+ /*
+ * If no index or index vacuum is disabled, make log report that
+ * lazy_vacuum_heap would've make.
+ */
if (vacuumed_pages)

Hmm, does this really do what the comment claims? It looks to me like
we only increment vacuumed_pages when we call lazy_vacuum_page(), and
we (correctly) don't do that when index cleanup is disabled, but then
here this claims that if (vacuumed_pages) will be true in that case.

You're right, vacuumed_pages never be > 0 in disable_index_cleanup case. Fixed.

I wonder if it would be cleaner to rename vacrelstate->hasindex to
'useindex' and set it to false if there are no indexes or index
cleanup is disabled. But that might actually be worse, not sure.

I tried the changes and it seems good idea to me. Fixed.

Attached the updated version patches.

Regards,

--

Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachments:

v9-0001-Add-DISABLE_INDEX_CLEANUP-option-to-VACUUM-comman.patchtext/x-patch; charset=US-ASCII; name=v9-0001-Add-DISABLE_INDEX_CLEANUP-option-to-VACUUM-comman.patchDownload
From 8ab648be35fe4fd864fabd17adf72fb476a3367c Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Thu, 7 Mar 2019 09:45:11 +0900
Subject: [PATCH v9 1/2] Add DISABLE_INDEX_CLEANUP option to VACUUM command

With this option, VACUUM does HOT-pruning for live tuples but doesn't
remove dead tuples completely and disables index vacuum.

vacrelstats->dead_tuples could have tuples that became dead after
checked at a HOT-pruning time, which are not marked as dead. Per
discussion on pgsql-hackers We normally records and remove them but
with this option we don't process and leave for the next vacuum for
simplifing the code. That's okay because it's very rare condition and
those tuples will be processed by the next vacuum.
---
 doc/src/sgml/ref/create_table.sgml     | 16 ++++++++
 doc/src/sgml/ref/vacuum.sgml           | 27 ++++++++++++
 src/backend/access/common/reloptions.c | 13 +++++-
 src/backend/access/heap/vacuumlazy.c   | 75 ++++++++++++++++++++++++++--------
 src/backend/commands/vacuum.c          | 15 ++++++-
 src/backend/parser/gram.y              |  2 +
 src/bin/psql/tab-complete.c            |  4 +-
 src/include/nodes/parsenodes.h         |  3 +-
 src/include/utils/rel.h                |  1 +
 src/test/regress/expected/vacuum.out   |  3 ++
 src/test/regress/sql/vacuum.sql        |  3 ++
 11 files changed, 142 insertions(+), 20 deletions(-)

diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index 22dbc07..cd9cea9 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -1345,6 +1345,22 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
    </varlistentry>
 
    <varlistentry>
+    <term><literal>vacuum_index_cleanup</literal> (<type>boolean</type>)</term>
+    <listitem>
+     <para>
+      Per table setting to use <literal>DISABLE_INDEX_CLEANUP</literal> option
+      of <command>VACUUM</command> command. The default value is true. If false,
+      autovacuum daemon and <command>VACUUM</command> never perform index vacuuming
+      and index cleanup, that is, always set <literal>DISABLE_INDEX_CLEANUP</literal>
+      option. Note that out of disk space due to index bloat. Setting this
+      parameter to false makes sense to avoid scanning large indexes when the
+      table has a few dead tuples. See <xref linkend="sql-vacuum"/> for more
+      details on <literal>DISABLE_INDEX_CLEANUP</literal> option.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><literal>autovacuum_vacuum_threshold</literal>, <literal>toast.autovacuum_vacuum_threshold</literal> (<type>integer</type>)</term>
     <listitem>
      <para>
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index fd911f5..3162793 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -31,6 +31,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     VERBOSE
     ANALYZE
     DISABLE_PAGE_SKIPPING
+    DISABLE_INDEX_CLEANUP
     SKIP_LOCKED
 
 <phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
@@ -162,6 +163,25 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    </varlistentry>
 
    <varlistentry>
+    <term><literal>DISABLE_INDEX_CLEANUP</literal></term>
+    <listitem>
+     <para>
+      <command>VACUUM</command> removes dead tuples and prunes HOT-updated
+      tuples chain for live tuples on table. If the table has any dead tuple
+      it removes them from both the table and its indexes and marks the
+      corresponding item identifers as available for re-use. With this option
+      <command>VACUUM</command> still removes dead tuples from the table, but
+      it does not process any indexes, and the item identifers are marked as
+      dead instead of available. This is suitable for avoiding transaction
+      ID wraparound (see <xref linkend="vacuum-for-wraparound"/>) but not
+      sufficient for avoiding index bloat. This option is ignored if the
+      table does not have index. This cannot be used in conjunction with
+      <literal>FULL</literal> option.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><literal>SKIP_LOCKED</literal></term>
     <listitem>
      <para>
@@ -261,6 +281,13 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    </para>
 
    <para>
+    Setting a values for <literal>vacuum_index_cleanup</literal> via
+    <xref linkend="sql-altertable"/> also enables and disables index cleanup.
+    The <literal>DISABLE_INDEX_CLEANUP</literal> options to
+    <command>VACUUM</command> takes precedence over this option.
+   </para>
+
+   <para>
     <command>VACUUM</command> causes a substantial increase in I/O traffic,
     which might cause poor performance for other active sessions.  Therefore,
     it is sometimes advisable to use the cost-based vacuum delay feature.
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index cdf1f4a..2cb423d 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -138,6 +138,15 @@ static relopt_bool boolRelOpts[] =
 		},
 		false
 	},
+	{
+		{
+			"vacuum_index_cleanup",
+			"Enables index vacuuming and index cleanup",
+			RELOPT_KIND_HEAP,
+			ShareUpdateExclusiveLock
+		},
+		true
+	},
 	/* list terminator */
 	{{NULL}}
 };
@@ -1388,7 +1397,9 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
 		{"parallel_workers", RELOPT_TYPE_INT,
 		offsetof(StdRdOptions, parallel_workers)},
 		{"vacuum_cleanup_index_scale_factor", RELOPT_TYPE_REAL,
-		offsetof(StdRdOptions, vacuum_cleanup_index_scale_factor)}
+		offsetof(StdRdOptions, vacuum_cleanup_index_scale_factor)},
+		{"vacuum_index_cleanup", RELOPT_TYPE_BOOL,
+		offsetof(StdRdOptions, vacuum_index_cleanup)}
 	};
 
 	options = parseRelOptions(reloptions, validate, kind, &numoptions);
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 9416c31..cd5b975 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -112,8 +112,8 @@
 
 typedef struct LVRelStats
 {
-	/* hasindex = true means two-pass strategy; false means one-pass */
-	bool		hasindex;
+	/* useindex = true means two-pass strategy; false means one-pass */
+	bool		useindex;
 	/* Overall statistics about rel */
 	BlockNumber old_rel_pages;	/* previous value of pg_class.relpages */
 	BlockNumber rel_pages;		/* total number of pages */
@@ -125,6 +125,8 @@ typedef struct LVRelStats
 	double		new_rel_tuples; /* new estimated total # of tuples */
 	double		new_live_tuples;	/* new estimated total # of live tuples */
 	double		new_dead_tuples;	/* new estimated total # of dead tuples */
+	double		nleft_dead_tuples;	/* # of dead tuples we left */
+	double		nleft_dead_itemids;	/* # of dead item pointers we left */
 	BlockNumber pages_removed;
 	double		tuples_deleted;
 	BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
@@ -258,7 +260,8 @@ heap_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 
 	/* Open all indexes of the relation */
 	vac_open_indexes(onerel, RowExclusiveLock, &nindexes, &Irel);
-	vacrelstats->hasindex = (nindexes > 0);
+	vacrelstats->useindex = (nindexes > 0 &&
+							(options & VACOPT_DISABLE_INDEX_CLEANUP) == 0);
 
 	/* Do the vacuuming */
 	lazy_scan_heap(onerel, options, vacrelstats, Irel, nindexes, aggressive);
@@ -332,7 +335,7 @@ heap_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 						new_rel_pages,
 						new_live_tuples,
 						new_rel_allvisible,
-						vacrelstats->hasindex,
+						nindexes > 0,
 						new_frozen_xid,
 						new_min_multi,
 						false);
@@ -403,6 +406,9 @@ heap_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 							 vacrelstats->new_rel_tuples,
 							 vacrelstats->new_dead_tuples,
 							 OldestXmin);
+			appendStringInfo(&buf, "%.0f tuples and %.0f item identifiers are left as dead.\n",
+							 vacrelstats->nleft_dead_tuples,
+							 vacrelstats->nleft_dead_itemids);
 			appendStringInfo(&buf,
 							 _("buffer usage: %d hits, %d misses, %d dirtied\n"),
 							 VacuumPageHit,
@@ -485,7 +491,10 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 				live_tuples,	/* live tuples (reltuples estimate) */
 				tups_vacuumed,	/* tuples cleaned up by vacuum */
 				nkeep,			/* dead-but-not-removable tuples */
-				nunused;		/* unused item pointers */
+				nunused,		/* unused item pointers */
+				nleft_dead_tuples,		/* tuples we left as dead */
+				nleft_dead_itemids;		/* item pointers we left as dead,
+										 * includes nleft_dead_tuples. */
 	IndexBulkDeleteResult **indstats;
 	int			i;
 	PGRUsage	ru0;
@@ -518,6 +527,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	empty_pages = vacuumed_pages = 0;
 	next_fsm_block_to_vacuum = (BlockNumber) 0;
 	num_tuples = live_tuples = tups_vacuumed = nkeep = nunused = 0;
+	nleft_dead_itemids = nleft_dead_tuples = 0;
 
 	indstats = (IndexBulkDeleteResult **)
 		palloc0(nindexes * sizeof(IndexBulkDeleteResult *));
@@ -1054,7 +1064,16 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 						HeapTupleIsHeapOnly(&tuple))
 						nkeep += 1;
 					else
+					{
 						tupgone = true; /* we can delete the tuple */
+
+						/*
+						 * Since the dead tuples will be not be vacuumed
+						 * and ignored when index cleanup is disabled we
+						 * count them for reporting.
+						 */
+						nleft_dead_tuples++;
+					}
 					all_visible = false;
 					break;
 				case HEAPTUPLE_LIVE:
@@ -1207,14 +1226,32 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 
 		/*
 		 * If there are no indexes then we can vacuum the page right now
-		 * instead of doing a second scan.
+		 * instead of doing a second scan. Also we don't do that and forget
+		 * dead tuples when index cleanup is disabled.
 		 */
-		if (nindexes == 0 &&
-			vacrelstats->num_dead_tuples > 0)
+		if (!vacrelstats->useindex && vacrelstats->num_dead_tuples > 0)
 		{
-			/* Remove tuples from heap */
-			lazy_vacuum_page(onerel, blkno, buf, 0, vacrelstats, &vmbuffer);
-			has_dead_tuples = false;
+			if (nindexes == 0)
+			{
+				/* Remove tuples from heap if the table has no index */
+				lazy_vacuum_page(onerel, blkno, buf, 0, vacrelstats, &vmbuffer);
+				vacuumed_pages++;
+				has_dead_tuples = false;
+			}
+			else
+			{
+				/*
+				 * Here, we have indexes but index cleanup is disabled. Instead of
+				 * vacuuming the dead tuples on the heap, we just forget them.
+				 *
+				 * Note that vacrelstats->dead_tuples could have tuples which
+				 * became dead after HOT-pruning but are not marked dead yet.
+				 * We do not process them because it's a very rare condition, and
+				 * the next vacuum will process them anyway.
+				 */
+				Assert((options & VACOPT_DISABLE_INDEX_CLEANUP) != 0);
+				nleft_dead_itemids += vacrelstats->num_dead_tuples;
+			}
 
 			/*
 			 * Forget the now-vacuumed tuples, and press on, but be careful
@@ -1222,7 +1259,6 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			 * valid.
 			 */
 			vacrelstats->num_dead_tuples = 0;
-			vacuumed_pages++;
 
 			/*
 			 * Periodically do incremental FSM vacuuming to make newly-freed
@@ -1348,7 +1384,9 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 
 	/* save stats for use later */
 	vacrelstats->tuples_deleted = tups_vacuumed;
-	vacrelstats->new_dead_tuples = nkeep;
+	vacrelstats->new_dead_tuples = nkeep + nleft_dead_tuples;
+	vacrelstats->nleft_dead_tuples = nleft_dead_tuples;
+	vacrelstats->nleft_dead_itemids = nleft_dead_itemids;
 
 	/* now we can compute the new value for pg_class.reltuples */
 	vacrelstats->new_live_tuples = vac_estimate_reltuples(onerel,
@@ -1417,8 +1455,11 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
 
 	/* Do post-vacuum cleanup and statistics update for each index */
-	for (i = 0; i < nindexes; i++)
-		lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
+	if (vacrelstats->useindex)
+	{
+		for (i = 0; i < nindexes; i++)
+			lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
+	}
 
 	/* If no indexes, make log report that lazy_vacuum_heap would've made */
 	if (vacuumed_pages)
@@ -1449,6 +1490,8 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 									"%u pages are entirely empty.\n",
 									empty_pages),
 					 empty_pages);
+	appendStringInfo(&buf, "%.0f tuples and %.0f item identifiers are left as dead.\n",
+					 nleft_dead_tuples, nleft_dead_itemids);
 	appendStringInfo(&buf, _("%s."), pg_rusage_show(&ru0));
 
 	ereport(elevel,
@@ -2092,7 +2135,7 @@ lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
 	autovacuum_work_mem != -1 ?
 	autovacuum_work_mem : maintenance_work_mem;
 
-	if (vacrelstats->hasindex)
+	if (vacrelstats->useindex)
 	{
 		maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);
 		maxtuples = Min(maxtuples, INT_MAX);
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index e91df21..a0f2622 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -203,7 +203,8 @@ vacuum(int options, List *relations, VacuumParams *params,
 						stmttype)));
 
 	/*
-	 * Sanity check DISABLE_PAGE_SKIPPING option.
+	 * Sanity check DISABLE_PAGE_SKIPPING option and DISABLE_INDEX_CLEANUP
+	 * option.
 	 */
 	if ((options & VACOPT_FULL) != 0 &&
 		(options & VACOPT_DISABLE_PAGE_SKIPPING) != 0)
@@ -211,6 +212,11 @@ vacuum(int options, List *relations, VacuumParams *params,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg("VACUUM option DISABLE_PAGE_SKIPPING cannot be used with FULL")));
 
+	if ((options & VACOPT_FULL) != 0 &&
+		(options & VACOPT_DISABLE_INDEX_CLEANUP) != 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("VACUUM option DISABLE_INDEX_CLEANUP cannot be used with FULL")));
 	/*
 	 * Send info about dead objects to the statistics collector, unless we are
 	 * in autovacuum --- autovacuum.c does this for itself.
@@ -1672,6 +1678,13 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 	LockRelationIdForSession(&onerelid, lmode);
 
 	/*
+	 * Disables index cleanup based on reloptions.
+	 */
+	if (onerel->rd_options &&
+		!((StdRdOptions *) onerel->rd_options)->vacuum_index_cleanup)
+		options |= VACOPT_DISABLE_INDEX_CLEANUP;
+
+	/*
 	 * Remember the relation's TOAST relation for later, if the caller asked
 	 * us to process it.  In VACUUM FULL, though, the toast table is
 	 * automatically rebuilt by cluster_rel so we shouldn't recurse to it.
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 753af60..91f8081 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -10495,6 +10495,8 @@ vacuum_option_elem:
 				{
 					if (strcmp($1, "disable_page_skipping") == 0)
 						$$ = VACOPT_DISABLE_PAGE_SKIPPING;
+					else if (strcmp($1, "disable_index_cleanup") == 0)
+						$$ = VACOPT_DISABLE_INDEX_CLEANUP;
 					else if (strcmp($1, "skip_locked") == 0)
 						$$ = VACOPT_SKIP_LOCKED;
 					else
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 10ae21c..0cd03ad 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1039,6 +1039,7 @@ static const char *const table_storage_parameters[] = {
 	"toast.log_autovacuum_min_duration",
 	"toast_tuple_target",
 	"user_catalog_table",
+	"vacuum_index_cleanup",
 	NULL
 };
 
@@ -3429,7 +3430,8 @@ psql_completion(const char *text, int start, int end)
 		 */
 		if (ends_with(prev_wd, '(') || ends_with(prev_wd, ','))
 			COMPLETE_WITH("FULL", "FREEZE", "ANALYZE", "VERBOSE",
-						  "DISABLE_PAGE_SKIPPING", "SKIP_LOCKED");
+						  "DISABLE_PAGE_SKIPPING", "SKIP_LOCKED",
+						  "DISABLE_INDEX_CLEANUP");
 	}
 	else if (HeadMatches("VACUUM") && TailMatches("("))
 		/* "VACUUM (" should be caught above, so assume we want columns */
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index fe35783..2e75906 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3164,7 +3164,8 @@ typedef enum VacuumOption
 	VACOPT_FULL = 1 << 4,		/* FULL (non-concurrent) vacuum */
 	VACOPT_SKIP_LOCKED = 1 << 5,	/* skip if cannot get lock */
 	VACOPT_SKIPTOAST = 1 << 6,	/* don't process the TOAST table, if any */
-	VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7	/* don't skip any pages */
+	VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7,	/* don't skip any pages */
+	VACOPT_DISABLE_INDEX_CLEANUP = 1 << 8	/* disable index vacuuming and cleanup */
 } VacuumOption;
 
 /*
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 9d805ca..1a19e4a 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -266,6 +266,7 @@ typedef struct StdRdOptions
 	AutoVacOpts autovacuum;		/* autovacuum-related options */
 	bool		user_catalog_table; /* use as an additional catalog relation */
 	int			parallel_workers;	/* max number of parallel workers */
+	bool		vacuum_index_cleanup;	/* do index vacuuming and cleanup */
 } StdRdOptions;
 
 #define HEAP_MIN_FILLFACTOR			10
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index fa9d663..cffce59 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -80,6 +80,9 @@ CONTEXT:  SQL function "do_analyze" statement 1
 SQL function "wrap_do_analyze" statement 1
 VACUUM FULL vactst;
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+VACUUM (DISABLE_INDEX_CLEANUP) vaccluster;
+VACUUM (DISABLE_INDEX_CLEANUP) vactst; -- DISABLE_INDEX_CLEANUP is ignored
+VACUUM (DISABLE_INDEX_CLEANUP, FREEZE) vaccluster;
 -- partitioned table
 CREATE TABLE vacparted (a int, b char) PARTITION BY LIST (a);
 CREATE TABLE vacparted1 PARTITION OF vacparted FOR VALUES IN (1);
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 9defa0d..9c4bdb7 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -61,6 +61,9 @@ VACUUM FULL vaccluster;
 VACUUM FULL vactst;
 
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+VACUUM (DISABLE_INDEX_CLEANUP) vaccluster;
+VACUUM (DISABLE_INDEX_CLEANUP) vactst; -- DISABLE_INDEX_CLEANUP is ignored
+VACUUM (DISABLE_INDEX_CLEANUP, FREEZE) vaccluster;
 
 -- partitioned table
 CREATE TABLE vacparted (a int, b char) PARTITION BY LIST (a);
-- 
1.8.3.1

v9-0002-Add-diable-index-cleanup-option-to-vacuumdb.patchtext/x-patch; charset=US-ASCII; name=v9-0002-Add-diable-index-cleanup-option-to-vacuumdb.patchDownload
From d25bbb5dce2d408b942ad4280cb54dc382f8757a Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 1 Feb 2019 16:02:50 +0100
Subject: [PATCH v9 2/2] Add --diable-index-cleanup option to vacuumdb.

---
 doc/src/sgml/ref/vacuumdb.sgml    | 15 +++++++++++++++
 src/bin/scripts/t/100_vacuumdb.pl |  9 ++++++++-
 src/bin/scripts/vacuumdb.c        | 29 +++++++++++++++++++++++++++++
 3 files changed, 52 insertions(+), 1 deletion(-)

diff --git a/doc/src/sgml/ref/vacuumdb.sgml b/doc/src/sgml/ref/vacuumdb.sgml
index 41c7f3d..02d6e46 100644
--- a/doc/src/sgml/ref/vacuumdb.sgml
+++ b/doc/src/sgml/ref/vacuumdb.sgml
@@ -118,6 +118,21 @@ PostgreSQL documentation
      </varlistentry>
 
      <varlistentry>
+      <term><option>--disable-index-cleanup</option></term>
+      <listitem>
+       <para>
+        Disable index vacuuming and index cleanup.
+       </para>
+       <note>
+        <para>
+         This option is only available for servers running
+         <productname>PostgreSQL</productname> 12 and later.
+        </para>
+       </note>
+      </listitem>
+     </varlistentry>
+
+     <varlistentry>
       <term><option>-e</option></term>
       <term><option>--echo</option></term>
       <listitem>
diff --git a/src/bin/scripts/t/100_vacuumdb.pl b/src/bin/scripts/t/100_vacuumdb.pl
index 7f3a9b1..d227225 100644
--- a/src/bin/scripts/t/100_vacuumdb.pl
+++ b/src/bin/scripts/t/100_vacuumdb.pl
@@ -3,7 +3,7 @@ use warnings;
 
 use PostgresNode;
 use TestLib;
-use Test::More tests => 44;
+use Test::More tests => 47;
 
 program_help_ok('vacuumdb');
 program_version_ok('vacuumdb');
@@ -38,6 +38,10 @@ $node->issues_sql_like(
 	qr/statement: VACUUM \(DISABLE_PAGE_SKIPPING\).*;/,
 	'vacuumdb --disable-page-skipping');
 $node->issues_sql_like(
+	[ 'vacuumdb', '--disable-index-cleanup', 'postgres' ],
+	qr/statement: VACUUM \(DISABLE_INDEX_CLEANUP\).*;/,
+	'vacuumdb --disable-index-cleanup');
+$node->issues_sql_like(
 	[ 'vacuumdb', '--skip-locked', 'postgres' ],
 	qr/statement: VACUUM \(SKIP_LOCKED\).*;/,
 	'vacuumdb --skip-locked');
@@ -48,6 +52,9 @@ $node->issues_sql_like(
 $node->command_fails(
 	[ 'vacuumdb', '--analyze-only', '--disable-page-skipping', 'postgres' ],
 	'--analyze-only and --disable-page-skipping specified together');
+$node->command_fails(
+	[ 'vacuumdb', '--analyze-only', '--disable-index-cleanup', 'postgres' ],
+	'--analyze-only and --disable-index-cleanup specified together');
 $node->command_ok([qw(vacuumdb -Z --table=pg_am dbname=template1)],
 	'vacuumdb with connection string');
 
diff --git a/src/bin/scripts/vacuumdb.c b/src/bin/scripts/vacuumdb.c
index 5ac41ea..4705217 100644
--- a/src/bin/scripts/vacuumdb.c
+++ b/src/bin/scripts/vacuumdb.c
@@ -42,6 +42,7 @@ typedef struct vacuumingOptions
 	bool		full;
 	bool		freeze;
 	bool		disable_page_skipping;
+	bool		disable_index_cleanup;
 	bool		skip_locked;
 	int			min_xid_age;
 	int			min_mxid_age;
@@ -117,6 +118,7 @@ main(int argc, char *argv[])
 		{"skip-locked", no_argument, NULL, 5},
 		{"min-xid-age", required_argument, NULL, 6},
 		{"min-mxid-age", required_argument, NULL, 7},
+		{"disable-index-cleanup", no_argument, NULL, 8},
 		{NULL, 0, NULL, 0}
 	};
 
@@ -244,6 +246,11 @@ main(int argc, char *argv[])
 					exit(1);
 				}
 				break;
+			case 8:
+				{
+					vacopts.disable_index_cleanup = true;
+					break;
+				}
 			default:
 				fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname);
 				exit(1);
@@ -288,6 +295,12 @@ main(int argc, char *argv[])
 					progname, "disable-page-skipping");
 			exit(1);
 		}
+		if (vacopts.disable_index_cleanup)
+		{
+			fprintf(stderr, _("%s: cannot use the \"%s\" option when performing only analyze\n"),
+					progname, "disable-index-cleanup");
+			exit(1);
+		}
 		/* allow 'and_analyze' with 'analyze_only' */
 	}
 
@@ -418,6 +431,14 @@ vacuum_one_database(const char *dbname, vacuumingOptions *vacopts,
 		exit(1);
 	}
 
+	if (vacopts->disable_index_cleanup && PQserverVersion(conn) < 120000)
+	{
+		PQfinish(conn);
+		fprintf(stderr, _("%s: cannot use the \"%s\" option on server versions older than PostgreSQL 12\n"),
+				progname, "disable-index-cleanup");
+		exit(1);
+	}
+
 	if (vacopts->skip_locked && PQserverVersion(conn) < 120000)
 	{
 		PQfinish(conn);
@@ -868,6 +889,13 @@ prepare_vacuum_command(PQExpBuffer sql, int serverVersion,
 				appendPQExpBuffer(sql, "%sDISABLE_PAGE_SKIPPING", sep);
 				sep = comma;
 			}
+			if (vacopts->disable_index_cleanup)
+			{
+				/* DISABLE_INDEX_CLEANUP is supported since 12 */
+				Assert(serverVersion >= 120000);
+				appendPQExpBuffer(sql, "%sDISABLE_INDEX_CLEANUP", sep);
+				sep = comma;
+			}
 			if (vacopts->skip_locked)
 			{
 				/* SKIP_LOCKED is supported since v12 */
@@ -1221,6 +1249,7 @@ help(const char *progname)
 	printf(_("  -a, --all                       vacuum all databases\n"));
 	printf(_("  -d, --dbname=DBNAME             database to vacuum\n"));
 	printf(_("      --disable-page-skipping     disable all page-skipping behavior\n"));
+	printf(_("      --disable-index-cleanup     disable index vacuuming and index cleanup\n"));
 	printf(_("  -e, --echo                      show the commands being sent to the server\n"));
 	printf(_("  -f, --full                      do full vacuuming\n"));
 	printf(_("  -F, --freeze                    freeze row transaction information\n"));
-- 
1.8.3.1

#44Robert Haas
robertmhaas@gmail.com
In reply to: Masahiko Sawada (#43)
Re: New vacuum option to do only freezing

On Thu, Mar 7, 2019 at 1:03 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Hmm, the patch already has new reloption vacuum_index_cleanup and
default value is true but you meant I should change its name to
index_cleanup?

No, I mean that you should make it so that someone writes VACUUM
(INDEX_CLEANUP false) instead of VACUUM (DISABLE_INDEX_CLEANUP).

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#45Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Robert Haas (#44)
Re: New vacuum option to do only freezing

On Fri, Mar 8, 2019 at 12:04 AM Robert Haas <robertmhaas@gmail.com> wrote:

On Thu, Mar 7, 2019 at 1:03 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Hmm, the patch already has new reloption vacuum_index_cleanup and
default value is true but you meant I should change its name to
index_cleanup?

No, I mean that you should make it so that someone writes VACUUM
(INDEX_CLEANUP false) instead of VACUUM (DISABLE_INDEX_CLEANUP).

IIUC we've discussed the field-and-value style vacuum option. I
suggested that since we have already the disable_page_skipping option
the disable_page_skipping option would be more natural style and
consistent. I think "VACUUM (INDEX_CLEANUP false)" seems consistent
with its reloption but not with other vacuum options. So why does only
this option (and probably up-coming new options) need to support new
style? Do we need the same change to the existing options?

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#46Robert Haas
robertmhaas@gmail.com
In reply to: Masahiko Sawada (#45)
Re: New vacuum option to do only freezing

On Fri, Mar 8, 2019 at 12:14 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

IIUC we've discussed the field-and-value style vacuum option. I
suggested that since we have already the disable_page_skipping option
the disable_page_skipping option would be more natural style and
consistent. I think "VACUUM (INDEX_CLEANUP false)" seems consistent
with its reloption but not with other vacuum options. So why does only
this option (and probably up-coming new options) need to support new
style? Do we need the same change to the existing options?

Well, it's too late to change to change DISABLE_PAGE_SKIPPING to work
some other way; it's been released, and we're stuck with it at this
point. However, I generally believe that it is preferable to phrase
options positively then negatively, so that for example one writes
EXPLAIN (ANALYZE, TIMING OFF) not EXPLAIN (ANALYZE, NO_TIMING). So
I'd like to do it that way for the new options that we're proposing to
add.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#47Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Robert Haas (#46)
2 attachment(s)
Re: New vacuum option to do only freezing

On Sat, Mar 23, 2019 at 3:25 AM Robert Haas <robertmhaas@gmail.com> wrote:

On Fri, Mar 8, 2019 at 12:14 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

IIUC we've discussed the field-and-value style vacuum option. I
suggested that since we have already the disable_page_skipping option
the disable_page_skipping option would be more natural style and
consistent. I think "VACUUM (INDEX_CLEANUP false)" seems consistent
with its reloption but not with other vacuum options. So why does only
this option (and probably up-coming new options) need to support new
style? Do we need the same change to the existing options?

Well, it's too late to change to change DISABLE_PAGE_SKIPPING to work
some other way; it's been released, and we're stuck with it at this
point.

Agreed.

However, I generally believe that it is preferable to phrase
options positively then negatively, so that for example one writes
EXPLAIN (ANALYZE, TIMING OFF) not EXPLAIN (ANALYZE, NO_TIMING). So
I'd like to do it that way for the new options that we're proposing to
add.

Agreed with using phrase options positively than negatively. Since
DISABLE_PAGE_SKIPPING is an option for emergency we might be able to
rename for consistency in a future release.

Attached updated version patches. 0001 patch can be applied on top of
the patch that allows the all existing options have one boolean
argument, which I've attached on another thread[1]/messages/by-id/CAD21AoBg8CBf1OAse6ESKJmNBon14h3nAR67nJhZ=yujA+Lk4A@mail.gmail.com. So please apply
them in following order.

1. v20-0001-All-VACUUM-command-options-allow-an-argument.patch
(attached on [1]/messages/by-id/CAD21AoBg8CBf1OAse6ESKJmNBon14h3nAR67nJhZ=yujA+Lk4A@mail.gmail.com thread)
2. v10-0001-Add-INDEX_CLEANUP-option-to-VACUUM-command.patch
3. v10-0002-Add-disable-index-cleanup-option-to-vacuumdb.patch

I kept the --disable-index-cleanup option of vacuumdb command since
perhaps it would be understandable to specify this option rather than
setting true/false as a command line option.

Please review the patches.

[1]: /messages/by-id/CAD21AoBg8CBf1OAse6ESKJmNBon14h3nAR67nJhZ=yujA+Lk4A@mail.gmail.com

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachments:

v10-0001-Add-INDEX_CLEANUP-option-to-VACUUM-command.patchtext/x-patch; charset=US-ASCII; name=v10-0001-Add-INDEX_CLEANUP-option-to-VACUUM-command.patchDownload
From 6900a0d3fec03748c4d413af4e379a9bf1209718 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Thu, 7 Mar 2019 09:45:11 +0900
Subject: [PATCH v10 1/2] Add INDEX_CLEANUP option to VACUUM command

If this option is false, VACUUM does HOT-pruning for live tuples but
doesn't remove dead tuples completely and disables index vacuum.

vacrelstats->dead_tuples could have tuples that became dead after
checked at a HOT-pruning time, which are not marked as dead. Per
discussion on pgsql-hackers We normally records and remove them but
with this option we don't process and leave for the next vacuum for
simplifing the code. That's okay because it's very rare condition and
those tuples will be processed by the next vacuum.
---
 doc/src/sgml/ref/create_table.sgml     | 17 +++++++
 doc/src/sgml/ref/vacuum.sgml           | 28 ++++++++++++
 src/backend/access/common/reloptions.c | 13 +++++-
 src/backend/access/heap/vacuumlazy.c   | 82 ++++++++++++++++++++++++++--------
 src/backend/commands/vacuum.c          | 22 ++++++++-
 src/backend/postmaster/autovacuum.c    |  2 +-
 src/bin/psql/tab-complete.c            |  6 +--
 src/include/commands/vacuum.h          |  3 +-
 src/include/utils/rel.h                |  1 +
 src/test/regress/expected/vacuum.out   |  3 ++
 src/test/regress/sql/vacuum.sql        |  3 ++
 11 files changed, 153 insertions(+), 27 deletions(-)

diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index e94fe2c..4209743 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -1345,6 +1345,23 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
    </varlistentry>
 
    <varlistentry>
+    <term><literal>vacuum_index_cleanup</literal> (<type>boolean</type>)</term>
+    <listitem>
+     <para>
+      Per table setting to use <literal>INDEX_CLEANUP</literal> option
+      of <command>VACUUM</command> command. The default value is <literal>true</literal>.
+      If <literal>false</literal>, autovacuum daemon and <command>VACUUM</command>
+      never perform index vacuuming and index cleanup, that is, always set
+      <literal>INDEX_CLEANUP</literal> option to <literal>false</literal>.
+      Note that out of disk space due to index bloat. Setting this parameter to
+      <literal>false</literal> makes sense to avoid scanning large indexes when
+      the table has a few dead tuples. See <xref linkend="sql-vacuum"/> for more
+      details on <literal>INDEX_CLEANUP</literal> option.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><literal>autovacuum_vacuum_threshold</literal>, <literal>toast.autovacuum_vacuum_threshold</literal> (<type>integer</type>)</term>
     <listitem>
      <para>
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index 906d0c2..3f87979 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -32,6 +32,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     ANALYZE [ <replaceable class="parameter">boolean</replaceable> ]
     DISABLE_PAGE_SKIPPING [ <replaceable class="parameter">boolean</replaceable> ]
     SKIP_LOCKED [ <replaceable class="parameter">boolean</replaceable> ]
+    INDEX_CLEANUP [ <replaceable class="parameter">boolean</replaceable> ]
 
 <phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
 
@@ -182,6 +183,26 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    </varlistentry>
 
    <varlistentry>
+    <term><literal>INDEX_CLEANUP</literal></term>
+    <listitem>
+     <para>
+      <command>VACUUM</command> removes dead tuples and prunes HOT-updated
+      tuples chain for live tuples on table. If the table has any dead tuple
+      it removes them from both the table and its indexes and marks the
+      corresponding item identifers as available for re-use. If this option
+      is <literal>FALSE</literal> <command>VACUUM</command> still removes
+      dead tuples from the table, but it does not process any indexes, and
+      the item identifers are marked as dead instead of available. This is
+      suitable for avoiding transaction ID wraparound
+      (see <xref linkend="vacuum-for-wraparound"/>) but not sufficient for
+      avoiding index bloat. It defaults to <literal>TRUE</literal>.
+      This option is ignored if the table does not have index. This cannot
+      be used in conjunction with <literal>FULL</literal> option.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><replaceable class="parameter">boolean</replaceable></term>
     <listitem>
      <para>
@@ -275,6 +296,13 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    </para>
 
    <para>
+    Setting a values for <literal>vacuum_index_cleanup</literal> via
+    <xref linkend="sql-altertable"/> also enables and disables index cleanup.
+    The <literal>INDEX_CLEANUP</literal> options to
+    <command>VACUUM</command> takes precedence over this option.
+   </para>
+
+   <para>
     <command>VACUUM</command> causes a substantial increase in I/O traffic,
     which might cause poor performance for other active sessions.  Therefore,
     it is sometimes advisable to use the cost-based vacuum delay feature.
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 3b0b138..fa11833 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -138,6 +138,15 @@ static relopt_bool boolRelOpts[] =
 		},
 		false
 	},
+	{
+		{
+			"vacuum_index_cleanup",
+			"Enables index vacuuming and index cleanup",
+			RELOPT_KIND_HEAP,
+			ShareUpdateExclusiveLock
+		},
+		true
+	},
 	/* list terminator */
 	{{NULL}}
 };
@@ -1388,7 +1397,9 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
 		{"parallel_workers", RELOPT_TYPE_INT,
 		offsetof(StdRdOptions, parallel_workers)},
 		{"vacuum_cleanup_index_scale_factor", RELOPT_TYPE_REAL,
-		offsetof(StdRdOptions, vacuum_cleanup_index_scale_factor)}
+		offsetof(StdRdOptions, vacuum_cleanup_index_scale_factor)},
+		{"vacuum_index_cleanup", RELOPT_TYPE_BOOL,
+		offsetof(StdRdOptions, vacuum_index_cleanup)}
 	};
 
 	options = parseRelOptions(reloptions, validate, kind, &numoptions);
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 5c554f9..9b1513d 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -112,8 +112,8 @@
 
 typedef struct LVRelStats
 {
-	/* hasindex = true means two-pass strategy; false means one-pass */
-	bool		hasindex;
+	/* useindex = true means two-pass strategy; false means one-pass */
+	bool		useindex;
 	/* Overall statistics about rel */
 	BlockNumber old_rel_pages;	/* previous value of pg_class.relpages */
 	BlockNumber rel_pages;		/* total number of pages */
@@ -125,6 +125,8 @@ typedef struct LVRelStats
 	double		new_rel_tuples; /* new estimated total # of tuples */
 	double		new_live_tuples;	/* new estimated total # of live tuples */
 	double		new_dead_tuples;	/* new estimated total # of dead tuples */
+	double		nleft_dead_tuples;	/* # of dead tuples we left */
+	double		nleft_dead_itemids;	/* # of dead item pointers we left */
 	BlockNumber pages_removed;
 	double		tuples_deleted;
 	BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
@@ -258,7 +260,8 @@ heap_vacuum_rel(Relation onerel, VacuumParams *params,
 
 	/* Open all indexes of the relation */
 	vac_open_indexes(onerel, RowExclusiveLock, &nindexes, &Irel);
-	vacrelstats->hasindex = (nindexes > 0);
+	vacrelstats->useindex = (nindexes > 0 &&
+							 (params->options & VACOPT_INDEX_CLEANUP) != 0);
 
 	/* Do the vacuuming */
 	lazy_scan_heap(onerel, params->options, vacrelstats, Irel, nindexes, aggressive);
@@ -332,7 +335,7 @@ heap_vacuum_rel(Relation onerel, VacuumParams *params,
 						new_rel_pages,
 						new_live_tuples,
 						new_rel_allvisible,
-						vacrelstats->hasindex,
+						nindexes > 0,
 						new_frozen_xid,
 						new_min_multi,
 						false);
@@ -404,6 +407,10 @@ heap_vacuum_rel(Relation onerel, VacuumParams *params,
 							 vacrelstats->new_dead_tuples,
 							 OldestXmin);
 			appendStringInfo(&buf,
+							 _("%.0f tuples and %.0f item identifiers are left as dead.\n"),
+							 vacrelstats->nleft_dead_tuples,
+							 vacrelstats->nleft_dead_itemids);
+			appendStringInfo(&buf,
 							 _("buffer usage: %d hits, %d misses, %d dirtied\n"),
 							 VacuumPageHit,
 							 VacuumPageMiss,
@@ -485,7 +492,10 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 				live_tuples,	/* live tuples (reltuples estimate) */
 				tups_vacuumed,	/* tuples cleaned up by vacuum */
 				nkeep,			/* dead-but-not-removable tuples */
-				nunused;		/* unused item pointers */
+				nunused,		/* unused item pointers */
+				nleft_dead_tuples,		/* tuples we left as dead */
+				nleft_dead_itemids;		/* item pointers we left as dead,
+										 * includes nleft_dead_tuples. */
 	IndexBulkDeleteResult **indstats;
 	int			i;
 	PGRUsage	ru0;
@@ -517,7 +527,8 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 
 	empty_pages = vacuumed_pages = 0;
 	next_fsm_block_to_vacuum = (BlockNumber) 0;
-	num_tuples = live_tuples = tups_vacuumed = nkeep = nunused = 0;
+	num_tuples = live_tuples = tups_vacuumed  = nkeep = nunused = 0;
+	nleft_dead_itemids = nleft_dead_tuples = 0;
 
 	indstats = (IndexBulkDeleteResult **)
 		palloc0(nindexes * sizeof(IndexBulkDeleteResult *));
@@ -1054,7 +1065,16 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 						HeapTupleIsHeapOnly(&tuple))
 						nkeep += 1;
 					else
+					{
 						tupgone = true; /* we can delete the tuple */
+
+						/*
+						 * Since the dead tuples will be not be vacuumed
+						 * and ignored when index cleanup is disabled we
+						 * count them for reporting.
+						 */
+						nleft_dead_tuples++;
+					}
 					all_visible = false;
 					break;
 				case HEAPTUPLE_LIVE:
@@ -1206,15 +1226,33 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		}
 
 		/*
-		 * If there are no indexes then we can vacuum the page right now
-		 * instead of doing a second scan.
+		 * If there are no indexes we can vacuum the page right now instead of
+		 * doing a second scan. Also we don't do that but forget dead tuples
+		 * when index cleanup is disabled.
 		 */
-		if (nindexes == 0 &&
-			vacrelstats->num_dead_tuples > 0)
+		if (!vacrelstats->useindex && vacrelstats->num_dead_tuples > 0)
 		{
-			/* Remove tuples from heap */
-			lazy_vacuum_page(onerel, blkno, buf, 0, vacrelstats, &vmbuffer);
-			has_dead_tuples = false;
+			if (nindexes == 0)
+			{
+				/* Remove tuples from heap if the table has no index */
+				lazy_vacuum_page(onerel, blkno, buf, 0, vacrelstats, &vmbuffer);
+				vacuumed_pages++;
+				has_dead_tuples = false;
+			}
+			else
+			{
+				/*
+				 * Here, we have indexes but index cleanup is disabled. Instead of
+				 * vacuuming the dead tuples on the heap, we just forget them.
+				 *
+				 * Note that vacrelstats->dead_tuples could have tuples which
+				 * became dead after HOT-pruning but are not marked dead yet.
+				 * We do not process them because it's a very rare condition, and
+				 * the next vacuum will process them anyway.
+				 */
+				Assert((options & VACOPT_INDEX_CLEANUP) == 0);
+				nleft_dead_itemids += vacrelstats->num_dead_tuples;
+			}
 
 			/*
 			 * Forget the now-vacuumed tuples, and press on, but be careful
@@ -1222,7 +1260,6 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			 * valid.
 			 */
 			vacrelstats->num_dead_tuples = 0;
-			vacuumed_pages++;
 
 			/*
 			 * Periodically do incremental FSM vacuuming to make newly-freed
@@ -1348,7 +1385,9 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 
 	/* save stats for use later */
 	vacrelstats->tuples_deleted = tups_vacuumed;
-	vacrelstats->new_dead_tuples = nkeep;
+	vacrelstats->new_dead_tuples = nkeep + nleft_dead_tuples;
+	vacrelstats->nleft_dead_tuples = nleft_dead_tuples;
+	vacrelstats->nleft_dead_itemids = nleft_dead_itemids;
 
 	/* now we can compute the new value for pg_class.reltuples */
 	vacrelstats->new_live_tuples = vac_estimate_reltuples(onerel,
@@ -1417,10 +1456,13 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
 
 	/* Do post-vacuum cleanup and statistics update for each index */
-	for (i = 0; i < nindexes; i++)
-		lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
+	if (vacrelstats->useindex)
+	{
+		for (i = 0; i < nindexes; i++)
+			lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
+	}
 
-	/* If no indexes, make log report that lazy_vacuum_heap would've made */
+	/* If no index, make log report that lazy_vacuum_heap would've made */
 	if (vacuumed_pages)
 		ereport(elevel,
 				(errmsg("\"%s\": removed %.0f row versions in %u pages",
@@ -1449,6 +1491,8 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 									"%u pages are entirely empty.\n",
 									empty_pages),
 					 empty_pages);
+	appendStringInfo(&buf, "%.0f tuples and %.0f item identifiers are left as dead.\n",
+					 nleft_dead_tuples, nleft_dead_itemids);
 	appendStringInfo(&buf, _("%s."), pg_rusage_show(&ru0));
 
 	ereport(elevel,
@@ -2092,7 +2136,7 @@ lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
 	autovacuum_work_mem != -1 ?
 	autovacuum_work_mem : maintenance_work_mem;
 
-	if (vacrelstats->hasindex)
+	if (vacrelstats->useindex)
 	{
 		maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);
 		maxtuples = Min(maxtuples, INT_MAX);
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 72f140e..9e0cd07 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -89,7 +89,10 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 	VacuumParams params;
 	ListCell	*lc;
 
-	params.options = vacstmt->is_vacuumcmd ? VACOPT_VACUUM : VACOPT_ANALYZE;
+	/* Vacuum enables index cleanup by default */
+	params.options = vacstmt->is_vacuumcmd
+		? VACOPT_VACUUM | VACOPT_INDEX_CLEANUP
+		: VACOPT_ANALYZE;
 
 	/* Parse options list */
 	foreach(lc, vacstmt->options)
@@ -116,6 +119,8 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 			params.options |= defGetBoolean(opt) ? VACOPT_FULL : 0;
 		else if (strcmp(opt->defname, "disable_page_skipping") == 0)
 			params.options |= defGetBoolean(opt) ? VACOPT_DISABLE_PAGE_SKIPPING : 0;
+		else if (strcmp(opt->defname, "index_cleanup") == 0)
+			params.options |= defGetBoolean(opt) ? VACOPT_INDEX_CLEANUP : 0;
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -237,7 +242,8 @@ vacuum(List *relations, VacuumParams *params,
 						stmttype)));
 
 	/*
-	 * Sanity check DISABLE_PAGE_SKIPPING option.
+	 * Sanity check DISABLE_PAGE_SKIPPING option and INDEX_CLEANUP
+	 * option.
 	 */
 	if ((params->options & VACOPT_FULL) != 0 &&
 		(params->options & VACOPT_DISABLE_PAGE_SKIPPING) != 0)
@@ -245,6 +251,11 @@ vacuum(List *relations, VacuumParams *params,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg("VACUUM option DISABLE_PAGE_SKIPPING cannot be used with FULL")));
 
+	if ((params->options & VACOPT_FULL) != 0 &&
+		(params->options & VACOPT_INDEX_CLEANUP) == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("VACUUM option INDEX_CLEANUP cannot be set to false with FULL")));
 	/*
 	 * Send info about dead objects to the statistics collector, unless we are
 	 * in autovacuum --- autovacuum.c does this for itself.
@@ -1706,6 +1717,13 @@ vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params)
 	LockRelationIdForSession(&onerelid, lmode);
 
 	/*
+	 * Disables index cleanup based on reloptions.
+	 */
+	if (onerel->rd_options &&
+		!((StdRdOptions *) onerel->rd_options)->vacuum_index_cleanup)
+		params->options &= ~(VACOPT_INDEX_CLEANUP);
+
+	/*
 	 * Remember the relation's TOAST relation for later, if the caller asked
 	 * us to process it.  In VACUUM FULL, though, the toast table is
 	 * automatically rebuilt by cluster_rel so we shouldn't recurse to it.
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index fa875db..51c60fc 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2883,7 +2883,7 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		tab->at_relid = relid;
 		tab->at_sharedrel = classForm->relisshared;
 		tab->at_params.options = VACOPT_SKIPTOAST |
-			(dovacuum ? VACOPT_VACUUM : 0) |
+			(dovacuum ? VACOPT_VACUUM | VACOPT_INDEX_CLEANUP : 0) |
 			(doanalyze ? VACOPT_ANALYZE : 0) |
 			(!wraparound ? VACOPT_SKIP_LOCKED : 0);
 		tab->at_params.freeze_min_age = freeze_min_age;
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 36e20fb..2d5bdd6 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1039,6 +1039,7 @@ static const char *const table_storage_parameters[] = {
 	"toast.log_autovacuum_min_duration",
 	"toast_tuple_target",
 	"user_catalog_table",
+	"vacuum_index_cleanup",
 	NULL
 };
 
@@ -3431,9 +3432,8 @@ psql_completion(const char *text, int start, int end)
 		 */
 		if (ends_with(prev_wd, '(') || ends_with(prev_wd, ','))
 			COMPLETE_WITH("FULL", "FREEZE", "ANALYZE", "VERBOSE",
-						  "DISABLE_PAGE_SKIPPING", "SKIP_LOCKED");
-		else if (TailMatches("FULL|FREEZE|ANALYZE|VERBOSE|DISABLE_PAGE_SKIPPING|SKIP_LOCKED"))
-			COMPLETE_WITH("ON", "OFF");
+						  "DISABLE_PAGE_SKIPPING", "SKIP_LOCKED",
+						  "INDEX_CLEANUP");
 	}
 	else if (HeadMatches("VACUUM") && TailMatches("("))
 		/* "VACUUM (" should be caught above, so assume we want columns */
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 77086f3..7bbc18c 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -145,7 +145,8 @@ typedef enum VacuumOption
 	VACOPT_FULL = 1 << 4,		/* FULL (non-concurrent) vacuum */
 	VACOPT_SKIP_LOCKED = 1 << 5,	/* skip if cannot get lock */
 	VACOPT_SKIPTOAST = 1 << 6,	/* don't process the TOAST table, if any */
-	VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7	/* don't skip any pages */
+	VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7,	/* don't skip any pages */
+	VACOPT_INDEX_CLEANUP = 1 << 8	/* Do index vacuum and cleanup */
 } VacuumOption;
 
 /*
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 5402851..89a7fbf 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -266,6 +266,7 @@ typedef struct StdRdOptions
 	AutoVacOpts autovacuum;		/* autovacuum-related options */
 	bool		user_catalog_table; /* use as an additional catalog relation */
 	int			parallel_workers;	/* max number of parallel workers */
+	bool		vacuum_index_cleanup;	/* enables index vacuuming and cleanup */
 } StdRdOptions;
 
 #define HEAP_MIN_FILLFACTOR			10
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index 07d0703..a1de992 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -80,6 +80,9 @@ CONTEXT:  SQL function "do_analyze" statement 1
 SQL function "wrap_do_analyze" statement 1
 VACUUM FULL vactst;
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+VACUUM (INDEX_CLEANUP FALSE) vaccluster;
+VACUUM (INDEX_CLEANUP FALSE) vactst; -- option is ignored if no indexes
+VACUUM (INDEX_CLEANUP FALSE, FREEZE) vaccluster;
 -- partitioned table
 CREATE TABLE vacparted (a int, b char) PARTITION BY LIST (a);
 CREATE TABLE vacparted1 PARTITION OF vacparted FOR VALUES IN (1);
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 81f3822..8745e34 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -61,6 +61,9 @@ VACUUM FULL vaccluster;
 VACUUM FULL vactst;
 
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+VACUUM (INDEX_CLEANUP FALSE) vaccluster;
+VACUUM (INDEX_CLEANUP FALSE) vactst; -- option is ignored if no indexes
+VACUUM (INDEX_CLEANUP FALSE, FREEZE) vaccluster;
 
 -- partitioned table
 CREATE TABLE vacparted (a int, b char) PARTITION BY LIST (a);
-- 
1.8.3.1

v10-0002-Add-disable-index-cleanup-option-to-vacuumdb.patchtext/x-patch; charset=US-ASCII; name=v10-0002-Add-disable-index-cleanup-option-to-vacuumdb.patchDownload
From 1beffb4775f1d4ab15dfd88730c733b0a6b10eb7 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 1 Feb 2019 16:02:50 +0100
Subject: [PATCH v10 2/2] Add --disable-index-cleanup option to vacuumdb.

---
 doc/src/sgml/ref/vacuumdb.sgml    | 15 +++++++++++++++
 src/bin/scripts/t/100_vacuumdb.pl |  9 ++++++++-
 src/bin/scripts/vacuumdb.c        | 29 +++++++++++++++++++++++++++++
 3 files changed, 52 insertions(+), 1 deletion(-)

diff --git a/doc/src/sgml/ref/vacuumdb.sgml b/doc/src/sgml/ref/vacuumdb.sgml
index 41c7f3d..02d6e46 100644
--- a/doc/src/sgml/ref/vacuumdb.sgml
+++ b/doc/src/sgml/ref/vacuumdb.sgml
@@ -118,6 +118,21 @@ PostgreSQL documentation
      </varlistentry>
 
      <varlistentry>
+      <term><option>--disable-index-cleanup</option></term>
+      <listitem>
+       <para>
+        Disable index vacuuming and index cleanup.
+       </para>
+       <note>
+        <para>
+         This option is only available for servers running
+         <productname>PostgreSQL</productname> 12 and later.
+        </para>
+       </note>
+      </listitem>
+     </varlistentry>
+
+     <varlistentry>
       <term><option>-e</option></term>
       <term><option>--echo</option></term>
       <listitem>
diff --git a/src/bin/scripts/t/100_vacuumdb.pl b/src/bin/scripts/t/100_vacuumdb.pl
index 7f3a9b1..2b7cd18 100644
--- a/src/bin/scripts/t/100_vacuumdb.pl
+++ b/src/bin/scripts/t/100_vacuumdb.pl
@@ -3,7 +3,7 @@ use warnings;
 
 use PostgresNode;
 use TestLib;
-use Test::More tests => 44;
+use Test::More tests => 47;
 
 program_help_ok('vacuumdb');
 program_version_ok('vacuumdb');
@@ -38,6 +38,10 @@ $node->issues_sql_like(
 	qr/statement: VACUUM \(DISABLE_PAGE_SKIPPING\).*;/,
 	'vacuumdb --disable-page-skipping');
 $node->issues_sql_like(
+	[ 'vacuumdb', '--disable-index-cleanup', 'postgres' ],
+	qr/statement: VACUUM \(INDEX_CLEANUP FALSE\).*;/,
+	'vacuumdb --disable-index-cleanup');
+$node->issues_sql_like(
 	[ 'vacuumdb', '--skip-locked', 'postgres' ],
 	qr/statement: VACUUM \(SKIP_LOCKED\).*;/,
 	'vacuumdb --skip-locked');
@@ -48,6 +52,9 @@ $node->issues_sql_like(
 $node->command_fails(
 	[ 'vacuumdb', '--analyze-only', '--disable-page-skipping', 'postgres' ],
 	'--analyze-only and --disable-page-skipping specified together');
+$node->command_fails(
+	[ 'vacuumdb', '--analyze-only', '--disable-index-cleanup', 'postgres' ],
+	'--analyze-only and --disable-index-cleanup specified together');
 $node->command_ok([qw(vacuumdb -Z --table=pg_am dbname=template1)],
 	'vacuumdb with connection string');
 
diff --git a/src/bin/scripts/vacuumdb.c b/src/bin/scripts/vacuumdb.c
index 5ac41ea..c090ee7 100644
--- a/src/bin/scripts/vacuumdb.c
+++ b/src/bin/scripts/vacuumdb.c
@@ -42,6 +42,7 @@ typedef struct vacuumingOptions
 	bool		full;
 	bool		freeze;
 	bool		disable_page_skipping;
+	bool		disable_index_cleanup;
 	bool		skip_locked;
 	int			min_xid_age;
 	int			min_mxid_age;
@@ -117,6 +118,7 @@ main(int argc, char *argv[])
 		{"skip-locked", no_argument, NULL, 5},
 		{"min-xid-age", required_argument, NULL, 6},
 		{"min-mxid-age", required_argument, NULL, 7},
+		{"disable-index-cleanup", no_argument, NULL, 8},
 		{NULL, 0, NULL, 0}
 	};
 
@@ -244,6 +246,11 @@ main(int argc, char *argv[])
 					exit(1);
 				}
 				break;
+			case 8:
+				{
+					vacopts.disable_index_cleanup = true;
+					break;
+				}
 			default:
 				fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname);
 				exit(1);
@@ -288,6 +295,12 @@ main(int argc, char *argv[])
 					progname, "disable-page-skipping");
 			exit(1);
 		}
+		if (vacopts.disable_index_cleanup)
+		{
+			fprintf(stderr, _("%s: cannot use the \"%s\" option when performing only analyze\n"),
+					progname, "disable-index-cleanup");
+			exit(1);
+		}
 		/* allow 'and_analyze' with 'analyze_only' */
 	}
 
@@ -418,6 +431,14 @@ vacuum_one_database(const char *dbname, vacuumingOptions *vacopts,
 		exit(1);
 	}
 
+	if (vacopts->disable_index_cleanup && PQserverVersion(conn) < 120000)
+	{
+		PQfinish(conn);
+		fprintf(stderr, _("%s: cannot use the \"%s\" option on server versions older than PostgreSQL 12\n"),
+				progname, "disable-index-cleanup");
+		exit(1);
+	}
+
 	if (vacopts->skip_locked && PQserverVersion(conn) < 120000)
 	{
 		PQfinish(conn);
@@ -868,6 +889,13 @@ prepare_vacuum_command(PQExpBuffer sql, int serverVersion,
 				appendPQExpBuffer(sql, "%sDISABLE_PAGE_SKIPPING", sep);
 				sep = comma;
 			}
+			if (vacopts->disable_index_cleanup)
+			{
+				/* INDEX_CLEANUP is supported since 12 */
+				Assert(serverVersion >= 120000);
+				appendPQExpBuffer(sql, "%sINDEX_CLEANUP FALSE", sep);
+				sep = comma;
+			}
 			if (vacopts->skip_locked)
 			{
 				/* SKIP_LOCKED is supported since v12 */
@@ -1221,6 +1249,7 @@ help(const char *progname)
 	printf(_("  -a, --all                       vacuum all databases\n"));
 	printf(_("  -d, --dbname=DBNAME             database to vacuum\n"));
 	printf(_("      --disable-page-skipping     disable all page-skipping behavior\n"));
+	printf(_("      --disable-index-cleanup     disable index vacuuming and index cleanup\n"));
 	printf(_("  -e, --echo                      show the commands being sent to the server\n"));
 	printf(_("  -f, --full                      do full vacuuming\n"));
 	printf(_("  -F, --freeze                    freeze row transaction information\n"));
-- 
1.8.3.1

#48Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Masahiko Sawada (#47)
Re: New vacuum option to do only freezing

On Wed, Mar 27, 2019 at 12:12 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Sat, Mar 23, 2019 at 3:25 AM Robert Haas <robertmhaas@gmail.com> wrote:

On Fri, Mar 8, 2019 at 12:14 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

IIUC we've discussed the field-and-value style vacuum option. I
suggested that since we have already the disable_page_skipping option
the disable_page_skipping option would be more natural style and
consistent. I think "VACUUM (INDEX_CLEANUP false)" seems consistent
with its reloption but not with other vacuum options. So why does only
this option (and probably up-coming new options) need to support new
style? Do we need the same change to the existing options?

Well, it's too late to change to change DISABLE_PAGE_SKIPPING to work
some other way; it's been released, and we're stuck with it at this
point.

Agreed.

However, I generally believe that it is preferable to phrase
options positively then negatively, so that for example one writes
EXPLAIN (ANALYZE, TIMING OFF) not EXPLAIN (ANALYZE, NO_TIMING). So
I'd like to do it that way for the new options that we're proposing to
add.

Agreed with using phrase options positively than negatively. Since
DISABLE_PAGE_SKIPPING is an option for emergency we might be able to
rename for consistency in a future release.

Attached updated version patches.

The patch adds the basic functionality to disable index cleanup but
one possible argument could be whether we should always disable it
when anti-wraparound vacuum. As discussed on another thread[1]/messages/by-id/CAC8Q8t+j36G_bLF=+0iMo6jGNWnLnWb1tujXuJr-+x8ZCCTqoQ@mail.gmail.com
anti-wraparound vacuum still could lead the I/O burst problem and take
a long time, especially for append-only large table. Originally the
purpose of this feature is to resolve the problem that vacuum takes a
long time even if the table has just a few dead tuples, which is a
quite common situation of anti-wraparound vacuum. It might be too late
to discuss but if we always disable it when anti-wraparound vacuum
then users don't need to do "VACUUM (INDEX_CLEANUP false)" manually on
PostgreSQL 12. Dose anyone have opinions?

[1]: /messages/by-id/CAC8Q8t+j36G_bLF=+0iMo6jGNWnLnWb1tujXuJr-+x8ZCCTqoQ@mail.gmail.com

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#49Robert Haas
robertmhaas@gmail.com
In reply to: Masahiko Sawada (#48)
Re: New vacuum option to do only freezing

On Thu, Mar 28, 2019 at 2:00 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

The patch adds the basic functionality to disable index cleanup but
one possible argument could be whether we should always disable it
when anti-wraparound vacuum. As discussed on another thread[1]
anti-wraparound vacuum still could lead the I/O burst problem and take
a long time, especially for append-only large table. Originally the
purpose of this feature is to resolve the problem that vacuum takes a
long time even if the table has just a few dead tuples, which is a
quite common situation of anti-wraparound vacuum. It might be too late
to discuss but if we always disable it when anti-wraparound vacuum
then users don't need to do "VACUUM (INDEX_CLEANUP false)" manually on
PostgreSQL 12. Dose anyone have opinions?

I think we can respect the configured value of the option even for
aggressive vacuums, but I don't think we should change aggressive
vacuums to work that way by default. You are correct that the table
might have only a few dead tuples, but it might also have a lot of
dead tuples; I have heard rumors of a PostgreSQL installation that had
autovacuum = off and non-stop wraparound autovacuums desperately
trying to forestall shutdown. That's probably a lot less likely now
that we have the freeze map and such a system would almost surely have
a nasty bloat problem, but disabling index cleanup by default would
make it worse.

I think the solution in the long run here is to (1) allow the
index_cleanup option (or the corresponding reloption) to override the
default behavior and (2) eventually change the default behavior from
'always yes' to 'depends on how many dead tuples we found'. But I
think that the second of those things is not appropriate to consider
changing in PG 12 at this point.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#50Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Robert Haas (#49)
2 attachment(s)
Re: New vacuum option to do only freezing

On Fri, Mar 29, 2019 at 4:39 AM Robert Haas <robertmhaas@gmail.com> wrote:

On Thu, Mar 28, 2019 at 2:00 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

The patch adds the basic functionality to disable index cleanup but
one possible argument could be whether we should always disable it
when anti-wraparound vacuum. As discussed on another thread[1]
anti-wraparound vacuum still could lead the I/O burst problem and take
a long time, especially for append-only large table. Originally the
purpose of this feature is to resolve the problem that vacuum takes a
long time even if the table has just a few dead tuples, which is a
quite common situation of anti-wraparound vacuum. It might be too late
to discuss but if we always disable it when anti-wraparound vacuum
then users don't need to do "VACUUM (INDEX_CLEANUP false)" manually on
PostgreSQL 12. Dose anyone have opinions?

I think we can respect the configured value of the option even for
aggressive vacuums, but I don't think we should change aggressive
vacuums to work that way by default. You are correct that the table
might have only a few dead tuples, but it might also have a lot of
dead tuples; I have heard rumors of a PostgreSQL installation that had
autovacuum = off and non-stop wraparound autovacuums desperately
trying to forestall shutdown. That's probably a lot less likely now
that we have the freeze map and such a system would almost surely have
a nasty bloat problem, but disabling index cleanup by default would
make it worse.

Understood and agreed. Always setting it to false would affect much
and there are users who expect anti-wraparound vacuums to reclaim
garbage.

I think the solution in the long run here is to (1) allow the
index_cleanup option (or the corresponding reloption) to override the
default behavior and (2) eventually change the default behavior from
'always yes' to 'depends on how many dead tuples we found'. But I
think that the second of those things is not appropriate to consider
changing in PG 12 at this point.

The current patch already takes (1) and I agreed with you that (2)
would be for PG 13 or later. So the patch would be helpful for such
users as well.

Attached updated patches. These patches are applied on top of 0001
patch on parallel vacuum thread[1]/messages/by-id/CAD21AoBaFcKBAeL5_++j+Vzir2vBBcF4juW7qH8b3HsQY=Q6+w@mail.gmail.com.

[1]: /messages/by-id/CAD21AoBaFcKBAeL5_++j+Vzir2vBBcF4juW7qH8b3HsQY=Q6+w@mail.gmail.com

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachments:

v11-0002-Add-disable-index-cleanup-option-to-vacuumdb.patchtext/x-patch; charset=US-ASCII; name=v11-0002-Add-disable-index-cleanup-option-to-vacuumdb.patchDownload
From 4a16701f327d26cf499cbe9c22916d5081a232d2 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 1 Feb 2019 16:02:50 +0100
Subject: [PATCH v11 2/2] Add --disable-index-cleanup option to vacuumdb.

---
 doc/src/sgml/ref/vacuumdb.sgml    | 15 +++++++++++++++
 src/bin/scripts/t/100_vacuumdb.pl |  9 ++++++++-
 src/bin/scripts/vacuumdb.c        | 29 +++++++++++++++++++++++++++++
 3 files changed, 52 insertions(+), 1 deletion(-)

diff --git a/doc/src/sgml/ref/vacuumdb.sgml b/doc/src/sgml/ref/vacuumdb.sgml
index 41c7f3d..02d6e46 100644
--- a/doc/src/sgml/ref/vacuumdb.sgml
+++ b/doc/src/sgml/ref/vacuumdb.sgml
@@ -118,6 +118,21 @@ PostgreSQL documentation
      </varlistentry>
 
      <varlistentry>
+      <term><option>--disable-index-cleanup</option></term>
+      <listitem>
+       <para>
+        Disable index vacuuming and index cleanup.
+       </para>
+       <note>
+        <para>
+         This option is only available for servers running
+         <productname>PostgreSQL</productname> 12 and later.
+        </para>
+       </note>
+      </listitem>
+     </varlistentry>
+
+     <varlistentry>
       <term><option>-e</option></term>
       <term><option>--echo</option></term>
       <listitem>
diff --git a/src/bin/scripts/t/100_vacuumdb.pl b/src/bin/scripts/t/100_vacuumdb.pl
index 7f3a9b1..2b7cd18 100644
--- a/src/bin/scripts/t/100_vacuumdb.pl
+++ b/src/bin/scripts/t/100_vacuumdb.pl
@@ -3,7 +3,7 @@ use warnings;
 
 use PostgresNode;
 use TestLib;
-use Test::More tests => 44;
+use Test::More tests => 47;
 
 program_help_ok('vacuumdb');
 program_version_ok('vacuumdb');
@@ -38,6 +38,10 @@ $node->issues_sql_like(
 	qr/statement: VACUUM \(DISABLE_PAGE_SKIPPING\).*;/,
 	'vacuumdb --disable-page-skipping');
 $node->issues_sql_like(
+	[ 'vacuumdb', '--disable-index-cleanup', 'postgres' ],
+	qr/statement: VACUUM \(INDEX_CLEANUP FALSE\).*;/,
+	'vacuumdb --disable-index-cleanup');
+$node->issues_sql_like(
 	[ 'vacuumdb', '--skip-locked', 'postgres' ],
 	qr/statement: VACUUM \(SKIP_LOCKED\).*;/,
 	'vacuumdb --skip-locked');
@@ -48,6 +52,9 @@ $node->issues_sql_like(
 $node->command_fails(
 	[ 'vacuumdb', '--analyze-only', '--disable-page-skipping', 'postgres' ],
 	'--analyze-only and --disable-page-skipping specified together');
+$node->command_fails(
+	[ 'vacuumdb', '--analyze-only', '--disable-index-cleanup', 'postgres' ],
+	'--analyze-only and --disable-index-cleanup specified together');
 $node->command_ok([qw(vacuumdb -Z --table=pg_am dbname=template1)],
 	'vacuumdb with connection string');
 
diff --git a/src/bin/scripts/vacuumdb.c b/src/bin/scripts/vacuumdb.c
index 5ac41ea..c090ee7 100644
--- a/src/bin/scripts/vacuumdb.c
+++ b/src/bin/scripts/vacuumdb.c
@@ -42,6 +42,7 @@ typedef struct vacuumingOptions
 	bool		full;
 	bool		freeze;
 	bool		disable_page_skipping;
+	bool		disable_index_cleanup;
 	bool		skip_locked;
 	int			min_xid_age;
 	int			min_mxid_age;
@@ -117,6 +118,7 @@ main(int argc, char *argv[])
 		{"skip-locked", no_argument, NULL, 5},
 		{"min-xid-age", required_argument, NULL, 6},
 		{"min-mxid-age", required_argument, NULL, 7},
+		{"disable-index-cleanup", no_argument, NULL, 8},
 		{NULL, 0, NULL, 0}
 	};
 
@@ -244,6 +246,11 @@ main(int argc, char *argv[])
 					exit(1);
 				}
 				break;
+			case 8:
+				{
+					vacopts.disable_index_cleanup = true;
+					break;
+				}
 			default:
 				fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname);
 				exit(1);
@@ -288,6 +295,12 @@ main(int argc, char *argv[])
 					progname, "disable-page-skipping");
 			exit(1);
 		}
+		if (vacopts.disable_index_cleanup)
+		{
+			fprintf(stderr, _("%s: cannot use the \"%s\" option when performing only analyze\n"),
+					progname, "disable-index-cleanup");
+			exit(1);
+		}
 		/* allow 'and_analyze' with 'analyze_only' */
 	}
 
@@ -418,6 +431,14 @@ vacuum_one_database(const char *dbname, vacuumingOptions *vacopts,
 		exit(1);
 	}
 
+	if (vacopts->disable_index_cleanup && PQserverVersion(conn) < 120000)
+	{
+		PQfinish(conn);
+		fprintf(stderr, _("%s: cannot use the \"%s\" option on server versions older than PostgreSQL 12\n"),
+				progname, "disable-index-cleanup");
+		exit(1);
+	}
+
 	if (vacopts->skip_locked && PQserverVersion(conn) < 120000)
 	{
 		PQfinish(conn);
@@ -868,6 +889,13 @@ prepare_vacuum_command(PQExpBuffer sql, int serverVersion,
 				appendPQExpBuffer(sql, "%sDISABLE_PAGE_SKIPPING", sep);
 				sep = comma;
 			}
+			if (vacopts->disable_index_cleanup)
+			{
+				/* INDEX_CLEANUP is supported since 12 */
+				Assert(serverVersion >= 120000);
+				appendPQExpBuffer(sql, "%sINDEX_CLEANUP FALSE", sep);
+				sep = comma;
+			}
 			if (vacopts->skip_locked)
 			{
 				/* SKIP_LOCKED is supported since v12 */
@@ -1221,6 +1249,7 @@ help(const char *progname)
 	printf(_("  -a, --all                       vacuum all databases\n"));
 	printf(_("  -d, --dbname=DBNAME             database to vacuum\n"));
 	printf(_("      --disable-page-skipping     disable all page-skipping behavior\n"));
+	printf(_("      --disable-index-cleanup     disable index vacuuming and index cleanup\n"));
 	printf(_("  -e, --echo                      show the commands being sent to the server\n"));
 	printf(_("  -f, --full                      do full vacuuming\n"));
 	printf(_("  -F, --freeze                    freeze row transaction information\n"));
-- 
1.8.3.1

v11-0001-Add-INDEX_CLEANUP-option-to-VACUUM-command.patchtext/x-patch; charset=US-ASCII; name=v11-0001-Add-INDEX_CLEANUP-option-to-VACUUM-command.patchDownload
From 5a50a20c346e6e96006d4b92f3b1ec7b0692e66f Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Thu, 7 Mar 2019 09:45:11 +0900
Subject: [PATCH v11 1/2] Add INDEX_CLEANUP option to VACUUM command

If this option is false, VACUUM does HOT-pruning for live tuples but
doesn't remove dead tuples completely and disables index vacuum.

vacrelstats->dead_tuples could have tuples that became dead after
checked at a HOT-pruning time, which are not marked as dead. Per
discussion on pgsql-hackers We normally records and remove them but
with this option we don't process and leave for the next vacuum for
simplifing the code. That's okay because it's very rare condition and
those tuples will be processed by the next vacuum.
---
 doc/src/sgml/ref/create_table.sgml     | 17 ++++++++
 doc/src/sgml/ref/vacuum.sgml           | 28 ++++++++++++
 src/backend/access/common/reloptions.c | 13 +++++-
 src/backend/access/heap/vacuumlazy.c   | 78 ++++++++++++++++++++++++++--------
 src/backend/commands/vacuum.c          | 21 ++++++++-
 src/backend/postmaster/autovacuum.c    |  2 +-
 src/bin/psql/tab-complete.c            |  6 ++-
 src/include/commands/vacuum.h          |  3 +-
 src/include/utils/rel.h                |  1 +
 src/test/regress/expected/vacuum.out   |  3 ++
 src/test/regress/sql/vacuum.sql        |  3 ++
 11 files changed, 151 insertions(+), 24 deletions(-)

diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index 1660784..a6eba05 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -1345,6 +1345,23 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
    </varlistentry>
 
    <varlistentry>
+    <term><literal>vacuum_index_cleanup</literal> (<type>boolean</type>)</term>
+    <listitem>
+     <para>
+      Per table setting to use <literal>INDEX_CLEANUP</literal> option
+      of <command>VACUUM</command> command. The default value is <literal>true</literal>.
+      If <literal>false</literal>, autovacuum daemon and <command>VACUUM</command>
+      never perform index vacuuming and index cleanup, that is, always set
+      <literal>INDEX_CLEANUP</literal> option to <literal>false</literal>.
+      Note that out of disk space due to index bloat. Setting this parameter to
+      <literal>false</literal> makes sense to avoid scanning large indexes when
+      the table has a few dead tuples. See <xref linkend="sql-vacuum"/> for more
+      details on <literal>INDEX_CLEANUP</literal> option.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><literal>autovacuum_vacuum_threshold</literal>, <literal>toast.autovacuum_vacuum_threshold</literal> (<type>integer</type>)</term>
     <listitem>
      <para>
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index 906d0c2..3f87979 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -32,6 +32,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     ANALYZE [ <replaceable class="parameter">boolean</replaceable> ]
     DISABLE_PAGE_SKIPPING [ <replaceable class="parameter">boolean</replaceable> ]
     SKIP_LOCKED [ <replaceable class="parameter">boolean</replaceable> ]
+    INDEX_CLEANUP [ <replaceable class="parameter">boolean</replaceable> ]
 
 <phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
 
@@ -182,6 +183,26 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    </varlistentry>
 
    <varlistentry>
+    <term><literal>INDEX_CLEANUP</literal></term>
+    <listitem>
+     <para>
+      <command>VACUUM</command> removes dead tuples and prunes HOT-updated
+      tuples chain for live tuples on table. If the table has any dead tuple
+      it removes them from both the table and its indexes and marks the
+      corresponding item identifers as available for re-use. If this option
+      is <literal>FALSE</literal> <command>VACUUM</command> still removes
+      dead tuples from the table, but it does not process any indexes, and
+      the item identifers are marked as dead instead of available. This is
+      suitable for avoiding transaction ID wraparound
+      (see <xref linkend="vacuum-for-wraparound"/>) but not sufficient for
+      avoiding index bloat. It defaults to <literal>TRUE</literal>.
+      This option is ignored if the table does not have index. This cannot
+      be used in conjunction with <literal>FULL</literal> option.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><replaceable class="parameter">boolean</replaceable></term>
     <listitem>
      <para>
@@ -275,6 +296,13 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    </para>
 
    <para>
+    Setting a values for <literal>vacuum_index_cleanup</literal> via
+    <xref linkend="sql-altertable"/> also enables and disables index cleanup.
+    The <literal>INDEX_CLEANUP</literal> options to
+    <command>VACUUM</command> takes precedence over this option.
+   </para>
+
+   <para>
     <command>VACUUM</command> causes a substantial increase in I/O traffic,
     which might cause poor performance for other active sessions.  Therefore,
     it is sometimes advisable to use the cost-based vacuum delay feature.
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index b58a1f7..e2c0de3 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -138,6 +138,15 @@ static relopt_bool boolRelOpts[] =
 		},
 		false
 	},
+	{
+		{
+			"vacuum_index_cleanup",
+			"Enables index vacuuming and index cleanup",
+			RELOPT_KIND_HEAP,
+			ShareUpdateExclusiveLock
+		},
+		true
+	},
 	/* list terminator */
 	{{NULL}}
 };
@@ -1388,7 +1397,9 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
 		{"parallel_workers", RELOPT_TYPE_INT,
 		offsetof(StdRdOptions, parallel_workers)},
 		{"vacuum_cleanup_index_scale_factor", RELOPT_TYPE_REAL,
-		offsetof(StdRdOptions, vacuum_cleanup_index_scale_factor)}
+		offsetof(StdRdOptions, vacuum_cleanup_index_scale_factor)},
+		{"vacuum_index_cleanup", RELOPT_TYPE_BOOL,
+		offsetof(StdRdOptions, vacuum_index_cleanup)}
 	};
 
 	options = parseRelOptions(reloptions, validate, kind, &numoptions);
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 5c554f9..4911131 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -112,8 +112,8 @@
 
 typedef struct LVRelStats
 {
-	/* hasindex = true means two-pass strategy; false means one-pass */
-	bool		hasindex;
+	/* useindex = true means two-pass strategy; false means one-pass */
+	bool		useindex;
 	/* Overall statistics about rel */
 	BlockNumber old_rel_pages;	/* previous value of pg_class.relpages */
 	BlockNumber rel_pages;		/* total number of pages */
@@ -125,6 +125,8 @@ typedef struct LVRelStats
 	double		new_rel_tuples; /* new estimated total # of tuples */
 	double		new_live_tuples;	/* new estimated total # of live tuples */
 	double		new_dead_tuples;	/* new estimated total # of dead tuples */
+	double		nleft_dead_tuples;	/* # of dead tuples we left */
+	double		nleft_dead_itemids;	/* # of dead item pointers we left */
 	BlockNumber pages_removed;
 	double		tuples_deleted;
 	BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
@@ -258,7 +260,8 @@ heap_vacuum_rel(Relation onerel, VacuumParams *params,
 
 	/* Open all indexes of the relation */
 	vac_open_indexes(onerel, RowExclusiveLock, &nindexes, &Irel);
-	vacrelstats->hasindex = (nindexes > 0);
+	vacrelstats->useindex = (nindexes > 0 &&
+							 (params->options & VACOPT_INDEX_CLEANUP) != 0);
 
 	/* Do the vacuuming */
 	lazy_scan_heap(onerel, params->options, vacrelstats, Irel, nindexes, aggressive);
@@ -332,7 +335,7 @@ heap_vacuum_rel(Relation onerel, VacuumParams *params,
 						new_rel_pages,
 						new_live_tuples,
 						new_rel_allvisible,
-						vacrelstats->hasindex,
+						nindexes > 0,
 						new_frozen_xid,
 						new_min_multi,
 						false);
@@ -404,6 +407,10 @@ heap_vacuum_rel(Relation onerel, VacuumParams *params,
 							 vacrelstats->new_dead_tuples,
 							 OldestXmin);
 			appendStringInfo(&buf,
+							 _("%.0f tuples and %.0f item identifiers are left as dead.\n"),
+							 vacrelstats->nleft_dead_tuples,
+							 vacrelstats->nleft_dead_itemids);
+			appendStringInfo(&buf,
 							 _("buffer usage: %d hits, %d misses, %d dirtied\n"),
 							 VacuumPageHit,
 							 VacuumPageMiss,
@@ -485,7 +492,10 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 				live_tuples,	/* live tuples (reltuples estimate) */
 				tups_vacuumed,	/* tuples cleaned up by vacuum */
 				nkeep,			/* dead-but-not-removable tuples */
-				nunused;		/* unused item pointers */
+				nunused,		/* unused item pointers */
+				nleft_dead_tuples,		/* tuples we left as dead */
+				nleft_dead_itemids;		/* item pointers we left as dead,
+										 * includes nleft_dead_tuples. */
 	IndexBulkDeleteResult **indstats;
 	int			i;
 	PGRUsage	ru0;
@@ -518,6 +528,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	empty_pages = vacuumed_pages = 0;
 	next_fsm_block_to_vacuum = (BlockNumber) 0;
 	num_tuples = live_tuples = tups_vacuumed = nkeep = nunused = 0;
+	nleft_dead_itemids = nleft_dead_tuples = 0;
 
 	indstats = (IndexBulkDeleteResult **)
 		palloc0(nindexes * sizeof(IndexBulkDeleteResult *));
@@ -1054,7 +1065,16 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 						HeapTupleIsHeapOnly(&tuple))
 						nkeep += 1;
 					else
+					{
 						tupgone = true; /* we can delete the tuple */
+
+						/*
+						 * Since the dead tuples will be not be vacuumed
+						 * and ignored when index cleanup is disabled we
+						 * count them for reporting.
+						 */
+						nleft_dead_tuples++;
+					}
 					all_visible = false;
 					break;
 				case HEAPTUPLE_LIVE:
@@ -1206,15 +1226,33 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		}
 
 		/*
-		 * If there are no indexes then we can vacuum the page right now
-		 * instead of doing a second scan.
+		 * If there are no indexes we can vacuum the page right now instead of
+		 * doing a second scan. Also we don't do that but forget dead tuples
+		 * when index cleanup is disabled.
 		 */
-		if (nindexes == 0 &&
-			vacrelstats->num_dead_tuples > 0)
+		if (!vacrelstats->useindex && vacrelstats->num_dead_tuples > 0)
 		{
-			/* Remove tuples from heap */
-			lazy_vacuum_page(onerel, blkno, buf, 0, vacrelstats, &vmbuffer);
-			has_dead_tuples = false;
+			if (nindexes == 0)
+			{
+				/* Remove tuples from heap if the table has no index */
+				lazy_vacuum_page(onerel, blkno, buf, 0, vacrelstats, &vmbuffer);
+				vacuumed_pages++;
+				has_dead_tuples = false;
+			}
+			else
+			{
+				/*
+				 * Here, we have indexes but index cleanup is disabled. Instead of
+				 * vacuuming the dead tuples on the heap, we just forget them.
+				 *
+				 * Note that vacrelstats->dead_tuples could have tuples which
+				 * became dead after HOT-pruning but are not marked dead yet.
+				 * We do not process them because it's a very rare condition, and
+				 * the next vacuum will process them anyway.
+				 */
+				Assert((options & VACOPT_INDEX_CLEANUP) == 0);
+				nleft_dead_itemids += vacrelstats->num_dead_tuples;
+			}
 
 			/*
 			 * Forget the now-vacuumed tuples, and press on, but be careful
@@ -1222,7 +1260,6 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			 * valid.
 			 */
 			vacrelstats->num_dead_tuples = 0;
-			vacuumed_pages++;
 
 			/*
 			 * Periodically do incremental FSM vacuuming to make newly-freed
@@ -1348,7 +1385,9 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 
 	/* save stats for use later */
 	vacrelstats->tuples_deleted = tups_vacuumed;
-	vacrelstats->new_dead_tuples = nkeep;
+	vacrelstats->new_dead_tuples = nkeep + nleft_dead_tuples;
+	vacrelstats->nleft_dead_tuples = nleft_dead_tuples;
+	vacrelstats->nleft_dead_itemids = nleft_dead_itemids;
 
 	/* now we can compute the new value for pg_class.reltuples */
 	vacrelstats->new_live_tuples = vac_estimate_reltuples(onerel,
@@ -1417,8 +1456,11 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
 
 	/* Do post-vacuum cleanup and statistics update for each index */
-	for (i = 0; i < nindexes; i++)
-		lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
+	if (vacrelstats->useindex)
+	{
+		for (i = 0; i < nindexes; i++)
+			lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
+	}
 
 	/* If no indexes, make log report that lazy_vacuum_heap would've made */
 	if (vacuumed_pages)
@@ -1449,6 +1491,8 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 									"%u pages are entirely empty.\n",
 									empty_pages),
 					 empty_pages);
+	appendStringInfo(&buf, "%.0f tuples and %.0f item identifiers are left as dead.\n",
+					 nleft_dead_tuples, nleft_dead_itemids);
 	appendStringInfo(&buf, _("%s."), pg_rusage_show(&ru0));
 
 	ereport(elevel,
@@ -2092,7 +2136,7 @@ lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
 	autovacuum_work_mem != -1 ?
 	autovacuum_work_mem : maintenance_work_mem;
 
-	if (vacrelstats->hasindex)
+	if (vacrelstats->useindex)
 	{
 		maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);
 		maxtuples = Min(maxtuples, INT_MAX);
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 10df766..1d1b215 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -93,6 +93,7 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 	bool freeze = false;
 	bool full = false;
 	bool disable_page_skipping = false;
+	bool index_cleanup = true;	/* by default */
 	ListCell	*lc;
 
 	/* Parse options list */
@@ -120,6 +121,8 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 			full = defGetBoolean(opt);
 		else if (strcmp(opt->defname, "disable_page_skipping") == 0)
 			disable_page_skipping = defGetBoolean(opt);
+		else if (strcmp(opt->defname, "index_cleanup") == 0)
+			index_cleanup = defGetBoolean(opt);
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -135,7 +138,8 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 		(analyze ? VACOPT_ANALYZE : 0) |
 		(freeze ? VACOPT_FREEZE : 0) |
 		(full ? VACOPT_FULL : 0) |
-		(disable_page_skipping ? VACOPT_DISABLE_PAGE_SKIPPING : 0);
+		(disable_page_skipping ? VACOPT_DISABLE_PAGE_SKIPPING : 0) |
+		(index_cleanup ? VACOPT_INDEX_CLEANUP : 0);
 
 	/* sanity checks on options */
 	Assert(params.options & (VACOPT_VACUUM | VACOPT_ANALYZE));
@@ -251,7 +255,8 @@ vacuum(List *relations, VacuumParams *params,
 						stmttype)));
 
 	/*
-	 * Sanity check DISABLE_PAGE_SKIPPING option.
+	 * Sanity check DISABLE_PAGE_SKIPPING option and INDEX_CLEANUP
+	 * option.
 	 */
 	if ((params->options & VACOPT_FULL) != 0 &&
 		(params->options & VACOPT_DISABLE_PAGE_SKIPPING) != 0)
@@ -259,6 +264,11 @@ vacuum(List *relations, VacuumParams *params,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg("VACUUM option DISABLE_PAGE_SKIPPING cannot be used with FULL")));
 
+	if ((params->options & VACOPT_FULL) != 0 &&
+		(params->options & VACOPT_INDEX_CLEANUP) == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("VACUUM option INDEX_CLEANUP cannot be set to false with FULL")));
 	/*
 	 * Send info about dead objects to the statistics collector, unless we are
 	 * in autovacuum --- autovacuum.c does this for itself.
@@ -1720,6 +1730,13 @@ vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params)
 	LockRelationIdForSession(&onerelid, lmode);
 
 	/*
+	 * Disables index cleanup based on reloptions.
+	 */
+	if (onerel->rd_options &&
+		!((StdRdOptions *) onerel->rd_options)->vacuum_index_cleanup)
+		params->options &= ~(VACOPT_INDEX_CLEANUP);
+
+	/*
 	 * Remember the relation's TOAST relation for later, if the caller asked
 	 * us to process it.  In VACUUM FULL, though, the toast table is
 	 * automatically rebuilt by cluster_rel so we shouldn't recurse to it.
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index fa875db..51c60fc 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2883,7 +2883,7 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		tab->at_relid = relid;
 		tab->at_sharedrel = classForm->relisshared;
 		tab->at_params.options = VACOPT_SKIPTOAST |
-			(dovacuum ? VACOPT_VACUUM : 0) |
+			(dovacuum ? VACOPT_VACUUM | VACOPT_INDEX_CLEANUP : 0) |
 			(doanalyze ? VACOPT_ANALYZE : 0) |
 			(!wraparound ? VACOPT_SKIP_LOCKED : 0);
 		tab->at_params.freeze_min_age = freeze_min_age;
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index c18977c..c60efd9 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1039,6 +1039,7 @@ static const char *const table_storage_parameters[] = {
 	"toast.log_autovacuum_min_duration",
 	"toast_tuple_target",
 	"user_catalog_table",
+	"vacuum_index_cleanup",
 	NULL
 };
 
@@ -3431,8 +3432,9 @@ psql_completion(const char *text, int start, int end)
 		 */
 		if (ends_with(prev_wd, '(') || ends_with(prev_wd, ','))
 			COMPLETE_WITH("FULL", "FREEZE", "ANALYZE", "VERBOSE",
-						  "DISABLE_PAGE_SKIPPING", "SKIP_LOCKED");
-		else if (TailMatches("FULL|FREEZE|ANALYZE|VERBOSE|DISABLE_PAGE_SKIPPING|SKIP_LOCKED"))
+						  "DISABLE_PAGE_SKIPPING", "SKIP_LOCKED",
+						  "INDEX_CLEANUP");
+		else if (TailMatches("FULL|FREEZE|ANALYZE|VERBOSE|DISABLE_PAGE_SKIPPING|SKIP_LOCKED|INDEX_CLEANUP"))
 			COMPLETE_WITH("ON", "OFF");
 	}
 	else if (HeadMatches("VACUUM") && TailMatches("("))
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 77086f3..7bbc18c 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -145,7 +145,8 @@ typedef enum VacuumOption
 	VACOPT_FULL = 1 << 4,		/* FULL (non-concurrent) vacuum */
 	VACOPT_SKIP_LOCKED = 1 << 5,	/* skip if cannot get lock */
 	VACOPT_SKIPTOAST = 1 << 6,	/* don't process the TOAST table, if any */
-	VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7	/* don't skip any pages */
+	VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7,	/* don't skip any pages */
+	VACOPT_INDEX_CLEANUP = 1 << 8	/* Do index vacuum and cleanup */
 } VacuumOption;
 
 /*
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 5402851..89a7fbf 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -266,6 +266,7 @@ typedef struct StdRdOptions
 	AutoVacOpts autovacuum;		/* autovacuum-related options */
 	bool		user_catalog_table; /* use as an additional catalog relation */
 	int			parallel_workers;	/* max number of parallel workers */
+	bool		vacuum_index_cleanup;	/* enables index vacuuming and cleanup */
 } StdRdOptions;
 
 #define HEAP_MIN_FILLFACTOR			10
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index 07d0703..a1de992 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -80,6 +80,9 @@ CONTEXT:  SQL function "do_analyze" statement 1
 SQL function "wrap_do_analyze" statement 1
 VACUUM FULL vactst;
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+VACUUM (INDEX_CLEANUP FALSE) vaccluster;
+VACUUM (INDEX_CLEANUP FALSE) vactst; -- option is ignored if no indexes
+VACUUM (INDEX_CLEANUP FALSE, FREEZE) vaccluster;
 -- partitioned table
 CREATE TABLE vacparted (a int, b char) PARTITION BY LIST (a);
 CREATE TABLE vacparted1 PARTITION OF vacparted FOR VALUES IN (1);
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 81f3822..8745e34 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -61,6 +61,9 @@ VACUUM FULL vaccluster;
 VACUUM FULL vactst;
 
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+VACUUM (INDEX_CLEANUP FALSE) vaccluster;
+VACUUM (INDEX_CLEANUP FALSE) vactst; -- option is ignored if no indexes
+VACUUM (INDEX_CLEANUP FALSE, FREEZE) vaccluster;
 
 -- partitioned table
 CREATE TABLE vacparted (a int, b char) PARTITION BY LIST (a);
-- 
1.8.3.1

#51Robert Haas
robertmhaas@gmail.com
In reply to: Masahiko Sawada (#50)
1 attachment(s)
Re: New vacuum option to do only freezing

On Fri, Mar 29, 2019 at 2:16 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Attached updated patches. These patches are applied on top of 0001
patch on parallel vacuum thread[1].

+ bool index_cleanup = true; /* by default */

I think we should instead initialize index_cleanup to the reloption
value, if there is one, or true if none, and then let it be overridden
by the loop that follows, where whatever the user specifies in the SQL
command is processed. That way, any explicitly-specified option
within the command itself wins, and the reloption sets the default.
As you have it, index cleanup is disabled when the reloption is set to
false even if the user wrote VACUUM (INDEX_CLEANUP TRUE).

+            appendStringInfo(&buf,
+                             _("%.0f tuples and %.0f item identifiers are left
as dead.\n"),
+                             vacrelstats->nleft_dead_tuples,
+                             vacrelstats->nleft_dead_itemids);

I tend to think we should omit this line entirely if both values are
0, as will very often be the case.

+    if ((params->options & VACOPT_FULL) != 0 &&
+        (params->options & VACOPT_INDEX_CLEANUP) == 0)
+        ereport(ERROR,
+                (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+                 errmsg("VACUUM option INDEX_CLEANUP cannot be set to
false with FULL")));

I think it would be better to just ignore the INDEX_CLEANUP option
when FULL is specified.

I wasn't all that happy with the documentation changes you proposed.
Please find attached a proposed set of doc changes.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachments:

vacuum-index-cleanup-doc-rmh.patchapplication/octet-stream; name=vacuum-index-cleanup-doc-rmh.patchDownload
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index 166078410c..2ce7a0cdbc 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -1344,6 +1344,21 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><literal>vacuum_index_cleanup</literal> (<type>boolean</type>)</term>
+    <listitem>
+     <para>
+     Enables or disables index cleanup when <command>VACUUM</command> is
+     run on this table.  The default value is <literal>true</literal>.
+     Disabling index cleanup can speed of <command>VACUUM</command> very
+     significantly, but may also lead to severely bloated indexes if table
+     modifications are frequent.  The <literal>INDEX_CLEANUP</literal>
+     parameter to <xref linkend="sql-vacuum"/>, if specified, overrides
+     the value of this option.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><literal>autovacuum_vacuum_threshold</literal>, <literal>toast.autovacuum_vacuum_threshold</literal> (<type>integer</type>)</term>
     <listitem>
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index 906d0c2ad7..13b9735c22 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -32,6 +32,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     ANALYZE [ <replaceable class="parameter">boolean</replaceable> ]
     DISABLE_PAGE_SKIPPING [ <replaceable class="parameter">boolean</replaceable> ]
     SKIP_LOCKED [ <replaceable class="parameter">boolean</replaceable> ]
+    INDEX_CLEANUP [ <replaceable class="parameter">boolean</replaceable> ]
 
 <phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
 
@@ -181,6 +182,28 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><literal>INDEX_CLEANUP</literal></term>
+    <listitem>
+     <para>
+      Specifies that <command>VACUUM</command> should attempt to remove
+      index entries pointing to dead tuples.  This is normally the desired
+      behavior and is the default unless the
+      <literal>vacuum_index_cleanup</literal> option has been set to false
+      for the table to be vacuumed.  Setting this option to false may be
+      useful when it is necessary to make vacuum run as quickly as possible,
+      for example to avoid imminent transaction ID wraparond
+      (see <xref linkend="vacuum-for-wraparound"/>).  However, if index
+      cleanup is not performed regularly, performance may suffer, because
+      as the table is modified, indexes will accumulate dead line pointers
+      and the table itself will accumulate dead line pointers that cannot be
+      removed until index cleanup is completed.  This option has no effect
+      when <literal>FULL</literal> is used or for tables that do not have
+      an index.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><replaceable class="parameter">boolean</replaceable></term>
     <listitem>
#52Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Robert Haas (#51)
Re: New vacuum option to do only freezing

On Fri, Mar 29, 2019 at 10:46 PM Robert Haas <robertmhaas@gmail.com> wrote:

On Fri, Mar 29, 2019 at 2:16 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Attached updated patches. These patches are applied on top of 0001
patch on parallel vacuum thread[1].

+ bool index_cleanup = true; /* by default */

I think we should instead initialize index_cleanup to the reloption
value, if there is one, or true if none, and then let it be overridden
by the loop that follows, where whatever the user specifies in the SQL
command is processed. That way, any explicitly-specified option
within the command itself wins, and the reloption sets the default.
As you have it, index cleanup is disabled when the reloption is set to
false even if the user wrote VACUUM (INDEX_CLEANUP TRUE).

Yeah, but since multiple relations might be specified in VACUUM
command we need to process index_cleanup option after opened each
relations. Maybe we need to process all options except for
INDEX_CLEANUP in ExecVacuum() and pass VacuumStmt down to vacuum_rel()
and process it in manner of you suggested after opened the relation.
Is that right?

+            appendStringInfo(&buf,
+                             _("%.0f tuples and %.0f item identifiers are left
as dead.\n"),
+                             vacrelstats->nleft_dead_tuples,
+                             vacrelstats->nleft_dead_itemids);

I tend to think we should omit this line entirely if both values are
0, as will very often be the case.

Fixed.

+    if ((params->options & VACOPT_FULL) != 0 &&
+        (params->options & VACOPT_INDEX_CLEANUP) == 0)
+        ereport(ERROR,
+                (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+                 errmsg("VACUUM option INDEX_CLEANUP cannot be set to
false with FULL")));

I think it would be better to just ignore the INDEX_CLEANUP option
when FULL is specified.

Okay, but why do we ignore that in this case while we complain in the
case of FULL and DISABLE_PAGE_SKIPPING?

I wasn't all that happy with the documentation changes you proposed.
Please find attached a proposed set of doc changes.

Thank you! I've incorporated these changes.
Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#53Robert Haas
robertmhaas@gmail.com
In reply to: Masahiko Sawada (#52)
Re: New vacuum option to do only freezing

On Fri, Mar 29, 2019 at 12:27 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Yeah, but since multiple relations might be specified in VACUUM
command we need to process index_cleanup option after opened each
relations. Maybe we need to process all options except for
INDEX_CLEANUP in ExecVacuum() and pass VacuumStmt down to vacuum_rel()
and process it in manner of you suggested after opened the relation.
Is that right?

Blech, no, let's not do that. We'd better use some method that can
indicate yes/no/default. Something like psql's trivalue thingy, but
probably not exactly that. We can define an enum for this purpose,
for example - VACUUM_INDEX_CLEANUP_{ENABLED,DISABLED,DEFAULT}. Or
maybe there's some other way. But let's not pass bits of the parse
tree around any more than really required.

I think it would be better to just ignore the INDEX_CLEANUP option
when FULL is specified.

Okay, but why do we ignore that in this case while we complain in the
case of FULL and DISABLE_PAGE_SKIPPING?

Well, that's a fair point, I guess. If we go that that route, we'll
need to make sure that setting the reloption doesn't prevent VACUUM
FULL from working -- the complaint must only affect an explicit option
on the VACUUM command line. I think we should have a regression test
for that.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#54Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Robert Haas (#53)
2 attachment(s)
Re: New vacuum option to do only freezing

On Sat, Mar 30, 2019 at 5:04 AM Robert Haas <robertmhaas@gmail.com> wrote:

On Fri, Mar 29, 2019 at 12:27 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Yeah, but since multiple relations might be specified in VACUUM
command we need to process index_cleanup option after opened each
relations. Maybe we need to process all options except for
INDEX_CLEANUP in ExecVacuum() and pass VacuumStmt down to vacuum_rel()
and process it in manner of you suggested after opened the relation.
Is that right?

Blech, no, let's not do that. We'd better use some method that can
indicate yes/no/default. Something like psql's trivalue thingy, but
probably not exactly that. We can define an enum for this purpose,
for example - VACUUM_INDEX_CLEANUP_{ENABLED,DISABLED,DEFAULT}. Or
maybe there's some other way. But let's not pass bits of the parse
tree around any more than really required.

I've defined an enum VacOptTernaryValue representing
enabled/disabled/default and added index_cleanup variable as the new
enum type to VacuumParams. The vacuum options that uses the reloption
value as the default value such as index cleanup and new truncation
option can use this enum and set either enabled or disabled after
opened the relation when it’s set to default. Please find the attached
patches.

I think it would be better to just ignore the INDEX_CLEANUP option
when FULL is specified.

Okay, but why do we ignore that in this case while we complain in the
case of FULL and DISABLE_PAGE_SKIPPING?

Well, that's a fair point, I guess. If we go that that route, we'll
need to make sure that setting the reloption doesn't prevent VACUUM
FULL from working -- the complaint must only affect an explicit option
on the VACUUM command line. I think we should have a regression test
for that.

I've added regression tests. Since we check it before setting
index_cleanup based on reloptions we doesn't prevent VAUCUM FULL from
working even when the vacuum_index_cleanup is false.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachments:

v12-0001-Add-INDEX_CLEANUP-option-to-VACUUM-command.patchapplication/x-patch; name=v12-0001-Add-INDEX_CLEANUP-option-to-VACUUM-command.patchDownload
From ca7a9f26283558525130a6c3d072096422aff525 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Thu, 7 Mar 2019 09:45:11 +0900
Subject: [PATCH v12 1/2] Add INDEX_CLEANUP option to VACUUM command

If this option is false, VACUUM does HOT-pruning for live tuples but
doesn't remove dead tuples completely and disables index vacuum.

vacrelstats->dead_tuples could have tuples that became dead after
checked at a HOT-pruning time, which are not marked as dead. Per
discussion on pgsql-hackers We normally records and remove them but
with this option we don't process and leave for the next vacuum for
simplifing the code. That's okay because it's very rare condition and
those tuples will be processed by the next vacuum.
---
 doc/src/sgml/ref/create_table.sgml     | 15 +++++++
 doc/src/sgml/ref/vacuum.sgml           | 23 ++++++++++
 src/backend/access/common/reloptions.c | 13 +++++-
 src/backend/access/heap/vacuumlazy.c   | 79 ++++++++++++++++++++++++++--------
 src/backend/commands/vacuum.c          | 37 +++++++++++++++-
 src/backend/postmaster/autovacuum.c    |  1 +
 src/bin/psql/tab-complete.c            |  6 ++-
 src/include/commands/vacuum.h          | 15 +++++++
 src/include/utils/rel.h                |  1 +
 src/test/regress/expected/vacuum.out   | 11 +++++
 src/test/regress/sql/vacuum.sql        | 11 +++++
 11 files changed, 191 insertions(+), 21 deletions(-)

diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index 0fcbc66..910db52 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -1378,6 +1378,21 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
    </varlistentry>
 
    <varlistentry>
+    <term><literal>vacuum_index_cleanup</literal> (<type>boolean</type>)</term>
+    <listitem>
+     <para>
+      Enables or disables index cleanup when <command>VACUUM</command> is
+      run on this table.  The default value is <literal>true</literal>.
+      Disabling index cleanup can speed of <command>VACUUM</command> very
+      significantly, but may also lead to severely bloated indexes if table
+      modifications are frequent.  The <literal>INDEX_CLEANUP</literal>
+      parameter to <xref linkend="sql-vacuum"/>, if specified, overrides
+      the value of this option.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><literal>autovacuum_vacuum_threshold</literal>, <literal>toast.autovacuum_vacuum_threshold</literal> (<type>integer</type>)</term>
     <listitem>
      <para>
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index 906d0c2..643e97b 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -32,6 +32,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     ANALYZE [ <replaceable class="parameter">boolean</replaceable> ]
     DISABLE_PAGE_SKIPPING [ <replaceable class="parameter">boolean</replaceable> ]
     SKIP_LOCKED [ <replaceable class="parameter">boolean</replaceable> ]
+    INDEX_CLEANUP [ <replaceable class="parameter">boolean</replaceable> ]
 
 <phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
 
@@ -182,6 +183,28 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    </varlistentry>
 
    <varlistentry>
+    <term><literal>INDEX_CLEANUP</literal></term>
+    <listitem>
+     <para>
+      Specifies that <command>VACUUM</command> should attempt to remove
+      index entries pointing to dead tuples.  This is normally the desired
+      behavior and is the default unless the
+      <literal>vacuum_index_cleanup</literal> option has been set to false
+      for the table to be vacuumed.  Setting this option to false may be
+      useful when it is necessary to make vacuum run as quickly as possible,
+      for example to avoid imminent transaction ID wraparound
+      (see <xref linkend="vacuum-for-wraparound"/>).  However, if index
+      cleanup is not performed regularly, performance may suffer, because
+      as the table is modified, indexes will accumulate dead line pointers
+      and the table itself will accumulate dead line pointers that cannot be
+      removed until index cleanup is completed.  This option has no effect
+      for tables that do not have an index and cannot be used in conjunction
+      with <literal>FULL</literal> option.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><replaceable class="parameter">boolean</replaceable></term>
     <listitem>
      <para>
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index b58a1f7..e2c0de3 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -138,6 +138,15 @@ static relopt_bool boolRelOpts[] =
 		},
 		false
 	},
+	{
+		{
+			"vacuum_index_cleanup",
+			"Enables index vacuuming and index cleanup",
+			RELOPT_KIND_HEAP,
+			ShareUpdateExclusiveLock
+		},
+		true
+	},
 	/* list terminator */
 	{{NULL}}
 };
@@ -1388,7 +1397,9 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
 		{"parallel_workers", RELOPT_TYPE_INT,
 		offsetof(StdRdOptions, parallel_workers)},
 		{"vacuum_cleanup_index_scale_factor", RELOPT_TYPE_REAL,
-		offsetof(StdRdOptions, vacuum_cleanup_index_scale_factor)}
+		offsetof(StdRdOptions, vacuum_cleanup_index_scale_factor)},
+		{"vacuum_index_cleanup", RELOPT_TYPE_BOOL,
+		offsetof(StdRdOptions, vacuum_index_cleanup)}
 	};
 
 	options = parseRelOptions(reloptions, validate, kind, &numoptions);
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index b5b464e..3619a01 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -112,8 +112,8 @@
 
 typedef struct LVRelStats
 {
-	/* hasindex = true means two-pass strategy; false means one-pass */
-	bool		hasindex;
+	/* useindex = true means two-pass strategy; false means one-pass */
+	bool		useindex;
 	/* Overall statistics about rel */
 	BlockNumber old_rel_pages;	/* previous value of pg_class.relpages */
 	BlockNumber rel_pages;		/* total number of pages */
@@ -125,6 +125,8 @@ typedef struct LVRelStats
 	double		new_rel_tuples; /* new estimated total # of tuples */
 	double		new_live_tuples;	/* new estimated total # of live tuples */
 	double		new_dead_tuples;	/* new estimated total # of dead tuples */
+	double		nleft_dead_tuples;	/* # of dead tuples we left */
+	double		nleft_dead_itemids;	/* # of dead item pointers we left */
 	BlockNumber pages_removed;
 	double		tuples_deleted;
 	BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
@@ -275,7 +277,8 @@ heap_vacuum_rel(Relation onerel, VacuumParams *params,
 
 	/* Open all indexes of the relation */
 	vac_open_indexes(onerel, RowExclusiveLock, &nindexes, &Irel);
-	vacrelstats->hasindex = (nindexes > 0);
+	vacrelstats->useindex = (nindexes > 0 &&
+							 params->index_cleanup == VACUUM_OPTION_ENABLED);
 
 	/* Do the vacuuming */
 	lazy_scan_heap(onerel, params->options, vacrelstats, Irel, nindexes, aggressive);
@@ -349,7 +352,7 @@ heap_vacuum_rel(Relation onerel, VacuumParams *params,
 						new_rel_pages,
 						new_live_tuples,
 						new_rel_allvisible,
-						vacrelstats->hasindex,
+						nindexes > 0,
 						new_frozen_xid,
 						new_min_multi,
 						false);
@@ -419,6 +422,12 @@ heap_vacuum_rel(Relation onerel, VacuumParams *params,
 							 vacrelstats->new_rel_tuples,
 							 vacrelstats->new_dead_tuples,
 							 OldestXmin);
+			if (vacrelstats->nleft_dead_tuples > 0 ||
+				vacrelstats->nleft_dead_itemids > 0)
+				appendStringInfo(&buf,
+								 _("%.0f tuples and %.0f item identifiers are left as dead.\n"),
+								 vacrelstats->nleft_dead_tuples,
+								 vacrelstats->nleft_dead_itemids);
 			appendStringInfo(&buf,
 							 _("buffer usage: %d hits, %d misses, %d dirtied\n"),
 							 VacuumPageHit,
@@ -501,7 +510,10 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 				live_tuples,	/* live tuples (reltuples estimate) */
 				tups_vacuumed,	/* tuples cleaned up by vacuum */
 				nkeep,			/* dead-but-not-removable tuples */
-				nunused;		/* unused item pointers */
+				nunused,		/* unused item pointers */
+				nleft_dead_tuples,		/* tuples we left as dead */
+				nleft_dead_itemids;		/* item pointers we left as dead,
+										 * includes nleft_dead_tuples. */
 	IndexBulkDeleteResult **indstats;
 	int			i;
 	PGRUsage	ru0;
@@ -534,6 +546,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	empty_pages = vacuumed_pages = 0;
 	next_fsm_block_to_vacuum = (BlockNumber) 0;
 	num_tuples = live_tuples = tups_vacuumed = nkeep = nunused = 0;
+	nleft_dead_itemids = nleft_dead_tuples = 0;
 
 	indstats = (IndexBulkDeleteResult **)
 		palloc0(nindexes * sizeof(IndexBulkDeleteResult *));
@@ -1070,7 +1083,16 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 						HeapTupleIsHeapOnly(&tuple))
 						nkeep += 1;
 					else
+					{
 						tupgone = true; /* we can delete the tuple */
+
+						/*
+						 * Since the dead tuples will be not be vacuumed
+						 * and ignored when index cleanup is disabled we
+						 * count them for reporting.
+						 */
+						nleft_dead_tuples++;
+					}
 					all_visible = false;
 					break;
 				case HEAPTUPLE_LIVE:
@@ -1222,15 +1244,32 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		}
 
 		/*
-		 * If there are no indexes then we can vacuum the page right now
-		 * instead of doing a second scan.
+		 * If there are no indexes we can vacuum the page right now instead of
+		 * doing a second scan. Also we don't do that but forget dead tuples
+		 * when index cleanup is disabled.
 		 */
-		if (nindexes == 0 &&
-			vacrelstats->num_dead_tuples > 0)
+		if (!vacrelstats->useindex && vacrelstats->num_dead_tuples > 0)
 		{
-			/* Remove tuples from heap */
-			lazy_vacuum_page(onerel, blkno, buf, 0, vacrelstats, &vmbuffer);
-			has_dead_tuples = false;
+			if (nindexes == 0)
+			{
+				/* Remove tuples from heap if the table has no index */
+				lazy_vacuum_page(onerel, blkno, buf, 0, vacrelstats, &vmbuffer);
+				vacuumed_pages++;
+				has_dead_tuples = false;
+			}
+			else
+			{
+				/*
+				 * Here, we have indexes but index cleanup is disabled. Instead of
+				 * vacuuming the dead tuples on the heap, we just forget them.
+				 *
+				 * Note that vacrelstats->dead_tuples could have tuples which
+				 * became dead after HOT-pruning but are not marked dead yet.
+				 * We do not process them because it's a very rare condition, and
+				 * the next vacuum will process them anyway.
+				 */
+				nleft_dead_itemids += vacrelstats->num_dead_tuples;
+			}
 
 			/*
 			 * Forget the now-vacuumed tuples, and press on, but be careful
@@ -1238,7 +1277,6 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			 * valid.
 			 */
 			vacrelstats->num_dead_tuples = 0;
-			vacuumed_pages++;
 
 			/*
 			 * Periodically do incremental FSM vacuuming to make newly-freed
@@ -1364,7 +1402,9 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 
 	/* save stats for use later */
 	vacrelstats->tuples_deleted = tups_vacuumed;
-	vacrelstats->new_dead_tuples = nkeep;
+	vacrelstats->new_dead_tuples = nkeep + nleft_dead_tuples;
+	vacrelstats->nleft_dead_tuples = nleft_dead_tuples;
+	vacrelstats->nleft_dead_itemids = nleft_dead_itemids;
 
 	/* now we can compute the new value for pg_class.reltuples */
 	vacrelstats->new_live_tuples = vac_estimate_reltuples(onerel,
@@ -1433,8 +1473,11 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
 
 	/* Do post-vacuum cleanup and statistics update for each index */
-	for (i = 0; i < nindexes; i++)
-		lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
+	if (vacrelstats->useindex)
+	{
+		for (i = 0; i < nindexes; i++)
+			lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
+	}
 
 	/* If no indexes, make log report that lazy_vacuum_heap would've made */
 	if (vacuumed_pages)
@@ -1465,6 +1508,8 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 									"%u pages are entirely empty.\n",
 									empty_pages),
 					 empty_pages);
+	appendStringInfo(&buf, "%.0f tuples and %.0f item identifiers are left as dead.\n",
+					 nleft_dead_tuples, nleft_dead_itemids);
 	appendStringInfo(&buf, _("%s."), pg_rusage_show(&ru0));
 
 	ereport(elevel,
@@ -2108,7 +2153,7 @@ lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
 	autovacuum_work_mem != -1 ?
 	autovacuum_work_mem : maintenance_work_mem;
 
-	if (vacrelstats->hasindex)
+	if (vacrelstats->useindex)
 	{
 		maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);
 		maxtuples = Min(maxtuples, INT_MAX);
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index fd2e47f..bae8034 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -76,6 +76,7 @@ static void vac_truncate_clog(TransactionId frozenXID,
 				  TransactionId lastSaneFrozenXid,
 				  MultiXactId lastSaneMinMulti);
 static bool vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params);
+static VacOptTernaryValue get_vacopt_ternary_value(DefElem *def);
 
 /*
  * Primary entry point for manual VACUUM and ANALYZE commands
@@ -95,6 +96,9 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 	bool disable_page_skipping = false;
 	ListCell	*lc;
 
+	/* Set default value */
+	params.index_cleanup = VACUUM_OPTION_DEFAULT;
+
 	/* Parse options list */
 	foreach(lc, vacstmt->options)
 	{
@@ -120,6 +124,8 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 			full = defGetBoolean(opt);
 		else if (strcmp(opt->defname, "disable_page_skipping") == 0)
 			disable_page_skipping = defGetBoolean(opt);
+		else if (strcmp(opt->defname, "index_cleanup") == 0)
+			params.index_cleanup = get_vacopt_ternary_value(opt);
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -251,7 +257,8 @@ vacuum(List *relations, VacuumParams *params,
 						stmttype)));
 
 	/*
-	 * Sanity check DISABLE_PAGE_SKIPPING option.
+	 * Sanity check DISABLE_PAGE_SKIPPING option and INDEX_CLEANUP
+	 * option.
 	 */
 	if ((params->options & VACOPT_FULL) != 0 &&
 		(params->options & VACOPT_DISABLE_PAGE_SKIPPING) != 0)
@@ -259,6 +266,12 @@ vacuum(List *relations, VacuumParams *params,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg("VACUUM option DISABLE_PAGE_SKIPPING cannot be used with FULL")));
 
+	if ((params->options & VACOPT_FULL) != 0 &&
+		params->index_cleanup == VACUUM_OPTION_DISABLED)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("VACUUM option INDEX_CLEANUP cannot be set to false with FULL")));
+
 	/*
 	 * Send info about dead objects to the statistics collector, unless we are
 	 * in autovacuum --- autovacuum.c does this for itself.
@@ -1719,6 +1732,16 @@ vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params)
 	onerelid = onerel->rd_lockInfo.lockRelId;
 	LockRelationIdForSession(&onerelid, lmode);
 
+	/* Set index cleanup option based on reloptions */
+	if (params->index_cleanup == VACUUM_OPTION_DEFAULT)
+	{
+		if (onerel->rd_options == NULL ||
+			((StdRdOptions *) onerel->rd_options)->vacuum_index_cleanup)
+			params->index_cleanup = VACUUM_OPTION_ENABLED;
+		else
+			params->index_cleanup = VACUUM_OPTION_DISABLED;
+	}
+
 	/*
 	 * Remember the relation's TOAST relation for later, if the caller asked
 	 * us to process it.  In VACUUM FULL, though, the toast table is
@@ -1899,3 +1922,15 @@ vacuum_delay_point(void)
 		CHECK_FOR_INTERRUPTS();
 	}
 }
+
+/*
+ * A wrapper function of defGetBoolean().
+ *
+ * This function returns VACUUM_OPTION_ENABLED and VACUUM_OPTION_DISABLED
+ * instead of true and false.
+ */
+static VacOptTernaryValue
+get_vacopt_ternary_value(DefElem *def)
+{
+	return defGetBoolean(def) ? VACUUM_OPTION_ENABLED : VACUUM_OPTION_DISABLED;
+}
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index fa875db..8e602c7 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2886,6 +2886,7 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 			(dovacuum ? VACOPT_VACUUM : 0) |
 			(doanalyze ? VACOPT_ANALYZE : 0) |
 			(!wraparound ? VACOPT_SKIP_LOCKED : 0);
+		tab->at_params.index_cleanup = VACUUM_OPTION_DEFAULT;
 		tab->at_params.freeze_min_age = freeze_min_age;
 		tab->at_params.freeze_table_age = freeze_table_age;
 		tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index d6d8fd0..4e6790b 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1039,6 +1039,7 @@ static const char *const table_storage_parameters[] = {
 	"toast.log_autovacuum_min_duration",
 	"toast_tuple_target",
 	"user_catalog_table",
+	"vacuum_index_cleanup",
 	NULL
 };
 
@@ -3443,8 +3444,9 @@ psql_completion(const char *text, int start, int end)
 		 */
 		if (ends_with(prev_wd, '(') || ends_with(prev_wd, ','))
 			COMPLETE_WITH("FULL", "FREEZE", "ANALYZE", "VERBOSE",
-						  "DISABLE_PAGE_SKIPPING", "SKIP_LOCKED");
-		else if (TailMatches("FULL|FREEZE|ANALYZE|VERBOSE|DISABLE_PAGE_SKIPPING|SKIP_LOCKED"))
+						  "DISABLE_PAGE_SKIPPING", "SKIP_LOCKED",
+						  "INDEX_CLEANUP");
+		else if (TailMatches("FULL|FREEZE|ANALYZE|VERBOSE|DISABLE_PAGE_SKIPPING|SKIP_LOCKED|INDEX_CLEANUP"))
 			COMPLETE_WITH("ON", "OFF");
 	}
 	else if (HeadMatches("VACUUM") && TailMatches("("))
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 77086f3..e888eef 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -149,6 +149,19 @@ typedef enum VacuumOption
 } VacuumOption;
 
 /*
+ * A ternary value used by vacuum parameters.
+ *
+ * DEFAULT value is used to determine the value based on other
+ * configurations, e.g. reloptions.
+ */
+typedef enum VacOptTernaryValue
+{
+	VACUUM_OPTION_ENABLED = 0,
+	VACUUM_OPTION_DISABLED,
+	VACUUM_OPTION_DEFAULT,
+} VacOptTernaryValue;
+
+/*
  * Parameters customizing behavior of VACUUM and ANALYZE.
  *
  * Note that at least one of VACOPT_VACUUM and VACOPT_ANALYZE must be set
@@ -167,6 +180,8 @@ typedef struct VacuumParams
 	int			log_min_duration;	/* minimum execution threshold in ms at
 									 * which  verbose logs are activated, -1
 									 * to use default */
+	VacOptTernaryValue index_cleanup;	/* Do index vacuum and cleanup,
+										* default value depends on reloptions */
 } VacuumParams;
 
 /* GUC parameters */
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 5402851..89a7fbf 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -266,6 +266,7 @@ typedef struct StdRdOptions
 	AutoVacOpts autovacuum;		/* autovacuum-related options */
 	bool		user_catalog_table; /* use as an additional catalog relation */
 	int			parallel_workers;	/* max number of parallel workers */
+	bool		vacuum_index_cleanup;	/* enables index vacuuming and cleanup */
 } StdRdOptions;
 
 #define HEAP_MIN_FILLFACTOR			10
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index 07d0703..b528413 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -80,6 +80,16 @@ CONTEXT:  SQL function "do_analyze" statement 1
 SQL function "wrap_do_analyze" statement 1
 VACUUM FULL vactst;
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+-- INDEX_CLEANUP option
+CREATE TABLE no_index_cleanup (i INT PRIMARY KEY) WITH (vacuum_index_cleanup = false);
+VACUUM (INDEX_CLEANUP FALSE) vaccluster;
+VACUUM (INDEX_CLEANUP FALSE) vactst; -- option is ignored if no indexes
+VACUUM (INDEX_CLEANUP FALSE, FREEZE TRUE) vaccluster;
+-- error
+VACUUM (INDEX_CLEANUP FALSE, FULL TRUE) no_index_cleanup;
+ERROR:  VACUUM option INDEX_CLEANUP cannot be set to false with FULL
+-- check if FULL option works even when index cleanup is disabled by reloption.
+VACUUM (FULL TRUE) no_index_cleanup;
 -- partitioned table
 CREATE TABLE vacparted (a int, b char) PARTITION BY LIST (a);
 CREATE TABLE vacparted1 PARTITION OF vacparted FOR VALUES IN (1);
@@ -136,6 +146,7 @@ ANALYZE (SKIP_LOCKED) vactst;
 DROP TABLE vaccluster;
 DROP TABLE vactst;
 DROP TABLE vacparted;
+DROP TABLE no_index_cleanup;
 -- relation ownership, WARNING logs generated as all are skipped.
 CREATE TABLE vacowned (a int);
 CREATE TABLE vacowned_parted (a int) PARTITION BY LIST (a);
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 81f3822..3280eca 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -62,6 +62,16 @@ VACUUM FULL vactst;
 
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
 
+-- INDEX_CLEANUP option
+CREATE TABLE no_index_cleanup (i INT PRIMARY KEY) WITH (vacuum_index_cleanup = false);
+VACUUM (INDEX_CLEANUP FALSE) vaccluster;
+VACUUM (INDEX_CLEANUP FALSE) vactst; -- option is ignored if no indexes
+VACUUM (INDEX_CLEANUP FALSE, FREEZE TRUE) vaccluster;
+-- error
+VACUUM (INDEX_CLEANUP FALSE, FULL TRUE) no_index_cleanup;
+-- check if FULL option works even when index cleanup is disabled by reloption.
+VACUUM (FULL TRUE) no_index_cleanup;
+
 -- partitioned table
 CREATE TABLE vacparted (a int, b char) PARTITION BY LIST (a);
 CREATE TABLE vacparted1 PARTITION OF vacparted FOR VALUES IN (1);
@@ -107,6 +117,7 @@ ANALYZE (SKIP_LOCKED) vactst;
 DROP TABLE vaccluster;
 DROP TABLE vactst;
 DROP TABLE vacparted;
+DROP TABLE no_index_cleanup;
 
 -- relation ownership, WARNING logs generated as all are skipped.
 CREATE TABLE vacowned (a int);
-- 
2.10.5

v12-0002-Add-disable-index-cleanup-option-to-vacuumdb.patchapplication/x-patch; name=v12-0002-Add-disable-index-cleanup-option-to-vacuumdb.patchDownload
From 51f5bd1d03dcce2ccaefd18fdaf2f99365c8fd74 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 1 Feb 2019 16:02:50 +0100
Subject: [PATCH v12 2/2] Add --disable-index-cleanup option to vacuumdb.

---
 doc/src/sgml/ref/vacuumdb.sgml    | 15 +++++++++++++++
 src/bin/scripts/t/100_vacuumdb.pl |  9 ++++++++-
 src/bin/scripts/vacuumdb.c        | 29 +++++++++++++++++++++++++++++
 3 files changed, 52 insertions(+), 1 deletion(-)

diff --git a/doc/src/sgml/ref/vacuumdb.sgml b/doc/src/sgml/ref/vacuumdb.sgml
index 41c7f3d..02d6e46 100644
--- a/doc/src/sgml/ref/vacuumdb.sgml
+++ b/doc/src/sgml/ref/vacuumdb.sgml
@@ -118,6 +118,21 @@ PostgreSQL documentation
      </varlistentry>
 
      <varlistentry>
+      <term><option>--disable-index-cleanup</option></term>
+      <listitem>
+       <para>
+        Disable index vacuuming and index cleanup.
+       </para>
+       <note>
+        <para>
+         This option is only available for servers running
+         <productname>PostgreSQL</productname> 12 and later.
+        </para>
+       </note>
+      </listitem>
+     </varlistentry>
+
+     <varlistentry>
       <term><option>-e</option></term>
       <term><option>--echo</option></term>
       <listitem>
diff --git a/src/bin/scripts/t/100_vacuumdb.pl b/src/bin/scripts/t/100_vacuumdb.pl
index 7f3a9b1..2b7cd18 100644
--- a/src/bin/scripts/t/100_vacuumdb.pl
+++ b/src/bin/scripts/t/100_vacuumdb.pl
@@ -3,7 +3,7 @@ use warnings;
 
 use PostgresNode;
 use TestLib;
-use Test::More tests => 44;
+use Test::More tests => 47;
 
 program_help_ok('vacuumdb');
 program_version_ok('vacuumdb');
@@ -38,6 +38,10 @@ $node->issues_sql_like(
 	qr/statement: VACUUM \(DISABLE_PAGE_SKIPPING\).*;/,
 	'vacuumdb --disable-page-skipping');
 $node->issues_sql_like(
+	[ 'vacuumdb', '--disable-index-cleanup', 'postgres' ],
+	qr/statement: VACUUM \(INDEX_CLEANUP FALSE\).*;/,
+	'vacuumdb --disable-index-cleanup');
+$node->issues_sql_like(
 	[ 'vacuumdb', '--skip-locked', 'postgres' ],
 	qr/statement: VACUUM \(SKIP_LOCKED\).*;/,
 	'vacuumdb --skip-locked');
@@ -48,6 +52,9 @@ $node->issues_sql_like(
 $node->command_fails(
 	[ 'vacuumdb', '--analyze-only', '--disable-page-skipping', 'postgres' ],
 	'--analyze-only and --disable-page-skipping specified together');
+$node->command_fails(
+	[ 'vacuumdb', '--analyze-only', '--disable-index-cleanup', 'postgres' ],
+	'--analyze-only and --disable-index-cleanup specified together');
 $node->command_ok([qw(vacuumdb -Z --table=pg_am dbname=template1)],
 	'vacuumdb with connection string');
 
diff --git a/src/bin/scripts/vacuumdb.c b/src/bin/scripts/vacuumdb.c
index 5ac41ea..c090ee7 100644
--- a/src/bin/scripts/vacuumdb.c
+++ b/src/bin/scripts/vacuumdb.c
@@ -42,6 +42,7 @@ typedef struct vacuumingOptions
 	bool		full;
 	bool		freeze;
 	bool		disable_page_skipping;
+	bool		disable_index_cleanup;
 	bool		skip_locked;
 	int			min_xid_age;
 	int			min_mxid_age;
@@ -117,6 +118,7 @@ main(int argc, char *argv[])
 		{"skip-locked", no_argument, NULL, 5},
 		{"min-xid-age", required_argument, NULL, 6},
 		{"min-mxid-age", required_argument, NULL, 7},
+		{"disable-index-cleanup", no_argument, NULL, 8},
 		{NULL, 0, NULL, 0}
 	};
 
@@ -244,6 +246,11 @@ main(int argc, char *argv[])
 					exit(1);
 				}
 				break;
+			case 8:
+				{
+					vacopts.disable_index_cleanup = true;
+					break;
+				}
 			default:
 				fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname);
 				exit(1);
@@ -288,6 +295,12 @@ main(int argc, char *argv[])
 					progname, "disable-page-skipping");
 			exit(1);
 		}
+		if (vacopts.disable_index_cleanup)
+		{
+			fprintf(stderr, _("%s: cannot use the \"%s\" option when performing only analyze\n"),
+					progname, "disable-index-cleanup");
+			exit(1);
+		}
 		/* allow 'and_analyze' with 'analyze_only' */
 	}
 
@@ -418,6 +431,14 @@ vacuum_one_database(const char *dbname, vacuumingOptions *vacopts,
 		exit(1);
 	}
 
+	if (vacopts->disable_index_cleanup && PQserverVersion(conn) < 120000)
+	{
+		PQfinish(conn);
+		fprintf(stderr, _("%s: cannot use the \"%s\" option on server versions older than PostgreSQL 12\n"),
+				progname, "disable-index-cleanup");
+		exit(1);
+	}
+
 	if (vacopts->skip_locked && PQserverVersion(conn) < 120000)
 	{
 		PQfinish(conn);
@@ -868,6 +889,13 @@ prepare_vacuum_command(PQExpBuffer sql, int serverVersion,
 				appendPQExpBuffer(sql, "%sDISABLE_PAGE_SKIPPING", sep);
 				sep = comma;
 			}
+			if (vacopts->disable_index_cleanup)
+			{
+				/* INDEX_CLEANUP is supported since 12 */
+				Assert(serverVersion >= 120000);
+				appendPQExpBuffer(sql, "%sINDEX_CLEANUP FALSE", sep);
+				sep = comma;
+			}
 			if (vacopts->skip_locked)
 			{
 				/* SKIP_LOCKED is supported since v12 */
@@ -1221,6 +1249,7 @@ help(const char *progname)
 	printf(_("  -a, --all                       vacuum all databases\n"));
 	printf(_("  -d, --dbname=DBNAME             database to vacuum\n"));
 	printf(_("      --disable-page-skipping     disable all page-skipping behavior\n"));
+	printf(_("      --disable-index-cleanup     disable index vacuuming and index cleanup\n"));
 	printf(_("  -e, --echo                      show the commands being sent to the server\n"));
 	printf(_("  -f, --full                      do full vacuuming\n"));
 	printf(_("  -F, --freeze                    freeze row transaction information\n"));
-- 
2.10.5

In reply to: Masahiko Sawada (#54)
Re: New vacuum option to do only freezing

The following review has been posted through the commitfest application:
make installcheck-world: not tested
Implements feature: not tested
Spec compliant: not tested
Documentation: not tested

I have read this patch. I like the concept and would like it to get committed.

Question I have after reading the patch is around this construct:

 		/*
-		 * If there are no indexes then we can vacuum the page right now
-		 * instead of doing a second scan.
+		 * If there are no indexes we can vacuum the page right now instead of
+		 * doing a second scan. Also we don't do that but forget dead tuples
+		 * when index cleanup is disabled.
 		 */

This seems to change behavior on heap tuples, even though the option itself is documented to be about "Indexes" only. This needs either better explanation what "forget dead tuples" means and that it does not lead to some kind of internal inconsistency, or documentation on what is the effect on heap tuples.

This same block raises a question on "after I enable this option, do a vacuum, decide I don't like it, what do I need to do to disable it back?" - just set it back, or set and perform a vacuum, or set and perform a VACUUM FULL as something was "forgotten"?

It may happen this concept of "forgetting" is documented somewhere in the near comments but I'd prefer it to be stated explicitly.

The new status of this patch is: Waiting on Author

#56Kyotaro HORIGUCHI
horiguchi.kyotaro@lab.ntt.co.jp
In reply to: Masahiko Sawada (#54)
Re: New vacuum option to do only freezing

Hello.

At Mon, 1 Apr 2019 14:26:15 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in <CAD21AoCKKwvgQgWxKwPPmVFJjQVj6c=oV9dtdiRthdV+WjnD4w@mail.gmail.com>

On Sat, Mar 30, 2019 at 5:04 AM Robert Haas <robertmhaas@gmail.com> wrote:

On Fri, Mar 29, 2019 at 12:27 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Yeah, but since multiple relations might be specified in VACUUM
command we need to process index_cleanup option after opened each
relations. Maybe we need to process all options except for
INDEX_CLEANUP in ExecVacuum() and pass VacuumStmt down to vacuum_rel()
and process it in manner of you suggested after opened the relation.
Is that right?

Blech, no, let's not do that. We'd better use some method that can
indicate yes/no/default. Something like psql's trivalue thingy, but
probably not exactly that. We can define an enum for this purpose,
for example - VACUUM_INDEX_CLEANUP_{ENABLED,DISABLED,DEFAULT}. Or
maybe there's some other way. But let's not pass bits of the parse
tree around any more than really required.

I've defined an enum VacOptTernaryValue representing
enabled/disabled/default and added index_cleanup variable as the new

It is defined as ENABLED=0, DISABLED=1, DEFAULT=2. At leat
ENABLED=0 and DISABLED=1 are misleading.

enum type to VacuumParams. The vacuum options that uses the reloption
value as the default value such as index cleanup and new truncation
option can use this enum and set either enabled or disabled after
opened the relation when it’s set to default. Please find the attached
patches.

+static VacOptTernaryValue get_vacopt_ternary_value(DefElem *def);

This is actually a type converter of boolean. It is used to read
VACUUM option but not used to read reloption. It seems useless.

Finally the ternary value is determined to true or false before
use. It is simple that index_cleanup finaly be read as bool. We
could add another boolean to indicate that the value is set or
not, but I think it would be better that the ternary type is a
straightfoward expansion of bool.{DEFAULT = -1, DISABLED = 0,
ENABLED = 1} and make sure that index_cleanup is not DEFAULT at a
certain point.

So, how about this?

#define VACOPT_TERNARY_DEFAULT -1
typedef int VacOptTernaryValue; /* -1 is undecided, 0 is false, 1 is true */

/* No longer the value mustn't be left DEFAULT */
Assert (params->index_cleanup != VACOPT_TERNARY_DEFAULT);

I think it would be better to just ignore the INDEX_CLEANUP option
when FULL is specified.

Okay, but why do we ignore that in this case while we complain in the
case of FULL and DISABLE_PAGE_SKIPPING?

Well, that's a fair point, I guess. If we go that that route, we'll
need to make sure that setting the reloption doesn't prevent VACUUM
FULL from working -- the complaint must only affect an explicit option
on the VACUUM command line. I think we should have a regression test
for that.

I've added regression tests. Since we check it before setting
index_cleanup based on reloptions we doesn't prevent VAUCUM FULL from
working even when the vacuum_index_cleanup is false.

+ errmsg("VACUUM option INDEX_CLEANUP cannot be set to false with FULL")));

I'm not one to talk on this, but this seems somewhat confused.

"VACUUM option INDEX_CLEANUP cannot be set to false with FULL being specified"

or

"INDEX_CLEANUP cannot be disabled for VACUUM FULL"?

And in the following part:

+	/* Set index cleanup option based on reloptions */
+	if (params->index_cleanup == VACUUM_OPTION_DEFAULT)
+	{
+		if (onerel->rd_options == NULL ||
+			((StdRdOptions *) onerel->rd_options)->vacuum_index_cleanup)
+			params->index_cleanup = VACUUM_OPTION_ENABLED;
+		else
+			params->index_cleanup = VACUUM_OPTION_DISABLED;
+	}
+

The option should not be false while VACUUM FULL, and maybe we
should complain in WARNING or NOTICE that the relopt is ignored.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#57Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Kyotaro HORIGUCHI (#56)
Re: New vacuum option to do only freezing

On Wed, Apr 3, 2019 at 10:56 AM Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:

Hello.

At Mon, 1 Apr 2019 14:26:15 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in <CAD21AoCKKwvgQgWxKwPPmVFJjQVj6c=oV9dtdiRthdV+WjnD4w@mail.gmail.com>

On Sat, Mar 30, 2019 at 5:04 AM Robert Haas <robertmhaas@gmail.com> wrote:

On Fri, Mar 29, 2019 at 12:27 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Yeah, but since multiple relations might be specified in VACUUM
command we need to process index_cleanup option after opened each
relations. Maybe we need to process all options except for
INDEX_CLEANUP in ExecVacuum() and pass VacuumStmt down to vacuum_rel()
and process it in manner of you suggested after opened the relation.
Is that right?

Blech, no, let's not do that. We'd better use some method that can
indicate yes/no/default. Something like psql's trivalue thingy, but
probably not exactly that. We can define an enum for this purpose,
for example - VACUUM_INDEX_CLEANUP_{ENABLED,DISABLED,DEFAULT}. Or
maybe there's some other way. But let's not pass bits of the parse
tree around any more than really required.

I've defined an enum VacOptTernaryValue representing
enabled/disabled/default and added index_cleanup variable as the new

Thank you for reviewing the patch!

It is defined as ENABLED=0, DISABLED=1, DEFAULT=2. At leat
ENABLED=0 and DISABLED=1 are misleading.

Indeed, will fix.

enum type to VacuumParams. The vacuum options that uses the reloption
value as the default value such as index cleanup and new truncation
option can use this enum and set either enabled or disabled after
opened the relation when it’s set to default. Please find the attached
patches.

+static VacOptTernaryValue get_vacopt_ternary_value(DefElem *def);

This is actually a type converter of boolean. It is used to read
VACUUM option but not used to read reloption. It seems useless.

Finally the ternary value is determined to true or false before
use. It is simple that index_cleanup finaly be read as bool. We
could add another boolean to indicate that the value is set or
not, but I think it would be better that the ternary type is a
straightfoward expansion of bool.{DEFAULT = -1, DISABLED = 0,
ENABLED = 1} and make sure that index_cleanup is not DEFAULT at a
certain point.

So, how about this?

#define VACOPT_TERNARY_DEFAULT -1
typedef int VacOptTernaryValue; /* -1 is undecided, 0 is false, 1 is true */

Hmm, if we do that we set either VAOPT_TERNARY_DEFAULT, true or false
to index_cleanup, but I'm not sure this is a good approach. I think we
would want VACOPT_TERNARY_TRUE and VACOPT_TERNARY_FALSE as we defined
new type as a ternary value and already have VACOPT_TERNARY_DEFAULT.

/* No longer the value mustn't be left DEFAULT */
Assert (params->index_cleanup != VACOPT_TERNARY_DEFAULT);

Agreed, will add it.

I think it would be better to just ignore the INDEX_CLEANUP option
when FULL is specified.

Okay, but why do we ignore that in this case while we complain in the
case of FULL and DISABLE_PAGE_SKIPPING?

Well, that's a fair point, I guess. If we go that that route, we'll
need to make sure that setting the reloption doesn't prevent VACUUM
FULL from working -- the complaint must only affect an explicit option
on the VACUUM command line. I think we should have a regression test
for that.

I've added regression tests. Since we check it before setting
index_cleanup based on reloptions we doesn't prevent VAUCUM FULL from
working even when the vacuum_index_cleanup is false.

+ errmsg("VACUUM option INDEX_CLEANUP cannot be set to false with FULL")));

I'm not one to talk on this, but this seems somewhat confused.

"VACUUM option INDEX_CLEANUP cannot be set to false with FULL being specified"

or

"INDEX_CLEANUP cannot be disabled for VACUUM FULL"?

I prefer the former, will fix.

And in the following part:

+       /* Set index cleanup option based on reloptions */
+       if (params->index_cleanup == VACUUM_OPTION_DEFAULT)
+       {
+               if (onerel->rd_options == NULL ||
+                       ((StdRdOptions *) onerel->rd_options)->vacuum_index_cleanup)
+                       params->index_cleanup = VACUUM_OPTION_ENABLED;
+               else
+                       params->index_cleanup = VACUUM_OPTION_DISABLED;
+       }
+

The option should not be false while VACUUM FULL,

I think that we need to complain only when INDEX_CLEANUP option is
disabled by an explicit option on the VACUUM command and FULL option
is specified. It's no problem when vacuum_index_cleanup is false and
FULL option is true. Since internally we don't use index cleanup when
vacuum full I guess that we don't need to require index_cleanup being
always true even when full option is specified.

and maybe we
should complain in WARNING or NOTICE that the relopt is ignored.

I think when users want to control index cleanup behavior manually
they specify INDEX_CLEANUP option on the VACUUM command. So it seems
to me that overwriting a reloption by an explicit option would be a
natural behavior. I'm concerned that these message would rather
confuse users.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#58Kyotaro HORIGUCHI
horiguchi.kyotaro@lab.ntt.co.jp
In reply to: Masahiko Sawada (#57)
Re: New vacuum option to do only freezing

At Wed, 3 Apr 2019 12:10:03 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in <CAD21AoBkaHka5sav5N6vvoKS9qpmrWRBdyNGP8S7M0SsPd0iyQ@mail.gmail.com>

And in the following part:

+       /* Set index cleanup option based on reloptions */
+       if (params->index_cleanup == VACUUM_OPTION_DEFAULT)
+       {
+               if (onerel->rd_options == NULL ||
+                       ((StdRdOptions *) onerel->rd_options)->vacuum_index_cleanup)
+                       params->index_cleanup = VACUUM_OPTION_ENABLED;
+               else
+                       params->index_cleanup = VACUUM_OPTION_DISABLED;
+       }
+

The option should not be false while VACUUM FULL,

I think that we need to complain only when INDEX_CLEANUP option is
disabled by an explicit option on the VACUUM command and FULL option
is specified. It's no problem when vacuum_index_cleanup is false and
FULL option is true. Since internally we don't use index cleanup when
vacuum full I guess that we don't need to require index_cleanup being
always true even when full option is specified.

I know it's safe. It's just about integrity of option values. So
I don't insist on that.

and maybe we
should complain in WARNING or NOTICE that the relopt is ignored.

I think when users want to control index cleanup behavior manually
they specify INDEX_CLEANUP option on the VACUUM command. So it seems
to me that overwriting a reloption by an explicit option would be a
natural behavior. I'm concerned that these message would rather
confuse users.

If it "cannot be specified with FULL", it seems strange that it's
safe being specified by reloptions.

I'm rather thinking that INDEX_CLEANUP = false is ignorable even
being specified with FULL option, aand DISABLE_PAGE_SKIPPING for
VACUUM FULL shuld be ignored since VACUUM FULL doesn't skip pages
in the first place.

Couldn't we silence the DISABLE_PAGE_SKIPPING & FULL case instead
of complaining about INDEX_CLEANUP & FULL? If so, I feel just
ignoring the relopt cases is reasonable.

Yeah, perhaps I'm warrying too much.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#59Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Kyotaro HORIGUCHI (#58)
3 attachment(s)
Re: New vacuum option to do only freezing

On Wed, Apr 3, 2019 at 1:17 PM Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:

At Wed, 3 Apr 2019 12:10:03 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in <CAD21AoBkaHka5sav5N6vvoKS9qpmrWRBdyNGP8S7M0SsPd0iyQ@mail.gmail.com>

And in the following part:

+       /* Set index cleanup option based on reloptions */
+       if (params->index_cleanup == VACUUM_OPTION_DEFAULT)
+       {
+               if (onerel->rd_options == NULL ||
+                       ((StdRdOptions *) onerel->rd_options)->vacuum_index_cleanup)
+                       params->index_cleanup = VACUUM_OPTION_ENABLED;
+               else
+                       params->index_cleanup = VACUUM_OPTION_DISABLED;
+       }
+

The option should not be false while VACUUM FULL,

I think that we need to complain only when INDEX_CLEANUP option is
disabled by an explicit option on the VACUUM command and FULL option
is specified. It's no problem when vacuum_index_cleanup is false and
FULL option is true. Since internally we don't use index cleanup when
vacuum full I guess that we don't need to require index_cleanup being
always true even when full option is specified.

I know it's safe. It's just about integrity of option values. So
I don't insist on that.

and maybe we
should complain in WARNING or NOTICE that the relopt is ignored.

I think when users want to control index cleanup behavior manually
they specify INDEX_CLEANUP option on the VACUUM command. So it seems
to me that overwriting a reloption by an explicit option would be a
natural behavior. I'm concerned that these message would rather
confuse users.

If it "cannot be specified with FULL", it seems strange that it's
safe being specified by reloptions.

I'm rather thinking that INDEX_CLEANUP = false is ignorable even
being specified with FULL option, aand DISABLE_PAGE_SKIPPING for
VACUUM FULL shuld be ignored since VACUUM FULL doesn't skip pages
in the first place.

Couldn't we silence the DISABLE_PAGE_SKIPPING & FULL case instead
of complaining about INDEX_CLEANUP & FULL? If so, I feel just
ignoring the relopt cases is reasonable.

Agreed with being silent even when INDEX_CLEANUP/vacuum_index_cleanup
= false and FULL = true. For DISABLE_PAGE_SKIPPING, it should be a
separate patch and changes the existing behavior. Maybe need other
discussion.

For VacOptTernaryValue part, I've incorporated the review comments but
left the new enum type since it seems to be more straightforward for
now. I might change that if there are other way.

Attached the updated version patches including the
DISABLE_PAGE_SKIPPING part (0003).

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachments:

v13-0001-Add-INDEX_CLEANUP-option-to-VACUUM-command.patchapplication/x-patch; name=v13-0001-Add-INDEX_CLEANUP-option-to-VACUUM-command.patchDownload
From 48ad0ab24084833b888b9a0f5a7ce1a55fb24775 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Thu, 7 Mar 2019 09:45:11 +0900
Subject: [PATCH v13 1/3] Add INDEX_CLEANUP option to VACUUM command

If this option is false, VACUUM does HOT-pruning for live tuples but
doesn't remove dead tuples completely and disables index vacuum.

vacrelstats->dead_tuples could have tuples that became dead after
checked at a HOT-pruning time, which are not marked as dead. Per
discussion on pgsql-hackers We normally records and remove them but
with this option we don't process and leave for the next vacuum for
simplifing the code. That's okay because it's very rare condition and
those tuples will be processed by the next vacuum.
---
 doc/src/sgml/ref/create_table.sgml     | 15 +++++++
 doc/src/sgml/ref/vacuum.sgml           | 23 ++++++++++
 src/backend/access/common/reloptions.c | 13 +++++-
 src/backend/access/heap/vacuumlazy.c   | 80 ++++++++++++++++++++++++++--------
 src/backend/commands/vacuum.c          | 28 ++++++++++++
 src/backend/postmaster/autovacuum.c    |  1 +
 src/bin/psql/tab-complete.c            |  6 ++-
 src/include/commands/vacuum.h          | 15 +++++++
 src/include/utils/rel.h                |  1 +
 src/test/regress/expected/vacuum.out   |  9 ++++
 src/test/regress/sql/vacuum.sql        | 10 +++++
 11 files changed, 181 insertions(+), 20 deletions(-)

diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index 0fcbc66..910db52 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -1378,6 +1378,21 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
    </varlistentry>
 
    <varlistentry>
+    <term><literal>vacuum_index_cleanup</literal> (<type>boolean</type>)</term>
+    <listitem>
+     <para>
+      Enables or disables index cleanup when <command>VACUUM</command> is
+      run on this table.  The default value is <literal>true</literal>.
+      Disabling index cleanup can speed of <command>VACUUM</command> very
+      significantly, but may also lead to severely bloated indexes if table
+      modifications are frequent.  The <literal>INDEX_CLEANUP</literal>
+      parameter to <xref linkend="sql-vacuum"/>, if specified, overrides
+      the value of this option.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><literal>autovacuum_vacuum_threshold</literal>, <literal>toast.autovacuum_vacuum_threshold</literal> (<type>integer</type>)</term>
     <listitem>
      <para>
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index 906d0c2..643e97b 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -32,6 +32,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     ANALYZE [ <replaceable class="parameter">boolean</replaceable> ]
     DISABLE_PAGE_SKIPPING [ <replaceable class="parameter">boolean</replaceable> ]
     SKIP_LOCKED [ <replaceable class="parameter">boolean</replaceable> ]
+    INDEX_CLEANUP [ <replaceable class="parameter">boolean</replaceable> ]
 
 <phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
 
@@ -182,6 +183,28 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    </varlistentry>
 
    <varlistentry>
+    <term><literal>INDEX_CLEANUP</literal></term>
+    <listitem>
+     <para>
+      Specifies that <command>VACUUM</command> should attempt to remove
+      index entries pointing to dead tuples.  This is normally the desired
+      behavior and is the default unless the
+      <literal>vacuum_index_cleanup</literal> option has been set to false
+      for the table to be vacuumed.  Setting this option to false may be
+      useful when it is necessary to make vacuum run as quickly as possible,
+      for example to avoid imminent transaction ID wraparound
+      (see <xref linkend="vacuum-for-wraparound"/>).  However, if index
+      cleanup is not performed regularly, performance may suffer, because
+      as the table is modified, indexes will accumulate dead line pointers
+      and the table itself will accumulate dead line pointers that cannot be
+      removed until index cleanup is completed.  This option has no effect
+      for tables that do not have an index and cannot be used in conjunction
+      with <literal>FULL</literal> option.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><replaceable class="parameter">boolean</replaceable></term>
     <listitem>
      <para>
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index b58a1f7..e2c0de3 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -138,6 +138,15 @@ static relopt_bool boolRelOpts[] =
 		},
 		false
 	},
+	{
+		{
+			"vacuum_index_cleanup",
+			"Enables index vacuuming and index cleanup",
+			RELOPT_KIND_HEAP,
+			ShareUpdateExclusiveLock
+		},
+		true
+	},
 	/* list terminator */
 	{{NULL}}
 };
@@ -1388,7 +1397,9 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
 		{"parallel_workers", RELOPT_TYPE_INT,
 		offsetof(StdRdOptions, parallel_workers)},
 		{"vacuum_cleanup_index_scale_factor", RELOPT_TYPE_REAL,
-		offsetof(StdRdOptions, vacuum_cleanup_index_scale_factor)}
+		offsetof(StdRdOptions, vacuum_cleanup_index_scale_factor)},
+		{"vacuum_index_cleanup", RELOPT_TYPE_BOOL,
+		offsetof(StdRdOptions, vacuum_index_cleanup)}
 	};
 
 	options = parseRelOptions(reloptions, validate, kind, &numoptions);
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index b5b464e..5c5ddce 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -112,8 +112,8 @@
 
 typedef struct LVRelStats
 {
-	/* hasindex = true means two-pass strategy; false means one-pass */
-	bool		hasindex;
+	/* useindex = true means two-pass strategy; false means one-pass */
+	bool		useindex;
 	/* Overall statistics about rel */
 	BlockNumber old_rel_pages;	/* previous value of pg_class.relpages */
 	BlockNumber rel_pages;		/* total number of pages */
@@ -125,6 +125,8 @@ typedef struct LVRelStats
 	double		new_rel_tuples; /* new estimated total # of tuples */
 	double		new_live_tuples;	/* new estimated total # of live tuples */
 	double		new_dead_tuples;	/* new estimated total # of dead tuples */
+	double		nleft_dead_tuples;	/* # of dead tuples we left */
+	double		nleft_dead_itemids;	/* # of dead item pointers we left */
 	BlockNumber pages_removed;
 	double		tuples_deleted;
 	BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
@@ -209,6 +211,7 @@ heap_vacuum_rel(Relation onerel, VacuumParams *params,
 	MultiXactId new_min_multi;
 
 	Assert(params != NULL);
+	Assert(params->index_cleanup != VACOPT_TERNARY_DEFAULT);
 
 	/* measure elapsed time iff autovacuum logging requires it */
 	if (IsAutoVacuumWorkerProcess() && params->log_min_duration >= 0)
@@ -275,7 +278,8 @@ heap_vacuum_rel(Relation onerel, VacuumParams *params,
 
 	/* Open all indexes of the relation */
 	vac_open_indexes(onerel, RowExclusiveLock, &nindexes, &Irel);
-	vacrelstats->hasindex = (nindexes > 0);
+	vacrelstats->useindex = (nindexes > 0 &&
+							 params->index_cleanup == VACOPT_TERNARY_ENABLED);
 
 	/* Do the vacuuming */
 	lazy_scan_heap(onerel, params->options, vacrelstats, Irel, nindexes, aggressive);
@@ -349,7 +353,7 @@ heap_vacuum_rel(Relation onerel, VacuumParams *params,
 						new_rel_pages,
 						new_live_tuples,
 						new_rel_allvisible,
-						vacrelstats->hasindex,
+						nindexes > 0,
 						new_frozen_xid,
 						new_min_multi,
 						false);
@@ -419,6 +423,12 @@ heap_vacuum_rel(Relation onerel, VacuumParams *params,
 							 vacrelstats->new_rel_tuples,
 							 vacrelstats->new_dead_tuples,
 							 OldestXmin);
+			if (vacrelstats->nleft_dead_tuples > 0 ||
+				vacrelstats->nleft_dead_itemids > 0)
+				appendStringInfo(&buf,
+								 _("%.0f tuples and %.0f item identifiers are left as dead.\n"),
+								 vacrelstats->nleft_dead_tuples,
+								 vacrelstats->nleft_dead_itemids);
 			appendStringInfo(&buf,
 							 _("buffer usage: %d hits, %d misses, %d dirtied\n"),
 							 VacuumPageHit,
@@ -501,7 +511,10 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 				live_tuples,	/* live tuples (reltuples estimate) */
 				tups_vacuumed,	/* tuples cleaned up by vacuum */
 				nkeep,			/* dead-but-not-removable tuples */
-				nunused;		/* unused item pointers */
+				nunused,		/* unused item pointers */
+				nleft_dead_tuples,		/* tuples we left as dead */
+				nleft_dead_itemids;		/* item pointers we left as dead,
+										 * includes nleft_dead_tuples. */
 	IndexBulkDeleteResult **indstats;
 	int			i;
 	PGRUsage	ru0;
@@ -534,6 +547,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	empty_pages = vacuumed_pages = 0;
 	next_fsm_block_to_vacuum = (BlockNumber) 0;
 	num_tuples = live_tuples = tups_vacuumed = nkeep = nunused = 0;
+	nleft_dead_itemids = nleft_dead_tuples = 0;
 
 	indstats = (IndexBulkDeleteResult **)
 		palloc0(nindexes * sizeof(IndexBulkDeleteResult *));
@@ -1070,7 +1084,16 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 						HeapTupleIsHeapOnly(&tuple))
 						nkeep += 1;
 					else
+					{
 						tupgone = true; /* we can delete the tuple */
+
+						/*
+						 * Since the dead tuples will be not be vacuumed
+						 * and ignored when index cleanup is disabled we
+						 * count them for reporting.
+						 */
+						nleft_dead_tuples++;
+					}
 					all_visible = false;
 					break;
 				case HEAPTUPLE_LIVE:
@@ -1222,15 +1245,32 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		}
 
 		/*
-		 * If there are no indexes then we can vacuum the page right now
-		 * instead of doing a second scan.
+		 * If there are no indexes we can vacuum the page right now instead of
+		 * doing a second scan. Also we don't do that but forget dead tuples
+		 * when index cleanup is disabled.
 		 */
-		if (nindexes == 0 &&
-			vacrelstats->num_dead_tuples > 0)
+		if (!vacrelstats->useindex && vacrelstats->num_dead_tuples > 0)
 		{
-			/* Remove tuples from heap */
-			lazy_vacuum_page(onerel, blkno, buf, 0, vacrelstats, &vmbuffer);
-			has_dead_tuples = false;
+			if (nindexes == 0)
+			{
+				/* Remove tuples from heap if the table has no index */
+				lazy_vacuum_page(onerel, blkno, buf, 0, vacrelstats, &vmbuffer);
+				vacuumed_pages++;
+				has_dead_tuples = false;
+			}
+			else
+			{
+				/*
+				 * Here, we have indexes but index cleanup is disabled. Instead of
+				 * vacuuming the dead tuples on the heap, we just forget them.
+				 *
+				 * Note that vacrelstats->dead_tuples could have tuples which
+				 * became dead after HOT-pruning but are not marked dead yet.
+				 * We do not process them because it's a very rare condition, and
+				 * the next vacuum will process them anyway.
+				 */
+				nleft_dead_itemids += vacrelstats->num_dead_tuples;
+			}
 
 			/*
 			 * Forget the now-vacuumed tuples, and press on, but be careful
@@ -1238,7 +1278,6 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			 * valid.
 			 */
 			vacrelstats->num_dead_tuples = 0;
-			vacuumed_pages++;
 
 			/*
 			 * Periodically do incremental FSM vacuuming to make newly-freed
@@ -1364,7 +1403,9 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 
 	/* save stats for use later */
 	vacrelstats->tuples_deleted = tups_vacuumed;
-	vacrelstats->new_dead_tuples = nkeep;
+	vacrelstats->new_dead_tuples = nkeep + nleft_dead_tuples;
+	vacrelstats->nleft_dead_tuples = nleft_dead_tuples;
+	vacrelstats->nleft_dead_itemids = nleft_dead_itemids;
 
 	/* now we can compute the new value for pg_class.reltuples */
 	vacrelstats->new_live_tuples = vac_estimate_reltuples(onerel,
@@ -1433,8 +1474,11 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
 
 	/* Do post-vacuum cleanup and statistics update for each index */
-	for (i = 0; i < nindexes; i++)
-		lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
+	if (vacrelstats->useindex)
+	{
+		for (i = 0; i < nindexes; i++)
+			lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
+	}
 
 	/* If no indexes, make log report that lazy_vacuum_heap would've made */
 	if (vacuumed_pages)
@@ -1465,6 +1509,8 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 									"%u pages are entirely empty.\n",
 									empty_pages),
 					 empty_pages);
+	appendStringInfo(&buf, "%.0f tuples and %.0f item identifiers are left as dead.\n",
+					 nleft_dead_tuples, nleft_dead_itemids);
 	appendStringInfo(&buf, _("%s."), pg_rusage_show(&ru0));
 
 	ereport(elevel,
@@ -2108,7 +2154,7 @@ lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
 	autovacuum_work_mem != -1 ?
 	autovacuum_work_mem : maintenance_work_mem;
 
-	if (vacrelstats->hasindex)
+	if (vacrelstats->useindex)
 	{
 		maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);
 		maxtuples = Min(maxtuples, INT_MAX);
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index fd2e47f..1a7291d 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -76,6 +76,7 @@ static void vac_truncate_clog(TransactionId frozenXID,
 				  TransactionId lastSaneFrozenXid,
 				  MultiXactId lastSaneMinMulti);
 static bool vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params);
+static VacOptTernaryValue get_vacopt_ternary_value(DefElem *def);
 
 /*
  * Primary entry point for manual VACUUM and ANALYZE commands
@@ -95,6 +96,9 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 	bool disable_page_skipping = false;
 	ListCell	*lc;
 
+	/* Set default value */
+	params.index_cleanup = VACOPT_TERNARY_DEFAULT;
+
 	/* Parse options list */
 	foreach(lc, vacstmt->options)
 	{
@@ -120,6 +124,8 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 			full = defGetBoolean(opt);
 		else if (strcmp(opt->defname, "disable_page_skipping") == 0)
 			disable_page_skipping = defGetBoolean(opt);
+		else if (strcmp(opt->defname, "index_cleanup") == 0)
+			params.index_cleanup = get_vacopt_ternary_value(opt);
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -1719,6 +1725,16 @@ vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params)
 	onerelid = onerel->rd_lockInfo.lockRelId;
 	LockRelationIdForSession(&onerelid, lmode);
 
+	/* Set index cleanup option based on reloptions if not yet */
+	if (params->index_cleanup == VACOPT_TERNARY_DEFAULT)
+	{
+		if (onerel->rd_options == NULL ||
+			((StdRdOptions *) onerel->rd_options)->vacuum_index_cleanup)
+			params->index_cleanup = VACOPT_TERNARY_ENABLED;
+		else
+			params->index_cleanup = VACOPT_TERNARY_DISABLED;
+	}
+
 	/*
 	 * Remember the relation's TOAST relation for later, if the caller asked
 	 * us to process it.  In VACUUM FULL, though, the toast table is
@@ -1899,3 +1915,15 @@ vacuum_delay_point(void)
 		CHECK_FOR_INTERRUPTS();
 	}
 }
+
+/*
+ * A wrapper function of defGetBoolean().
+ *
+ * This function returns VACOPT_TERNARY_ENABLED and VACOPT_TERNARY_DISABLED
+ * instead of true and false.
+ */
+static VacOptTernaryValue
+get_vacopt_ternary_value(DefElem *def)
+{
+	return defGetBoolean(def) ? VACOPT_TERNARY_ENABLED : VACOPT_TERNARY_DISABLED;
+}
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index fa875db..0976029 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2886,6 +2886,7 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 			(dovacuum ? VACOPT_VACUUM : 0) |
 			(doanalyze ? VACOPT_ANALYZE : 0) |
 			(!wraparound ? VACOPT_SKIP_LOCKED : 0);
+		tab->at_params.index_cleanup = VACOPT_TERNARY_DEFAULT;
 		tab->at_params.freeze_min_age = freeze_min_age;
 		tab->at_params.freeze_table_age = freeze_table_age;
 		tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index d34bf86..22576ad 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1039,6 +1039,7 @@ static const char *const table_storage_parameters[] = {
 	"toast.log_autovacuum_min_duration",
 	"toast_tuple_target",
 	"user_catalog_table",
+	"vacuum_index_cleanup",
 	NULL
 };
 
@@ -3443,8 +3444,9 @@ psql_completion(const char *text, int start, int end)
 		 */
 		if (ends_with(prev_wd, '(') || ends_with(prev_wd, ','))
 			COMPLETE_WITH("FULL", "FREEZE", "ANALYZE", "VERBOSE",
-						  "DISABLE_PAGE_SKIPPING", "SKIP_LOCKED");
-		else if (TailMatches("FULL|FREEZE|ANALYZE|VERBOSE|DISABLE_PAGE_SKIPPING|SKIP_LOCKED"))
+						  "DISABLE_PAGE_SKIPPING", "SKIP_LOCKED",
+						  "INDEX_CLEANUP");
+		else if (TailMatches("FULL|FREEZE|ANALYZE|VERBOSE|DISABLE_PAGE_SKIPPING|SKIP_LOCKED|INDEX_CLEANUP"))
 			COMPLETE_WITH("ON", "OFF");
 	}
 	else if (HeadMatches("VACUUM") && TailMatches("("))
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 77086f3..cbd585a 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -149,6 +149,19 @@ typedef enum VacuumOption
 } VacuumOption;
 
 /*
+ * A ternary value used by vacuum parameters.
+ *
+ * DEFAULT value is used to determine the value based on other
+ * configurations, e.g. reloptions.
+ */
+typedef enum VacOptTernaryValue
+{
+	VACOPT_TERNARY_DISABLED = 0,
+	VACOPT_TERNARY_ENABLED,
+	VACOPT_TERNARY_DEFAULT
+} VacOptTernaryValue;
+
+/*
  * Parameters customizing behavior of VACUUM and ANALYZE.
  *
  * Note that at least one of VACOPT_VACUUM and VACOPT_ANALYZE must be set
@@ -167,6 +180,8 @@ typedef struct VacuumParams
 	int			log_min_duration;	/* minimum execution threshold in ms at
 									 * which  verbose logs are activated, -1
 									 * to use default */
+	VacOptTernaryValue index_cleanup;	/* Do index vacuum and cleanup,
+										* default value depends on reloptions */
 } VacuumParams;
 
 /* GUC parameters */
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 5402851..89a7fbf 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -266,6 +266,7 @@ typedef struct StdRdOptions
 	AutoVacOpts autovacuum;		/* autovacuum-related options */
 	bool		user_catalog_table; /* use as an additional catalog relation */
 	int			parallel_workers;	/* max number of parallel workers */
+	bool		vacuum_index_cleanup;	/* enables index vacuuming and cleanup */
 } StdRdOptions;
 
 #define HEAP_MIN_FILLFACTOR			10
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index 07d0703..6ba7cd7 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -80,6 +80,14 @@ CONTEXT:  SQL function "do_analyze" statement 1
 SQL function "wrap_do_analyze" statement 1
 VACUUM FULL vactst;
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+-- INDEX_CLEANUP option
+CREATE TABLE no_index_cleanup (i INT PRIMARY KEY) WITH (vacuum_index_cleanup = false);
+VACUUM (INDEX_CLEANUP FALSE) vaccluster;
+VACUUM (INDEX_CLEANUP FALSE) vactst; -- index cleanup option is ignored if no indexes
+VACUUM (INDEX_CLEANUP FALSE, FREEZE TRUE) vaccluster;
+-- index cleanup option is ignored if VACUUM FULL
+VACUUM (INDEX_CLEANUP TRUE, FULL TRUE) no_index_cleanup;
+VACUUM (FULL TRUE) no_index_cleanup;
 -- partitioned table
 CREATE TABLE vacparted (a int, b char) PARTITION BY LIST (a);
 CREATE TABLE vacparted1 PARTITION OF vacparted FOR VALUES IN (1);
@@ -136,6 +144,7 @@ ANALYZE (SKIP_LOCKED) vactst;
 DROP TABLE vaccluster;
 DROP TABLE vactst;
 DROP TABLE vacparted;
+DROP TABLE no_index_cleanup;
 -- relation ownership, WARNING logs generated as all are skipped.
 CREATE TABLE vacowned (a int);
 CREATE TABLE vacowned_parted (a int) PARTITION BY LIST (a);
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 81f3822..57e0f35 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -62,6 +62,15 @@ VACUUM FULL vactst;
 
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
 
+-- INDEX_CLEANUP option
+CREATE TABLE no_index_cleanup (i INT PRIMARY KEY) WITH (vacuum_index_cleanup = false);
+VACUUM (INDEX_CLEANUP FALSE) vaccluster;
+VACUUM (INDEX_CLEANUP FALSE) vactst; -- index cleanup option is ignored if no indexes
+VACUUM (INDEX_CLEANUP FALSE, FREEZE TRUE) vaccluster;
+-- index cleanup option is ignored if VACUUM FULL
+VACUUM (INDEX_CLEANUP TRUE, FULL TRUE) no_index_cleanup;
+VACUUM (FULL TRUE) no_index_cleanup;
+
 -- partitioned table
 CREATE TABLE vacparted (a int, b char) PARTITION BY LIST (a);
 CREATE TABLE vacparted1 PARTITION OF vacparted FOR VALUES IN (1);
@@ -107,6 +116,7 @@ ANALYZE (SKIP_LOCKED) vactst;
 DROP TABLE vaccluster;
 DROP TABLE vactst;
 DROP TABLE vacparted;
+DROP TABLE no_index_cleanup;
 
 -- relation ownership, WARNING logs generated as all are skipped.
 CREATE TABLE vacowned (a int);
-- 
2.10.5

v13-0002-Add-disable-index-cleanup-option-to-vacuumdb.patchapplication/x-patch; name=v13-0002-Add-disable-index-cleanup-option-to-vacuumdb.patchDownload
From f4f04f94a83a880c36c1371eea6cf4776fa2805f Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 1 Feb 2019 16:02:50 +0100
Subject: [PATCH v13 2/3] Add --disable-index-cleanup option to vacuumdb.

---
 doc/src/sgml/ref/vacuumdb.sgml    | 15 +++++++++++++++
 src/bin/scripts/t/100_vacuumdb.pl |  9 ++++++++-
 src/bin/scripts/vacuumdb.c        | 29 +++++++++++++++++++++++++++++
 3 files changed, 52 insertions(+), 1 deletion(-)

diff --git a/doc/src/sgml/ref/vacuumdb.sgml b/doc/src/sgml/ref/vacuumdb.sgml
index 47d9345..8a824ec 100644
--- a/doc/src/sgml/ref/vacuumdb.sgml
+++ b/doc/src/sgml/ref/vacuumdb.sgml
@@ -118,6 +118,21 @@ PostgreSQL documentation
      </varlistentry>
 
      <varlistentry>
+      <term><option>--disable-index-cleanup</option></term>
+      <listitem>
+       <para>
+        Disable index vacuuming and index cleanup.
+       </para>
+       <note>
+        <para>
+         This option is only available for servers running
+         <productname>PostgreSQL</productname> 12 and later.
+        </para>
+       </note>
+      </listitem>
+     </varlistentry>
+
+     <varlistentry>
       <term><option>-e</option></term>
       <term><option>--echo</option></term>
       <listitem>
diff --git a/src/bin/scripts/t/100_vacuumdb.pl b/src/bin/scripts/t/100_vacuumdb.pl
index 7f3a9b1..2b7cd18 100644
--- a/src/bin/scripts/t/100_vacuumdb.pl
+++ b/src/bin/scripts/t/100_vacuumdb.pl
@@ -3,7 +3,7 @@ use warnings;
 
 use PostgresNode;
 use TestLib;
-use Test::More tests => 44;
+use Test::More tests => 47;
 
 program_help_ok('vacuumdb');
 program_version_ok('vacuumdb');
@@ -38,6 +38,10 @@ $node->issues_sql_like(
 	qr/statement: VACUUM \(DISABLE_PAGE_SKIPPING\).*;/,
 	'vacuumdb --disable-page-skipping');
 $node->issues_sql_like(
+	[ 'vacuumdb', '--disable-index-cleanup', 'postgres' ],
+	qr/statement: VACUUM \(INDEX_CLEANUP FALSE\).*;/,
+	'vacuumdb --disable-index-cleanup');
+$node->issues_sql_like(
 	[ 'vacuumdb', '--skip-locked', 'postgres' ],
 	qr/statement: VACUUM \(SKIP_LOCKED\).*;/,
 	'vacuumdb --skip-locked');
@@ -48,6 +52,9 @@ $node->issues_sql_like(
 $node->command_fails(
 	[ 'vacuumdb', '--analyze-only', '--disable-page-skipping', 'postgres' ],
 	'--analyze-only and --disable-page-skipping specified together');
+$node->command_fails(
+	[ 'vacuumdb', '--analyze-only', '--disable-index-cleanup', 'postgres' ],
+	'--analyze-only and --disable-index-cleanup specified together');
 $node->command_ok([qw(vacuumdb -Z --table=pg_am dbname=template1)],
 	'vacuumdb with connection string');
 
diff --git a/src/bin/scripts/vacuumdb.c b/src/bin/scripts/vacuumdb.c
index 25ff19e..8711f17 100644
--- a/src/bin/scripts/vacuumdb.c
+++ b/src/bin/scripts/vacuumdb.c
@@ -43,6 +43,7 @@ typedef struct vacuumingOptions
 	bool		full;
 	bool		freeze;
 	bool		disable_page_skipping;
+	bool		disable_index_cleanup;
 	bool		skip_locked;
 	int			min_xid_age;
 	int			min_mxid_age;
@@ -118,6 +119,7 @@ main(int argc, char *argv[])
 		{"skip-locked", no_argument, NULL, 5},
 		{"min-xid-age", required_argument, NULL, 6},
 		{"min-mxid-age", required_argument, NULL, 7},
+		{"disable-index-cleanup", no_argument, NULL, 8},
 		{NULL, 0, NULL, 0}
 	};
 
@@ -242,6 +244,11 @@ main(int argc, char *argv[])
 					exit(1);
 				}
 				break;
+			case 8:
+				{
+					vacopts.disable_index_cleanup = true;
+					break;
+				}
 			default:
 				fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname);
 				exit(1);
@@ -286,6 +293,12 @@ main(int argc, char *argv[])
 						 "disable-page-skipping");
 			exit(1);
 		}
+		if (vacopts.disable_index_cleanup)
+		{
+			pg_log_error("cannot use the \"%s\" option when performing only analyze",
+						 "disable-index-cleanup");
+			exit(1);
+		}
 		/* allow 'and_analyze' with 'analyze_only' */
 	}
 
@@ -414,6 +427,14 @@ vacuum_one_database(const char *dbname, vacuumingOptions *vacopts,
 		exit(1);
 	}
 
+	if (vacopts->disable_index_cleanup && PQserverVersion(conn) < 120000)
+	{
+		PQfinish(conn);
+		pg_log_error("cannot use the \"%s\" option on server versions older than PostgreSQL 12",
+					 "disable-index-cleanup");
+		exit(1);
+	}
+
 	if (vacopts->skip_locked && PQserverVersion(conn) < 120000)
 	{
 		PQfinish(conn);
@@ -864,6 +885,13 @@ prepare_vacuum_command(PQExpBuffer sql, int serverVersion,
 				appendPQExpBuffer(sql, "%sDISABLE_PAGE_SKIPPING", sep);
 				sep = comma;
 			}
+			if (vacopts->disable_index_cleanup)
+			{
+				/* INDEX_CLEANUP is supported since 12 */
+				Assert(serverVersion >= 120000);
+				appendPQExpBuffer(sql, "%sINDEX_CLEANUP FALSE", sep);
+				sep = comma;
+			}
 			if (vacopts->skip_locked)
 			{
 				/* SKIP_LOCKED is supported since v12 */
@@ -1216,6 +1244,7 @@ help(const char *progname)
 	printf(_("  -a, --all                       vacuum all databases\n"));
 	printf(_("  -d, --dbname=DBNAME             database to vacuum\n"));
 	printf(_("      --disable-page-skipping     disable all page-skipping behavior\n"));
+	printf(_("      --disable-index-cleanup     disable index vacuuming and index cleanup\n"));
 	printf(_("  -e, --echo                      show the commands being sent to the server\n"));
 	printf(_("  -f, --full                      do full vacuuming\n"));
 	printf(_("  -F, --freeze                    freeze row transaction information\n"));
-- 
2.10.5

v13-0003-Do-not-complain-even-when-DISABLE_PAGE_SKIPPING-.patchapplication/x-patch; name=v13-0003-Do-not-complain-even-when-DISABLE_PAGE_SKIPPING-.patchDownload
From ae509e0477130b2ccd9fcf1bcc66b91a463ee995 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Wed, 3 Apr 2019 14:23:30 +0900
Subject: [PATCH v13 3/3] Do not complain even when DISABLE_PAGE_SKIPPING and
 FULL are true

---
 src/backend/commands/vacuum.c | 9 ---------
 1 file changed, 9 deletions(-)

diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 1a7291d..34c761e 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -257,15 +257,6 @@ vacuum(List *relations, VacuumParams *params,
 						stmttype)));
 
 	/*
-	 * Sanity check DISABLE_PAGE_SKIPPING option.
-	 */
-	if ((params->options & VACOPT_FULL) != 0 &&
-		(params->options & VACOPT_DISABLE_PAGE_SKIPPING) != 0)
-		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-				 errmsg("VACUUM option DISABLE_PAGE_SKIPPING cannot be used with FULL")));
-
-	/*
 	 * Send info about dead objects to the statistics collector, unless we are
 	 * in autovacuum --- autovacuum.c does this for itself.
 	 */
-- 
2.10.5

#60Robert Haas
robertmhaas@gmail.com
In reply to: Masahiko Sawada (#59)
Re: New vacuum option to do only freezing

On Wed, Apr 3, 2019 at 1:32 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Attached the updated version patches including the
DISABLE_PAGE_SKIPPING part (0003).

I am confused about nleft_dead_tuples. It looks like it gets
incremented whenever we set tupgone = true, regardless of whether we
are doing index cleanup. But if we ARE doing index cleanup then the
dead tuple will not be left. And if we are not doing index vacuum
then we still don't need this for anything, because tups_vacuumed is
counting the same thing. I may be confused. But if I'm not, then I
think this should just be ripped out, and we should only keep
nleft_dead_itemids.

As far as VacOptTernaryValue, I think it would be safer to change this
so that VACOPT_TERNARY_DEFAULT = 0. That way palloc0 will fill in the
value that people are likely to want by default, which makes it less
likely that people will accidentally write future code that doesn't
clean up indexes.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#61Kyotaro HORIGUCHI
horiguchi.kyotaro@lab.ntt.co.jp
In reply to: Robert Haas (#60)
Re: New vacuum option to do only freezing

Hello.

At Wed, 3 Apr 2019 11:55:00 -0400, Robert Haas <robertmhaas@gmail.com> wrote in <CA+Tgmoas581jpJ0TPaA38OhjXHgbLy8z1fuuHH7CaNkrboZJeA@mail.gmail.com>

On Wed, Apr 3, 2019 at 1:32 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Attached the updated version patches including the
DISABLE_PAGE_SKIPPING part (0003).

I am confused about nleft_dead_tuples. It looks like it gets
incremented whenever we set tupgone = true, regardless of whether we
are doing index cleanup. But if we ARE doing index cleanup then the
dead tuple will not be left. And if we are not doing index vacuum
then we still don't need this for anything, because tups_vacuumed is
counting the same thing. I may be confused. But if I'm not, then I
think this should just be ripped out, and we should only keep
nleft_dead_itemids.

tups_vacuumed is including heap_page_prune()ed tuples, which
aren't counted as "tupgone".

As far as VacOptTernaryValue, I think it would be safer to change this
so that VACOPT_TERNARY_DEFAULT = 0. That way palloc0 will fill in the
value that people are likely to want by default, which makes it less
likely that people will accidentally write future code that doesn't
clean up indexes.

It's convincing. My compalint was enabled=0 and disabled=1 is
confusing so I'm fine with default=0, disabled=1, enabled=2.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#62Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Kyotaro HORIGUCHI (#61)
1 attachment(s)
Re: New vacuum option to do only freezing

On Thu, Apr 4, 2019 at 9:18 AM Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:

Hello.

At Wed, 3 Apr 2019 11:55:00 -0400, Robert Haas <robertmhaas@gmail.com> wrote in <CA+Tgmoas581jpJ0TPaA38OhjXHgbLy8z1fuuHH7CaNkrboZJeA@mail.gmail.com>

On Wed, Apr 3, 2019 at 1:32 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Attached the updated version patches including the
DISABLE_PAGE_SKIPPING part (0003).

I am confused about nleft_dead_tuples. It looks like it gets
incremented whenever we set tupgone = true, regardless of whether we
are doing index cleanup. But if we ARE doing index cleanup then the
dead tuple will not be left. And if we are not doing index vacuum
then we still don't need this for anything, because tups_vacuumed is
counting the same thing. I may be confused. But if I'm not, then I
think this should just be ripped out, and we should only keep
nleft_dead_itemids.

tups_vacuumed is including heap_page_prune()ed tuples, which
aren't counted as "tupgone".

Yes. tup_vacuumed counts not only HOT pruned tuples but also tuples
that became dead after heap_page_prune(). When index clenaup is
disabled, the former leaves only itemid whereas the latter leaves
itemid and heap tuple as we don't remove. nleft_dead_tuples counts
only the latter to report precisely. I think nleft_dead_tuples should
be incremented only when index cleanup is disabled, and the that part
comment should be polished.

As far as VacOptTernaryValue, I think it would be safer to change this
so that VACOPT_TERNARY_DEFAULT = 0. That way palloc0 will fill in the
value that people are likely to want by default, which makes it less
likely that people will accidentally write future code that doesn't
clean up indexes.

It's convincing. My compalint was enabled=0 and disabled=1 is
confusing so I'm fine with default=0, disabled=1, enabled=2.

Okay, fixed.

Attached the updated version patch.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachments:

v14-0001-Add-INDEX_CLEANUP-option-to-VACUUM-command.patchapplication/octet-stream; name=v14-0001-Add-INDEX_CLEANUP-option-to-VACUUM-command.patchDownload
From f08ed881b2771b58d788cae4d48368ce4dd0c5a2 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Thu, 7 Mar 2019 09:45:11 +0900
Subject: [PATCH v14 1/3] Add INDEX_CLEANUP option to VACUUM command

If this option is false, VACUUM does HOT-pruning for live tuples but
doesn't remove dead tuples completely and disables index vacuum.

vacrelstats->dead_tuples could have tuples that became dead after
checked at a HOT-pruning time, which are not marked as dead. Per
discussion on pgsql-hackers We normally records and remove them but
with this option we don't process and leave for the next vacuum for
simplifing the code. That's okay because it's very rare condition and
those tuples will be processed by the next vacuum.
---
 doc/src/sgml/ref/create_table.sgml     | 15 ++++++
 doc/src/sgml/ref/vacuum.sgml           | 23 ++++++++
 src/backend/access/common/reloptions.c | 13 ++++-
 src/backend/access/heap/vacuumlazy.c   | 97 ++++++++++++++++++++++++++--------
 src/backend/commands/vacuum.c          | 28 ++++++++++
 src/backend/postmaster/autovacuum.c    |  1 +
 src/bin/psql/tab-complete.c            |  6 ++-
 src/include/commands/vacuum.h          | 15 ++++++
 src/include/utils/rel.h                |  1 +
 src/test/regress/expected/vacuum.out   |  9 ++++
 src/test/regress/sql/vacuum.sql        | 10 ++++
 11 files changed, 193 insertions(+), 25 deletions(-)

diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index 99b95bb..dea2881 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -1373,6 +1373,21 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
    </varlistentry>
 
    <varlistentry>
+    <term><literal>vacuum_index_cleanup</literal> (<type>boolean</type>)</term>
+    <listitem>
+     <para>
+      Enables or disables index cleanup when <command>VACUUM</command> is
+      run on this table.  The default value is <literal>true</literal>.
+      Disabling index cleanup can speed of <command>VACUUM</command> very
+      significantly, but may also lead to severely bloated indexes if table
+      modifications are frequent.  The <literal>INDEX_CLEANUP</literal>
+      parameter to <xref linkend="sql-vacuum"/>, if specified, overrides
+      the value of this option.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><literal>autovacuum_vacuum_threshold</literal>, <literal>toast.autovacuum_vacuum_threshold</literal> (<type>integer</type>)</term>
     <listitem>
      <para>
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index 906d0c2..643e97b 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -32,6 +32,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     ANALYZE [ <replaceable class="parameter">boolean</replaceable> ]
     DISABLE_PAGE_SKIPPING [ <replaceable class="parameter">boolean</replaceable> ]
     SKIP_LOCKED [ <replaceable class="parameter">boolean</replaceable> ]
+    INDEX_CLEANUP [ <replaceable class="parameter">boolean</replaceable> ]
 
 <phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
 
@@ -182,6 +183,28 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    </varlistentry>
 
    <varlistentry>
+    <term><literal>INDEX_CLEANUP</literal></term>
+    <listitem>
+     <para>
+      Specifies that <command>VACUUM</command> should attempt to remove
+      index entries pointing to dead tuples.  This is normally the desired
+      behavior and is the default unless the
+      <literal>vacuum_index_cleanup</literal> option has been set to false
+      for the table to be vacuumed.  Setting this option to false may be
+      useful when it is necessary to make vacuum run as quickly as possible,
+      for example to avoid imminent transaction ID wraparound
+      (see <xref linkend="vacuum-for-wraparound"/>).  However, if index
+      cleanup is not performed regularly, performance may suffer, because
+      as the table is modified, indexes will accumulate dead line pointers
+      and the table itself will accumulate dead line pointers that cannot be
+      removed until index cleanup is completed.  This option has no effect
+      for tables that do not have an index and cannot be used in conjunction
+      with <literal>FULL</literal> option.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><replaceable class="parameter">boolean</replaceable></term>
     <listitem>
      <para>
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index b58a1f7..e2c0de3 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -138,6 +138,15 @@ static relopt_bool boolRelOpts[] =
 		},
 		false
 	},
+	{
+		{
+			"vacuum_index_cleanup",
+			"Enables index vacuuming and index cleanup",
+			RELOPT_KIND_HEAP,
+			ShareUpdateExclusiveLock
+		},
+		true
+	},
 	/* list terminator */
 	{{NULL}}
 };
@@ -1388,7 +1397,9 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
 		{"parallel_workers", RELOPT_TYPE_INT,
 		offsetof(StdRdOptions, parallel_workers)},
 		{"vacuum_cleanup_index_scale_factor", RELOPT_TYPE_REAL,
-		offsetof(StdRdOptions, vacuum_cleanup_index_scale_factor)}
+		offsetof(StdRdOptions, vacuum_cleanup_index_scale_factor)},
+		{"vacuum_index_cleanup", RELOPT_TYPE_BOOL,
+		offsetof(StdRdOptions, vacuum_index_cleanup)}
 	};
 
 	options = parseRelOptions(reloptions, validate, kind, &numoptions);
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 392b35e..c9d8312 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -112,8 +112,8 @@
 
 typedef struct LVRelStats
 {
-	/* hasindex = true means two-pass strategy; false means one-pass */
-	bool		hasindex;
+	/* useindex = true means two-pass strategy; false means one-pass */
+	bool		useindex;
 	/* Overall statistics about rel */
 	BlockNumber old_rel_pages;	/* previous value of pg_class.relpages */
 	BlockNumber rel_pages;		/* total number of pages */
@@ -125,6 +125,8 @@ typedef struct LVRelStats
 	double		new_rel_tuples; /* new estimated total # of tuples */
 	double		new_live_tuples;	/* new estimated total # of live tuples */
 	double		new_dead_tuples;	/* new estimated total # of dead tuples */
+	double		nleft_dead_tuples;	/* # of dead tuples we left */
+	double		nleft_dead_itemids;	/* # of dead item pointers we left */
 	BlockNumber pages_removed;
 	double		tuples_deleted;
 	BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
@@ -150,7 +152,7 @@ static BufferAccessStrategy vac_strategy;
 
 
 /* non-export function prototypes */
-static void lazy_scan_heap(Relation onerel, int options,
+static void lazy_scan_heap(Relation onerel, VacuumParams *params,
 			   LVRelStats *vacrelstats, Relation *Irel, int nindexes,
 			   bool aggressive);
 static void lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats, BlockNumber nblocks);
@@ -209,6 +211,7 @@ heap_vacuum_rel(Relation onerel, VacuumParams *params,
 	MultiXactId new_min_multi;
 
 	Assert(params != NULL);
+	Assert(params->index_cleanup != VACOPT_TERNARY_DEFAULT);
 
 	/* measure elapsed time iff autovacuum logging requires it */
 	if (IsAutoVacuumWorkerProcess() && params->log_min_duration >= 0)
@@ -275,10 +278,11 @@ heap_vacuum_rel(Relation onerel, VacuumParams *params,
 
 	/* Open all indexes of the relation */
 	vac_open_indexes(onerel, RowExclusiveLock, &nindexes, &Irel);
-	vacrelstats->hasindex = (nindexes > 0);
+	vacrelstats->useindex = (nindexes > 0 &&
+							 params->index_cleanup == VACOPT_TERNARY_ENABLED);
 
 	/* Do the vacuuming */
-	lazy_scan_heap(onerel, params->options, vacrelstats, Irel, nindexes, aggressive);
+	lazy_scan_heap(onerel, params, vacrelstats, Irel, nindexes, aggressive);
 
 	/* Done with indexes */
 	vac_close_indexes(nindexes, Irel, NoLock);
@@ -349,7 +353,7 @@ heap_vacuum_rel(Relation onerel, VacuumParams *params,
 						new_rel_pages,
 						new_live_tuples,
 						new_rel_allvisible,
-						vacrelstats->hasindex,
+						nindexes > 0,
 						new_frozen_xid,
 						new_min_multi,
 						false);
@@ -419,6 +423,12 @@ heap_vacuum_rel(Relation onerel, VacuumParams *params,
 							 vacrelstats->new_rel_tuples,
 							 vacrelstats->new_dead_tuples,
 							 OldestXmin);
+			if (vacrelstats->nleft_dead_tuples > 0 ||
+				vacrelstats->nleft_dead_itemids > 0)
+				appendStringInfo(&buf,
+								 _("%.0f tuples and %.0f item identifiers are left as dead.\n"),
+								 vacrelstats->nleft_dead_tuples,
+								 vacrelstats->nleft_dead_itemids);
 			appendStringInfo(&buf,
 							 _("buffer usage: %d hits, %d misses, %d dirtied\n"),
 							 VacuumPageHit,
@@ -485,7 +495,7 @@ vacuum_log_cleanup_info(Relation rel, LVRelStats *vacrelstats)
  *		reference them have been killed.
  */
 static void
-lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
+lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			   Relation *Irel, int nindexes, bool aggressive)
 {
 	BlockNumber nblocks,
@@ -501,7 +511,10 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 				live_tuples,	/* live tuples (reltuples estimate) */
 				tups_vacuumed,	/* tuples cleaned up by vacuum */
 				nkeep,			/* dead-but-not-removable tuples */
-				nunused;		/* unused item pointers */
+				nunused,		/* unused item pointers */
+				nleft_dead_tuples,		/* tuples we left as dead */
+				nleft_dead_itemids;		/* item pointers we left as dead,
+										 * includes nleft_dead_tuples. */
 	IndexBulkDeleteResult **indstats;
 	int			i;
 	PGRUsage	ru0;
@@ -534,6 +547,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	empty_pages = vacuumed_pages = 0;
 	next_fsm_block_to_vacuum = (BlockNumber) 0;
 	num_tuples = live_tuples = tups_vacuumed = nkeep = nunused = 0;
+	nleft_dead_itemids = nleft_dead_tuples = 0;
 
 	indstats = (IndexBulkDeleteResult **)
 		palloc0(nindexes * sizeof(IndexBulkDeleteResult *));
@@ -599,7 +613,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	 * be replayed on any hot standby, where it can be disruptive.
 	 */
 	next_unskippable_block = 0;
-	if ((options & VACOPT_DISABLE_PAGE_SKIPPING) == 0)
+	if ((params->options & VACOPT_DISABLE_PAGE_SKIPPING) == 0)
 	{
 		while (next_unskippable_block < nblocks)
 		{
@@ -654,7 +668,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		{
 			/* Time to advance next_unskippable_block */
 			next_unskippable_block++;
-			if ((options & VACOPT_DISABLE_PAGE_SKIPPING) == 0)
+			if ((params->options & VACOPT_DISABLE_PAGE_SKIPPING) == 0)
 			{
 				while (next_unskippable_block < nblocks)
 				{
@@ -1070,7 +1084,17 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 						HeapTupleIsHeapOnly(&tuple))
 						nkeep += 1;
 					else
+					{
 						tupgone = true; /* we can delete the tuple */
+
+						/*
+						 * Since this dead tuple will not be vacuumed and
+						 * ignored when index cleanup is disabled we count
+						 * count it for reporting.
+						 */
+						if (params->index_cleanup == VACOPT_TERNARY_ENABLED)
+							nleft_dead_tuples++;
+					}
 					all_visible = false;
 					break;
 				case HEAPTUPLE_LIVE:
@@ -1222,15 +1246,33 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		}
 
 		/*
-		 * If there are no indexes then we can vacuum the page right now
-		 * instead of doing a second scan.
+		 * If there are no indexes we can vacuum the page right now instead of
+		 * doing a second scan. Also we don't do that but forget dead tuples
+		 * when index cleanup is disabled.
 		 */
-		if (nindexes == 0 &&
-			vacrelstats->num_dead_tuples > 0)
+		if (!vacrelstats->useindex && vacrelstats->num_dead_tuples > 0)
 		{
-			/* Remove tuples from heap */
-			lazy_vacuum_page(onerel, blkno, buf, 0, vacrelstats, &vmbuffer);
-			has_dead_tuples = false;
+			if (nindexes == 0)
+			{
+				/* Remove tuples from heap if the table has no index */
+				lazy_vacuum_page(onerel, blkno, buf, 0, vacrelstats, &vmbuffer);
+				vacuumed_pages++;
+				has_dead_tuples = false;
+			}
+			else
+			{
+				/*
+				 * Here, we have indexes but index cleanup is disabled. Instead of
+				 * vacuuming the dead tuples on the heap, we just forget them.
+				 *
+				 * Note that vacrelstats->dead_tuples could have tuples which
+				 * became dead after HOT-pruning but are not marked dead yet.
+				 * We do not process them because it's a very rare condition, and
+				 * the next vacuum will process them anyway.
+				 */
+				Assert(params->index_cleanup == VACOPT_TERNARY_DISABLED);
+				nleft_dead_itemids += vacrelstats->num_dead_tuples;
+			}
 
 			/*
 			 * Forget the now-vacuumed tuples, and press on, but be careful
@@ -1238,7 +1280,6 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			 * valid.
 			 */
 			vacrelstats->num_dead_tuples = 0;
-			vacuumed_pages++;
 
 			/*
 			 * Periodically do incremental FSM vacuuming to make newly-freed
@@ -1357,6 +1398,11 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			RecordPageWithFreeSpace(onerel, blkno, freespace, nblocks);
 	}
 
+	/* No dead tuples should be left if index cleanup is enabled */
+	Assert((params->index_cleanup == VACOPT_TERNARY_ENABLED &&
+			nleft_dead_tuples == 0 && nleft_dead_itemids == 0) ||
+		   params->index_cleanup == VACOPT_TERNARY_DISABLED);
+
 	/* report that everything is scanned and vacuumed */
 	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_SCANNED, blkno);
 
@@ -1364,7 +1410,9 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 
 	/* save stats for use later */
 	vacrelstats->tuples_deleted = tups_vacuumed;
-	vacrelstats->new_dead_tuples = nkeep;
+	vacrelstats->new_dead_tuples = nkeep + nleft_dead_tuples;
+	vacrelstats->nleft_dead_tuples = nleft_dead_tuples;
+	vacrelstats->nleft_dead_itemids = nleft_dead_itemids;
 
 	/* now we can compute the new value for pg_class.reltuples */
 	vacrelstats->new_live_tuples = vac_estimate_reltuples(onerel,
@@ -1433,8 +1481,11 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
 
 	/* Do post-vacuum cleanup and statistics update for each index */
-	for (i = 0; i < nindexes; i++)
-		lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
+	if (vacrelstats->useindex)
+	{
+		for (i = 0; i < nindexes; i++)
+			lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
+	}
 
 	/* If no indexes, make log report that lazy_vacuum_heap would've made */
 	if (vacuumed_pages)
@@ -1465,6 +1516,8 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 									"%u pages are entirely empty.\n",
 									empty_pages),
 					 empty_pages);
+	appendStringInfo(&buf, "%.0f tuples and %.0f item identifiers are left as dead.\n",
+					 nleft_dead_tuples, nleft_dead_itemids);
 	appendStringInfo(&buf, _("%s."), pg_rusage_show(&ru0));
 
 	ereport(elevel,
@@ -2110,7 +2163,7 @@ lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
 	autovacuum_work_mem != -1 ?
 	autovacuum_work_mem : maintenance_work_mem;
 
-	if (vacrelstats->hasindex)
+	if (vacrelstats->useindex)
 	{
 		maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);
 		maxtuples = Min(maxtuples, INT_MAX);
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index fd2e47f..1a7291d 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -76,6 +76,7 @@ static void vac_truncate_clog(TransactionId frozenXID,
 				  TransactionId lastSaneFrozenXid,
 				  MultiXactId lastSaneMinMulti);
 static bool vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params);
+static VacOptTernaryValue get_vacopt_ternary_value(DefElem *def);
 
 /*
  * Primary entry point for manual VACUUM and ANALYZE commands
@@ -95,6 +96,9 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 	bool disable_page_skipping = false;
 	ListCell	*lc;
 
+	/* Set default value */
+	params.index_cleanup = VACOPT_TERNARY_DEFAULT;
+
 	/* Parse options list */
 	foreach(lc, vacstmt->options)
 	{
@@ -120,6 +124,8 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 			full = defGetBoolean(opt);
 		else if (strcmp(opt->defname, "disable_page_skipping") == 0)
 			disable_page_skipping = defGetBoolean(opt);
+		else if (strcmp(opt->defname, "index_cleanup") == 0)
+			params.index_cleanup = get_vacopt_ternary_value(opt);
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -1719,6 +1725,16 @@ vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params)
 	onerelid = onerel->rd_lockInfo.lockRelId;
 	LockRelationIdForSession(&onerelid, lmode);
 
+	/* Set index cleanup option based on reloptions if not yet */
+	if (params->index_cleanup == VACOPT_TERNARY_DEFAULT)
+	{
+		if (onerel->rd_options == NULL ||
+			((StdRdOptions *) onerel->rd_options)->vacuum_index_cleanup)
+			params->index_cleanup = VACOPT_TERNARY_ENABLED;
+		else
+			params->index_cleanup = VACOPT_TERNARY_DISABLED;
+	}
+
 	/*
 	 * Remember the relation's TOAST relation for later, if the caller asked
 	 * us to process it.  In VACUUM FULL, though, the toast table is
@@ -1899,3 +1915,15 @@ vacuum_delay_point(void)
 		CHECK_FOR_INTERRUPTS();
 	}
 }
+
+/*
+ * A wrapper function of defGetBoolean().
+ *
+ * This function returns VACOPT_TERNARY_ENABLED and VACOPT_TERNARY_DISABLED
+ * instead of true and false.
+ */
+static VacOptTernaryValue
+get_vacopt_ternary_value(DefElem *def)
+{
+	return defGetBoolean(def) ? VACOPT_TERNARY_ENABLED : VACOPT_TERNARY_DISABLED;
+}
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index fa875db..0976029 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2886,6 +2886,7 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 			(dovacuum ? VACOPT_VACUUM : 0) |
 			(doanalyze ? VACOPT_ANALYZE : 0) |
 			(!wraparound ? VACOPT_SKIP_LOCKED : 0);
+		tab->at_params.index_cleanup = VACOPT_TERNARY_DEFAULT;
 		tab->at_params.freeze_min_age = freeze_min_age;
 		tab->at_params.freeze_table_age = freeze_table_age;
 		tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index d34bf86..22576ad 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1039,6 +1039,7 @@ static const char *const table_storage_parameters[] = {
 	"toast.log_autovacuum_min_duration",
 	"toast_tuple_target",
 	"user_catalog_table",
+	"vacuum_index_cleanup",
 	NULL
 };
 
@@ -3443,8 +3444,9 @@ psql_completion(const char *text, int start, int end)
 		 */
 		if (ends_with(prev_wd, '(') || ends_with(prev_wd, ','))
 			COMPLETE_WITH("FULL", "FREEZE", "ANALYZE", "VERBOSE",
-						  "DISABLE_PAGE_SKIPPING", "SKIP_LOCKED");
-		else if (TailMatches("FULL|FREEZE|ANALYZE|VERBOSE|DISABLE_PAGE_SKIPPING|SKIP_LOCKED"))
+						  "DISABLE_PAGE_SKIPPING", "SKIP_LOCKED",
+						  "INDEX_CLEANUP");
+		else if (TailMatches("FULL|FREEZE|ANALYZE|VERBOSE|DISABLE_PAGE_SKIPPING|SKIP_LOCKED|INDEX_CLEANUP"))
 			COMPLETE_WITH("ON", "OFF");
 	}
 	else if (HeadMatches("VACUUM") && TailMatches("("))
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 77086f3..9cc6e0d 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -149,6 +149,19 @@ typedef enum VacuumOption
 } VacuumOption;
 
 /*
+ * A ternary value used by vacuum parameters.
+ *
+ * DEFAULT value is used to determine the value based on other
+ * configurations, e.g. reloptions.
+ */
+typedef enum VacOptTernaryValue
+{
+	VACOPT_TERNARY_DEFAULT = 0,
+	VACOPT_TERNARY_DISABLED,
+	VACOPT_TERNARY_ENABLED,
+} VacOptTernaryValue;
+
+/*
  * Parameters customizing behavior of VACUUM and ANALYZE.
  *
  * Note that at least one of VACOPT_VACUUM and VACOPT_ANALYZE must be set
@@ -167,6 +180,8 @@ typedef struct VacuumParams
 	int			log_min_duration;	/* minimum execution threshold in ms at
 									 * which  verbose logs are activated, -1
 									 * to use default */
+	VacOptTernaryValue index_cleanup;	/* Do index vacuum and cleanup,
+										* default value depends on reloptions */
 } VacuumParams;
 
 /* GUC parameters */
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 5402851..89a7fbf 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -266,6 +266,7 @@ typedef struct StdRdOptions
 	AutoVacOpts autovacuum;		/* autovacuum-related options */
 	bool		user_catalog_table; /* use as an additional catalog relation */
 	int			parallel_workers;	/* max number of parallel workers */
+	bool		vacuum_index_cleanup;	/* enables index vacuuming and cleanup */
 } StdRdOptions;
 
 #define HEAP_MIN_FILLFACTOR			10
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index 07d0703..6ba7cd7 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -80,6 +80,14 @@ CONTEXT:  SQL function "do_analyze" statement 1
 SQL function "wrap_do_analyze" statement 1
 VACUUM FULL vactst;
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+-- INDEX_CLEANUP option
+CREATE TABLE no_index_cleanup (i INT PRIMARY KEY) WITH (vacuum_index_cleanup = false);
+VACUUM (INDEX_CLEANUP FALSE) vaccluster;
+VACUUM (INDEX_CLEANUP FALSE) vactst; -- index cleanup option is ignored if no indexes
+VACUUM (INDEX_CLEANUP FALSE, FREEZE TRUE) vaccluster;
+-- index cleanup option is ignored if VACUUM FULL
+VACUUM (INDEX_CLEANUP TRUE, FULL TRUE) no_index_cleanup;
+VACUUM (FULL TRUE) no_index_cleanup;
 -- partitioned table
 CREATE TABLE vacparted (a int, b char) PARTITION BY LIST (a);
 CREATE TABLE vacparted1 PARTITION OF vacparted FOR VALUES IN (1);
@@ -136,6 +144,7 @@ ANALYZE (SKIP_LOCKED) vactst;
 DROP TABLE vaccluster;
 DROP TABLE vactst;
 DROP TABLE vacparted;
+DROP TABLE no_index_cleanup;
 -- relation ownership, WARNING logs generated as all are skipped.
 CREATE TABLE vacowned (a int);
 CREATE TABLE vacowned_parted (a int) PARTITION BY LIST (a);
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 81f3822..57e0f35 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -62,6 +62,15 @@ VACUUM FULL vactst;
 
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
 
+-- INDEX_CLEANUP option
+CREATE TABLE no_index_cleanup (i INT PRIMARY KEY) WITH (vacuum_index_cleanup = false);
+VACUUM (INDEX_CLEANUP FALSE) vaccluster;
+VACUUM (INDEX_CLEANUP FALSE) vactst; -- index cleanup option is ignored if no indexes
+VACUUM (INDEX_CLEANUP FALSE, FREEZE TRUE) vaccluster;
+-- index cleanup option is ignored if VACUUM FULL
+VACUUM (INDEX_CLEANUP TRUE, FULL TRUE) no_index_cleanup;
+VACUUM (FULL TRUE) no_index_cleanup;
+
 -- partitioned table
 CREATE TABLE vacparted (a int, b char) PARTITION BY LIST (a);
 CREATE TABLE vacparted1 PARTITION OF vacparted FOR VALUES IN (1);
@@ -107,6 +116,7 @@ ANALYZE (SKIP_LOCKED) vactst;
 DROP TABLE vaccluster;
 DROP TABLE vactst;
 DROP TABLE vacparted;
+DROP TABLE no_index_cleanup;
 
 -- relation ownership, WARNING logs generated as all are skipped.
 CREATE TABLE vacowned (a int);
-- 
2.10.5

#63Robert Haas
robertmhaas@gmail.com
In reply to: Masahiko Sawada (#62)
Re: New vacuum option to do only freezing

On Wed, Apr 3, 2019 at 10:32 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Attached the updated version patch.

Committed with a little bit of documentation tweaking.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#64Bossart, Nathan
bossartn@amazon.com
In reply to: Robert Haas (#63)
Re: New vacuum option to do only freezing

On 4/4/19, 12:06 PM, "Robert Haas" <robertmhaas@gmail.com> wrote:

Committed with a little bit of documentation tweaking.

Thanks! I noticed a very small typo in the new documentation.

diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index fdd8151220..c652f8b0bc 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -199,7 +199,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
       and the table itself will accumulate dead line pointers that cannot be
       removed until index cleanup is completed.  This option has no effect
       for tables that do not have an index and is ignored if the
-      <literal>FULL</literal> is used.
+      <literal>FULL</literal> option is used.
      </para>
     </listitem>
    </varlistentry>

Nathan

#65Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Robert Haas (#63)
Re: New vacuum option to do only freezing

On Fri, Apr 5, 2019 at 4:06 AM Robert Haas <robertmhaas@gmail.com> wrote:

On Wed, Apr 3, 2019 at 10:32 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Attached the updated version patch.

Committed with a little bit of documentation tweaking.

Thank you for committing them!

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#66Robert Haas
robertmhaas@gmail.com
In reply to: Bossart, Nathan (#64)
Re: New vacuum option to do only freezing

On Thu, Apr 4, 2019 at 5:37 PM Bossart, Nathan <bossartn@amazon.com> wrote:

Thanks! I noticed a very small typo in the new documentation.

Committed, thanks.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#67Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Masahiko Sawada (#65)
Re: New vacuum option to do only freezing

On Fri, Apr 5, 2019 at 10:17 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Fri, Apr 5, 2019 at 4:06 AM Robert Haas <robertmhaas@gmail.com> wrote:

On Wed, Apr 3, 2019 at 10:32 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Attached the updated version patch.

Committed with a little bit of documentation tweaking.

Thank you for committing them!

BTW should we support this option for toast tables as well?

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#68Andres Freund
andres@anarazel.de
In reply to: Robert Haas (#63)
Re: New vacuum option to do only freezing

Hi,

On 2019-04-04 15:05:55 -0400, Robert Haas wrote:

On Wed, Apr 3, 2019 at 10:32 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Attached the updated version patch.

Committed with a little bit of documentation tweaking.

I've closed the commitfest entry. I hope that's accurate?

Greetings,

Andres Freund

#69Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Andres Freund (#68)
1 attachment(s)
Re: New vacuum option to do only freezing

On Sat, Apr 6, 2019 at 10:23 AM Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2019-04-04 15:05:55 -0400, Robert Haas wrote:

On Wed, Apr 3, 2019 at 10:32 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Attached the updated version patch.

Committed with a little bit of documentation tweaking.

I've closed the commitfest entry. I hope that's accurate?

Yes, but Fujii-san pointed out that this option doesn't support toast
tables and I think there is not specific reason why not supporting
them. So it might be good to add toast.vacuum_index_cleanup. Attached
patch.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachments:

toast_vacuum_index_cleanup.patchtext/x-patch; charset=US-ASCII; name=toast_vacuum_index_cleanup.patchDownload
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index 1a8184e..2421590 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -1390,7 +1390,7 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
    </varlistentry>
 
    <varlistentry>
-    <term><literal>vacuum_index_cleanup</literal> (<type>boolean</type>)</term>
+    <term><literal>vacuum_index_cleanup</literal>, <literal>toast.vacuum_index_cleanup (<type>boolean</type>)</term>
     <listitem>
      <para>
       Enables or disables index cleanup when <command>VACUUM</command> is
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index e2c0de3..f0cc9b8 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -142,7 +142,7 @@ static relopt_bool boolRelOpts[] =
 		{
 			"vacuum_index_cleanup",
 			"Enables index vacuuming and index cleanup",
-			RELOPT_KIND_HEAP,
+			RELOPT_KIND_HEAP | RELOPT_KIND_TOAST,
 			ShareUpdateExclusiveLock
 		},
 		true
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 7c4e5fba..bb29a62 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1037,6 +1037,7 @@ static const char *const table_storage_parameters[] = {
 	"toast.autovacuum_vacuum_scale_factor",
 	"toast.autovacuum_vacuum_threshold",
 	"toast.log_autovacuum_min_duration",
+	"toast.vacuum_index_cleanup",
 	"toast_tuple_target",
 	"user_catalog_table",
 	"vacuum_index_cleanup",
#70Michael Paquier
michael@paquier.xyz
In reply to: Masahiko Sawada (#69)
Re: New vacuum option to do only freezing

On Sat, Apr 06, 2019 at 11:31:31AM +0900, Masahiko Sawada wrote:

Yes, but Fujii-san pointed out that this option doesn't support toast
tables and I think there is not specific reason why not supporting
them. So it might be good to add toast.vacuum_index_cleanup. Attached
patch.

Being able to control that option at toast level sounds sensible. I
have added an open item about that.
--
Michael

#71Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#63)
Re: New vacuum option to do only freezing

Robert Haas <robertmhaas@gmail.com> writes:

On Wed, Apr 3, 2019 at 10:32 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Attached the updated version patch.

Committed with a little bit of documentation tweaking.

topminnow just failed an assertion from this patch:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=topminnow&amp;dt=2019-04-14%2011%3A01%3A48

The symptoms are:

TRAP: FailedAssertion("!((params->index_cleanup == VACOPT_TERNARY_ENABLED && nleft_dead_tuples == 0 && nleft_dead_itemids == 0) || params->index_cleanup == VACOPT_TERNARY_DISABLED)", File: "/home/nm/farm/mipsel_deb8_gcc_32/HEAD/pgsql.build/../pgsql/src/backend/access/heap/vacuumlazy.c", Line: 1404)
...
2019-04-14 14:49:16.328 CEST [15282:5] LOG: server process (PID 18985) was terminated by signal 6: Aborted
2019-04-14 14:49:16.328 CEST [15282:6] DETAIL: Failed process was running: autovacuum: VACUUM ANALYZE pg_catalog.pg_depend

Just looking at the logic around index_cleanup, I rather think that
that assertion is flat out wrong:

+    /* No dead tuples should be left if index cleanup is enabled */
+    Assert((params->index_cleanup == VACOPT_TERNARY_ENABLED &&
+            nleft_dead_tuples == 0 && nleft_dead_itemids == 0) ||
+           params->index_cleanup == VACOPT_TERNARY_DISABLED);

Either it's wrong, or this is:

+                        /*
+                         * Since this dead tuple will not be vacuumed and
+                         * ignored when index cleanup is disabled we count
+                         * count it for reporting.
+                         */
+                        if (params->index_cleanup == VACOPT_TERNARY_ENABLED)
+                            nleft_dead_tuples++;

The poor quality of that comment suggests that maybe the code is just
as confused.

(I also think that that "ternary option" stuff is unreadably overdesigned
notation, which possibly contributed to getting this wrong. If even the
author can't keep it straight, it's unreadable.)

regards, tom lane

#72Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Tom Lane (#71)
Re: New vacuum option to do only freezing

On Mon, Apr 15, 2019 at 12:47 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Wed, Apr 3, 2019 at 10:32 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Attached the updated version patch.

Committed with a little bit of documentation tweaking.

topminnow just failed an assertion from this patch:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=topminnow&amp;dt=2019-04-14%2011%3A01%3A48

The symptoms are:

TRAP: FailedAssertion("!((params->index_cleanup == VACOPT_TERNARY_ENABLED && nleft_dead_tuples == 0 && nleft_dead_itemids == 0) || params->index_cleanup == VACOPT_TERNARY_DISABLED)", File: "/home/nm/farm/mipsel_deb8_gcc_32/HEAD/pgsql.build/../pgsql/src/backend/access/heap/vacuumlazy.c", Line: 1404)
...
2019-04-14 14:49:16.328 CEST [15282:5] LOG: server process (PID 18985) was terminated by signal 6: Aborted
2019-04-14 14:49:16.328 CEST [15282:6] DETAIL: Failed process was running: autovacuum: VACUUM ANALYZE pg_catalog.pg_depend

Just looking at the logic around index_cleanup, I rather think that
that assertion is flat out wrong:

+    /* No dead tuples should be left if index cleanup is enabled */
+    Assert((params->index_cleanup == VACOPT_TERNARY_ENABLED &&
+            nleft_dead_tuples == 0 && nleft_dead_itemids == 0) ||
+           params->index_cleanup == VACOPT_TERNARY_DISABLED);

Either it's wrong, or this is:

+                        /*
+                         * Since this dead tuple will not be vacuumed and
+                         * ignored when index cleanup is disabled we count
+                         * count it for reporting.
+                         */
+                        if (params->index_cleanup == VACOPT_TERNARY_ENABLED)
+                            nleft_dead_tuples++;

Ugh, I think the assertion is right but the above condition is
completely wrong. We should increment nleft_dead_tuples when index
cleanup is *not* enabled. For nleft_dead_itemids we require that index
cleanup is disabled as follows.

{
/*
* Here, we have indexes but index cleanup is disabled.
Instead of
* vacuuming the dead tuples on the heap, we just forget them.
*
* Note that vacrelstats->dead_tuples could have tuples which
* became dead after HOT-pruning but are not marked dead yet.
* We do not process them because it's a very rare condition, and
* the next vacuum will process them anyway.
*/
Assert(params->index_cleanup == VACOPT_TERNARY_DISABLED);
nleft_dead_itemids += vacrelstats->num_dead_tuples;
}

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#73Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Masahiko Sawada (#72)
1 attachment(s)
Re: New vacuum option to do only freezing

On Mon, Apr 15, 2019 at 9:28 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Mon, Apr 15, 2019 at 12:47 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Wed, Apr 3, 2019 at 10:32 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Attached the updated version patch.

Committed with a little bit of documentation tweaking.

topminnow just failed an assertion from this patch:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=topminnow&amp;dt=2019-04-14%2011%3A01%3A48

The symptoms are:

TRAP: FailedAssertion("!((params->index_cleanup == VACOPT_TERNARY_ENABLED && nleft_dead_tuples == 0 && nleft_dead_itemids == 0) || params->index_cleanup == VACOPT_TERNARY_DISABLED)", File: "/home/nm/farm/mipsel_deb8_gcc_32/HEAD/pgsql.build/../pgsql/src/backend/access/heap/vacuumlazy.c", Line: 1404)
...
2019-04-14 14:49:16.328 CEST [15282:5] LOG: server process (PID 18985) was terminated by signal 6: Aborted
2019-04-14 14:49:16.328 CEST [15282:6] DETAIL: Failed process was running: autovacuum: VACUUM ANALYZE pg_catalog.pg_depend

Just looking at the logic around index_cleanup, I rather think that
that assertion is flat out wrong:

+    /* No dead tuples should be left if index cleanup is enabled */
+    Assert((params->index_cleanup == VACOPT_TERNARY_ENABLED &&
+            nleft_dead_tuples == 0 && nleft_dead_itemids == 0) ||
+           params->index_cleanup == VACOPT_TERNARY_DISABLED);

Either it's wrong, or this is:

+                        /*
+                         * Since this dead tuple will not be vacuumed and
+                         * ignored when index cleanup is disabled we count
+                         * count it for reporting.
+                         */
+                        if (params->index_cleanup == VACOPT_TERNARY_ENABLED)
+                            nleft_dead_tuples++;

Ugh, I think the assertion is right but the above condition is
completely wrong. We should increment nleft_dead_tuples when index
cleanup is *not* enabled.

Here is a draft patch to fix this issue.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachments:

fix_index_cleanup.patchapplication/octet-stream; name=fix_index_cleanup.patchDownload
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 8dc76fa..6efe3e9 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -1088,11 +1088,17 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 						tupgone = true; /* we can delete the tuple */
 
 						/*
-						 * Since this dead tuple will not be vacuumed and
-						 * ignored when index cleanup is disabled we count
-						 * count it for reporting.
+						 * But, we will leave this dead tuple when index cleanup
+						 * is disabled in order to keep the code for leaving
+						 * dead tuples simple. We can remove its tuple storage
+						 * when vacuuming the page but it requires to vacuum only
+						 * this dead tuple while checking the recorded dead
+						 * tuples, so it is more reasonable way to just skip
+						 * vacuuming the page and leave the whole tuple. It is
+						 * not harmful because this is a rare condition, and the
+						 * next vacuum will process it anyway.
 						 */
-						if (params->index_cleanup == VACOPT_TERNARY_ENABLED)
+						if (params->index_cleanup == VACOPT_TERNARY_DISABLED)
 							nleft_dead_tuples++;
 					}
 					all_visible = false;
#74Tom Lane
tgl@sss.pgh.pa.us
In reply to: Masahiko Sawada (#73)
Re: New vacuum option to do only freezing

Masahiko Sawada <sawada.mshk@gmail.com> writes:

Ugh, I think the assertion is right but the above condition is
completely wrong. We should increment nleft_dead_tuples when index
cleanup is *not* enabled.

Here is a draft patch to fix this issue.

So the real issue here, I fear, is that we've got no consistent testing
of the whole case block for HEAPTUPLE_DEAD, as you can easily confirm
by checking the code coverage report at
https://coverage.postgresql.org/src/backend/access/heap/vacuumlazy.c.gcov.html

This is perhaps unsurprising, given the code comment that points out that
we can only reach that block if the tuple's state changed since
heap_page_prune() a few lines above. Still, it means that this patch
hasn't been tested in that scenario, until we were lucky enough for
a slow buildfarm machine like topminnow to hit it.

What's more, because that block is the only way for "tupgone" to be
set, we also don't reach the "if (tupgone)" block at lines 1183ff.
And I think this patch has probably broken that, too. Surely, if we
are not going to remove the tuple, we should not increment tups_vacuumed?
And maybe we have to freeze it instead? How is it that we can, or should,
treat this situation as different from a dead-but-not-removable tuple?

I have a very strong feeling that this patch was not fully baked.

BTW, I can reproduce the crash fairly easily (within a few cycles
of the regression tests) by (a) inserting pg_usleep(10000) after
the heap_page_prune call, and (b) making autovacuum much more
aggressive.

regards, tom lane

#75Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#74)
Re: New vacuum option to do only freezing

On Mon, Apr 15, 2019 at 1:13 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Masahiko Sawada <sawada.mshk@gmail.com> writes:

Ugh, I think the assertion is right but the above condition is
completely wrong. We should increment nleft_dead_tuples when index
cleanup is *not* enabled.

Here is a draft patch to fix this issue.

So the real issue here, I fear, is that we've got no consistent testing
of the whole case block for HEAPTUPLE_DEAD, as you can easily confirm
by checking the code coverage report at
https://coverage.postgresql.org/src/backend/access/heap/vacuumlazy.c.gcov.html

This is perhaps unsurprising, given the code comment that points out that
we can only reach that block if the tuple's state changed since
heap_page_prune() a few lines above. Still, it means that this patch
hasn't been tested in that scenario, until we were lucky enough for
a slow buildfarm machine like topminnow to hit it.

What's more, because that block is the only way for "tupgone" to be
set, we also don't reach the "if (tupgone)" block at lines 1183ff.
And I think this patch has probably broken that, too. Surely, if we
are not going to remove the tuple, we should not increment tups_vacuumed?
And maybe we have to freeze it instead? How is it that we can, or should,
treat this situation as different from a dead-but-not-removable tuple?

I have a very strong feeling that this patch was not fully baked.

I think you're right, but I don't understand the comment in the
preceding paragraph. How does this problem prevent tupgone from
getting set?

It looks to me like nleft_dead_tuples should be ripped out. It's
basically trying to count the same thing as tups_vacuumed, and there
doesn't seem to be any need to count that thing twice. And then just
below this block:

/* save stats for use later */
vacrelstats->tuples_deleted = tups_vacuumed;
vacrelstats->new_dead_tuples = nkeep;
vacrelstats->nleft_dead_itemids = nleft_dead_itemids;

We should do something like:

if (params->index_cleanup == VACOPT_TERNARY_DISABLED)
{
nkeep += tups_vacuumed;
tups_vacuumed = 0;
}

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#76Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#75)
Re: New vacuum option to do only freezing

Robert Haas <robertmhaas@gmail.com> writes:

On Mon, Apr 15, 2019 at 1:13 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

I have a very strong feeling that this patch was not fully baked.

I think you're right, but I don't understand the comment in the
preceding paragraph. How does this problem prevent tupgone from
getting set?

My point is that I suspect that tupgone *shouldn't* get set.
It's not (going to be) gone.

It looks to me like nleft_dead_tuples should be ripped out.

That was pretty much what I was thinking too. It makes more sense
just to treat this case identically to dead-but-not-yet-removable.
I have substantial doubts about nleft_dead_itemids being worth
anything, as well.

We should do something like:
if (params->index_cleanup == VACOPT_TERNARY_DISABLED)
{
nkeep += tups_vacuumed;
tups_vacuumed = 0;
}

No. I'm thinking there should be exactly one test of index_cleanup
in this logic, and what it would be is along the lines of

if (HeapTupleIsHotUpdated(&tuple) ||
HeapTupleIsHeapOnly(&tuple) ||
+ params->index_cleanup == VACOPT_TERNARY_DISABLED)
nkeep += 1;
else

In general, this thing has a strong whiff of "large patch
with a small patch struggling to get out".

regards, tom lane

#77Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#76)
Re: New vacuum option to do only freezing

On Mon, Apr 15, 2019 at 3:47 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

No. I'm thinking there should be exactly one test of index_cleanup
in this logic, and what it would be is along the lines of

if (HeapTupleIsHotUpdated(&tuple) ||
HeapTupleIsHeapOnly(&tuple) ||
+ params->index_cleanup == VACOPT_TERNARY_DISABLED)
nkeep += 1;
else

I'm not sure that's correct. If you do that, it'll end up in the
non-tupgone case, which might try to freeze a tuple that should've
been removed. Or am I confused?

My idea of the mechanism of action of this patch is that we accumulate
the tuples just as if we were going to vacuum them, but then at the
end of each page we forget them all, sorta like there are no indexes.
In that mental model, I don't really see why this part of this logic
needs any adjustment at all vs. pre-patch.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#78Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#77)
Re: New vacuum option to do only freezing

Robert Haas <robertmhaas@gmail.com> writes:

On Mon, Apr 15, 2019 at 3:47 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

No. I'm thinking there should be exactly one test of index_cleanup
in this logic, and what it would be is along the lines of ...

I'm not sure that's correct. If you do that, it'll end up in the
non-tupgone case, which might try to freeze a tuple that should've
been removed. Or am I confused?

If we're failing to remove it, and it's below the desired freeze
horizon, then we'd darn well better freeze it instead, no?

Since we know that the tuple only just became dead, I suspect
that the case would be unreachable in practice. But the approach
you propose risks violating the invariant that all old tuples
will either be removed or frozen.

regards, tom lane

#79Michael Paquier
michael@paquier.xyz
In reply to: Tom Lane (#78)
Re: New vacuum option to do only freezing

On Mon, Apr 15, 2019 at 09:07:16PM -0400, Tom Lane wrote:

If we're failing to remove it, and it's below the desired freeze
horizon, then we'd darn well better freeze it instead, no?

Since we know that the tuple only just became dead, I suspect
that the case would be unreachable in practice. But the approach
you propose risks violating the invariant that all old tuples
will either be removed or frozen.

Please note that I have added an open item for this investigation (see
"topminnow triggered assertion failure with vacuum_index_cleanup").
--
Michael

#80Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Tom Lane (#76)
Re: New vacuum option to do only freezing

On Tue, Apr 16, 2019 at 4:47 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Mon, Apr 15, 2019 at 1:13 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

I have a very strong feeling that this patch was not fully baked.

I think you're right, but I don't understand the comment in the
preceding paragraph. How does this problem prevent tupgone from
getting set?

My point is that I suspect that tupgone *shouldn't* get set.
It's not (going to be) gone.

It looks to me like nleft_dead_tuples should be ripped out.

That was pretty much what I was thinking too.

tups_vacuumed counts not only (1)dead-but-not-yet-removable tuple but
also HOT-pruned tuples. These HOT-pruned tuples include both (2)the
tuples we removed both its itemid and tuple storage and the tuples
(3)we removed only its tuple storage and marked itemid as dead. So we
cannot add tups_vacuumed to nkeeps as it includes completely removed
tuple like tuples-(2). I added nleft_dead_itemids to count only
tuples-(3) and nleft_dead_tuples to count only tuples-(1) for
reporting. Tuples-(2) are removed even when index cleanup is disabled.

It makes more sense
just to treat this case identically to dead-but-not-yet-removable.
I have substantial doubts about nleft_dead_itemids being worth
anything, as well.

I think that the adding tuples-(3) to nkeeps would be a good idea. If
we do that, nleft_dead_tuples is no longer necessary. On the other
hand, I think we need nleft_dead_itemids to report how many itemids we
left when index cleanup is disabled.
Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#81Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#78)
Re: New vacuum option to do only freezing

On Mon, Apr 15, 2019 at 9:07 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

I'm not sure that's correct. If you do that, it'll end up in the
non-tupgone case, which might try to freeze a tuple that should've
been removed. Or am I confused?

If we're failing to remove it, and it's below the desired freeze
horizon, then we'd darn well better freeze it instead, no?

I don't know that that's safe. IIRC, the freeze code doesn't cope
nicely with being given a tuple that actually ought to have been
deleted. It'll just freeze it anyway, which is obviously bad.

Unless this has been changed since I last looked at it.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#82Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Robert Haas (#81)
Re: New vacuum option to do only freezing

On 2019-Apr-16, Robert Haas wrote:

On Mon, Apr 15, 2019 at 9:07 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

I'm not sure that's correct. If you do that, it'll end up in the
non-tupgone case, which might try to freeze a tuple that should've
been removed. Or am I confused?

If we're failing to remove it, and it's below the desired freeze
horizon, then we'd darn well better freeze it instead, no?

I don't know that that's safe. IIRC, the freeze code doesn't cope
nicely with being given a tuple that actually ought to have been
deleted. It'll just freeze it anyway, which is obviously bad.

Umm, but if we fail to freeze it, we'll leave a tuple around that's
below the relfrozenxid for the table, causing later pg_commit to be
truncated and error messages saying that the tuple cannot be read, no?

I remember that for a similar case in multixact-land, what we do is
generate a fresh multixact that carries the members that are still alive
(ie. those that cause the multixact to be kept rather than remove it),
and relabel the tuple with that one. So the old multixact can be
removed safely. Obviously we cannot do that for XIDs, but I do wonder
what can possibly cause a tuple to be unfreezable yet the XID to be
below the freeze horizon. Surely if the transaction is that old, we
should have complained about it, and generated a freeze horizon that was
even older?

Unless this has been changed since I last looked at it.

I don't think so.

--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#83Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Robert Haas (#81)
Re: New vacuum option to do only freezing

On Tue, Apr 16, 2019 at 11:26 PM Robert Haas <robertmhaas@gmail.com> wrote:

On Mon, Apr 15, 2019 at 9:07 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

I'm not sure that's correct. If you do that, it'll end up in the
non-tupgone case, which might try to freeze a tuple that should've
been removed. Or am I confused?

If we're failing to remove it, and it's below the desired freeze
horizon, then we'd darn well better freeze it instead, no?

I don't know that that's safe. IIRC, the freeze code doesn't cope
nicely with being given a tuple that actually ought to have been
deleted. It'll just freeze it anyway, which is obviously bad.

Hmm, I think that we already choose to leave HEAPTUPLE_DEAD tuples and
might freeze them if HeapTupleIsHotUpdated() ||
HeapTupleIsHeapOnly(L1083 at vacuumlazy.c) is true, which actually
have to be deleted. What difference between these tuples and the
tuples that we intentionally leave when index cleanup is disabled?
Maybe I'm missing something and confused.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#84Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#82)
Re: New vacuum option to do only freezing

Alvaro Herrera <alvherre@2ndquadrant.com> writes:

On 2019-Apr-16, Robert Haas wrote:

On Mon, Apr 15, 2019 at 9:07 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

If we're failing to remove it, and it's below the desired freeze
horizon, then we'd darn well better freeze it instead, no?

I don't know that that's safe. IIRC, the freeze code doesn't cope
nicely with being given a tuple that actually ought to have been
deleted. It'll just freeze it anyway, which is obviously bad.

Umm, but if we fail to freeze it, we'll leave a tuple around that's
below the relfrozenxid for the table, causing later pg_commit to be
truncated and error messages saying that the tuple cannot be read, no?

Yeah. If you think that it's unsafe to freeze the tuple, then this
entire patch is ill-conceived and needs to be reverted. I don't
know how much more plainly I can put it: index_cleanup cannot be a
license to ignore the freeze horizon. (Indeed, I do not quite see
what the point of the feature is otherwise. Why would you run a
vacuum with this option at all, if not to increase the table's
relfrozenxid? But you can *not* advance relfrozenxid if you left
old XIDs behind.)

regards, tom lane

#85Andres Freund
andres@anarazel.de
In reply to: Alvaro Herrera (#82)
Re: New vacuum option to do only freezing

Hi,

On 2019-04-16 10:54:34 -0400, Alvaro Herrera wrote:

On 2019-Apr-16, Robert Haas wrote:

On Mon, Apr 15, 2019 at 9:07 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

I'm not sure that's correct. If you do that, it'll end up in the
non-tupgone case, which might try to freeze a tuple that should've
been removed. Or am I confused?

If we're failing to remove it, and it's below the desired freeze
horizon, then we'd darn well better freeze it instead, no?

I don't know that that's safe. IIRC, the freeze code doesn't cope
nicely with being given a tuple that actually ought to have been
deleted. It'll just freeze it anyway, which is obviously bad.

Umm, but if we fail to freeze it, we'll leave a tuple around that's
below the relfrozenxid for the table, causing later pg_commit to be
truncated and error messages saying that the tuple cannot be read, no?

Is the below-relfrozenxid case actually reachable? Isn't the theory of
that whole codeblock that we ought to only get there if a transaction
concurrently commits?

* Ordinarily, DEAD tuples would have been removed by
* heap_page_prune(), but it's possible that the tuple
* state changed since heap_page_prune() looked. In
* particular an INSERT_IN_PROGRESS tuple could have
* changed to DEAD if the inserter aborted. So this
* cannot be considered an error condition.

And in case there was a concurrent transaction at the time of the
heap_page_prune(), it got to be above the OldestXmin passed to
HeapTupleSatisfiesVacuum() to - as it's the same OldestXmin value. And
as FreezeLimit should always be older than than OldestXmin, we should
never get into a situation where heap_page_prune() couldn't prune
something that we would have been forced to remove?

I don't know that that's safe. IIRC, the freeze code doesn't cope
nicely with being given a tuple that actually ought to have been
deleted. It'll just freeze it anyway, which is obviously bad.

Unless this has been changed since I last looked at it.

I don't think so.

I think it has changed a bit - these days heap_prepare_freeze_tuple()
will detect that case, and error out:

/*
* If we freeze xmax, make absolutely sure that it's not an XID
* that is important. (Note, a lock-only xmax can be removed
* independent of committedness, since a committed lock holder has
* released the lock).
*/
if (!HEAP_XMAX_IS_LOCKED_ONLY(tuple->t_infomask) &&
TransactionIdDidCommit(xid))
ereport(ERROR,
(errcode(ERRCODE_DATA_CORRUPTED),
errmsg_internal("cannot freeze committed xmax %u",
xid)));
and the equivalent multixact case:

if (TransactionIdDidCommit(xid))
ereport(ERROR,
(errcode(ERRCODE_DATA_CORRUPTED),
errmsg_internal("cannot freeze committed update xid %u", xid)));

We even complain if xmin is uncommitted and would need to be frozen:

if (TransactionIdPrecedes(xid, cutoff_xid))
{
if (!TransactionIdDidCommit(xid))
ereport(ERROR,
(errcode(ERRCODE_DATA_CORRUPTED),
errmsg_internal("uncommitted xmin %u from before xid cutoff %u needs to be frozen",
xid, cutoff_xid)));

I IIRC added that after one of the multixact issues lead to precisely
that, heap_prepare_freeze_tuple() leading to a valid xmax just being
emptied out, resurfacing dead tuples (and HOT corruption and such).

These messages are obviously intended to be a backstop against
continuing to corrupt further, than actually something a user should
ever see in a working system.

Greetings,

Andres Freund

#86Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#84)
Re: New vacuum option to do only freezing

Hi,

On 2019-04-16 11:38:01 -0400, Tom Lane wrote:

Alvaro Herrera <alvherre@2ndquadrant.com> writes:

On 2019-Apr-16, Robert Haas wrote:

On Mon, Apr 15, 2019 at 9:07 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

If we're failing to remove it, and it's below the desired freeze
horizon, then we'd darn well better freeze it instead, no?

I don't know that that's safe. IIRC, the freeze code doesn't cope
nicely with being given a tuple that actually ought to have been
deleted. It'll just freeze it anyway, which is obviously bad.

Umm, but if we fail to freeze it, we'll leave a tuple around that's
below the relfrozenxid for the table, causing later pg_commit to be
truncated and error messages saying that the tuple cannot be read, no?

Yeah. If you think that it's unsafe to freeze the tuple, then this
entire patch is ill-conceived and needs to be reverted. I don't
know how much more plainly I can put it: index_cleanup cannot be a
license to ignore the freeze horizon. (Indeed, I do not quite see
what the point of the feature is otherwise. Why would you run a
vacuum with this option at all, if not to increase the table's
relfrozenxid? But you can *not* advance relfrozenxid if you left
old XIDs behind.)

As I just wrote - I don't think this codepath can ever deal with tuples
that old.

Greetings,

Andres Freund

#87Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#81)
Re: New vacuum option to do only freezing

Robert Haas <robertmhaas@gmail.com> writes:

On Mon, Apr 15, 2019 at 9:07 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

If we're failing to remove it, and it's below the desired freeze
horizon, then we'd darn well better freeze it instead, no?

I don't know that that's safe. IIRC, the freeze code doesn't cope
nicely with being given a tuple that actually ought to have been
deleted. It'll just freeze it anyway, which is obviously bad.

Looking at heap_prepare_freeze_tuple, it looks to me like it'd notice
the problem and throw an error. The two possible reasons for a tuple
to be dead are xmin aborted and xmax committed, right? There are
tests in there that will complain if either of those is true and
the xid is below the freeze horizon.

Given that we don't get here except when the tuple has just become dead,
it probably is all right to assume that it can't possibly get selected
for freezing, and let those tests backstop the assumption.

(BTW, I don't understand why that code will throw "found xmin %u from
before relfrozenxid %u" if HeapTupleHeaderXminFrozen is true? Shouldn't
the whole if-branch at lines 6113ff be skipped if xmin_frozen?)

regards, tom lane

PS: I see that mandrill just replicated the topminnow failure that
started this discussion.

#88Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#87)
Re: New vacuum option to do only freezing

Hi,

On 2019-04-16 12:01:36 -0400, Tom Lane wrote:

(BTW, I don't understand why that code will throw "found xmin %u from
before relfrozenxid %u" if HeapTupleHeaderXminFrozen is true? Shouldn't
the whole if-branch at lines 6113ff be skipped if xmin_frozen?)

I *think* that just looks odd, but isn't actively wrong. That's because
TransactionIdIsNormal() won't trigger, as:

#define HeapTupleHeaderGetXmin(tup) \
( \
HeapTupleHeaderXminFrozen(tup) ? \
FrozenTransactionId : HeapTupleHeaderGetRawXmin(tup) \
)

which afaict makes
xmin_frozen = ((xid == FrozenTransactionId) ||
HeapTupleHeaderXminFrozen(tuple));
redundant.

Looks like that was introduced relatively recently, in

commit d2599ecfcc74fea9fad1720a70210a740c716730
Author: Alvaro Herrera <alvherre@alvh.no-ip.org>
Date: 2018-05-04 15:24:44 -0300

Don't mark pages all-visible spuriously

@@ -6814,6 +6815,8 @@ heap_prepare_freeze_tuple(HeapTupleHeader tuple,

    /* Process xmin */
    xid = HeapTupleHeaderGetXmin(tuple);
+   xmin_frozen = ((xid == FrozenTransactionId) ||
+                  HeapTupleHeaderXminFrozen(tuple));
    if (TransactionIdIsNormal(xid))
    {
        if (TransactionIdPrecedes(xid, relfrozenxid))
@@ -6832,9 +6835,8 @@ heap_prepare_freeze_tuple(HeapTupleHeader tuple,

frz->t_infomask |= HEAP_XMIN_FROZEN;
changed = true;
+ xmin_frozen = true;
}
- else
- totally_frozen = false;
}

Greetings,

Andres Freund

#89Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#88)
Re: New vacuum option to do only freezing

So after thinking about this a bit more ...

ISTM that what we have here is a race condition (ie, tuple changed state
since heap_page_prune), and that ideally we want the code to resolve it
as if no race had happened. That is, either of these behaviors would
be acceptable:

1. Delete the tuple, just as heap_page_prune would've done if it had seen
it DEAD. (Possibly we could implement that by jumping back and doing
heap_page_prune again, but that seems pretty messy and bug-prone.
In any case, if we're not doing index cleanup then this must reduce to
"replace tuple by a dead line pointer", not remove it altogether.)

2. Act as if the tuple were still live, just as would've happened if the
state didn't change till just after we looked instead of just before.

Option #2 is a lot simpler and safer, and can be implemented as I
suggested earlier, assuming we're all good with the assumption that
heap_prepare_freeze_tuple isn't going to do anything bad.

However ... it strikes me that there's yet another assumption in here
that this patch has broken. Namely, notice that the reason we normally
don't get here is that what heap_page_prune does with an already-DEAD
tuple is reduce it to a dead line pointer and then count it in its
return value, which gets added to tups_vacuumed. But then what
lazy_scan_heap's per-tuple loop does is

/*
* DEAD item pointers are to be vacuumed normally; but we don't
* count them in tups_vacuumed, else we'd be double-counting (at
* least in the common case where heap_page_prune() just freed up
* a non-HOT tuple).
*/
if (ItemIdIsDead(itemid))
{
lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
all_visible = false;
continue;
}

When this patch is active, it will *greatly* increase the odds that
we report a misleading tups_vacuumed total, for two different reasons:

* DEAD tuples reduced to dead line pointers during heap_page_prune will be
counted as tups_vacuumed, even though we don't take the further step of
removing the dead line pointer, as always happened before.

* When, after some vacuum cycles with index_cleanup disabled, we finally
do one with index_cleanup enabled, there are going to be a heck of a lot
of old dead line pointers to clean out, which the existing logic won't
count at all. That was only barely tolerable before, and it seems like
this has pushed it over the bounds of silliness. People are going to
be wondering why vacuum reports that it removed zillions of index
entries and no tuples.

I'm thinking that we really need to upgrade vacuum's reporting totals
so that it accounts in some more-honest way for pre-existing dead
line pointers. The patch as it stands has made the reporting even more
confusing, rather than less so.

BTW, the fact that dead line pointers will accumulate without limit
makes me even more dubious of the proposition that this "feature"
will be safe to enable as a reloption in production. I really think
that we ought to restrict it to be a manual VACUUM option, to be
used only when you're desperate to freeze old tuples.

regards, tom lane

#90Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#89)
Re: New vacuum option to do only freezing

I wrote:

I'm thinking that we really need to upgrade vacuum's reporting totals
so that it accounts in some more-honest way for pre-existing dead
line pointers. The patch as it stands has made the reporting even more
confusing, rather than less so.

Here's a couple of ideas about that:

1. Ignore heap_page_prune's activity altogether, on the grounds that
it's just random chance that any cleanup done there was done during
VACUUM and not some preceding DML operation. Make tups_vacuumed
be the count of dead line pointers removed. The advantage of this
way is that tups_vacuumed would become independent of previous
non-VACUUM pruning activity, as well as preceding index-cleanup-disabled
VACUUMs. But maybe it's hiding too much information.

2. Have heap_page_prune count, and add to tups_vacuumed, only HOT
tuples that it deleted entirely. The action of replacing a DEAD
root tuple with a dead line pointer doesn't count for anything.
Then also add the count of dead line pointers removed to tups_vacuumed.

Approach #2 is closer to the way we've defined tups_vacuumed up to
now, but I think it'd be more realistic in cases where previous
pruning or index-cleanup-disabled vacuums have left lots of dead
line pointers.

I'm not especially wedded to either of these --- any other ideas?

regards, tom lane

#91Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Tom Lane (#90)
Re: New vacuum option to do only freezing

On Wed, Apr 17, 2019 at 5:00 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

I wrote:

I'm thinking that we really need to upgrade vacuum's reporting totals
so that it accounts in some more-honest way for pre-existing dead
line pointers. The patch as it stands has made the reporting even more
confusing, rather than less so.

Here's a couple of ideas about that:

1. Ignore heap_page_prune's activity altogether, on the grounds that
it's just random chance that any cleanup done there was done during
VACUUM and not some preceding DML operation. Make tups_vacuumed
be the count of dead line pointers removed. The advantage of this
way is that tups_vacuumed would become independent of previous
non-VACUUM pruning activity, as well as preceding index-cleanup-disabled
VACUUMs. But maybe it's hiding too much information.

2. Have heap_page_prune count, and add to tups_vacuumed, only HOT
tuples that it deleted entirely. The action of replacing a DEAD
root tuple with a dead line pointer doesn't count for anything.
Then also add the count of dead line pointers removed to tups_vacuumed.

Approach #2 is closer to the way we've defined tups_vacuumed up to
now, but I think it'd be more realistic in cases where previous
pruning or index-cleanup-disabled vacuums have left lots of dead
line pointers.

On top of the approach #2, how about reporting the number of line
pointers we left so that user can notice that there are many dead line
pointers in the table?

I'm not especially wedded to either of these --- any other ideas?

Or, how about reporting the vacuumed tuples and line pointers we
separately? It might be too detailed but since with this patch we
delete either only tuples or only line pointers, it's more accurate.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#92Tom Lane
tgl@sss.pgh.pa.us
In reply to: Masahiko Sawada (#91)
Re: New vacuum option to do only freezing

Masahiko Sawada <sawada.mshk@gmail.com> writes:

On Wed, Apr 17, 2019 at 5:00 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

I'm thinking that we really need to upgrade vacuum's reporting totals
so that it accounts in some more-honest way for pre-existing dead
line pointers. The patch as it stands has made the reporting even more
confusing, rather than less so.

Or, how about reporting the vacuumed tuples and line pointers we
separately? It might be too detailed but since with this patch we
delete either only tuples or only line pointers, it's more accurate.

Yeah, if we wanted to expose these complications more directly, we
could think about adding or changing the main counters. I was wondering
about whether we should have it report "x bytes and y line pointers
freed", rather than counting tuples per se.

regards, tom lane

In reply to: Tom Lane (#92)
Re: New vacuum option to do only freezing

On Wed, Apr 17, 2019 at 10:46 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Yeah, if we wanted to expose these complications more directly, we
could think about adding or changing the main counters. I was wondering
about whether we should have it report "x bytes and y line pointers
freed", rather than counting tuples per se.

I like that idea, but I'm pretty sure that there are very few users
that are aware of these distinctions at all. It would be a good idea
to clearly document them.

--
Peter Geoghegan

#94Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Peter Geoghegan (#93)
Re: New vacuum option to do only freezing

On Thu, Apr 18, 2019 at 4:20 AM Peter Geoghegan <pg@bowt.ie> wrote:

On Wed, Apr 17, 2019 at 10:46 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Yeah, if we wanted to expose these complications more directly, we
could think about adding or changing the main counters. I was wondering
about whether we should have it report "x bytes and y line pointers
freed", rather than counting tuples per se.

It looks good idea to me.

I like that idea, but I'm pretty sure that there are very few users
that are aware of these distinctions at all. It would be a good idea
to clearly document them.

I completely agreed. I'm sure that only a few user can do the action
of enabling index cleanup when the report says there are many dead
line pointers in the table.

It brought me an another idea of reporting something like "x bytes
freed, y bytes can be freed after index cleanup". That is, we report
how much bytes including tuples and line pointers we freed and how
much bytes of line pointers can be freed after index cleanup. While
index cleanup is enabled, the latter value should always be 0. If the
latter value gets large user can be aware of necessity of doing index
cleanup.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#95Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Masahiko Sawada (#94)
1 attachment(s)
Re: New vacuum option to do only freezing

On Thu, Apr 18, 2019 at 3:12 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Thu, Apr 18, 2019 at 4:20 AM Peter Geoghegan <pg@bowt.ie> wrote:

On Wed, Apr 17, 2019 at 10:46 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Yeah, if we wanted to expose these complications more directly, we
could think about adding or changing the main counters. I was wondering
about whether we should have it report "x bytes and y line pointers
freed", rather than counting tuples per se.

It looks good idea to me.

I like that idea, but I'm pretty sure that there are very few users
that are aware of these distinctions at all. It would be a good idea
to clearly document them.

I completely agreed. I'm sure that only a few user can do the action
of enabling index cleanup when the report says there are many dead
line pointers in the table.

It brought me an another idea of reporting something like "x bytes
freed, y bytes can be freed after index cleanup". That is, we report
how much bytes including tuples and line pointers we freed and how
much bytes of line pointers can be freed after index cleanup. While
index cleanup is enabled, the latter value should always be 0. If the
latter value gets large user can be aware of necessity of doing index
cleanup.

Attached the draft version of patch based on the discussion so far.
This patch fixes two issues: the assertion error topminnow reported
and the contents of the vacuum logs.

For the first issue, I've changed lazy_scan_heap so that it counts the
tuples that became dead after HOT pruning in nkeep when index cleanup
is disabled. As per discussions so far, it would be no problem to try
to freeze tuples that ought to have been deleted.

For the second issue, I've changed lazy vacuum so that it reports both
the number of kilobytes we freed and the number of kilobytes can be
freed after index cleanup. The former value includes the size of not
only heap tuples but also line pointers. That is, when a normal tuple
has been marked as either dead or redirected we count only the size of
heap tuple, and when it has been marked as unused we count the size of
both heap tuple and line pointer. Similarly when either a redirect
line pointer or a dead line pointer become unused we count only the
size of line pointer. The latter value we report could be non-zero
only when index cleanup is disabled; it counts the number of bytes of
dead line pointers we left. The advantage of this change is that user
can be aware of both how many bytes we freed and how many bytes we
left due to skipping index cleanup. User can be aware of the necessity
of doing index cleanup by seeing the latter value.

Also with this patch, we count only tuples that has been marked as
unused as deleted tuples. The action of replacing a dead root tuple
with a dead line pointer doesn't count for anything. It would be
close to the meaning of "deleted tuples" and less confusion. We do
that when actually marking rather than when recording because we could
record and forget dead tuples.

Here is the sample of the report by VACUUM VERBOSE. I used the
following script and the vacuum report is changed by the patch.

* Script
DROP TABLE IF EXISTS test;
CREATE TABLE test (c int primary key, d int);
INSERT INTO test SELECT * FROM generate_series(1,10000);
DELETE FROM test WHERE c < 2000;
VACUUM (INDEX_CLEANUP FALSE, VERBOSE) test;
VACUUM (INDEX_CLEANUP TRUE, VERBOSE) test;

* HEAD (when index cleanup is disabled)
INFO: vacuuming "public.test"
INFO: "test": found 1999 removable, 8001 nonremovable row versions in
45 out of 45 pages
DETAIL: 0 dead row versions cannot be removed yet, oldest xmin: 504
There were 0 unused item pointers.
Skipped 0 pages due to buffer pins, 0 frozen pages.
0 pages are entirely empty.
0 tuples and 1999 item identifiers are left as dead.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
VACUUM

* Patched (when index cleanup is disabled)
INFO: vacuuming "public.test"
INFO: "test": 55 kB freed, 7996 bytes can be freed after index
cleanup, table size: 360 kB
DETAIL: found 8001 nonremovable row versions in 45 out of 45 pages.
0 dead row versions cannot be removed yet, oldest xmin: 1660
There were 0 unused item pointers.
Skipped 0 pages due to buffer pins, 0 frozen pages.
0 pages are entirely empty.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
VACUUM

* HEAD (when index cleanup is enabled)
INFO: vacuuming "public.test"
INFO: scanned index "test_pkey" to remove 1999 row versions
DETAIL: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s
INFO: "test": removed 1999 row versions in 9 pages
DETAIL: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s
INFO: index "test_pkey" now contains 8001 row versions in 30 pages
DETAIL: 1999 index row versions were removed.
5 index pages have been deleted, 0 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO: "test": found 0 removable, 91 nonremovable row versions in 10
out of 45 pages
DETAIL: 0 dead row versions cannot be removed yet, oldest xmin: 504
There were 0 unused item pointers.
Skipped 0 pages due to buffer pins, 0 frozen pages.
0 pages are entirely empty.
0 tuples and 0 item identifiers are left as dead.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
VACUUM

* Patched (when index cleanup is enabled)
INFO: vacuuming "public.test"
INFO: scanned index "test_pkey" to remove 1999 row versions
DETAIL: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s
INFO: "test": removed 1999 row versions in 9 pages
DETAIL: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s
INFO: index "test_pkey" now contains 8001 row versions in 30 pages
DETAIL: 1999 index row versions were removed.
5 index pages have been deleted, 0 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO: "test": 7996 bytes freed, table size: 360 kB
DETAIL: found 91 nonremovable row versions in 10 out of 45 pages.
0 dead row versions cannot be removed yet, oldest xmin: 1660
There were 0 unused item pointers.
Skipped 0 pages due to buffer pins, 0 frozen pages.
0 pages are entirely empty.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
VACUUM

The patch doesn't address the concern that Tom had, which is it might
not be safe in production to accumulate the dead line pointers without
limit when the reloption is set to false. I think this is a separate
issue from the above two issues so I'd like to discuss that after
these are solved.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachments:

vacuum_index_cleanup.patchapplication/x-patch; name=vacuum_index_cleanup.patchDownload
diff --git a/src/backend/access/heap/pruneheap.c b/src/backend/access/heap/pruneheap.c
index a3e5192..4f39c76 100644
--- a/src/backend/access/heap/pruneheap.c
+++ b/src/backend/access/heap/pruneheap.c
@@ -147,11 +147,13 @@ heap_page_prune_opt(Relation relation, Buffer buffer)
 		 */
 		if (PageIsFull(page) || PageGetHeapFreeSpace(page) < minfree)
 		{
-			TransactionId ignore = InvalidTransactionId;	/* return value not
-															 * needed */
+			/* return values not needed */
+			TransactionId ignore1 = InvalidTransactionId;
+			Size ignore2;
 
 			/* OK to prune */
-			(void) heap_page_prune(relation, buffer, OldestXmin, true, &ignore);
+			(void) heap_page_prune(relation, buffer, OldestXmin, true, &ignore1,
+								   &ignore2);
 		}
 
 		/* And release buffer lock */
@@ -173,12 +175,13 @@ heap_page_prune_opt(Relation relation, Buffer buffer)
  * send its own new total to pgstats, and we don't want this delta applied
  * on top of that.)
  *
- * Returns the number of tuples deleted from the page and sets
- * latestRemovedXid.
+ * Returns the number of tuples deleted from the page. This function sets
+ * latestRemovedXid and increments freed_bytes.
  */
 int
 heap_page_prune(Relation relation, Buffer buffer, TransactionId OldestXmin,
-				bool report_stats, TransactionId *latestRemovedXid)
+				bool report_stats, TransactionId *latestRemovedXid,
+				Size *freed_bytes)
 {
 	int			ndeleted = 0;
 	Page		page = BufferGetPage(buffer);
@@ -235,10 +238,11 @@ heap_page_prune(Relation relation, Buffer buffer, TransactionId OldestXmin,
 		 * Apply the planned item changes, then repair page fragmentation, and
 		 * update the page's hint bit about whether it has free line pointers.
 		 */
-		heap_page_prune_execute(buffer,
-								prstate.redirected, prstate.nredirected,
-								prstate.nowdead, prstate.ndead,
-								prstate.nowunused, prstate.nunused);
+		*freed_bytes +=
+			heap_page_prune_execute(buffer,
+									prstate.redirected, prstate.nredirected,
+									prstate.nowdead, prstate.ndead,
+									prstate.nowunused, prstate.nunused);
 
 		/*
 		 * Update the page's pd_prune_xid field to either zero, or the lowest
@@ -346,7 +350,7 @@ heap_page_prune(Relation relation, Buffer buffer, TransactionId OldestXmin,
  * LP_DEAD state are added to nowdead[]; and items to be set to LP_UNUSED
  * state are added to nowunused[].
  *
- * Returns the number of tuples (to be) deleted from the page.
+ * Returns the number of tuples (to be) marked as unused from the page.
  */
 static int
 heap_prune_chain(Relation relation, Buffer buffer, OffsetNumber rootoffnum,
@@ -584,14 +588,6 @@ heap_prune_chain(Relation relation, Buffer buffer, OffsetNumber rootoffnum,
 		}
 
 		/*
-		 * If the root entry had been a normal tuple, we are deleting it, so
-		 * count it in the result.  But changing a redirect (even to DEAD
-		 * state) doesn't count.
-		 */
-		if (ItemIdIsNormal(rootlp))
-			ndeleted++;
-
-		/*
 		 * If the DEAD tuple is at the end of the chain, the entire chain is
 		 * dead and the root line pointer can be marked dead.  Otherwise just
 		 * redirect the root to the correct chain member.
@@ -676,8 +672,10 @@ heap_prune_record_unused(PruneState *prstate, OffsetNumber offnum)
  * This is split out because it is also used by heap_xlog_clean()
  * to replay the WAL record when needed after a crash.  Note that the
  * arguments are identical to those of log_heap_clean().
+ *
+ * Return the total number of bytes we freed.
  */
-void
+Size
 heap_page_prune_execute(Buffer buffer,
 						OffsetNumber *redirected, int nredirected,
 						OffsetNumber *nowdead, int ndead,
@@ -686,6 +684,7 @@ heap_page_prune_execute(Buffer buffer,
 	Page		page = (Page) BufferGetPage(buffer);
 	OffsetNumber *offnum;
 	int			i;
+	Size		freed_bytes = 0;
 
 	/* Update all redirected line pointers */
 	offnum = redirected;
@@ -695,6 +694,7 @@ heap_page_prune_execute(Buffer buffer,
 		OffsetNumber tooff = *offnum++;
 		ItemId		fromlp = PageGetItemId(page, fromoff);
 
+		freed_bytes += ItemIdGetLength(fromlp);
 		ItemIdSetRedirect(fromlp, tooff);
 	}
 
@@ -705,6 +705,7 @@ heap_page_prune_execute(Buffer buffer,
 		OffsetNumber off = *offnum++;
 		ItemId		lp = PageGetItemId(page, off);
 
+		freed_bytes += ItemIdGetLength(lp);
 		ItemIdSetDead(lp);
 	}
 
@@ -715,6 +716,8 @@ heap_page_prune_execute(Buffer buffer,
 		OffsetNumber off = *offnum++;
 		ItemId		lp = PageGetItemId(page, off);
 
+		/* Setting unused frees both heap tuple and line pointer */
+		freed_bytes += ItemIdGetLength(lp) + sizeof(ItemIdData);
 		ItemIdSetUnused(lp);
 	}
 
@@ -723,6 +726,8 @@ heap_page_prune_execute(Buffer buffer,
 	 * whether it has free pointers.
 	 */
 	PageRepairFragmentation(page);
+
+	return freed_bytes;
 }
 
 
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 8dc76fa..cc8b036 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -44,6 +44,7 @@
 #include "access/transam.h"
 #include "access/visibilitymap.h"
 #include "access/xlog.h"
+#include "utils/builtins.h"
 #include "catalog/storage.h"
 #include "commands/dbcommands.h"
 #include "commands/progress.h"
@@ -125,10 +126,12 @@ typedef struct LVRelStats
 	double		new_rel_tuples; /* new estimated total # of tuples */
 	double		new_live_tuples;	/* new estimated total # of live tuples */
 	double		new_dead_tuples;	/* new estimated total # of dead tuples */
-	double		nleft_dead_tuples;	/* # of dead tuples we left */
-	double		nleft_dead_itemids;	/* # of dead item pointers we left */
+	Size		freed_bytes;		/* bytes of the size of both heap tuple and
+									 * line pointer we freed */
+	Size		left_lp_bytes;		/* bytes of dead line pointer we left, which
+									 * can be freed after index cleanup */
 	BlockNumber pages_removed;
-	double		tuples_deleted;
+	double		tuples_deleted;	/* # of tuples marked as unused */
 	BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
 	/* List of TIDs of tuples we intend to delete */
 	/* NB: this list is ordered by TID address */
@@ -176,6 +179,7 @@ static bool lazy_tid_reaped(ItemPointer itemptr, void *state);
 static int	vac_cmp_itemptr(const void *left, const void *right);
 static bool heap_page_is_all_visible(Relation rel, Buffer buf,
 						 TransactionId *visibility_cutoff_xid, bool *all_frozen);
+static char *get_pretty_size(Size size);
 
 
 /*
@@ -412,23 +416,27 @@ heap_vacuum_rel(Relation onerel, VacuumParams *params,
 							 get_namespace_name(RelationGetNamespace(onerel)),
 							 RelationGetRelationName(onerel),
 							 vacrelstats->num_index_scans);
+			if (vacrelstats->left_lp_bytes > 0)
+				appendStringInfo(&buf,
+								 _("stats: %s freed, %s can be freed after index cleanup, table size: %s\n"),
+								 get_pretty_size(vacrelstats->freed_bytes),
+								 get_pretty_size(vacrelstats->left_lp_bytes),
+								 get_pretty_size(vacrelstats->rel_pages * BLCKSZ));
+			else
+				appendStringInfo(&buf,
+								 _("stats: %s freed, table size: %s\n"),
+								 get_pretty_size(vacrelstats->freed_bytes),
+								 get_pretty_size(vacrelstats->rel_pages * BLCKSZ));
 			appendStringInfo(&buf, _("pages: %u removed, %u remain, %u skipped due to pins, %u skipped frozen\n"),
 							 vacrelstats->pages_removed,
 							 vacrelstats->rel_pages,
 							 vacrelstats->pinskipped_pages,
 							 vacrelstats->frozenskipped_pages);
 			appendStringInfo(&buf,
-							 _("tuples: %.0f removed, %.0f remain, %.0f are dead but not yet removable, oldest xmin: %u\n"),
-							 vacrelstats->tuples_deleted,
+							 _("tuples: %.0f remain, %.0f are dead but not yet removable, oldest xmin: %u\n"),
 							 vacrelstats->new_rel_tuples,
 							 vacrelstats->new_dead_tuples,
 							 OldestXmin);
-			if (vacrelstats->nleft_dead_tuples > 0 ||
-				vacrelstats->nleft_dead_itemids > 0)
-				appendStringInfo(&buf,
-								 _("%.0f tuples and %.0f item identifiers are left as dead.\n"),
-								 vacrelstats->nleft_dead_tuples,
-								 vacrelstats->nleft_dead_itemids);
 			appendStringInfo(&buf,
 							 _("buffer usage: %d hits, %d misses, %d dirtied\n"),
 							 VacuumPageHit,
@@ -509,12 +517,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 				next_fsm_block_to_vacuum;
 	double		num_tuples,		/* total number of nonremovable tuples */
 				live_tuples,	/* live tuples (reltuples estimate) */
-				tups_vacuumed,	/* tuples cleaned up by vacuum */
 				nkeep,			/* dead-but-not-removable tuples */
-				nunused,		/* unused item pointers */
-				nleft_dead_tuples,		/* tuples we left as dead */
-				nleft_dead_itemids;		/* item pointers we left as dead,
-										 * includes nleft_dead_tuples. */
+				nunused;		/* unused item pointers */
 	IndexBulkDeleteResult **indstats;
 	int			i;
 	PGRUsage	ru0;
@@ -546,8 +550,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 
 	empty_pages = vacuumed_pages = 0;
 	next_fsm_block_to_vacuum = (BlockNumber) 0;
-	num_tuples = live_tuples = tups_vacuumed = nkeep = nunused = 0;
-	nleft_dead_itemids = nleft_dead_tuples = 0;
+	num_tuples = live_tuples = nkeep = nunused = 0;
 
 	indstats = (IndexBulkDeleteResult **)
 		palloc0(nindexes * sizeof(IndexBulkDeleteResult *));
@@ -980,8 +983,10 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 *
 		 * We count tuples removed by the pruning step as removed by VACUUM.
 		 */
-		tups_vacuumed += heap_page_prune(onerel, buf, OldestXmin, false,
-										 &vacrelstats->latestRemovedXid);
+		vacrelstats->tuples_deleted +=
+			heap_page_prune(onerel, buf, OldestXmin, false,
+							&vacrelstats->latestRemovedXid,
+							&vacrelstats->freed_bytes);
 
 		/*
 		 * Now scan the page to collect vacuumable items and check for tuples
@@ -1022,12 +1027,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 
 			ItemPointerSet(&(tuple.t_self), blkno, offnum);
 
-			/*
-			 * DEAD item pointers are to be vacuumed normally; but we don't
-			 * count them in tups_vacuumed, else we'd be double-counting (at
-			 * least in the common case where heap_page_prune() just freed up
-			 * a non-HOT tuple).
-			 */
+			/* DEAD item pointers are to be vacuumed normally */
 			if (ItemIdIsDead(itemid))
 			{
 				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
@@ -1068,10 +1068,10 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 					 *
 					 * If the tuple is HOT-updated then it must only be
 					 * removed by a prune operation; so we keep it just as if
-					 * it were RECENTLY_DEAD.  Also, if it's a heap-only
-					 * tuple, we choose to keep it, because it'll be a lot
-					 * cheaper to get rid of it in the next pruning pass than
-					 * to treat it like an indexed tuple.
+					 * it were RECENTLY_DEAD.  Also, either if it's a heap-only
+					 * tuple or when index cleanup is disabled, we choose to
+					 * keep it, because it'll be a lot cheaper to get rid of
+					 * it in the next pruning pass than to treat it.
 					 *
 					 * If this were to happen for a tuple that actually needed
 					 * to be deleted, we'd be in trouble, because it'd
@@ -1081,20 +1081,11 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 					 * preventing corruption.
 					 */
 					if (HeapTupleIsHotUpdated(&tuple) ||
-						HeapTupleIsHeapOnly(&tuple))
+						HeapTupleIsHeapOnly(&tuple) ||
+						params->index_cleanup == VACOPT_TERNARY_DISABLED)
 						nkeep += 1;
 					else
-					{
 						tupgone = true; /* we can delete the tuple */
-
-						/*
-						 * Since this dead tuple will not be vacuumed and
-						 * ignored when index cleanup is disabled we count
-						 * count it for reporting.
-						 */
-						if (params->index_cleanup == VACOPT_TERNARY_ENABLED)
-							nleft_dead_tuples++;
-					}
 					all_visible = false;
 					break;
 				case HEAPTUPLE_LIVE:
@@ -1183,7 +1174,6 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
 				HeapTupleHeaderAdvanceLatestRemovedXid(tuple.t_data,
 													   &vacrelstats->latestRemovedXid);
-				tups_vacuumed += 1;
 				has_dead_tuples = true;
 			}
 			else
@@ -1262,16 +1252,33 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			else
 			{
 				/*
-				 * Here, we have indexes but index cleanup is disabled. Instead of
-				 * vacuuming the dead tuples on the heap, we just forget them.
-				 *
-				 * Note that vacrelstats->dead_tuples could have tuples which
-				 * became dead after HOT-pruning but are not marked dead yet.
-				 * We do not process them because it's a very rare condition, and
-				 * the next vacuum will process them anyway.
+				 * Here, we have indexes but index cleanup is disabled and
+				 * vacrelstats->dead_tuples have only tuples marked as DEAD.
+				 * Instead of vacuuming the dead tuples on the heap, we forget
+				 * them and count the total bytes of dead line pointers.
 				 */
+#ifdef USE_ASSERT_CHECKING
 				Assert(params->index_cleanup == VACOPT_TERNARY_DISABLED);
-				nleft_dead_itemids += vacrelstats->num_dead_tuples;
+
+				/*
+				 * Check whether vacrelstats->dead_tuples has only tuples
+				 * marked as dead.
+				 */
+				{
+					Page	page = BufferGetPage(buf);
+					int i;
+					for (i = 0; i < vacrelstats->num_dead_tuples; i++)
+					{
+						OffsetNumber offset =
+							ItemPointerGetOffsetNumber(&vacrelstats->dead_tuples[i]);
+						ItemId lp = PageGetItemId(page, offset);
+
+						Assert(ItemIdIsDead(lp));
+					}
+				}
+#endif
+				vacrelstats->left_lp_bytes +=
+					(Size) (vacrelstats->num_dead_tuples * sizeof(ItemIdData));
 			}
 
 			/*
@@ -1398,9 +1405,9 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			RecordPageWithFreeSpace(onerel, blkno, freespace, nblocks);
 	}
 
-	/* No dead tuples should be left if index cleanup is enabled */
+	/* No dead line pointer should be left if index cleanup is enabled */
 	Assert((params->index_cleanup == VACOPT_TERNARY_ENABLED &&
-			nleft_dead_tuples == 0 && nleft_dead_itemids == 0) ||
+			vacrelstats->left_lp_bytes == 0) ||
 		   params->index_cleanup == VACOPT_TERNARY_DISABLED);
 
 	/* report that everything is scanned and vacuumed */
@@ -1409,10 +1416,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	pfree(frozen);
 
 	/* save stats for use later */
-	vacrelstats->tuples_deleted = tups_vacuumed;
-	vacrelstats->new_dead_tuples = nkeep + nleft_dead_tuples;
-	vacrelstats->nleft_dead_tuples = nleft_dead_tuples;
-	vacrelstats->nleft_dead_itemids = nleft_dead_itemids;
+	vacrelstats->new_dead_tuples = nkeep;
 
 	/* now we can compute the new value for pg_class.reltuples */
 	vacrelstats->new_live_tuples = vac_estimate_reltuples(onerel,
@@ -1492,7 +1496,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		ereport(elevel,
 				(errmsg("\"%s\": removed %.0f row versions in %u pages",
 						RelationGetRelationName(onerel),
-						tups_vacuumed, vacuumed_pages)));
+						vacrelstats->tuples_deleted, vacuumed_pages)));
 
 	/*
 	 * This is pretty messy, but we split it up so that we can skip emitting
@@ -1500,6 +1504,9 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	 */
 	initStringInfo(&buf);
 	appendStringInfo(&buf,
+					 _("found %.0f nonremovable row versions in %u out of %u pages.\n"),
+					   num_tuples, vacrelstats->scanned_pages, nblocks);
+	appendStringInfo(&buf,
 					 _("%.0f dead row versions cannot be removed yet, oldest xmin: %u\n"),
 					 nkeep, OldestXmin);
 	appendStringInfo(&buf, _("There were %.0f unused item pointers.\n"),
@@ -1516,16 +1523,24 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 									"%u pages are entirely empty.\n",
 									empty_pages),
 					 empty_pages);
-	appendStringInfo(&buf, "%.0f tuples and %.0f item identifiers are left as dead.\n",
-					 nleft_dead_tuples, nleft_dead_itemids);
 	appendStringInfo(&buf, _("%s."), pg_rusage_show(&ru0));
 
-	ereport(elevel,
-			(errmsg("\"%s\": found %.0f removable, %.0f nonremovable row versions in %u out of %u pages",
-					RelationGetRelationName(onerel),
-					tups_vacuumed, num_tuples,
-					vacrelstats->scanned_pages, nblocks),
-			 errdetail_internal("%s", buf.data)));
+	if (vacrelstats->left_lp_bytes > 0)
+		ereport(elevel,
+				(errmsg("\"%s\": %s freed, %s can be freed after index cleanup, table size: %s",
+						RelationGetRelationName(onerel),
+						get_pretty_size(vacrelstats->freed_bytes),
+						get_pretty_size(vacrelstats->left_lp_bytes),
+						get_pretty_size(nblocks * BLCKSZ)),
+				 errdetail_internal("%s", buf.data)));
+	else
+		ereport(elevel,
+				(errmsg("\"%s\": %s freed, table size: %s",
+						RelationGetRelationName(onerel),
+						get_pretty_size(vacrelstats->freed_bytes),
+						get_pretty_size(nblocks * BLCKSZ)),
+				 errdetail_internal("%s", buf.data)));
+
 	pfree(buf.data);
 }
 
@@ -1632,6 +1647,11 @@ lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 			break;				/* past end of tuples for this block */
 		toff = ItemPointerGetOffsetNumber(&vacrelstats->dead_tuples[tupindex]);
 		itemid = PageGetItemId(page, toff);
+
+		/* Count the number of bytes we freed */
+		vacrelstats->freed_bytes +=
+				ItemIdGetLength(itemid) + sizeof(ItemIdData);
+
 		ItemIdSetUnused(itemid);
 		unused[uncnt++] = toff;
 	}
@@ -1663,6 +1683,9 @@ lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 	 */
 	END_CRIT_SECTION();
 
+	/* Count tuples marked as unused */
+	vacrelstats->tuples_deleted += uncnt;
+
 	/*
 	 * Now that we have removed the dead tuples from the page, once again
 	 * check if the page has become all-visible.  The page is already marked
@@ -2375,3 +2398,15 @@ heap_page_is_all_visible(Relation rel, Buffer buf,
 
 	return all_visible;
 }
+
+/* Return a string of formatted size with size units */
+static char *
+get_pretty_size(Size size)
+{
+	Datum size_text;
+
+	size_text = DirectFunctionCall1(pg_size_pretty,
+									Int64GetDatum(size));
+
+	return text_to_cstring(DatumGetTextP(size_text));
+}
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 77e5e60..2c9e15e 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -184,8 +184,9 @@ extern TransactionId heap_compute_xid_horizon_for_tuples(Relation rel,
 extern void heap_page_prune_opt(Relation relation, Buffer buffer);
 extern int heap_page_prune(Relation relation, Buffer buffer,
 				TransactionId OldestXmin,
-				bool report_stats, TransactionId *latestRemovedXid);
-extern void heap_page_prune_execute(Buffer buffer,
+				bool report_stats, TransactionId *latestRemovedXid,
+				Size *freed_bytes);
+extern Size heap_page_prune_execute(Buffer buffer,
 						OffsetNumber *redirected, int nredirected,
 						OffsetNumber *nowdead, int ndead,
 						OffsetNumber *nowunused, int nunused);
#96Robert Haas
robertmhaas@gmail.com
In reply to: Masahiko Sawada (#95)
Re: New vacuum option to do only freezing

On Tue, Apr 23, 2019 at 7:09 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

For the second issue, I've changed lazy vacuum so that it reports both
the number of kilobytes we freed and the number of kilobytes can be
freed after index cleanup.

I am not very convinced that this reporting is in any useful to users.
Despite N kilobytes of tuples having been freed, the pages themselves
are still allocated and the actual ability to reuse that space may be
dependent on lots of factors that the user can't control like the
sizes of newly-inserted tuples and the degree to which the free space
map is accurate.

I feel like we're drifting off into inventing new kinds of reporting
here instead of focusing on fixing the reported defects of the
already-committed patch, but perhaps I am taking too narrow a view of
the situation.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#97Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Robert Haas (#96)
2 attachment(s)
Re: New vacuum option to do only freezing

On Fri, Apr 26, 2019 at 12:11 AM Robert Haas <robertmhaas@gmail.com> wrote:

On Tue, Apr 23, 2019 at 7:09 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

For the second issue, I've changed lazy vacuum so that it reports both
the number of kilobytes we freed and the number of kilobytes can be
freed after index cleanup.

I am not very convinced that this reporting is in any useful to users.
Despite N kilobytes of tuples having been freed, the pages themselves
are still allocated and the actual ability to reuse that space may be
dependent on lots of factors that the user can't control like the
sizes of newly-inserted tuples and the degree to which the free space
map is accurate.

Hmm, it's a term problem? The phrase 'x bytes vacuumed' would solve it?

I feel like we're drifting off into inventing new kinds of reporting
here instead of focusing on fixing the reported defects of the
already-committed patch, but perhaps I am taking too narrow a view of
the situation.

I should have divided the patches into two: fixing assertion error and
the reporting. I think we could think the latter issue also is a kind
of bug because it can report something like "1000 index tuples
vacuumed but 0 heap tuple vacuumed" in case where the vacuumed table
had only dead line pointers. Maybe I should add an another open item
for the latter.

Attached the split version patches.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachments:

fix_vacuum_reporting_v2.patchapplication/octet-stream; name=fix_vacuum_reporting_v2.patchDownload
diff --git a/src/backend/access/heap/pruneheap.c b/src/backend/access/heap/pruneheap.c
index a3e5192..4f39c76 100644
--- a/src/backend/access/heap/pruneheap.c
+++ b/src/backend/access/heap/pruneheap.c
@@ -147,11 +147,13 @@ heap_page_prune_opt(Relation relation, Buffer buffer)
 		 */
 		if (PageIsFull(page) || PageGetHeapFreeSpace(page) < minfree)
 		{
-			TransactionId ignore = InvalidTransactionId;	/* return value not
-															 * needed */
+			/* return values not needed */
+			TransactionId ignore1 = InvalidTransactionId;
+			Size ignore2;
 
 			/* OK to prune */
-			(void) heap_page_prune(relation, buffer, OldestXmin, true, &ignore);
+			(void) heap_page_prune(relation, buffer, OldestXmin, true, &ignore1,
+								   &ignore2);
 		}
 
 		/* And release buffer lock */
@@ -173,12 +175,13 @@ heap_page_prune_opt(Relation relation, Buffer buffer)
  * send its own new total to pgstats, and we don't want this delta applied
  * on top of that.)
  *
- * Returns the number of tuples deleted from the page and sets
- * latestRemovedXid.
+ * Returns the number of tuples deleted from the page. This function sets
+ * latestRemovedXid and increments freed_bytes.
  */
 int
 heap_page_prune(Relation relation, Buffer buffer, TransactionId OldestXmin,
-				bool report_stats, TransactionId *latestRemovedXid)
+				bool report_stats, TransactionId *latestRemovedXid,
+				Size *freed_bytes)
 {
 	int			ndeleted = 0;
 	Page		page = BufferGetPage(buffer);
@@ -235,10 +238,11 @@ heap_page_prune(Relation relation, Buffer buffer, TransactionId OldestXmin,
 		 * Apply the planned item changes, then repair page fragmentation, and
 		 * update the page's hint bit about whether it has free line pointers.
 		 */
-		heap_page_prune_execute(buffer,
-								prstate.redirected, prstate.nredirected,
-								prstate.nowdead, prstate.ndead,
-								prstate.nowunused, prstate.nunused);
+		*freed_bytes +=
+			heap_page_prune_execute(buffer,
+									prstate.redirected, prstate.nredirected,
+									prstate.nowdead, prstate.ndead,
+									prstate.nowunused, prstate.nunused);
 
 		/*
 		 * Update the page's pd_prune_xid field to either zero, or the lowest
@@ -346,7 +350,7 @@ heap_page_prune(Relation relation, Buffer buffer, TransactionId OldestXmin,
  * LP_DEAD state are added to nowdead[]; and items to be set to LP_UNUSED
  * state are added to nowunused[].
  *
- * Returns the number of tuples (to be) deleted from the page.
+ * Returns the number of tuples (to be) marked as unused from the page.
  */
 static int
 heap_prune_chain(Relation relation, Buffer buffer, OffsetNumber rootoffnum,
@@ -584,14 +588,6 @@ heap_prune_chain(Relation relation, Buffer buffer, OffsetNumber rootoffnum,
 		}
 
 		/*
-		 * If the root entry had been a normal tuple, we are deleting it, so
-		 * count it in the result.  But changing a redirect (even to DEAD
-		 * state) doesn't count.
-		 */
-		if (ItemIdIsNormal(rootlp))
-			ndeleted++;
-
-		/*
 		 * If the DEAD tuple is at the end of the chain, the entire chain is
 		 * dead and the root line pointer can be marked dead.  Otherwise just
 		 * redirect the root to the correct chain member.
@@ -676,8 +672,10 @@ heap_prune_record_unused(PruneState *prstate, OffsetNumber offnum)
  * This is split out because it is also used by heap_xlog_clean()
  * to replay the WAL record when needed after a crash.  Note that the
  * arguments are identical to those of log_heap_clean().
+ *
+ * Return the total number of bytes we freed.
  */
-void
+Size
 heap_page_prune_execute(Buffer buffer,
 						OffsetNumber *redirected, int nredirected,
 						OffsetNumber *nowdead, int ndead,
@@ -686,6 +684,7 @@ heap_page_prune_execute(Buffer buffer,
 	Page		page = (Page) BufferGetPage(buffer);
 	OffsetNumber *offnum;
 	int			i;
+	Size		freed_bytes = 0;
 
 	/* Update all redirected line pointers */
 	offnum = redirected;
@@ -695,6 +694,7 @@ heap_page_prune_execute(Buffer buffer,
 		OffsetNumber tooff = *offnum++;
 		ItemId		fromlp = PageGetItemId(page, fromoff);
 
+		freed_bytes += ItemIdGetLength(fromlp);
 		ItemIdSetRedirect(fromlp, tooff);
 	}
 
@@ -705,6 +705,7 @@ heap_page_prune_execute(Buffer buffer,
 		OffsetNumber off = *offnum++;
 		ItemId		lp = PageGetItemId(page, off);
 
+		freed_bytes += ItemIdGetLength(lp);
 		ItemIdSetDead(lp);
 	}
 
@@ -715,6 +716,8 @@ heap_page_prune_execute(Buffer buffer,
 		OffsetNumber off = *offnum++;
 		ItemId		lp = PageGetItemId(page, off);
 
+		/* Setting unused frees both heap tuple and line pointer */
+		freed_bytes += ItemIdGetLength(lp) + sizeof(ItemIdData);
 		ItemIdSetUnused(lp);
 	}
 
@@ -723,6 +726,8 @@ heap_page_prune_execute(Buffer buffer,
 	 * whether it has free pointers.
 	 */
 	PageRepairFragmentation(page);
+
+	return freed_bytes;
 }
 
 
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 9889e48..5f3d018 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -44,6 +44,7 @@
 #include "access/transam.h"
 #include "access/visibilitymap.h"
 #include "access/xlog.h"
+#include "utils/builtins.h"
 #include "catalog/storage.h"
 #include "commands/dbcommands.h"
 #include "commands/progress.h"
@@ -125,10 +126,12 @@ typedef struct LVRelStats
 	double		new_rel_tuples; /* new estimated total # of tuples */
 	double		new_live_tuples;	/* new estimated total # of live tuples */
 	double		new_dead_tuples;	/* new estimated total # of dead tuples */
-	double		nleft_dead_tuples;	/* # of dead tuples we left */
-	double		nleft_dead_itemids;	/* # of dead item pointers we left */
+	Size		freed_bytes;		/* bytes of the size of both heap tuple and
+									 * line pointer we freed */
+	Size		left_lp_bytes;		/* bytes of dead line pointer we left, which
+									 * can be freed after index cleanup */
 	BlockNumber pages_removed;
-	double		tuples_deleted;
+	double		tuples_deleted;	/* # of tuples marked as unused */
 	BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
 	/* List of TIDs of tuples we intend to delete */
 	/* NB: this list is ordered by TID address */
@@ -176,6 +179,7 @@ static bool lazy_tid_reaped(ItemPointer itemptr, void *state);
 static int	vac_cmp_itemptr(const void *left, const void *right);
 static bool heap_page_is_all_visible(Relation rel, Buffer buf,
 						 TransactionId *visibility_cutoff_xid, bool *all_frozen);
+static char *get_pretty_size(Size size);
 
 
 /*
@@ -416,23 +420,27 @@ heap_vacuum_rel(Relation onerel, VacuumParams *params,
 							 get_namespace_name(RelationGetNamespace(onerel)),
 							 RelationGetRelationName(onerel),
 							 vacrelstats->num_index_scans);
+			if (vacrelstats->left_lp_bytes > 0)
+				appendStringInfo(&buf,
+								 _("stats: %s freed, %s can be freed after index cleanup, table size: %s\n"),
+								 get_pretty_size(vacrelstats->freed_bytes),
+								 get_pretty_size(vacrelstats->left_lp_bytes),
+								 get_pretty_size(vacrelstats->rel_pages * BLCKSZ));
+			else
+				appendStringInfo(&buf,
+								 _("stats: %s freed, table size: %s\n"),
+								 get_pretty_size(vacrelstats->freed_bytes),
+								 get_pretty_size(vacrelstats->rel_pages * BLCKSZ));
 			appendStringInfo(&buf, _("pages: %u removed, %u remain, %u skipped due to pins, %u skipped frozen\n"),
 							 vacrelstats->pages_removed,
 							 vacrelstats->rel_pages,
 							 vacrelstats->pinskipped_pages,
 							 vacrelstats->frozenskipped_pages);
 			appendStringInfo(&buf,
-							 _("tuples: %.0f removed, %.0f remain, %.0f are dead but not yet removable, oldest xmin: %u\n"),
-							 vacrelstats->tuples_deleted,
+							 _("tuples: %.0f remain, %.0f are dead but not yet removable, oldest xmin: %u\n"),
 							 vacrelstats->new_rel_tuples,
 							 vacrelstats->new_dead_tuples,
 							 OldestXmin);
-			if (vacrelstats->nleft_dead_tuples > 0 ||
-				vacrelstats->nleft_dead_itemids > 0)
-				appendStringInfo(&buf,
-								 _("%.0f tuples and %.0f item identifiers are left as dead.\n"),
-								 vacrelstats->nleft_dead_tuples,
-								 vacrelstats->nleft_dead_itemids);
 			appendStringInfo(&buf,
 							 _("buffer usage: %d hits, %d misses, %d dirtied\n"),
 							 VacuumPageHit,
@@ -513,12 +521,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 				next_fsm_block_to_vacuum;
 	double		num_tuples,		/* total number of nonremovable tuples */
 				live_tuples,	/* live tuples (reltuples estimate) */
-				tups_vacuumed,	/* tuples cleaned up by vacuum */
 				nkeep,			/* dead-but-not-removable tuples */
-				nunused,		/* unused item pointers */
-				nleft_dead_tuples,		/* tuples we left as dead */
-				nleft_dead_itemids;		/* item pointers we left as dead,
-										 * includes nleft_dead_tuples. */
+				nunused;		/* unused item pointers */
 	IndexBulkDeleteResult **indstats;
 	int			i;
 	PGRUsage	ru0;
@@ -550,8 +554,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 
 	empty_pages = vacuumed_pages = 0;
 	next_fsm_block_to_vacuum = (BlockNumber) 0;
-	num_tuples = live_tuples = tups_vacuumed = nkeep = nunused = 0;
-	nleft_dead_itemids = nleft_dead_tuples = 0;
+	num_tuples = live_tuples = nkeep = nunused = 0;
 
 	indstats = (IndexBulkDeleteResult **)
 		palloc0(nindexes * sizeof(IndexBulkDeleteResult *));
@@ -984,8 +987,10 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		 *
 		 * We count tuples removed by the pruning step as removed by VACUUM.
 		 */
-		tups_vacuumed += heap_page_prune(onerel, buf, OldestXmin, false,
-										 &vacrelstats->latestRemovedXid);
+		vacrelstats->tuples_deleted +=
+			heap_page_prune(onerel, buf, OldestXmin, false,
+							&vacrelstats->latestRemovedXid,
+							&vacrelstats->freed_bytes);
 
 		/*
 		 * Now scan the page to collect vacuumable items and check for tuples
@@ -1026,12 +1031,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 
 			ItemPointerSet(&(tuple.t_self), blkno, offnum);
 
-			/*
-			 * DEAD item pointers are to be vacuumed normally; but we don't
-			 * count them in tups_vacuumed, else we'd be double-counting (at
-			 * least in the common case where heap_page_prune() just freed up
-			 * a non-HOT tuple).
-			 */
+			/* DEAD item pointers are to be vacuumed normally */
 			if (ItemIdIsDead(itemid))
 			{
 				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
@@ -1178,7 +1178,6 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 				lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
 				HeapTupleHeaderAdvanceLatestRemovedXid(tuple.t_data,
 													   &vacrelstats->latestRemovedXid);
-				tups_vacuumed += 1;
 				has_dead_tuples = true;
 			}
 			else
@@ -1257,16 +1256,33 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			else
 			{
 				/*
-				 * Here, we have indexes but index cleanup is disabled. Instead of
-				 * vacuuming the dead tuples on the heap, we just forget them.
-				 *
-				 * Note that vacrelstats->dead_tuples could have tuples which
-				 * became dead after HOT-pruning but are not marked dead yet.
-				 * We do not process them because it's a very rare condition, and
-				 * the next vacuum will process them anyway.
+				 * Here, we have indexes but index cleanup is disabled and
+				 * vacrelstats->dead_tuples have only tuples marked as DEAD.
+				 * Instead of vacuuming the dead tuples on the heap, we forget
+				 * them and count the total bytes of dead line pointers.
 				 */
+#ifdef USE_ASSERT_CHECKING
 				Assert(params->index_cleanup == VACOPT_TERNARY_DISABLED);
-				nleft_dead_itemids += vacrelstats->num_dead_tuples;
+
+				/*
+				 * Check whether vacrelstats->dead_tuples has only tuples
+				 * marked as dead.
+				 */
+				{
+					Page	page = BufferGetPage(buf);
+					int i;
+					for (i = 0; i < vacrelstats->num_dead_tuples; i++)
+					{
+						OffsetNumber offset =
+							ItemPointerGetOffsetNumber(&vacrelstats->dead_tuples[i]);
+						ItemId lp = PageGetItemId(page, offset);
+
+						Assert(ItemIdIsDead(lp));
+					}
+				}
+#endif
+				vacrelstats->left_lp_bytes +=
+					(Size) (vacrelstats->num_dead_tuples * sizeof(ItemIdData));
 			}
 
 			/*
@@ -1393,9 +1409,9 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			RecordPageWithFreeSpace(onerel, blkno, freespace, nblocks);
 	}
 
-	/* No dead tuples should be left if index cleanup is enabled */
+	/* No dead line pointer should be left if index cleanup is enabled */
 	Assert((params->index_cleanup == VACOPT_TERNARY_ENABLED &&
-			nleft_dead_tuples == 0 && nleft_dead_itemids == 0) ||
+			vacrelstats->left_lp_bytes == 0) ||
 		   params->index_cleanup == VACOPT_TERNARY_DISABLED);
 
 	/* report that everything is scanned and vacuumed */
@@ -1404,10 +1420,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	pfree(frozen);
 
 	/* save stats for use later */
-	vacrelstats->tuples_deleted = tups_vacuumed;
-	vacrelstats->new_dead_tuples = nkeep + nleft_dead_tuples;
-	vacrelstats->nleft_dead_tuples = nleft_dead_tuples;
-	vacrelstats->nleft_dead_itemids = nleft_dead_itemids;
+	vacrelstats->new_dead_tuples = nkeep;
 
 	/* now we can compute the new value for pg_class.reltuples */
 	vacrelstats->new_live_tuples = vac_estimate_reltuples(onerel,
@@ -1487,7 +1500,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		ereport(elevel,
 				(errmsg("\"%s\": removed %.0f row versions in %u pages",
 						RelationGetRelationName(onerel),
-						tups_vacuumed, vacuumed_pages)));
+						vacrelstats->tuples_deleted, vacuumed_pages)));
 
 	/*
 	 * This is pretty messy, but we split it up so that we can skip emitting
@@ -1495,6 +1508,9 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	 */
 	initStringInfo(&buf);
 	appendStringInfo(&buf,
+					 _("found %.0f nonremovable row versions in %u out of %u pages.\n"),
+					   num_tuples, vacrelstats->scanned_pages, nblocks);
+	appendStringInfo(&buf,
 					 _("%.0f dead row versions cannot be removed yet, oldest xmin: %u\n"),
 					 nkeep, OldestXmin);
 	appendStringInfo(&buf, _("There were %.0f unused item pointers.\n"),
@@ -1511,16 +1527,24 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 									"%u pages are entirely empty.\n",
 									empty_pages),
 					 empty_pages);
-	appendStringInfo(&buf, "%.0f tuples and %.0f item identifiers are left as dead.\n",
-					 nleft_dead_tuples, nleft_dead_itemids);
 	appendStringInfo(&buf, _("%s."), pg_rusage_show(&ru0));
 
-	ereport(elevel,
-			(errmsg("\"%s\": found %.0f removable, %.0f nonremovable row versions in %u out of %u pages",
-					RelationGetRelationName(onerel),
-					tups_vacuumed, num_tuples,
-					vacrelstats->scanned_pages, nblocks),
-			 errdetail_internal("%s", buf.data)));
+	if (vacrelstats->left_lp_bytes > 0)
+		ereport(elevel,
+				(errmsg("\"%s\": %s freed, %s can be freed after index cleanup, table size: %s",
+						RelationGetRelationName(onerel),
+						get_pretty_size(vacrelstats->freed_bytes),
+						get_pretty_size(vacrelstats->left_lp_bytes),
+						get_pretty_size(nblocks * BLCKSZ)),
+				 errdetail_internal("%s", buf.data)));
+	else
+		ereport(elevel,
+				(errmsg("\"%s\": %s freed, table size: %s",
+						RelationGetRelationName(onerel),
+						get_pretty_size(vacrelstats->freed_bytes),
+						get_pretty_size(nblocks * BLCKSZ)),
+				 errdetail_internal("%s", buf.data)));
+
 	pfree(buf.data);
 }
 
@@ -1627,6 +1651,11 @@ lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 			break;				/* past end of tuples for this block */
 		toff = ItemPointerGetOffsetNumber(&vacrelstats->dead_tuples[tupindex]);
 		itemid = PageGetItemId(page, toff);
+
+		/* Count the number of bytes we freed */
+		vacrelstats->freed_bytes +=
+				ItemIdGetLength(itemid) + sizeof(ItemIdData);
+
 		ItemIdSetUnused(itemid);
 		unused[uncnt++] = toff;
 	}
@@ -1658,6 +1687,9 @@ lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 	 */
 	END_CRIT_SECTION();
 
+	/* Count tuples marked as unused */
+	vacrelstats->tuples_deleted += uncnt;
+
 	/*
 	 * Now that we have removed the dead tuples from the page, once again
 	 * check if the page has become all-visible.  The page is already marked
@@ -2370,3 +2402,15 @@ heap_page_is_all_visible(Relation rel, Buffer buf,
 
 	return all_visible;
 }
+
+/* Return a string of formatted size with size units */
+static char *
+get_pretty_size(Size size)
+{
+	Datum size_text;
+
+	size_text = DirectFunctionCall1(pg_size_pretty,
+									Int64GetDatum(size));
+
+	return text_to_cstring(DatumGetTextP(size_text));
+}
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 77e5e60..2c9e15e 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -184,8 +184,9 @@ extern TransactionId heap_compute_xid_horizon_for_tuples(Relation rel,
 extern void heap_page_prune_opt(Relation relation, Buffer buffer);
 extern int heap_page_prune(Relation relation, Buffer buffer,
 				TransactionId OldestXmin,
-				bool report_stats, TransactionId *latestRemovedXid);
-extern void heap_page_prune_execute(Buffer buffer,
+				bool report_stats, TransactionId *latestRemovedXid,
+				Size *freed_bytes);
+extern Size heap_page_prune_execute(Buffer buffer,
 						OffsetNumber *redirected, int nredirected,
 						OffsetNumber *nowdead, int ndead,
 						OffsetNumber *nowunused, int nunused);
fix_assertion_v2.patchapplication/octet-stream; name=fix_assertion_v2.patchDownload
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 9364cd4..9889e48 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -1072,10 +1072,10 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 					 *
 					 * If the tuple is HOT-updated then it must only be
 					 * removed by a prune operation; so we keep it just as if
-					 * it were RECENTLY_DEAD.  Also, if it's a heap-only
-					 * tuple, we choose to keep it, because it'll be a lot
-					 * cheaper to get rid of it in the next pruning pass than
-					 * to treat it like an indexed tuple.
+					 * it were RECENTLY_DEAD.  Also, either if it's a heap-only
+					 * tuple or when index cleanup is disabled, we choose to
+					 * keep it, because it'll be a lot cheaper to get rid of
+					 * it in the next pruning pass than to treat it.
 					 *
 					 * If this were to happen for a tuple that actually needed
 					 * to be deleted, we'd be in trouble, because it'd
@@ -1085,20 +1085,11 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 					 * preventing corruption.
 					 */
 					if (HeapTupleIsHotUpdated(&tuple) ||
-						HeapTupleIsHeapOnly(&tuple))
+						HeapTupleIsHeapOnly(&tuple) ||
+						params->index_cleanup == VACOPT_TERNARY_DISABLED)
 						nkeep += 1;
 					else
-					{
 						tupgone = true; /* we can delete the tuple */
-
-						/*
-						 * Since this dead tuple will not be vacuumed and
-						 * ignored when index cleanup is disabled we count
-						 * count it for reporting.
-						 */
-						if (params->index_cleanup == VACOPT_TERNARY_ENABLED)
-							nleft_dead_tuples++;
-					}
 					all_visible = false;
 					break;
 				case HEAPTUPLE_LIVE:
#98Andres Freund
andres@anarazel.de
In reply to: Andres Freund (#88)
Re: New vacuum option to do only freezing

Hi,

This thread is referenced an open item, and we ought to make some
progress on it.

On a more cosmetic note:

On 2019-04-16 09:10:19 -0700, Andres Freund wrote:

On 2019-04-16 12:01:36 -0400, Tom Lane wrote:

(BTW, I don't understand why that code will throw "found xmin %u from
before relfrozenxid %u" if HeapTupleHeaderXminFrozen is true? Shouldn't
the whole if-branch at lines 6113ff be skipped if xmin_frozen?)

I *think* that just looks odd, but isn't actively wrong. That's because
TransactionIdIsNormal() won't trigger, as:

#define HeapTupleHeaderGetXmin(tup) \
( \
HeapTupleHeaderXminFrozen(tup) ? \
FrozenTransactionId : HeapTupleHeaderGetRawXmin(tup) \
)

which afaict makes
xmin_frozen = ((xid == FrozenTransactionId) ||
HeapTupleHeaderXminFrozen(tuple));
redundant.

Looks like that was introduced relatively recently, in

commit d2599ecfcc74fea9fad1720a70210a740c716730
Author: Alvaro Herrera <alvherre@alvh.no-ip.org>
Date: 2018-05-04 15:24:44 -0300

Don't mark pages all-visible spuriously

@@ -6814,6 +6815,8 @@ heap_prepare_freeze_tuple(HeapTupleHeader tuple,

/* Process xmin */
xid = HeapTupleHeaderGetXmin(tuple);
+   xmin_frozen = ((xid == FrozenTransactionId) ||
+                  HeapTupleHeaderXminFrozen(tuple));
if (TransactionIdIsNormal(xid))
{
if (TransactionIdPrecedes(xid, relfrozenxid))
@@ -6832,9 +6835,8 @@ heap_prepare_freeze_tuple(HeapTupleHeader tuple,

frz->t_infomask |= HEAP_XMIN_FROZEN;
changed = true;
+ xmin_frozen = true;
}
- else
- totally_frozen = false;
}

Alvaro, could we perhaps clean this up a bit? This is pretty confusing
looking. I think this probably could just be changed to

bool xmin_frozen = false;

xid = HeapTupleHeaderGetXmin(tuple);

if (xid == FrozenTransactionId)
xmin_frozen = true;
else if (TransactionIdIsNormal(xid))

or somesuch. There's no need to check for
HeapTupleHeaderXminFrozen(tuple) etc, because HeapTupleHeaderGetXmin()
already does so - and if it didn't, the issue Tom points out would be
problematic.

Greetings,

Andres Freund

#99Andres Freund
andres@anarazel.de
In reply to: Michael Paquier (#70)
Re: New vacuum option to do only freezing

Hi,

On 2019-04-06 16:13:53 +0900, Michael Paquier wrote:

On Sat, Apr 06, 2019 at 11:31:31AM +0900, Masahiko Sawada wrote:

Yes, but Fujii-san pointed out that this option doesn't support toast
tables and I think there is not specific reason why not supporting
them. So it might be good to add toast.vacuum_index_cleanup. Attached
patch.

Being able to control that option at toast level sounds sensible. I
have added an open item about that.

Robert, what is your stance on this open item? It's been an open item
for about three weeks, without any progress.

Greetings,

Andres Freund

#100Robert Haas
robertmhaas@gmail.com
In reply to: Andres Freund (#99)
Re: New vacuum option to do only freezing

On Wed, May 1, 2019 at 12:14 PM Andres Freund <andres@anarazel.de> wrote:

On 2019-04-06 16:13:53 +0900, Michael Paquier wrote:

On Sat, Apr 06, 2019 at 11:31:31AM +0900, Masahiko Sawada wrote:

Yes, but Fujii-san pointed out that this option doesn't support toast
tables and I think there is not specific reason why not supporting
them. So it might be good to add toast.vacuum_index_cleanup. Attached
patch.

Being able to control that option at toast level sounds sensible. I
have added an open item about that.

Robert, what is your stance on this open item? It's been an open item
for about three weeks, without any progress.

The actual bug in this patch needs to be fixed, but I see we have
another open item for that. This open item, as I understand it, is
all about whether we should add another reloption so that you can
control this behavior separately for TOAST tables. In my opinion,
that's not a critical change and the open item should be dropped, but
others might see it differently.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#101Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#89)
Re: New vacuum option to do only freezing

On Tue, Apr 16, 2019 at 12:44 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

So after thinking about this a bit more ...

ISTM that what we have here is a race condition (ie, tuple changed state
since heap_page_prune), and that ideally we want the code to resolve it
as if no race had happened. That is, either of these behaviors would
be acceptable:

1. Delete the tuple, just as heap_page_prune would've done if it had seen
it DEAD. (Possibly we could implement that by jumping back and doing
heap_page_prune again, but that seems pretty messy and bug-prone.
In any case, if we're not doing index cleanup then this must reduce to
"replace tuple by a dead line pointer", not remove it altogether.)

2. Act as if the tuple were still live, just as would've happened if the
state didn't change till just after we looked instead of just before.

Option #2 is a lot simpler and safer, and can be implemented as I
suggested earlier, assuming we're all good with the assumption that
heap_prepare_freeze_tuple isn't going to do anything bad.

After studying this more carefully, I agree. You and Andres and
Alvaro are all correct, and I'm plain wrong. Thanks for explaining.
I have committed a patch that changes the logic as per your
suggestion, and also removes nleft_dead_tuples and nleft_dead_itemids.

I'll reply separately to your other point about tups_vacuumed reporting.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#102Andres Freund
andres@anarazel.de
In reply to: Robert Haas (#101)
Re: New vacuum option to do only freezing

Hi,

On 2019-05-02 10:20:25 -0400, Robert Haas wrote:

On Tue, Apr 16, 2019 at 12:44 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

So after thinking about this a bit more ...
2. Act as if the tuple were still live, just as would've happened if the
state didn't change till just after we looked instead of just before.

Option #2 is a lot simpler and safer, and can be implemented as I
suggested earlier, assuming we're all good with the assumption that
heap_prepare_freeze_tuple isn't going to do anything bad.

After studying this more carefully, I agree. You and Andres and
Alvaro are all correct, and I'm plain wrong. Thanks for explaining.
I have committed a patch that changes the logic as per your
suggestion, and also removes nleft_dead_tuples and nleft_dead_itemids.

It'd be good if somebody could make a pass over the safety mechanisms in
heap_prepare_freeze_tuple(). I added them at some point, after a data
corrupting bug related to freezing, but they really were more intended
as a secondary layer of defense, rather than the primary one. My
understanding is still that we assume that we never should reach
heap_prepare_freeze_tuple() for something that is below the horizon,
even after this change, right?

Greetings,

Andres Freund

#103Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#90)
Re: New vacuum option to do only freezing

On Tue, Apr 16, 2019 at 4:00 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

I wrote:

I'm thinking that we really need to upgrade vacuum's reporting totals
so that it accounts in some more-honest way for pre-existing dead
line pointers. The patch as it stands has made the reporting even more
confusing, rather than less so.

Here's a couple of ideas about that:

1. Ignore heap_page_prune's activity altogether, on the grounds that
it's just random chance that any cleanup done there was done during
VACUUM and not some preceding DML operation. Make tups_vacuumed
be the count of dead line pointers removed. The advantage of this
way is that tups_vacuumed would become independent of previous
non-VACUUM pruning activity, as well as preceding index-cleanup-disabled
VACUUMs. But maybe it's hiding too much information.

2. Have heap_page_prune count, and add to tups_vacuumed, only HOT
tuples that it deleted entirely. The action of replacing a DEAD
root tuple with a dead line pointer doesn't count for anything.
Then also add the count of dead line pointers removed to tups_vacuumed.

Approach #2 is closer to the way we've defined tups_vacuumed up to
now, but I think it'd be more realistic in cases where previous
pruning or index-cleanup-disabled vacuums have left lots of dead
line pointers.

I'm not especially wedded to either of these --- any other ideas?

I think it's almost impossible to have clear reporting here with only
a single counter. There are two clearly-distinct cleanup operations
going on here: (1) removing tuples from pages, and (2) making dead
line pointers unused so that they can be reused for new tuples. They
happen in equal quantity when there are no HOT updates: pruning makes
dead tuples into dead line pointers, and then index vacuuming allows
the dead line pointers to be set unused. But if there are HOT
updates, intermediate tuples in each HOT chain are removed from the
page but the line pointers are directly set to unused, so VACUUM could
remove a lot of tuples but not need to make very many dead line
pointers unused. On the other hand, the opposite could also happen:
maybe lots of single-page HOT-pruning has happened prior to VACUUM, so
VACUUM has lots of dead line pointers to make unused but removes very
few tuples because that's already been done.

For the most part, tups_vacuumed seems to be intending to count #1
rather than #2. While the comments for heap_page_prune and
heap_prune_chain are not as clear as might be desirable, it appears to
me that those functions are counting tuples removed from a page,
ignoring everything that might happen to line pointers -- so using the
return value of this function is consistent with wanting to count #1.
However, there's one place that seems slightly unclear about this,
namely this comment:

/*
* DEAD item pointers are to be vacuumed normally; but we don't
* count them in tups_vacuumed, else we'd be double-counting (at
* least in the common case where heap_page_prune() just freed up
* a non-HOT tuple).
*/

If we're counting tuples removed from pages, then it's not merely that
we would be double-counting, but that we would be counting completely
the wrong thing. However, as far as I can see, that's just an issue
with the comments; the code is in all cases counting tuple removals,
not dead line pointers marked unused.

If I understand correctly, your first proposal amounts to redefining
tups_vacuumed to count #2 rather than #1, and your second proposal
amounts to making tups_vacuumed count #1 + #2 rather than #1. I
suggest a third option: have two counters. tups_vacuum can continue
to count #1, with just a comment adjustment. And then we can add a
second counter which is incremented every time lazy_vacuum_page does
ItemIdSetUnused, which will count #2.

Thoughts?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#104Andres Freund
andres@anarazel.de
In reply to: Robert Haas (#103)
Re: New vacuum option to do only freezing

On 2019-05-02 11:09:10 -0400, Robert Haas wrote:

If I understand correctly, your first proposal amounts to redefining
tups_vacuumed to count #2 rather than #1, and your second proposal
amounts to making tups_vacuumed count #1 + #2 rather than #1. I
suggest a third option: have two counters.

+1

#105Robert Haas
robertmhaas@gmail.com
In reply to: Andres Freund (#102)
Re: New vacuum option to do only freezing

On Thu, May 2, 2019 at 10:28 AM Andres Freund <andres@anarazel.de> wrote:

It'd be good if somebody could make a pass over the safety mechanisms in
heap_prepare_freeze_tuple(). I added them at some point, after a data
corrupting bug related to freezing, but they really were more intended
as a secondary layer of defense, rather than the primary one. My
understanding is still that we assume that we never should reach
heap_prepare_freeze_tuple() for something that is below the horizon,
even after this change, right?

I think so. This code is hit if the tuple wasn't dead yet at the time
that heap_page_prune() decided not to prune it, but it is dead by the
time we retest the tuple status. But the same value of OldestXmin was
used for both checks, so the only way that can really happen is if the
XID in the tuple header was running before and is no longer running.
However, if the XID was running at the time that heap_page_prune()
ran, than OldestXmin certainly can't be newer than that XID. And
therefore the value we're intending to set for relfrozenxid has surely
got to be older, so we could hardly prune using OldestXmin as the
threshold and then relfrozenxid to a newer XID.

Actually, I now believe that my original concern here was exactly
backwards. Prior to the logic change in this commit, with index
vacuum disabled, a tuple that becomes dead between the
heap_page_prune() check and the lazy_scan_heap() check would have
followed the tupgone = true path. That would cause it to be treated
as if the second vacuum pass were going to remove it - i.e. not
frozen. But with index cleanup disabled, there will never be a second
vacuum pass. So a tuple that was actually being kept was not sent to
heap_prepare_freeze_tuple() with, basically, no justification.
Imagine for example that the tuple was not old in terms of its XID
age, but its MXID age was somehow ancient. Then we'd fail to freeze
it on the theory that it was going to be removed, but yet not remove
it. Oops. The revised logic - which is as per Tom's suggestion -
does the right thing, which is treat the tuple as one we've chosen to
keep.

While looking at this code, I think I may have spotted another bug, or
at least a near-bug. lazy_record_dead_tuple() thinks it's OK to just
forget about dead tuples if there's not enough memory, which I think
is OK in the normal case where the dead tuple has been truncated to a
dead line pointer. But in the tupgone = true case it's NOT ok,
because in that case we're leaving behind not just a dead line pointer
but an actual tuple which we have declined to freeze on the assumption
that it will be removed later. But if lazy_record_dead_tuple()
forgets about it, then it won't be removed later. That could lead to
tuples remaining that precede the freeze horizons. The reason why
this may be only a near-bug is that we seem to try pretty hard to make
sure that we'll never call that function in the first place without
enough space being available. Still, it seems to me that it would be
safer to change the code to look like:

if (vacrelstats->num_dead_tuples >= vacrelstats->max_dead_tuples)
elog(ERROR, "oh crap");

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#106Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Andres Freund (#98)
1 attachment(s)
Re: New vacuum option to do only freezing

On 2019-May-01, Andres Freund wrote:

Alvaro, could we perhaps clean this up a bit? This is pretty confusing
looking. I think this probably could just be changed to

bool xmin_frozen = false;

xid = HeapTupleHeaderGetXmin(tuple);

if (xid == FrozenTransactionId)
xmin_frozen = true;
else if (TransactionIdIsNormal(xid))

or somesuch. There's no need to check for
HeapTupleHeaderXminFrozen(tuple) etc, because HeapTupleHeaderGetXmin()
already does so - and if it didn't, the issue Tom points out would be
problematic.

Ah, yeah, that's simpler. I would like to introduce a couple of very
minor changes to the proposed style, per the attached.

* don't initialize xmin_frozen at all; rather, only set its value to the
correct one when we have determined what it is. Doing premature
initialization is what led to some of those old bugs, so I prefer not to
do it.

* Handle the BootstrapXid and InvalidXid cases explicitly, by setting
xmin_frozen to true when xmin is not normal. After all, those XID
values do not need any freezing.

--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

0001-heap_prepare_freeze_tuple-Simplify-coding.patchtext/x-diff; charset=us-asciiDownload
From 1cca0581655614703b020facbbf8d64b442f95be Mon Sep 17 00:00:00 2001
From: Alvaro Herrera <alvherre@alvh.no-ip.org>
Date: Thu, 2 May 2019 11:50:18 -0400
Subject: [PATCH] heap_prepare_freeze_tuple: Simplify coding

---
 src/backend/access/heap/heapam.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index a05b6a07ad0..d6ceb8fd4a4 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -6108,9 +6108,9 @@ heap_prepare_freeze_tuple(HeapTupleHeader tuple,
 
 	/* Process xmin */
 	xid = HeapTupleHeaderGetXmin(tuple);
-	xmin_frozen = ((xid == FrozenTransactionId) ||
-				   HeapTupleHeaderXminFrozen(tuple));
-	if (TransactionIdIsNormal(xid))
+	if (!TransactionIdIsNormal(xid))
+		xmin_frozen = true;
+	else
 	{
 		if (TransactionIdPrecedes(xid, relfrozenxid))
 			ereport(ERROR,
@@ -6118,7 +6118,8 @@ heap_prepare_freeze_tuple(HeapTupleHeader tuple,
 					 errmsg_internal("found xmin %u from before relfrozenxid %u",
 									 xid, relfrozenxid)));
 
-		if (TransactionIdPrecedes(xid, cutoff_xid))
+		xmin_frozen = TransactionIdPrecedes(xid, cutoff_xid);
+		if (xmin_frozen)
 		{
 			if (!TransactionIdDidCommit(xid))
 				ereport(ERROR,
@@ -6128,7 +6129,6 @@ heap_prepare_freeze_tuple(HeapTupleHeader tuple,
 
 			frz->t_infomask |= HEAP_XMIN_FROZEN;
 			changed = true;
-			xmin_frozen = true;
 		}
 	}
 
-- 
2.17.1

#107Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Alvaro Herrera (#106)
Re: New vacuum option to do only freezing

Pushed. I added one comment to explain xmin_frozen also, which
otherwise seemed a bit mysterious. I did not backpatch, though, so
9.6-11 are a bit different, but I'm not sure it's a good idea at this
point, though it should be pretty innocuous.

--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#108Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Robert Haas (#103)
Re: New vacuum option to do only freezing

On Fri, May 3, 2019 at 12:09 AM Robert Haas <robertmhaas@gmail.com> wrote:

On Tue, Apr 16, 2019 at 4:00 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

I wrote:

I'm thinking that we really need to upgrade vacuum's reporting totals
so that it accounts in some more-honest way for pre-existing dead
line pointers. The patch as it stands has made the reporting even more
confusing, rather than less so.

Here's a couple of ideas about that:

1. Ignore heap_page_prune's activity altogether, on the grounds that
it's just random chance that any cleanup done there was done during
VACUUM and not some preceding DML operation. Make tups_vacuumed
be the count of dead line pointers removed. The advantage of this
way is that tups_vacuumed would become independent of previous
non-VACUUM pruning activity, as well as preceding index-cleanup-disabled
VACUUMs. But maybe it's hiding too much information.

2. Have heap_page_prune count, and add to tups_vacuumed, only HOT
tuples that it deleted entirely. The action of replacing a DEAD
root tuple with a dead line pointer doesn't count for anything.
Then also add the count of dead line pointers removed to tups_vacuumed.

Approach #2 is closer to the way we've defined tups_vacuumed up to
now, but I think it'd be more realistic in cases where previous
pruning or index-cleanup-disabled vacuums have left lots of dead
line pointers.

I'm not especially wedded to either of these --- any other ideas?

I think it's almost impossible to have clear reporting here with only
a single counter. There are two clearly-distinct cleanup operations
going on here: (1) removing tuples from pages, and (2) making dead
line pointers unused so that they can be reused for new tuples. They
happen in equal quantity when there are no HOT updates: pruning makes
dead tuples into dead line pointers, and then index vacuuming allows
the dead line pointers to be set unused. But if there are HOT
updates, intermediate tuples in each HOT chain are removed from the
page but the line pointers are directly set to unused, so VACUUM could
remove a lot of tuples but not need to make very many dead line
pointers unused. On the other hand, the opposite could also happen:
maybe lots of single-page HOT-pruning has happened prior to VACUUM, so
VACUUM has lots of dead line pointers to make unused but removes very
few tuples because that's already been done.

For the most part, tups_vacuumed seems to be intending to count #1
rather than #2. While the comments for heap_page_prune and
heap_prune_chain are not as clear as might be desirable, it appears to
me that those functions are counting tuples removed from a page,
ignoring everything that might happen to line pointers -- so using the
return value of this function is consistent with wanting to count #1.
However, there's one place that seems slightly unclear about this,
namely this comment:

/*
* DEAD item pointers are to be vacuumed normally; but we don't
* count them in tups_vacuumed, else we'd be double-counting (at
* least in the common case where heap_page_prune() just freed up
* a non-HOT tuple).
*/

If we're counting tuples removed from pages, then it's not merely that
we would be double-counting, but that we would be counting completely
the wrong thing. However, as far as I can see, that's just an issue
with the comments; the code is in all cases counting tuple removals,
not dead line pointers marked unused.

If I understand correctly, your first proposal amounts to redefining
tups_vacuumed to count #2 rather than #1, and your second proposal
amounts to making tups_vacuumed count #1 + #2 rather than #1. I
suggest a third option: have two counters. tups_vacuum can continue
to count #1, with just a comment adjustment. And then we can add a
second counter which is incremented every time lazy_vacuum_page does
ItemIdSetUnused, which will count #2.

Thoughts?

I agree to have an another counter. Please note that non-HOT-updated
tuples that became dead after hot pruning (that is 'tupgone' tuples)
are changed to unused directly in lazy_page_vacuum. Therefore we would
need not to increment tups_vacuumed in tupgone case if we increment
the new counter in lazy_page_vacuum.

For the contents of vacuum verbose report, it could be worth to
discuss whether reporting the number of deleted line pointers would be
helpful for users. The reporting the number of line pointers we left
might be more helpful for users.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#109Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Robert Haas (#100)
1 attachment(s)
Re: New vacuum option to do only freezing

On Thu, May 2, 2019 at 1:39 AM Robert Haas <robertmhaas@gmail.com> wrote:

On Wed, May 1, 2019 at 12:14 PM Andres Freund <andres@anarazel.de> wrote:

On 2019-04-06 16:13:53 +0900, Michael Paquier wrote:

On Sat, Apr 06, 2019 at 11:31:31AM +0900, Masahiko Sawada wrote:

Yes, but Fujii-san pointed out that this option doesn't support toast
tables and I think there is not specific reason why not supporting
them. So it might be good to add toast.vacuum_index_cleanup. Attached
patch.

Being able to control that option at toast level sounds sensible. I
have added an open item about that.

Robert, what is your stance on this open item? It's been an open item
for about three weeks, without any progress.

The actual bug in this patch needs to be fixed, but I see we have
another open item for that. This open item, as I understand it, is
all about whether we should add another reloption so that you can
control this behavior separately for TOAST tables. In my opinion,
that's not a critical change and the open item should be dropped, but
others might see it differently.

I agree that this item is neither critical and bug. But this is an
(my) oversight and is a small patch and I think there is no specific
reason why we don't dare to include this in 12. So if this patch could
get reviewed enough I think we can have it in 12. Since the previous
patch conflicts with current HEAD I've attached the rebased version
patch.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachments:

toast_vacuum_index_cleanup_v2.patchapplication/octet-stream; name=toast_vacuum_index_cleanup_v2.patchDownload
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index 44a61ef..1e1b0e8 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -1406,7 +1406,7 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
    </varlistentry>
 
    <varlistentry id="reloption-vacuum-index-cleanup" xreflabel="vacuum_index_cleanup">
-    <term><literal>vacuum_index_cleanup</literal> (<type>boolean</type>)
+    <term><literal>vacuum_index_cleanup</literal>, <literal>toast.vacuum_index_cleanup</literal> (<type>boolean</type>)
     <indexterm>
      <primary><varname>vacuum_index_cleanup</varname> storage parameter</primary>
     </indexterm>
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index cfbabb5..022b3a0 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -147,7 +147,7 @@ static relopt_bool boolRelOpts[] =
 		{
 			"vacuum_index_cleanup",
 			"Enables index vacuuming and index cleanup",
-			RELOPT_KIND_HEAP,
+			RELOPT_KIND_HEAP | RELOPT_KIND_TOAST,
 			ShareUpdateExclusiveLock
 		},
 		true
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index e4c03de..2379b3d 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1056,6 +1056,7 @@ static const char *const table_storage_parameters[] = {
 	"toast.autovacuum_vacuum_scale_factor",
 	"toast.autovacuum_vacuum_threshold",
 	"toast.log_autovacuum_min_duration",
+	"toast.vacuum_index_clenaup",
 	"toast.vacuum_truncate",
 	"toast_tuple_target",
 	"user_catalog_table",
#110Andres Freund
andres@anarazel.de
In reply to: Masahiko Sawada (#109)
Re: New vacuum option to do only freezing

Hi,

On 2019-05-09 14:14:20 +0900, Masahiko Sawada wrote:

I agree that this item is neither critical and bug. But this is an
(my) oversight and is a small patch and I think there is no specific
reason why we don't dare to include this in 12. So if this patch could
get reviewed enough I think we can have it in 12. Since the previous
patch conflicts with current HEAD I've attached the rebased version
patch.

Robert, this indeed looks near trivial. What do you think?

diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index 44a61ef..1e1b0e8 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -1406,7 +1406,7 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
</varlistentry>
<varlistentry id="reloption-vacuum-index-cleanup" xreflabel="vacuum_index_cleanup">
-    <term><literal>vacuum_index_cleanup</literal> (<type>boolean</type>)
+    <term><literal>vacuum_index_cleanup</literal>, <literal>toast.vacuum_index_cleanup</literal> (<type>boolean</type>)
<indexterm>
<primary><varname>vacuum_index_cleanup</varname> storage parameter</primary>
</indexterm>
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index cfbabb5..022b3a0 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -147,7 +147,7 @@ static relopt_bool boolRelOpts[] =
{
"vacuum_index_cleanup",
"Enables index vacuuming and index cleanup",
-			RELOPT_KIND_HEAP,
+			RELOPT_KIND_HEAP | RELOPT_KIND_TOAST,
ShareUpdateExclusiveLock
},
true
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index e4c03de..2379b3d 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1056,6 +1056,7 @@ static const char *const table_storage_parameters[] = {
"toast.autovacuum_vacuum_scale_factor",
"toast.autovacuum_vacuum_threshold",
"toast.log_autovacuum_min_duration",
+	"toast.vacuum_index_clenaup",
"toast.vacuum_truncate",
"toast_tuple_target",
"user_catalog_table",

typo.

Sawada-san, it'd be good if you could add at least some minimal tests in
the style of the no_index_cleanup test in vacuum.sql.

- Andres

#111Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#110)
Re: New vacuum option to do only freezing

Andres Freund <andres@anarazel.de> writes:

Robert, this indeed looks near trivial. What do you think?

On 2019-05-09 14:14:20 +0900, Masahiko Sawada wrote:

+ "toast.vacuum_index_clenaup",

Not trivial enough to not have typos, apparently.

regards, tom lane

#112Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Andres Freund (#110)
1 attachment(s)
Re: New vacuum option to do only freezing

On Sat, May 18, 2019 at 5:55 AM Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2019-05-09 14:14:20 +0900, Masahiko Sawada wrote:

I agree that this item is neither critical and bug. But this is an
(my) oversight and is a small patch and I think there is no specific
reason why we don't dare to include this in 12. So if this patch could
get reviewed enough I think we can have it in 12. Since the previous
patch conflicts with current HEAD I've attached the rebased version
patch.

Robert, this indeed looks near trivial. What do you think?

diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index 44a61ef..1e1b0e8 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -1406,7 +1406,7 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
</varlistentry>
<varlistentry id="reloption-vacuum-index-cleanup" xreflabel="vacuum_index_cleanup">
-    <term><literal>vacuum_index_cleanup</literal> (<type>boolean</type>)
+    <term><literal>vacuum_index_cleanup</literal>, <literal>toast.vacuum_index_cleanup</literal> (<type>boolean</type>)
<indexterm>
<primary><varname>vacuum_index_cleanup</varname> storage parameter</primary>
</indexterm>
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index cfbabb5..022b3a0 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -147,7 +147,7 @@ static relopt_bool boolRelOpts[] =
{
"vacuum_index_cleanup",
"Enables index vacuuming and index cleanup",
-                     RELOPT_KIND_HEAP,
+                     RELOPT_KIND_HEAP | RELOPT_KIND_TOAST,
ShareUpdateExclusiveLock
},
true
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index e4c03de..2379b3d 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1056,6 +1056,7 @@ static const char *const table_storage_parameters[] = {
"toast.autovacuum_vacuum_scale_factor",
"toast.autovacuum_vacuum_threshold",
"toast.log_autovacuum_min_duration",
+     "toast.vacuum_index_clenaup",
"toast.vacuum_truncate",
"toast_tuple_target",
"user_catalog_table",

typo.

Sawada-san, it'd be good if you could add at least some minimal tests in
the style of the no_index_cleanup test in vacuum.sql.

Thank you for comments. Attached updated version patch.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachments:

toast_vacuum_index_cleanup_v3.patchapplication/x-patch; name=toast_vacuum_index_cleanup_v3.patchDownload
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index 44a61ef..1e1b0e8 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -1406,7 +1406,7 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
    </varlistentry>
 
    <varlistentry id="reloption-vacuum-index-cleanup" xreflabel="vacuum_index_cleanup">
-    <term><literal>vacuum_index_cleanup</literal> (<type>boolean</type>)
+    <term><literal>vacuum_index_cleanup</literal>, <literal>toast.vacuum_index_cleanup</literal> (<type>boolean</type>)
     <indexterm>
      <primary><varname>vacuum_index_cleanup</varname> storage parameter</primary>
     </indexterm>
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index cfbabb5..022b3a0 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -147,7 +147,7 @@ static relopt_bool boolRelOpts[] =
 		{
 			"vacuum_index_cleanup",
 			"Enables index vacuuming and index cleanup",
-			RELOPT_KIND_HEAP,
+			RELOPT_KIND_HEAP | RELOPT_KIND_TOAST,
 			ShareUpdateExclusiveLock
 		},
 		true
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index e4c03de..e2aa738 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1056,6 +1056,7 @@ static const char *const table_storage_parameters[] = {
 	"toast.autovacuum_vacuum_scale_factor",
 	"toast.autovacuum_vacuum_threshold",
 	"toast.log_autovacuum_min_duration",
+	"toast.vacuum_index_cleanup",
 	"toast.vacuum_truncate",
 	"toast_tuple_target",
 	"user_catalog_table",
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index 8353d84..bc43e82 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -81,7 +81,10 @@ SQL function "wrap_do_analyze" statement 1
 VACUUM FULL vactst;
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
 -- INDEX_CLEANUP option
-CREATE TABLE no_index_cleanup (i INT PRIMARY KEY) WITH (vacuum_index_cleanup = false);
+CREATE TABLE no_index_cleanup (i INT PRIMARY KEY, t TEXT) WITH (vacuum_index_cleanup = false, toast.vacuum_index_cleanup = false);
+INSERT INTO no_index_cleanup(i, t) VALUES(1, repeat('1234567890',30000));
+SELECT relname as toastrel FROM pg_class WHERE oid = (SELECT reltoastrelid FROM pg_class WHERE relname = 'no_index_cleanup') \gset
+VACUUM pg_toast.:toastrel; -- vacuum toast table of no_index_cleanup with no index cleanup
 VACUUM (INDEX_CLEANUP FALSE) vaccluster;
 VACUUM (INDEX_CLEANUP FALSE) vactst; -- index cleanup option is ignored if no indexes
 VACUUM (INDEX_CLEANUP FALSE, FREEZE TRUE) vaccluster;
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index a558580..5173539 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -63,7 +63,10 @@ VACUUM FULL vactst;
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
 
 -- INDEX_CLEANUP option
-CREATE TABLE no_index_cleanup (i INT PRIMARY KEY) WITH (vacuum_index_cleanup = false);
+CREATE TABLE no_index_cleanup (i INT PRIMARY KEY, t TEXT) WITH (vacuum_index_cleanup = false, toast.vacuum_index_cleanup = false);
+INSERT INTO no_index_cleanup(i, t) VALUES(1, repeat('1234567890',30000));
+SELECT relname as toastrel FROM pg_class WHERE oid = (SELECT reltoastrelid FROM pg_class WHERE relname = 'no_index_cleanup') \gset
+VACUUM pg_toast.:toastrel; -- vacuum toast table of no_index_cleanup with no index cleanup
 VACUUM (INDEX_CLEANUP FALSE) vaccluster;
 VACUUM (INDEX_CLEANUP FALSE) vactst; -- index cleanup option is ignored if no indexes
 VACUUM (INDEX_CLEANUP FALSE, FREEZE TRUE) vaccluster;
#113Robert Haas
robertmhaas@gmail.com
In reply to: Andres Freund (#110)
Re: New vacuum option to do only freezing

On Fri, May 17, 2019 at 4:55 PM Andres Freund <andres@anarazel.de> wrote:

Robert, this indeed looks near trivial. What do you think?

Hmm, yeah, I guess that'd be OK.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#114Michael Paquier
michael@paquier.xyz
In reply to: Masahiko Sawada (#112)
Re: New vacuum option to do only freezing

On Mon, May 20, 2019 at 11:43:19AM +0900, Masahiko Sawada wrote:

Thank you for comments. Attached updated version patch.

This is an open item present for quite some time, so I have looked at
the patch. The base patch is fine.

+INSERT INTO no_index_cleanup(i, t) VALUES(1, repeat('1234567890',30000));
Do we really need a string as long as that?

Is actually the existing set of tests that helpful? We now have only
two VACUUM queries which run on no_index_cleanup, both of them using
FULL so the reloption as well as the value of INDEX_CLEANUP are
ignored. Wouldn't it be better to redesign a bit the tests with more
combinations of options like that?
-- Toast inherit the value from the table
ALTER TABLE no_index_cleanup SET (vacuum_index_cleanup = false);
-- Value directly defined for both
ALTER TABLE no_index_cleanup SET (vacuum_index_cleanup = false,
toast.vacuum_index_cleanup = true);
ALTER TABLE no_index_cleanup SET (vacuum_index_cleanup = true,
toast.vacuum_index_cleanup = false);

It seems to me that we'd want tests to make sure that indexes are
actually cleaned up, where pageinspect could prove to be useful.
--
Michael

In reply to: Michael Paquier (#114)
Re: New vacuum option to do only freezing

On Tue, Jun 18, 2019 at 10:39 PM Michael Paquier <michael@paquier.xyz> wrote:

+INSERT INTO no_index_cleanup(i, t) VALUES(1, repeat('1234567890',30000));
Do we really need a string as long as that?

Specifying EXTERNAL storage might make things easier. I have used
PLAIN storage to test the 1/3 of a page restriction within nbtree, and
to test a bug in amcheck that was related to TOAST compression.

It seems to me that we'd want tests to make sure that indexes are
actually cleaned up, where pageinspect could prove to be useful.

That definitely seems preferable, but it'll be a bit tricky to do it
in a way that doesn't run into buildfarm issues due to alignment. I
suggest an index on a text column to avoid problems.

--
Peter Geoghegan

#116Michael Paquier
michael@paquier.xyz
In reply to: Peter Geoghegan (#115)
1 attachment(s)
Re: New vacuum option to do only freezing

On Wed, Jun 19, 2019 at 12:51:41PM -0700, Peter Geoghegan wrote:

On Tue, Jun 18, 2019 at 10:39 PM Michael Paquier <michael@paquier.xyz> wrote:

+INSERT INTO no_index_cleanup(i, t) VALUES(1, repeat('1234567890',30000));
Do we really need a string as long as that?

Specifying EXTERNAL storage might make things easier. I have used
PLAIN storage to test the 1/3 of a page restriction within nbtree, and
to test a bug in amcheck that was related to TOAST compression.

Ah, good point here. That makes sense.

It seems to me that we'd want tests to make sure that indexes are
actually cleaned up, where pageinspect could prove to be useful.

That definitely seems preferable, but it'll be a bit tricky to do it
in a way that doesn't run into buildfarm issues due to alignment. I
suggest an index on a text column to avoid problems.

I am not completely sure how tricky that may be, so I'll believe you
on this one :)

So, to keep things simple and if we want to get something into v12, I
would suggest to just stress more combinations even if the changes are
not entirely visible yet. If we get a couple of queries to run with
the option disabled on the table, its toast or both by truncating and
filling in the table in-between, we may be able to catch some issues
by stressing those code paths.

And I finish with the attached. Thoughts?
--
Michael

Attachments:

toast_vacuum_index_cleanup_v4.patchtext/x-diff; charset=us-asciiDownload
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index 8e1bfe4d78..9009addb9c 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -1406,7 +1406,7 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
    </varlistentry>
 
    <varlistentry id="reloption-vacuum-index-cleanup" xreflabel="vacuum_index_cleanup">
-    <term><literal>vacuum_index_cleanup</literal> (<type>boolean</type>)
+    <term><literal>vacuum_index_cleanup</literal>, <literal>toast.vacuum_index_cleanup</literal> (<type>boolean</type>)
     <indexterm>
      <primary><varname>vacuum_index_cleanup</varname> storage parameter</primary>
     </indexterm>
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index de06c92574..5773021499 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -144,7 +144,7 @@ static relopt_bool boolRelOpts[] =
 		{
 			"vacuum_index_cleanup",
 			"Enables index vacuuming and index cleanup",
-			RELOPT_KIND_HEAP,
+			RELOPT_KIND_HEAP | RELOPT_KIND_TOAST,
 			ShareUpdateExclusiveLock
 		},
 		true
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 5e38f46399..7dcf342413 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1069,6 +1069,7 @@ static const char *const table_storage_parameters[] = {
 	"toast.autovacuum_vacuum_scale_factor",
 	"toast.autovacuum_vacuum_threshold",
 	"toast.log_autovacuum_min_duration",
+	"toast.vacuum_index_cleanup",
 	"toast.vacuum_truncate",
 	"toast_tuple_target",
 	"user_catalog_table",
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index f944b93fd6..5d50548f65 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -93,13 +93,38 @@ SQL function "wrap_do_analyze" statement 1
 VACUUM FULL vactst;
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
 -- INDEX_CLEANUP option
-CREATE TABLE no_index_cleanup (i INT PRIMARY KEY) WITH (vacuum_index_cleanup = false);
-VACUUM (INDEX_CLEANUP FALSE) vaccluster;
-VACUUM (INDEX_CLEANUP FALSE) vactst; -- index cleanup option is ignored if no indexes
-VACUUM (INDEX_CLEANUP FALSE, FREEZE TRUE) vaccluster;
+CREATE TABLE no_index_cleanup (i INT PRIMARY KEY, t TEXT);
+-- Use uncompressed data stored in toast.
+CREATE INDEX no_index_cleanup_idx ON no_index_cleanup(t);
+ALTER TABLE no_index_cleanup ALTER COLUMN t SET STORAGE EXTERNAL;
+INSERT INTO no_index_cleanup(i, t) VALUES (generate_series(1,30),
+    repeat('1234567890',300));
 -- index cleanup option is ignored if VACUUM FULL
 VACUUM (INDEX_CLEANUP TRUE, FULL TRUE) no_index_cleanup;
 VACUUM (FULL TRUE) no_index_cleanup;
+-- Toast inherits the value from its parent table.
+DELETE FROM no_index_cleanup WHERE i < 15;
+-- Nothing is cleaned up.
+ALTER TABLE no_index_cleanup SET (vacuum_index_cleanup = false);
+-- Both parent relation and toast are cleaned up.
+ALTER TABLE no_index_cleanup SET (vacuum_index_cleanup = true);
+VACUUM no_index_cleanup;
+-- Parameter is set for both the parent table and its toast relation.
+INSERT INTO no_index_cleanup(i, t) VALUES (generate_series(31,60),
+    repeat('1234567890',300));
+DELETE FROM no_index_cleanup WHERE i < 45;
+-- Only toast index is cleaned up.
+ALTER TABLE no_index_cleanup SET (vacuum_index_cleanup = false,
+    toast.vacuum_index_cleanup = true);
+VACUUM no_index_cleanup;
+-- Only parent is cleaned up.
+ALTER TABLE no_index_cleanup SET (vacuum_index_cleanup = true,
+    toast.vacuum_index_cleanup = false);
+VACUUM no_index_cleanup;
+-- Test some extra relations.
+VACUUM (INDEX_CLEANUP FALSE) vaccluster;
+VACUUM (INDEX_CLEANUP FALSE) vactst; -- index cleanup option is ignored if no indexes
+VACUUM (INDEX_CLEANUP FALSE, FREEZE TRUE) vaccluster;
 -- TRUNCATE option
 CREATE TABLE vac_truncate_test(i INT NOT NULL, j text)
 	WITH (vacuum_truncate=true, autovacuum_enabled=false);
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 16748e1823..91a2ea6482 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -76,13 +76,38 @@ VACUUM FULL vactst;
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
 
 -- INDEX_CLEANUP option
-CREATE TABLE no_index_cleanup (i INT PRIMARY KEY) WITH (vacuum_index_cleanup = false);
-VACUUM (INDEX_CLEANUP FALSE) vaccluster;
-VACUUM (INDEX_CLEANUP FALSE) vactst; -- index cleanup option is ignored if no indexes
-VACUUM (INDEX_CLEANUP FALSE, FREEZE TRUE) vaccluster;
+CREATE TABLE no_index_cleanup (i INT PRIMARY KEY, t TEXT);
+-- Use uncompressed data stored in toast.
+CREATE INDEX no_index_cleanup_idx ON no_index_cleanup(t);
+ALTER TABLE no_index_cleanup ALTER COLUMN t SET STORAGE EXTERNAL;
+INSERT INTO no_index_cleanup(i, t) VALUES (generate_series(1,30),
+    repeat('1234567890',300));
 -- index cleanup option is ignored if VACUUM FULL
 VACUUM (INDEX_CLEANUP TRUE, FULL TRUE) no_index_cleanup;
 VACUUM (FULL TRUE) no_index_cleanup;
+-- Toast inherits the value from its parent table.
+DELETE FROM no_index_cleanup WHERE i < 15;
+-- Nothing is cleaned up.
+ALTER TABLE no_index_cleanup SET (vacuum_index_cleanup = false);
+-- Both parent relation and toast are cleaned up.
+ALTER TABLE no_index_cleanup SET (vacuum_index_cleanup = true);
+VACUUM no_index_cleanup;
+-- Parameter is set for both the parent table and its toast relation.
+INSERT INTO no_index_cleanup(i, t) VALUES (generate_series(31,60),
+    repeat('1234567890',300));
+DELETE FROM no_index_cleanup WHERE i < 45;
+-- Only toast index is cleaned up.
+ALTER TABLE no_index_cleanup SET (vacuum_index_cleanup = false,
+    toast.vacuum_index_cleanup = true);
+VACUUM no_index_cleanup;
+-- Only parent is cleaned up.
+ALTER TABLE no_index_cleanup SET (vacuum_index_cleanup = true,
+    toast.vacuum_index_cleanup = false);
+VACUUM no_index_cleanup;
+-- Test some extra relations.
+VACUUM (INDEX_CLEANUP FALSE) vaccluster;
+VACUUM (INDEX_CLEANUP FALSE) vactst; -- index cleanup option is ignored if no indexes
+VACUUM (INDEX_CLEANUP FALSE, FREEZE TRUE) vaccluster;
 
 -- TRUNCATE option
 CREATE TABLE vac_truncate_test(i INT NOT NULL, j text)
#117Michael Paquier
michael@paquier.xyz
In reply to: Michael Paquier (#116)
Re: New vacuum option to do only freezing

On Thu, Jun 20, 2019 at 03:50:32PM +0900, Michael Paquier wrote:

And I finish with the attached. Thoughts?

So, are there any objections with this patch? Or do people think that
it's too late for v12 and that it is better to wait until v13 opens
for business?
--
Michael

#118Michael Paquier
michael@paquier.xyz
In reply to: Michael Paquier (#117)
Re: New vacuum option to do only freezing

On Sun, Jun 23, 2019 at 10:29:25PM +0900, Michael Paquier wrote:

So, are there any objections with this patch? Or do people think that
it's too late for v12 and that it is better to wait until v13 opens
for business?

Committed, and open item closed.
--
Michael