Teaching users how they can get the most out of HOT in Postgres 14
Recent work from commit 5100010e taught VACUUM that it doesn't have to
do index vacuuming in cases where there are practically zero (not
necessarily exactly zero) tuples to delete from indexes. It also
surfaces the information used to decide whether or not we skip index
vacuuming in the logs, via the log_autovacuum_min_duration mechanism.
This log output can be used to get a sense of how effective HOT is
over time.
There is one number of particular interest: the proportion of heap
pages that have one or more LP_DEAD items across successive VACUUMs
(this is expressed as a percentage of the table). The immediate reason
to expose this is that it is crucial to the skipping behavior from
commit 5100010e -- the threshold for skipping is 2% of all heap pages.
But that's certainly not the only reason to pay attention to the
percentage. It can also be used to understand HOT in general. It can
be correlated with workload spikes and stressors that tend to make HOT
less effective.
A number of interesting workload-specific patterns seem to emerge by
focussing on how this number changes/grows over time. I think that
this should be pointed out directly in the docs. What's more, it seems
like a good vehicle for discussing how HOT works in general. Why did
we never really get around to documenting HOT? There should at least
be some handling of how DBAs can get the most out of HOT through
monitoring and through tuning -- especially by lowering heap
fillfactor.
It's very hard to get all UPDATEs to use HOT. It's much easier to get
UPDATEs to mostly use HOT most of the time. How things change over
time seems crucially important.
I'll show one realistic example, just to give some idea of what it
might look like. This is output for 3 successive autovacuums against
the largest TPC-C table:
automatic vacuum of table "postgres.public.bmsql_order_line": index scans: 0
pages: 0 removed, 4668405 remain, 0 skipped due to pins, 696997 skipped frozen
tuples: 324571 removed, 186702373 remain, 333888 are dead but not yet
removable, oldest xmin: 7624965
buffer usage: 3969937 hits, 3931997 misses, 1883255 dirtied
index scan bypassed: 42634 pages from table (0.91% of total) have
324364 dead item identifiers
avg read rate: 62.469 MB/s, avg write rate: 29.920 MB/s
I/O Timings: read=42359.501 write=11867.903
system usage: CPU: user: 43.62 s, system: 38.17 s, elapsed: 491.74 s
WAL usage: 4586766 records, 1850599 full page images, 8499388881 bytes
automatic vacuum of table "postgres.public.bmsql_order_line": index scans: 0
pages: 0 removed, 5976391 remain, 0 skipped due to pins, 2516643 skipped frozen
tuples: 759956 removed, 239787517 remain, 1848431 are dead but not yet
removable, oldest xmin: 18489326
buffer usage: 3432019 hits, 3385757 misses, 2426571 dirtied
index scan bypassed: 103941 pages from table (1.74% of total) have
790233 dead item identifiers
avg read rate: 50.338 MB/s, avg write rate: 36.077 MB/s
I/O Timings: read=49252.721 write=17003.987
system usage: CPU: user: 45.86 s, system: 34.47 s, elapsed: 525.47 s
WAL usage: 5598040 records, 2274556 full page images, 10510281959 bytes
automatic vacuum of table "postgres.public.bmsql_order_line": index scans: 1
pages: 0 removed, 7483804 remain, 1 skipped due to pins, 4208295 skipped frozen
tuples: 972778 removed, 299295048 remain, 1970910 are dead but not yet
removable, oldest xmin: 30987445
buffer usage: 3384994 hits, 4593727 misses, 2891003 dirtied
index scan needed: 174243 pages from table (2.33% of total) had
1325752 dead item identifiers removed
index "bmsql_order_line_pkey": pages: 1250660 in total, 0 newly
deleted, 0 currently deleted, 0 reusable
avg read rate: 60.505 MB/s, avg write rate: 38.078 MB/s
I/O Timings: read=72881.986 write=21872.615
system usage: CPU: user: 65.24 s, system: 42.24 s, elapsed: 593.14 s
WAL usage: 6668353 records, 2684040 full page images, 12374536939 bytes
These autovacuums occur every 60-90 minutes with the workload in
question (with pretty aggressive autovacuum settings). We see that HOT
works rather well here -- but not so well that index vacuuming can be
avoided consistently, which happens in the final autovacuum (it has
"index scans: 1"). There was slow but steady growth in the percentage
of LP_DEAD-containing heap pages over time here, which is common
enough.
The point of HOT is not to avoid having to do index vacuuming, of
course -- that has it backwards. But framing HOT as doing work in
backends so autovacuum doesn't have to do similar work later on is a
good mental model to encourage users to adopt. There are also
significant advantages to reducing the effectiveness of HOT to this
one number -- HOT must be working well if it's close to 0%, almost
always below 2%, with the occasional aberration that sees it go up to
maybe 5%. But if it ever goes too high (in the absence of DELETEs),
you might have trouble on your hands. It might not go down again.
There are other interesting patterns from other tables within the same
database -- including on tables with no UPDATEs, and tables that
generally cannot use HOT due to a need to modify indexed columns. The
particulars with these other tables hint at problems with heap
fragmentation, which is something that users can think of as a
degenerative process -- something that gets progressively worse in
extreme cases (i.e. cases where it matters).
This new percentage metric isn't about HOT per se. It's actually about
the broader question of how effective the system is at keeping the
physical location of each logical row stable over time, for a given
workload. So maybe that's what any new documentation should address.
The documentation would still have plenty to say about HOT, though. It
would also have something to say about bottom-up index deletion, which
can be thought of as avoiding problems when HOT doesn't or can't be
applied very often.
--
Peter Geoghegan
Hi,
On 2021-04-12 16:11:59 -0700, Peter Geoghegan wrote:
Recent work from commit 5100010e taught VACUUM that it doesn't have to
do index vacuuming in cases where there are practically zero (not
necessarily exactly zero) tuples to delete from indexes.
FWIW, I'd not at all be surprised if this causes some issues. Consider
cases where all index lookups are via bitmap scans (which does not
support killtuples) - if value ranges are looked up often the repeated
heap fetches can absolutely kill query performance. I've definitely had
to make autovacuum more aggressive for cases like this or schedule
manual vacuums, and now that's silently not good enough anymore. Yes, 2%
of the table isn't all that much, but often enough all the updates and
lookups concentrate in one value range.
As far as I can see there's no reasonable way to disable this
"optimization", which scares me.
Greetings,
Andres Freund
On Mon, Apr 12, 2021 at 4:30 PM Andres Freund <andres@anarazel.de> wrote:
As far as I can see there's no reasonable way to disable this
"optimization", which scares me.
I'm fine with adding a simple 'off' switch. What I'd like to avoid
doing is making the behavior tunable, since it's likely to change in
Postgres 15 and Postgres 16 anyway.
--
Peter Geoghegan
On Mon, Apr 12, 2021 at 04:35:13PM -0700, Peter Geoghegan wrote:
On Mon, Apr 12, 2021 at 4:30 PM Andres Freund <andres@anarazel.de> wrote:
As far as I can see there's no reasonable way to disable this
"optimization", which scares me.I'm fine with adding a simple 'off' switch. What I'd like to avoid
doing is making the behavior tunable, since it's likely to change in
Postgres 15 and Postgres 16 anyway.
While going through this commit a couple of days ago, I really got to
wonder why you are controlling this stuff with a hardcoded value and I
found that scary, while what you should be using are two GUCs with the
reloptions that come with the feature (?):
- A threshold, as an integer, to define a number of pages.
- A scale factor to define a percentage of pages.
Also, I am a bit confused with the choice of BYPASS_THRESHOLD_PAGES as
parameter name. For all the other parameters of autovacuum, we use
"threshold" for a fixed number of items, not a percentage of a given
item.
--
Michael
On Mon, Apr 12, 2021 at 4:52 PM Michael Paquier <michael@paquier.xyz> wrote:
While going through this commit a couple of days ago, I really got to
wonder why you are controlling this stuff with a hardcoded value and I
found that scary, while what you should be using are two GUCs with the
reloptions that come with the feature (?):
- A threshold, as an integer, to define a number of pages.
- A scale factor to define a percentage of pages.
Why?
--
Peter Geoghegan
Hi,
On 2021-04-12 16:53:47 -0700, Peter Geoghegan wrote:
On Mon, Apr 12, 2021 at 4:52 PM Michael Paquier <michael@paquier.xyz> wrote:
While going through this commit a couple of days ago, I really got to
wonder why you are controlling this stuff with a hardcoded value and I
found that scary, while what you should be using are two GUCs with the
reloptions that come with the feature (?):
- A threshold, as an integer, to define a number of pages.
- A scale factor to define a percentage of pages.Why?
Well, one argument is that you made a fairly significant behavioural
change, with hard-coded logic for when the optimization kicks in. It's
not at all clear that your constants are the right ones for every
workload. We'll likely on get to know whether they're right in > 1 year
- not having a real out at that point imo is somewhat scary.
That said, adding more and more reloptions has a significant cost, so I
don't think it's clear cut that it's the right decision to add
one. Perhaps vacuum_cleanup_index_scale_factor should just be reused for
BYPASS_THRESHOLD_PAGES?
Greetings,
Andres Freund
On Mon, Apr 12, 2021 at 5:37 PM Andres Freund <andres@anarazel.de> wrote:
Well, one argument is that you made a fairly significant behavioural
change, with hard-coded logic for when the optimization kicks in. It's
not at all clear that your constants are the right ones for every
workload.
(Apparently nobody wants to talk about HOT and the documentation.)
The BYPASS_THRESHOLD_PAGES cutoff was chosen conservatively, so that
it would avoid index vacuuming in truly marginal cases -- and it tends
to only delay it there.
A table-level threshold is not the best way of constraining the
problem. In the future, the table threshold should be treated as only
one factor among several. Plus there will be more than a simple yes/no
question to consider. We should eventually be able to do index
vacuuming for some indexes but not others. Bottom-up index deletion
has totally changed things here, because roughly speaking it makes
index bloat proportionate to the number of logical changes to indexed
columns -- you could have one super-bloated index on the table, but
several others that perfectly retain their original size. You still
need to do heap vacuuming eventually, which necessitates vacuuming
indexes too, but the right strategy is probably to vacuum much more
frequently, vacuuming the bloated index each time. You only do a full
round of index vacuuming when the table starts to accumulate way too
many LP_DEAD items. You need a much more sophisticated model for this.
It might also need to hook into autovacuums scheduling.
One of the dangers of high BYPASS_THRESHOLD_PAGES settings is that
it'll work well for some indexes but not others. To a dramatic degree,
even.
That said, nbtree isn't the only index AM, and it is hard to be
completely sure that you've caught everything. So an off switch seems
like a good idea now.
We'll likely on get to know whether they're right in > 1 year
- not having a real out at that point imo is somewhat scary.That said, adding more and more reloptions has a significant cost, so I
don't think it's clear cut that it's the right decision to add
one. Perhaps vacuum_cleanup_index_scale_factor should just be reused for
BYPASS_THRESHOLD_PAGES?
I think that the right way to do this is to generalize INDEX_CLEANUP
to support a mode of operation that disallows vacuumlazy.c from
applying this optimization, as well as any similar optimizations which
will be added in the future.
Even if you don't buy my argument about directly parameterizing
BYPASS_THRESHOLD_PAGES undermining future work, allowing it to be set
much higher than 5% - 10% would be a pretty big footgun. It might
appear to help at first, but risks destabilizing things much later on.
--
Peter Geoghegan
On Mon, Apr 12, 2021 at 06:12:18PM -0700, Peter Geoghegan wrote:
One of the dangers of high BYPASS_THRESHOLD_PAGES settings is that
it'll work well for some indexes but not others. To a dramatic degree,
even.That said, nbtree isn't the only index AM, and it is hard to be
completely sure that you've caught everything. So an off switch seems
like a good idea now.
Whatever the solution chosen, the thing I can see we agree on here is
that we need to do something, at least in the shape of an on/off
switch to have an escape path in case of problems. Peter, could we
get something by beta1 for that? FWIW, I would use a float GUC to
control that, and not a boolean switch, but I am just one voice here,
and that's not a feature I worked on.
--
Michael
On Tue, May 11, 2021 at 04:42:27PM +0900, Michael Paquier wrote:
Whatever the solution chosen, the thing I can see we agree on here is
that we need to do something, at least in the shape of an on/off
switch to have an escape path in case of problems. Peter, could we
get something by beta1 for that? FWIW, I would use a float GUC to
control that, and not a boolean switch, but I am just one voice here,
and that's not a feature I worked on.
So, I have been thinking more about this item, and a boolean switch
still sounded weird to me, so attached is a patch to have two GUCs,
one for manual VACUUM and autovacuum like any other parameters, to
control this behavior, with a default set at 2% of the number of
relation pages with dead items needed to do the index cleanup work.
Even if we switch the parameter type used here, the easiest and most
consistent way to pass down the parameter is just to use VacuumParams
set within ExecVacuum() and the autovacuum code path. The docs need
more work, I guess.
Thoughts?
--
Michael
Attachments:
vacuum-index-cleanup-v1.patchtext/x-diff; charset=us-asciiDownload
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index cb27257bb6..9c3b9c5e1f 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -221,6 +221,10 @@ typedef struct VacuumParams
VacOptTernaryValue truncate; /* Truncate empty pages at the end,
* default value depends on reloptions */
+ double cleanup_index_scale_factor; /* Minimum fraction of relation pages
+ * with dead items required for index
+ * cleanup */
+
/*
* The number of parallel vacuum workers. 0 by default which means choose
* based on the number of indexes. -1 indicates parallel vacuum is
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 95202d37af..9356b9c5b2 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -256,6 +256,8 @@ extern int VacuumCostPageDirty;
extern int VacuumCostLimit;
extern double VacuumCostDelay;
+extern double VacuumCleanupIndexScale;
+
extern int64 VacuumPageHit;
extern int64 VacuumPageMiss;
extern int64 VacuumPageDirty;
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index aacdd0f575..24cfbdad15 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -41,6 +41,7 @@ extern int autovacuum_freeze_max_age;
extern int autovacuum_multixact_freeze_max_age;
extern double autovacuum_vac_cost_delay;
extern int autovacuum_vac_cost_limit;
+extern double autovacuum_cleanup_index_scale;
/* autovacuum launcher PID, only valid when worker is shutting down */
extern int AutovacuumLauncherPid;
diff --git a/src/include/utils/guc_tables.h b/src/include/utils/guc_tables.h
index 6b40f1eeb8..30e69a9640 100644
--- a/src/include/utils/guc_tables.h
+++ b/src/include/utils/guc_tables.h
@@ -84,6 +84,7 @@ enum config_group
STATS_MONITORING,
STATS_COLLECTOR,
AUTOVACUUM,
+ VACUUM_INDEX,
CLIENT_CONN_STATEMENT,
CLIENT_CONN_LOCALE,
CLIENT_CONN_PRELOAD,
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 17519a970f..8871fb8758 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -103,12 +103,6 @@
#define VACUUM_TRUNCATE_LOCK_WAIT_INTERVAL 50 /* ms */
#define VACUUM_TRUNCATE_LOCK_TIMEOUT 5000 /* ms */
-/*
- * Threshold that controls whether we bypass index vacuuming and heap
- * vacuuming as an optimization
- */
-#define BYPASS_THRESHOLD_PAGES 0.02 /* i.e. 2% of rel_pages */
-
/*
* When a table is small (i.e. smaller than this), save cycles by avoiding
* repeated failsafe checks
@@ -407,7 +401,8 @@ static void lazy_scan_prune(LVRelState *vacrel, Buffer buf,
BlockNumber blkno, Page page,
GlobalVisState *vistest,
LVPagePruneState *prunestate);
-static void lazy_vacuum(LVRelState *vacrel, bool onecall);
+static void lazy_vacuum(LVRelState *vacrel, VacuumParams *params,
+ bool onecall);
static bool lazy_vacuum_all_indexes(LVRelState *vacrel);
static void lazy_vacuum_heap_rel(LVRelState *vacrel);
static int lazy_vacuum_heap_page(LVRelState *vacrel, BlockNumber blkno,
@@ -1149,7 +1144,7 @@ lazy_scan_heap(LVRelState *vacrel, VacuumParams *params, bool aggressive)
}
/* Remove the collected garbage tuples from table and indexes */
- lazy_vacuum(vacrel, false);
+ lazy_vacuum(vacrel, params, false);
have_vacuumed_indexes = true;
/*
@@ -1567,7 +1562,7 @@ lazy_scan_heap(LVRelState *vacrel, VacuumParams *params, bool aggressive)
/* If any tuples need to be deleted, perform final vacuum cycle */
if (dead_tuples->num_tuples > 0)
- lazy_vacuum(vacrel, !have_vacuumed_indexes);
+ lazy_vacuum(vacrel, params, !have_vacuumed_indexes);
/*
* Vacuum the remainder of the Free Space Map. We must do this whether or
@@ -2052,7 +2047,7 @@ retry:
* wraparound.
*/
static void
-lazy_vacuum(LVRelState *vacrel, bool onecall)
+lazy_vacuum(LVRelState *vacrel, VacuumParams *params, bool onecall)
{
bool do_bypass_optimization;
@@ -2119,7 +2114,7 @@ lazy_vacuum(LVRelState *vacrel, bool onecall)
* The discrepancy should be negligible. If this optimization is ever
* expanded to cover more cases then this may need to be reconsidered.
*/
- threshold = (double) vacrel->rel_pages * BYPASS_THRESHOLD_PAGES;
+ threshold = (double) vacrel->rel_pages * params->cleanup_index_scale_factor;
do_bypass_optimization =
(vacrel->lpdead_item_pages < threshold &&
vacrel->lpdead_items < MAXDEADTUPLES(32L * 1024L * 1024L));
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index d549d0d86f..55c458da07 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -250,6 +250,9 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
/* user-invoked vacuum never uses this parameter */
params.log_min_duration = -1;
+ /* set scale factor for index cleanup based on GUC */
+ params.cleanup_index_scale_factor = VacuumCleanupIndexScale;
+
/* Now go through the common routine */
vacuum(vacstmt->rels, ¶ms, NULL, isTopLevel);
}
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index d516df0ac5..df9d52a2d2 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -124,6 +124,7 @@ int autovacuum_anl_thresh;
double autovacuum_anl_scale;
int autovacuum_freeze_max_age;
int autovacuum_multixact_freeze_max_age;
+double autovacuum_cleanup_index_scale = 0.02;
double autovacuum_vac_cost_delay;
int autovacuum_vac_cost_limit;
@@ -2922,6 +2923,7 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
int vac_cost_limit;
double vac_cost_delay;
int log_min_duration;
+ int cleanup_index_scale_factor;
/*
* Calculate the vacuum cost parameters and the freeze ages. If there
@@ -2968,6 +2970,9 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
? avopts->multixact_freeze_table_age
: default_multixact_freeze_table_age;
+ cleanup_index_scale_factor = autovacuum_cleanup_index_scale >= 0 ?
+ autovacuum_cleanup_index_scale : VacuumCostDelay;
+
tab = palloc(sizeof(autovac_table));
tab->at_relid = relid;
tab->at_sharedrel = classForm->relisshared;
@@ -2986,6 +2991,7 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
tab->at_params.multixact_freeze_table_age = multixact_freeze_table_age;
tab->at_params.is_wraparound = wraparound;
tab->at_params.log_min_duration = log_min_duration;
+ tab->at_params.cleanup_index_scale_factor = cleanup_index_scale_factor;
tab->at_vacuum_cost_limit = vac_cost_limit;
tab->at_vacuum_cost_delay = vac_cost_delay;
tab->at_relname = NULL;
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 381d9e548d..953fb2ae08 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -143,6 +143,11 @@ int VacuumCostPageMiss = 2;
int VacuumCostPageDirty = 20;
int VacuumCostLimit = 200;
double VacuumCostDelay = 0;
+/*
+ * Fraction number that controls whether we bypass index vacuuming and heap
+ * vacuuming as an optimization.
+ */
+double VacuumCleanupIndexScale = 0.02; /* i.e. 2% of rel_pages */
int64 VacuumPageHit = 0;
int64 VacuumPageMiss = 0;
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index eb7f7181e4..3999bbeddb 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -759,6 +759,8 @@ const char *const config_group_names[] =
gettext_noop("Statistics / Query and Index Statistics Collector"),
/* AUTOVACUUM */
gettext_noop("Autovacuum"),
+ /* VACUUM_INDEX */
+ gettext_noop("Vacuum / Index"),
/* CLIENT_CONN_STATEMENT */
gettext_noop("Client Connection Defaults / Statement Behavior"),
/* CLIENT_CONN_LOCALE */
@@ -3739,6 +3741,26 @@ static struct config_real ConfigureNamesReal[] =
NULL, NULL, NULL
},
+ {
+ {"vacuum_cleanup_index_scale_factor", PGC_SIGHUP, VACUUM_INDEX,
+ gettext_noop("Fraction of relation pages, with at least one dead item, required to clean up indexes."),
+ NULL
+ },
+ &VacuumCleanupIndexScale,
+ 0.02, 0.0, 0.05,
+ NULL, NULL, NULL
+ },
+
+ {
+ {"autovacuum_cleanup_index_scale_factor", PGC_SIGHUP, AUTOVACUUM,
+ gettext_noop("Fraction of relation pages, with at least one dead item, required to clean up indexes, for autovacuum."),
+ NULL
+ },
+ &autovacuum_cleanup_index_scale,
+ -1, -1, 0.05,
+ NULL, NULL, NULL
+ },
+
{
{"autovacuum_vacuum_scale_factor", PGC_SIGHUP, AUTOVACUUM,
gettext_noop("Number of tuple updates or deletes prior to vacuum as a fraction of reltuples."),
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index efde01ee56..d9b8d1ad42 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -173,6 +173,13 @@
#vacuum_cost_page_dirty = 20 # 0-10000 credits
#vacuum_cost_limit = 200 # 1-10000 credits
+# - Vacuum / Index
+
+#vacuum_cleanup_index_scale_factor = 0.02 # Minimum fraction of relation
+ # pages with at least one dead item
+ # needed to trigger index cleanup during
+ # manual VACUUM. Default is 2%.
+
# - Background Writer -
#bgwriter_delay = 200ms # 10-10000ms between rounds
@@ -642,6 +649,11 @@
#autovacuum_vacuum_cost_limit = -1 # default vacuum cost limit for
# autovacuum, -1 means use
# vacuum_cost_limit
+#autovacuum_cleanup_index_scale_factor = -1 # Minimum fraction of relation
+ # pages with at least one dead item
+ # needed to trigger index cleanup,
+ # -1 means use
+ # vacuum_cleanup_index_scale_factor
#------------------------------------------------------------------------------
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 45bd1f1b7e..ac77727564 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2241,6 +2241,38 @@ include_dir 'conf.d'
</note>
</sect2>
+ <sect2 id="runtime-config-vacuum-index">
+ <title>Vacuum parameters for Indexes</title>
+ <para>
+ During the execution of <xref linkend="sql-vacuum"/>
+ and <xref linkend="sql-analyze"/>
+ commands, this controls the execution of vacuum operations
+ performed on indexes of the relation worked on.
+ </para>
+
+ <variablelist>
+ <varlistentry id="guc-vacuum-cleanup-index-scale-factor" xreflabel="vacuum_cleanup_index_scale_factor">
+ <term><varname>vacuum_cleanup_index_scale_factor</varname> (<type>floating point</type>)
+ <indexterm>
+ <primary><varname>vacuum_cleanup_index_scale_factor</varname></primary>
+ <secondary>configuration parameter</secondary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Specifies the minimum percentage of relation pages required to have
+ at least one dead item to trigger index cleanup in
+ <command>VACUUM</command> operations. The default is
+ <literal>0.02</literal>, meaning that at least 2% of the relation
+ pages need to have at least one dead item to trigger the cleanup
+ This parameter can only be set in the <filename>postgresql.conf</filename>
+ file or on the server command line.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </sect2>
+
<sect2 id="runtime-config-resource-background-writer">
<title>Background Writer</title>
@@ -8001,6 +8033,26 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
</listitem>
</varlistentry>
+ <varlistentry id="guc-autovacuum--cleanup-index-scale-factor" xreflabel="autovacuum_cleanup_index_scale_factor">
+ <term><varname>autovacuum_cleanup_index_scale_factor</varname> (<type>floating point</type>)
+ <indexterm>
+ <primary><varname>autovacuum_cleanup_index_scale_factor</varname></primary>
+ <secondary>configuration parameter</secondary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Specifies the minimum percentage of relation pages required to have
+ at least one dead item to trigger index cleanup in automatic
+ <command>VACUUM</command> operations. If <literal>-1</literal> is
+ specified (which is the default), the regular
+ <xref linkend="guc-vacuum-cleanup-index-scale-factor"/> value will be used.
+ This parameter can only be set in the <filename>postgresql.conf</filename>
+ file or on the server command line.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</sect1>
On Thu, May 13, 2021 at 04:27:47PM +0900, Michael Paquier wrote:
On Tue, May 11, 2021 at 04:42:27PM +0900, Michael Paquier wrote:
Whatever the solution chosen, the thing I can see we agree on here is
that we need to do something, at least in the shape of an on/off
switch to have an escape path in case of problems. Peter, could we
get something by beta1 for that? FWIW, I would use a float GUC to
control that, and not a boolean switch, but I am just one voice here,
and that's not a feature I worked on.So, I have been thinking more about this item, and a boolean switch
still sounded weird to me, so attached is a patch to have two GUCs,
one for manual VACUUM and autovacuum like any other parameters, to
control this behavior, with a default set at 2% of the number of
relation pages with dead items needed to do the index cleanup work.Even if we switch the parameter type used here, the easiest and most
consistent way to pass down the parameter is just to use VacuumParams
set within ExecVacuum() and the autovacuum code path. The docs need
more work, I guess.Thoughts?
+ cleanup_index_scale_factor = autovacuum_cleanup_index_scale >= 0 ? + autovacuum_cleanup_index_scale : VacuumCostDelay;
CostDelay is surely not what you meant.
+ <title>Vacuum parameters for Indexes</title> + <para> + During the execution of <xref linkend="sql-vacuum"/> + and <xref linkend="sql-analyze"/>
"and analyze" is wrong?
+ This parameter can only be set in the <filename>postgresql.conf</filename> + file or on the server command line.
It's SIGHUP
+ This parameter can only be set in the <filename>postgresql.conf</filename> + file or on the server command line.
Same
+ {
+ {"vacuum_cleanup_index_scale_factor", PGC_SIGHUP, VACUUM_INDEX,
+ gettext_noop("Fraction of relation pages, with at least one dead item, required to clean up indexes."),
+ NULL
+ },
+ &VacuumCleanupIndexScale,
+ 0.02, 0.0, 0.05,
+ NULL, NULL, NULL
+ },
Why is the allowed range from 0 to 0.05? Why not 0.10 or 1.0 ?
The old GUC of the same name had max 1e10.
I think a reduced range and a redefinition of the GUC would need to be called
out as an incompatibility.
Also, the old GUC (removed at 9f3665fbf) had:
- {"vacuum_cleanup_index_scale_factor", PGC_USERSET, CLIENT_CONN_STATEMENT,
I think USERSET and STATEMENT were right ?
Alternately, what if this were in the DEVELOPER category, which makes this
easier to remove in v15.
--
Justin
On Thu, May 13, 2021 at 5:06 AM Justin Pryzby <pryzby@telsasoft.com> wrote:
Why is the allowed range from 0 to 0.05? Why not 0.10 or 1.0 ?
The old GUC of the same name had max 1e10.
It also had a completely different purpose and default.
I think a reduced range and a redefinition of the GUC would need to be called
out as an incompatibility.
The justification from Michael for this approach is that not having
this level of control would be weird, at least to him. But that
justification itself seems weird to me; why start from the premise
that you need a knob (as opposed to an off switch) at all? Why not
start with the way the mechanism works (or is intended to work) in
practice? Most individual tables will *never* have VACUUM apply the
optimization with *any* reasonable threshold value, so we only need to
consider the subset of tables/workloads where it *might* make sense to
skip index vacuuming. This is more qualitative than quantitative.
It makes zero sense to treat the threshold as a universal scale --
this is one reason why I don't want to expose a true tunable knob to
users. Though the threshold-driven/BYPASS_THRESHOLD_PAGES design is
not exactly something with stable behavior for a given table, it
almost works like that in practice: tables tend to usually skip index
vacuuming, or never skip it even once. There is a clear bifurcation
along this line when you view how VACUUM behaves with a variety of
different tables using the new autovacuum logging stuff.
Almost all of the benefit of the optimization is available with the
current BYPASS_THRESHOLD_PAGES threshold (2% of heap pages have
LP_DEAD items), which has less risk than a higher threshold. I don't
think it matters much if we have the occasional unnecessary round of
index vacuuming on account of not applying the optimization. The truly
important benefit of the optimization is to not do unnecessary index
vacuuming all the time in the extreme cases where it's really hard to
justify. There is currently zero evidence that anything higher than 2%
will ever help anybody to an appreciably degree. (There is also no
evidence that the optimization will ever need to be disabled, but I
accept the need to be conservative and offer an off switch -- the
precautionary principle applies when talking about new harms.)
Not having to scan every index on every VACUUM, but only every 5th or
so VACUUM is a huge improvement. But going from every 5th VACUUM to
every 10th VACUUM? That's at best a tiny additional improvement in
exchange for what I'd guess is a roughly linear increase in risk
(maybe a greater-than-linear increase, even). That's an awful deal.
--
Peter Geoghegan
On Thu, May 13, 2021 at 01:27:44PM -0700, Peter Geoghegan wrote:
Almost all of the benefit of the optimization is available with the
current BYPASS_THRESHOLD_PAGES threshold (2% of heap pages have
LP_DEAD items), which has less risk than a higher threshold. I don't
think it matters much if we have the occasional unnecessary round of
index vacuuming on account of not applying the optimization. The truly
important benefit of the optimization is to not do unnecessary index
vacuuming all the time in the extreme cases where it's really hard to
justify. There is currently zero evidence that anything higher than 2%
will ever help anybody to an appreciably degree. (There is also no
evidence that the optimization will ever need to be disabled, but I
accept the need to be conservative and offer an off switch -- the
precautionary principle applies when talking about new harms.)Not having to scan every index on every VACUUM, but only every 5th or
so VACUUM is a huge improvement. But going from every 5th VACUUM to
every 10th VACUUM? That's at best a tiny additional improvement in
exchange for what I'd guess is a roughly linear increase in risk
(maybe a greater-than-linear increase, even). That's an awful deal.
Perhaps that's an awful deal, but based on which facts can you really
say that this new behavior of needing at least 2% of relation pages
with some dead items to clean up indexes is not a worse deal in some
cases? This may cause more problems for the in-core index AMs, as
much as it could impact any out-of-core index AM, no? What about
other values like 1%, or even 5%? My guess is that there would be an
ask to have more control on that, though that stands as my opinion.
Saying that, as long as there is a way to disable that for the users
with autovacuum and manual vacuums, I'd be fine. It is worth noting
that adding an GUC to control this optimization would make the code
more confusing, as there is already do_index_cleanup, a
vacuum_index_cleanup reloption, and specifying vacuum_index_cleanup to
TRUE may cause the index cleanup to not actually kick if the 2% bar is
not reached.
--
Michael
On Thu, May 13, 2021 at 7:14 PM Michael Paquier <michael@paquier.xyz> wrote:
Perhaps that's an awful deal, but based on which facts can you really
say that this new behavior of needing at least 2% of relation pages
with some dead items to clean up indexes is not a worse deal in some
cases?
If I thought that it simply wasn't possible then I wouldn't have
accepted the need to make it possible to disable. This is a
cost/benefit decision problem, which must be made based on imperfect
information -- there are no absolute certainties. But I'm certain
about one thing: there is a large practical difference between the
optimization causing terrible performance in certain scenarios and the
optimization causing slightly suboptimal performance in certain
scenarios. A tiny risk of the former scenario is *much* worse than a
relatively large risk of the latter scenario. There needs to be a
sense of proportion about risk.
This may cause more problems for the in-core index AMs, as
much as it could impact any out-of-core index AM, no?
I don't understand what you mean here.
What about
other values like 1%, or even 5%? My guess is that there would be an
ask to have more control on that, though that stands as my opinion.
How did you arrive at that guess? Why do you believe that? This is the
second time I've asked.
Saying that, as long as there is a way to disable that for the users
with autovacuum and manual vacuums, I'd be fine. It is worth noting
that adding an GUC to control this optimization would make the code
more confusing, as there is already do_index_cleanup, a
vacuum_index_cleanup reloption, and specifying vacuum_index_cleanup to
TRUE may cause the index cleanup to not actually kick if the 2% bar is
not reached.
I don't intend to add a GUC. A reloption should suffice.
Your interpretation of what specifying vacuum_index_cleanup (the
VACUUM command option) represents doesn't seem particularly justified
to me. To me it just means "index cleanup and vacuuming are not
explicitly disabled, the default behavior". It's an option largely
intended for emergencies, and largely superseded by the failsafe
mechanism. This interpretation is justified by well established
precedent: it has long been possible for VACUUM to skip heap page
pruning and even heap page vacuuming just because a super-exclusive
lock could not be acquired (though the latter case no longer happens
due to the same work inside vacuumlazy.c) -- which also implies
skipping some index vacuuming, without it ever being apparent to the
user.
--
Peter Geoghegan
(I had missed this discussion due to the mismatched thread subject..)
On Fri, May 14, 2021 at 11:14 AM Michael Paquier <michael@paquier.xyz> wrote:
On Thu, May 13, 2021 at 01:27:44PM -0700, Peter Geoghegan wrote:
Almost all of the benefit of the optimization is available with the
current BYPASS_THRESHOLD_PAGES threshold (2% of heap pages have
LP_DEAD items), which has less risk than a higher threshold. I don't
think it matters much if we have the occasional unnecessary round of
index vacuuming on account of not applying the optimization. The truly
important benefit of the optimization is to not do unnecessary index
vacuuming all the time in the extreme cases where it's really hard to
justify. There is currently zero evidence that anything higher than 2%
will ever help anybody to an appreciably degree. (There is also no
evidence that the optimization will ever need to be disabled, but I
accept the need to be conservative and offer an off switch -- the
precautionary principle applies when talking about new harms.)Not having to scan every index on every VACUUM, but only every 5th or
so VACUUM is a huge improvement. But going from every 5th VACUUM to
every 10th VACUUM? That's at best a tiny additional improvement in
exchange for what I'd guess is a roughly linear increase in risk
(maybe a greater-than-linear increase, even). That's an awful deal.Perhaps that's an awful deal, but based on which facts can you really
say that this new behavior of needing at least 2% of relation pages
with some dead items to clean up indexes is not a worse deal in some
cases? This may cause more problems for the in-core index AMs, as
much as it could impact any out-of-core index AM, no? What about
other values like 1%, or even 5%? My guess is that there would be an
ask to have more control on that, though that stands as my opinion.
I'm concerned how users can tune that scale type parameter that can be
configurable between 0.0 and 0.05. I think that users basically don't
pay attention to how many blocks are updated by UPDATE/DELETE. Unlike
old vacuum_cleanup_index_scale_factor, increasing this parameter would
directly affect index bloats. If the user can accept more index bloat
to speed up (auto)vacuum, they can use vacuum_index_cleanup instead.
I prefer to have an on/off switch just in case. I remember I also
commented the same thing before. We’ve discussed a way to control
whether or not to enable the skipping optimization by adding a new
mode to INDEX_CLEANUP option, as Peter mentioned. For example, we can
use the new mode “auto” (or “smart”) mode by default, enabling all
skipping optimizations, and specifying “on” disables them. Or we can
add “force” mode to disable all skipping optimizations while leaving
the existing modes as they are. Anyway, I think it’s not a good idea
to add a new GUC parameter that we’re not sure how to tune.
IIUC skipping index vacuum when less than 2% of relation pages with at
least one LP_DEAD is a table’s optimization rather than a btree
index’s optimization. Since we’re not likely to set many pages
all-visible or collect many dead tuples in that case, we can skip
index vacuuming and table vacuuming. IIUC this case, fortunately, goes
well together btree indexes’ bottom-up deletion. If this skipping
behavior badly affects other indexes AMs, this optimization should be
considered within btree indexes, although we will need a way for index
AMs to consider and tell the vacuum strategy. But I guess this works
well in most cases so having an on/off switch suffice.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Tue, May 18, 2021 at 7:29 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I prefer to have an on/off switch just in case. I remember I also
commented the same thing before. We’ve discussed a way to control
whether or not to enable the skipping optimization by adding a new
mode to INDEX_CLEANUP option, as Peter mentioned. For example, we can
use the new mode “auto” (or “smart”) mode by default, enabling all
skipping optimizations, and specifying “on” disables them. Or we can
add “force” mode to disable all skipping optimizations while leaving
the existing modes as they are. Anyway, I think it’s not a good idea
to add a new GUC parameter that we’re not sure how to tune.IIUC skipping index vacuum when less than 2% of relation pages with at
least one LP_DEAD is a table’s optimization rather than a btree
index’s optimization.
Right. There *is* an excellent way to tune this behavior: by adjusting
heap fillfactor to make HOT more effective. That was why I started
this thread!
If you leave heap fillfactor at the default of 100, and have lots of
updates (that don't modify indexed columns) and no deletes, then
you're almost certainly not going to have VACUUM skip indexes anyway
-- in practice you're bound to exceed having 2% of pages with an
LP_DEAD item before very long. Tuning heap fillfactor is practically
essential to see a real benefit, regardless of the exact
BYPASS_THRESHOLD_PAGES. (There may be some rare exceptions, but for
the most part this mechanism helps with tables that get many updates
that are expected to use HOT, and will use HOT barring a tiny number
of cases where the new tuple won't' quite fit, etc.)
The idea of tuning the behavior directly (e.g. with a reloption that
lets the user specify a BYPASS_THRESHOLD_PAGES style threshold) is
exactly backwards. The point for the user should not be to skip
indexes during VACUUM. The point for the user is to get lots of
non-HOT updates to *avoid heap fragmentation*, guided by the new
autovacuum instrumentation. That also means that there will be much
less index vacuuming. But that's a pretty minor side-benefit. Why
should the user *expect* largely unnecessary index vacuuming to take
place?
To put it another way, the index bypass mechanism added to
vacuumlazy.c was not intended to add a new good behavior. It was
actually intended to subtract an old bad behavior. The patch is mostly
useful because it allows the user to make VACUUM *more* aggressive
with freezing and VM bit setting (not less aggressive with indexes).
The BYPASS_THRESHOLD_PAGES threshold of 0.02 is a little arbitrary --
but only a little.
Since we’re not likely to set many pages
all-visible or collect many dead tuples in that case, we can skip
index vacuuming and table vacuuming. IIUC this case, fortunately, goes
well together btree indexes’ bottom-up deletion.
It's true that bottom-up index deletion provides additional insurance
against problems, but I don't think that that insurance is strictly
necessary. It's nice to have insurance, though.
If this skipping
behavior badly affects other indexes AMs, this optimization should be
considered within btree indexes, although we will need a way for index
AMs to consider and tell the vacuum strategy. But I guess this works
well in most cases so having an on/off switch suffice.
Right. I doubt that it will actually turn out to be necessary to have
such a switch. But I try to be modest when it comes to predicting what
will be important to some user workload -- it's way too complicated to
have total confidence about something like that. It is a risk to be
managed.
--
Peter Geoghegan
On Wed, May 19, 2021 at 6:09 AM Peter Geoghegan <pg@bowt.ie> wrote:
On Tue, May 18, 2021 at 7:29 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
If this skipping
behavior badly affects other indexes AMs, this optimization should be
considered within btree indexes, although we will need a way for index
AMs to consider and tell the vacuum strategy. But I guess this works
well in most cases so having an on/off switch suffice.Right. I doubt that it will actually turn out to be necessary to have
such a switch. But I try to be modest when it comes to predicting what
will be important to some user workload -- it's way too complicated to
have total confidence about something like that. It is a risk to be
managed.
I think the possible side effect of this hard-coded
BYPASS_THRESHOLD_PAGES would be that by default, bulkdelete is not
called for a long term and the index becomes bloat. IOW, we will
enforce users have the index bloat corresponding to 2% of table pages.
The bloat could be serious depending on the index tuple size (e.g.,
index including many columns). The user may have been running
autovacuums aggressively on that table to prevent the index bloat but
it's no longer possible and there is no choice. So I think that for
those (relatively) rare use cases, it's good to provide a way to
somehow control it. Fortunately, an on/off switch is likely to be
useful for controlling other optimizations that could be added in the
future.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Sun, May 23, 2021 at 11:34 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I think the possible side effect of this hard-coded
BYPASS_THRESHOLD_PAGES would be that by default, bulkdelete is not
called for a long term and the index becomes bloat.
What do you think of the approach taken in the attached POC patch?
The patch makes it possible to disable the optimization by
generalizing the INDEX_CLEANUP reloption to be an enum that looks like
a trinay boolean (not just a plain boolean). INDEX_CLEANUP now accepts
the values 'auto', 'on', and 'off' (plus a variety of alternative
spellings, the usual ones for booleans in Postgres). Now 'auto' is the
default, and 'on' forces the previous behavior inside vacuumlazy.c. It
does not disable the failsafe, though -- INDEX_CLEANUP remains a
fairly mechanical thing.
This approach seems good to me because INDEX_CLEANUP remains
consistent with the original purpose and design of INDEX_CLEANUP --
that was always an option that forced VACUUM to do something special
with indexes. I don't see much downside to this approach, either. As
things stand, INDEX_CLEANUP is mostly superseded by the failsafe, so
we don't really need to talk about wraparound emergencies in the docs
for INDEX_CLEANUP anymore. This seems much more elegant than either
repurposing/reviving cleanup_index_scale_factor (which makes no sense
to me at all) or inventing a new reloption (which would itself be in
tension with INDEX_CLEANUP).
There are some practical issues that make this patch surprisingly
complicated for such a simple problem. For example, I hope that I
haven't missed any subtlety in generalizing a boolean reloption like
this. We've done similar things with GUCs in the past, but this may be
a little different. Another concern with this approach is what it
means for the VACUUM command itself. I haven't added an 'auto'
spelling that is accepted by the VACUUM command in this POC version.
But do I need to at all? Can that just be implied by not having any
INDEX_CLEANUP option? And does StdRdOptions.vacuum_truncate now need
to become a VacOptTernaryValue field too, for consistency with the new
definition of StdRdOptions.vacuum_index_cleanup?
--
Peter Geoghegan
Attachments:
0001-Generalize-VACUUM-s-INDEX_CLEANUP-option.patchapplication/octet-stream; name=0001-Generalize-VACUUM-s-INDEX_CLEANUP-option.patchDownload
From 54823d82fcbb2c8bf3eb2a122c461b41b66eb5b6 Mon Sep 17 00:00:00 2001
From: Peter Geoghegan <pg@bowt.ie>
Date: Wed, 26 May 2021 17:17:05 -0700
Subject: [PATCH] Generalize VACUUM's INDEX_CLEANUP option.
Generalize the reloption INDEX_CLEANUP to enable users to disable the
index vacuum bypassing optimization added by commit 5100010e, as well as
any similar optimizations that may be added in the future.
The reloption is now a trinary boolean style enum whose default is
'auto', which preserves the current VACUUM behavior. Explicitly setting
INDEX_CLEANUP to 'on' now avoids any optimizations that skip index
vacuuming, though it doesn't have any effect on the failsafe mechanism.
The INDEX_CLEANUP reloption was originally added by commit a96c41fe as
an emergency option to allow DBAs to opt out of index vacuuming in
extreme cases -- cases where the likely alternative is an anti-wraparoud
failure. The reloption can still be used for that, but it seems like a
much less useful option now that VACUUM has the failsafe mechanism added
by commit 1e55e7d1.
---
src/include/commands/vacuum.h | 14 +----------
src/include/utils/rel.h | 15 +++++++++++-
src/backend/access/common/reloptions.c | 33 +++++++++++++++++++-------
src/backend/access/heap/vacuumlazy.c | 33 ++++++++++++++------------
src/backend/commands/vacuum.c | 11 +++------
5 files changed, 60 insertions(+), 46 deletions(-)
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index cb27257bb6..d45891ca31 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -21,6 +21,7 @@
#include "parser/parse_node.h"
#include "storage/buf.h"
#include "storage/lock.h"
+#include "utils/rel.h"
#include "utils/relcache.h"
/*
@@ -184,19 +185,6 @@ typedef struct VacAttrStats
#define VACOPT_PROCESS_TOAST 0x40 /* process the TOAST table, if any */
#define VACOPT_DISABLE_PAGE_SKIPPING 0x80 /* don't skip any pages */
-/*
- * A ternary value used by vacuum parameters.
- *
- * DEFAULT value is used to determine the value based on other
- * configurations, e.g. reloptions.
- */
-typedef enum VacOptTernaryValue
-{
- VACOPT_TERNARY_DEFAULT = 0,
- VACOPT_TERNARY_DISABLED,
- VACOPT_TERNARY_ENABLED,
-} VacOptTernaryValue;
-
/*
* Parameters customizing behavior of VACUUM and ANALYZE.
*
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 774ac5b2b1..740e22942e 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -307,6 +307,19 @@ typedef struct AutoVacOpts
float8 analyze_scale_factor;
} AutoVacOpts;
+/*
+ * A ternary value used by vacuum parameters.
+ *
+ * DEFAULT value is used to determine the value based on other
+ * configurations, e.g. reloptions.
+ */
+typedef enum VacOptTernaryValue
+{
+ VACOPT_TERNARY_DEFAULT = 0,
+ VACOPT_TERNARY_DISABLED,
+ VACOPT_TERNARY_ENABLED,
+} VacOptTernaryValue;
+
typedef struct StdRdOptions
{
int32 vl_len_; /* varlena header (do not touch directly!) */
@@ -316,7 +329,7 @@ typedef struct StdRdOptions
AutoVacOpts autovacuum; /* autovacuum-related options */
bool user_catalog_table; /* use as an additional catalog relation */
int parallel_workers; /* max number of parallel workers */
- bool vacuum_index_cleanup; /* enables index vacuuming and cleanup */
+ VacOptTernaryValue vacuum_index_cleanup; /* enables index vacuuming and cleanup */
bool vacuum_truncate; /* enables vacuum to truncate a relation */
} StdRdOptions;
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 5554275e64..60df707f36 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -140,15 +140,6 @@ static relopt_bool boolRelOpts[] =
},
false
},
- {
- {
- "vacuum_index_cleanup",
- "Enables index vacuuming and index cleanup",
- RELOPT_KIND_HEAP | RELOPT_KIND_TOAST,
- ShareUpdateExclusiveLock
- },
- true
- },
{
{
"vacuum_truncate",
@@ -474,6 +465,19 @@ static relopt_real realRelOpts[] =
{{NULL}}
};
+/* values from HeapOptIndexCleanupMode */
+relopt_enum_elt_def HeapOptIndexCleanupOptValues[] =
+{
+ {"auto", VACOPT_TERNARY_DEFAULT},
+ {"on", VACOPT_TERNARY_ENABLED},
+ {"off", VACOPT_TERNARY_DISABLED},
+ {"true", VACOPT_TERNARY_ENABLED},
+ {"false", VACOPT_TERNARY_DISABLED},
+ {"1", VACOPT_TERNARY_ENABLED},
+ {"0", VACOPT_TERNARY_DISABLED},
+ {(const char *) NULL} /* list terminator */
+};
+
/* values from GistOptBufferingMode */
relopt_enum_elt_def gistBufferingOptValues[] =
{
@@ -494,6 +498,17 @@ relopt_enum_elt_def viewCheckOptValues[] =
static relopt_enum enumRelOpts[] =
{
+ {
+ {
+ "vacuum_index_cleanup",
+ "Enables index vacuuming and index cleanup",
+ RELOPT_KIND_HEAP | RELOPT_KIND_TOAST,
+ ShareUpdateExclusiveLock
+ },
+ HeapOptIndexCleanupOptValues,
+ VACOPT_TERNARY_DEFAULT,
+ gettext_noop("Valid values are \"on\", \"off\", and \"auto\".")
+ },
{
{
"buffering",
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index ad3feb88b3..db22ca1ee1 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -313,6 +313,8 @@ typedef struct LVRelState
bool do_index_cleanup;
/* Wraparound failsafe in effect? (implies !do_index_vacuuming) */
bool do_failsafe;
+ /* Consider bypass optimization? */
+ bool do_bypass_optimization;
/* Buffer access strategy and parallel state */
BufferAccessStrategy bstrategy;
@@ -406,7 +408,7 @@ static void lazy_scan_prune(LVRelState *vacrel, Buffer buf,
BlockNumber blkno, Page page,
GlobalVisState *vistest,
LVPagePruneState *prunestate);
-static void lazy_vacuum(LVRelState *vacrel, bool onecall);
+static void lazy_vacuum(LVRelState *vacrel);
static bool lazy_vacuum_all_indexes(LVRelState *vacrel);
static void lazy_vacuum_heap_rel(LVRelState *vacrel);
static int lazy_vacuum_heap_page(LVRelState *vacrel, BlockNumber blkno,
@@ -507,7 +509,6 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
MultiXactId MultiXactCutoff;
Assert(params != NULL);
- Assert(params->index_cleanup != VACOPT_TERNARY_DEFAULT);
Assert(params->truncate != VACOPT_TERNARY_DEFAULT);
/* measure elapsed time iff autovacuum logging requires it */
@@ -560,11 +561,15 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
vacrel->do_index_vacuuming = true;
vacrel->do_index_cleanup = true;
vacrel->do_failsafe = false;
+ vacrel->do_bypass_optimization = true;
if (params->index_cleanup == VACOPT_TERNARY_DISABLED)
{
vacrel->do_index_vacuuming = false;
vacrel->do_index_cleanup = false;
}
+ else if (params->index_cleanup == VACOPT_TERNARY_ENABLED)
+ vacrel->do_bypass_optimization = false;
+
vacrel->bstrategy = bstrategy;
vacrel->old_rel_pages = rel->rd_rel->relpages;
vacrel->old_live_tuples = rel->rd_rel->reltuples;
@@ -893,8 +898,7 @@ lazy_scan_heap(LVRelState *vacrel, VacuumParams *params, bool aggressive)
next_fsm_block_to_vacuum;
PGRUsage ru0;
Buffer vmbuffer = InvalidBuffer;
- bool skipping_blocks,
- have_vacuumed_indexes = false;
+ bool skipping_blocks;
StringInfoData buf;
const int initprog_index[] = {
PROGRESS_VACUUM_PHASE,
@@ -1167,8 +1171,8 @@ lazy_scan_heap(LVRelState *vacrel, VacuumParams *params, bool aggressive)
}
/* Remove the collected garbage tuples from table and indexes */
- lazy_vacuum(vacrel, false);
- have_vacuumed_indexes = true;
+ vacrel->do_bypass_optimization = false;
+ lazy_vacuum(vacrel);
/*
* Vacuum the Free Space Map to make newly-freed space visible on
@@ -1580,7 +1584,7 @@ lazy_scan_heap(LVRelState *vacrel, VacuumParams *params, bool aggressive)
/* If any tuples need to be deleted, perform final vacuum cycle */
if (dead_tuples->num_tuples > 0)
- lazy_vacuum(vacrel, !have_vacuumed_indexes);
+ lazy_vacuum(vacrel);
/*
* Vacuum the remainder of the Free Space Map. We must do this whether or
@@ -2065,9 +2069,9 @@ retry:
* wraparound.
*/
static void
-lazy_vacuum(LVRelState *vacrel, bool onecall)
+lazy_vacuum(LVRelState *vacrel)
{
- bool do_bypass_optimization;
+ bool bypass;
/* Should not end up here with no indexes */
Assert(vacrel->nindexes > 0);
@@ -2100,8 +2104,8 @@ lazy_vacuum(LVRelState *vacrel, bool onecall)
* It's far easier to ensure that 99%+ of all UPDATEs against a table use
* HOT through careful tuning.
*/
- do_bypass_optimization = false;
- if (onecall && vacrel->rel_pages > 0)
+ bypass = false;
+ if (vacrel->do_bypass_optimization && vacrel->rel_pages > 0)
{
BlockNumber threshold;
@@ -2133,12 +2137,11 @@ lazy_vacuum(LVRelState *vacrel, bool onecall)
* expanded to cover more cases then this may need to be reconsidered.
*/
threshold = (double) vacrel->rel_pages * BYPASS_THRESHOLD_PAGES;
- do_bypass_optimization =
- (vacrel->lpdead_item_pages < threshold &&
- vacrel->lpdead_items < MAXDEADTUPLES(32L * 1024L * 1024L));
+ bypass = (vacrel->lpdead_item_pages < threshold &&
+ vacrel->lpdead_items < MAXDEADTUPLES(32L * 1024L * 1024L));
}
- if (do_bypass_optimization)
+ if (bypass)
{
/*
* There are almost zero TIDs. Behave as if there were precisely
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 7421d7cfbd..3b17d028af 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -1939,14 +1939,9 @@ vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params)
LockRelationIdForSession(&lockrelid, lmode);
/* Set index cleanup option based on reloptions if not yet */
- if (params->index_cleanup == VACOPT_TERNARY_DEFAULT)
- {
- if (rel->rd_options == NULL ||
- ((StdRdOptions *) rel->rd_options)->vacuum_index_cleanup)
- params->index_cleanup = VACOPT_TERNARY_ENABLED;
- else
- params->index_cleanup = VACOPT_TERNARY_DISABLED;
- }
+ if (params->index_cleanup == VACOPT_TERNARY_DEFAULT && rel->rd_options != NULL)
+ params->index_cleanup =
+ ((StdRdOptions *) rel->rd_options)->vacuum_index_cleanup;
/* Set truncate option based on reloptions if not yet */
if (params->truncate == VACOPT_TERNARY_DEFAULT)
--
2.27.0
On Fri, May 28, 2021 at 9:53 AM Peter Geoghegan <pg@bowt.ie> wrote:
On Sun, May 23, 2021 at 11:34 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I think the possible side effect of this hard-coded
BYPASS_THRESHOLD_PAGES would be that by default, bulkdelete is not
called for a long term and the index becomes bloat.What do you think of the approach taken in the attached POC patch?
The patch makes it possible to disable the optimization by
generalizing the INDEX_CLEANUP reloption to be an enum that looks like
a trinay boolean (not just a plain boolean). INDEX_CLEANUP now accepts
the values 'auto', 'on', and 'off' (plus a variety of alternative
spellings, the usual ones for booleans in Postgres). Now 'auto' is the
default, and 'on' forces the previous behavior inside vacuumlazy.c. It
does not disable the failsafe, though -- INDEX_CLEANUP remains a
fairly mechanical thing.
+1
This approach seems good to me because INDEX_CLEANUP remains
consistent with the original purpose and design of INDEX_CLEANUP --
that was always an option that forced VACUUM to do something special
with indexes. I don't see much downside to this approach, either. As
things stand, INDEX_CLEANUP is mostly superseded by the failsafe, so
we don't really need to talk about wraparound emergencies in the docs
for INDEX_CLEANUP anymore. This seems much more elegant than either
repurposing/reviving cleanup_index_scale_factor (which makes no sense
to me at all) or inventing a new reloption (which would itself be in
tension with INDEX_CLEANUP).
+1
There are some practical issues that make this patch surprisingly
complicated for such a simple problem. For example, I hope that I
haven't missed any subtlety in generalizing a boolean reloption like
this. We've done similar things with GUCs in the past, but this may be
a little different.
+/* values from HeapOptIndexCleanupMode */
+relopt_enum_elt_def HeapOptIndexCleanupOptValues[] =
+{
+ {"auto", VACOPT_TERNARY_DEFAULT},
+ {"on", VACOPT_TERNARY_ENABLED},
+ {"off", VACOPT_TERNARY_DISABLED},
+ {"true", VACOPT_TERNARY_ENABLED},
+ {"false", VACOPT_TERNARY_DISABLED},
+ {"1", VACOPT_TERNARY_ENABLED},
+ {"0", VACOPT_TERNARY_DISABLED},
+ {(const char *) NULL} /* list terminator */
+};
We need to accept "yes" and "no" too? Currently, the parsing of a
boolean type reloption accepts those words.
Another concern with this approach is what it
means for the VACUUM command itself. I haven't added an 'auto'
spelling that is accepted by the VACUUM command in this POC version.
But do I need to at all? Can that just be implied by not having any
INDEX_CLEANUP option?
It seems to me that it's better to have INDEX_CLEANUP option of VACUUM
command support AUTO for consistency. Do you have any concerns about
supporting it?
And does StdRdOptions.vacuum_truncate now need
to become a VacOptTernaryValue field too, for consistency with the new
definition of StdRdOptions.vacuum_index_cleanup?
We don't have the bypass optimization for heap truncation, unlike
index vacuuming. So I think we can leave both vacuum_truncate
reloption and TRUNCATE option as boolean parameters.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Mon, May 31, 2021 at 10:30:08AM +0900, Masahiko Sawada wrote:
On Fri, May 28, 2021 at 9:53 AM Peter Geoghegan <pg@bowt.ie> wrote:
Another concern with this approach is what it
means for the VACUUM command itself. I haven't added an 'auto'
spelling that is accepted by the VACUUM command in this POC version.
But do I need to at all? Can that just be implied by not having any
INDEX_CLEANUP option?It seems to me that it's better to have INDEX_CLEANUP option of VACUUM
command support AUTO for consistency. Do you have any concerns about
supporting it?
I have read through the patch, and I am surprised to see that this
only makes possible to control the optimization at relation level.
The origin of the complaints is that this index cleanup optimization
has been introduced as a new rule that gets enforced at *system*
level, so I think that we should have an equivalent with a GUC to
control the behavior for the whole system. With what you are
presenting here, one could only disable the optimization for each
relation, one-by-one. If this optimization proves to be a problem,
it's just going to be harder to users to go through all the relations
and re-tune autovacuum. Am I missing something?
--
Michael
On Fri, Jun 4, 2021 at 3:15 PM Michael Paquier <michael@paquier.xyz> wrote:
On Mon, May 31, 2021 at 10:30:08AM +0900, Masahiko Sawada wrote:
On Fri, May 28, 2021 at 9:53 AM Peter Geoghegan <pg@bowt.ie> wrote:
Another concern with this approach is what it
means for the VACUUM command itself. I haven't added an 'auto'
spelling that is accepted by the VACUUM command in this POC version.
But do I need to at all? Can that just be implied by not having any
INDEX_CLEANUP option?It seems to me that it's better to have INDEX_CLEANUP option of VACUUM
command support AUTO for consistency. Do you have any concerns about
supporting it?I have read through the patch, and I am surprised to see that this
only makes possible to control the optimization at relation level.
The origin of the complaints is that this index cleanup optimization
has been introduced as a new rule that gets enforced at *system*
level, so I think that we should have an equivalent with a GUC to
control the behavior for the whole system. With what you are
presenting here, one could only disable the optimization for each
relation, one-by-one. If this optimization proves to be a problem,
it's just going to be harder to users to go through all the relations
and re-tune autovacuum. Am I missing something?
I've not thought to disable that optimization at system level. I think
we can use this option for particular tables whose indexes
unexpectedly bloated much due to this optimization. Similarly, we have
DISABLE_PAGE_SKIPPING option but don’t have a way to disable lazy
vacuum’s page skipping behavior at system level.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Thu, Jun 3, 2021 at 11:15 PM Michael Paquier <michael@paquier.xyz> wrote:
I have read through the patch, and I am surprised to see that this
only makes possible to control the optimization at relation level.
The origin of the complaints is that this index cleanup optimization
has been introduced as a new rule that gets enforced at *system*
level, so I think that we should have an equivalent with a GUC to
control the behavior for the whole system.
*Why* does it have to work at the system level? I don't understand
what you mean about the system level.
As Masahiko pointed out, adding a GUC isn't what we've done in other
similar cases -- that's how DISABLE_PAGE_SKIPPING works, which was a
defensive option that seems similar enough to what we want to add now.
To give another example, the TRUNCATE VACUUM option (or the related
reloption) can be used to disable relation truncation, a behavior that
sometimes causes *big* issues in production. The truncate behavior is
determined dynamically in most situations -- which is another
similarity to the optimization we've added here.
Why is this fundamentally different to those two things?
With what you are
presenting here, one could only disable the optimization for each
relation, one-by-one. If this optimization proves to be a problem,
it's just going to be harder to users to go through all the relations
and re-tune autovacuum. Am I missing something?
Why would you expect autovacuum to run even when the optimization is
unavailable (e.g. with Postgres 13)? After all, the specifics of when
the bypass optimization kicks in make it very unlikely that ANALYZE
will ever be able to notice enough dead tuples to trigger an
autovacuum (barring antiwraparound and insert-driven autovacuums).
There will probably be very few LP_DEAD items remaining. Occasionally
there will be somewhat more LP_DEAD items, that happen to be
concentrated in less than 2% of the table's blocks -- but block-based
sampling by ANALYZE is likely to miss most of them and underestimate
the total number. The sampling approach taken by acquire_sample_rows()
ensures this with larger tables. With small tables the chances of the
optimization kicking in are very low, unless perhaps fillfactor has
been tuned very aggressively.
There has never been a guarantee that autovacuum will be triggered
(and do index vacuuming) in cases that have very few LP_DEAD items, no
matter how the system has been tuned. The main reason why making the
optimization behavior controllable is for the VACUUM command.
Principally for hackers. I can imagine myself using the VACUUM option
to disable the optimization when I was interested in testing VACUUM or
space utilization in some specific, narrow way.
Of course it's true that there is still some uncertainty about the
optimization harming production workloads -- that is to be expected
with an enhancement like this one. But there is still no actual
example or test case that shows the optimization doing the wrong
thing, or anything like it. Anything is possible, but I am not
expecting there to be even one user complaint about the feature.
Naturally I don't want to add something as heavyweight as a GUC, given
all that.
--
Peter Geoghegan
On Sun, May 30, 2021 at 6:30 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Another concern with this approach is what it
means for the VACUUM command itself. I haven't added an 'auto'
spelling that is accepted by the VACUUM command in this POC version.
But do I need to at all? Can that just be implied by not having any
INDEX_CLEANUP option?It seems to me that it's better to have INDEX_CLEANUP option of VACUUM
command support AUTO for consistency. Do you have any concerns about
supporting it?
I suppose we should have it. But now we have to teach vacuumdb about
this new boolean-like enum too. It's a lot more new code than I would
have preferred, but I suppose that it makes sense.
--
Peter Geoghegan
On Fri, Jun 11, 2021 at 02:46:20PM -0700, Peter Geoghegan wrote:
On Thu, Jun 3, 2021 at 11:15 PM Michael Paquier <michael@paquier.xyz> wrote:
I have read through the patch, and I am surprised to see that this
only makes possible to control the optimization at relation level.
The origin of the complaints is that this index cleanup optimization
has been introduced as a new rule that gets enforced at *system*
level, so I think that we should have an equivalent with a GUC to
control the behavior for the whole system.*Why* does it have to work at the system level? I don't understand
what you mean about the system level.
I mean that you lack a GUC that allows to enforce to *not* use this
optimization for all relations, for all processes.
As Masahiko pointed out, adding a GUC isn't what we've done in other
similar cases -- that's how DISABLE_PAGE_SKIPPING works, which was a
defensive option that seems similar enough to what we want to add now.
To give another example, the TRUNCATE VACUUM option (or the related
reloption) can be used to disable relation truncation, a behavior that
sometimes causes *big* issues in production. The truncate behavior is
determined dynamically in most situations -- which is another
similarity to the optimization we've added here.
Why is this fundamentally different to those two things?
Because the situation looks completely different to me here. TRUNCATE
is thought as a option to be able to avoid an exclusive lock when
truncating the relation file size at the end of VACUUM. More
importantly the default of TRUNCATE is *false*, meaning that we are
never going to skip the truncation unless one specifies it at the
relation level.
Here, what we have is a decision that is enforced to happen by
default, all the time, with the user not knowing about it. If there
is a bug of an issue with it, users, based on your proposal, would be
forced to change it for each *relation*. If they miss some of those
relations, they may still run into problems without knowing about it.
The change of default behavior and having no way to control it in
a simple way look incompatible to me.
With what you are
presenting here, one could only disable the optimization for each
relation, one-by-one. If this optimization proves to be a problem,
it's just going to be harder to users to go through all the relations
and re-tune autovacuum. Am I missing something?Why would you expect autovacuum to run even when the optimization is
unavailable (e.g. with Postgres 13)? After all, the specifics of when
the bypass optimization kicks in make it very unlikely that ANALYZE
will ever be able to notice enough dead tuples to trigger an
autovacuum (barring antiwraparound and insert-driven autovacuums).There will probably be very few LP_DEAD items remaining. Occasionally
there will be somewhat more LP_DEAD items, that happen to be
concentrated in less than 2% of the table's blocks -- but block-based
sampling by ANALYZE is likely to miss most of them and underestimate
the total number. The sampling approach taken by acquire_sample_rows()
ensures this with larger tables. With small tables the chances of the
optimization kicking in are very low, unless perhaps fillfactor has
been tuned very aggressively.There has never been a guarantee that autovacuum will be triggered
(and do index vacuuming) in cases that have very few LP_DEAD items, no
matter how the system has been tuned. The main reason why making the
optimization behavior controllable is for the VACUUM command.
Principally for hackers. I can imagine myself using the VACUUM option
to disable the optimization when I was interested in testing VACUUM or
space utilization in some specific, narrow way.Of course it's true that there is still some uncertainty about the
optimization harming production workloads -- that is to be expected
with an enhancement like this one. But there is still no actual
example or test case that shows the optimization doing the wrong
thing, or anything like it. Anything is possible, but I am not
expecting there to be even one user complaint about the feature.
Naturally I don't want to add something as heavyweight as a GUC, given
all that.
Perhaps. What I am really scared about is that you are assuming that
enforcing this decision will be *always* fine. What I am trying to
say here is that it *may not* be fine for everybody, and that there
should be an easy way to turn it off if that proves to be a problem.
I don't quite see how that's an implementation problem, we have
already many reloptions that are controlled with GUCs if the
reloptions have no default.
I think that a more careful choice implementation would have been to
turn this optimization off by default, while having an option to allow
one to turn it on at will.
--
Michael
On Mon, Jun 14, 2021 at 5:23 PM Michael Paquier <michael@paquier.xyz> wrote:
*Why* does it have to work at the system level? I don't understand
what you mean about the system level.I mean that you lack a GUC that allows to enforce to *not* use this
optimization for all relations, for all processes.
You've just explained what a GUC is. This is not helpful.
Why is this fundamentally different to those two things?
Because the situation looks completely different to me here. TRUNCATE
is thought as a option to be able to avoid an exclusive lock when
truncating the relation file size at the end of VACUUM. More
importantly the default of TRUNCATE is *false*, meaning that we are
never going to skip the truncation unless one specifies it at the
relation level.
Maybe it looks different to you because that's not actually true;
VACUUM *does* skip the truncation when it feels like it, regardless of
the value of the reloption. In general there is no strict guarantee of
truncation ever happening -- see lazy_truncate_heap().
Again: Why is this fundamentally different?
Here, what we have is a decision that is enforced to happen by
default, all the time, with the user not knowing about it. If there
is a bug of an issue with it, users, based on your proposal, would be
forced to change it for each *relation*. If they miss some of those
relations, they may still run into problems without knowing about it.
The change of default behavior and having no way to control it in
a simple way look incompatible to me.
You've just explained what a reloption is. Again, this is not helping.
Perhaps. What I am really scared about is that you are assuming that
enforcing this decision will be *always* fine.
I very clearly and repeatedly said that there was uncertainty about
causing issues in rare real world cases. Are you always 100% certain
that your code has no bugs before you commit it?
Should I now add a GUC for every single feature that I commit? You are
just asserting that we must need to add a GUC, without giving any real
reasons -- you're just giving generic reasons that work just as well
in most situations. I'm baffled by this.
What I am trying to
say here is that it *may not* be fine for everybody, and that there
should be an easy way to turn it off if that proves to be a problem.
As I said, I think that the relationship is both necessary and
sufficient. A GUC is a heavyweight solution that seems quite
unnecessary.
I don't quite see how that's an implementation problem, we have
already many reloptions that are controlled with GUCs if the
reloptions have no default.
I never said that there was an implementation problem with a GUC. Just
that it was unnecessary, and not consistent with existing practice.
Does anyone else have an opinion on this? Of course I can easily add a
GUC. But I won't do so in the absence of any real argument in favor of
it.
I think that a more careful choice implementation would have been to
turn this optimization off by default, while having an option to allow
one to turn it on at will.
You have yet to say anything about the implementation.
--
Peter Geoghegan
On Sun, May 30, 2021 at 6:30 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
We need to accept "yes" and "no" too? Currently, the parsing of a
boolean type reloption accepts those words.
Added those in the attached revision, version 2. This is much closer
to being commitable than v1 was. I plan on committing this in the next
several days.
I probably need to polish the documentation some more, though.
It seems to me that it's better to have INDEX_CLEANUP option of VACUUM
command support AUTO for consistency. Do you have any concerns about
supporting it?
v2 sorts out the mess with VacOptTernaryValue by just adding a new
enum constant to VacOptTernaryValue, called VACOPT_TERNARY_AUTO -- the
enum still has a distinct VACOPT_TERNARY_DEFAULT value. v2 also adds a
new reloption-specific enum, StdRdOptIndexCleanup, which is the
datatype that we actually use inside the StdRdOptions struct. So we
are now able to specify "VACUUM (INDEX_CLEANUP AUTO)" in v2 of the
patch.
v2 also adds a new option to vacuumdb, --force-index-cleanup. This
seemed to make sense because we already have a --no-index-cleanup
option.
And does StdRdOptions.vacuum_truncate now need
to become a VacOptTernaryValue field too, for consistency with the new
definition of StdRdOptions.vacuum_index_cleanup?We don't have the bypass optimization for heap truncation, unlike
index vacuuming. So I think we can leave both vacuum_truncate
reloption and TRUNCATE option as boolean parameters.
Actually FWIW we do have a bypass optimization for TRUNCATE -- it too
has an always-on dynamic behavior -- so it really is like the index
vacuuming thing. In theory it might make sense to have the same "auto,
on, off" thing, just like with index vacuuming in the patch. However,
I haven't done that in the patch because in practice it's a bad idea.
If we offered users the option of truly forcing truncation, then
lazy_truncate_heap() could just insist on truncating. It would have to
just wait for an AEL, no matter how long it took. That would probably
be dangerous because waiting for an AEL without backing out in VACUUM
just isn't a great idea.
Thanks
--
Peter Geoghegan
Attachments:
v2-0001-Generalize-VACUUM-s-INDEX_CLEANUP-option.patchapplication/octet-stream; name=v2-0001-Generalize-VACUUM-s-INDEX_CLEANUP-option.patchDownload
From b7813aee4f9770fe23cf64eaffc8c119a517f3f3 Mon Sep 17 00:00:00 2001
From: Peter Geoghegan <pg@bowt.ie>
Date: Wed, 26 May 2021 17:17:05 -0700
Subject: [PATCH v2] Generalize VACUUM's INDEX_CLEANUP option.
Generalize the reloption INDEX_CLEANUP to enable users to disable the
index vacuum bypassing optimization added by commit 5100010e, as well as
any similar optimizations that may be added in the future.
The reloption is now a trinary boolean style enum whose default is
'auto', which preserves the current VACUUM behavior. Explicitly setting
INDEX_CLEANUP to 'on' now avoids any optimizations that skip index
vacuuming, though it doesn't have any effect on the failsafe mechanism.
The INDEX_CLEANUP reloption was originally added by commit a96c41fe as
an emergency option to allow DBAs to opt out of index vacuuming in
extreme cases -- cases where the likely alternative is an anti-wraparoud
failure. The reloption can still be used for that, but it seems like a
much less useful option now that VACUUM has the failsafe mechanism added
by commit 1e55e7d1.
Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Masahiko Sawada <sawada.mshk@gmail.com>
Discussion: https://postgr.es/m/CAH2-Wz=w7r1dpFEr=ija2XFn798wd41+6QUSbkcc8HmWKhEhdA@mail.gmail.com
---
src/include/commands/vacuum.h | 4 +-
src/include/utils/rel.h | 15 ++++++-
src/backend/access/common/reloptions.c | 35 +++++++++++----
src/backend/access/heap/vacuumlazy.c | 48 ++++++++++++++------
src/backend/commands/vacuum.c | 47 +++++++++++++++++---
src/bin/scripts/vacuumdb.c | 61 +++++++++++++++++++++-----
doc/src/sgml/ref/create_table.sgml | 21 +++++----
doc/src/sgml/ref/vacuum.sgml | 43 +++++++++++-------
doc/src/sgml/ref/vacuumdb.sgml | 15 +++++++
src/test/regress/expected/vacuum.out | 8 ++--
src/test/regress/sql/vacuum.sql | 8 ++--
11 files changed, 236 insertions(+), 69 deletions(-)
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index cb27257bb6..a357ce148d 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -188,11 +188,13 @@ typedef struct VacAttrStats
* A ternary value used by vacuum parameters.
*
* DEFAULT value is used to determine the value based on other
- * configurations, e.g. reloptions.
+ * configurations, e.g. reloptions. AUTO value is used by
+ * index_cleanup option only.
*/
typedef enum VacOptTernaryValue
{
VACOPT_TERNARY_DEFAULT = 0,
+ VACOPT_TERNARY_AUTO,
VACOPT_TERNARY_DISABLED,
VACOPT_TERNARY_ENABLED,
} VacOptTernaryValue;
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 774ac5b2b1..6924ed730d 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -307,6 +307,19 @@ typedef struct AutoVacOpts
float8 analyze_scale_factor;
} AutoVacOpts;
+/*
+ * A ternary value used by vacuum parameters.
+ *
+ * DEFAULT value is used to determine the value based on other
+ * configurations, e.g. reloptions.
+ */
+typedef enum StdRdOptIndexCleanup
+{
+ STDRD_OPTION_VACUUM_INDEX_CLEANUP_AUTO = 0,
+ STDRD_OPTION_VACUUM_INDEX_CLEANUP_OFF,
+ STDRD_OPTION_VACUUM_INDEX_CLEANUP_ON
+} StdRdOptIndexCleanup;
+
typedef struct StdRdOptions
{
int32 vl_len_; /* varlena header (do not touch directly!) */
@@ -316,7 +329,7 @@ typedef struct StdRdOptions
AutoVacOpts autovacuum; /* autovacuum-related options */
bool user_catalog_table; /* use as an additional catalog relation */
int parallel_workers; /* max number of parallel workers */
- bool vacuum_index_cleanup; /* enables index vacuuming and cleanup */
+ StdRdOptIndexCleanup vacuum_index_cleanup; /* controls index vacuuming */
bool vacuum_truncate; /* enables vacuum to truncate a relation */
} StdRdOptions;
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 5554275e64..7ef934cbc4 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -140,15 +140,6 @@ static relopt_bool boolRelOpts[] =
},
false
},
- {
- {
- "vacuum_index_cleanup",
- "Enables index vacuuming and index cleanup",
- RELOPT_KIND_HEAP | RELOPT_KIND_TOAST,
- ShareUpdateExclusiveLock
- },
- true
- },
{
{
"vacuum_truncate",
@@ -474,6 +465,21 @@ static relopt_real realRelOpts[] =
{{NULL}}
};
+/* values from StdRdOptIndexCleanup */
+relopt_enum_elt_def StdRdOptIndexCleanupValues[] =
+{
+ {"auto", STDRD_OPTION_VACUUM_INDEX_CLEANUP_AUTO},
+ {"on", STDRD_OPTION_VACUUM_INDEX_CLEANUP_ON},
+ {"off", STDRD_OPTION_VACUUM_INDEX_CLEANUP_OFF},
+ {"true", STDRD_OPTION_VACUUM_INDEX_CLEANUP_ON},
+ {"false", STDRD_OPTION_VACUUM_INDEX_CLEANUP_OFF},
+ {"yes", STDRD_OPTION_VACUUM_INDEX_CLEANUP_ON},
+ {"no", STDRD_OPTION_VACUUM_INDEX_CLEANUP_OFF},
+ {"1", STDRD_OPTION_VACUUM_INDEX_CLEANUP_ON},
+ {"0", STDRD_OPTION_VACUUM_INDEX_CLEANUP_OFF},
+ {(const char *) NULL} /* list terminator */
+};
+
/* values from GistOptBufferingMode */
relopt_enum_elt_def gistBufferingOptValues[] =
{
@@ -494,6 +500,17 @@ relopt_enum_elt_def viewCheckOptValues[] =
static relopt_enum enumRelOpts[] =
{
+ {
+ {
+ "vacuum_index_cleanup",
+ "Forces or disables index vacuuming and index cleanup",
+ RELOPT_KIND_HEAP | RELOPT_KIND_TOAST,
+ ShareUpdateExclusiveLock
+ },
+ StdRdOptIndexCleanupValues,
+ STDRD_OPTION_VACUUM_INDEX_CLEANUP_AUTO,
+ gettext_noop("Valid values are \"on\", \"off\", and \"auto\".")
+ },
{
{
"buffering",
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 88db2e2cfc..8c7f72c775 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -313,6 +313,8 @@ typedef struct LVRelState
bool do_index_cleanup;
/* Wraparound failsafe in effect? (implies !do_index_vacuuming) */
bool do_failsafe;
+ /* Consider bypass optimization? */
+ bool do_bypass_optimization;
/* Buffer access strategy and parallel state */
BufferAccessStrategy bstrategy;
@@ -405,7 +407,7 @@ static void lazy_scan_prune(LVRelState *vacrel, Buffer buf,
BlockNumber blkno, Page page,
GlobalVisState *vistest,
LVPagePruneState *prunestate);
-static void lazy_vacuum(LVRelState *vacrel, bool onecall);
+static void lazy_vacuum(LVRelState *vacrel);
static bool lazy_vacuum_all_indexes(LVRelState *vacrel);
static void lazy_vacuum_heap_rel(LVRelState *vacrel);
static int lazy_vacuum_heap_page(LVRelState *vacrel, BlockNumber blkno,
@@ -509,6 +511,7 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
Assert(params != NULL);
Assert(params->index_cleanup != VACOPT_TERNARY_DEFAULT);
Assert(params->truncate != VACOPT_TERNARY_DEFAULT);
+ Assert(params->truncate != VACOPT_TERNARY_AUTO);
/* measure elapsed time iff autovacuum logging requires it */
if (IsAutoVacuumWorkerProcess() && params->log_min_duration >= 0)
@@ -557,14 +560,33 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
vacrel->rel = rel;
vac_open_indexes(vacrel->rel, RowExclusiveLock, &vacrel->nindexes,
&vacrel->indrels);
+
+ /*
+ * index_cleanup either disables index vacuuming and cleanup or forces it
+ * to go ahead when we would otherwise apply the bypass optimization. The
+ * default is 'auto', which leaves the final decision up to lazy_vacuum().
+ */
vacrel->do_index_vacuuming = true;
vacrel->do_index_cleanup = true;
vacrel->do_failsafe = false;
+ vacrel->do_bypass_optimization = true;
if (params->index_cleanup == VACOPT_TERNARY_DISABLED)
{
+ /* Force disable index vacuuming up-front */
vacrel->do_index_vacuuming = false;
vacrel->do_index_cleanup = false;
}
+ else if (params->index_cleanup == VACOPT_TERNARY_ENABLED)
+ {
+ /* Force index vacuuming (never bypass) */
+ vacrel->do_bypass_optimization = false;
+ }
+ else
+ {
+ /* Default/auto, make all decisions dynamically */
+ Assert(params->index_cleanup == VACOPT_TERNARY_AUTO);
+ }
+
vacrel->bstrategy = bstrategy;
vacrel->old_rel_pages = rel->rd_rel->relpages;
vacrel->old_live_tuples = rel->rd_rel->reltuples;
@@ -893,8 +915,7 @@ lazy_scan_heap(LVRelState *vacrel, VacuumParams *params, bool aggressive)
next_fsm_block_to_vacuum;
PGRUsage ru0;
Buffer vmbuffer = InvalidBuffer;
- bool skipping_blocks,
- have_vacuumed_indexes = false;
+ bool skipping_blocks;
StringInfoData buf;
const int initprog_index[] = {
PROGRESS_VACUUM_PHASE,
@@ -1166,8 +1187,8 @@ lazy_scan_heap(LVRelState *vacrel, VacuumParams *params, bool aggressive)
}
/* Remove the collected garbage tuples from table and indexes */
- lazy_vacuum(vacrel, false);
- have_vacuumed_indexes = true;
+ vacrel->do_bypass_optimization = false;
+ lazy_vacuum(vacrel);
/*
* Vacuum the Free Space Map to make newly-freed space visible on
@@ -1579,7 +1600,7 @@ lazy_scan_heap(LVRelState *vacrel, VacuumParams *params, bool aggressive)
/* If any tuples need to be deleted, perform final vacuum cycle */
if (dead_tuples->num_tuples > 0)
- lazy_vacuum(vacrel, !have_vacuumed_indexes);
+ lazy_vacuum(vacrel);
/*
* Vacuum the remainder of the Free Space Map. We must do this whether or
@@ -2064,9 +2085,9 @@ retry:
* wraparound.
*/
static void
-lazy_vacuum(LVRelState *vacrel, bool onecall)
+lazy_vacuum(LVRelState *vacrel)
{
- bool do_bypass_optimization;
+ bool bypass;
/* Should not end up here with no indexes */
Assert(vacrel->nindexes > 0);
@@ -2099,8 +2120,8 @@ lazy_vacuum(LVRelState *vacrel, bool onecall)
* It's far easier to ensure that 99%+ of all UPDATEs against a table use
* HOT through careful tuning.
*/
- do_bypass_optimization = false;
- if (onecall && vacrel->rel_pages > 0)
+ bypass = false;
+ if (vacrel->do_bypass_optimization && vacrel->rel_pages > 0)
{
BlockNumber threshold;
@@ -2132,12 +2153,11 @@ lazy_vacuum(LVRelState *vacrel, bool onecall)
* expanded to cover more cases then this may need to be reconsidered.
*/
threshold = (double) vacrel->rel_pages * BYPASS_THRESHOLD_PAGES;
- do_bypass_optimization =
- (vacrel->lpdead_item_pages < threshold &&
- vacrel->lpdead_items < MAXDEADTUPLES(32L * 1024L * 1024L));
+ bypass = (vacrel->lpdead_item_pages < threshold &&
+ vacrel->lpdead_items < MAXDEADTUPLES(32L * 1024L * 1024L));
}
- if (do_bypass_optimization)
+ if (bypass)
{
/*
* There are almost zero TIDs. Behave as if there were precisely
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 7421d7cfbd..9190e9cb70 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -142,7 +142,21 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
else if (strcmp(opt->defname, "disable_page_skipping") == 0)
disable_page_skipping = defGetBoolean(opt);
else if (strcmp(opt->defname, "index_cleanup") == 0)
- params.index_cleanup = get_vacopt_ternary_value(opt);
+ {
+ /* Interpret no string as the default, which is 'auto' */
+ if (!opt->arg)
+ params.index_cleanup = VACOPT_TERNARY_AUTO;
+ else
+ {
+ char *sval = defGetString(opt);
+
+ /* Try matching on 'auto' string, or fall back on boolean */
+ if (pg_strcasecmp(sval, "auto") == 0)
+ params.index_cleanup = VACOPT_TERNARY_AUTO;
+ else
+ params.index_cleanup = get_vacopt_ternary_value(opt);
+ }
+ }
else if (strcmp(opt->defname, "process_toast") == 0)
process_toast = defGetBoolean(opt);
else if (strcmp(opt->defname, "truncate") == 0)
@@ -1938,17 +1952,37 @@ vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params)
lockrelid = rel->rd_lockInfo.lockRelId;
LockRelationIdForSession(&lockrelid, lmode);
- /* Set index cleanup option based on reloptions if not yet */
+ /*
+ * Set index cleanup option based on vacuum_index_cleanup reloptions if
+ * not yet. This is a ternary style boolean option.
+ */
if (params->index_cleanup == VACOPT_TERNARY_DEFAULT)
{
- if (rel->rd_options == NULL ||
- ((StdRdOptions *) rel->rd_options)->vacuum_index_cleanup)
+ StdRdOptIndexCleanup vacuum_index_cleanup;
+
+ /* The reloption default is 'auto', just like the VACUUM option */
+ if (rel->rd_options == NULL)
+ vacuum_index_cleanup = STDRD_OPTION_VACUUM_INDEX_CLEANUP_AUTO;
+ else
+ vacuum_index_cleanup =
+ ((StdRdOptions *) rel->rd_options)->vacuum_index_cleanup;
+
+ if (vacuum_index_cleanup == STDRD_OPTION_VACUUM_INDEX_CLEANUP_AUTO)
+ params->index_cleanup = VACOPT_TERNARY_AUTO;
+ else if (vacuum_index_cleanup == STDRD_OPTION_VACUUM_INDEX_CLEANUP_ON)
params->index_cleanup = VACOPT_TERNARY_ENABLED;
else
+ {
+ Assert(vacuum_index_cleanup ==
+ STDRD_OPTION_VACUUM_INDEX_CLEANUP_OFF);
params->index_cleanup = VACOPT_TERNARY_DISABLED;
+ }
}
- /* Set truncate option based on reloptions if not yet */
+ /*
+ * Set index cleanup option based on truncate reloptions if not yet. This
+ * is a simple boolean reloption.
+ */
if (params->truncate == VACOPT_TERNARY_DEFAULT)
{
if (rel->rd_options == NULL ||
@@ -2219,6 +2253,9 @@ compute_parallel_delay(void)
*
* This function returns VACOPT_TERNARY_ENABLED and VACOPT_TERNARY_DISABLED
* instead of true and false.
+ *
+ * There is also a VACOPT_TERNARY_AUTO value that is used only by the
+ * INDEX_CLEANUP option. We don't handle that here.
*/
static VacOptTernaryValue
get_vacopt_ternary_value(DefElem *def)
diff --git a/src/bin/scripts/vacuumdb.c b/src/bin/scripts/vacuumdb.c
index 069a861aab..7df75602eb 100644
--- a/src/bin/scripts/vacuumdb.c
+++ b/src/bin/scripts/vacuumdb.c
@@ -39,7 +39,8 @@ typedef struct vacuumingOptions
int min_mxid_age;
int parallel_workers; /* >= 0 indicates user specified the
* parallel degree, otherwise -1 */
- bool do_index_cleanup;
+ bool no_index_cleanup;
+ bool force_index_cleanup;
bool do_truncate;
bool process_toast;
} vacuumingOptions;
@@ -99,8 +100,9 @@ main(int argc, char *argv[])
{"min-xid-age", required_argument, NULL, 6},
{"min-mxid-age", required_argument, NULL, 7},
{"no-index-cleanup", no_argument, NULL, 8},
- {"no-truncate", no_argument, NULL, 9},
- {"no-process-toast", no_argument, NULL, 10},
+ {"force-index-cleanup", no_argument, NULL, 9},
+ {"no-truncate", no_argument, NULL, 10},
+ {"no-process-toast", no_argument, NULL, 11},
{NULL, 0, NULL, 0}
};
@@ -126,7 +128,8 @@ main(int argc, char *argv[])
/* initialize options */
memset(&vacopts, 0, sizeof(vacopts));
vacopts.parallel_workers = -1;
- vacopts.do_index_cleanup = true;
+ vacopts.no_index_cleanup = false;
+ vacopts.force_index_cleanup = false;
vacopts.do_truncate = true;
vacopts.process_toast = true;
@@ -233,12 +236,15 @@ main(int argc, char *argv[])
}
break;
case 8:
- vacopts.do_index_cleanup = false;
+ vacopts.no_index_cleanup = true;
break;
case 9:
- vacopts.do_truncate = false;
+ vacopts.force_index_cleanup = true;
break;
case 10:
+ vacopts.do_truncate = false;
+ break;
+ case 11:
vacopts.process_toast = false;
break;
default:
@@ -285,12 +291,18 @@ main(int argc, char *argv[])
"disable-page-skipping");
exit(1);
}
- if (!vacopts.do_index_cleanup)
+ if (vacopts.no_index_cleanup)
{
pg_log_error("cannot use the \"%s\" option when performing only analyze",
"no-index-cleanup");
exit(1);
}
+ if (vacopts.force_index_cleanup)
+ {
+ pg_log_error("cannot use the \"%s\" option when performing only analyze",
+ "force-index-cleanup");
+ exit(1);
+ }
if (!vacopts.do_truncate)
{
pg_log_error("cannot use the \"%s\" option when performing only analyze",
@@ -323,6 +335,14 @@ main(int argc, char *argv[])
}
}
+ /* Prohibit --no-index-cleanup and --force-index-cleanup together */
+ if (vacopts.no_index_cleanup && vacopts.force_index_cleanup)
+ {
+ pg_log_error("cannot use the \"%s\" option with the \"%s\" option",
+ "no-index-cleanup", "force-index-cleanup");
+ exit(1);
+ }
+
/* fill cparams except for dbname, which is set below */
cparams.pghost = host;
cparams.pgport = port;
@@ -453,7 +473,7 @@ vacuum_one_database(ConnParams *cparams,
exit(1);
}
- if (!vacopts->do_index_cleanup && PQserverVersion(conn) < 120000)
+ if (vacopts->no_index_cleanup && PQserverVersion(conn) < 120000)
{
PQfinish(conn);
pg_log_error("cannot use the \"%s\" option on server versions older than PostgreSQL %s",
@@ -461,6 +481,14 @@ vacuum_one_database(ConnParams *cparams,
exit(1);
}
+ if (vacopts->force_index_cleanup && PQserverVersion(conn) < 120000)
+ {
+ PQfinish(conn);
+ pg_log_error("cannot use the \"%s\" option on server versions older than PostgreSQL %s",
+ "force-index-cleanup", "12");
+ exit(1);
+ }
+
if (!vacopts->do_truncate && PQserverVersion(conn) < 120000)
{
PQfinish(conn);
@@ -878,10 +906,23 @@ prepare_vacuum_command(PQExpBuffer sql, int serverVersion,
appendPQExpBuffer(sql, "%sDISABLE_PAGE_SKIPPING", sep);
sep = comma;
}
- if (!vacopts->do_index_cleanup)
+ if (vacopts->no_index_cleanup)
{
- /* INDEX_CLEANUP is supported since v12 */
+ /* "INDEX_CLEANUP FALSE" has been supported since v12 */
Assert(serverVersion >= 120000);
+ Assert(!vacopts->force_index_cleanup);
+ appendPQExpBuffer(sql, "%sINDEX_CLEANUP FALSE", sep);
+ sep = comma;
+ }
+ if (vacopts->force_index_cleanup)
+ {
+ /*
+ * "INDEX_CLEANUP TRUE" has been supported since v12. Though
+ * the --force-index-cleanup vacuumdb option was only added in
+ * v14, it still works in the same way on v12+.
+ */
+ Assert(serverVersion >= 120000);
+ Assert(!vacopts->no_index_cleanup);
appendPQExpBuffer(sql, "%sINDEX_CLEANUP FALSE", sep);
sep = comma;
}
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index c6d0a35e50..a5193eeaac 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -1473,20 +1473,25 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
</varlistentry>
<varlistentry id="reloption-vacuum-index-cleanup" xreflabel="vacuum_index_cleanup">
- <term><literal>vacuum_index_cleanup</literal>, <literal>toast.vacuum_index_cleanup</literal> (<type>boolean</type>)
+ <term><literal>vacuum_index_cleanup</literal>, <literal>toast.vacuum_index_cleanup</literal> (<type>enum</type>)
<indexterm>
<primary><varname>vacuum_index_cleanup</varname> storage parameter</primary>
</indexterm>
</term>
<listitem>
<para>
- Enables or disables index cleanup when <command>VACUUM</command> is
- run on this table. The default value is <literal>true</literal>.
- Disabling index cleanup can speed up <command>VACUUM</command> very
- significantly, but may also lead to severely bloated indexes if table
- modifications are frequent. The <literal>INDEX_CLEANUP</literal>
- parameter of <link linkend="sql-vacuum"><command>VACUUM</command></link>, if specified, overrides
- the value of this option.
+ Disables or forces index cleanup when <command>VACUUM</command>
+ is run on this table. The default value is
+ <literal>AUTO</literal>. With <literal>OFF</literal> index
+ cleanup is disabled, with <literal>ON</literal> it is enabled,
+ and with <literal>AUTO</literal> the decision is made
+ dynamically by a <command>VACUUM</command> cost model.
+ Disabling index cleanup can speed up <command>VACUUM</command>
+ very significantly, but may also lead to severely bloated
+ indexes if table modifications are frequent.
+ The <literal>INDEX_CLEANUP</literal> parameter of <link
+ linkend="sql-vacuum"><command>VACUUM</command></link>, if
+ specified, overrides the value of this option.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index 5f67c9d18b..3a32144ad0 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -32,7 +32,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
ANALYZE [ <replaceable class="parameter">boolean</replaceable> ]
DISABLE_PAGE_SKIPPING [ <replaceable class="parameter">boolean</replaceable> ]
SKIP_LOCKED [ <replaceable class="parameter">boolean</replaceable> ]
- INDEX_CLEANUP [ <replaceable class="parameter">boolean</replaceable> ]
+ INDEX_CLEANUP [ <replaceable class="parameter">enum</replaceable> ]
PROCESS_TOAST [ <replaceable class="parameter">boolean</replaceable> ]
TRUNCATE [ <replaceable class="parameter">boolean</replaceable> ]
PARALLEL <replaceable class="parameter">integer</replaceable>
@@ -193,20 +193,24 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
<term><literal>INDEX_CLEANUP</literal></term>
<listitem>
<para>
- Specifies that <command>VACUUM</command> should attempt to remove
- index entries pointing to dead tuples. This is normally the desired
- behavior and is the default unless the
- <literal>vacuum_index_cleanup</literal> option has been set to false
- for the table to be vacuumed. Setting this option to false may be
- useful when it is necessary to make vacuum run as quickly as possible,
- for example to avoid imminent transaction ID wraparound
- (see <xref linkend="vacuum-for-wraparound"/>). However, if index
- cleanup is not performed regularly, performance may suffer, because
- as the table is modified, indexes will accumulate dead tuples
- and the table itself will accumulate dead line pointers that cannot be
- removed until index cleanup is completed. This option has no effect
- for tables that do not have an index and is ignored if the
- <literal>FULL</literal> option is used.
+ Specifies when and how <command>VACUUM</command> should remove
+ index entries pointing to dead tuples. The default setting is
+ <literal>AUTO</literal>, which makes vacuum remove index entries
+ unless the table has an insignificant number of dead tuples.
+ Other valid settings are <literal>OFF</literal>, which forces
+ index cleanup to be bypassed in all cases, and
+ <literal>ON</literal>, which prevents vacuum from considering
+ bypassing index cleanup in cases where there is more than zero
+ dead tuples.
+ </para>
+ <para>
+ If <literal>INDEX_CLEANUP</literal> is set to
+ <literal>OFF</literal> performance may suffer, because as the
+ table is modified, indexes will accumulate dead tuples and the
+ table itself will accumulate dead line pointers that cannot be
+ removed until index cleanup is completed. This option has no
+ effect for tables that do not have an index and is ignored if
+ the <literal>FULL</literal> option is used.
</para>
</listitem>
</varlistentry>
@@ -284,6 +288,15 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
</listitem>
</varlistentry>
+ <varlistentry>
+ <term><replaceable class="parameter">enum</replaceable></term>
+ <listitem>
+ <para>
+ Specifies an enumeration value passed to the selected option.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry>
<term><replaceable class="parameter">integer</replaceable></term>
<listitem>
diff --git a/doc/src/sgml/ref/vacuumdb.sgml b/doc/src/sgml/ref/vacuumdb.sgml
index 3e7f7ed68f..537724184e 100644
--- a/doc/src/sgml/ref/vacuumdb.sgml
+++ b/doc/src/sgml/ref/vacuumdb.sgml
@@ -244,6 +244,21 @@ PostgreSQL documentation
</listitem>
</varlistentry>
+ <varlistentry>
+ <term><option>--no-index-cleanup</option></term>
+ <listitem>
+ <para>
+ Always remove index entries pointing to dead tuples.
+ </para>
+ <note>
+ <para>
+ This option is only available for servers running
+ <productname>PostgreSQL</productname> 12 and later.
+ </para>
+ </note>
+ </listitem>
+ </varlistentry>
+
<varlistentry>
<term><option>--no-process-toast</option></term>
<listitem>
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index 5e657849aa..e5771462d5 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -146,13 +146,15 @@ VACUUM no_index_cleanup;
-- Both parent relation and toast are cleaned up.
ALTER TABLE no_index_cleanup SET (vacuum_index_cleanup = true);
VACUUM no_index_cleanup;
+ALTER TABLE no_index_cleanup SET (vacuum_index_cleanup = auto);
+VACUUM no_index_cleanup;
-- Parameter is set for both the parent table and its toast relation.
INSERT INTO no_index_cleanup(i, t) VALUES (generate_series(31,60),
repeat('1234567890',269));
DELETE FROM no_index_cleanup WHERE i < 45;
-- Only toast index is cleaned up.
-ALTER TABLE no_index_cleanup SET (vacuum_index_cleanup = false,
- toast.vacuum_index_cleanup = true);
+ALTER TABLE no_index_cleanup SET (vacuum_index_cleanup = off,
+ toast.vacuum_index_cleanup = yes);
VACUUM no_index_cleanup;
-- Only parent is cleaned up.
ALTER TABLE no_index_cleanup SET (vacuum_index_cleanup = true,
@@ -160,7 +162,7 @@ ALTER TABLE no_index_cleanup SET (vacuum_index_cleanup = true,
VACUUM no_index_cleanup;
-- Test some extra relations.
VACUUM (INDEX_CLEANUP FALSE) vaccluster;
-VACUUM (INDEX_CLEANUP FALSE) vactst; -- index cleanup option is ignored if no indexes
+VACUUM (INDEX_CLEANUP AUTO) vactst; -- index cleanup option is ignored if no indexes
VACUUM (INDEX_CLEANUP FALSE, FREEZE TRUE) vaccluster;
-- TRUNCATE option
CREATE TABLE vac_truncate_test(i INT NOT NULL, j text)
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 93fd258fc0..f220fc28a7 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -127,13 +127,15 @@ VACUUM no_index_cleanup;
-- Both parent relation and toast are cleaned up.
ALTER TABLE no_index_cleanup SET (vacuum_index_cleanup = true);
VACUUM no_index_cleanup;
+ALTER TABLE no_index_cleanup SET (vacuum_index_cleanup = auto);
+VACUUM no_index_cleanup;
-- Parameter is set for both the parent table and its toast relation.
INSERT INTO no_index_cleanup(i, t) VALUES (generate_series(31,60),
repeat('1234567890',269));
DELETE FROM no_index_cleanup WHERE i < 45;
-- Only toast index is cleaned up.
-ALTER TABLE no_index_cleanup SET (vacuum_index_cleanup = false,
- toast.vacuum_index_cleanup = true);
+ALTER TABLE no_index_cleanup SET (vacuum_index_cleanup = off,
+ toast.vacuum_index_cleanup = yes);
VACUUM no_index_cleanup;
-- Only parent is cleaned up.
ALTER TABLE no_index_cleanup SET (vacuum_index_cleanup = true,
@@ -141,7 +143,7 @@ ALTER TABLE no_index_cleanup SET (vacuum_index_cleanup = true,
VACUUM no_index_cleanup;
-- Test some extra relations.
VACUUM (INDEX_CLEANUP FALSE) vaccluster;
-VACUUM (INDEX_CLEANUP FALSE) vactst; -- index cleanup option is ignored if no indexes
+VACUUM (INDEX_CLEANUP AUTO) vactst; -- index cleanup option is ignored if no indexes
VACUUM (INDEX_CLEANUP FALSE, FREEZE TRUE) vaccluster;
-- TRUNCATE option
--
2.27.0
On Thu, Jun 17, 2021 at 10:54 AM Peter Geoghegan <pg@bowt.ie> wrote:
On Sun, May 30, 2021 at 6:30 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
We need to accept "yes" and "no" too? Currently, the parsing of a
boolean type reloption accepts those words.Added those in the attached revision, version 2. This is much closer
to being commitable than v1 was. I plan on committing this in the next
several days.I probably need to polish the documentation some more, though.
It seems to me that it's better to have INDEX_CLEANUP option of VACUUM
command support AUTO for consistency. Do you have any concerns about
supporting it?v2 sorts out the mess with VacOptTernaryValue by just adding a new
enum constant to VacOptTernaryValue, called VACOPT_TERNARY_AUTO -- the
enum still has a distinct VACOPT_TERNARY_DEFAULT value. v2 also adds a
new reloption-specific enum, StdRdOptIndexCleanup, which is the
datatype that we actually use inside the StdRdOptions struct. So we
are now able to specify "VACUUM (INDEX_CLEANUP AUTO)" in v2 of the
patch.v2 also adds a new option to vacuumdb, --force-index-cleanup. This
seemed to make sense because we already have a --no-index-cleanup
option.And does StdRdOptions.vacuum_truncate now need
to become a VacOptTernaryValue field too, for consistency with the new
definition of StdRdOptions.vacuum_index_cleanup?We don't have the bypass optimization for heap truncation, unlike
index vacuuming. So I think we can leave both vacuum_truncate
reloption and TRUNCATE option as boolean parameters.Actually FWIW we do have a bypass optimization for TRUNCATE -- it too
has an always-on dynamic behavior -- so it really is like the index
vacuuming thing. In theory it might make sense to have the same "auto,
on, off" thing, just like with index vacuuming in the patch. However,
I haven't done that in the patch because in practice it's a bad idea.
If we offered users the option of truly forcing truncation, then
lazy_truncate_heap() could just insist on truncating. It would have to
just wait for an AEL, no matter how long it took. That would probably
be dangerous because waiting for an AEL without backing out in VACUUM
just isn't a great idea.
I agree that it doesn't make sense to force heap truncation.
Thank you for updating the patch! Here are comments on v2 patch:
typedef enum VacOptTernaryValue
{
VACOPT_TERNARY_DEFAULT = 0,
+ VACOPT_TERNARY_AUTO,
VACOPT_TERNARY_DISABLED,
VACOPT_TERNARY_ENABLED,
} VacOptTernaryValue;
VacOptTernaryValue is no longer a ternary value. Can we rename it
something like VacOptValue?
---
+ if (vacopts->force_index_cleanup)
{
- /* INDEX_CLEANUP is supported since v12 */
+ /*
+ * "INDEX_CLEANUP TRUE" has been supported since v12. Though
+ * the --force-index-cleanup vacuumdb option was only added in
+ * v14, it still works in the same way on v12+.
+ */
Assert(serverVersion >= 120000);
+ Assert(!vacopts->no_index_cleanup);
appendPQExpBuffer(sql, "%sINDEX_CLEANUP FALSE", sep);
sep = comma;
}
We should specify TRUE instead.
---
--force-index-cleanup option isn't shown in the help message.
---
I think we also improve the tab completion for INDEX_CLEANUP option.
---
@@ -32,7 +32,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [
<replaceable class="paramet
ANALYZE [ <replaceable class="parameter">boolean</replaceable> ]
DISABLE_PAGE_SKIPPING [ <replaceable
class="parameter">boolean</replaceable> ]
SKIP_LOCKED [ <replaceable class="parameter">boolean</replaceable> ]
- INDEX_CLEANUP [ <replaceable class="parameter">boolean</replaceable> ]
+ INDEX_CLEANUP [ <replaceable class="parameter">enum</replaceable> ]
PROCESS_TOAST [ <replaceable class="parameter">boolean</replaceable> ]
TRUNCATE [ <replaceable class="parameter">boolean</replaceable> ]
PARALLEL <replaceable class="parameter">integer</replaceable>
How about listing the available values of INDEX_CLEANUP here instead
of enum? For example, we do a similar thing in the description of
FORMAT option of EXPLAIN command. It would be easier to perceive all
available values.
---
+ <varlistentry>
+ <term><option>--no-index-cleanup</option></term>
+ <listitem>
+ <para>
It should be --force-index-cleanup.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
+ <literal>AUTO</literal>. With <literal>OFF</literal> index
+ cleanup is disabled, with <literal>ON</literal> it is enabled,
OFF comma
+ bypassing index cleanup in cases where there is more than zero
+ dead tuples.
*are* more than zero
Or (preferably): "cases when there are no dead tuples at all"
+ If <literal>INDEX_CLEANUP</literal> is set to
+ <literal>OFF</literal> performance may suffer, because as the
OFF comma
+ removed until index cleanup is completed. This option has no
+ effect for tables that do not have an index and is ignored if
+ the <literal>FULL</literal> option is used.
I'd say "tables that have no indexes,"
--
Justin
On Thu, Jun 17, 2021 at 2:14 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Thank you for updating the patch! Here are comments on v2 patch:
Thanks for the review!
Attached is v3, which has all the changes that you suggested (plus the
doc stuff from Justin).
I also renamed the "default" VacOptTernaryValue (actually now called
VacOptValue) value -- it seems clearer to call this "unspecified".
Because it represents a state that has nothing to do with the default
of the reloption or GUC. Really, it means "VACUUM command didn't have
this specified explicitly" (note that this means that it always starts
out "default" in an autovacuum worker). Unspecified seems much clearer
because it directly expresses "fall back on the reloption, and then
fall back on the reloption's default". I find this much clearer -- it
is unspecified, but will have to *become* specified later, so that
vacuumlazy.c has a truly usable value ("unspecified" is never usable
in vacuumlazy.c).
VacOptTernaryValue is no longer a ternary value. Can we rename it
something like VacOptValue?
As I said, done that way.
We should specify TRUE instead.
Ooops. Fixed.
--force-index-cleanup option isn't shown in the help message.
Fixed.
---
I think we also improve the tab completion for INDEX_CLEANUP option.
Fixed.
How about listing the available values of INDEX_CLEANUP here instead
of enum? For example, we do a similar thing in the description of
FORMAT option of EXPLAIN command. It would be easier to perceive all
available values.
That looks much better. Fixed.
It should be --force-index-cleanup.
Fixed.
--
Peter Geoghegan
Attachments:
v3-0001-Generalize-VACUUM-s-INDEX_CLEANUP-option.patchapplication/octet-stream; name=v3-0001-Generalize-VACUUM-s-INDEX_CLEANUP-option.patchDownload
From 51143ed7d65748490e10a5d81476388e53f6af90 Mon Sep 17 00:00:00 2001
From: Peter Geoghegan <pg@bowt.ie>
Date: Wed, 26 May 2021 17:17:05 -0700
Subject: [PATCH v3] Generalize VACUUM's INDEX_CLEANUP option.
Generalize the reloption INDEX_CLEANUP to enable users to disable the
index vacuum bypassing optimization added by commit 5100010e, as well as
any similar optimizations that may be added in the future.
The reloption is now a trinary boolean style enum whose default is
'auto', which preserves the current VACUUM behavior. Explicitly setting
INDEX_CLEANUP to 'on' now avoids any optimizations that skip index
vacuuming, though it doesn't have any effect on the failsafe mechanism.
The INDEX_CLEANUP reloption was originally added by commit a96c41fe as
an emergency option to allow DBAs to opt out of index vacuuming in
extreme cases -- cases where the likely alternative is an anti-wraparoud
failure. The reloption can still be used for that, but it seems like a
much less useful option now that VACUUM has the failsafe mechanism added
by commit 1e55e7d1.
Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-By: Justin Pryzby <pryzby@telsasoft.com>
Discussion: https://postgr.es/m/CAH2-WznrBoCST4_Gxh_G9hA8NzGUbeBGnOUC8FcXcrhqsv6OHQ@mail.gmail.com
---
src/include/commands/vacuum.h | 31 ++++++-----
src/include/utils/rel.h | 15 +++++-
src/backend/access/common/reloptions.c | 35 ++++++++----
src/backend/access/heap/vacuumlazy.c | 56 ++++++++++++-------
src/backend/commands/vacuum.c | 75 +++++++++++++++++++-------
src/backend/postmaster/autovacuum.c | 9 +++-
src/bin/psql/tab-complete.c | 4 +-
src/bin/scripts/vacuumdb.c | 62 +++++++++++++++++----
doc/src/sgml/ref/create_table.sgml | 21 +++++---
doc/src/sgml/ref/vacuum.sgml | 29 +++++-----
doc/src/sgml/ref/vacuumdb.sgml | 15 ++++++
src/test/regress/expected/vacuum.out | 8 +--
src/test/regress/sql/vacuum.sql | 8 +--
13 files changed, 268 insertions(+), 100 deletions(-)
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index cb27257bb6..57f93db4a9 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -185,17 +185,24 @@ typedef struct VacAttrStats
#define VACOPT_DISABLE_PAGE_SKIPPING 0x80 /* don't skip any pages */
/*
- * A ternary value used by vacuum parameters.
+ * Values used by VACUUM's index_cleanup and truncate parameters.
*
- * DEFAULT value is used to determine the value based on other
- * configurations, e.g. reloptions.
+ * *_UNSPECIFIED is used as an initial placeholder. When VACUUM command does
+ * not have explicit specification (or when running in an autovacuum worker),
+ * final value used by vacuumlazy.c is taken from the reloption.
+ *
+ * *_AUTO is used by index_cleanup option/reloption only, where it is the
+ * default.
+ *
+ * *_ENABLED is the default for the truncate option/reloption.
*/
-typedef enum VacOptTernaryValue
+typedef enum VacOptValue
{
- VACOPT_TERNARY_DEFAULT = 0,
- VACOPT_TERNARY_DISABLED,
- VACOPT_TERNARY_ENABLED,
-} VacOptTernaryValue;
+ VACOPT_VAL_UNSPECIFIED = 0,
+ VACOPT_VAL_AUTO,
+ VACOPT_VAL_DISABLED,
+ VACOPT_VAL_ENABLED,
+} VacOptValue;
/*
* Parameters customizing behavior of VACUUM and ANALYZE.
@@ -216,10 +223,10 @@ typedef struct VacuumParams
int log_min_duration; /* minimum execution threshold in ms at
* which verbose logs are activated, -1
* to use default */
- VacOptTernaryValue index_cleanup; /* Do index vacuum and cleanup,
- * default value depends on reloptions */
- VacOptTernaryValue truncate; /* Truncate empty pages at the end,
- * default value depends on reloptions */
+ VacOptValue index_cleanup; /* Do index vacuum and cleanup, default value
+ * depends on reloptions */
+ VacOptValue truncate; /* Truncate empty pages at the end, default value
+ * depends on reloptions */
/*
* The number of parallel vacuum workers. 0 by default which means choose
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 774ac5b2b1..6924ed730d 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -307,6 +307,19 @@ typedef struct AutoVacOpts
float8 analyze_scale_factor;
} AutoVacOpts;
+/*
+ * A ternary value used by vacuum parameters.
+ *
+ * DEFAULT value is used to determine the value based on other
+ * configurations, e.g. reloptions.
+ */
+typedef enum StdRdOptIndexCleanup
+{
+ STDRD_OPTION_VACUUM_INDEX_CLEANUP_AUTO = 0,
+ STDRD_OPTION_VACUUM_INDEX_CLEANUP_OFF,
+ STDRD_OPTION_VACUUM_INDEX_CLEANUP_ON
+} StdRdOptIndexCleanup;
+
typedef struct StdRdOptions
{
int32 vl_len_; /* varlena header (do not touch directly!) */
@@ -316,7 +329,7 @@ typedef struct StdRdOptions
AutoVacOpts autovacuum; /* autovacuum-related options */
bool user_catalog_table; /* use as an additional catalog relation */
int parallel_workers; /* max number of parallel workers */
- bool vacuum_index_cleanup; /* enables index vacuuming and cleanup */
+ StdRdOptIndexCleanup vacuum_index_cleanup; /* controls index vacuuming */
bool vacuum_truncate; /* enables vacuum to truncate a relation */
} StdRdOptions;
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 5554275e64..7ef934cbc4 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -140,15 +140,6 @@ static relopt_bool boolRelOpts[] =
},
false
},
- {
- {
- "vacuum_index_cleanup",
- "Enables index vacuuming and index cleanup",
- RELOPT_KIND_HEAP | RELOPT_KIND_TOAST,
- ShareUpdateExclusiveLock
- },
- true
- },
{
{
"vacuum_truncate",
@@ -474,6 +465,21 @@ static relopt_real realRelOpts[] =
{{NULL}}
};
+/* values from StdRdOptIndexCleanup */
+relopt_enum_elt_def StdRdOptIndexCleanupValues[] =
+{
+ {"auto", STDRD_OPTION_VACUUM_INDEX_CLEANUP_AUTO},
+ {"on", STDRD_OPTION_VACUUM_INDEX_CLEANUP_ON},
+ {"off", STDRD_OPTION_VACUUM_INDEX_CLEANUP_OFF},
+ {"true", STDRD_OPTION_VACUUM_INDEX_CLEANUP_ON},
+ {"false", STDRD_OPTION_VACUUM_INDEX_CLEANUP_OFF},
+ {"yes", STDRD_OPTION_VACUUM_INDEX_CLEANUP_ON},
+ {"no", STDRD_OPTION_VACUUM_INDEX_CLEANUP_OFF},
+ {"1", STDRD_OPTION_VACUUM_INDEX_CLEANUP_ON},
+ {"0", STDRD_OPTION_VACUUM_INDEX_CLEANUP_OFF},
+ {(const char *) NULL} /* list terminator */
+};
+
/* values from GistOptBufferingMode */
relopt_enum_elt_def gistBufferingOptValues[] =
{
@@ -494,6 +500,17 @@ relopt_enum_elt_def viewCheckOptValues[] =
static relopt_enum enumRelOpts[] =
{
+ {
+ {
+ "vacuum_index_cleanup",
+ "Forces or disables index vacuuming and index cleanup",
+ RELOPT_KIND_HEAP | RELOPT_KIND_TOAST,
+ ShareUpdateExclusiveLock
+ },
+ StdRdOptIndexCleanupValues,
+ STDRD_OPTION_VACUUM_INDEX_CLEANUP_AUTO,
+ gettext_noop("Valid values are \"on\", \"off\", and \"auto\".")
+ },
{
{
"buffering",
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 88db2e2cfc..44f1bf0d86 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -313,6 +313,8 @@ typedef struct LVRelState
bool do_index_cleanup;
/* Wraparound failsafe in effect? (implies !do_index_vacuuming) */
bool do_failsafe;
+ /* Consider bypass optimization? */
+ bool do_bypass_optimization;
/* Buffer access strategy and parallel state */
BufferAccessStrategy bstrategy;
@@ -405,7 +407,7 @@ static void lazy_scan_prune(LVRelState *vacrel, Buffer buf,
BlockNumber blkno, Page page,
GlobalVisState *vistest,
LVPagePruneState *prunestate);
-static void lazy_vacuum(LVRelState *vacrel, bool onecall);
+static void lazy_vacuum(LVRelState *vacrel);
static bool lazy_vacuum_all_indexes(LVRelState *vacrel);
static void lazy_vacuum_heap_rel(LVRelState *vacrel);
static int lazy_vacuum_heap_page(LVRelState *vacrel, BlockNumber blkno,
@@ -507,8 +509,9 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
MultiXactId MultiXactCutoff;
Assert(params != NULL);
- Assert(params->index_cleanup != VACOPT_TERNARY_DEFAULT);
- Assert(params->truncate != VACOPT_TERNARY_DEFAULT);
+ Assert(params->index_cleanup != VACOPT_VAL_UNSPECIFIED);
+ Assert(params->truncate != VACOPT_VAL_UNSPECIFIED);
+ Assert(params->truncate != VACOPT_VAL_AUTO);
/* measure elapsed time iff autovacuum logging requires it */
if (IsAutoVacuumWorkerProcess() && params->log_min_duration >= 0)
@@ -557,14 +560,33 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
vacrel->rel = rel;
vac_open_indexes(vacrel->rel, RowExclusiveLock, &vacrel->nindexes,
&vacrel->indrels);
+
+ /*
+ * index_cleanup either disables index vacuuming and cleanup or forces it
+ * to go ahead when we would otherwise apply the bypass optimization. The
+ * default is 'auto', which leaves the final decision up to lazy_vacuum().
+ */
vacrel->do_index_vacuuming = true;
vacrel->do_index_cleanup = true;
vacrel->do_failsafe = false;
- if (params->index_cleanup == VACOPT_TERNARY_DISABLED)
+ vacrel->do_bypass_optimization = true;
+ if (params->index_cleanup == VACOPT_VAL_DISABLED)
{
+ /* Force disable index vacuuming up-front */
vacrel->do_index_vacuuming = false;
vacrel->do_index_cleanup = false;
}
+ else if (params->index_cleanup == VACOPT_VAL_ENABLED)
+ {
+ /* Force index vacuuming (never bypass) */
+ vacrel->do_bypass_optimization = false;
+ }
+ else
+ {
+ /* Default/auto, make all decisions dynamically */
+ Assert(params->index_cleanup == VACOPT_VAL_AUTO);
+ }
+
vacrel->bstrategy = bstrategy;
vacrel->old_rel_pages = rel->rd_rel->relpages;
vacrel->old_live_tuples = rel->rd_rel->reltuples;
@@ -893,8 +915,7 @@ lazy_scan_heap(LVRelState *vacrel, VacuumParams *params, bool aggressive)
next_fsm_block_to_vacuum;
PGRUsage ru0;
Buffer vmbuffer = InvalidBuffer;
- bool skipping_blocks,
- have_vacuumed_indexes = false;
+ bool skipping_blocks;
StringInfoData buf;
const int initprog_index[] = {
PROGRESS_VACUUM_PHASE,
@@ -1166,8 +1187,8 @@ lazy_scan_heap(LVRelState *vacrel, VacuumParams *params, bool aggressive)
}
/* Remove the collected garbage tuples from table and indexes */
- lazy_vacuum(vacrel, false);
- have_vacuumed_indexes = true;
+ vacrel->do_bypass_optimization = false;
+ lazy_vacuum(vacrel);
/*
* Vacuum the Free Space Map to make newly-freed space visible on
@@ -1579,7 +1600,7 @@ lazy_scan_heap(LVRelState *vacrel, VacuumParams *params, bool aggressive)
/* If any tuples need to be deleted, perform final vacuum cycle */
if (dead_tuples->num_tuples > 0)
- lazy_vacuum(vacrel, !have_vacuumed_indexes);
+ lazy_vacuum(vacrel);
/*
* Vacuum the remainder of the Free Space Map. We must do this whether or
@@ -2064,9 +2085,9 @@ retry:
* wraparound.
*/
static void
-lazy_vacuum(LVRelState *vacrel, bool onecall)
+lazy_vacuum(LVRelState *vacrel)
{
- bool do_bypass_optimization;
+ bool bypass;
/* Should not end up here with no indexes */
Assert(vacrel->nindexes > 0);
@@ -2099,8 +2120,8 @@ lazy_vacuum(LVRelState *vacrel, bool onecall)
* It's far easier to ensure that 99%+ of all UPDATEs against a table use
* HOT through careful tuning.
*/
- do_bypass_optimization = false;
- if (onecall && vacrel->rel_pages > 0)
+ bypass = false;
+ if (vacrel->do_bypass_optimization && vacrel->rel_pages > 0)
{
BlockNumber threshold;
@@ -2132,12 +2153,11 @@ lazy_vacuum(LVRelState *vacrel, bool onecall)
* expanded to cover more cases then this may need to be reconsidered.
*/
threshold = (double) vacrel->rel_pages * BYPASS_THRESHOLD_PAGES;
- do_bypass_optimization =
- (vacrel->lpdead_item_pages < threshold &&
- vacrel->lpdead_items < MAXDEADTUPLES(32L * 1024L * 1024L));
+ bypass = (vacrel->lpdead_item_pages < threshold &&
+ vacrel->lpdead_items < MAXDEADTUPLES(32L * 1024L * 1024L));
}
- if (do_bypass_optimization)
+ if (bypass)
{
/*
* There are almost zero TIDs. Behave as if there were precisely
@@ -3140,7 +3160,7 @@ should_attempt_truncation(LVRelState *vacrel, VacuumParams *params)
{
BlockNumber possibly_freeable;
- if (params->truncate == VACOPT_TERNARY_DISABLED)
+ if (params->truncate == VACOPT_VAL_DISABLED)
return false;
if (vacrel->do_failsafe)
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 7421d7cfbd..fcf7da641f 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -88,7 +88,7 @@ static void vac_truncate_clog(TransactionId frozenXID,
MultiXactId lastSaneMinMulti);
static bool vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params);
static double compute_parallel_delay(void);
-static VacOptTernaryValue get_vacopt_ternary_value(DefElem *def);
+static VacOptValue get_vacopt_ternary_value(DefElem *def);
/*
* Primary entry point for manual VACUUM and ANALYZE commands
@@ -109,9 +109,9 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
bool process_toast = true;
ListCell *lc;
- /* Set default value */
- params.index_cleanup = VACOPT_TERNARY_DEFAULT;
- params.truncate = VACOPT_TERNARY_DEFAULT;
+ /* index_cleanup and truncate are unspecified at first */
+ params.index_cleanup = VACOPT_VAL_UNSPECIFIED;
+ params.truncate = VACOPT_VAL_UNSPECIFIED;
/* By default parallel vacuum is enabled */
params.nworkers = 0;
@@ -142,7 +142,21 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
else if (strcmp(opt->defname, "disable_page_skipping") == 0)
disable_page_skipping = defGetBoolean(opt);
else if (strcmp(opt->defname, "index_cleanup") == 0)
- params.index_cleanup = get_vacopt_ternary_value(opt);
+ {
+ /* Interpret no string as the default, which is 'auto' */
+ if (!opt->arg)
+ params.index_cleanup = VACOPT_VAL_AUTO;
+ else
+ {
+ char *sval = defGetString(opt);
+
+ /* Try matching on 'auto' string, or fall back on boolean */
+ if (pg_strcasecmp(sval, "auto") == 0)
+ params.index_cleanup = VACOPT_VAL_AUTO;
+ else
+ params.index_cleanup = get_vacopt_ternary_value(opt);
+ }
+ }
else if (strcmp(opt->defname, "process_toast") == 0)
process_toast = defGetBoolean(opt);
else if (strcmp(opt->defname, "truncate") == 0)
@@ -1938,24 +1952,44 @@ vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params)
lockrelid = rel->rd_lockInfo.lockRelId;
LockRelationIdForSession(&lockrelid, lmode);
- /* Set index cleanup option based on reloptions if not yet */
- if (params->index_cleanup == VACOPT_TERNARY_DEFAULT)
+ /*
+ * Set index_cleanup option based on index_cleanup reloption if it wasn't
+ * specified in VACUUM command, or when running in an autovacuum worker
+ */
+ if (params->index_cleanup == VACOPT_VAL_UNSPECIFIED)
{
- if (rel->rd_options == NULL ||
- ((StdRdOptions *) rel->rd_options)->vacuum_index_cleanup)
- params->index_cleanup = VACOPT_TERNARY_ENABLED;
+ StdRdOptIndexCleanup vacuum_index_cleanup;
+
+ /* The reloption default is 'auto', just like the VACUUM option */
+ if (rel->rd_options == NULL)
+ vacuum_index_cleanup = STDRD_OPTION_VACUUM_INDEX_CLEANUP_AUTO;
else
- params->index_cleanup = VACOPT_TERNARY_DISABLED;
+ vacuum_index_cleanup =
+ ((StdRdOptions *) rel->rd_options)->vacuum_index_cleanup;
+
+ if (vacuum_index_cleanup == STDRD_OPTION_VACUUM_INDEX_CLEANUP_AUTO)
+ params->index_cleanup = VACOPT_VAL_AUTO;
+ else if (vacuum_index_cleanup == STDRD_OPTION_VACUUM_INDEX_CLEANUP_ON)
+ params->index_cleanup = VACOPT_VAL_ENABLED;
+ else
+ {
+ Assert(vacuum_index_cleanup ==
+ STDRD_OPTION_VACUUM_INDEX_CLEANUP_OFF);
+ params->index_cleanup = VACOPT_VAL_DISABLED;
+ }
}
- /* Set truncate option based on reloptions if not yet */
- if (params->truncate == VACOPT_TERNARY_DEFAULT)
+ /*
+ * Set truncate option based on truncate reloption if it wasn't specified
+ * in VACUUM command, or when running in an autovacuum worker
+ */
+ if (params->truncate == VACOPT_VAL_UNSPECIFIED)
{
if (rel->rd_options == NULL ||
((StdRdOptions *) rel->rd_options)->vacuum_truncate)
- params->truncate = VACOPT_TERNARY_ENABLED;
+ params->truncate = VACOPT_VAL_ENABLED;
else
- params->truncate = VACOPT_TERNARY_DISABLED;
+ params->truncate = VACOPT_VAL_DISABLED;
}
/*
@@ -2217,11 +2251,14 @@ compute_parallel_delay(void)
/*
* A wrapper function of defGetBoolean().
*
- * This function returns VACOPT_TERNARY_ENABLED and VACOPT_TERNARY_DISABLED
- * instead of true and false.
+ * This function returns VACOPT_VAL_ENABLED and VACOPT_VAL_DISABLED instead of
+ * true and false.
+ *
+ * There is also a VACOPT_VAC_AUTO value that is used only by the
+ * INDEX_CLEANUP option. We don't handle that here.
*/
-static VacOptTernaryValue
+static VacOptValue
get_vacopt_ternary_value(DefElem *def)
{
- return defGetBoolean(def) ? VACOPT_TERNARY_ENABLED : VACOPT_TERNARY_DISABLED;
+ return defGetBoolean(def) ? VACOPT_VAL_ENABLED : VACOPT_VAL_DISABLED;
}
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index d516df0ac5..95a3cf0d52 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2976,8 +2976,13 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
tab->at_params.options = (dovacuum ? VACOPT_VACUUM : 0) |
(doanalyze ? VACOPT_ANALYZE : 0) |
(!wraparound ? VACOPT_SKIP_LOCKED : 0);
- tab->at_params.index_cleanup = VACOPT_TERNARY_DEFAULT;
- tab->at_params.truncate = VACOPT_TERNARY_DEFAULT;
+ /*
+ * index_cleanup and truncate are unspecified at first in autovacuum.
+ * They will be filled in with usable values via the reloption (or its
+ * default) later.
+ */
+ tab->at_params.index_cleanup = VACOPT_VAL_UNSPECIFIED;
+ tab->at_params.truncate = VACOPT_VAL_UNSPECIFIED;
/* As of now, we don't support parallel vacuum for autovacuum */
tab->at_params.nworkers = -1;
tab->at_params.freeze_min_age = freeze_min_age;
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index bd8e9ea2f8..38af5682f2 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -3917,8 +3917,10 @@ psql_completion(const char *text, int start, int end)
"DISABLE_PAGE_SKIPPING", "SKIP_LOCKED",
"INDEX_CLEANUP", "PROCESS_TOAST",
"TRUNCATE", "PARALLEL");
- else if (TailMatches("FULL|FREEZE|ANALYZE|VERBOSE|DISABLE_PAGE_SKIPPING|SKIP_LOCKED|INDEX_CLEANUP|PROCESS_TOAST|TRUNCATE"))
+ else if (TailMatches("FULL|FREEZE|ANALYZE|VERBOSE|DISABLE_PAGE_SKIPPING|SKIP_LOCKED|PROCESS_TOAST|TRUNCATE"))
COMPLETE_WITH("ON", "OFF");
+ else if (TailMatches("INDEX_CLEANUP"))
+ COMPLETE_WITH("AUTO", "ON", "OFF");
}
else if (HeadMatches("VACUUM") && TailMatches("("))
/* "VACUUM (" should be caught above, so assume we want columns */
diff --git a/src/bin/scripts/vacuumdb.c b/src/bin/scripts/vacuumdb.c
index 069a861aab..bf8755285d 100644
--- a/src/bin/scripts/vacuumdb.c
+++ b/src/bin/scripts/vacuumdb.c
@@ -39,7 +39,8 @@ typedef struct vacuumingOptions
int min_mxid_age;
int parallel_workers; /* >= 0 indicates user specified the
* parallel degree, otherwise -1 */
- bool do_index_cleanup;
+ bool no_index_cleanup;
+ bool force_index_cleanup;
bool do_truncate;
bool process_toast;
} vacuumingOptions;
@@ -99,8 +100,9 @@ main(int argc, char *argv[])
{"min-xid-age", required_argument, NULL, 6},
{"min-mxid-age", required_argument, NULL, 7},
{"no-index-cleanup", no_argument, NULL, 8},
- {"no-truncate", no_argument, NULL, 9},
- {"no-process-toast", no_argument, NULL, 10},
+ {"force-index-cleanup", no_argument, NULL, 9},
+ {"no-truncate", no_argument, NULL, 10},
+ {"no-process-toast", no_argument, NULL, 11},
{NULL, 0, NULL, 0}
};
@@ -126,7 +128,8 @@ main(int argc, char *argv[])
/* initialize options */
memset(&vacopts, 0, sizeof(vacopts));
vacopts.parallel_workers = -1;
- vacopts.do_index_cleanup = true;
+ vacopts.no_index_cleanup = false;
+ vacopts.force_index_cleanup = false;
vacopts.do_truncate = true;
vacopts.process_toast = true;
@@ -233,12 +236,15 @@ main(int argc, char *argv[])
}
break;
case 8:
- vacopts.do_index_cleanup = false;
+ vacopts.no_index_cleanup = true;
break;
case 9:
- vacopts.do_truncate = false;
+ vacopts.force_index_cleanup = true;
break;
case 10:
+ vacopts.do_truncate = false;
+ break;
+ case 11:
vacopts.process_toast = false;
break;
default:
@@ -285,12 +291,18 @@ main(int argc, char *argv[])
"disable-page-skipping");
exit(1);
}
- if (!vacopts.do_index_cleanup)
+ if (vacopts.no_index_cleanup)
{
pg_log_error("cannot use the \"%s\" option when performing only analyze",
"no-index-cleanup");
exit(1);
}
+ if (vacopts.force_index_cleanup)
+ {
+ pg_log_error("cannot use the \"%s\" option when performing only analyze",
+ "force-index-cleanup");
+ exit(1);
+ }
if (!vacopts.do_truncate)
{
pg_log_error("cannot use the \"%s\" option when performing only analyze",
@@ -323,6 +335,14 @@ main(int argc, char *argv[])
}
}
+ /* Prohibit --no-index-cleanup and --force-index-cleanup together */
+ if (vacopts.no_index_cleanup && vacopts.force_index_cleanup)
+ {
+ pg_log_error("cannot use the \"%s\" option with the \"%s\" option",
+ "no-index-cleanup", "force-index-cleanup");
+ exit(1);
+ }
+
/* fill cparams except for dbname, which is set below */
cparams.pghost = host;
cparams.pgport = port;
@@ -453,7 +473,7 @@ vacuum_one_database(ConnParams *cparams,
exit(1);
}
- if (!vacopts->do_index_cleanup && PQserverVersion(conn) < 120000)
+ if (vacopts->no_index_cleanup && PQserverVersion(conn) < 120000)
{
PQfinish(conn);
pg_log_error("cannot use the \"%s\" option on server versions older than PostgreSQL %s",
@@ -461,6 +481,14 @@ vacuum_one_database(ConnParams *cparams,
exit(1);
}
+ if (vacopts->force_index_cleanup && PQserverVersion(conn) < 120000)
+ {
+ PQfinish(conn);
+ pg_log_error("cannot use the \"%s\" option on server versions older than PostgreSQL %s",
+ "force-index-cleanup", "12");
+ exit(1);
+ }
+
if (!vacopts->do_truncate && PQserverVersion(conn) < 120000)
{
PQfinish(conn);
@@ -878,13 +906,26 @@ prepare_vacuum_command(PQExpBuffer sql, int serverVersion,
appendPQExpBuffer(sql, "%sDISABLE_PAGE_SKIPPING", sep);
sep = comma;
}
- if (!vacopts->do_index_cleanup)
+ if (vacopts->no_index_cleanup)
{
- /* INDEX_CLEANUP is supported since v12 */
+ /* "INDEX_CLEANUP FALSE" has been supported since v12 */
Assert(serverVersion >= 120000);
+ Assert(!vacopts->force_index_cleanup);
appendPQExpBuffer(sql, "%sINDEX_CLEANUP FALSE", sep);
sep = comma;
}
+ if (vacopts->force_index_cleanup)
+ {
+ /*
+ * "INDEX_CLEANUP TRUE" has been supported since v12. Though
+ * the --force-index-cleanup vacuumdb option was only added in
+ * v14, it still works in the same way on v12+.
+ */
+ Assert(serverVersion >= 120000);
+ Assert(!vacopts->no_index_cleanup);
+ appendPQExpBuffer(sql, "%sINDEX_CLEANUP TRUE", sep);
+ sep = comma;
+ }
if (!vacopts->do_truncate)
{
/* TRUNCATE is supported since v12 */
@@ -998,6 +1039,7 @@ help(const char *progname)
printf(_(" --min-mxid-age=MXID_AGE minimum multixact ID age of tables to vacuum\n"));
printf(_(" --min-xid-age=XID_AGE minimum transaction ID age of tables to vacuum\n"));
printf(_(" --no-index-cleanup don't remove index entries that point to dead tuples\n"));
+ printf(_(" --force-index-cleanup always remove index entries that point to dead tuples\n"));
printf(_(" --no-process-toast skip the TOAST table associated with the table to vacuum\n"));
printf(_(" --no-truncate don't truncate empty pages at the end of the table\n"));
printf(_(" -P, --parallel=PARALLEL_WORKERS use this many background workers for vacuum, if available\n"));
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index c6d0a35e50..a5193eeaac 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -1473,20 +1473,25 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
</varlistentry>
<varlistentry id="reloption-vacuum-index-cleanup" xreflabel="vacuum_index_cleanup">
- <term><literal>vacuum_index_cleanup</literal>, <literal>toast.vacuum_index_cleanup</literal> (<type>boolean</type>)
+ <term><literal>vacuum_index_cleanup</literal>, <literal>toast.vacuum_index_cleanup</literal> (<type>enum</type>)
<indexterm>
<primary><varname>vacuum_index_cleanup</varname> storage parameter</primary>
</indexterm>
</term>
<listitem>
<para>
- Enables or disables index cleanup when <command>VACUUM</command> is
- run on this table. The default value is <literal>true</literal>.
- Disabling index cleanup can speed up <command>VACUUM</command> very
- significantly, but may also lead to severely bloated indexes if table
- modifications are frequent. The <literal>INDEX_CLEANUP</literal>
- parameter of <link linkend="sql-vacuum"><command>VACUUM</command></link>, if specified, overrides
- the value of this option.
+ Disables or forces index cleanup when <command>VACUUM</command>
+ is run on this table. The default value is
+ <literal>AUTO</literal>. With <literal>OFF</literal> index
+ cleanup is disabled, with <literal>ON</literal> it is enabled,
+ and with <literal>AUTO</literal> the decision is made
+ dynamically by a <command>VACUUM</command> cost model.
+ Disabling index cleanup can speed up <command>VACUUM</command>
+ very significantly, but may also lead to severely bloated
+ indexes if table modifications are frequent.
+ The <literal>INDEX_CLEANUP</literal> parameter of <link
+ linkend="sql-vacuum"><command>VACUUM</command></link>, if
+ specified, overrides the value of this option.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index 5f67c9d18b..761fdbe4f4 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -32,7 +32,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
ANALYZE [ <replaceable class="parameter">boolean</replaceable> ]
DISABLE_PAGE_SKIPPING [ <replaceable class="parameter">boolean</replaceable> ]
SKIP_LOCKED [ <replaceable class="parameter">boolean</replaceable> ]
- INDEX_CLEANUP [ <replaceable class="parameter">boolean</replaceable> ]
+ INDEX_CLEANUP { AUTO | ON | OFF }
PROCESS_TOAST [ <replaceable class="parameter">boolean</replaceable> ]
TRUNCATE [ <replaceable class="parameter">boolean</replaceable> ]
PARALLEL <replaceable class="parameter">integer</replaceable>
@@ -193,19 +193,20 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
<term><literal>INDEX_CLEANUP</literal></term>
<listitem>
<para>
- Specifies that <command>VACUUM</command> should attempt to remove
- index entries pointing to dead tuples. This is normally the desired
- behavior and is the default unless the
- <literal>vacuum_index_cleanup</literal> option has been set to false
- for the table to be vacuumed. Setting this option to false may be
- useful when it is necessary to make vacuum run as quickly as possible,
- for example to avoid imminent transaction ID wraparound
- (see <xref linkend="vacuum-for-wraparound"/>). However, if index
- cleanup is not performed regularly, performance may suffer, because
- as the table is modified, indexes will accumulate dead tuples
- and the table itself will accumulate dead line pointers that cannot be
- removed until index cleanup is completed. This option has no effect
- for tables that do not have an index and is ignored if the
+ Specifies when and how <command>VACUUM</command> should remove
+ index entries pointing to dead tuples. The default setting is
+ <literal>AUTO</literal>, which makes vacuum remove index entries
+ unless the table has an insignificant number of dead tuples.
+ Other valid settings are <literal>OFF</literal>, which forces
+ index cleanup to be bypassed in all cases, and
+ <literal>ON</literal>, which prevents vacuum from considering
+ bypassing index cleanup in cases where there are more than zero
+ dead tuples. If <literal>INDEX_CLEANUP</literal> is set to
+ <literal>OFF</literal>, performance may suffer, because as the
+ table is modified indexes will accumulate dead tuples and the
+ table itself will accumulate dead line pointers that cannot be
+ removed until index cleanup is completed. This option has no
+ effect for tables that have no index and is ignored if the
<literal>FULL</literal> option is used.
</para>
</listitem>
diff --git a/doc/src/sgml/ref/vacuumdb.sgml b/doc/src/sgml/ref/vacuumdb.sgml
index 3e7f7ed68f..d799804cda 100644
--- a/doc/src/sgml/ref/vacuumdb.sgml
+++ b/doc/src/sgml/ref/vacuumdb.sgml
@@ -244,6 +244,21 @@ PostgreSQL documentation
</listitem>
</varlistentry>
+ <varlistentry>
+ <term><option>--force-index-cleanup</option></term>
+ <listitem>
+ <para>
+ Always remove index entries pointing to dead tuples.
+ </para>
+ <note>
+ <para>
+ This option is only available for servers running
+ <productname>PostgreSQL</productname> 12 and later.
+ </para>
+ </note>
+ </listitem>
+ </varlistentry>
+
<varlistentry>
<term><option>--no-process-toast</option></term>
<listitem>
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index 5e657849aa..e5771462d5 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -146,13 +146,15 @@ VACUUM no_index_cleanup;
-- Both parent relation and toast are cleaned up.
ALTER TABLE no_index_cleanup SET (vacuum_index_cleanup = true);
VACUUM no_index_cleanup;
+ALTER TABLE no_index_cleanup SET (vacuum_index_cleanup = auto);
+VACUUM no_index_cleanup;
-- Parameter is set for both the parent table and its toast relation.
INSERT INTO no_index_cleanup(i, t) VALUES (generate_series(31,60),
repeat('1234567890',269));
DELETE FROM no_index_cleanup WHERE i < 45;
-- Only toast index is cleaned up.
-ALTER TABLE no_index_cleanup SET (vacuum_index_cleanup = false,
- toast.vacuum_index_cleanup = true);
+ALTER TABLE no_index_cleanup SET (vacuum_index_cleanup = off,
+ toast.vacuum_index_cleanup = yes);
VACUUM no_index_cleanup;
-- Only parent is cleaned up.
ALTER TABLE no_index_cleanup SET (vacuum_index_cleanup = true,
@@ -160,7 +162,7 @@ ALTER TABLE no_index_cleanup SET (vacuum_index_cleanup = true,
VACUUM no_index_cleanup;
-- Test some extra relations.
VACUUM (INDEX_CLEANUP FALSE) vaccluster;
-VACUUM (INDEX_CLEANUP FALSE) vactst; -- index cleanup option is ignored if no indexes
+VACUUM (INDEX_CLEANUP AUTO) vactst; -- index cleanup option is ignored if no indexes
VACUUM (INDEX_CLEANUP FALSE, FREEZE TRUE) vaccluster;
-- TRUNCATE option
CREATE TABLE vac_truncate_test(i INT NOT NULL, j text)
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 93fd258fc0..f220fc28a7 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -127,13 +127,15 @@ VACUUM no_index_cleanup;
-- Both parent relation and toast are cleaned up.
ALTER TABLE no_index_cleanup SET (vacuum_index_cleanup = true);
VACUUM no_index_cleanup;
+ALTER TABLE no_index_cleanup SET (vacuum_index_cleanup = auto);
+VACUUM no_index_cleanup;
-- Parameter is set for both the parent table and its toast relation.
INSERT INTO no_index_cleanup(i, t) VALUES (generate_series(31,60),
repeat('1234567890',269));
DELETE FROM no_index_cleanup WHERE i < 45;
-- Only toast index is cleaned up.
-ALTER TABLE no_index_cleanup SET (vacuum_index_cleanup = false,
- toast.vacuum_index_cleanup = true);
+ALTER TABLE no_index_cleanup SET (vacuum_index_cleanup = off,
+ toast.vacuum_index_cleanup = yes);
VACUUM no_index_cleanup;
-- Only parent is cleaned up.
ALTER TABLE no_index_cleanup SET (vacuum_index_cleanup = true,
@@ -141,7 +143,7 @@ ALTER TABLE no_index_cleanup SET (vacuum_index_cleanup = true,
VACUUM no_index_cleanup;
-- Test some extra relations.
VACUUM (INDEX_CLEANUP FALSE) vaccluster;
-VACUUM (INDEX_CLEANUP FALSE) vactst; -- index cleanup option is ignored if no indexes
+VACUUM (INDEX_CLEANUP AUTO) vactst; -- index cleanup option is ignored if no indexes
VACUUM (INDEX_CLEANUP FALSE, FREEZE TRUE) vaccluster;
-- TRUNCATE option
--
2.27.0
On Thu, Jun 17, 2021 at 5:55 AM Justin Pryzby <pryzby@telsasoft.com> wrote:
(Various sgml typos)
Fixed in the v3 I just posted.
+ removed until index cleanup is completed. This option has no + effect for tables that do not have an index and is ignored if + the <literal>FULL</literal> option is used.I'd say "tables that have no indexes,"
That wording wasn't mine (it just happened to be moved around by
reformatting), but I think you're right. I went with your suggestion.
Thanks for taking a look
--
Peter Geoghegan
On Thu, Jun 17, 2021 at 7:26 PM Peter Geoghegan <pg@bowt.ie> wrote:
Thanks for the review!
Attached is v3, which has all the changes that you suggested (plus the
doc stuff from Justin).
Just pushed a version of that with much improved documentation.
Thanks again
--
Peter Geoghegan
On Jun 14, 2021, at 7:46 PM, Peter Geoghegan <pg@bowt.ie> wrote:
Does anyone else have an opinion on this? Of course I can easily add a
GUC. But I won't do so in the absence of any real argument in favor of
it.
I'd want to see some evidence that the GUC is necessary. (For that matter, why is a per relation setting necessary?) Is there a reproducible pathological case, perhaps with a pgbench script, to demonstrate the need? I'm not asking whether there might be some regression, but rather whether somebody wants to construct a worst-case pathological case and publish quantitative results about how bad it is.
—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Sun, Jun 20, 2021 at 9:22 AM Mark Dilger
<mark.dilger@enterprisedb.com> wrote:
I'd want to see some evidence that the GUC is necessary. (For that matter, why is a per relation setting necessary?) Is there a reproducible pathological case, perhaps with a pgbench script, to demonstrate the need? I'm not asking whether there might be some regression, but rather whether somebody wants to construct a worst-case pathological case and publish quantitative results about how bad it is.
One clear argument in favor of the VACUUM option (not so much the
reloption) is that it enables certain testing scenarios.
For example, I was recently using pg_visibility to do a low-level
analysis of how visibility map bits were getting set with a test case
that built on the BenchmarkSQL fair-use TPC-C implementation. The
optimization was something that I noticed in certain scenarios -- I
could have used the option of disabling it at the VACUUM command level
just to get a perfectly clean slate. A small fraction of the pages in
the table to not be set all-visible, which would be inconsequential to
users but was annoying in the context of this particular test
scenario.
The way the optimization works will only ever leave an affected table
in a state where the LP_DEAD items left behind would be highly
unlikely to be counted by ANALYZE. They would not be counted
accurately anyway, either because they're extremely few in number or
because there are relatively many that are concentrated in just a few
heap blocks -- that's how block-based sampling by ANALYZE works.
In short, even if there really was a performance problem implied by
the bypass indexes optimization, it seems unlikely that autovacuum
would run in the first place to take care of it, with or without the
optimization. Even if autovacuum_vacuum_scale_factor were set very
aggressively. VACUUM (really autovacuum) just doesn't tend to work at
that level of precision.
--
Peter Geoghegan