Eagerly scan all-visible pages to amortize aggressive vacuum
Hi,
An aggressive vacuum of a relation is triggered when its relfrozenxid
is older than vacuum_freeze_table_age XIDs. Aggressive vacuums require
examining every unfrozen tuple in the relation. Normal vacuums can
skip all-visible pages. So a relation with a large number of
all-visible but not all-frozen pages may suddenly have to vacuum an
order of magnitude more pages than the previous vacuum.
In many cases, these all-visible not all-frozen pages are not part of
the working set and must be read in. All of the pages with newly
frozen tuples have to be written out and all of the WAL associated
with freezing and setting the page all-frozen in the VM must be
emitted. This extra I/O can affect performance of the foreground
workload substantially.
The best solution would be to freeze the pages instead of just setting
them all-visible. But we don't want to do this if the page will be
modified again because freezing costs extra I/O.
Last year, I worked on a vacuum patch to try and predict which pages
should be eagerly frozen [1]/messages/by-id/CAAKRu_b3tpbdRPUPh1Q5h35gXhY=spH2ssNsEsJ9sDfw6=PEAg@mail.gmail.com by building a distribution of page
modification intervals and estimating the probability that a given
page would stay frozen long enough to merit freezing.
While working on it I encountered a problem. Pages set all-visible but
not all-frozen by vacuum and not modified again do not have a
modification interval. As such, the distribution would not account for
an outstanding debt of old, unfrozen pages.
As I thought about the problem more, I realized that even if we could
find a way to include those pages in our model and then predict and
effectively eagerly freeze pages, there will always be pages we miss
and have to be picked up later by an aggressive vacuum.
While it would be best to freeze these pages the first time they are
vacuumed and set all-visible, the write amplification is already being
incurred. This patch proposes to spread it out across multiple
"semi-aggressive" vacuums.
I believe eager scanning is actually a required step toward more
intelligent eager freezing. It is worth noting that eager scanning
should also allow us to lift the restriction on setting pages
all-visible in the VM during on-access pruning. This could enable
index-only scans in more cases.
The approach I take in the attached patch set is built on suggestions
and feedback from both Robert and Andres as well as my own ideas and
research.
It implements a new "semi-aggressive" vacuum. Semi-aggressive vacuums
eagerly scan some number of all-visible but not all-frozen pages in
hopes of freezing them. All-visible pages that are eagerly scanned and
set all-frozen in the visibility map are considered successful eager
scans and those not frozen are considered failed eager scans.
Because we want to amortize our eager scanning across a few vacuums,
we cap the maximum number of successful eager scans to a percentage of
the total number of all-visible but not all-frozen pages in the table
(currently 20%).
We also want to cap the maximum number of failures. We assume that
different areas or "regions" of the relation are likely to contain
similarly aged data. So, if too many blocks are eagerly scanned and
not frozen in a given region of the table, eager scanning is
temporarily suspended.
Since I refer to vacuums that eagerly scan a set number of pages as
"semi-aggressive vacuums," I’ve begun calling those that scan every
unfrozen tuple "fully aggressive vacuums" and those with no eager
scanning, or with permanently disabled eager scanning, "unaggressive
vacuums."
v1 of this feature is attached. The first eight patches in the set are
preliminary.
I've proposed 0001-0003 in this thread [2]/messages/by-id/CAAKRu_aJM+0Gwoq_+-sozMX8QEax4QeDhMvySxRt2ayteXJNCg@mail.gmail.com -- they boil down to
counting pages set all-frozen in the VM.
0004-0007 are a bit of refactoring to make the code a better shape for
the eager scanning feature.
0008 is a WIP patch to add a more general description of heap
vacuuming to the top of vacuumlazy.c.
0009 is the actual eager scanning feature.
To demonstrate the results, I ran an append-only workload run for over
3 hours on master and with my patch applied. The patched version of
Postgres amortized the work of freezing the all-visible but not
all-frozen pages nicely. The first aggressive vacuum with the patch
was 44 seconds and on master it was 1201 seconds.
patch
LOG: automatic aggressive vacuum of table "history": index scans: 0
vacuum duration: 44 seconds (msecs: 44661).
pages: 0 removed, 27425085 remain, 1104095 scanned (4.03% of
total), 709889 eagerly scanned
frozen: 316544 pages from table (1.15% of total) had 17409920
tuples frozen. 316544 pages set all-frozen in the VM
I/O timings: read: 1160.599 ms, write: 2461.205 ms. approx time
spent in vacuum delay: 16230 ms.
buffer usage: 1105630 hits, 1111898 reads, 646229 newly dirtied,
1750426 dirtied.
WAL usage: 1027099 records, 316566 full page images, 276209780 bytes.
master
LOG: automatic aggressive vacuum of table "history": index scans: 0
vacuum duration: 1201 seconds (msecs: 1201487).
pages: 0 removed, 27515348 remain, 15800948 scanned (57.43% of
total), 15098257 eagerly scanned
frozen: 15096384 pages from table (54.87% of total) had 830247896
tuples frozen. 15096384 pages set all-frozen in the VM
I/O timings: read: 246537.348 ms, write: 73000.498 ms. approx time
spent in vacuum delay: 349166 ms.
buffer usage: 15798343 hits, 15813524 reads, 15097063 newly
dirtied, 31274333 dirtied.
WAL usage: 30733564 records, 15097073 full page images, 11789257631 bytes.
This is because, with the patch, the freezing work is being off-loaded
to earlier vacuums.
In the attached chart.png, you can see the vm_page_freezes climbing
steadily with the patch, whereas on master, there are sudden spikes
aligned with the aggressive vacuums. You can also see that the number
of pages that are all-visible but not all-frozen grows steadily on
master until the aggressive vacuum. This is vacuum's "backlog" of
freezing work.
In this benchmark, the TPS is rate-limited, but using pgbench
per-statement reports, I calculated that the P99 latency for this
benchmark is 16 ms on master and 1 ms with the patch. Vacuuming pages
sooner decreases vacuum reads and doing the freezing work spread over
more vacuums improves P99 transaction latency.
Below is the comparative WAL volume, checkpointer and background
writer writes, reads and writes done by all other backend types, time
spent vacuuming in milliseconds, and p99 latency. Notice that overall
vacuum IO time is substantially lower with the patch.
version wal cptr_bgwriter_w other_rw vac_io_time p99_lat
patch 770 GB 5903264 235073744 513722 1
master 767 GB 5908523 216887764 1003654 16
(Note that the benchmarks were run on Postgres with a few extra
patches applied to both master and the patch to trigger vacuums more
frequently. I've proposed those here [3]/messages/by-id/CAAKRu_aj-P7YyBz_cPNwztz6ohP+vWis=iz3YcomkB3NpYA--w@mail.gmail.com.)
I've also run the built-in tpcb-like pgbench workload and confirmed
that it improves the vacuuming behavior on pgbench_history but has
little impact on vacuuming of heavy-update tables like
pgbench_accounts -- depending on how aggressively the eager scanning
is tuned. Which brings me to the TODOs.
I need to do further benchmarking and investigation to determine
optimal failure and success caps -- ones that will work well for all
workloads. Perhaps the failure cap per region should be configurable.
I also need to try other scenarios -- like those in which old data is
deleted -- and determine if the region boundaries should change from
run to run to avoid eager scanning and failing to freeze the same
pages each time.
Also, all my benchmarking so far has been done on compressed
timelines. I tuned Postgres to exhibit the behavior it might normally
exhibit over days or a week in a few hours. As such, I need to start
running long benchmarks to observe the behavior in a more natural
environment.
- Melanie
[1]: /messages/by-id/CAAKRu_b3tpbdRPUPh1Q5h35gXhY=spH2ssNsEsJ9sDfw6=PEAg@mail.gmail.com
[2]: /messages/by-id/CAAKRu_aJM+0Gwoq_+-sozMX8QEax4QeDhMvySxRt2ayteXJNCg@mail.gmail.com
[3]: /messages/by-id/CAAKRu_aj-P7YyBz_cPNwztz6ohP+vWis=iz3YcomkB3NpYA--w@mail.gmail.com
Attachments:
v1-0007-Make-heap_vac_scan_next_block-return-BlockNumber.patchtext/x-patch; charset=US-ASCII; name=v1-0007-Make-heap_vac_scan_next_block-return-BlockNumber.patchDownload
From 78ad9e022b95e024ff5bfa96af78e9e44730c970 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Mon, 28 Oct 2024 11:42:10 -0400
Subject: [PATCH v1 7/9] Make heap_vac_scan_next_block() return BlockNumber
After removing the requirement for blkno to be set to rel_pages outside
of lazy_scan_heap(), heap_vac_scan_next_block() can return the next
block number for vacuum to scan. This makes the interface more
straightforward as well as paving the way for heap_vac_scan_next_block()
to be used by the read stream API as a callback to implement streaming
vacuum.
---
src/backend/access/heap/vacuumlazy.c | 27 +++++++++++++--------------
1 file changed, 13 insertions(+), 14 deletions(-)
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 52c9d49f2b1..7ce69953ba0 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -229,8 +229,8 @@ typedef struct LVSavedErrInfo
/* non-export function prototypes */
static void lazy_scan_heap(LVRelState *vacrel);
-static bool heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
- bool *all_visible_according_to_vm);
+static BlockNumber heap_vac_scan_next_block(LVRelState *vacrel,
+ bool *all_visible_according_to_vm);
static void find_next_unskippable_block(LVRelState *vacrel, bool *skipsallvis);
static bool lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf,
BlockNumber blkno, Page page,
@@ -857,7 +857,8 @@ lazy_scan_heap(LVRelState *vacrel)
vacrel->next_unskippable_allvis = false;
vacrel->next_unskippable_vmbuffer = InvalidBuffer;
- while (heap_vac_scan_next_block(vacrel, &blkno, &all_visible_according_to_vm))
+ while (BlockNumberIsValid(blkno = heap_vac_scan_next_block(vacrel,
+ &all_visible_according_to_vm)))
{
Buffer buf;
Page page;
@@ -1096,11 +1097,11 @@ lazy_scan_heap(LVRelState *vacrel)
* lazy_scan_heap() calls here every time it needs to get the next block to
* prune and vacuum. The function uses the visibility map, vacuum options,
* and various thresholds to skip blocks which do not need to be processed and
- * sets blkno to the next block to process.
+ * returns the next block to process.
*
- * The block number and visibility status of the next block to process are set
- * in *blkno and *all_visible_according_to_vm. The return value is false if
- * there are no further blocks to process.
+ * The block number and visibility status of the next block to process are
+ * returned and set in *all_visible_according_to_vm. The return value is
+ * InvalidBlockNumber if there are no further blocks to process.
*
* vacrel is an in/out parameter here. Vacuum options and information about
* the relation are read. vacrel->skippedallvis is set if we skip a block
@@ -1108,8 +1109,8 @@ lazy_scan_heap(LVRelState *vacrel)
* relfrozenxid in that case. vacrel also holds information about the next
* unskippable block, as bookkeeping for this function.
*/
-static bool
-heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
+static BlockNumber
+heap_vac_scan_next_block(LVRelState *vacrel,
bool *all_visible_according_to_vm)
{
/* relies on InvalidBlockNumber + 1 overflowing to 0 on first call */
@@ -1117,7 +1118,7 @@ heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
/* Have we reached the end of the relation? */
if (vacrel->current_block >= vacrel->rel_pages)
- return false;
+ return InvalidBlockNumber;
/*
* We must be in one of the three following states:
@@ -1166,9 +1167,8 @@ heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
* but chose not to. We know that they are all-visible in the VM,
* otherwise they would've been unskippable.
*/
- *blkno = vacrel->current_block;
*all_visible_according_to_vm = true;
- return true;
+ return vacrel->current_block;
}
else
{
@@ -1178,9 +1178,8 @@ heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
*/
Assert(vacrel->current_block == vacrel->next_unskippable_block);
- *blkno = vacrel->current_block;
*all_visible_according_to_vm = vacrel->next_unskippable_allvis;
- return true;
+ return vacrel->current_block;
}
}
--
2.34.1
v1-0008-WIP-Add-more-general-summary-to-vacuumlazy.c.patchtext/x-patch; charset=US-ASCII; name=v1-0008-WIP-Add-more-general-summary-to-vacuumlazy.c.patchDownload
From 818d1c3b068c6705611256cfc3eb1f10bdc0b684 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Fri, 1 Nov 2024 18:25:05 -0400
Subject: [PATCH v1 8/9] WIP: Add more general summary to vacuumlazy.c
Currently the summary at the top of vacuumlazy.c provides some specific
details related to the new dead TID storage in 17. I plan to add a
summary and maybe some sub-sections to contextualize it.
---
src/backend/access/heap/vacuumlazy.c | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 7ce69953ba0..15a04c6b10b 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -3,6 +3,17 @@
* vacuumlazy.c
* Concurrent ("lazy") vacuuming.
*
+ * Heap relations are vacuumed in three main phases. In the first phase,
+ * vacuum scans relation pages, pruning and freezing tuples and saving dead
+ * tuples' TIDs in a TID store. If that TID store fills up or vacuum finishes
+ * scanning the relation, it progresses to the second phase: index vacuuming.
+ * After index vacuuming is complete, vacuum scans the blocks of the relation
+ * indicated by the TIDs in the TID store and reaps the dead tuples, freeing
+ * that space for future tuples. Finally, vacuum may truncate the relation if
+ * it has emptied pages at the end. XXX: this summary needs work.
+ *
+ * Dead TID Storage:
+ *
* The major space usage for vacuuming is storage for the dead tuple IDs that
* are to be removed from indexes. We want to ensure we can vacuum even the
* very largest relations with finite memory space usage. To do that, we set
--
2.34.1
v1-0009-Eagerly-scan-all-visible-pages-to-amortize-aggres.patchtext/x-patch; charset=US-ASCII; name=v1-0009-Eagerly-scan-all-visible-pages-to-amortize-aggres.patchDownload
From f21f0bab1dbe675be4b4dddcb2eea486d8a69d36 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Mon, 28 Oct 2024 12:15:08 -0400
Subject: [PATCH v1 9/9] Eagerly scan all-visible pages to amortize aggressive
vacuum
Introduce semi-aggressive vacuums, which scan some of the all-visible
but not all-frozen pages in the relation to amortize the cost of an
aggressive vacuum.
Because the goal is to freeze these all-visible pages, all-visible pages
that are eagerly scanned and set all-frozen in the visibility map are
considered successful eager scans and those not frozen are considered
failed eager scans.
If too many eager scans fail in a row, eager scanning is temporarily
suspended until a later portion of the relation. Because the goal is to
amortize aggressive vacuums, we cap the number of successes as well.
Once we reach the maximum number of blocks successfully eager scanned
and frozen, the semi-aggressive vacuum is downgraded to an unaggressive
vacuum.
---
src/backend/access/heap/vacuumlazy.c | 327 +++++++++++++++++++++++----
src/backend/commands/vacuum.c | 20 +-
src/include/commands/vacuum.h | 27 ++-
src/tools/pgindent/typedefs.list | 1 +
4 files changed, 326 insertions(+), 49 deletions(-)
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 15a04c6b10b..adabb5ff5f1 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -12,6 +12,40 @@
* that space for future tuples. Finally, vacuum may truncate the relation if
* it has emptied pages at the end. XXX: this summary needs work.
*
+ * Relation Scanning:
+ *
+ * Vacuum scans the heap relation, starting at the beginning and progressing
+ * to the end, skipping pages as permitted by their visibility status, vacuum
+ * options, and the aggressiveness level of the vacuum.
+ *
+ * When page skipping is enabled, unaggressive vacuums may skip scanning pages
+ * that are marked all-visible in the visibility map. We may choose not to
+ * skip pages if the range of skippable pages is below SKIP_PAGES_THRESHOLD.
+ *
+ * Semi-aggressive vacuums will scan skippable pages in an effort to freeze
+ * them and decrease the backlog of all-visible but not all-frozen pages that
+ * have to be processed to advance relfrozenxid and avoid transaction ID
+ * wraparound.
+ *
+ * We count it as a success when we are able to set an eagerly scanned page
+ * all-frozen in the VM and a failure when we are not able to set the page
+ * all-frozen.
+ *
+ * Because we want to amortize the overhead of freezing pages over multiple
+ * vacuums, we cap the number of successful eager scans to
+ * EAGER_SCAN_SUCCESS_RATE of the number of all-visible but not all-frozen
+ * pages at the beginning of the vacuum.
+ *
+ * On the assumption that different regions of the table are likely to contain
+ * similarly aged data, we use a localized failure cap instead of a global cap
+ * for the whole relation. The failure count is reset on each region of the
+ * table, comprised of RELSEG_SIZE blocks (or 1/4 of the table size for a
+ * small table). In each region, we tolerate MAX_SUCCESSIVE_EAGER_SCAN_FAILS
+ * before suspending eager scanning until the end of the region.
+ *
+ * Fully aggressive vacuums must examine every unfrozen tuple and are thus not
+ * subject to failure or success caps when eagerly scanning all-visible pages.
+ *
* Dead TID Storage:
*
* The major space usage for vacuuming is storage for the dead tuple IDs that
@@ -142,6 +176,27 @@ typedef enum
VACUUM_ERRCB_PHASE_TRUNCATE,
} VacErrPhase;
+/*
+ * Semi-aggressive vacuums eagerly scan some all-visible but not all-frozen
+ * pages. Since our goal is to freeze these pages, an eager scan that fails to
+ * set the page all-frozen in the VM is considered to have "failed".
+ *
+ * On the assumption that different regions of the table tend to have
+ * similarly aged data, once we fail to freeze MAX_SUCCESSIVE_EAGER_SCAN_FAILS
+ * blocks, we suspend eager scanning until vacuum has progressed to another
+ * region of the table with potentially older data.
+ */
+#define MAX_SUCCESSIVE_EAGER_SCAN_FAILS 1024
+
+/*
+ * An eager scan of a page that is set all-frozen in the VM is considered
+ * "successful". To spread out eager scanning across multiple semi-aggressive
+ * vacuums, we limit the number of successful eager scans (as well as the
+ * number of failures). The maximum number of successful eager scans is
+ * calculated as a ratio of the all-visible but not all-frozen pages at the
+ * beginning of the vacuum.
+ */
+#define EAGER_SCAN_SUCCESS_RATE 0.2
typedef struct LVRelState
{
/* Target heap relation and its indexes */
@@ -153,8 +208,22 @@ typedef struct LVRelState
BufferAccessStrategy bstrategy;
ParallelVacuumState *pvs;
- /* Aggressive VACUUM? (must set relfrozenxid >= FreezeLimit) */
- bool aggressive;
+ /*
+ * Whether or not this is an aggressive, semi-aggressive, or unaggressive
+ * VACUUM. A fully aggressive vacuum must set relfrozenxid >= FreezeLimit
+ * and therefore must scan every unfrozen tuple. A semi-aggressive vacuum
+ * will scan a certain number of all-visible pages until it is downgraded
+ * to an unaggressive vacuum.
+ */
+ VacAggressive aggressive;
+
+ /*
+ * A semi-aggressive vacuum that has failed to freeze too many eagerly
+ * scanned blocks in a row suspends eager scanning. unaggressive_to is the
+ * block number of the first block eligible for resumed eager scanning.
+ */
+ BlockNumber unaggressive_to;
+
/* Use visibility map to skip? (disabled by DISABLE_PAGE_SKIPPING) */
bool skipwithvm;
/* Consider index vacuuming bypass optimization? */
@@ -227,6 +296,26 @@ typedef struct LVRelState
BlockNumber next_unskippable_block; /* next unskippable block */
bool next_unskippable_allvis; /* its visibility status */
Buffer next_unskippable_vmbuffer; /* buffer containing its VM bit */
+
+ /*
+ * Count of skippable blocks eagerly scanned as part of a semi-aggressive
+ * vacuum (for logging only).
+ */
+ BlockNumber eager_scanned;
+
+ /*
+ * The number of eagerly scanned blocks a semi-aggressive vacuum failed to
+ * freeze (due to age) in the current eager scan region. It is reset each
+ * time we hit MAX_SUCCESSIVE_EAGER_SCAN_FAILS.
+ */
+ BlockNumber eager_scanned_failed_frozen;
+
+ /*
+ * The remaining number of blocks a semi-aggressive vacuum will consider
+ * eager scanning. This is initialized to EAGER_SCAN_SUCCESS_RATE of the
+ * total number of all-visible but not all-frozen pages.
+ */
+ BlockNumber remaining_eager_scan_successes;
} LVRelState;
/* Struct for saving and restoring vacuum error information. */
@@ -241,8 +330,13 @@ typedef struct LVSavedErrInfo
/* non-export function prototypes */
static void lazy_scan_heap(LVRelState *vacrel);
static BlockNumber heap_vac_scan_next_block(LVRelState *vacrel,
- bool *all_visible_according_to_vm);
-static void find_next_unskippable_block(LVRelState *vacrel, bool *skipsallvis);
+ bool *all_visible_according_to_vm,
+ bool *was_eager_scanned);
+static void find_next_unskippable_block(
+ LVRelState *vacrel,
+ bool consider_eager_scan,
+ bool *was_eager_scanned,
+ bool *skipsallvis);
static bool lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf,
BlockNumber blkno, Page page,
bool sharelock, Buffer vmbuffer);
@@ -314,7 +408,9 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
minmulti_updated;
BlockNumber orig_rel_pages,
new_rel_pages,
- new_rel_allvisible;
+ orig_rel_allvisible,
+ new_rel_allvisible,
+ orig_rel_allfrozen;
PGRUsage ru0;
TimestampTz starttime = 0;
PgStat_Counter startreadtime = 0,
@@ -458,6 +554,8 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
* to increase the number of dead tuples it can prune away.)
*/
vacrel->aggressive = vacuum_get_cutoffs(rel, params, &vacrel->cutoffs);
+ vacrel->unaggressive_to = 0;
+
vacrel->rel_pages = orig_rel_pages = RelationGetNumberOfBlocks(rel);
vacrel->vistest = GlobalVisTestFor(rel);
/* Initialize state used to track oldest extant XID/MXID */
@@ -471,24 +569,49 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
* Force aggressive mode, and disable skipping blocks using the
* visibility map (even those set all-frozen)
*/
- vacrel->aggressive = true;
+ vacrel->aggressive = VAC_AGGRESSIVE;
skipwithvm = false;
}
vacrel->skipwithvm = skipwithvm;
+ vacrel->eager_scanned = 0;
+ vacrel->eager_scanned_failed_frozen = 0;
+
+ /*
+ * Even if we successfully freeze them, we want to cap the number of
+ * eagerly scanned blocks so that we spread out the overhead across
+ * multiple vacuums. remaining_eager_scan_successes is only used by
+ * semi-aggressive vacuums.
+ */
+ visibilitymap_count(rel, &orig_rel_allvisible, &orig_rel_allfrozen);
+ vacrel->remaining_eager_scan_successes =
+ (BlockNumber) (EAGER_SCAN_SUCCESS_RATE * (orig_rel_allvisible - orig_rel_allfrozen));
if (verbose)
{
- if (vacrel->aggressive)
- ereport(INFO,
- (errmsg("aggressively vacuuming \"%s.%s.%s\"",
- vacrel->dbname, vacrel->relnamespace,
- vacrel->relname)));
- else
- ereport(INFO,
- (errmsg("vacuuming \"%s.%s.%s\"",
- vacrel->dbname, vacrel->relnamespace,
- vacrel->relname)));
+ switch (vacrel->aggressive)
+ {
+ case VAC_UNAGGRESSIVE:
+ ereport(INFO,
+ (errmsg("vacuuming \"%s.%s.%s\"",
+ vacrel->dbname, vacrel->relnamespace,
+ vacrel->relname)));
+ break;
+
+ case VAC_AGGRESSIVE:
+ ereport(INFO,
+ (errmsg("aggressively vacuuming \"%s.%s.%s\"",
+ vacrel->dbname, vacrel->relnamespace,
+ vacrel->relname)));
+ break;
+
+ case VAC_SEMIAGGRESSIVE:
+ ereport(INFO,
+ (errmsg("semiaggressively vacuuming \"%s.%s.%s\"",
+ vacrel->dbname, vacrel->relnamespace,
+ vacrel->relname)));
+ break;
+ }
}
/*
@@ -545,11 +668,13 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
* Non-aggressive VACUUMs may advance them by any amount, or not at all.
*/
Assert(vacrel->NewRelfrozenXid == vacrel->cutoffs.OldestXmin ||
- TransactionIdPrecedesOrEquals(vacrel->aggressive ? vacrel->cutoffs.FreezeLimit :
+ TransactionIdPrecedesOrEquals(vacrel->aggressive == VAC_AGGRESSIVE ?
+ vacrel->cutoffs.FreezeLimit :
vacrel->cutoffs.relfrozenxid,
vacrel->NewRelfrozenXid));
Assert(vacrel->NewRelminMxid == vacrel->cutoffs.OldestMxact ||
- MultiXactIdPrecedesOrEquals(vacrel->aggressive ? vacrel->cutoffs.MultiXactCutoff :
+ MultiXactIdPrecedesOrEquals(vacrel->aggressive == VAC_AGGRESSIVE ?
+ vacrel->cutoffs.MultiXactCutoff :
vacrel->cutoffs.relminmxid,
vacrel->NewRelminMxid));
if (vacrel->skippedallvis)
@@ -559,7 +684,7 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
* chose to skip an all-visible page range. The state that tracks new
* values will have missed unfrozen XIDs from the pages we skipped.
*/
- Assert(!vacrel->aggressive);
+ Assert(vacrel->aggressive != VAC_AGGRESSIVE);
vacrel->NewRelfrozenXid = InvalidTransactionId;
vacrel->NewRelminMxid = InvalidMultiXactId;
}
@@ -654,14 +779,14 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
* implies aggressive. Produce distinct output for the corner
* case all the same, just in case.
*/
- if (vacrel->aggressive)
+ if (vacrel->aggressive == VAC_AGGRESSIVE)
msgfmt = _("automatic aggressive vacuum to prevent wraparound of table \"%s.%s.%s\": index scans: %d\n");
else
msgfmt = _("automatic vacuum to prevent wraparound of table \"%s.%s.%s\": index scans: %d\n");
}
else
{
- if (vacrel->aggressive)
+ if (vacrel->aggressive == VAC_AGGRESSIVE)
msgfmt = _("automatic aggressive vacuum of table \"%s.%s.%s\": index scans: %d\n");
else
msgfmt = _("automatic vacuum of table \"%s.%s.%s\": index scans: %d\n");
@@ -802,6 +927,16 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
}
}
+/*
+ * Helper to decrement a block number to 0 without wrapping around.
+ */
+static void
+decrement_blkno(BlockNumber *block)
+{
+ if ((*block) > 0)
+ (*block)--;
+}
+
/*
* lazy_scan_heap() -- workhorse function for VACUUM
*
@@ -844,7 +979,8 @@ lazy_scan_heap(LVRelState *vacrel)
BlockNumber rel_pages = vacrel->rel_pages,
blkno,
next_fsm_block_to_vacuum = 0;
- bool all_visible_according_to_vm;
+ bool all_visible_according_to_vm,
+ was_eager_scanned = false;
TidStore *dead_items = vacrel->dead_items;
VacDeadItemsInfo *dead_items_info = vacrel->dead_items_info;
@@ -855,6 +991,7 @@ lazy_scan_heap(LVRelState *vacrel)
PROGRESS_VACUUM_MAX_DEAD_TUPLE_BYTES
};
int64 initprog_val[3];
+ BlockNumber page_freezes = 0;
/* Report that we're scanning the heap, advertising total # of blocks */
initprog_val[0] = PROGRESS_VACUUM_PHASE_SCAN_HEAP;
@@ -869,7 +1006,8 @@ lazy_scan_heap(LVRelState *vacrel)
vacrel->next_unskippable_vmbuffer = InvalidBuffer;
while (BlockNumberIsValid(blkno = heap_vac_scan_next_block(vacrel,
- &all_visible_according_to_vm)))
+ &all_visible_according_to_vm,
+ &was_eager_scanned)))
{
Buffer buf;
Page page;
@@ -956,11 +1094,23 @@ lazy_scan_heap(LVRelState *vacrel)
if (!got_cleanup_lock)
LockBuffer(buf, BUFFER_LOCK_SHARE);
+ page_freezes = vacrel->vm_page_freezes;
+
/* Check for new or empty pages before lazy_scan_[no]prune call */
if (lazy_scan_new_or_empty(vacrel, buf, blkno, page, !got_cleanup_lock,
vmbuffer))
{
/* Processed as new/empty page (lock and pin released) */
+
+ /* count an eagerly scanned page as a failure or a success */
+ if (was_eager_scanned)
+ {
+ if (vacrel->vm_page_freezes > page_freezes)
+ decrement_blkno(&vacrel->remaining_eager_scan_successes);
+ else
+ vacrel->eager_scanned_failed_frozen++;
+ }
+
continue;
}
@@ -979,7 +1129,7 @@ lazy_scan_heap(LVRelState *vacrel)
* lazy_scan_noprune could not do all required processing. Wait
* for a cleanup lock, and call lazy_scan_prune in the usual way.
*/
- Assert(vacrel->aggressive);
+ Assert(vacrel->aggressive == VAC_AGGRESSIVE);
LockBuffer(buf, BUFFER_LOCK_UNLOCK);
LockBufferForCleanup(buf);
got_cleanup_lock = true;
@@ -1003,6 +1153,15 @@ lazy_scan_heap(LVRelState *vacrel)
vmbuffer, all_visible_according_to_vm,
&has_lpdead_items);
+ /* count an eagerly scanned page as a failure or a success */
+ if (was_eager_scanned)
+ {
+ if (vacrel->vm_page_freezes > page_freezes)
+ decrement_blkno(&vacrel->remaining_eager_scan_successes);
+ else
+ vacrel->eager_scanned_failed_frozen++;
+ }
+
/*
* Now drop the buffer lock and, potentially, update the FSM.
*
@@ -1112,7 +1271,9 @@ lazy_scan_heap(LVRelState *vacrel)
*
* The block number and visibility status of the next block to process are
* returned and set in *all_visible_according_to_vm. The return value is
- * InvalidBlockNumber if there are no further blocks to process.
+ * InvalidBlockNumber if there are no further blocks to process. If the block
+ * is being eagerly scanned, was_eager_scanned is set so that the caller can
+ * count whether or not we successfully freeze it.
*
* vacrel is an in/out parameter here. Vacuum options and information about
* the relation are read. vacrel->skippedallvis is set if we skip a block
@@ -1122,11 +1283,14 @@ lazy_scan_heap(LVRelState *vacrel)
*/
static BlockNumber
heap_vac_scan_next_block(LVRelState *vacrel,
- bool *all_visible_according_to_vm)
+ bool *all_visible_according_to_vm,
+ bool *was_eager_scanned)
{
/* relies on InvalidBlockNumber + 1 overflowing to 0 on first call */
vacrel->current_block++;
+ *was_eager_scanned = false;
+
/* Have we reached the end of the relation? */
if (vacrel->current_block >= vacrel->rel_pages)
return InvalidBlockNumber;
@@ -1137,6 +1301,8 @@ heap_vac_scan_next_block(LVRelState *vacrel,
if (vacrel->current_block > vacrel->next_unskippable_block ||
vacrel->next_unskippable_block == InvalidBlockNumber)
{
+ bool consider_eager_scan = false;
+
/*
* 1. We have just processed an unskippable block (or we're at the
* beginning of the scan). Find the next unskippable block using the
@@ -1144,7 +1310,65 @@ heap_vac_scan_next_block(LVRelState *vacrel,
*/
bool skipsallvis;
- find_next_unskippable_block(vacrel, &skipsallvis);
+ /*
+ * Figure out if we should disable eager scan going forward or
+ * downgrade to an unaggressive vacuum altogether.
+ */
+ if (vacrel->aggressive == VAC_SEMIAGGRESSIVE)
+ {
+ /*
+ * If we hit our success limit, there is no need to eagerly scan
+ * any additional pages. Downgrade the vacuum to unaggressive.
+ */
+ if (vacrel->remaining_eager_scan_successes == 0)
+ vacrel->aggressive = VAC_UNAGGRESSIVE;
+
+ /*
+ * If we hit the max number of failed eager scans for this region
+ * of the table, figure out where the next eager scan region
+ * should start. Eager scanning is effectively disabled until we
+ * scan a block in that new region.
+ */
+ else if (vacrel->eager_scanned_failed_frozen >=
+ MAX_SUCCESSIVE_EAGER_SCAN_FAILS)
+ {
+ BlockNumber region_size,
+ offset;
+
+ /*
+ * On the assumption that different regions of the table are
+ * likely to have similarly aged data, we will retry eager
+ * scanning again later. For a small table, we'll retry eager
+ * scanning every quarter of the table. For a larger table,
+ * we'll consider eager scanning again after processing
+ * another region's worth of data.
+ *
+ * We consider the region to start from the first failure, so
+ * calculate the block to restart eager scanning from there.
+ */
+ region_size = Min(RELSEG_SIZE, (vacrel->rel_pages / 4));
+
+ offset = vacrel->eager_scanned_failed_frozen % region_size;
+
+ Assert(vacrel->eager_scanned > 0);
+
+ vacrel->unaggressive_to = vacrel->current_block + (region_size - offset);
+ vacrel->eager_scanned_failed_frozen = 0;
+ }
+ }
+
+ /*
+ * If it is a fully aggressive vacuum or we haven't yet hit the fail
+ * limit in our current eager scan region, consider eager scanning the
+ * next block.
+ */
+ if (vacrel->aggressive == VAC_AGGRESSIVE)
+ consider_eager_scan = true;
+ else if (vacrel->aggressive == VAC_SEMIAGGRESSIVE)
+ consider_eager_scan = vacrel->current_block >= vacrel->unaggressive_to;
+
+ find_next_unskippable_block(vacrel, consider_eager_scan,
+ was_eager_scanned, &skipsallvis);
/*
* We now know the next block that we must process. It can be the
@@ -1199,6 +1423,11 @@ heap_vac_scan_next_block(LVRelState *vacrel,
* The next unskippable block and its visibility information is updated in
* vacrel.
*
+ * consider_eager_scan indicates whether or not we should consider scanning
+ * all-visible but not all-frozen blocks. was_eager_scanned is set to true if
+ * we decided to eager scan a block. In this case, next_unskippable_block is
+ * set to that block number.
+ *
* Note: our opinion of which blocks can be skipped can go stale immediately.
* It's okay if caller "misses" a page whose all-visible or all-frozen marking
* was concurrently cleared, though. All that matters is that caller scan all
@@ -1208,7 +1437,11 @@ heap_vac_scan_next_block(LVRelState *vacrel,
* to skip such a range is actually made, making everything safe.)
*/
static void
-find_next_unskippable_block(LVRelState *vacrel, bool *skipsallvis)
+find_next_unskippable_block(
+ LVRelState *vacrel,
+ bool consider_eager_scan,
+ bool *was_eager_scanned,
+ bool *skipsallvis)
{
BlockNumber rel_pages = vacrel->rel_pages;
BlockNumber next_unskippable_block = vacrel->next_unskippable_block + 1;
@@ -1217,7 +1450,7 @@ find_next_unskippable_block(LVRelState *vacrel, bool *skipsallvis)
*skipsallvis = false;
- for (;;)
+ for (;; next_unskippable_block++)
{
uint8 mapbits = visibilitymap_get_status(vacrel->rel,
next_unskippable_block,
@@ -1253,23 +1486,31 @@ find_next_unskippable_block(LVRelState *vacrel, bool *skipsallvis)
break;
/*
- * Aggressive VACUUM caller can't skip pages just because they are
- * all-visible. They may still skip all-frozen pages, which can't
- * contain XIDs < OldestXmin (XIDs that aren't already frozen by now).
+ * In all other cases, we can skip all-frozen pages. Even fully
+ * aggressive vacuums may skip all-frozen pages since all-frozen pages
+ * cannot contain XIDs < OldestXmin (XIDs that aren't already frozen
+ * by now).
*/
- if ((mapbits & VISIBILITYMAP_ALL_FROZEN) == 0)
- {
- if (vacrel->aggressive)
- break;
+ if (mapbits & VISIBILITYMAP_ALL_FROZEN)
+ continue;
- /*
- * All-visible block is safe to skip in non-aggressive case. But
- * remember that the final range contains such a block for later.
- */
- *skipsallvis = true;
+ /*
+ * Fully aggressive vacuums cannot skip all-visible pages that are not
+ * also all-frozen. Semi-aggressive vacuums only skip such pages if
+ * they have hit the failure limit for the current eager scan region.
+ */
+ if (consider_eager_scan)
+ {
+ *was_eager_scanned = true;
+ vacrel->eager_scanned++;
+ break;
}
- next_unskippable_block++;
+ /*
+ * All-visible block is safe to skip in a semi or unaggressive vacuum.
+ * But remember that the final range contains such a block for later.
+ */
+ *skipsallvis = true;
}
/* write the local variables back to vacrel */
@@ -1781,7 +2022,7 @@ lazy_scan_noprune(LVRelState *vacrel,
&NoFreezePageRelminMxid))
{
/* Tuple with XID < FreezeLimit (or MXID < MultiXactCutoff) */
- if (vacrel->aggressive)
+ if (vacrel->aggressive == VAC_AGGRESSIVE)
{
/*
* Aggressive VACUUMs must always be able to advance rel's
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 86f36b36954..236bd2dbb98 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -1079,7 +1079,7 @@ get_all_vacuum_rels(MemoryContext vac_context, int options)
* FreezeLimit (at a minimum), and relminmxid up to MultiXactCutoff (at a
* minimum).
*/
-bool
+VacAggressive
vacuum_get_cutoffs(Relation rel, const VacuumParams *params,
struct VacuumCutoffs *cutoffs)
{
@@ -1213,7 +1213,7 @@ vacuum_get_cutoffs(Relation rel, const VacuumParams *params,
aggressiveXIDCutoff = FirstNormalTransactionId;
if (TransactionIdPrecedesOrEquals(cutoffs->relfrozenxid,
aggressiveXIDCutoff))
- return true;
+ return VAC_AGGRESSIVE;
/*
* Similar to the above, determine the table freeze age to use for
@@ -1234,10 +1234,22 @@ vacuum_get_cutoffs(Relation rel, const VacuumParams *params,
aggressiveMXIDCutoff = FirstMultiXactId;
if (MultiXactIdPrecedesOrEquals(cutoffs->relminmxid,
aggressiveMXIDCutoff))
- return true;
+ return VAC_AGGRESSIVE;
+
+ /*
+ * If we are not required to do a fully aggressive vacuum, we may still
+ * eagerly scan pages as long as relfrozenxid precedes the freeze limit.
+ * We don't bother enabling eager scanning if no tuples will be eligible
+ * to be frozen.
+ */
+ if ((TransactionIdIsNormal(cutoffs->relfrozenxid) &&
+ TransactionIdPrecedesOrEquals(cutoffs->relfrozenxid, cutoffs->FreezeLimit)) ||
+ (MultiXactIdIsValid(cutoffs->relminmxid) &&
+ MultiXactIdPrecedesOrEquals(cutoffs->relminmxid, cutoffs->MultiXactCutoff)))
+ return VAC_SEMIAGGRESSIVE;
/* Non-aggressive VACUUM */
- return false;
+ return VAC_UNAGGRESSIVE;
}
/*
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 759f9a87d38..39809f3fc83 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -288,6 +288,29 @@ typedef struct VacDeadItemsInfo
int64 num_items; /* current # of entries */
} VacDeadItemsInfo;
+/*
+ * The aggressiveness level of a vacuum determines how many all-visible but
+ * not all-frozen pages it eagerly scans.
+ *
+ * An unaggressive vacuum scans no all-visible pages unless page skipping is
+ * disabled.
+ *
+ * A fully aggressive vacuum eagerly scans all all-visible but not all-frozen
+ * pages.
+ *
+ * A semi-aggressive vacuum eagerly scans a number of pages up to a limit
+ * based on whether or not it is succeeding or failing. A semi-aggressive
+ * vacuum is downgraded to an unaggressive vacuum when it hits its success
+ * quota. An aggressive vacuum cannot be downgraded. No aggressiveness
+ * level is ever upgraded.
+ */
+typedef enum VacAggressive
+{
+ VAC_UNAGGRESSIVE,
+ VAC_AGGRESSIVE,
+ VAC_SEMIAGGRESSIVE,
+} VacAggressive;
+
/* GUC parameters */
extern PGDLLIMPORT int default_statistics_target; /* PGDLLIMPORT for PostGIS */
extern PGDLLIMPORT int vacuum_freeze_min_age;
@@ -335,8 +358,8 @@ extern void vac_update_relstats(Relation relation,
bool *frozenxid_updated,
bool *minmulti_updated,
bool in_outer_xact);
-extern bool vacuum_get_cutoffs(Relation rel, const VacuumParams *params,
- struct VacuumCutoffs *cutoffs);
+extern VacAggressive vacuum_get_cutoffs(Relation rel, const VacuumParams *params,
+ struct VacuumCutoffs *cutoffs);
extern bool vacuum_xid_failsafe_check(const struct VacuumCutoffs *cutoffs);
extern void vac_update_datfrozenxid(void);
extern void vacuum_delay_point(void);
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 171a7dd5d2b..abed3008f87 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -3056,6 +3056,7 @@ UserAuth
UserContext
UserMapping
UserOpts
+VacAggressive
VacAttrStats
VacAttrStatsP
VacDeadItemsInfo
--
2.34.1
chart.pngimage/png; name=chart.pngDownload
�PNG
IHDR � o�<&