From 2ecb6ea2b3bca4264083b732e91ad3ad693caea3 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Wed, 11 Dec 2024 14:13:34 -0500
Subject: [PATCH v3 6/7] Add more general summary to vacuumlazy.c

Add more details to how vacuuming heap relations works to vacuumlazy.c
Previously the top of vacuumlazy.c only had details related to the dead
TID storage added in Postgres 17. This commit adds a more general
summary to help future developers understand the heap relation vacuuming
implementation at a high level.

It would be good to add another sentence or two on index vacuuming.

Reviewed-by: Bilal Yavuz
Discussion: https://postgr.es/m/flat/CAAKRu_ZF_KCzZuOrPrOqjGVe8iRVWEAJSpzMgRQs%3D5-v84cXUg%40mail.gmail.com
---
 src/backend/access/heap/vacuumlazy.c | 37 ++++++++++++++++++++++++++++
 1 file changed, 37 insertions(+)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 59637284d02..2c30660f5eb 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -3,6 +3,43 @@
  * vacuumlazy.c
  *	  Concurrent ("lazy") vacuuming.
  *
+ * Heap relations are vacuumed in three main phases. In phase I, vacuum scans
+ * relation pages, pruning and freezing tuples and saving dead tuples' TIDs in
+ * a TID store. If that TID store fills up or vacuum finishes scanning the
+ * relation, it progresses to phase II: index vacuuming. Index vacuuming
+ * deletes the dead index entries referenced in the TID store. In phase III,
+ * vacuum scans the blocks of the relation indicated by the TIDs in the TID
+ * store and reaps the dead tuples, freeing that space for future tuples.
+ *
+ * If there are no indexes or index scanning is disabled, phase II may be
+ * skipped. If phase I identified very few dead index entries, vacuum may skip
+ * phases II and III.
+ *
+ * Finally, vacuum may truncate the relation if it has emptied pages at the
+ * end. After finishing all phases of work, vacuum updates relation statistics
+ * in pg_class and the cumulative statistics subsystem.
+ *
+ * Relation Scanning:
+ *
+ * Vacuum scans the heap relation, starting at the beginning and progressing
+ * to the end, skipping pages as permitted by their visibility status, vacuum
+ * options, and the eagerness level of the vacuum.
+ *
+ * When page skipping is enabled, non-aggressive vacuums may skip scanning
+ * pages that are marked all-visible in the visibility map. We may choose not
+ * to skip pages if the range of skippable pages is below
+ * SKIP_PAGES_THRESHOLD.
+ *
+ * Once vacuum has decided to scan a given block, it must read in the block
+ * and obtain a cleanup lock to prune tuples on the page. A non-aggressive
+ * vacuums may choose to skip pruning and freezing if it cannot acquire a
+ * cleanup lock on the buffer right away.
+ *
+ * After pruning and freezing, pages that are newly all-visible and all-frozen
+ * are marked as such in the visibility map.
+ *
+ * Dead TID Storage:
+ *
  * The major space usage for vacuuming is storage for the dead tuple IDs that
  * are to be removed from indexes.  We want to ensure we can vacuum even the
  * very largest relations with finite memory space usage.  To do that, we set
-- 
2.34.1

